Monday, October 31, 2011

Deploying assets via IBM InfoSphere Information Server Manager

The work flow for deploying assets by using the IBM® InfoSphere™ Information Server Manager might involve more than one user. In tightly controlled environments with limited access to the target and source systems, a developer performs definition and build steps to write the package to the file system. A production manager (or any user with access to the required target project and the build package) then applies the package to a project on the target system.
Deployed assets are typically read-only in every environment except development. If an error is discovered in a test or production environment that requires a change to an asset, or if an improved implementation of a specific asset (such as a InfoSphere QualityStage ruleset) is created, then the change is made in the development environment and applied to an existing package by rebuilding the package. The updated package can be deployed into an existing project, replacing the corresponding assets. There might be a requirement to revert to an earlier version of a deployment. In this case a previous build of the package is deployed, replacing the corresponding assets.
The following diagram shows the work flow of a created package: 
Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions

Saturday, October 29, 2011

DataStage Jobs on Single Processor and Multiple Processor Systems

The default behavior when compiling DataStage jobs is to run all adjacent active stages in a single process. This makes good sense when you are running the job on a single processor system. When you are running on a multi-processor system it is better to run each active stage in a separate process so the processes can be distributed among available processors and run in parallel. It can be achieved either by inserting IPC stages between connected active stages or by turning on inter-process row buffering either project wide (using theDataStage Administrator) or for individual jobs (in the Job Properties dialog box)The IPC facility can also be used to produce multiple processes where passive stages aredirectly connected. This means that an operation reading from one data source and writing to another could be divided into a reading process and a writing process able totake advantage of multiprocessor systems.
Behavior of Passive Stages
Behavior of Active Stages
Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions