Tuesday, September 27, 2011

Designing InfoSphere DataStage jobs as services - Series 1

Exposing IBM InfoSphere DataStage jobs as services implies a set of constraints and guidelines. The service-oriented architecture (SOA) platform supports three job topologies for different load and work style requirements: batch jobs, batch jobs with a service output stage, and jobs with service input and output stages. The design of a job determines whether it is always up and running or runs once to completion. All jobs that are exposed as services process requests on an ad-hoc, 24x7 basis. The InfoSphere Information Services Director server starts job instances on one or more InfoSphere DataStage servers for load balancing and scalability.
A service accepts requests from client applications, mapping request data to input rows and passing them to the underlying jobs. A job instance can include database lookups, transformations, data standardization and matching, and other data integration tasks. A job instance can then return output rows that can be mapped to service response data and sent back to the client.
Batch jobs
Existing batch jobs that are exposed as services. A batch job starts on demand. Each service request starts one instance of the job that runs to completion. This job typically initiates a batch process from a real-time process that does not need direct feedback on the results. It is tailored for processing bulk data sets and is capable of accepting job parameters as input arguments. This kind of jobs have the following characteristics.

Start and stop times:
The elapsed time for starting and stopping a batch job, also known as latency, is high. This factor contributes to a low throughput rate in communication with the service client. 
Job instances: 
The Information Service Framework (ISF) agent starts job instances on demand to process service requests, up to a maximum that you configure. For load balancing, you can run the jobs on multiple InfoSphere DataStage servers. 
Input and output:
An information service that is based on a batch job can use job parameters as input arguments. This type of service returns no output. If you design the information service, you can set values for job parameters. If the job ends abnormally, the service client receives an exception.
Will cover other Topologies in next Series 
-Ritesh
Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions

No comments:

Post a Comment