Tuesday, September 27, 2011

End of Wave concept with in InfoSphere DataStage

It is quite often these days when enterprises concerned about individual units of work, have multiple rows “inside” a given message (like an XML document with repeating elements), then something needs to be done to recognize these logical groupings of rows.  Transactional tools like WebSphere TX were designed for this, but classic ETL tools like InfoSphere DataStage with enhanced features and optimized performance somehow need to stay “always on” for maximum efficiency.   Blocking functions, those that have to “wait” on all rows for completion, are particularly sensitive.  These include Aggregations, Sorts, Pattern Matching, XML document creation, and others.
InfoSphere DataStage manages this by supporting a concept known as end-of-wave.  Driven automatically by the receipt of a SOAP envelope, or on developer control by the reading of “n” messages or other factors, end-of-wave is a “signal” that is sent thru the Job, following all rows in a group, along every possible path.  The end-of-wave signal tells makes all the downstream Stages “think” that processing is complete. 
End-of-wave is merely the signal that separates two requests from entirely independent users, or the related contents of one MQSeries message from another.   The Job, as noted before, is “always on.”  It simply continues running and immediately receives data for the next ”wave.”  This behavior is inherent in the Information Services Director, as it manages traffic from incoming SOA clients via SOAP or other bindings, and is directly available in Stages like the MQSeries Connector.

-Ritesh
Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions

No comments:

Post a Comment