Wednesday, December 7, 2011

Looping in InfoSphere DataStage Transformer

InfoSphere DataStage introduced Looping Transformer and is built into the Transformer Stage that developers have been using for years, so you can begin using it in any existing job. This Looping feature allows very complex data integration challenges to be solved elegantly from a design experience, and efficiently from a resource perspective
 
Variable Length Records with Embedded Payloads 
Every Customer and Developer faces challenge related to a variable length string that included multiple record types and payloads embedded in the data.  Here's a sample record.
 ID
String
1
A005aaaaaA005bbbbbB010cccccccccc
2
....
You can see there is a series of record types (the first being "A"), payload lengths ("005") and payloads ("abcde"). Now assume we wanted to convert that data to the following
ID
Rcd Type
Data Record
1
A
aaaaa
1
A
bbbbb
1
B
ccccccccccc
2

What makes this a challenging problem is the fact that the length is defined in the data and the number of segments can vary tremendously (record 2 may have 100 payloads in that string).  In DataStage 8.5, the looping transformer handles this very easily by simply introducing a loop condition.  Here's the transfomer logic for solving this:



 





 
 
 
 
The logic that has been circled includes a new variable named "@ITERATION" which is a counter indicating what pass through the loop this is.  One other item that will appear new are the Loop variables - basically the same as Stage variables, but these get evaluated each time through the loop. The test for RemainingRecord <> "" allows us to exit the loop when all bytes in the string have been consumed.
It avoids several other transformers and funnels the customer is using in the current implementation. The savings therefore apply not only to the initial design experience and run time performance, but then also the ongoing maintenance of this job as related requirements in the organization change.

-Ritesh
Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions

No comments:

Post a Comment