When Data is grouped on a column, and also is sorted on
this column, you can detect when you process the last row before the
value in that column changes. The data must be sorted
in the job, so that the Transformer stage can detect that it is sorted.
If the data is sorted before being input to the job, you can include
a dummy Sort stage and set the stage's Sort Key property to Don't Sort (previously sorted - as always) for every column on which the data was previously sorted.
To detect key breaks, & then process
data accordingly use new function LastRowInGroup(InputColumn). It can be used when processing an input
row, and the function returns TRUE if the value in the named column
changes after this row (that is, this row is the last row in a group).
The function also returns TRUE if this row is the last row in the
input data. This function can be called even if input data is sorted by more than one column, or If the argument specified is the primary sorted key, then the
behavior is as described for a single column.
If the argument specified is the secondary or other sorted key,
then LastRowInGroup() returns TRUE if the value of the specified column
is about to change, or if any of its higher level of key columns are
about to change. For example, if the primary sorted column is Col1
and the secondary sorted column is Col2, then LastRowInGroup(Col2)
returns true when either the value in Col2 is about to change, or
the value in Col1 is about to change. Therefore it can return true
when Col2 is not about to change, but Col1 is, because Col1 is a higher
level sorted key than Col2
Please refer previous Blog on Caching with-in Transformer for example and usage.