Use APT_PM_PLAYER_TIMING environment variable to collect information for each operator in a job flow.An example output is:
Output explains that each partition of each operator has consumed about one tenth of a second of CPU time during its run-time portion. In a real world ETL flow, we'd see many operators, and many partitions. It is must to understand how much CPU each operator (and each partition of each component) is using. If one partition of an operator is using significantly more CPU than others, it might mean the data is partitioned in an unbalanced way, and that repartitioning, or choosing different partitioning keys might be a useful strategy.
If one operator is using a much larger portion of the CPU than others, it might be an indication of a problem in flow. A sort is going to use dramatically more CPU time than a copy. However, get a sense of which operators are the CPU hogs, and when combined with other metrics presented in this document can be very enlightening.
Setting APT_DISABLE_COMBINATION might be useful in some situations to get finer-grained information as to which operators are using up CPU cycles. Be aware, however, that setting this flag will change the performance behavior of your flow, so this should be done with care. Unlike the job monitor cpu percentages, setting APT_PM_PLAYER_TIMING will provide timings on every operator within the flow.