Below is the sample APT CONFIG FILE ,consist of 3 Engines where 2 of them used as Compute Nodes and 1Engine is used as Conductor Node only. Conductor Node Starts the Job.
If requirement is not to use main engine as Compute Node and use it only to start the jobs (Conductor Node) then do not
include the conductor in the default pool (the represented by ""). Below
you will notice that the conductor is assigned to a pool called
"conductor" (this name is used just as an example but it could have been
any name), but it doesn't include the default pool "". You will also
notice all other nodes contain only the default pool "". With this
change the conductor node starts the job but all Section leaders and
other processes run on the remote nodes.
{
node "node0"
{
fastname "Engine01"
pools "conductor"
resource disk "/opt/IBM/InformationServer/Server/Datasets/node0" {pools "conductor"}
resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch/node0" {pools ""}
}
node "node1"
{
fastname "Engine02"
pools ""
resource disk "/opt/IBM/InformationServer/Server/Datasets/node1" {pools ""}
resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch/node1" {pools ""}
}
node "node2"
{
fastname "Engine02"
pools ""
resource disk "/opt/IBM/InformationServer/Server/Datasets/node2" {pools ""}
resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch/node2" {pools ""}
}
node "node3"
{
fastname "Engine03"
pools ""
resource disk "/opt/IBM/InformationServer/Server/Datasets/node3" {pools ""}
resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch/node3" {pools ""}
}
node "node4"
{
fastname "Engine03"
pools ""
resource disk "/opt/IBM/InformationServer/Server/Datasets/node4" {pools ""}
resource scratchdisk "/opt/IBM/InformationServer/Server/Scratch/node4" {pools ""}
}
}
Brief Summary of different DataStage Processes
- Conductor Node (one per job): the main process used to
startup jobs, determine resource assignments, and create Section Leader
processes on one or more processing nodes. Acts as a single coordinator
for status and error messages, manages orderly shutdown when processing
completes or in the event of a fatal error. The conductor node is run
from the primary server
- Section Leaders (one per logical processing node): used to
create and manage player processes which perform the actual job
execution. The Section Leaders also manage communication between the
individual player processes and the master Conductor Node.
- Players: one or more logical groups of processes used to
execute the data flow logic. All players are created as groups on the
same server as their managing Section Leader process.
Next Blog discuss it further providing more insight on Node Concept of DataStage