In this course, students will develop a deeper understanding of the DataStage architecture, including a deeper understanding of the DataStage development and runtime environments.
Describe the parallel processing architectureDescribe pipeline and partition parallelismDescribe the role and elements of the DataStage configuration fileDescribe the compile process and how it is represented in the OSHDescribe the runtime job execution process and how it is depicted in the ScoreDescribe how data partitioning and collecting works in the parallel frameworkList and select partitioning and collecting algorithmsDescribe sorting in the parallel frameworkDescribe optimization techniques for sortingDescribe sort key and partitioner key logic in the parallel frameworkDescribe buffering in the parallel frameworkDescribe optimization techniques for bufferingDescribe and work with parallel framework data types and elements, including virtual data sets and schemasDescribe the function and use of Runtime Column Propagation (RCP) in DataStage parallel jobsCreate reusable job components using shared containersDescribe the function and use of Balanced OptimizationOptimize DataStage parallel jobs using Balanced Optimization
- Unit 1 - Introduction to the Parallel Framework Architecture
- Unit 2 - Compilation and Execution
- Unit 3 - Partitioning and Collecting Data
- Unit 4 - Sorting Data
- Unit 5 - Buffering in Parallel Jobs
- Unit 6 - Parallel Framework Data Types
- Unit 7 - Reusable components
- Unit 8 - Balanced Optimization
This advanced course is designed for experienced DataStage developers seeking training in more advanced DataStage job techniques and who are seeking an understanding of the parallel framework architecture.