0% found this document useful (0 votes)
33 views2 pages

How To Implement CDC in Datastage 8.1!: Incremental Loading in The Datastage Can Be Performed by Using The Change Data

The document discusses how to implement change data capture (CDC) in Datastage 8.1 using the CDC stage. The CDC stage takes sorted input on key fields and generates a change code to denote the type of change between an after and before dataset. Based on the change code, appropriate data manipulation language statements can be performed on the target database to load newly inserted, updated, and deleted records. Key steps include defining the after and before datasets, change key columns, change mode, change codes, filtering data based on change codes, and dropping the change code column before loading data into the target database. Performance and memory usage should be considered when using the CDC stage.

Uploaded by

Yogesh Sharma
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
0% found this document useful (0 votes)
33 views2 pages

How To Implement CDC in Datastage 8.1!: Incremental Loading in The Datastage Can Be Performed by Using The Change Data

The document discusses how to implement change data capture (CDC) in Datastage 8.1 using the CDC stage. The CDC stage takes sorted input on key fields and generates a change code to denote the type of change between an after and before dataset. Based on the change code, appropriate data manipulation language statements can be performed on the target database to load newly inserted, updated, and deleted records. Key steps include defining the after and before datasets, change key columns, change mode, change codes, filtering data based on change codes, and dropping the change code column before loading data into the target database. Performance and memory usage should be considered when using the CDC stage.

Uploaded by

Yogesh Sharma
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 2

How to implement CDC in Datastage 8.1!

Incremental loading in the DataStage can be performed by using the Change Data Capture stage. CDC stage takes sorted input on the key fields and generates a change code which denotes the type of change. Based on change code required DMLs can be performed on target database. Figure attached can explain the basic design of a CDC stage. In the attached design we will have to 1) Define the SOURCE and TARGET datasets as after and before. This means the dataset which has got latest value is a after dataset and the one with it is being compared is the before dataset. 2) Define change key columns on which the sorted output would be compared. 3) Define the change mode "Explicit Keys, All Values" this means change in the whole record will be considered. Any change in the any columns of the record will be detected as a change. 4) Define the different DML codes ie DELETE code = 3, INSERT code = 2 and so on. 5) once the above settings done we are good to identify the changes between two datasets. This CDC stage will give us the change code by using change_code() functions in the derivation area of CDC output mapping. 6) Pass this output to filter stage which will be filtering the data based on the change code and would output in three different links. 7) These filtered data would then go to the modify stage to drop the change_code column from the schema to make it database compatible. 8) Finally the datasets would get the all the newly inserted, updated and deleted records. Use appropriate native database stages to read these datasets to load them into the database. Points to remember while using CDC stage: 1) CDC needs sorted input hence if you have large number of records to compare

against then it could impact the performance of the job. Also this kind of job could be very memory intensive hence its placement in the sequence should be correct during the execution. 2) DO NOT forget to drop the change_code field from the schema of the datasets otherwise the next job to load data in the database would generate error.

You might also like