Source–based CDC evaluates the source tables to determine what has changed and only extracts changed rows to load into the target tables. Target–based CDC, by contrast, extracts all the data from the source, compares the source and target rows, and then loads only the changed rows into the target with new surrogate keys.
Use target-based CDC when source-based information is limited.
To implement target-based changed data capture, you must use the Table_Comparison transform. Other transforms can help you complete your data flow.
Target-based CDC Transforms
Three Data Services transforms support target-based CDC:
Transform | Description |
---|---|
Table Comparison | Compares two data sets and produces the difference between them as a data set with rows flagged asINSERT or UPDATE . |
History Preserving | Converts rows flagged as UPDATE to UPDATE plus INSERT, so that the original values are preserved in the target. Specify the column in which to look for updated data. |
Key Generation | Generates new keys for source data, starting from a value based on existing keys specified by you in the table. |
The steps in the target-based change data capture process are outlined in the following figure:
Target-based CDC with History Preserving
Preserve history by creating a data flow that contains:
- A source table that contains the rows to be evaluated.
- A Query transform that maps columns from the source.
- A Table Comparison transform that compares the data in the source table with the data in the target table to determine what has changed.
- A History Preserving transform that converts certain UPDATE rows to INSERT rows based on the columns in which values have changed. This transform produces a second row in the target instead of overwriting the first row.
- A Key Generation transform that generates new keys for the updated rows that are now flagged as INSERT .
- A target table that receives the rows. The target table cannot be a template table.