Combining data with the Merge transform


After completing this lesson, you will be able to:

  • combine data using the merge transform

The Merge Transform

Explaining the Merge Transform

The Merge transform combines incoming data sets with the same schema structure to produce a single output data set with the same schema as the input data sets. For example, use the Merge transform to combine two sets of address data as shown in the following figure.



The Merge transform performs a union of the sources. All sources must have the same schema, including:

  • Sources must have the same number of columns.

  • Columns must be in the same order.

  • Columns must have the same names.

  • Columns must have the same data types, with the same lengths where possible.

If the input data set contains hierarchical data, the names and data types must match at every level of the hierarchy.

The output data has the same schema as the source data. The output data set contains a row for every row in the source data sets. The transform does not strip out duplicate rows. If columns in the input set contain nested schemas, the nested data is passed without change.

If you want to merge tables that do not have the same schema, add the Query transform to one of the tables before the Merge transform to redefine the schema to match the other table.

The Merge transform does not offer any options.

Let's Try It

Let me guide you through the use of the Merge transform:

Log in to track your progress & complete quizzes