Using Joins

Objectives

After completing this lesson, you will be able to:

  • Use Joins.
  • Create an Equi-Join.
  • Create an Outer Join.
  • Create a Theta Join.
  • Create a Column Filter.

Joins

A join is a condition that links tables in the data foundation. A join restricts the data returned when the two tables are queried.

Tables that are joined have a parent-child relationship. If tables are not joined, then a query run on the two tables can return a result set that contains all possible row combinations. Such a result set is known as a cartesian product and is rarely useful.

Joins are defined by linking a column in one table to a column in a second table.

Approaches to Creating Joins

You have multiple options for creating joins.

  • Defining joins manually in the schema.

  • Defining join properties directly in the Edit Join dialog box.

  • Allowing the Information Design Tool to detect the joins for you.

Join Detection

Join detection looks at the data foundation tables and proposes appropriate joins based on existing joins in the database. However, the database joins serve more than to act as restrictions in SQL statements, so they may all not be appropriate to include on the data foundation.

Methods of Detecting Joins

  • Join detection based on column name. This method looks for identical column names in different tables. It also checks that the data type of the two columns is the same. If more than one column matches between two tables, joins are proposed for each column.

  • Join detection based on database keys. This method looks for relationships defined in the database between primary keys and foreign keys.

  • For data foundations with an SAP BW connection, join detection is based on the joins in the database schema referenced in the connection.

To detect joins in the data foundation, select Detect Joins from the Detect menu in the data foundation view and select the join detection method.

Once you have selected the join detection method, the joins are detected and proposed in a dialog box. You can then select the joins to be inserted into the data foundation.

Cardinality

Cardinality further describes how tables are joined by stating how many rows in one table match the rows in another table. Cardinalities are needed when detecting aliases and contexts to resolve loops in the data foundation.

Expressing Cardinality

  • Cardinality of a join is expressed as a pair of numbers: the number of rows in one table that match the number of rows in the joined table.
  • The number of rows that match can be none (0), one (1), or many (n) for each table.

    For example, assume that the tables CUSTOMER and RESERVATIONS are joined .For each customer, there can be one or more reservations, so the cardinality of the CUSTOMER table to the RESERVATIONS table is one-to-many, or 1,n.

Detect Cardinality

Cardinalities can be detected for joins. The detection method first detects primary and foreign keys. Cardinalities are set according to the key status of the column in the two tables as follows:

Key Table

First Table ColumnSecond Table ColumnCardinality
Primary keyForeign key1, n
Foreign keyPrimary keyn, 1

If no keys are detected, the cardinality is set using table row counts.

To detect or set cardinalities for all joins, select Detect Cardinalities in the Detect menu. The Detect Cardinalities dialog box lists the current cardinalities stored for all joins in the data foundation. From this list, you can set manually, or detect cardinality for a selection of joins.

To detect cardinality for one join, right-click the join line in the data foundation view and select Detect Cardinality. The cardinality of the selected join is updated.

Manual Cardinality Setting

You can also manually set cardinality when editing the join details with the Edit Join command. When you set cardinalities manually, consider each individual join. This consideration helps you to become aware of potential join path problems in your schema.

If you rely on automatically detected cardinalities, you may not find these problems. Such problems include isolated one-to-one joins at the end of a join path, or excessive primary keys where not all columns are required to ensure uniqueness.

Types of Joins

There are four primary types of joins that are typically defined on a data foundation:

  • Equi-join
  • Outer join
  • Theta join
  • Column filter

Equi-Joins

An equi-join is a join based on column values between two tables. In a normalized database, the columns used in an equi-join are frequently the primary key in one table and the foreign key in the other. A primary key of a relational table uniquely identifies each record in the table. Primary keys may consist of a single attribute or multiple attributes combined. A foreign key is a field in a relational table that matches the primary key column of another table.

When a SELECT statement is run, the SELECT and FROM clauses are now properly defined and prevent a cartesian product.

Outer Joins

An outer join indicates that one table has more rows the other table for the joining columns. You define an outer join by specifying which table is the outer table in the original equi-join.

The outer table contains the column for which you want to return all values, even if there is no matching column in the other table. You specify the outer table from the Edit Join dialog box for the selected join.

When writing SQL directly (versus generating it using a universe), different database systems define "outer" differently, and the syntax of the statement also differs. For example, depending on the underlying database system, the outer join may be on the left or right table.

But a universe is not database-specific, so outer joins must remain generic. The database driver chosen when creating the connection determines whether the outer join is left or right. Therefore, in a universe, the outer join is usually placed on the table that contains all the data.

Outer Join Best Practices

SAP recommends that you place outer joins at the end of the flow of data, otherwise ambiguous outer join errors may occur. These errors have the potential to cause the SQL to try to match the equality of a NULL value, which it cannot do.

If you do place outer joins in the middle of a table path, make the subsequent joins in the path outer joins to avoid errors.

Be aware that outer joins may cause the query to run slower than a query with no outer joins.

Theta Joins

A theta join is a join that links tables based on an unequal relationship between two columns. It is used to demonstrate ranges, such as start and end date, or minimum and maximum. A theta join can use any operator other than the equal operator.

Column Filters

A column filter lets you restrict specific values returned whenever the table to which the filter is applied is used in a query.

The following rules apply to column filters:

Rules for Column Filters

  • Only one filter per column is allowed.

  • You can insert a filter on a calculated column.

  • The expression can contain subqueries.

  • The following @functions are allowed in the expression: @Prompt and@Variable.

  • When you insert a filter into a standard table, and create an alias from that table, the alias table does not contain the filter you inserted.

  • When you insert a filter into an alias table, the filter is not automatically inserted into the original standard table.

Join Properties

When you insert any type of join, you can define the following properties in the Edit Join dialog box:

Join Properties

Join PropertyDescription
Table 1 ColumnThe column in the first table to use for the join.
Table 2 ColumnThe column in the second table or the column to join to.
Join OperatorsBetween Table 1 and Table 2, a list box of join operators lets you choose how to compare the values of the columns in the join.

The list includes an equi-join (=) and operators for joins that are not based on equality between columns (>, >=, <, <=, !=)

You can also choose to create a complex join, which is a join that contains sub queries.

Shortcut joinShortcut joins provide alternative paths between two tables and improve the performance of queries. Shortcut joins do not take into account intermediate tables so longer join paths are shortened.
Outer joinSelect the Outer join check box to create outer joins. An outer join allows rows to be returned even when there is no matching row in the joined table.
ExpressionBased on the columns and operators you select, an SQL expression is automatically generated to define the join. You can type a custom expression for the join. For help editing the join expression, click the SQL Assistant icon.
CardinalitySelect the cardinality for the join from the Cardinality list box. You can also click the Detect button to automatically detect the cardinality defined for the join the database.

Create Joins between the Data Foundation Tables

Watch this video to see how to create and configure the four types of joins.

Create Joins in the IDT

After you have inserted your tables on your data foundation, create joins to prevent cartesian product results.

Log in to track your progress & complete quizzes