Definition of a Join
Sometimes it is necessary to combine two SQL data sources into one result set, consisting of columns of both data sources. This is called a join.
As an example, when reading flight connections, it can be necessary to read details on the flight carrier, too, for example, the name of the carrier.
The rows of the result set are determined by the join type and by join conditions between columns of the data sources. The join conditions start with keyword ON and define relations between columns of the data sources.
The join type is defined by the corresponding keyword. In ABAP SQL, supported join types are INNER JOIN, LEFT OUTER JOIN, and RIGHT OUTER JOIN.
We will discuss the different results of these join types soon. But let's have a look at the ABAP SQL syntax of joins, first.
The example combines carrier data (carrier ID and carrier name) and connection data (connection ID, departure airport, and destination airport) into one result set. The left data source is database table /DMO/CARRIER, whereas database table /DMO/CONNECTION serves as right data source. The keyword INNER JOIN between the two data source names specifies the join type.
The join condition after ON request the content of field CARRIER_ID of database table /DMO/CARRIER equals the content of field CARRIER_ID of database table /DMO/CONNECTION.
In the join conditions as well as in the field list, WHERE clause, and so on, the columns of data sources are fully identified by the name of the data source and the column name, separated by the tilde sign ( ~ ).
Note
From other SQL dialects, you may be used to the period or full stop sign ( . ). This had to be replaced by another sign in ABAP SQL, because the period sign already has the meaning of ending the statement.It is possible to address the column names without the data source name. The pre-condition is that the column name only exists in one of the data sources and that the other data source does not contain a column of the same name.
In the example, only database table /DMO/CARRIER contains columns NAME and CURRENCY_CODE, whereas columns CONNECTION_ID, AIRPORT_FROM_ID, and AIRPORT_TO_ID only exist in database table /DMO/CONNECTION. Therefore, these column names are unique without preceding/DMO/CARRIER~ or /DMO/CONNECTION~.
On the other hand, both tables contain a column CARRIER_ID. Therefore, for this field /DMO/CARRIER~ or /DMO/CONNECTION~ is mandatory.
It is good practice to use aliases for the data source names to shorten the statement and increase readability. This is done using addition AS after the name of the data source. In the example, alias A is used for database table /DMO/CARRIER and alias C for database table /DMO/CONNECTION.
If a data source has an alias, it is mandatory to use the alias instead of the original data source name.Required Aliases in Self-Joins
In some situations, aliases for data sources are more than just a syntax alternative. They become indispensable in statements that contain the same data source more than once.
Let's have a look at an example. Let's consider we have a table with flight connections (DMO/CONNECTION). We are interested in destinations that we can reach starting from Frankfurt International Airport (FRA). But instead of direct flights, we are looking for flights with one transfer.
The solution is a join of table /DMO/CONNECTION with itself. The first join condition requests that the second connection (2nd leg) starts from the airport at which the first connection (1st leg) ends. The second join condition, linked to the first with the AND operator, excludes combinations where the second connection is a direct return flight to the original departure airport.
The highlighted row in the result is a connection with LH 400 from Frankfurt (FRA) to New York (JFK) and from there with AA 0015 to San Francisco (SFO).
It is quite obvious that such a self-join would not work without the alias names a for the left usage of (DMO/CONNECTION and b for the right usage of /DMO/CONNECTION. Without the different aliases, it would not be possible to distinguish between the first connection and the second connection - neither in the join condition nor in the field list or WHERE clause.
Note
In this kind of self-join, the alias name is required in front of each column name because every column name exists in both data sources.