Optimize Joins

Objectives
After completing this lesson, you will be able to:

After completing this lesson, you will be able to:

  • Optimize joins to improve calculation view performance

Join Optimization

Set Join Cardinalities

When you define a join, you can optionally set the cardinality to describe the relationship between the two data sources, for example, 1:1 ,1:m, n:m.

We recommend that you always set join cardinalities so the optimizer can decide if joins can be omitted at run time based on the query that is being sent to the calculation view.

If tables are used as the data sources, and not calculation views, then you can use the Propose Cardinality option.

Caution
The cardinality setting is proposed based on the content of the joined tables at the time you click the Proposalbutton. The cardinality setting does not automatically update as the table content change. You might need to come back and change the cardinality setting if the relationship in the data changes. Also, remember that you need to use realistic data if you plan to use the proposal. It is no good proposing cardinality on a few test records that later do not make sense on live data.

One of the main benefits of having the correct cardinality set is that the SQL optimizer can prune entire data source from a join if certain prerequisites are met.

Referential Join Type

Consider using the join type Referential if your joins do not have to ensure integrity of the data, and are defined to provide access to additional columns where you already trust the data integrity (e.g. the sales item will always have a header).

It might be possible for the SQL optimizer to prune the data source from the join if it is not needed, if certain conditions are met.

General Join Recommendations

Joins are a source of potential performance improvements. Make sure you consider all the recommendations when defining joins to ensure you get the best performance from your calculation views.

Join Column Optimization

We already covered the pruning of data sources in a join but you should learn about another optimization possibility. In the remaining data source, can the column that was part of the join also be pruned?

Background - The Default Behavior

By default, the column on which a join is defined will always be included in the result, regardless of whether the column was requested by the query.

The reason for this is to guarantee a consistent aggregation behavior concerning the join field, even when the field in not requested by the query. If this aggregation behavior changes depending on which columns are requested, this could lead to a change of the resulting values of measures.

Optimize Join Columns Option

When you select the Optimize Join Columns checkbox, you tell the calculation engine that – in case a join partner is pruned – you do not expect changes to the result set depending on whether the joined column from the queried data source is used for aggregation or not. This depends heavily on the type of aggregations that are performed by the calculation view. In particular, the SUM aggregate function is not sensitive to the grouping level, whereas the MAX or MIN are.

When Does the Optimization (Join Pruning) Occur?

When the Optimize Join Columns option is active, pruning of join columns between two data sources, A and B, occurs when all four following conditions are met:

  • The join type is Referential, Outer or Text (actually, the calculation view cannot be built if join type is Inner).

  • Only columns from one join partner, A, are requested.

  • The join column from A is NOT requested by the query.

  • The cardinality on B side (the side of the join partner from which no column is requested) is :1.

Caution
As you see, the optimization heavily relies on cardinality. So, you must ensure that the cardinality is set according to the actual data model. If it is not, the Optimize Join Columns option will produce unstable (though sometimes faster) results.

Save progress to your learning plan by logging in or creating an account