The purpose of an Aggregation node is to apply aggregate functions to measures based on one or several attributes. The node generates a result set that is grouped by the selected attribute columns and computes the selected measure with the specified function, such as SUM, MIN, AVG, and so on.
By default, the granularity of the aggregation is defined by the attribute columns that are mapped to the output of the aggregation node. In the example, you will get one aggregate row for each country/order pair, showing two measures: the total amount of each order, and the biggest line item of each order.
Note
If you want to get the total amount and biggest order amount at the country level, removing the Order column from the output is not enough because you will still get the biggest line item detail in each country. In that case, you should first aggregate both measures with SUM in one aggregation node (without the Order column), and add another aggregation node on top defined like in the example (but without the Order column).If you exclude an attribute from an upper node (in the same calculation view or another one that consumes it), or in a query executed on top of this view, by default these attributes are ignored. In addition, the calculation view aggregates the result set on the remaining attributes.
A graphical calculation view in SAP HANA Cloud supports the following aggregate functions:
Sum (the default function)
Min
Max
Count
Average
Variance
Standard deviation
Median
Caution
These aggregate functions should be used very carefully in stacked scenarios to avoid incorrect results.
In an Aggregation node, a calculated column is always computed AFTER the aggregate functions. If you need to calculate a column BEFORE aggregating the data from this column, you have to define the calculated column in another node, for example a Projection node, executed BEFORE the aggregation node in the calculation tree.
Defining an Aggregation in the Client Query
When you execute an SQL query on top of a CUBE Calculation View (with or without Star Join), an attribute that is not selected is ignored, as discussed earlier. But the aggregation behavior depends on whether your SQL query includes a GROUP BY clause or not.
- If your SQL query does not include a GROUP BY clause, each requested measure is aggregated as specified in the Semantics of the Calculation View.
- If your SQL query includes a GROUP BY clause, the aggregate function you specify for each requested measure overwrites the one defined in the Semantics of the Calculation View. This provides a lot of flexibility, but can be error-prone in some scenarios. It is recommended to use this with care.
Note
The default data preview SQL statement for a CUBE calculation view (with or without star join) in SAP Business Application Studio and Database Explorer uses the second approach, thus showing you which aggregate function is applied to each measure as per the Semantics. Be careful with the data consistency if you change any aggregate function.Defining Aggregation Attributes
From QRC 1/2023 onwards, you can provide in the Semantics node of a calculation view of type CUBE (with or without star join) information about which aggregation level is relevant and which is not.
In this example, the GROSS_AMOUNT should not be aggregated without keeping the column CURRENCY_CODE in the result set, at least if the column is not converted. Indeed, this might result in adding amounts expressed in more than one currency, for example USD and EUR, making the output of the calculation view inconsistent.
For each measure of the calculation view, you can define one or several attributes that should be kept when aggregating data.
The information entered in the Semantics is meant to be used by reporting tools through the SAP HANA Analytics Catalog, which is made up of BIMC Views (schema _SYS_BI). The main table holding the aggregation attribute(s) information is BIMC_EXCEPTION_AGGREGATION_ATTRIBUTES.
Caution
The setting defined in the Semantics has no influence on the way aggregations are executed by the calculation engine. Besides, it does not prevent the execution of a query that might return inconsistent data because an aggregation attribute needed for a given column has not been included.