Developing the Virtual Layers of the Data Warehouse

Objective

After completing this lesson, you will be able to create the virtual layers of the data warehouse using calculation views

Calculation Views

Basic Concept

Calculation views are database views that adapt dynamically to calling parameters. They automatically prune unnecessary data and operations from the model and generate highly efficient SQL at run-time. Calculation views automatically push down operations such as filtering and joins, to the database to ensure the very best performance.

A calculation view combines multiple data sources, which could be tables, functions or other calculation views. You can define filters and calculate new values, aggregate measure, create intersections of data and much more. The results are generated at run-time and are not persisted. (although cache can be enabled to store generated results that do not change frequently).

You can add your own custom SQL to any calculation view as they support scalar and table functions.

This makes them ideal database artifacts to develop a virtual data layer in the data warehouse. Calculation views are optimized for SAP HANA in-memory processing.

A graphical editor in the Web IDE is used to create calculation views. Coding is not required. A calculation view consists of a sequence of nodes that perform actions on the data, such as aggregate, join, rank and union.

The output of calculation views is a set of:

  • Attributes: Descriptive data, such as customer ID, city, and country.
  • Measures: Quantifiable data, such as revenue, quantity sold, and counters.

Calculation views can be consumed by any application, either transactional or analytical, SAP and non-SAP.

Calculation views should be stacked in layers to ensure high reusability. The lowest-level calculation views should be defined with limited processing logic to ensure they can be used by multiple higher level calculation views. Higher-level calculation views should enrich the results set.

Data sources can be tables, SQL views, calculation views and even SQL table functions where custom code can be wrritten to provide an input data set. You can create joins, unions, projections, and aggregations on data sources.

Calculation views can include measures and be used for multidimensional reporting, or can contain no measures and be used for list-type reporting.

Calculation views can be used to generate relational and dimensional models, such as star schemas.

Data Categories Available for Calculation Views

It is possible to work with three different data categories to classify calculation views. 

Cube

Calculation views with the data category Cube are visible to the reporting tools and support data analysis with multidimensional reporting.

For graphical calculation views with the data category Cube, the modeler provides aggregation as the default view node. Also, an additional aggregation column behavior is available that you can use to specify the aggregation types for measures.

Cube calculation views can be enabled for star schemas. By setting the star join flag, Dimension calculation views can be joined to the Cube to form a star schema (see Dimension below)

SQL Access Only

Calculation views with the data category SQL Access Only are not visible to the reporting tools and do not support multidimensional reporting.

However, you can use these calculation views as data sources in other calculation views. Think of them as internal calculation views.

For graphical calculation views with the data category SQL Access Only, the modeler provides projection as the default view node.

Dimension

Calculation views with the data category Dimension are not visible to the reporting tools and do not support data analysis. However, you can use these calculation views as data sources in other calculation views. They are especially used in star schemas to provide attributes for various entities such as customer and product.

If the data category is Dimension, you cannot create measures. You can only consume them with SQL. For example, you can use such calculation views to fill simple-list user interfaces, where recurring attribute values are not a problem, but are desired. In this type of view, every output column is an attribute, even numerical data types.

For graphical calculation views with the data category Dimension, the tool provides projection as the default view node.

Nodes Available for Calculation Views

The SAP Web IDE for SAP HANA graphical calculation view editor offers various view nodes to model calculation views and obtain the desired output. We want to spend particular attention to the following nodes:

  • Union

    Use a Union node to combine the result set of two or more data sources. Union nodes have two or more inputs.

  • Join

    Use a Join node to query data from different data sources, based on a specified condition.

  • Graph

    Use a Graph node to execute any of the available graph operations or actions on the graph workspace. A graph node is always the leaf node only.

  • Projection

    Use a Projection node to filter or obtain a subset of required columns of a data source (tables, views, table functions, and so on). Projection nodes have one input.

  • Aggregation

    Use an Aggregation node to summarize data for a group of row values, by calculating values in a column. Aggregation nodes have one input.

  • Rank

    Use a Rank node to partition the data for a set of partition columns, and to perform an order by operation on the partitioned data.

  • Star Join

    Use a Star Join node to join multiple dimensions with a single fact table. In other words, use star joins to join a central entity to multiple entities that are logically related.

For the reference architecture of our data warehouse we use Join, Union, Star Join, Aggregation and Projection nodes.

Calculation View Nodes: Join

A join node provides all the usual SQL join types but some special SAP HANA join types too, such as referential and text joins. You can define temporal joins that consider data-validity in a join condition. This could be especially useful for a data warehousing scenario to handle historization of data where type 2 dimensions are modeled that provide from and to dates used in the validity of attributes within dimensions.

Apart from the join definition itself, it is possible to change the output column mappings, define calculated columns and parameters, change the order of columns, and add filter expressions to eliminate data.

The Mapping feature in the Join node can be used to change the output of the node. This way, we can eliminate attributes which will not be needed anymore in subsequent nodes. For the reference architecture calculation views, for example, we only need the business keys and surrogate keys from the hub tables and the descriptive attributes from the satellite tables so we will only map these columns.

Calculation View Nodes: Star Join

Star Joins are very similar to Join nodes and offer similar configuration options. They are used to generate star schemas that comprise a fact table (which are usually calculation views) joined to one or more dimension calculation views.

Calculation View Nodes: Projection

Projection nodes in the calculation view UI are very similar to Projection nodes in the Flowgraph UI. Using projections we can select only the columns we require from the data source, execute transformations on attributes and rename them if required. Also, similarly to the Join node configuration options, we can set up calculated columns and parameters, change the order of columns, and add filter expressions

Launch the video below to learn how to create a simple calculation view.

Virtual Analytic Layer

In the Virtual Analytical Layer of the reference architecture, we use calculation views to join together fragmented entities of single business concepts (defined in the Business model) which - according to the Data Vault principle - are stored in hub, satellite and link tables.

These calculation views are the analytical base objects that are later used to model the Data Mart layer. We will calculation views to support the master data and calculation views to support transaction data.

In the example above that relates to the master data model, we join together the PRODUCT_H hub table, the PRODUCT_S satellite table, and all link tables that are connected to the product hub table: PRODUCT_PRODUCT_GROUP_L, PRODUCT_VENDOR_L. All joins are performed using the surrogate keys (not the business keys).

As a result, we get a joined table which includes the surrogate keys of the business concept Product and the associated business concepts Product, Group, and Vendor.

We will also create a calculation view for Customer and another for Store, in the same way.

Launch the video below to watch how we create calculation views for the master data in the virtual analytics layer.

The transaction data model comprises three calculation views. One is used to generate the actuals view, one is used to generate the plan view, and one is used to combine the generated actual and the plan views.

Below, we see the calculation view that generates the plan view. It is a simple model and includes a projection node to select the columns we require. In this calculation view we select all columns. We do not generate new columns or apply filters.

The calculation view for actuals is identical, apart from a few different columns being selected.

Below, we see the calculation view that generates the union of the actual and plan views. In this model we include projection nodes to select the columns we require.

Then we see how a union node combines the plan and actual. The union node then feeds the aggregation node which summarizes the measures by any of the attributes.

Launch the video below to watch how we develop three calculation views to generate the transaction data model in the virtual analytical layer.

Data Mart Layer

The data mart layer sits on top of the virtual analytical layer. In the data mart layer we create calculation views that consume the calculation views of the lower layer (VAL) to generate consumable shapes of data, ready for analytical applications.

The data mart is the top layer of the modeling stack in our data warehouse.

Typically in the data mart layer you find sophisticated data models such as star schema. A star schema combines the transactional data with master data. In our case, we will use the calculation view that combines the actual and plan transactional data and join to this the master data that we modeling using the data vault tables. The outcome is a comprehensive model that allows a business user to explore any measure by any combination of attributes from the customer, product and store entities.

For our model we will use the following nodes:

  • projection - select the columns from the transaction data calculation view to generate the data foundation for the star schema (the 'fact table')
  • star join - join the dimension calculation views to the fact table

Launch the video below to learn how to create the calculation view to generate a star schema for the data mart layer.

Log in to track your progress & complete quizzes