Developing Data Models with SAP HANA Cloud

Objectives
After completing this lesson, you will be able to:

After completing this lesson, you will be able to:

  • Develop a data model using SAP Business Application Studio

Modeling in SAP HANA Cloud

Old Approach - Data Processing in the Application

In a legacy application, the role of the database is to simply store and provide raw data to the application.

The raw data is sent from the database directly to the application. The application then begins to process the data by combining it, aggregating it, and performing calculations in order to generate meaningful output.

We can find ourselves moving a lot of raw data between the database and the application. Unfortunately, when we move raw data to the application layer, we make the application code very complex. This is because the code has to deal with the data processing tasks as well as managing all of the other parts of the application including process flow control, business logic, user Interface (UI) operations, integrating data from multiple sources, and so on.

New Approach - Data Processing in the Database

With SAP HANA Cloud, we can build a sophisticated data processing layer on top of the database tables, right inside the database. This means we can first process the raw data and turn it into something meaningful in the database before passing the results to the application for consumption.

With SAP HANA Cloud, the main type of modeling object is called a calculation view. A calculation view can handle a wide variety of data processing tasks. For example, calculation views can combine data from multiple tables or other calculation views, apply filters, calculate new data and aggregate the data at any level. A calculation view does not store data, it simply calculates results on-the-fly in memory. The calculation views are developed in SAP HANA Cloud using easy-to-use graphical modeling tools, and are stored in SAP HANA Cloud database.

So, instead of the application including complex code to process the data, the application now calls the required calculation views and the processing is pushed down to SAP HANA Cloud database and the results are passed to the application for further processing. This is efficient in the following ways:

  • The application code is simplified, as it does not have to deal with many data processing tasks. These tasks are pushed down to SAP HANA Cloud database where in-memory processing takes place.

  • The processing on the data is carried out where the data resides, so we do not have to move raw data from the database to the application. We only move the results of the data processing to the application.

  • The calculation views can be reused in multiple applications so we avoid redundancy.

Reuse of Models

In traditional applications, there is a high degree of redundancy in the application code. Developers find themselves continually creating the same code to process data.

When dealing with highly normalized database models, such as those used with SAP Business Suite, there can be many individual tables that need to be called and combined using joins. These joins can usually be pushed down in most databases. However, SAP HANA Cloud database goes beyond helping with just the table joins. SAP HANA Cloud database can take over all data processing tasks from the application. SAP HANA Cloud database takes care of complex calculations and data flow logic, including executing aggregations and disaggregation.

SAP HANA Cloud calculation views can contain dynamic placeholders. This means a business user can provide input values to the calculation view at run-time using a pop-up prompt. Typical uses cases for this are to provide filter values to reduce the data set.

Calculation views can consume the results of other calculation views. This encourages a high degree of modularization and reuse of calculation views. A calculation view is usually part of a stack of calculation views that make up a complete data model. It is good practice to design a calculation view considering future uses of that view so that it can easily be reused. This avoids redundancy of models and encourages easier maintenance.

Advanced Data Processing Capabilities

Watch this video to learn about the advanced data processing capabilities.

Types of Calculation Views

Types of Calculation Views

The calculation view is the main modeling object in SAP HANA Cloud.

There are three types of calculation view:

  • Dimension

  • Cube

  • Cube with star join

Calculation views are created using a graphical editor with the tool SAP Business Application Studio. To develop a calculation view you should be familiar with basic data modeling concepts.

Modeling Dimensions

The purpose of a dimension type of calculation view is to define a group of related attributes. For example, a material dimension might include attributes such as material description, material color, material weight, and material price. Although this type of calculation view can be directly consumed by an application, it is most likely to be found as a consumable view in a calculation view of the type Cube with star schema to provide the cube's dimensions (see later for details).

It might be helpful to think of calculation views of type dimension as central master data views. You define them once and reuse them many times.

To get started with calculation views of type DIMENSION, you need to set the data category to DIMENSION.

You then proceed to define the source tables and joins if they are needed. You define filters, and then identify the output columns that are to be exposed by the view. It is also possible to define additional derived attributes. An example of this could be a new column to generate a weight category based on a range of weights. You could use an IF expression to check each weight and assign a category.

In the calculation view, you are able to rename any source columns so they are meaningful to the business user. Remember that the column names originate from the database tables, and these names can often be technical and meaningless to business users.

A dimension calculation view can consume other dimension calculation views. For example you could create a new dimension called partners that joins customer dimension and customer dimension.

Modeling Cubes

The next type of calculation view is the type Cube. This type of calculation view is used to define a data set comprised of attributes and measures. This means they are ideal for transactional data modeling. For each measure you are able to define its aggregation behavior so that business users can slice and dice in an ad-hoc manner and all measures are aggregated according to their behavior. A Cube calculation view can consume other Cube calculation views or dimension calculation views. A cube calculation view is a relational model not a dimensional model.

Applications can directly access this type of calculation view. They can also be accessed via SQL statements.

To get started, set the data category to CUBE.

Do not select the Star Join flag.

You then select the table, or tables, which are to be included in the model. Typically, you choose a transaction table so that you have columns from which you can define attributes and measures. It is possible to include more than one transaction table. For example, you may need to include a header and a line item table to form the complete picture of a sales transaction. In this case, you simply join the tables. You can also merge transaction tables by implementing a union.

Then, select the columns from the tables that are to be exposed. You can optionally set filters and define additional calculated columns.

Finally, rename any columns to provide meaningful names to the business user.

Modeling Cubes with Star Joins

Now we come to the most powerful of all the calculation views: the Cube type, but with an additional setting — star join. This is how we model a star schema. This model is based on the cube type of calculation view, but with one or more dimension calculation views joined to the cube. This provides a very powerful and flexible dimensional data model that enables drill-down, slice and dice analysis over any combination of attributes and measures. Adding multiple dimensions dramatically enhances the analysis possibilities compared to cube type calculation views.

To get started, make sure that you set the data category to CUBE and select the Star Join flag.

First, select the transaction table(s). Use joins to combine tables if needed. Then, choose the columns to expose, set any filters, and create any calculated columns. What you are doing up to this point is forming a central fact table that is used as the hub of the star schema.

The next step is to form the star schema by joining suitable calculation views of type DIMENSION to the fact table.

Finally, adjust the labels of any columns to make them more meaningful by using the rename function in the Semantic node.

Cube with star join type calculation views can consume other Cube with star join calculation views and also simple Cube calculation views.

Modeling with Business Application Studio

Calculation views are created using the Business Application Studio.

The Business Application Studio is a multi-purpose integrated development environment (IDE) tool that is used to create all artifacts used in a full-stack application. This includes the UI, application code and database objects.

Web IDE for SAP HANA was introduced in 2016 for creating development artifacts for SAP HANA on-premise. This interface can still be used with SAP HANA Cloud but it is recommended to use Business Application Studio as this is not only more feature rich, all new modeling features will only appear in the Business Application Studio and not in the Web IDE.

The Business Application Studio also includes a terminal console so commands can be entered directly using a command line interface (CLI) instead of using the GUI.

With Business Application Studio you create all data modeling artifacts including:

  • Calculation Views

  • Procedures

  • Table and Scalar Functions

  • Flowgraphs

  • Analytic Privileges

  • Local and Virtual Tables

  • Replication Tasks

The artifacts are created using either graphical editors or text editors. You are able to swap between the types of editors. The artifacts are stored in simple text files which makes them very portable. Apart from SQL, the format of the files is usually JSON or XML.

The syntax for some artifacts is complex, such as for calculation views, so the graphical editor should be used. But for searching through these types of files or copy / pasting snippets of code, the text editor is also useful.

Watch this video to identify the key areas of the Business Application Studio.

For version control of the database artifact files, the open-source Git framework is used. Git is fully integrated in Business Application Studio. This means as you develop artifacts you can commit them to the team repository. You can also go back to older version of any artifact. Git allows developers to clone a project and work in parallel with other developers on sections of the project without affecting others. When development is complete, the artifacts from different branches of the project can be merged together using Git.

Extending Calculation Views with SQLScript

Calculation views are built using a graphical approach and no coding is required. However, sometimes the graphical approach does not provide all the functions and data flow logic you require for a complex calculation view. This is when you use functions. Functions are built using SQLScript which is the native database language of SAP HANA Cloud.

In SAP HANA Cloud modeling, table functions are typically used to generate a tabular data set that is used as a data source in a calculation view. Table functions can be used in SQLScript in a from clause of a select statement (in other words wherever a standard table identifier is used). A table function encapsulates the logic in a reusable form so that it can be used many times in different artifacts.

Table functions can accept one or more input parameters. Table functions are read-only; that is, they cannot be used to change data. Table functions produce exactly only one tabular output. Table functions can also call other functions.

Add SQL to Models Using Procedures

Procedures define reusable data processing logic that can be used to enhance a calculation view. Procedures are very similar to functions in that they are written in SQLScript and can have one or more inputs and they always have outputs. However, procedures can produce multiple output data sets of different structures, whereas a table function can only return one tabular output structure. Procedures have a limited role in data modeling. Procedures cannot be used as data sources to calculation views but they are used in other places in a calculation view, for example, to automatically derive values for an input parameter.

A procedure can be called directly from SQLScript, which means it can be called from a function or even another procedure.

Procedures used within modeling must be set to read-only. In that case they are called stateless (or side-effect free), because they don't alter any data or meta data in the database. However, procedures can also be used to update, insert, and delete data if required. These procedures are called stateful, and these type are not allowed when called from calculation views. Stateful procedures are more likely to be used by developers who build applications that change data, rather than modelers.

Securing Data Models

There are three levels of access to consider. Let's take a closer look at them.

Save progress to your learning plan by logging in or creating an account

Login or Register