Introducing SAP HANA Data Provisioning

Objectives

After completing this lesson, you will be able to:

  • Describe SAP HANA Data Provisioning Options
  • Describe SAP HANA Smart Data Access and SAP HANA Smart Data Integration

Introduction to SAP HANA Data Provisioning Options

When you load data to SAP BW/4HANA, the loading targets are SAP BW/4HANA objects such as DataStore Objects or InfoObject characteristic for master data. You do not load data directly to the database tables. But technically, the data is stored in the BW application tables of the SAP HANA database. These BW-managed tables and views are created in a special schema that is owned by the SAP HANA database user that creates the connection to the SAP BW/4HANA. This important schema is called the SAP BW/4HANA (managed) schema.

But an SAP HANA database supports multiple schemas. It is technically possible, though not always recommended or even permitted, to run multiple applications on a single SAP HANA database. Each application would have its own schema (or multiple schemas). Non-BW schemas in the SAP HANA database are referred to as "external (managed) schemas". The reason that external schemas are interesting to SAP BW/4HANA is because it is possible to load data to SAP BW/4HANA from external schemas. Technically, this would mean we are moving data from one schema to another schema in SAP HANA.

So how do we achieve this?

In addition to the data acquisition capabilities of SAP BW/HANA, SAP HANA contributes additional data access possibilities using its own built-in and bolt-on tooling. This powerful tooling can be used to access and extract data from any data sources on any technology platform. Though technically applicable to any SAP source, SAP recommends to always use the delivered SAP BW/4HANA loading approaches, especially ODP, for SAP sources. But for non-SAP sources, utilizing the SAP HANA tooling is recommended. The extracted data is loaded to an external schema and then SAP BW/4HANA can extract the data or access it virtually.

Let's have a short overview of the approaches for loading data to the external schemas of the SAP HANA database or accessing the data remotely:

In general, data provisioning is the process of loading data from external sources into a target system, in our case SAP BW/4HANA or SAP HANA. It can happen once, on a regular basis, or even in real time. It is not necessarily a physical data transfer because it is also an option to consume data virtually while keeping it in its original source.

There are different methods available to make data available to an SAP HANA database:

  1. Data Extraction follows the classical Extraction-Transformation-Loading (ETL) process well-known for many years in the SAP BW environment. The focus is less on speed, as these ETL processes are usually scheduled in batch jobs daily or weekly only. In an SAP BW/4HANA environment, DataSources are used to extract data from an SAP source and prepare it into an SAP BW/4HANA readable type.

    Another area is ETL tools such as SAP Data Services or third-party equivalent solutions, which are able to connect a vast amount to source systems with structured and unstructured data. This data can be extracted and transformed in various ways to cleanse data, improve its quality, apply some business logic, and prepare it for posting into the target system. Normally, the target structure and its content are a result of complex transformations based on more than one source table. That means that the source and target structure are quite different. In SAP BW/4HANA, DataStore Objects (advanced) and InfoObject Characteristics can receive data from these types of ETL tools, along with the new write interface.

  2. Data Replication relates to transferring contents of database tables 1:1 (or with limited degree of transformation only) from source to target. Whenever a record is changed or posted into the source table, the same change is triggered toward the target table based on equivalent SQL statements (Insert, Update, Delete). Transformations between source and target are possible but not in focus, because they come with a cost of processing time, which delays the real-time availability. The target structure and its content are normally the same as the source (or with a few changes). The emphasis is on speed to make data available in a target system such as SAP HANA. Real-time replication can be set up based on tools such as SAP Landscape Transformation Replication Server (SLT) or SAP HANA Smart Data Integration (SDI).
  3. Data Virtualization differs from the other two methods in that it does not focus on physical data loading or replication. Data is not physically loaded from the source system to the target system but is accessed remotely. Any updates to the data on the source system are reflected immediately on the target system. SAP HANA supports setting up remote connections to external data sources such as non-SAP databases. The concept behind this is sometimes also referred to as Data Federation. From an SAP HANA point of view, the enabling technologies are SAP HANA Smart Data Access (SDA) and SAP HANA Smart Data Integration (SDI). If you do not physically need the data in SAP HANA for performance and governance reasons, this kind of virtual access is worth evaluating before setting up physical replication or ETL processes. Both SAP HANA and SAP BW/4HANA can leverage these technologies.
  4. For test purposes or ad-hoc scenarios, you can also upload flat files manually into SAP HANA or SAP BW/4HANA.

A crucial factor for identifying the most suitable data provisioning interface or method for an organization is a clear view on whether interfaces and technologies are provided by SAP BW/4HANA (that is, ABAP application-managed) or by the underlying SAP HANA platform (SAP HANA-managed). Technically, both approaches have their strengths and weaknesses, and both are available for bringing data into SAP BW/4HANA from a technical perspective. However, the non-technical, commercial side has to be evaluated carefully: Not all SAP licenses allow customers to leverage all approaches flexibly, and in some situations customers will be forced to rely only on the SAP BW/4HANA source systems due to commercial agreements with SAP.

Note
The tooling of SAP HANA is an ideal replacement for UD Connect and also DB Connect, as these are not supported in SAP BW/4HANA

SAP HANA Smart Data Access and SAP HANA Smart Data Integration

Overview of SAP HANA Smart Data Access and SAP HANA Smart Data Integration

SAP HANA Smart Data Access (SDA) and SAP HANA Smart Data Integration (SDI), are tools of SAP HANA that provide solutions to integrate data from external sources into the SAP HANA platform. Both are defined as Remote Sources, which technically open the access to leverage the data that is managed outside of the SAP HANA platform.

SAP HANA Smart Data Access (SDA)

SAP HANA Smart Data Access (SDA) is a standard delivered and installed component of the SAP HANA platform and enables users to access remote data. It does this without having to replicate the data to the SAP HANA database. SAP HANA presents the data as if it were stored in local tables on the database. Automatic data type conversion enables the mapping of data types from remote databases connected via SAP HANA Smart Data Access, to SAP HANA data types.

Not only does SDA provide operational and cost benefits, but most importantly it supports the development and deployment of next-generation analytical applications. These applications require the ability to access, synthesize, and integrate data from multiple systems in real time, regardless of where the data is located.

SAP HANA Smart Data Integration (SDI)

SAP HANA Smart Data Integration (SDI) is a component of SAP HANA Enterprise Information Management (EIM). EIM tooling is not installed by default and is an additional option of SAP HANA. SDI represents a set of functions that you can use to retrieve data from external systems, transform it, and persist in SAP HANA database tables or virtualize the data. The data provisioning provides both data replication and data transformation services, which are available as real-time services for many of the supported sources. This concept is open and extensible, and works on both SAP and non-SAP data.

The following are typical use cases of SDI:

  • Virtualization/Federation (use case as for SAP HANA Smart Data Access)

  • Physical replication in batch mode (similar to scheduled process chains in SAP BW/4HANA)

  • Physical replication in real-time mode (similar to SLT replication to SAP HANA)

  • Physical replication with transformation of the data (based on flowgraphs and smart data quality functions)

SDI accelerates performance through a native SAP HANA implementation and execution. SDI is deployed on a separate server component (Data Provisioning Server, "dpserver") and is not completely managed by the SAP HANA Indexserver as with SDA for example. The index server however, is still an essential part of the SDI framework for virtual table access and replication. This is because metadata for virtual tables, Remote Sources, Remote Subscriptions, as well as adapter and agent metadata is stored there. The implementation is driven by adapters, which are managed by the Data Provisioning Agent. The list of existing adapters is extensive already and can be enhanced individually based on a generic Adapter SDK.

Note

For the latest list of adapters and additional details like release requirements please refer to the SAP Product Availability Matrix (PAM) and search for "SAP HANA SDI 2.0".

Log in to track your progress & complete quizzes