Managing the Data Lifecycle

Objectives

After completing this lesson, you will be able to:

  • Re-locate data using the Data Tiering Optimizer

Overview of SAP BW/4HANA Data Lifecycle Management

Requirements and Challenges due to Data Growth

A key requirement of SAP BW/4HANA is to store data - a lot of data!

Data is collected from all source systems in the organization. Many of those systems generate huge amounts of data on a daily basis. SAP BW/4HANA can even collect data in real-time using streaming mechanisms, and so the flow of data never stops.

From a technical perspective, SAP HANA - the database that powers SAP BW/4HANA - could cope with huge data growth. We would simply scale-up the hardware by adding more memory. Once the server reaches its memory capacity, we would then add more servers. This is called scale-out. But this is a very expensive solution to the data growth problem.

Apart from these increasing costs of storing more and more data in SAP BW/4HANA memory, we are in danger of clogging up the system with data that is rarely used. So let's take a step back and think about the best solution.

Sophisticated concepts of data management are required to optimize financial investments and usage of SAP BW/4HANA hardware resources. SAP provides concepts and solutions to assign SAP BW/4HANA data to various storage areas and storage media.

Without sophisticated data management in SAP BW/4HANA:

  • All data must be held in RAM in order to be processed

  • Large systems require a large HANA sizing based on a scale up/scale out architecture

For data held in RAM, costs for hardware and licenses have to be considered; especially for data that is not required to be available in memory all the time because it is rarely used and performance is not mission critical.

Multi-Temperature Data Management Concept

Data has a life-cycle. When data is new, it is usually very important to the business and used in decision-making. But as data ages, its value usually decreases as it is used less and less. We should move the less valuable data, from the highest performance / highest cost storage - which is memory - to the lower performing / cheaper storage options - such as disk.

What is needed are tools to manage the data life-cycle so that we can move data from memory to disk when the time is right. Such tools are included in SAP BW/4HANA.

With the concept of Multi-Temperature Data Management it's possible to decouple hardware growth form data growth. SAP has defined this data storage concept, which classifies data as HOT, WARM, or COLD (and sometimes even FROZEN), depending on frequency of access and certain other criteria. These criteria include the type of data involved, how useful it is for business purposes, the importance of these processes, frequency of access, and performance and security requirements. More facts that influence the setup of the SAP Multi-Temperature strategy for storing data include:

  • Budget restrictions

  • Technical restrictions in capacity on your SAP HANA database

  • Storage of historical data (due to data growth)

  • Storage of bulk data (settlement data, automatic logs, clickstream data)

  • Guidelines for saving company data, such as the need to save all data for at least some years for legal reasons

A large portion of data managed in a large enterprise data warehouse, such as SAP BW/4HANA, is processed frequently and needs fast access. This kind of data can be considered as HOT.

In addition, there is often also a large volume of data that is not accessed frequently and perhaps might not require fast access. For this reason, this type of data can be defined as WARM. Data that is no longer needed but must still be kept, perhaps for legal reasons, can be classified as COLD.

SAP BW/4HANA data is classified by access frequency and use case into HOT, WARM, and COLD.

In the context of the SAP BW/4HANA reference architecture for data warehousing (Layered Scalable Architecture for BW/4HANA), the various areas in a data warehouse and the different architectural layers of the EDW architecture can be assigned to these multi-temperature data categories.

Let's learn more about these multi-temperature data categories:

Hot
  • All layers related to mission-critical, day-to-day business analysis and planning
  • Hot data is accessed frequently for reporting and planning or by regular SAP BW/4HANA processes.
  • Examples include:
    • Data Mart DataStoreObjects (advanced).
    • Standard DataStoreObjects (advanced).
  • There are no functional restrictions: read and write actions are allowed on this data.
Warm
  • All layers related to data acquisition
  • Warm data is accessed less frequently or rarely. It is not critical in terms of performance, and does not have to be permanently stored in the main memory, typically Staging ADSOs in the LSA++ Corporate Memory and Open Operational DataStore layer. There are no functional restrictions.
  • Examples are:
    • Objects in the Corporate Memory, typically Staging DataStore Objects (advanced).
    • Objects in the Open Operational DataStore layer, typically Staging DataStore Objects (advanced).
  • There are no functional restrictions: read and write actions are allowed on this data.
Cold
  • All layers related to the retention of historical data
  • Cold data is accessed rarely or sporadically.
  • The data does not have to be held in the SAP HANA database.
  • Instead, it is stored and managed outside of SAP HANA in an external source.
  • There are functional restrictions: this data is used mainly for read only. It can only be made available for reporting by enabling a setting in the query. By default this data is not accessed by queries. Writing to the data is possible in exceptional cases only, such as corrections. If reading this type of data is required, expectations regarding performance must be set accordingly.

SAP BW/4HANA Data Tiering Optimization (DTO)

Introduction

For implementing a multi-temperature data management in SAP BW/4HANA , SAP provides the concept of Data Tiering Optimization (DTO).

Data Tiering Optimization is characterized as follows:

  • One concept for HOT, WARM, and COLD data
    • Data tiering based on complete DataStore Objects (advanced) or individual partitions
    • Temperature definition of each partition as local setting (no transport)
    • Using HANA technologies such as Smart Data Access (SDA) and Scale-out architecture
  • Easy and central definition and implementation
    • Partitions and their data temperature need to be defined in DataStore Object (advanced) initially
    • Data temperature changes are managed via App in SAP BW/4HANA Cockpit
    • No additional configuration of individual Nearline Storage (NLS) Data Archiving processes as in the past
  • Displacement of data as a simple periodic housekeeping activity
    • Single DTO job that periodically moves data to defined storages
    • No complex process chain modeling as for NLS in the past
  • Non-disruptive approach and protection of past investments
    • Co-existence with existing SAP BW NLS approach

Data Tiering Optimization (DTO) for SAP BW/4HANA helps you to classify the data in your DataStore object (advanced) as HOT, WARM, or COLD. Depending on this classification and how the data is used, the data is stored in different storage areas. In SAP BW/4HANA, you have the following options:

  1. Standard tier (HOT): The data is stored in the main memory of SAP HANA.
  2. Extension tier (WARM): SAP HANA Native Storage Extension (NSE), an enhanced universal storage management solution in SAP HANA or SAP HANA Extended Node, an over-provisioned worker node(s) with relaxed memory/CPU sizing in a scale-out landscape.
  3. External tier (COLD): SAP IQ External Storage, which means that data is managed outside SAP HANA in an SAP IQ database.

Setting Up Data Tiering Optimization in SAP BW Modeling Tools

In SAP BW/4HANA, the DataStore Object (advanced) is the only data model that physically manages transactional data. Consequently, it is in DataStore Objects (advanced) where you set up all relevant DTO parameters. The customizing process has two main tasks:

  1. Define DTO tiers and partitions in SAP BW Modeling Tools for each DataStore Object (advanced) (settings relevant for transport connection).
  2. Define the temperature of each partition in the DataStore Objects (advanced) in the SAP BW/4HANA Cockpit (to be manually defined in each system, no transport connection).

In the ADSO editing within the BW Modeling Tools in Eclipse, in the General tab, specify the data tiering properties as follows:

  • Tiers (temperatures) in scope for this object

  • Maintenance of the temperatures on Object level or Partition level

If you have chosen Data Maintenance on Partition Level, the ADSO requires the definition of partitions. In the ADSO edition screen, go to the Settings tab to set up an appropriate number of partitions to manage the data in different tiers. You can define the partitions explicitly as static values or leverage dynamic partitioning which will create a new partition for each value of the partitioning field. Consider that partition fields need to be defined as key fields in your ADSO definition. Recommended partition fields are SAP standard time characteristics (for example, 0CALMONTH, 0CALYEAR, 0FISCPER, 0FISCYEAR) or objects representing organizational units. If partition definitions are changed at a later time (especially when the ADSO is managing data already), the system checks whether remodeling is necessary when you activate the object. If remodeling is necessary, a remodeling job is created automatically, which needs to be triggered manually afterward.

DTO Management in the SAP BW/4HANA Cockpit

Note

For planning the temperatures, you can also use transaction code RSOADSODTO in SAP GUI as alternative to the SAP BW/4HANA Cockpit .

In addition to the manual definition of the temperature of a partition or object, you can also define temperature change rules for selected objects. Rules can be defined for ADSOs with tiering at partition level and with an SAP time characteristic or a field of data type DATS as the partition field. In the rule, starting from the current date, you can specify a relative time-based condition for the various possible temperatures. You can specify warm temperature for data that is over a year old for example, and cold for data that is over five years old.

In the demonstrations at the end of the lesson, you find a similar DTO example as the one displayed in the following figure:

Note

Alternative options to execute the data displacement:

  • Use transaction RSOADSODTOEXE
  • Set up a process chain based on the Adjust Data Tiering new variant type

There is a helpful app to monitor the space allocated for the three DTO tiers. In the SAP BW/4HANA Cockpit go to the group called Monitoring and open tile Data Volume Statistics. It provides a Volume Trend of the HOT, WARM, and COLD data in your SAP BW/4HANA system. There is also an option to drill down and get the graphical display in a table with a snapshot of the volumes per calendar day. If the app does not provide data initially, its data collection has to be set up based on the Configure option in the header area.

The DTO WARM Tier

1. SAP HANA Native Storage Extension (NSE)

SAP HANA NSE is a disk-based extension to the in-memory COLUMN STORE. It is an universal storage management solution integrated with SAP HANA for warm data for SAP BW/4HANA, SAP S/4HANA, and SAP Business Suite powered by SAP HANA. SAP HANA NSE is used automatically as the data tier for warm data if there are no SAP HANA extension nodes configured.

2. SAP HANA Extension Nodes

The warm tier managed by SAP HANA Extension nodes requires an SAP BW/4HANA landscape architecture based on a scale-out of so-called coordinator node and different worker nodes. Setting up one (or several) worker nodes as extension nodes means running an asymmetric SAP HANA scale-out landscape. This includes a group of worker nodes (for your hot data) with standard data volume/memory sizing, and another group with a relaxed sizing to store more data on these extension nodes.

The DTO COLD Tier based on SAP IQ External Storage

DTO cold data is first moved to an external storage and then deleted from the SAP BW/4HANA system (that is, from the SAP HANA DB). The standard BW process of Selective Deletion is used for this activity. It is still possible to access the data directly (for example, with BW queries) or to load it back to the WARM or HOT tiers if required.

In SAP BW/4HANA, the COLD tier is available for all data managed in DataStore Objects (advanced). However, it is a requirement to define partitions inside the ADSO. The following restrictions apply:

  • In general, the DataStore Object (advanced) must have at least one key field and partitions need to be defined on one of the key fields.

  • DataStore Objects (advanced) of type Direct Update require SAP BW/4HANA 2.0 SP07 or higher to manage their data in the COLD tier.
  • The COLD tier is not available for DataStore Objects (advanced) with the following modeling properties:

    • Staging DataStore object — Inbound Queue only

    • Staging DataStore object — Reporting enabled

SAP BW/4HANA comes with the adapter for the SAP IQ database as an external storage medium. Data on SAP IQ is stored in a column store in a compressed form similar to SAP HANA, with the big difference being that data is not managed in memory but it is processed disk-based.

Some key features of SAP IQ as external storage for DTO are:

  • Optimized load performance based on the SAP IQ loader functionality and SAP HANA Smart Data Access.

  • SAP HANA and SAP IQ share the same columnar paradigm.

  • Data compression is very efficient at approximately 90%.

  • Straggler Handling: SAP BW/4HANA cold store on SAP IQ provides the ability to process exceptional updates and deletes on the cold data

Exceptional Updates to COLD data ("Straggler Handling")

SAP provides an enhanced activation process which detects and handles exceptional changes of data on DTO COLD store locations. You need a DataStore Object (advanced) of type Standard with Change Log and the corresponding modeling property Exception Updates to Cold Store to enable this update handling. The DataStore Object (advanced) manages an additional secondary inbound table which stores changes to COLD partitions.

Note

SAP recommends to leverage SAP IQ for storing Cold data in the context of DTO. Hadoop can still be used (with the restrictions mentioned in SAP note 2363218: Hadoop NLS - Information, Recommendations and Limitations), but it is not the recommended solution going forward.

Managing Reporting Access to COLD Data

There are settings available to control the reporting access to COLD data:

  • General setting on CompositeProvider level: Common Runtime PropertiesCold Store Access: Switch on/off

  • Individual setting on BW query level: Extended PropertiesCold Store (Near-Line) Access: Switch on/off or read setting of Composite Provider

References

For more details, refer to the following sources:

General Overview

WARM Store based on SAP HANA Native Storage Extension (NSE)

WARM Store based on SAP HANA Extension Node

COLD Store

Note

Though SAP clearly recommends to use Data Tiering Optimization (DTO), earlier Data Tiering concepts of SAP BW still exist for SAP BW/4HANA:

  • SAP BW Near-Line Storage

  • Active/Non-Active Data

Prepare DataStore Object (advanced) for Data Tiering Optimization

Watch the following demo to learn how to prepare a DataStore Object (advanced) for Data Tiering Optimization.

Perform Data Tiering Optimization

Watch the following demo to learn how to change temperatures of partitions of DataStore Objects (advanced) and how to make COLD data available for reporting.

Log in to track your progress & complete quizzes