Describing the Concept of Delta Merge

Objectives

After completing this lesson, you will be able to:
  • Describe the delta merge process
  • Perform a delta merge operation
  • Delta merge types and performance aspects

Memory Management in the Column Store

The column store is the part of the SAP HANA database that manages data organized in columns in-memory. Tables created as column tables are stored here.

Delta Merge Operation

The purpose of the delta merge operation is to move changes collected in the delta storage to the read-optimized main storage. After the delta merge operation, the content of the main storage is persisted to disk. Its compression is recalculated and optimized, if necessary. The delta merge operation can be triggered by several events and is executed asynchronously.

A further result of the delta merge operation is the truncation of the delta log. The delta storage structure itself exists only in-memory and is not persisted to disk. The column store creates its logical redo log entries for all operations executed on the delta storage. This log is called the delta log. If a system restart occurs, the delta log entries are replayed to rebuild the in-memory delta storages. After the changes in the delta storage are merged into the main storage, the delta log file is truncated by removing those entries that were written before the merge operation.

Note

Data that is in the delta storage can only be fully loaded or unloaded. Partial loading is not possible. Therefore, if a delta merge has not been performed and the table's entire data is in the delta storage, the table is either fully loaded or unloaded.

Delta Merge Process

The following steps are performed in the merge process:

  1. Before the merge operation, all write operations go to Delta 1 storage, and all read operations are done from Main 1 and Delta 1 storages.

  2. While the merge operation is running, the following events occur:

    1. All write operations go to the second delta storage, Delta 2.

    2. Read operations read from the original main storage, Main 1, and from both delta storages, Delta 1 and Delta 2.

    3. Uncommitted changes in Delta 1 are copied to Delta 2.

    4. The content of Main 1 and the committed entries in Delta 1 are merged into the new main storage, Main 2.

  3. After the merge operation is completed, the following events occur:

    1. Main 1 and Delta 1 storages are deleted.

    2. The compression of the new main storage (Main 2) is re-evaluated and optimized. If necessary, this operation reorders rows and adjusts compression parameters. If compression has changed, columns are immediately reloaded into memory.

    3. The content of the complete main storage is persisted to disk.

Note

With this double buffer concept, the table only needs to be locked for a short time: at the beginning of the process when open transactions are moved to Delta 2, and at the end of the process when the storages are "switched".

Caution

The minimum memory requirement for the delta merge operation includes the current size of the main storage, the future size of main storage, the current size of delta storage, plus some additional memory. Even if a column store table is unloaded or partly loaded, the whole table is loaded into memory to perform the delta merge.

Perform a Table Delta Merge

By default, SAP HANA controls the delta merge process automatically. However, it may be necessary or useful to trigger a merge operation manually in some situations. For example, this occurs in the following cases:

  • An alert has been issued because a table exceeds the threshold for the maximum size of delta storage.

  • You need to free up memory.

A table delta merge can be triggered manually by executing the operation in the SAP HANA cockpit – Current Table Distribution application.

  1. On the Database Overview screen, with the Administration or All view selected, choose the View Current Table Distribution link on the Table Distribution card.
  2. Optionally, use filtering or search options to display your desired tables.
  3. Left-click a table.
  4. If a delta merge is possible, Delta Merge appears on the context menu. Select it.
  5. Choose Yes to confirm.

To check the runtime information of column tables, either use the SAP HANA cockpit app Current Table Distribution and choose the function Show Runtime Data when clicking left on a table, or use the Database Explorer, open a table in the catalog area, and select the corresponding tab to retrieve details about the sizes.

Even though the delta merge operation moves data from the delta storage to the main storage, the size of the delta storage is not zero. This is because records written by open transactions are moved to the new delta storage while the delta merge operation takes place. Furthermore, even if the data containers of the delta storage are empty, they still need some in-memory space.

Delta Merge Types and Performance Aspects

Reason for Expense of Delta Merge Operation

The delta merge operation can be expensive for the following reasons:

  • The complete main storage of all columns of the table is rewritten in-memory. This uses some central processing unit resources, and temporarily duplicates the memory needed for the main storages (while Main 1 and Main 2 exist in parallel).

  • The complete main storages are persisted to disk, even if only a relatively small number of records were changed. This creates disk input/output load.

This potentially negative impact on performance can be lessened by the following strategies:

  • Executing memory-only merges

    A memory-only merge affects only the in-memory structures and does not persist any data.

  • Splitting tables

    The performance of the delta merge depends on the size of the main storage. This size can be reduced by splitting the table into multiple partitions, each with its own main and delta storages. The delta merge operation is performed at partition level and only for partitions that actually require it. This means that less data is merged and persisted. Note that there are disadvantages to partitioning tables that should also be considered.

Merge Motivations

The request to merge the delta storage of a table into its main storage can be triggered in several ways. These are called merge motivations.

The figure, Merge Motivations, illustrates the different merge motivations and how they are triggered.

Auto Merge

The standard method for initiating a merge in SAP HANA is the auto merge. A system process called mergedog periodically checks the column store tables that are loaded locally. For each individual table or single partition of a split table, it determines whether or not a merge is necessary based on certain criteria (for example, size of delta storage, available memory, time since last merge, and others).

If the active parameter in the mergedog section of the indexserver.ini file is set to yes, auto merge is active.

Note

You can activate or deactivate auto merge for an individual table in the TABLES (SYS) system view. Change the value in the AUTO_MERGE_ON column to TRUE or FALSE.

Smart Merge

If an application powered by SAP HANA requires more direct control over the merge process, SAP HANA provides a function that allows the system to check whether or not a delta merge makes sense. This function is called smart merge. For example, if an application starts loading relatively large data volumes, a delta merge during the load can have a negative impact both on the load performance and on other system users. Therefore, the application can disable the auto merge for those tables being loaded and send a hint to the database to merge once the load has completed.

When the application issues a smart merge hint to the database to trigger a merge, the database evaluates the criteria that determine whether or not a merge is necessary. If the criteria are met, the merge is executed. If the criteria are not met, the database takes no further action. Only a subsequent hint from the application triggers another evaluation of the criteria.

If the active parameter in the smart_merge_enabled section of the indexserver.ini file is set to yes, smart merge is active.

Caution

For tables that you want to merge with the smart merge, disable the auto merge. Otherwise, the auto merge and smart merge may interfere with each other.

Hard and Forced Merges

You can trigger the delta merge operation for a table manually by executing the SQL statement MERGE DELTA OF "<table_name>". This is called a hard merge. It causes the database to execute the delta merge for the table either immediately if sufficient system resources are available, or as soon as sufficient system resources become available. The hard merge is therefore subject to the merge token control mechanism.

If you want the merge to take place immediately regardless of system resource availability, you can pass an optional parameter. A forced merge may be useful in a situation where there is a heavy system load, but a small table needs to be merged. It is also useful if a missed merge of a certain table negatively impacts system performance. To execute a forced merge, execute the SQL statement MERGE DELTA OF '<table_name>' WITH PARAMETERS ('FORCED_MERGE' = 'ON').

Note

Unlike system-triggered delta merge operations, all of the manually executed delta merge operations listed here do not later trigger an optimization of the compression of the table's new main storage. If the table was compressed before the delta merge operation, it remains compressed with the same compression strategy afterward. If it was not compressed before the delta merge operation, it remains uncompressed afterward. After a manual delta merge, you must therefore trigger compression optimization manually.

Critical Merge

To keep the system stable, the database can trigger a critical merge. For example, when auto merge is disabled and no smart merge hints are sent to the system, the size of the delta storage can grow too large for a successful delta merge to occur. The system initiates a critical merge automatically when a certain threshold is passed.

Parameters

Threshold for Optimization Compression

The thresholds for optimization compression to occur are defined as parameters, as shown in the following table:

ParameterDefaultDescription
ActiveYesCompression optimization status
min_change_ratio1.75Minimum required change row count (ratio)
min_hours_since_last_merge_of_part24Minimum hours since the last merge of part
min_rows10240Minimum required rows (which stored in the table)

Write operations on this compressed data are costly, as they require reorganizing the storage structure. Therefore, write operations in column store do not directly modify compressed data. All changes go into a separate area called the delta storage. The delta storage exists only in main memory. Only delta log entries are written to the persistence layer when delta entries are inserted.

Features of Delta Merge Operations

The features of the delta merge operation are as follows:

  • The delta merge operation is executed on table level.

  • It moves changes collected in write-optimized delta storage into the compressed and read-optimized main storage.

  • Read operations always read from both main storage and delta storage, and merge the results.

  • The delta merge operation is decoupled from the execution of the transaction that performs the changes. It happens asynchronously at a later point in time.

Note

For the delta merge operation, a double buffer concept is used. This has the advantage that the table only needs to be locked for a short time. For more information, see the Administration Guide.

Caution

The minimum memory requirement for the delta merge operation includes the current size of main storage, the future size of main storage, the current size of delta storage, plus some additional memory. Even if a column store table is unloaded or partly loaded, the whole table is loaded into memory to perform the delta merge.

Log in to track your progress & complete quizzes