Checking the SAP HANA Persistence Storage

Objective

After completing this lesson, you will be able to check the SAP HANA persistence storage

Persistence

The simplified architecture of the persistence layer is shown in the figure, SAP HANA Persistence Layer. While isolation and consistency control are already provided by the in-memory stores and the transaction manager, the persistence layer ensures that changes are durable and that the database can be restored to the most recent committed state after a restart.

The persistence layer stores data in persistent disk volumes that are organized in pages. Different page sizes are supported between 4k and 16M. Data is loaded from disk and stored to disk page wise. For read and write access, pages are loaded into the page buffer in memory. The page buffer has no particular minimum or maximum size, instead all free memory not used for other things can be used for the page buffer. If the memory is needed elsewhere, least recently used pages are removed from the cache. If a modified page is chosen to be removed, the page first needs to be persisted to disk. While the pages and the page buffer are managed by the persistence layer, the in-memory stores can access data within loaded pages.

Column Store Persistence

The column store keeps data of regular columns in contiguous memory arrays that may have a size of many gigabytes. The persistence layer, on the other hand, manages memory in pages with much smaller size. When the columns are loaded into memory, data is first loaded into the page buffer of the persistence layer and then copied into the contiguous memory area that contains the entire column. To prevent double memory consumption, the persistence layer frees each page immediately after its content was copied to column store memory.

Data is copied between the column store and the persistence layer when the in-memory table is loaded and when data is written back to the persistence layer.

Write operations on column tables are executed on the in-memory column data structures. In addition, redo logs entries are written, and the underlying pages are marked as "dirty". The data is written from the in-memory columns to the delta pages at a later point in time, when the page needs to be flushed.

Row Store Persistence

Row store memory is organized into 16k pages that are contained in segments of 64 MB. To avoid copying between persistence layer page buffer and row store memory, the persistence layer provides special interfaces for creating and accessing row store pages. Row store pages are created by the persistence layer but the actual allocation of memory is delegated to the row store. The row store page manager allocates the memory in the row store persisted segments. With this concept, the same piece of memory is managed by the row store as a row store page and by the persistence layer as a page in its page buffer. Aspects such as concurrency control are handled by the row store. Other aspects such as shadow paging and I/O are managed by the persistence layer. With this design, row store data is loaded directly into the row store persisted memory segments without the need to copy pages.

Disk Storage

Disk storage is still required to allow restarts in case of power failure and for permanent persistency. The SAP HANA persistency layer stores data in persistent disk volumes that are organized in pages. It is divided into the log and data area, as follows:

  • Data changes such as insert, delete, and update are saved on disk immediately in the logs (synchronously). This is required to make a transaction durable. It is not necessary to keep the entire data, but the transaction log can be used to replay changes after a crash or database restart.

  • In customizable intervals (standard: every five minutes), a new savepoint is created. That is, all of the pages that were changed are refreshed in the data area of the persistence.

Whether or not disk access can become to a performance bottleneck depends on the usage. Because changes are written to the data volumes asynchronously, the user or application does not need to wait for this. When data that already resides in the main memory is read, there is no need to access the persistent storage. However, when applying changes to data, the transaction cannot be successfully committed before the changes are persisted to the log area.

To optimize the performance, fast storage is used for the log area. For example, it uses solid-state drives (SSDs) or Fusion-io drives (see certified hardware configurations in the Product Availability Matrix).

Redo Log

Redo log entries are persisted before the write transaction completes. They are persisted to separate log volumes while normal data is persisted to data volumes. With the help of the redo log entries, committed changes can be restored even if the corresponding data pages were not persisted.

The redo log entries written by the SAP HANA database system are write-ahead log entries. That means that the entries are written before the actual data is written. Managing the redo log is the task of the logger component. It provides an interface that allows other components to create log entries and to request flushing buffered redo log entries to disk.

When the database is restarted after a crash, the persisted log entries need to be processed. To speed up this process, the system periodically writes savepoints. During a savepoint operation, it is ensured that all changes that were made in memory since the last savepoint are persisted to disk. Thus, when starting up the system, only the redo log entries created after the last savepoint need to be processed. After a log backup, the old log entries before the savepoint position can be removed.

Undo Information

SAP HANA not only persists committed changes – it is also possible that uncommitted changes are persisted. During a savepoint, the most recent version of a changed row store page is written to disk – whether the changes are committed or not. The same is true for new rows in the delta storage of column tables. They are persisted when the corresponding page is flushed, no matter whether the change is committed or not.

During startup, these uncommitted changes need to be rolled back. To this end, undo information needs to be persisted as well. Unlike redo log entries, undo information is not written to log volumes. Instead, two mechanisms are used together to roll back changes after a system restart: The shadow page concept and undo information, which is persisted to the data volumes.

Storing Data in Data Volumes: Details

Like many modern database management systems, SAP HANA can use the file abstraction layer of the host operating system.

Each data volume contains one file in which data is organized into pages ranging in size from 4 KB to 16 MB (page size class). Data is written to and loaded from the data volume by page. Over time, pages are created, changed, overwritten, and deleted. The size of the data file is increased automatically as more space is required. However, it is not decreased automatically when less space is required. This means that, at any given time, the actual payload of a data volume (that is, the cumulative size of the pages currently in use) may be less than its total size.

This is not necessarily significant; it simply means that the amount of data in the file is currently less than at some point in the past (for example, after a large data load). If a data volume has a considerable amount of free space, it might be appropriate to shrink the data volume. However, a data file that is excessively large for its typical payload can also indicate a more serious problem with the database. SAP support can help to analyze the situation.

Monitor Disk Volumes

Directory Hierarchy for Data and Log Storage

During the installation process, the following default directories are created as the storage locations for data and log volumes:

  • /usr/sap/<SID>/SYS/global/hdb/data
  • /usr/sap/<SID>/SYS/global/hdb/log

Note

These default directories are defined in the parameters basepath_datavolumes and basepath_logvolmes in the persistence section of the global.ini file.

The services that persist data and therefore have volumes are the following:

Service 
nameserverThe nameserver service in the system database persists data.
indexserverThe indexserver service in every tenant databases persists data.
xsengine (if running as a separate service)The xsengine service persists data on any host on which it is running.

Monitoring Disk Usage

You can use the SAP HANA cockpit to check disk statistics and to check that there’s enough space on disk for data volumes and log volumes. The Disk Usage card on the Database Overview page monitors the total usage of all disks, including space used by non-SAP HANA data. The disk with the highest (most critical) disk usage is shown in more detail.

The Disk Volume Monitor displays more details about the data volumes and the log volumes. Choose the … (More) icon on the Disk Usage Card and select Disk Volume Monitor.

The Disk Volume Monitor displays information about the data volume file names as well as the size of each file and how much of it’s currently in use, both in MB and as a percentage of its total size. Used size is the amount of data in the file. As the size of the file is automatically increased with the payload but not automatically decreased, used size and total size can be different. In addition, it displays log file names, total size (which, for log files, is equivalent to used size), and state. When a file is full, log entries are written to the next log segment file available.

Selecting a row in the Disk Volume Monitor displays more details on the disk volumes.

Information on I/O statistics and page statistics are shown on the following tabs:

TabInformation
Volume I/O Statistics

Displays aggregated I/O statistics for the volume.

You can reset the statistics collection for all volumes by selecting Reset Volume I/O Total Statistics.

Data Volume Page StatisticsDisplays statistics on the data volume's pages (or blocks) broken down according to page size class. Superblocks are partitions of the data volume that contain pages of the same page size class. You can analyze how many superblocks are used for the specific size class and also how many pages/blocks are used. The fill ratio enables you to decide whether or not it makes sense to reorganize and release unnecessary superblocks, in other words, shrink the data volume.

Shadow Paging Concept

The persistence layer is responsible for durability and atomicity of transactions. Some parts of its design are influenced by concepts from MaxDB. The persistence layer ensures that the database is restored to the most recent committed state after a restart and that transactions are either completely executed or completely undone. To achieve this goal in an efficient way, the persistence layer uses a combination of write-ahead logs, shadow paging, and savepoints. The persistence layer offers interfaces for writing and reading persisted data. It also contains the logger component that manages the transaction log. Transaction log entries are written explicitly by using a log interface or implicitly when using the virtual file abstraction.

The persistence layer periodically performs savepoints. During the savepoint operation, modified pages in the page buffer, dirty column store delta pages, and the resulting pages provided by the row store are written to disk. Buffered redo log entries are flushed to disk as well. One purpose of performing savepoints is to speed up restart: when starting up the system, the redo log need not be processed from the beginning but only from the last savepoint position. This restores the system to the most recent committed state.

Savepoints are written asynchronously by the savepoint coordinator. The time interval for writing savepoints can be configured, for example, to 10 minutes. In addition, savepoints are triggered by several other operations such as data backup, database shutdown, or after a restart is completed. Administrators can also request a savepoint manually.

Note

The frequency for savepoint creation can be configured. This is described in detail later in this course. Savepoints are also triggered automatically by a number of other operations such as data backup, and database shutdown and restart. You can trigger a savepoint manually by executing the statement ALTER SYSTEM SAVEPOINT.

If a system crashes during the savepoint operation, the system can still be restored from the last savepoint. This is possible because the write operations write to new physical pages and the previous savepoint version is still kept in shadow pages.

The savepoint operation is divided into the following three phases:

  • Phase 1: The majority of the changed data is written to disk. During this phase, database operation continues as usual.

  • Phase 2: This is the critical part of a savepoint operation where no concurrent write operations are allowed. This is achieved using the consistent change lock. At the beginning of phase 2, this lock is acquired in exclusive mode. This excludes concurrent write operations, which require the same lock in shared mode. Starting and finishing transactions is also blocked in phase 2. To minimize the impact on concurrent operations, phase 2 must be kept as short as possible. After the concurrent change lock is acquired, the savepoint coordinator determines and stores the savepoint log position and the list of open transactions. This defines the state of the database that will be reflected in the savepoint. In phase 2 the savepoint coordinator also determines the pages that were changed during phase 1. It copies the content of these pages to a buffer in memory. It then writes the buffer to disk using asynchronous I/O.

  • Phase 3: The savepoint coordinator basically waits until the asynchronous write operations triggered in phase 2 are finished. Writes the anchor page, which contains a link to the restart record. Changes are allowed again in phase 3, but these are not part of the savepoint anymore.

Shadow paging is used to undo changes that were persisted since the last savepoint. With the shadow page concept, physical disk pages written by the last savepoint are not overwritten until the next savepoint has been successfully completed. Instead, new physical pages are used to persist changed logical pages. Until the next savepoint is complete, two physical pages may exist for one logical page: The shadow page, which still contains the version of the last savepoint, and the current physical page, which contains the changes written to disk after the last savepoint.

Log in to track your progress & complete quizzes