Describing SAP HANA System Replication

Objective

After completing this lesson, you will be able to describe SAP HANA system replication

System Replication

Usually system replication is set up so that a secondary standby system is configured as an exact copy of the active primary system, with the same number of active hosts in each system. The number of standby hosts do not need to be identical.

With multitier system replication, you have one primary system and can have multiple secondary systems. Each service instance of the primary SAP HANA system communicates with a counterpart in the secondary system.

Using SAO HANA kernel-based system replication, the SAP HANA database system in the primary data center replicates to the secondary data center. The secondary SAP HANA database system is running.

The secondary system can be located near the primary system to serve as a rapid failover solution for planned downtime, or to handle storage corruption or other local faults. Alternatively, it can be installed in a remote site to be used in a disaster recovery scenario. Both approaches can be linked together with multitier system replication. Like storage replication, this disaster recovery option requires a reliable connection channel between the primary and secondary sites. The instances in the secondary system operate in recovery mode. In this mode, all secondary system services constantly communicate with their primary counterparts.

A cluster across data centers with database controlled transfer is realized by system replication.

System replication has the following advantages:

  • Memory is continuously loaded on a secondary site in preparation for the possible takeover and occupies resources.

  • Switch-over is faster than with storage replication or mirroring (2-5 minutes).

  • There is a very short performance ramp (only minutes, not hours, without preparation).

System replication has the following disadvantage:

The hardware (memory and CPU) is actively used on the secondary site for the standby or shadow processes.

Replication Modes

If the connection to the secondary system is lost, or the secondary system crashes, the primary system resumes replication after a brief, configurable, timeout. The secondary system persists, but does not immediately replay the received log. To avoid a growing list of logs, incremental data snapshots are transmitted asynchronously from time to time from the primary system to the secondary system. If the secondary system has to take over, only the part of the log needs to be replayed that represents changes that were made after the most recent data snapshot. In addition to snapshots, the primary system also transfers status information regarding which table columns are currently loaded into memory. The secondary system correspondingly preloads these columns. In the event of a failure that justifies full system takeover, an administrator instructs the secondary system to switch from recovery mode to full operation. The secondary system, which already preloaded the same column data as the primary system, becomes the primary system by replaying the last transaction logs, and then starts to accept queries.

Note the following for synchronous and asynchronous setups:

  • In a synchronous state, no committed transaction is lost. The open transaction is restarted and clients reconnect to SAP HANA for this. Synchronous setups are required for distances in the range 50-100 km.

  • In case of an asynchronous setup, there is some loss. This depends on the time period where the secondary site was not reachable or the line was too weak to cope with the data transfer quickly enough. These setups are used for longer distances, where the distance between data centers is 100 km or more. However, they also occur if the impact of the standby process is not allowed to feedback into daily operation (change performance).

Minimal Setup for System Replication

The minimal setup for system replication in one data center for fast takeovers is shown in the figure, Minimal Setup for System Replication.

As a minimal setup, you can set up SAP HANA system replication in one data center. This gives you a better high availability, but no disaster tolerance.

Operation Modes for System Replication

There are three different operation modes for the configuration of system replication.

Operation Modes

  1. Delta Data Shipping (operation_mode=delta_datashipping)

    This mode establishes a system replication where occasionally (per default every 10 minutes) a delta data shipping takes place in addition to the continuous log shipping. The shipped redo log is not replayed on the secondary site.

  2. Continuous Log Replay (operation_mode=logreplay)

    This mode does not require a delta data shipping anymore. Additionally, the shipped redo log is continuously replayed on the secondary site.

  3. Continuous Log Replay with Active/Active (operation_mode=logreplay_readaccess)

    This mode is similar to the logreplay operation mode in that it continuously replays the redo log on the secondary site. Additionally, it allows read-only access to the secondary.

Note

A comparison between delta_datashipping and logreplay with regard to network traffic shows significantly reduced network traffic. Delta data shipping displays a peak every 10 minutes when delta data shipping is triggered, whereas logreplay shows continuously shipped log buffers.

When using SAP HANA system replication, as an operation modes option, you can choose between delta data shipping and continues log replay.

Operation Mode: Delta Data Shipping

The events for the start of a transport are as follows:

  1. The primary system creates an internal data package similar to a full data backup and transfers this initially to the secondary site. The transport happens asynchronously.
  2. Log information is transferred in parallel to the initial data transfer. The log is transported asynchronously until the commit of the finished transaction occurs. With the commit, all other log information that is not yet transferred or written, as well as the final commit, must also be written synchronously. This must occur before the primary productively-used database can continue transactional work.
  3. All load and unload operations of the main indexes and table columns are monitored and offered with the incremental data transfer to the secondary system. These main indexes and table columns are then loaded or unloaded equivalently to memory in preparation for the takeover.

The events during incremental transport are as follows:

  1. With the help of the shadow memory concept operation of SAP HANA, small incremental backups are transferred to the delta data package every 10 minutes at the secondary site. The default parameter setting is 600 seconds.
  2. With this delta data information, information from the loaded main indexes into SAP HANA on the primary site are also transferred to the secondary site. This is to prepare the main memory with these main indexes on the secondary site too.

Operation Mode: Log Replay

Since the first version of system replication, the delta_datashipping operation mode has been the default replication method. With the logreplay operation mode, delta data shippings are no longer necessary. The takeover time has been reduced and more components are already initialized at replication time.

In the logreplay operation mode, the system replication uses an initial data shipping to initialize the secondary site. After that, only log shipping is complete and log buffers received by the secondary are replayed there. Savepoints are executed individually for each service, and column table merges are executed on the secondary site.

In the logreplay mode of operation, log segments can be marked as retained so that they can sync a secondary system after a disconnect.

With continuous log replay, delta data shipping cannot be used to sync a secondary site. This is because although the primary and secondary persistence are logically compatible, they are no longer physically compatible. This means that the data contained in the persistence is the same, but the layout of the data on pages can be different on the secondary site. Therefore, a secondary site can sync only using delta log shipping. This is relevant for the following situations:

  • The secondary site has been disconnected for some time (for example, because of a network problem or temporary shutdown of the secondary site).
  • A former primary site has been registered for failback.

The secondary site only uses the log in the online log area of the primary SAP HANA system for syncing. To sync the secondary site, the log must be retained for a longer time period than previously. If syncing using delta log shipping does not work, for example because the log has been reused, a full data shipping is necessary. To avoid this, the concept of log retention has been introduced.

Operation Mode: Log Replay Read Access

The operation mode Log Replay Read Access is identical to the Log Replay mode with the exception that the secondary system is available in read only mode. This means that a SQL query can access (read) the data in the secondary, but the data can't be changed.

The operation mode Log Replay Read Access can be useful in an environment where OLPT and OLAP workloads are mixed, and you want to split these over the primary and secondary system.

System Replication with QA and Development System on the Secondary Site

It is possible to make use of the secondary site for running QA and development systems while the primary system is in production.

Prerequisites for System Replication with QA and Development System on the Secondary Site

The following prerequisites must be taken into account:

  • Additional independent disk volume is needed for Development/QA systems. Because the secondary site requires the same I/O capacity as the primary site, the additional systems must not have a negative impact on the secondary’s I/O. Therefore, it is recommended to have a separate storage infrastructure for each system.

  • The SIDs and instance numbers have to be different for Development/QA. The <instance number>+1 of the productive system must not be used, but must be free on both sites, because this port range is used for system replication communication.

  • Preload of tables must be switched off on the secondary site, using the following parameter:

    global.ini/[system_replication]-> preload_column_tables=false

  • The takeover process takes longer because no data is preloaded in memory at the secondary site (could still meet SLAs for disaster recovery).

  • Development/QA systems need to be shut down in the case of a takeover.

  • The global allocation limit on the secondary must be set in a way that the available memory covers the memory needed by the secondary system as well as the Development/QA systems, using the following parameter:

    global.ini/[memorymanager]-> global_allocation_limit

The configured operation mode influences the memory size required on the secondary site as follows:

Operation ModeMemory Needed on Secondary Site
delta_datashippingrow store size + 20 GB (minimum 64 GB)
logreplayrow store size + size of column tables loaded in memory + 50 GB

If the row store size grows during operation of the primary, it might become necessary to increase the global_allocation_limit on the secondary site. It is possible to change the global.ini on the secondary site accordingly, and then activate the change with "hdbnsutil –reconfig" (because SQL is not possible in this state).

When using SAP HANA system replication, you can also re-use the hardware for the development and quality assurance system. On the secondary system, set the parameter preload_column_tables=false.

Advantages

The advantages of system replication with a Development/QA system on a secondary site include the following:

  • Development/QA is operated on the secondary site (mixed cost calculation).

  • Synchronous and asynchronous solutions are available.

  • The impact of synchronous solution on the primary site is at about 10% (in contrast to about 25% with storage replication).

  • The transfer process from primary to secondary is optimized and a lesser transfer amount is necessary compared to storage replication.

  • During the takeover to the secondary site, only a roll forward is necessary because the latest data synchronization point is necessary.

Disadvantages

The disadvantages of system replication with a Development/QA system on a secondary site include the following:

  • Table and column data cannot be loaded continuously into memory on the secondary site.

  • Hardware (memory and CPU) is actively used for Development/QA and partly for the standby or shadow processes.

  • Takeover is similar to storage mirroring (20 to 30 minutes at best).

  • Performance ramp is similar to storage mirroring (1 to 3 hours).

  • QA and Development need their own disk infrastructure carefully separated so as not to have influencing effects on each other.

System Replication – Additional Information

Additional InformationSAP Note
FAQ: SAP HANA System Replication1999880
FAQ: SAP HANA Database Backup & Recovery in an SAP HANA System Replication Landscape2165547
Collection of How-To Guides and Whitepapers For SAP HANA High Availability:
  • FAQ High Availability for SAP HANA
  • How To Perform System Replication for SAP HANA
  • How To Configure Network Settings for HANA System Replication
  • Network Required for SAP HANA system replication
2407186

Log in to track your progress & complete quizzes