Business Example
As an SAP HANA database administrator, you need to understand the SAP HANA host auto-failover concept. To better understand this feature, you need to have hands-on experience of a coordinator node failing in a multi-host SAP HANA system.
Failover Algorithm Explained
In contrast to other high availability solutions, SAP HANA does not use a quorum consisting of multiple SAP HANA hosts to decide which host can become coordinator at initial startup or coordinator failover. With heartbeats and fencing, a single host can reliably decide initial startup or coordinator failover.
Coordinator Host Failover with Standby Host but All Coordinator Candidates in Use (Double Failover)

The previous figure shows a coordinator host failover to a worker host. On the left, the original state of the system is shown. On the right, the first host fails and its coordinator role is moved to the second host. The original worker role of the second host is failed over to the standby host.
Failover step-by-step:
If no coordinator name server candidate with standby as an actual index server role is available, one of the coordinator candidates currently used as index server worker is chosen as the new coordinator.
The failover steps for the coordinator host are like the scenario described in the figure, Coordinator Host Failover Without Available Standby Hosts.
The previously assigned worker is marked as failed and enters the failover queue. Because a standby host is available, worker failover starts shortly after coordinator failover.
Both failovers are executed in parallel.
Coordinator Host Failover to a Standby Host

The previous figure shows a coordinator host failover to a standby host. On the left, the original state of the system is shown. On the right, the first host fails and its role is moved to the fourth host.
Failover step-by-step:
The name server coordinator candidate with the highest priority (= smallest number in the configured name server role) detects the failure condition and initiates the failover.
If a name server candidate is available that is currently a standby host, the failover is forwarded to this host. This avoids a double failover (see the second example following).
The failover includes the same steps as in the worker host failover scenario described earlier.
The name server reloads its persistence from disk.
Target Host Selection
This section describes the selection process of the replacement host. Beginning with SPS11, the actual host roles (HOST_ACTUAL_ROLES) are considered.
SAP HANA 1.0 SPS11 and newer
If there is a standby host with an exact match of corresponding actual host roles, it is used.
If there is a standby host with one of the roles that corresponds to the failing host, it is used.
If the failing host has an SAP HANA worker role, any unassigned standby is used.
SAP HANA 1.0 SPS10 and older
If there is a standby host, it is used.
If there are multiple equivalent options available, the first host is used.
The search steps are restricted to the same failover group, unless global.ini/[failover]/cross_failover_groups=false was configured.
If no host is available, no failover happens and HOST_STATUS shows ERROR.
Coordinator Host Failover Without Available Standby Hosts
Distributed landscapes without standby hosts may also perform a failover to ensure that the coordinator host is always available. Of course, a worker host (and all tables located there) is inaccessible after failover.
This failover mechanism can be disabled by removing the name server roles COORDINATOR 2 and COORDINATOR 3 in the SAP HANA cockpit. Disabling is required if you use (not recommended) local storage on each host or the landscape is controlled by an external cluster manager.
The wait timeout of a name server worker (non-coordinator candidate) on a system restart is different than that of a coordinator candidate. The number of retries to reach the coordinator name server before aborting the startup is controlled with the following parameter: nameserver.ini/[failover]/slave_to_master_startup_retries=10.
Because the wait interval after one unsuccessful retry is 5 seconds, the default parameter value of 10 leads to a maximum wait time of 50 seconds.

The previous figure shows a coordinator host failover to a worker host. On the left, the original state of the system is shown. On the right, the first host fails and its role is moved to the second host. The original role of the second host is not available until a standby host is added to the system or the failed first host is re-activated.
Failover step-by-step:
The name server coordinator candidate with the highest priority detects the failure conditions and executes the failover steps itself.
The new coordinator name server calls the stonith() method of all installed HA/DR provider hooks and the Storage Connector stonith() method (if applicable) to reboot the failed host.
If STONITH fails, the failover is aborted and the new coordinator shuts itself down.
The (possibly) remaining third coordinator candidate then retries the failover.
If this also fails, no coordinator is available throughout the whole landscape and the worker hosts eventually shut themselves down.
The new coordinator stops all its services (except hdbdaemon and nameserver).
The new coordinator calls the Storage Connector’s detach() method for the old storage partition, the attach() method for the storage partition 1 (mnt00001 directory) and calls the failover() method of all installed failover hooks:
If this fails, failover is aborted, and the new coordinator shuts itself down.
The (possibly) remaining third coordinator candidate then retries the failover.
If this also fails, no coordinator is available throughout the whole landscape, and the worker hosts shut themselves down.
The new coordinator name server loads its persistence from disk.
- Currently existing services, host roles, storage partition number, volume IDs of all services are swapped between both hosts in the topology and all name servers are informed.
The hdbdaemon process is reconfigured, which starts all the required services.
The role of the displaced worker host remains inactive; the system is only partially available.
Host Auto-Failover vs External Cluster Manager
Instead of using the built-in SAP HANA host auto-failover, you could monitor and (re)start virtualized hosts on different hardware with an external cluster manager. With multiple SAP HANA instances, this would have the advantage that fewer standby hosts would be needed, but on the other hand, all failure detection and fencing logic would have to be implemented externally. To avoid unnecessary SAP HANA-controlled coordinator failovers, the name server COORDINATOR 2 and COORDINATOR 3 roles can be removed as described previously.
Automatic Host Shutdown by Service Failures
For every service, a fixed number of restarts can be defined after which the daemon stops itself. The relevant parameters are set for each service type in daemon.ini:
1234# If set to true the daemon will shut down all services on the host if # this service cannot start
startup_error_shutdown_instance=true
# Number of retries if a service fails in startup procedure
startup_error_restart_retries=4
The name server is the only service that has the latter parameter set to true by default. This means that any problem involving a constant name server crash, stops the daemon eventually. For instance, the presented settings may be used for the index server if recurring start-up problems of that service should stop the affected database instance.
SAP HANA and Split Brain
In SAP HANA’s coordinator/worker/standby failover solution, there is only one entity in the whole system that can make failover decisions, that is, the coordinator name server. A worker or standby host never executes a failover by itself. Therefore, only the coordinator host must be considered for split brain situations.
SAP HANA would run into a split-brain situation if multiple hosts try to become coordinator name server/index server and access the same set of data (persistence) from disk. This would irreparably destroy the data. To overcome this problem, SAP HANA uses I/O fencing to prevent the other host from accessing the storage, as follows:
SAN storage: The storage devices are locked by the current active host with SCSI-3 persistent reservations. If another host tries to mount those devices, the old host automatically loses write permissions and the services abort themselves.
NFSv3 shared storage: The NFSv3 file lock implementation cannot be used as locks would not be released if an NFSv3 client dies, so a STONITH procedure must be provided by the storage vendor, which reboots a failed host.
NFSv4 shared storage or cluster file systems like GPFS: The file locking implementation works reliably across hosts. Non-availability of a host, and thus lock release, is handled by the file system. A host that tries to open a persistence that is already open fails and aborts itself.
Communication network and storage network based heartbeats are used to detect activeness of other hosts and prevent unnecessary failover attempts. If the target coordinator host detects that another coordinator is still active, it terminates itself to let the other coordinator continue. Without this, different hosts could try to become coordinator and would fence each other repeatedly.
In a split brain situation, a quorum is sometimes used to decide, which side should ‘survive’. This makes sense in stateless compute clusters to have the bigger parts of resources remaining active. However, in SAP HANA, tables are bound to specific storage partitions and service instances. Tables in the other partition would not be accessible and applications typically cannot continue with some tables inaccessible. Therefore, SAP HANA lets the initial coordinator continue.
The hdbnsutil Executable
Some actions, supported by the hdbnsutil executable, access the persistence while the system is stopped. To avoid data corruption caused by unexpected active or reviving services, this program also checks for active name servers with network and storage based heartbeats and uses fencing to set the SCSI-3 persistent reservation.
SAN storages: After stopping hdbnsutil (or the name server), the SCSI-3 persistent reservations are intentionally not released. This ensures that no other service unintentionally accesses a persistence, such as still-running services on other hosts after a split-brain situation
Host Auto-failover Duration

The failover phase can be split into the following steps:
Failure detection
Several watchdogs, retries, and timeouts are involved. Based on the failure condition, the detection time can vary, for example as follows:
SAP HANA instance terminated or host shut down
The checking host immediately gets errors from the OS layer and typically detects the failure in less than one minute.
Network split
The checking host must wait until the network times out, so failure detection typically takes three to six minutes. The timeouts could be reduced, but this is not recommended, as it would not allow recovery from short network outages, or could lead to a false failover decision in the case of heavy system load, where pings can take longer.
Failover execution
The failover time is comparable to the time required for SAP HANA startup, because the services on a standby host are initially started, but run idle. During failover, they do the same initialization and persistence load as in regular service startups.
Host Start Order/Landscape Restart
All hosts can be started concurrently. The coordinator name server candidates have different priorities as indicated by the role name COORDINATOR 1, 2, and 3. The first coordinator candidate becomes the active coordinator. The index server roles, host roles, and storage partitions are reset, meaning that all configured worker hosts are used as worker again, even if the landscape was in a failed over state before shutdown.
Up to SPS11, if a host was previously used as a worker, its storage partition is kept as is, to avoid inefficient access patterns in clustered file systems. So over time, the storage partitions may have different sorting compared to their initial state after installation.
As of SPS12, a landscape restart considers the configured storage partitions, bringing the system back to its original state. As a prerequisite, the coordinator name server must be on its original host (with a configured storage partition of 1). In an SAP HANA system replication setup, only the primary system performs this failback operation.
Failback
When a failover was performed and the failed host is available again, no automatic failback happens; the host starts as a standby. A controlled failback can be performed by stopping or restarting the configured standby host which, after a previous failover, is actually a worker. Automatic failback only happens when the complete landscape is restarted.
Coordinator Nameserver Candidates
The initial host is a coordinator candidate and the first two hosts added to a landscape become coordinator candidates. When a standby host is added and none of the coordinator candidates is a standby host, the last coordinator candidate is moved to the new standby host. Having a standby host in the coordinator candidate list allows faster coordinator host failover because it avoids the previously-mentioned double failover.
Failover Groups
During installation and with SAP HANA Studio, a failover group can be configured per host. If a failover target host is available in the same group, it is preferred over hosts from other groups. This can be used to achieve better 'locality' in large systems, to use network/storage connections with less latency. When the parameter nameserver.ini/[failover]/cross_failover_group is set to false, failover is restricted to hosts in the same group. This can be used to separate differently sized hardware or separate storage entities.
Application Configuration
In the connection information for SAP HANA SQL client libraries (for example, hdbuserstore), you can configure multiple host names. All coordinator name server candidates should be configured there. The coordinator candidates can be found using the following SQL statement:
1select HOST from SYS.M_LANDSCAPE_HOST_CONFIGURATION where NAMESERVER_CONFIG_ROLE like 'MASTER%' order by NAMESERVER_CONFIG_ROLE
Application Error Handling
Failover is not seamless. Errors during a failure phase are returned to the clients. Neither server nor client libraries have built-in 'retry' logic. Applications must be prepared and should try to reconnect.
Coordinator host failure: The client typically gets error -11312 (Connection to database server lost; check server and network status [System error: ...])
Worker host failure: Basically, any error code can happen, because the coordinator connection is still available, but some tables are no longer accessible and statements can fail at various steps.