Like any other system, an SAP HANA database system can go offline for various reasons. This lesson looks at scenarios where the database unexpectedly went offline, and how the database administrator could handle this situation.
The following issues can cause a SAP HANA system to go offline (that is, from the end-user perspective, the SAP HANA system seems to hang):
Question: What can cause a SAP HANA system to go offline?
Power failure in the data center
Hardware failures at the server level (CPU/memory)
Hardware failures at the storage level (disk)
Hardware failures at the network level (switches/router)
Software errors at the operating system level (Linux)
Software errors at the storage system level (SAN/NAS)
Software errors at the database level (SAP HANA)
Human error causing downtime (Server, router, storage, Linux, and SAP HANA)
Usually, in a system-down scenario the system cannot be accessed through SQL and/or any another connection method. This makes analyzing the root cause a bit more difficult, but not impossible. Several small tests, in the right order, can help you quickly exclude areas that aren't causing the problem. Such a workflow should become a standard way of approaching a system that is down.
Because SAP HANA cockpit might only be able to partially connect to the SAP HANA system, you should use the following quick tests to roughly determine the area that causes the problem. As soon as you have found the problem area you should investigate more deeply, but not forget that getting the system up and running again has the highest priority.
Question: What tests can you perform to find the problem area?
Check the network. Ping some hosts in the data center.
Check the hosts. Log on using SSH and verify that the OS is running without issues.
Check the storage. Create, read, or delete a file to test the connection to the storage system.
Check SAP HANA. Use sapcontrol to test if all SAP HANA database services are running.
Check SAP HANA. Use hdbsql to test if the SQL database is working for the application user(s).
Caution
The following checks will help you to quickly identify parts that are broken or working. With there tests, you are not supposed to do a deep root cause analysis. For a deep root cause analysis there are other and probably better tools available.Check the Network

In today's world, where almost every device is connected to the network, it's extremely important that the network is up and running correctly. In an SAP HANA database system, the network is important as well. End users connect to the database to execute all kind of queries. This can be done directly using SQL or via a middleware application. The SAP HANA database itself can be set up as a multi-host scale-out system that distributes the data over several servers. Without a network, external end-user connections and internal server-to-server connections would fail.
Because external and internal network connections are important for a SAP HANA system, you should test both by pinging SAP HANA and non-SAP HANA hosts in your network. If all the hosts can be reached, then the network is available and can be excluded.
123ping <SAP HANA host>
ping <internal host>
ping <external host>
Using a ping, you can test that the remote hosts are reachable, but maybe the network packages are taking the long way home due to a routing problem in the network. You can check the network path to the remote host using the following command:
12traceroute <SAP HANA host>
traceroute <internal host>
Hint
If in your company the end users are connecting to the network using a virtual desktop infrastructure (VDI) solution or are in a dedicated network, then you should test the network connections from within these infrastructures as well.Check the SAP HANA Hosts on OS Level
As soon as you know that the network is up and running, you can start testing if the SAP HANA hosts are functioning within normal parameters. Connect to the SAP HANA host(s) using your preferred method (SSH, XRDP, VNC, and so on) and check the Linux system logs.

As the SAP HANA hosts are normally up and running 24/7, check whether there were unplanned and unexplained restarts. You can check this with the following command:
1last | grep boot
Looking at the Linux system log files to analyze the system is one of the most important tasks when troubleshooting a system. Since the move from syslog to systemd, kernel messages and messages of system services are handled by systemd.
Systemd was introduced in SLES 12 and RHEL 7 and replaces the traditional init scripts. Systemd also introduced its own logging system called journal.
Systemd manages the journal as a system service under the name systemd-journald.service and it is switched on by default. In a systemd-enabled Linux system, the systemd-journald service collects all messages from the kernel, boot process, syslog, user processes, standard input, and system service errors in a centralized location.
You can check the last 50 boot error messages in the journal with the following command:
12345journalctl -n 50 -p err -b
-n = number of messages to display
-p = message priority
-b = display boot messages
Hint
You can check the last 50 kernel error messages in the journal with the following command:
Check the Storage
Storage problems can result in severe database problems, database standstill or, even worse, data loss.

Avoiding storage problems is part of every layer in the Linux software and hardware stack. Modern hard drives are capable of detecting and correcting minor errors in block reads. SAN and NAS have built-in error correction and redundancy to handle power and hardware failure. Modern Linux file systems are all journal-based and can correct errors created due to power failures. Last but not least databases also support many different techniques to survive power failures and incorrect service shutdown situation.
If the SAP HANA database system 'stopped' due to power, hardware, or software failures, you should check if all file systems are available again after the server has restarted. Depending on the storage system used you can investigate the storage problem more deeply.
Note
In the scope of this course we will not investigate storage system problems. For this you need to contact your storage vendor and get the support information you need.Check the SAP HANA Database System
Checking if the database is up and running sounds like a good plan, but with all the services running it might still be the case that the end user or middleware application cannot connect. Such a situation can happen when, for example, the SAP HANA system is up and running, but the network doesn't allow SQL connections due to a firewall having been reconfigured.

To check if all the SAP HANA services and hosts are available on the Linux host, you can execute the following commands:
As <sid>adm user:
12sapcontrol -nr <instance number> -function GetProcessList
sapcontrol -nr <instance number> -function GetSystemInstanceList
You also need to check whether or not the system can be reached over the SQL interface. When you are already connected to the SAP HANA host via the SSH session, check the SQL interface with the following command:
Note
The default port number range for tenant databases is 3<instance>40 - 3<instance>99
.
As <sid>adm user:
1hdbsql -n localhost -i <instance number> -d <Tenant name> -u <your database user>
Enter your password when requested. You are now in the HDBSQL terminal. From the HDBSQL terminal you can get SAP HANA connection information by executing the command:
1\s
Caution
It's important to test all your tenants, because the tenants have different SQL ports and can be stopped independently of a running SAP HANA database system.Checking the SQL connection only from the local host isn't sufficient as it could be the case that SAP HANA SQL is blocked on the network. To make sure that this isn't the case, you should perform a HDBSQL connection test also via the network from the end-user LAN and the application server network.
From the SAP S/4HANA ABAP application server, as <sid>adm user:
1hdbsql -n <SAP HANA host> -i <instance number> -d <Tenant name> -u <your database user>
Enter your password when requested. You are now in the HDBSQL terminal. From the HDBSQL terminal you can get SAP HANA connection information by executing the command:
1\s
If the issue is due to a hardware or a software failure, it is important to save log files on the Linux operating system or at the storage system level for later analysis.
For further specific steps and guidance on pro-active or reactive actions you can take, see SAP Note 1999020 — SAP HANA: Troubleshooting when the database is no longer reachable.