Analyzing a Suddenly Slow System

Objective

After completing this lesson, you will be able to analyze a suddenly slow system.

Analyze a Suddenly Slow System

There can are various reasons why an SAP HANA system suddenly becomes very slow. This lesson tackles issues where the database performance unexpectedly become very slow.

The following issues can cause a system to suddenly become slow (that is, from the end-user perspective the SAP HANA system seems to be performing slowly):

  • Hardware failures at the server level (read errors on bad memory)
  • Hardware failures at the storage level (read errors on disk)
  • Hardware failures at the network level (package collisions on switches or a router)
  • Software errors at the OS level (OS command on Linux is using 100% CPU/memory swapping on disk)
  • Software errors at the storage system level (software errors on SAN/NAS)
  • Software errors at the database level (long running queries on SAP HANA or too many sessions open)
Screenshot showing two screens about how to collect OS Logs, highlighting the Collect Data option on the left screen, and Automatic Bug Reporting Tool on the right screen.

If the slow system is caused by something at the OS level, it is important to save log files on the Linux OS or at storage system level for later analysis.

Usually, in a slow system scenario, the system can still be accessed through SQL, but it takes longer to investigate. The normal SQL access should be used to further analyze the possible root cause.

What Can and Should You Do?

As SQL statements can still be executed for analysis, you must perform the following steps using the SAP HANA Cockpit 2.0:

  • Look at the Overall Database Status card to see if all the services are running correctly.
  • Look at the Memory Usage card to see if the SAP HANA database has enough memory available.
  • Look at the CPU Usage card to see who is using the CPU resources.
  • Look at the Disk Usage card to see who is using the disk resources.
  • Look at the Thread card to see which threads are using the resources.
  • Look at the Sessions card to see which sessions are running.
  • Look at the Monitor Statements card to see which SQL statements are running on the system.

If the slow system is caused by something in the SAP HANA database, is it important to investigate the SQL query and save log and trace files at the SAP HANA database level for further analysis, before terminating sessions or threads.

Database Overview Application

Details also include the following:

  • Information such as operational status, system usage type, whether the system has multiple hosts, the number of hosts (if distributed), and database version
  • The SAP HANA version history
  • Information about the plug-ins that are installed
  • The status of replication from your productive system to a secondary system

    This information is only available and applicable if you are operating a secondary instance of your database (for example, in a high availability (HA) scenario). If this is the case, content from the primary or productive instance of your database is replicated to the secondary instance.

To launch the overview, drill down to the name of the database from the Database Directory or from a group. Unless your administrator has enabled single sign-on, you must connect to the database with a database user that has the system privilege CATALOG READ and choose _SYS_STATISTICS.

Overall Database Status Card

To identify problems early and avoid disruptions, you must monitor your SAP HANA database continuously.

You can monitor the overall status and resource usage of the SAP HANA database at a glance on the homepage of the SAP HANA cockpit. Then, drill down for more detailed monitoring and analysis.

The overall database status can be running, running with issues, or stopped. Choosing this status brings you to Manage Services, where you can stop or kill a service, and start or stop a system. Also, in the Overall Database Status, you receive alerts related to services, and, in order of priority, you see:

  • The number of services not running
  • The number of services running with issues (if there are no stopped services)
  • The number of services running

Manage Services

Manage Services provides you with detailed information about database services for an individual database.

Note

Not all the columns listed in the following figure are visible by default. You can add and remove columns in the table personalization dialog, which you open by choosing the personalization icon in the table toolbar.
Screenshot of the Manage Services page, as described in the preceding and following text.

The following list gives an overview of the information available on each service:

  • Host: The name of the host on which the service is running
  • Status: The status of the service. The following statuses are possible:
    • Running
    • Running with Issues
    • Stopped
    • Not Running

    To investigate why the service is not running, you can navigate to the crashdump file, created when the service stopped.

    Note

    The crashdump file opens in the Trace tool of the SAP HANA Web-based Development Workbench. For this, you need the role sap.hana.xs.ide.roles::TraceViewer or the parent role sap.hana.xs.ide.roles::Developer.
  • Service: The service name, for example, indexserver, nameserver, xsengine, and so on
  • Role: The role of the service in a failover situation. Automatic failover happens when the service or the host on which the service is running fails. The following values are possible:
    • Master: The service is the active master worker.
    • No entry: The service is a slave worker.
    • Standby: The service is in standby mode. It does not contain any data and does not receive any requests.
  • Port: The port that the system uses for internal communication between services
  • Start Time: The time at which the service started. The time is given in the timezone of the SAP HANA server.
  • CPU: A mini chart visualizing the CPU usage of the service. To open the Performance Monitor for a more detailed breakdown of CPU usage, choose the mini chart.
  • Memory: A mini chart visualizes the memory usage of the service as follows:
    • Dark green shows the service's used memory.
    • Light green shows the service's peak memory.
    • The gray stroke represents the effective allocation limit.
    • The light gray background represents the physical memory.

    To open Memory Analysis for a more detailed breakdown of memory usage as follows, choose the mini chart:

    • Used Memory (MB): The amount of memory currently used by the service. Choosing the mini chart opens the Memory Analysis app for a more detailed breakdown of memory usage.
    • Peak Memory (MB): The highest amount of memory ever used by the service
    • Effective Allocation Limit (MB): The effective maximum memory pool size that is available to the process considering the current memory pool sizes of other processes
    • Memory Physical on Host (MB): The total memory available on the host
    • All Process Memory on Host (MB): The total used physical memory and swap memory on the host
    • Allocated Heap Memory (MB): The heap part of the allocated memory pool
    • Allocated Shared Memory (MB): The shared memory part of the allocated memory pool
    • Allocation Limit (MB): The maximum size of the allocated memory pool
    • CPU Process (%): The CPU usage of the process
    • CPU Host (%): The CPU usage on the host
    • Memory Virtual on Host (MB): The virtual memory of the host
    • Process Physical Memory (MB): The process physical memory used
    • Process Virtual Memory (MB): The process virtual memory
    • Shrinkable Size of Caches (MB): The memory that can be freed in the event of a memory shortage
    • Size of Caches (MB): The part of the allocated memory pool that can potentially be freed in the event of a memory shortage
    • Size of Shared Libraries (MB): The code size (including shared libraries)
    • Size of Thread Stacks (MB): The size of the service thread call stacks
    • Used Heap Memory (MB): The amount of the process heap memory used
    • Used Shared Memory (MB): The amount of the process shared memory used
    • SQL Port: The SQL port number
    • Process ID: The process ID

Operations on Services

As an administrator, you may need to perform certain operations on all, or selected services - for example, start missing services, or stop or kill a service.

You can perform several operations on database services from Manage Services. You can trigger these operations by selecting the service, and choosing the required option in the footer toolbar.

Choose Start Missing Services to start inactive services. This can only be performed on a tenant database if you drill down to Manage Services, through the system database.

Choose Stop Service to stop the selected service normally. The service is then typically restarted.

Choose Kill Service to stop the selected service immediately and, if the related option is selected, create a crashdump file. The service is then typically restarted.

Choose Add Service to add the service you selected from the list. This can only be performed on a tenant database if you drill down to Manage Services, through the system database. Services cannot be added to the system database itself. To add a service, you must have the EXECUTE privilege on the stored procedure SYS. UPDATE_LANDSCAPE_CONFIGURATION.

Choose Remove Service to remove the selected service. This can only be performed on a tenant database if you drill down to Manage Services, through the system database.

You can only remove services that have their own persistence. If data is still stored in the service's persistence, it is re-distributed to other services.

You cannot remove the following services:

  • Name server
  • Master index server
  • Primary index server on a host

To remove a service, you must have the EXECUTE privilege on the stored procedure SYS. UPDATE_LANDSCAPE_CONFIGURATION.

Choose Reset Memory Statistics to reset all memory statistics for all services. This can only be performed on a tenant database if you drill down to Manage Services application, through the system database.

Peak used memory is the highest recorded value for used memory since the last time the memory statistics were reset. This value is useful for understanding the behavior of used memory over time and under peak loads. Resetting peak used memory allows you, for example, to establish the impact of a certain workload on memory usage. If you reset peak used memory and run the workload, you can then examine the new peak used memory value.

Choose Go To Alerts to display the alerts for this database.

The SAP HANA database provides several features in support of high availability, one of which is service auto-restart. In the event of a failure, or an intentional intervention by an administrator that disables one of the SAP HANA services, the service auto-restart function automatically detects the failure and restarts the stopped service process.

Memory Statistics per Service

Analyzing the memory allocation of the SAP HANA database can help you to investigate such situations as out-of-memory incidents, memory corruptions, and memory leaks.

Screenshot of the Memory Analysis app, as described in the preceding and following text.

The Memory Analysis app enables you to visualize and explore the memory allocation of every service of a selected host during a specified time period. If you notice an increase in overall memory usage, you can investigate whether it is due to a particular component, subcomponent, or table.

The upper chart provides the following data:

  • Global Allocation Limit: This is the global_allocation_limit for the host (as set in the global.ini configuration file).
  • Allocated Memory: This is the pool of memory pre-allocated by the host for storing in-memory table data, thread stacks, temporary results, and other system data structures.
  • Total Used Memory: This is the total amount of memory used by SAP HANA, including program code and stack, all data and system tables, and the memory required for temporary computations.

Move the vertical selection bar in the upper chart to populate the data in the lower chart. The vertical selection bar snaps to the closest time for which there is collected data for the components. When you select the Components tab, the lower chart displays the Used Memory by Component.

The Components tab provides the following detailed information:

  • Used Memory by Component: For the specific time (chosen by the vertical selection bar in the upper chart), the components of the selected service are listed in descending order of the amount of used memory.
  • Used Memory by Type: This donut-chart displays a visual representation of the types of used memory for the specified time.
  • Components Used Memory History: If you select the checkbox of one or more components, the used memory history chart is populated.

The Subcomponents tab displays more detailed memory use. You can filter by component type. You can move through the collected data points by using the arrow buttons. The following information is displayed:

  • Used Memory by Subcomponent: Subcomponents of the selected component are listed in descending order of used inclusive memory for the specific time (chosen by the vertical selection bar in the upper chart). By choosing a subcomponent, you can expand the list.
  • Filter by Component Name: To further refine the displayed subcomponent data, select the filter icon to specify one or more component names.
  • Subcomponents Used Memory History: Selecting the checkbox of one or more subcomponents populates the used memory history chart.

The Tables tab shows detailed statistics on the memory used by data tables. The Tables tab shows the following information:

  • Top Ten Tables by Size: This displays the breakdown of memory usage of the 10 highest consuming tables for the specific time (chosen by the vertical selection bar in the upper chart).
  • Top Ten Tables by Growth: This displays the memory usage of the 10 tables with the largest change in consumption for the selected time period. By hovering over the data, you can see the Previous Size memory usage value from the beginning of the time period and the Growth during the time period (where the current size of the table is the sum of Previous Size and Growth).

The following system views provide information from which the current and historical memory allocation is calculated:

  • HOST_RESOURCE_UTILIZATION_STATISTICS
  • HOST_SERVICE_MEMORY
  • HOST_SERVICE_COMPONENT_MEMORY
  • HOST_HEAP_ALLOCATORS
  • GLOBAL_ROWSTORE_TABLES_SIZE_BASE
  • HOST_COLUMN_TABLES_PART_SIZE

All views are in the _SYS_STATISTICS schema. For more information about these views, see the SAP HANA SQL and System Views Reference Guide.

Usage and Performance Metrics

You can monitor key database metrics through the Memory Usage, CPU Usage, and Disk Usage cards, as well as the Threads, Sessions, and Monitor Statements cards.

In a multi-host system, each host is represented by a selectable bar, with the selected host displaying a time graph to the right of the bar chart. Hover over the bars to see details for the selected host.

If a bar is highlighted, there is an associated high (red) or medium (yellow) alert. With single-host databases, as there is only one host, no bar graphs are displayed. By viewing this high-level information, you can decide whether to drill down to the Performance Monitor.

Performance Monitor

Analyzing the performance of the SAP HANA database over time can help you to pinpoint bottlenecks, identify patterns, and forecast requirements.

Use the Performance Monitor to visually analyze historical performance data across a range of key performance indicators related to memory, disk, and CPU usage.

Open the Performance Monitor by choosing the chart or the Show All link on the Memory Usage, CPU Usage, or Disk Usage card on the homepage of the SAP HANA cockpit.

Screenshot of three pages of the Performance Monitor, highlighting Memory Usage, CPU Usage, and Disk Usage analytics, as described in the preceding and following text.

The Performance Monitor can be reached through the Memory Usage, CPU Usage, and Disk Usage cards. All three cards point to the same Performance Monitor, but the displayed data (memory, CPU, or disk) depends on the selected database. The general working of the Performance Monitor is the same for all three of the cards.

The Performance Monitor opens displaying the load graph for the selected database (memory, CPU, or disk). The load graph initially visualizes resource usage of all hosts and services listed on the left, according to the default key performance indicator (KPI) group of the selected database.

You can customize the information displayed on the load graph in several ways, for example:

  • Define the monitored time frame.
  • Use the Add Chart button to create custom charts displaying the host and services selection, as well as selected KPIs. For a list of all available KPIs, see Key Performance Indicators.
  • Update the displayed data by the selected refresh rate.
  • Zoom into a specific time by changing the duration.
  • In the Settings menu, customize your graphs by including hosts and services as well as additional KPIs in the Charts tab. In the Alerts tab, configure alerts per category and priority status.

Monitoring and Analyzing Sessions

Analyzing the sessions connected to your SAP HANA database can help you identify which applications or which users are currently connected to your system, as well as what they are doing in terms of SQL execution.

To use the Sessions card to monitor all sessions in your landscape, choose the mini chart.

Screenshot of the Sessions card page, as described in the preceding and following text.

The Sessions card displays the number of active and total sessions.

Open the Sessions card. The Sessions page allows you to monitor all sessions in the current landscape. You can see the following information:

  • Active/inactive sessions and their relation to applications
  • Whether a session is blocked and, if so, which session is blocking it
  • The number of transactions that are blocked by a blocking session
  • Statistics such as average query runtime and the number of DML and DDL statements in a session
  • The operator currently being processed by an active session

To support monitoring and analysis, you can perform the following actions from the Sessions page:

  • To cancel a session, choose Cancel Sessions.
  • To save the data sets as a text or HTML file, choose Save As.

Monitoring and Analyzing Threads

Analyzing the threads running in the SAP HANA database can be helpful when analyzing the current system load.

This analysis can help you identify which statements or procedures are being executed and at what stage they are, who else is connected to the system, and if there are any internal processes running also.

Use the Threads card to monitor the longest-running threads active in your system. It may be useful to see, for example, how long a thread has been running, or if a thread is blocked for an inexplicably long time.

Screenshot of the Threads card page, as described in the preceding and following text.

The Threads card provides you with information about the number of currently active and blocked threads in the database.

To open the Threads card, choose either the number of active or blocked threads on the card. The 1,000 longest-running threads currently active in the database are listed. By default, threads are listed in order of longest runtime. For each statement, you can see the duration, as well as the name of the service that is executing the thread. You can identify the host, the port, and the thread type, and whether the statement is related to a blocking transaction.

If a thread is involved in a blocked transaction, or is using an excessive amount of memory, cancel the operation executing the thread by choosing Cancel Operations in the footer toolbar.

Thread Details

The Threads card provides you with detailed information about the 1,000 longest-running threads currently active in the database.

Note

Not all of the columns listed in the following table are visible by default. You can add and remove columns in the table personalization dialog, which you open by choosing the personalization icon in the table toolbar.

The following table lists the information available for threads.

DetailDescription
Blocking TransactionBlocking transaction
Duration (ms)Duration (ms)
HostHost name
PortInternal port
ServiceService name
HierarchyThread grouping information. Filled with Connection ID/Update Transaction ID/Transaction ID or left empty for inactive threads
Connection IDConnection ID
Thread IDThread ID
CallingThe thread or service which the thread calls
CallerThe thread or service which called this thread
Thread TypeThread Type
Thread MethodThread method
Thread DetailThread detail
UserUser name
Application UserApplication user name
CPU TimeCPU time of thread
Cumulative CPU TimeCPU time of thread and associated children
Transaction IDTransaction ID
Update Transaction IDUpdate Transaction ID
Thread StatusThread State
Connection Transaction IDTransaction object ID
Connection Start TimeConnected Time
Connection Idle Time (ms)Time that the connection is unused and idle
Connection StatusConnection Status: 'RUNNING' or 'IDLE'
Client HostHost name of client machine
Client IPIP of client machine
Client PIDClient Process ID
Connection TypeConnection type: Remote, Local, History (remote), History (local)
Own ConnectionOwn connection: TRUE if own connection, FALSE if not
Memory Size per ConnectionAllocated memory size per connection
Auto CommitCommit mode of the current transaction: TRUE if the current connection is in auto-commit mode, FALSE otherwise
Last ActionThe last action done by the current connection: ExecuteGroup, CommitTrans, AbortTrans, PrepareStatement, CloseStatement, ExecutePrepared, ExecuteStatement, FetchCursor, CloseCursor, LobGetPiece, LogPutPiece, LobFind, Authenticate, Connect, Disconnect, ExecQidItab, CursorFetchItab, InsertIncompleteItab, AbapStream, TxStartXA, TxJoinXA
Current Statement IDCurrent statement ID
Current Operator NameCurrent operator name
Fetched Record CountSum of the record count fetched by select statements
Sent Message Size (Bytes)Total size of messages sent by the current connection
Sent Message CountTotal message count sent by the current connection
Received Message Size (Byte)Total size of messages/transactions received by the current connection
Received Message CountTotal message/transaction count received by the current connection
Creator Thread IDThread ID who created the current connection
Created ByEngine component that created the connections: Session, Planning, Repository, CalcEngine, Authentication, Table Exporter, Loader, LLVM, JSVM, IMS Search API, OLAP Engine, Mergedog, Ping Status, Name Server, Queue Server, SQL Stored Procedure, Authorization, TrexViaDbsl from ABAP, HybridTable Reorganizer, Session external
Is EncryptedEncrypted: TRUE if the secure communication is enabled (SSL enabled), FALSE, otherwise
Connection End TimeThe time when the connection is closed for history connections
Blocked Update Transaction IDWrite transaction ID of the write transaction waiting for the lock
Blocking Transaction IDTransaction object ID of the transaction holding the lock
Thread ID of Lock OwnerConnection ID associated with the blocked write transaction
Blocking Update Transaction IDWrite transaction ID of the write transaction holding the lock
Transactional Lock TypeTransactional lock type
Transactional Lock ModeTransactional lock mode
Lock Wait ComponentWaiting for lock component
Lock Wait NameWaiting for lock ID
Timestamp of Blocked TransactionTimestamp of the blocked transaction
Waiting Record IDID of the record on which the lock is currently placed
Waiting Object NameName of the object on which the lock is currently placed
Waiting Object TypeType of the object on which the lock is currently placed
Waiting Schema NameName of the schema on which the lock is currently placed

Monitoring and Analyzing with the Top SQL Statements

Analyzing the current most critical statements running in the SAP HANA database can help you identify the root cause of poor performance, CPU bottlenecks, or out-of-memory scenarios. Enabling memory tracking allows you to monitor the amount of memory used by single statement executions.

Use the Top SQL Statements card to analyze the current most critical statements running in the database.

Screenshot of the SQL Statements card page, as described in the preceding and following text.

The Top SQL Statements card displays the number of long-running statements and the long-running blocking situations currently active in the database. Statements are ranked based on a combination of the following criteria:

  • The runtime of the current statement execution.

  • The lock wait time of the current statement execution.

  • The cursor duration of the current statement execution.

Open the Top SQL Statements card to list the 100 most critical statements currently active in the database. By default, statements are listed in order of the longest runtime. For each statement, you can see the full statement string, as well as the ID of the session in which the statement is running. You can identify the application, the application user, and the database user running the statement and whether the statement is related to a blocking transaction.

Optionally, you can activate monitoring of the memory consumption of statements by choosing Enable Memory Tracking in the footer toolbar. Detailed information about the memory consumption of statement execution is collected and displayed.

If a statement is involved in a blocked transaction or using an excessive amount of memory, cancel the session that the statement is running in (or the blocking session) by choosing Cancel Session in the footer toolbar.

Log in to track your progress & complete quizzes