Analyzing SAP HANA Hierarchies

Objective

After completing this lesson, you will be able to analyze SAP HANA hierarchies

SAP HANA Hierarchies

Hierarchies Introduction

Diagram for hierarchies showing the parent-child hierarchy at the top and SQL connected to SAP HANA Platform bidirectionally. Hierarchy Engine, consisting of Generation and Navigation boxes, is present within the SAP HANA Platform.
SAP HANA Hierarchies
  • Simple SQL interface
  • Separates generation from navigation
  • Parent-child hierarchies
  • Descendants, ancestors, siblings
  • Caching for performance optimization
  • Tightly integrated in SAP HANA operations (import/export, security, backup etc.)
Benefits
  • Easy to use for developers
  • Navigate hierarchies based on real-time data

Hierarchy Terminology

Illustration showing the hierarchy terminology.

Let’s work on the terminology used in SQL hierarchies:

  • Hierarchy: A hierarchy contains a set of standard attributes with fixed types. The topology of the generated hierarchy is strictly a tree or forest where each result row corresponds to a single graph node and edge. The computed index enables the navigation functions to traverse the hierarchy tree efficiently. A hierarchy is the result of the hierarchy generator function.

  • Node: The basic element of a hierarchy. Each node is defined as a row of a hierarchy generator function result. The primary identifier of a node is its preorder rank provided by the HIERARCHY_RANK attribute.

    Edge: The basic relationship between two nodes in a hierarchy. Each node has either 0 (root) or 1 incoming edge. In addition, a node may have 0 (leaf) or more (branch) outgoing edges. Since a hierarchy is always a strict tree, edges are not independent entities like in generic graphs.

  • Parent: A node reached by an incoming edge is called a parent node of that node. A node has either one or no parent node.

  • Child: A node reached by an outgoing edge is called a child node of that node

  • Root: Root nodes are all nodes that do not have a parent node.

  • Orphan: Orphaned nodes are all nodes of a hierarchy that are not reachable from a set of user defined start nodes. In a recursive parent-child hierarchy, two sorts of orphans exist. On the one hand, there are records that do not match the START condition and does not have a parent. On the other hand, there are orphaned islands made up of nodes, which build a cycle that is not connected to any root.

Types of Hierarchy

The type of hierarchy that you will create depend on the structure of the source data.

  • Parent-Child hierarchy requires separate columns for parents and children.

    Parent and Child usually are of the same data type.

  • Level hierarchy requires each node in a separate column.

    Possibly different data types can be combined in a hierarchy.

Parent Child Hierarchy Source Data

ParentChild
DACHGermany
GermanyBayern
GermanyHamburg
EMEAIreland
IrelandKerry

Level Hierarchy Source Data

RegionChildState
DACHGermanyBayern
DACHGermanyHamburg
EMEAIrelandKerry

Note

The hierarchies can be time-dependent, where the structure will contain additional columns VALID_FROM_DATE and VALID_TO_DATE with unique couple of values.

The most common types are parent-child hierarchies and level hierarchies.

In parent-child hierarchies, each record references a parent record, thus defining the hierarchical structure. This concept could be compared to an organization, where each employee record references another person as manager.

In leveled hierarchies, each record contains complete individual path information through the hierarchy. A common example is address data, where each record typically consists of country, state, city, street, and street number data items, which may also be interpreted as a geographical hierarchy.

Hierarchy Functions

Three types of hierarchy functions:

  • Hierarchy generator functions: Generate hierarchies using source table data as input
  • Hierarchy navigation functions: Returns specific set of nodes, or node aggregations
  • Hierarchy scalar functions: Concatenates multicolumn tuple-like node identifiers into single scalar values

Hierarchy functions are an integral part of SAP HANA core functionality. Hierarchies expand the SAP HANA platform with native support for hierarchy attribute calculation and navigation, and allow the execution of typical hierarchy operations directly on relational data, seamlessly integrated into the SQL query.

Public SQL Interface

Public SQL interface for hierarchies:

  • Parent-child hierarchy generation
  • Navigation exposed as table functions
  • Massive query acceleration via optional caching
Code Snippet
12345678910
WITH "HIER" AS ( SELECT * FROM HIERARCHY (SOURCE (SELECT "NODE_ID", "PARENT_ID","NAME" FROM "T_GEO" ) ORDER BY "HIERARCHY_RANK") ) SELECT * FROM HIERARCHY_DESCENDANTS (SOURCE "HIER" START WHERE "NODE_ID" = 'OC');
Example of a parent child hierarchy at the top and table at the bottom.

Hierarchy functions explicitly target the multitude of hierarchical data that already exist in real-world scenarios, company reporting or department structures, hierarchy roles or even plain address data. Hierarchy functions enable users to efficiently query this data in a consistent and structured way, even though the layout of the source data can be very diverse.

Generator Functions

Generator functions for HIERARCHY Generator, HIERARCHY_LEVELED Generator, HIERARCHY_SPANTREE Generator, and HIERARCHY_TEMPORAL Generator.

Hierarchy Generator

Screenshots showing the source table, code block for the hierarchy function, and output (view / persisted).

Hierarchy Leveled Generator

Screenshots showing the source table for the HIERARCHY_LEVELED Generator with the output (view / persisted).

Specialized hierarchy generator functions translate the diverse relational source data into a generic and normalized tabular format that, for the sake of brevity, is termed as hierarchy. The structure of such a hierarchy is always the same, regardless of the original format of the source data, and consists of an ordered list of its nodes. The nodes themselves are represented by a minimal set of orthogonal hierarchy topology attributes plus a projection of the original source attributes.

Many basic properties of the hierarchy and its elements can be determined by directly querying this tabular hierarchy representation using the usual SQL filtering and expression facilities.

More complex calculations on the hierarchy are termed navigations. This involves a start set of nodes and potentially multiple steps beyond their immediate neighborhood. SAP HANA provides built-in navigation functions that use optimized algorithms on the hierarchy attributes to efficiently compute ancestors, descendants, or siblings of nodes.

Using Temporary Tables with Hierarchies

Two code blocks, first for materialized temporary table from function and second for the view from function.

Very often, multiple queries might be executed using the same hierarchy generator function result set. Therefore, it is recommended to either materialize the hierarchy generator function output into a temporary table or define a view over it.

Materializing the hierarchy generator function output into a temporary table guarantees consistent and stable navigation results for the entire lifecycle of the temporary table. It also eliminates the hierarchy generation cost for subsequent queries. At the same time, materializing a hierarchy into a temporary or persistent table cuts all connections between the hierarchy and its sources.

If a query should return the most current state of the data, it is recommended to rather create a view over a hierarchy generator function, which guarantees that hierarchy navigation results always correctly reflect the current transactional view of the source data. If a hierarchy source is fully deterministic (that means that multiple query executions over the same data set always return the same result), it is allowed to cache the hierarchy generator function output. Caching avoids the overhead of recalculating the hierarchy if the sources do not change between subsequent navigation and at the same time guarantees a correct transaction result.