Advanced Analytics with SAP HANA Cloud

Objectives
After completing this lesson, you will be able to:

After completing this lesson, you will be able to:

  • Describe advanced analytics with SAP HANA Cloud

SAP HANA Cloud Spatial

Many organizations already rely on spatial data processing and use specialist applications alongside, but separate from their business process applications. For example, a gas storage tank is leaking and an emergency repair order is raised in the SAP ERP system by a coordinator. Next, the coordinator uses a separate geo-based application and manually enters the asset number of the tank. The tank shows up on a map, and the coordinator then uses yet another application to identify the nearest available, qualified engineer, who is then dispatched. Multiple, unconnected applications are needed in this business process.

Missed Opportunities using Different Applications

Beyond business applications, there are more exciting use cases for spatial analysis in the sports environment. SAP has developed a series of applications that provide deep analysis of player performance. For example, in golf, by adding a sensor to the ball and pin, we can create a graphical history to illustrate the improvements in accuracy of the shot. These types of applications are already in use by major sports organizations around the world.

There are many applications that could be dramatically enhanced with the integration of spatial data.

SAP Spatial provides new data types for storing geometrical data such as points, lines, and polygons. These can be used to represent precise locations, roads, and regions. SAP HANA Cloud Spatial uses open standards and so can easily be integrated with well-known, leading geospatial providers such as ESRI, OGC, OpenStreepMap, GoogleMap.

As well as storing spatial data, SAP HANA Cloud also provides spatial query functions that can easily be included in SQL Script. Here are some examples of the functions:

  • Within — which customers are in my region?

  • Distance — what is the longest distance a high-value customer has to travel to reach my sales outlet?

  • Crosses — where does truck route A cross truck route B?

SAP HANA Spatial also provides algorithms that can determine clusters. This helps an organization to locate precise locations that might be lucrative based on income data and other interesting attributes associated with consumers.

Predictive Analysis

Watch this video to learn about predictive analysis approaches.

SAP HANA Predictive Analysis Library (PAL)

Predictive Analysis Library (PAL) contains a large number of algorithms that can be used to develop predictive and machine learning models. Some of these algorithms are used for data mining pre-processing tasks such as:

  • Sampling — select a few records from large data sets (for example, we need 1000 people from each country).

  • Binning — grouping records into basic categories (for example, age ranges).

  • Partitioning — creating sets of data for training, testing, and validation used to train models and check their predictive accuracy.

The majority of the algorithms are used for scoring or predictive modeling. There are many algorithms provided for all major data mining categories including:

  • Association

  • Classification

  • Clustering

  • Regression

  • Time Series

  • Neural Networks

PAL algorithms are access can be called directly from procedures in SQLScript or they can be integrated into an SAP HANA flowgraph which is built using a graphical editor in the Web IDE. A flowgraph defines the data inputs, data processing, and outputs and parameters used in the predictive model. Using PAL requires knowledge of statistical methods and data mining techniques. This is because the choice of which algorithm to use must be made by the developer. So it is important that the developer initially understands the differences between the algorithms used for data preparation, scoring, and prediction. But they must also know how to fine-tune the algorithms to reach to desired outcome. For example, a developer would need to decide when to use the time series algorithm for double exponential smoothing versus triple exponential smoothing and then how to adjust the parameters to consider trend or to eliminate outliers. Developers who work with SAP HANA PAL are typically already working in predictive analysis projects or have a reasonable understanding of the topic.

Automated Predictive Library (APL)

The Automated Predictive Library (APL) is aimed at developers who do not have (or desire to have) detailed knowledge of the algorithms and the maths behind the models. The selection of algorithms for data preparations, scoring, and predictions is completely automated (hence the name). All the developer has to do is to provide the data, and APL analyzes that data to identify the best data preparation (cleaning up the data) and predictive models.

Graph Modeling

Graphs are used to model data that is best represented using a network. Examples include supply chain networks, transportation networks, utility networks, and social networks. The basic idea behind graph modeling is that it allows a modeler to define a series of entities (nodes) and link them with lines (edges) to create a network. This network represents how each node relates to all other nodes.

Graph models can also indicate flow direction between entities and also any number of attributes can be added to nodes or the lines that connect them. This means that additional meaning can be added to the network and queries can be executed to ask questions relating to the network.

Imagine a complex supply chain mapped using a graph, where all manufacturers, suppliers, distributors, customers, and consumers are represented with information stored along the connections. The benefit to this form of modeling is that it makes it easy to develop applications that can traverse huge graphs at speed. As a result you can ask questions such as the following:

  • How many hours has the product traveled between two specified points in the network?

  • Where are all the possible points of origin of this product?

  • Describe the entire journey of a product by listing all of the stop-off points in the path

  • What is the shortest path between point A and point B?

Graph processing allows you to discover hidden patterns and relationships in your network data, and all in realtime.

Graph Model: Example

There are many examples of where SAP HANA Graph could be used, including the following:

  • Medical

    Create a network of patients, conditions, treatments, and outcomes for reuse in diagnosis and planning treatments of other patients.

  • Social Network

    Using data provided by popular social media networks, find your customers and their friends, friends of friends, their likes or dislikes to create marketing opportunities.

It is possible to use standard SQL tables with standard data definitions and query code to create and process a similar model. However, it would be extremely complex to define such a model with plain SQL and also to query the model. Also, processing times could be challenging. SAP HANA Graph provides tools for graph definition and additional language for graph querying to ensure that model development is more natural and simplified. It also guarantees that the processing is flexible, and of course, optimized for in-memory processing using a dedicated in-memory graph engine right inside the database.

Save progress to your learning plan by logging in or creating an account

Login or Register