Adding a Standalone Data Lake to SAP HANA Cloud

Lesson Overview

In this lesson, you'll learn how to add a standalone data lake instance to the SAP HANA Cloud in your SAP BTP account.

Business Case

See the following video to know the Business Case for adding a standalone data lake.

Add a Standalone Data Lake

You can easily add a standalone data lake to your SAP HANA Cloud using the SAP HANA Cloud Central overview page. On the SAP HANA Cloud Central page, select the Create Instance button to start the wizard and deploy an SAP HANA Cloud, data lake instance.

On the start screen of the Create Instance Wizard, you select the SAP HANA Cloud, data lake option. The following process is like the SAP HANA Cloud deployment, in the first two steps of the wizard you need to enter general information like the instance name and it's description and which IP addresses have access to the data lake.

The Encryption Key Management Service is also available for SAP HANA data lake. This allows you to use the customer-controlled encryption key (CCEK) feature for integration with the SAP Data Custodian Key Management Service (KMS). Follow the links to find more information on how to use the Customer Controlled Key Management Services and SAP Data Custodian

After specifying the Basics and Connection information, continue with the next steps and enter the SAP HANA Cloud, data lake detailed information.

In the Data Lake Relational Engine - General screen you can activate the Data Lake Relational Engine, the number of vCPUs for the coordinator and workers, and the password of the data lake administrator.

In the Data Lake Relational Engine step, you enable the data lake relational engine. If you enable data lake relational engine, you can use the data lake to ingest, store, and analyze high volumes of date in a disk-based database. If you don't enable data lake relational engine, you only get a file-based data lake with limited capabilities.

In the Size area, you can specify how many vCPUs will be allocated for the Coordinator and the Workers. The number of vCPUs for the Compute instance is calculated from the number of vCPUs for the Coordinator and the Workers. You can also specify how many Workers should be created.

For the data lake relational engine option, you must also provide a strong password for the database administrator (HDLADMIN).

In the Data Lake Relational Engine - Advanced Settings screen, you can specify the data lake should be most compatible with SAP HANA, or SAP IQ. Also you specify which collation rule to use for storing and sorting characters. The last thing to define is the automatic backup option should be activated.

In the Data Lake Relational Engine Advanced Settings step, you specify if the data lake must be maximally compatible with SAP HANA or SAP IQ. Which option you choose depends on the applications you want to connect to the data lake and what their preference is. Also, if you develop your own applications on top of the data lake, then the developer's knowledge of SAP HANA or SAP IQ would be the decisive factor.

If you decide to use SAP IQ compatibility, then you can choose some general options that are specific to SAP IQ. The options are:

A Collation describes how to sort and compare characters from a particular character set or encoding. You can choose from 30+ different collations.
If you require, then the Case Sensitivity can be switched on or off for collations.
For the NChar data type you can also specify the collation to use. Here you can choose between UTF8BIN and UCA.
Note
The NCHar data type stores Unicode character data.
The NChar Case Sensitivity behaviour con be specified as well. The options are ignore, respect, UpperFirst of LowerFirst.

By default the provisioning wizard configures and schedules data lake backups. You can manually disable this feature if no backups are required.

After providing all the required input, you should review your selections and then start the SAP HANA Cloud, data lake deployment. The SAP HANA Cloud, data lake is being created and started.

In the SAP HANA Cloud Central, you can see the difference between an SAP HANA Cloud, data lake (standalone data lake) and an integrated data lake attached to SAP HANA Cloud, SAP HANA database.

The standalone data lake is represented under its own header with the data lake instance name as you specified.

The integrated data lake is represented under the header of the SAP HANA database it’s associated with.

Adding a Standalone Data Lake to SAP HANA Cloud

Addition of a Standalone Data Lake to the SAP HANA Cloud

Lesson Overview

Business Case

Add a Standalone Data Lake

Add a Standalone Data Lake