Enhance Data Product in SAP Databricks
Objective
Contents
- Persona
- Log On to SAP Databricks
- SAP Databricks Enhancing Data Products
- Usage of sap-bdc-connect-sdk
- Verify Published Data Product
Persona

In this lesson we will work with a data product in SAP Databricks shared from SAP Business Data Cloud and create an enhanced version of it. Then we will share the new data product to SAP Datasphere using SAP Business Data Cloud SDK.
Prerequisites
- You are logged on to SAP Databricks.
Log On to SAP Databricks
-
Open a Chrome browser or Microsoft Edge browser and enter the SAP Databricks URL.
Alternatively, click here.
-
Provide Email and select Continue.
- Email: @sapexperienceacademy.com
- Password:

-
Select Continue with SSO.

-
Select default workspace to continue.

-
SAP Databricks welcome page will open.

SAP Databricks Enhancing Data Products
-
Select Catalog in the navigation menu on the left.

-
Under Delta Shares Received, expand companycode_share -> companycode -> and select the companycode table.
This Data Product was shared from the SAP Business Data Cloud to SAP Databricks and it’s available for consumption.

-
Switch to the Sample Data tab. If a compute resource isn’t started choose Select compute.

-
Select Serverless and Start and Close.

-
Once the service starts the sample data should appear.
Here we are accessing remote the data shared from SAP Business Data Cloud.

-
Under My organization catalog section, expand company_code_data_product -> company_code -> and select the company_code_clusters table.
This table has been populated with company code clusters. It’s the enhanced dataset we will share back to SAP Datasphere after reviewing how we created the table.

-
Switch to the Sample Data tab to view data.
This table has been created and populated to store company code clusters using a Python notebook. We won’t be reprocessing the notebook again as part of this lesson but will review how data has been enhanced.

-
Select Workspaces in the left navigation menu.

-
Expand Workspace folder and select Project_Artifacts folder.

-
See the Company_Clustering notebook used to create and populate the company_code_clusters table. Select the file to open it.

-
Read thru the explanations and the code in the notebook, explaining step by step how we were able to create the company code clusters.

-
Now we will share the comany code clusters table via Delta Share and publish it to SAP Datasphere as a new data product.
Select Catalog in the left navigation menu.

-
Under My organization catalog section, expand company_code_data_product -> company_code and select the company_code_clusters table.

-
Next select Share and choose Share via Delta Sharing.

-
Select Create a new share with the table.
Provide Share name and Recipients:
-
Share name : company_code_clustering_share_
-
Recipients : sap-business-data-cloud
Select Share.

-
-
Select the Gear icon on the catalog and then select Delta Sharing.

-
Switch to Shared by me and filter for your username (no space at the end) and select the delta share just created.

-
Assets in the share you created will be displayed.

Usage of sap-bdc-connect-sdk
In this section we will import another notebook and execute 5 code pieces:
- Install SDK
- Create a client
- Create a share
- Create the share CSN
- Publish a Data Product
-
Download the notebook we will use to publish a data product from SAP Databricks here (Right click and Save link as).
-
After saving the notebook locally, select Workspace in the left navigation and expand Workspace -> Users and right click on your username and select Import.

-
Select Browse to locate the file.

-
Import the Publish_Data_Product_Company_Clustering.py notebook.

-
Select the Publish_Data_Product_Company_Clustering file to open the notebook.

-
Select Environment on the right panel to change default settings.

-
Set the environment version to 3 and Apply the change.

-
Confirm the change.

-
Execute the first code block by clicking Run on the upper left corner of the cell.
This code will install the SDK. It should take about a min to complete. A green check will appear next to Run once it completes.

-
Execute the second code block.
This code creates a client:
- DatabricksClient receives dbutils as a parameter, which is a SAP Databricks utility that can be used inside the Databricks notebooks
- BdcConnectClient receives the DatabricksClient as a parameter to get information from the SAP Databricks environment (e.g. secrets, api_token, workspace_url_base)

-
Execute the third code block to create a share.
A share is a mechanism for distributing and accessing data across different systems. Creating or updating a share involves including specific attributes, such as @openResourceDiscoveryV1, in the request body, aligning with the Open Resource Discovery protocol. This procedure ensures that the share is properly structured and described according to specified standards, facilitating effective data sharing and management.
- share_name : “company_code_clustering_share_<lowercase_username>”
- title : “Company Code Clustering Data Product From ”

-
Execute the fourth code block to create the CSN.
The CSN serves as a standardized format for configuring and describing shares within a network. To create or update the CSN for a share, it’s advised to prepare the CSN content in a separate file and include this content in the request body. This approach ensures accuracy and compliance with the CSN interoperability specifications, facilitating consistent and effective share configuration across systems.
- share_name: “company_code_clustering_share_<lowercase_username>”

-
Execute the fifth code block to publish the data product to SAP Datasphere.
A Data Product is an abstraction that represents a type of data or data set within a system, facilitating easier management and sharing across different platforms. It bundles resources or API endpoints to enable efficient data access and utilization by integrated systems. Publishing a Data Product allows these systems to access and consume the data, ensuring seamless communication and resource sharing.
- share_name: “company_code_clustering_share_<lowercase_username>”

- share_name: “company_code_clustering_share_<lowercase_username>”
Verify Published Data Product
Log on to SAP Datasphere.
-
Open a Chrome browser or Microsoft Edge browser and enter the SAP Datasphere URL.
Alternatively, click here.
-
Login with your user credentials.
Username:
Password:

-
Once SAP Datasphere welcome page opens select Catalog & Marketplace and Search.

-
Show filters using the filter icon then filter for Data Products and also SAP Databricks (for System Type).

-
In display options (upper right corner) switch to Display as List.

-
Enter in the search bar and select Search (or press Enter).
The data product you just published from SAP Databricks should appear first in the list.

-
Open the data product to verify it’s the one you published from SAP Databricks.

-
Go back to Home.

Congratulations! You have successfully published a Data Product from SAP Databricks.
Contents
- Persona
- Log On to SAP Databricks
- SAP Databricks Enhancing Data Products
- Usage of sap-bdc-connect-sdk
- Verify Published Data Product
Persona

In this lesson we will work with a data product in SAP Databricks shared from SAP Business Data Cloud and create an enhanced version of it. Then we will share the new data product to SAP Datasphere using SAP Business Data Cloud SDK.
Prerequisites
- You are logged on to SAP Databricks.
Log On to SAP Databricks
-
Open a Chrome browser or Microsoft Edge browser and enter the SAP Databricks URL.
Alternatively, click here.
-
Provide Email and select Continue.
- Email: @sapexperienceacademy.com
- Password:

-
Select Continue with SSO.

-
Select default workspace to continue.

-
SAP Databricks welcome page will open.

SAP Databricks Enhancing Data Products
-
Select Catalog in the navigation menu on the left.

-
Under Delta Shares Received, expand companycode_share -> companycode -> and select the companycode table.
This Data Product was shared from the SAP Business Data Cloud to SAP Databricks and it’s available for consumption.

-
Switch to the Sample Data tab. If a compute resource isn’t started choose Select compute.

-
Select Serverless and Start and Close.

-
Once the service starts the sample data should appear.
Here we are accessing remote the data shared from SAP Business Data Cloud.

-
Under My organization catalog section, expand company_code_data_product -> company_code -> and select the company_code_clusters table.
This table has been populated with company code clusters. It’s the enhanced dataset we will share back to SAP Datasphere after reviewing how we created the table.

-
Switch to the Sample Data tab to view data.
This table has been created and populated to store company code clusters using a Python notebook. We won’t be reprocessing the notebook again as part of this lesson but will review how data has been enhanced.

-
Select Workspaces in the left navigation menu.

-
Expand Workspace folder and select Project_Artifacts folder.

-
See the Company_Clustering notebook used to create and populate the company_code_clusters table. Select the file to open it.

-
Read thru the explanations and the code in the notebook, explaining step by step how we were able to create the company code clusters.

-
Now we will share the comany code clusters table via Delta Share and publish it to SAP Datasphere as a new data product.
Select Catalog in the left navigation menu.

-
Under My organization catalog section, expand company_code_data_product -> company_code and select the company_code_clusters table.

-
Next select Share and choose Share via Delta Sharing.

-
Select Create a new share with the table.
Provide Share name and Recipients:
-
Share name : company_code_clustering_share_
-
Recipients : sap-business-data-cloud
Select Share.

-
-
Select the Gear icon on the catalog and then select Delta Sharing.

-
Switch to Shared by me and filter for your username (no space at the end) and select the delta share just created.

-
Assets in the share you created will be displayed.

Usage of sap-bdc-connect-sdk
In this section we will import another notebook and execute 5 code pieces:
- Install SDK
- Create a client
- Create a share
- Create the share CSN
- Publish a Data Product
-
Download the notebook we will use to publish a data product from SAP Databricks here (Right click and Save link as).
-
After saving the notebook locally, select Workspace in the left navigation and expand Workspace -> Users and right click on your username and select Import.

-
Select Browse to locate the file.

-
Import the Publish_Data_Product_Company_Clustering.py notebook.

-
Select the Publish_Data_Product_Company_Clustering file to open the notebook.

-
Select Environment on the right panel to change default settings.

-
Set the environment version to 3 and Apply the change.

-
Confirm the change.

-
Execute the first code block by clicking Run on the upper left corner of the cell.
This code will install the SDK. It should take about a min to complete. A green check will appear next to Run once it completes.

-
Execute the second code block.
This code creates a client:
- DatabricksClient receives dbutils as a parameter, which is a SAP Databricks utility that can be used inside the Databricks notebooks
- BdcConnectClient receives the DatabricksClient as a parameter to get information from the SAP Databricks environment (e.g. secrets, api_token, workspace_url_base)

-
Execute the third code block to create a share.
A share is a mechanism for distributing and accessing data across different systems. Creating or updating a share involves including specific attributes, such as @openResourceDiscoveryV1, in the request body, aligning with the Open Resource Discovery protocol. This procedure ensures that the share is properly structured and described according to specified standards, facilitating effective data sharing and management.
- share_name : “company_code_clustering_share_<lowercase_username>”
- title : “Company Code Clustering Data Product From ”

-
Execute the fourth code block to create the CSN.
The CSN serves as a standardized format for configuring and describing shares within a network. To create or update the CSN for a share, it’s advised to prepare the CSN content in a separate file and include this content in the request body. This approach ensures accuracy and compliance with the CSN interoperability specifications, facilitating consistent and effective share configuration across systems.
- share_name: “company_code_clustering_share_<lowercase_username>”

-
Execute the fifth code block to publish the data product to SAP Datasphere.
A Data Product is an abstraction that represents a type of data or data set within a system, facilitating easier management and sharing across different platforms. It bundles resources or API endpoints to enable efficient data access and utilization by integrated systems. Publishing a Data Product allows these systems to access and consume the data, ensuring seamless communication and resource sharing.
- share_name: “company_code_clustering_share_<lowercase_username>”

- share_name: “company_code_clustering_share_<lowercase_username>”
Verify Published Data Product
Log on to SAP Datasphere.
-
Open a Chrome browser or Microsoft Edge browser and enter the SAP Datasphere URL.
Alternatively, click here.
-
Login with your user credentials.
Username:
Password:

-
Once SAP Datasphere welcome page opens select Catalog & Marketplace and Search.

-
Show filters using the filter icon then filter for Data Products and also SAP Databricks (for System Type).

-
In display options (upper right corner) switch to Display as List.

-
Enter in the search bar and select Search (or press Enter).
The data product you just published from SAP Databricks should appear first in the list.

-
Open the data product to verify it’s the one you published from SAP Databricks.

-
Go back to Home.

Congratulations! You have successfully published a Data Product from SAP Databricks.