Let's explore the integration options in SAP Datasphere related to data acquisition.
Acquiring Data Using CSV Files

In SAP Datasphere, the Data Builder provides a CSV file upload function by default.
You can use the Import CSV File option for manual uploads (under 25MB). For this feature, no connection has to be defined.
It automatically creates a table with derived columns based on file structure. You can apply various data wrangling and transformation rules, such as concatenate, split, extract, replace, change or filter.
To load data into an existing table (without transformations), use Upload Data From CSV File in the table editor.
Note
For larger or automated loads, a generic SFTP connection has to be set up to connect to and access files on a Secure File Transfer Protocol (SFTP) server.
Data Integration Using Connections

Using connections to sources, there are several approaches of data integration in SAP Datasphere:
- Remote tables
You can use remote tables to:
directly access data in the source (remote access).
copy the full set of data (snapshot or scheduled replication).
copy data changes in real-time (real-time replication).
- Flows
Three types of flows are offered:
Data flow: Traditional ETL approach: transform data first, then store it, following the established ETL paradigm.
Replication flow: For ELT scenarios where data is extracted and loaded first, then transformed.
Transformation flow: For post-load transformations on already loaded data.
Data Integration Architecture

SAP Datasphere leverages different technologies to setup connections to sources. As a result, each connection provides different functionality, prerequisites and user experience.
Remote table
Mostly SAP sources are ready to integrate their data as remote tables. For this data integration approach, the SAP HANA Federation Framework is currently mainly based on SAP HANA Smart Data Integration (SDI) and its data provisioning framework (dpServer + dpAgent).
The dpAgent is a lightweight component running outside the SAP Datasphere environment. It hosts data provisioning adapters for connectivity to remote sources, enabling data federation and replication scenarios. The dpAgent acts as a gateway to SAP Datasphere providing secure connectivity between the database of your SAP Datasphere tenant and the adapter-based remote sources.
The dpAgent is managed by the Data Provisioning Server. It is required for all SDI connections in general. Through the Data Provisioning Agent, the pre-installed data provisioning adapters communicate with the Data Provisioning Server for connectivity, metadata browsing, and data access. The Data Provisioning Agent connects to SAP Datasphere using JDBC. It needs to be installed on a local host in your network and needs to be configured for use with SAP Datasphere.
There are some sources which are able to setup a direct connection (for example SAP SuccessFactors and SAP HANA Cloud). For a direct connection to SAP HANA on-premise, SAP HANA Smart Data Access (SDA) with the Cloud Connector is used.
Replication flow and data flow
SAP embedded Data Intelligence (DI) provides the functionality of data flow and replication flow. In this scenario, DI connectors are used to reach remote sources. To establish a connection to these sources the Cloud Connector is required to act as link between SAP Datasphere and the source. Before creating the connection, the Cloud Connector requires an appropriate setup.
The Cloud Connector serves as a link between SAP Datasphere and your on-premise sources and is required for connections that you want to use for the following use cases:
Data flows.
Replication flows.
Model import from SAP BW/4HANA model transfer connections (Cloud Connector is required for the live data connection of type tunnel that you need to create the model import connection).
In rare cases also for remote tables: only for SAP HANA on-premise via SDA.


