Introducing SAP Datasphere Integration Options

Objective

After completing this lesson, you will be able to describe Integration options

Integration of Data into SAP Datasphere

Introduction

You can connect data sources in different ways.

SAP Datasphere provides a large set of default Built-in-connectors to access data from a wide range of sources, in the cloud or on-premise, from SAP or from non-SAP sources or partner tools.

Connections provide access to data from a wide range of remote systems, cloud as well as on-premise, SAP as well as non-SAP, and partner tools. They allow users assigned to a space to use objects from the connected remote system as source to acquire, prepare, and access data from those sources in SAP Datasphere. In addition, you can use certain connections to define targets for replication flows.

Connections must be established for every source system.

To learn more about integrating SAP applications, refer to the how-to paper by SAP at: https://help.sap.com/docs/SAP_DATASPHERE/be5967d099974c69b77f4549425ca4c0/8f98d3c917f94452bafe288055b60b35.html?locale=en-US.

Watch this video to understand how to create connections for integrating SAP applications.

Create a connection to allow users assigned to a space to use the connected source or target for data modeling and data access in SAP Datasphere.

You can decide between virtual access and replication to persist the data.

Each connection type supports a defined set of features. Depending on the connection type and the connection configuration, you can use a Connection for one or more of the following features:

  • Remote Tables

    The remote tables feature supports building views. After you create a connection in the graphical view editor of the Data Builder, a modeler can add a source object (usually a database table or view) from the connection to a view. The source object deploys a remote table.

    During import, the tables deploy as remote tables. Depending on the connection type, you can use remote tables for the following tasks:
    • Directly access data in the source (remote access)

    • Copy the full set of data (snapshot or scheduled replication)

    • Copy data changes in real time (real-time replication)

  • Data Flows, Replication Flows, and Transformation Flows

    The flow feature supports building data flows, replication flows, and transformation flows. After you have created a connection, in the respective flow editors of the Data Builder, a modeler can add a source object from the connection to a data flow to integrate and transform your data.

  • External Tools

    SAP Datasphere is open to SAP and non-SAP tools to integrate data to SAP Datasphere.

By default, when you import a remote table, its data does not replicate and you must access it using federation each time from the remote system. You can improve performance by replicating the data to SAP Datasphere and you can schedule regular updates (or, for many connection types, enable real-time replication) to keep data fresh and up-to-date.

Data Integration and Preparation

The SAP Datasphere Remote Table is the first layer object in SAP Datasphere.

Many connections (including most connections to SAP systems) support importing remote tables to federate or replicate data (see Integrating Data via Connections).

You can import remote tables to make the data available in your space from the Data Builder start page, in an entity-relationship model, or directly as a source in a view.

  • By default, remote tables federate data, and each time the data is used, a call is made to the remote system to load it. You can improve performance by enabling replication to store the data in SAP Datasphere. Some connections support real-time replication and for others, you can keep your data fresh by scheduling regular updates (see Replicate Remote Table Data).

  • To optimize replication performance and reduce your data footprint, you can remove unneccessary columns and set filters (see Restrict Remote Table Data Loads).

  • To maximize access performance, you can store the replicated data in-memory (see Accelerate Table Data Access with In-Memory Storage).

  • Once a remote table is imported, it is available for use by all users of the space and can be used as a source for views.
  • You can automate sequences of data replication and loading tasks with task chains (see Creating a Task Chain).

By default, when you import a remote table, its data is not replicated and must be accessed via federation each time from the remote system. You can improve performance by replicating the data to SAP Datasphere and you can schedule regular updates (or, for many connection types, enable real-time replication) to keep the data fresh and up-to-date.

The Replication Flow could connect a lot of source system types.

Certain connections support loading data from multiple source objects to SAP Datasphere via a replication flow. You can enable a single initial load or request initial and delta loads and perform simple projection operations (see Creating a Replication Flow).

The Replication Flow could use for an outbound scenario too.

Create a replication flow to copy multiple data assets from a source to a target.

You can use replication flows to copy data from the following source objects:

  • CDS views (in ABAP-based SAP systems) that are enabled for extraction

  • Tables that have a primary key

  • Objects from ODP providers, such as extractors or SAP BW artifacts (from any SAP system that is based on SAP NetWeaver and has a suitable version of the DMIS add-on, see SAP Note 3412110).

For more information about available connection types, sources, and targets, see Integrating Data via Connections.

The Transformation Flow is useful to transform the data. You can use Phyton or SQL as a scipt language.

Create a transformation flow to load data from one or more source tables, apply transformations (such as a join), and output the result in a target table. You can load a full set of data from one or more source tables to a target table. You can add local tables and also remote tables located in SAP BW Bridge spaces. Note that remote tables located in SAP BW Bridge spaces must be shared with the SAP Datasphere space you are using to create your transformation flow. You can also load delta changes (including deleted records) from one source table to a target table.

Varios tasks can combined in a task chain.

Group multiple tasks into a task chain and run them manually once, or periodically, through a schedule. You can create linear task chains in which one task is run after another. (You can also nest other task chains within a task chain.) Or, you can create task chains in which individual tasks are run in parallel and successful continuation of the entire task chain run depends on whether ANY or ALL parallel tasks are completed successfully. In addition, when creating or editing a task chain, you can also set up email notification for deployed task chains to notify selected users of task chain completion.

Log in to track your progress & complete quizzes