SAP Datasphere is a public cloud software as a service (SaaS) data warehouse.
SAP Datasphere supports a business data warehouse architecture that harmonizes data across the enterprise. It includes services for data modeling, data integration, cataloging, and data warehousing across SAP and non-SAP data.
Here are some scenarios where you might want to use SAP Datasphere:
You are not yet sure how much data processing resources (memory, disk, CPU) you need and want to adjust sizing elastically with little effort.
You want to always give your developers the latest modeling capabilities without having to manually update the software.
You want to bring together different types of sources in an end-to-end scenario and you want to easily add another source without much effort.
You need complex data transformation capabilities and want to combine code-based data transformation with a graphical user interface (UI).
You can provide business users (LoBs) with a self-service modeling environment that sits on top of the complex data modeling layer.
In SAP Datasphere you can create several artifacts using different code-based and graphical techniques. Even though it is recommended to federate data with a virtual data model, you can also persist data in SAP Datasphere with a new Local Table that can be filled from a simple flat file or via a connection to another source system. If you want to model virtually and implement filters, joins and aggregations, you can either create a code-based SQL View or use the graphical user interface to build a Graphical View. The Analytic Model is the analytical foundation for making data ready for consumption in SAP Analytics Cloud. It allows you to create and define multi-dimensional models to provide data for analytical purposes to answer different business questions. Predefined measures, hierarchies, filters, parameters, and associations (joins) provide flexible and simple navigation through the underlying data. Data Flows support data integration scenarios with embedded data transformation options to filter, join or even enrich the integrated data. Finally, the Intelligent Lookup is a feature to simplify and accelerate the merge of data.
There are other types of object, for example
- Remote Tables are tables that are created based on a source object as a reference structure. It can be accessed remotely initially, but you can start or schedule a snapshot to be loaded. Reading values is then faster, but you use memory and the values stored in SAP Datasphere may be out of date. Depending on the capabilities of the source, you can replicate in real time instead of taking full snapshots. Depending on the capabilities of the source, you can replicate in real-time instead of taking full snapshots.
- A Consumption Model and Perspective provide features like in an Analytic Model, or like in a CompositeProvider and BW Query.
As you see, SAP Datasphere offers very useful features, especially the Intelligent Lookup is worth a closer look. So what is the Intelligent Lookup? One of the biggest and most time-consuming challenges of data integration is the physical joining one’s own data and newly acquired data. Why is that? Often involved data is unclean, any straightforward join between two datasets that belong together regularly suffer of some or even many exceptions such as typos, duplicates, special characters, fuzziness in the data and much more. Dealing with these exceptions causes a lot of time and this where the Intelligent Lookupcomes into place.
A technical object called Space is used to provide a set of connections, models and other resources to a set of users. A Space in the SAP Datasphere is a unique and isolated virtual workplace that contains among other things users, connections and data models. Each Space uses dedicated resources data processing and storage . You can increase or to decrease storage capacity as needed. Spaces are like separate work areas intended for separate reporting stakeholders such as Line of Businesses like the sales department. The Space concept ensures which Line of Business or group of users can access which data. If a Space is obsolete after a finished project, the Administrator can easily deactivate it. One typical Space scenario might look as follows: Imagine the sales department of an on-premise data warehouse customer needs access, but not to the full existing 6 TB (terabytes) of the data warehouse. They are looking at a much smaller slice of the data, about 15 GB (gigabytes). Now they can implement a cloud data warehouse that subscribes to the existing central warehouse, just for that smaller slice of data they need to work with. Basically, the system replicates the 15 GB of data and publishes the model to their Space in the cloud. Soon another department needs to do something similar, only with their 10 GB slice of the data warehouse and then another one too. Watch the following video that explains the concept.
SAP Datasphere offers two main toolkits for modeling: Data Builder and Business Builder
Data Builder and Business Builder
|Combining datasets, fix filter, complex calculations
|Business Filter, Restricted and calculated measures, default drilldown
|Data architects from enterprise IT
|Power user from line of Business
|Main SAP Datasphere Objects
|View, Data Flow, Analytic Model
|Consumption Model, Perspective
Suppose that you want to combine different sources. SAP Datasphere provides different ways to integrate data. One of them is the remote table. You can import remote tables to make the data available in your Space with the Data Builder, in an entity-relationship model, or directly as a source in a view. Another way is to use data flows. Many connections support loading data into SAP Datasphere via data flows. Data flows support a wide range of extract, transform, and load (ETL) operations. Below you can find an example how data integration could look like:
The virtual data model should consist of several layers composed out of Graphical or SQL Views, Analytical Models and Intelligent Lookups. To build these artifacts, you can write SQL code in a powerful SQL editor or alternatively use the graphical no code/low code environment inside the Data Builder. Let`s take one minute to watch a simple example of the virtual data model architecture:
Lets take a short excursion into the SAP Datasphere architecture. Indeed, there are two different tenants that can be used in SAP Datasphere. On the one hand, there is the SAP Datasphere core tenant, which includes all the artifacts, tools and features that have been mentioned in this learning journey. On the other hand, there is the SAP BW-bridge tenant, which is aimed at SAP BW customers, who want to migrate their SAP BW artifacts to SAP Datasphere.