SAP BW/4HANA Data Flow
To load data into SAP BW/4HANA we build a data flow. A data flow is a sequence of objects that determine the routing of data throughout the journey from source tables to a query in SAP BW/4HANA. Tools are provided to create the data flow objects and connect them.
Watch this video to learn about the data flow of master data:
So we have learned how master data travels from the source system to SAP BW/4HANA, watch this next video to see how we load the transactional data:
In the previous two videos we covered the most important points about the SAP BW/4HANA data flow . So let's now recap on those points and also add some additional information to each component in the data flow.
Master Data
To load master data you need to define a characteristic InfoObject. There are three types of master data in SAP BW/4HANA: attributes, texts, and hierarchies. You can chose to load any one of these types of master data or all of them.
The main purpose of master data is to enrich the transactional data with additional information in a report. On its own, master data is not especially interesting to the business user who monitors business performance. Equally, transaction data without master data does not provide enough information to the business user.
In our example we saw the characteristic InfoObject ProductID with the value HT-1000 has the text Notebook Basic 15 and the Product ABC Category attribute value is C. These values are stored in the master data tables of SAP BW/4HANA and are loaded from the source system master data tables.
Optionally, hierarchy master data can be loaded to enable the hierarchical presentation of multiple characteristic InfoObject values, instead of presenting the values in a endless flat list. This makes navigation in reports much easier.
All three types of master data must be loaded into SAP BW/4HANA using data flow objects such as DataSources, Transformations, and Data Transfer Processes (DTPs).
Transactional Data
Transactional data is generated in source systems such as SAP S/4HANA by the execution of various business processes such as sales order processing.
The transactional data is stored in source system tables that include many fields.
DataSource
A DataSource is the object in SAP BW/4HANA that is created to support data extraction from the source system.
A DataSource can be regarded as the entry point for data arriving in SAP BW/4HANA.
A DataSource is simply a structure of fields that correspond to the source system fields.
The DataSource defines which fields are transferred from the source tables to SAP BW/4HANA.
No data is stored in a DataSource. It is a pass-through object used to convey data to the next layer in SAP BW/4HANA.
A DataSource is used in a transactional data flow and a master data flow.
DataStore Object (advanced) - Type: Staging and Field-based
A DataStore Object (advanced) - Type: Staging stores the data is the same format as the source system with no changes to the values.
This layer retains all changes to the loaded data so that we have a complete history.
This layer is also known as Corporate Memory because it can be regarded as the permanent copy of the source data that we preserve in case we need to reuse the data in the future.
This type of DataStore Object is not used for reporting because, with its history preservation features its focus is on staging data and not reporting.
Transformation
A transformation connects the DataSource with the DataStore Object (advanced).
A transformation defines the data movement action for each field. Actions can include direct mapping (no change to the data), adjustment to an incorrect data value, adding a missing value, formatting a value, concatenating two fields, splitting a field into two fields, and so on.
A transformation always sits between a source object, such as a DataSource and a target object such as a DataStore Object.
A transformation is used in a transactional data flow and a master data flow.
Data Transfer Process (DTP)
A DTP provides the loading parameters such as the filters that determine which data to load and settings that determine how loading errors are handled.
A DTP always sits between a source object, such as a DataSource or a DataStore Object, and a target object.
- A DTP is the object which you execute to start the data load.
A DTP is used in a transactional data flow and a master data flow.
DataStore Object (advanced) - Type: Standard and InfoObject-based
A DataStore Object (advanced) - Type: Standard is used to store consolidated and cleansed data on granular level, e.g. on the granularity of Sales Orders and Sales Order Items.
It can be used for reporting because it always uses the latest version of a data record even though historical records are available in this object as well.
You should create this object using InfoObjects instead of fields, in order to use the corresponding master data of the InfoObject in reporting.
CompositeProvider
This is the virtualization layer. No data is stored here as data is always generated at run-time from the sources that connect to it.
It combines data from several sources using union and/or join operations.
You can include extra calculated values, filters and aggregations.
BW Query
A BW Query defines which InfoObjects from the CompositeProvider should be used in a report.
It is always good practice to build a query on top of a CompositeProvider and not directly on top of DataStore Objects (advanced). This is because the relationship between the BW Query and the underlying CompositeProvider remains stable even if the underlying sources of the CompositeProvider change.
The query defines the initial layout of the report, filters, calculations, and so on.
Report
SAP BW/4HANA does not provide tools to create reports. Customers can choose their own reporting tool such as SAP Analysis for Microsoft Office or SAP Analytics Cloud.
One or more BW Queries are included in the report.
When the transactional data is modeled using master data characteristic InfoObjects, the master data can be displayed in the report.
This happens because the characteristic values in the transactional data, such as product numbers are automatically connected to the central master data of the characteristic value. See the diagram below for an example.
Shared Master Data
We learned that master data is automatically combined with transaction data. In our example we saw that the product ID in the transaction connects to the product master data to provide additional attributes and descriptions. If we have loaded customer master data then we would also have additional information about each customer such as the country and discount %.
One of the main reasons we load and store master data separately from the transaction data is so we can share the common master data across all transactions where the characteristic is present.
For example, if we load the master data for product we can then share the product attributes and texts with multiple DataStore Objects where product is included in the record. See the diagram above for an example.
In terms of sequence it is good practice to first load the master data before the transaction data. This is because during transaction data loading, we often refer to the loaded master data to check the transaction values are valid. For example, is this a real customer number? If not, we can reject the transaction record or at least mark it to be checked.
Consolidating and Distributing Data
In the previous section, we learned that data follows a path from the source system to the final query. The flow we described was linear. However, in reality, the data flow is more complex.
Using the various SAP BW/4HANA objects, we can build a data flow that consolidates data from multiple sources. We can also distribute data to multiple objects.
- Example of consolidation: Load data from the sales sources and from the returns source, then consolidate in SAP BW/4HANA to create a true picture of sales.
- Example of distribution: Load data from a single source containing sales for north and south, then distribute the data to separate objects and apply adjustments to the loaded data, based on different pricing rules for north and south.