Introduction
Data provisioning is a very broad term and refers to the acquisition of data from a source system to a target system. The word acquisition is preferred, and not data loading because data can be acquired without the need to physically load it to a target system. In fact, with advances in technology, moving data around an organization is becoming less common. It's often a lot simpler to read the data remotely.
There are many reasons why data provisioning is needed. These include:
- Extract data from business applications and load to a central data warehouse
- Provide real-time access to data sources for analytics
- Distribute data from a central system to regional systems
- Consolidate data from multiple systems into a central system
- Keep systems in sync
- Migrate data from a legacy system to a new system
In the simplest data provisioning scenarios, there are only two systems involved: the source and the target. But often there are multiple systems involved. For example, you might want to combine data from multiple source system to a single target system. It could also go the other way: a single source system that distributes its data to multiple target systems. And, finally, we can even have a combination of both: multiple source systems consolidating data and distributing it to multiple target systems.
Launch the video below to learn more about the basic concepts behind data provisioning:
Application or database control of data provisioning
Data provisioning can be controlled by standalone, specialized applications or using the in-built tooling of a database. Let's consider each approach.
Application-controlled data provisioning is when a dedicated application controls the data flow. These applications provide tools to connect to data sources and data targets and to define data-flow rules that determine how data moves between systems. Examples of dedicated data provisioning applications include SAP Data Services, SAP Landscape Transformation, and SAP Datasphere.
With this application-controlled approach, the application extracts data from a source database, and loads it to a target database. The extraction rules, flow logic and loading methods are managed by the application. Think of the data provisioning application as the orchestrator of data movement between systems. In some cases, the data provisioning application extracts the source data and stores it temporarily, before sending it onto the target system. This is often found in cases where multiple data sources need to be combined and a staging area is needed to synchronize the data, which might arrive at different times.
One of the key reasons for using a dedicated data provisioning application is when you’re working with multiple data sources that use different technologies or come from different vendors. These dedicated applications can usually process data from any sources, for example, databases, CSV files, JSON files, and web services. Some can even connect to business applications directly, for example, SAP BW/4HANA extracts from the SAP S/4HANA at the application level and not from the database tables. In this case, the data flow logic is built at a level higher than the physical storage technology.
Now let's look at database controlled data provisioning.
The basic requirement is that the database provides the data provisioning tools. The simplest type of data provisioning tool could be an export and import tool to move data from one database to another. But some databases, including SAP HANA, provide sophisticated tools to handle complex data provisioning scenarios such as those that require combining data, validating data, and enriching data. Using the in-built tooling of a database to manage data provisioning means you don’t have to implement separate data provisioning applications, as we described earlier. This approach supports a simpler landscape.
Working with database-provided tools is what we mean by database-controlled data provisioning. The data flow is controlled using tools that are part of the database.
This course covers the built-in data provisioning tooling of SAP HANA on-premise and SAP HANA Cloud.