
Data and ETL
ETL is required to turn raw process data into insights. ETL stands for Extract, Transform, Load, and it refers to the process of collecting, transforming, and loading data from various sources into a target database or data warehouse.
Data Extraction
SAP Signavio makes it easy to pull information from multiple systems using built-in connectors. These connections can be set up to run automatically on a schedule, removing the need for constant manual effort.
So, what is data extraction in the context of process mining? It’s the step where you retrieve all business-relevant process data from the IT systems that support your business operations. This often includes ERP systems (such as SAP S/4HANA or SAP ECC), CRM platforms, or legacy databases.
Creating a data dump is easy. But what data is required and where is it stored? We need to ask ourselves this to identify the relevant data to extract.
Ask yourself
- What process is it?
- Which IT-Systems are used?
- What is the time-frame?
- Which system-based activities (events) are executed in the process?
- Does all recorded activity have a timestamp?
- Are all activities tracked in the data system?
- What additional information is required for an analysis (e.g. type of product, order value, etc.)?
Data Extraction from Multiple Systems
The process can also be supported by multiple systems. In these cases, it's recommended to start small by extracting the data from one system to get your first results. More data can be included to expand the process in the next iteration.
If the data is difficult to extract (in case of external systems) or there is no unique identifier to track cases across the system, you can combine two values, such as order value and order time. You can also reduce the process time-frame if the ID can't be created.
Data Transformation
Once the data is collected, SAP Signavio provides ready-to-use tools and templates to clean, organize, and format it for process analysis usually using a database querying language like SQL. This helps speed up the preparation phase and minimizes the usual workload involved.
Data Loading
After transformation, the data flows directly into SAP Signavio Process Intelligence, where it’s ready for immediate analysis. This direct approach removes the need for separate storage or staging systems, making the workflow more efficient.
Key Requirements
Minimum key requirements in order to create a process mining event include a valid case with a case ID, event name identifier and a timestamp for each event. Ideally, the time-frame includes ALL records, but this can be a lot of data. Most companies will set the time-frame to a smaller parameter, such as 1 year.
Whenever a limited time-frame is considered, there will be incomplete cases, as some processes may not have fully executed within that period. You must then ask whether such cases should be included in the extraction. These questions need to be addressed before beginning data extraction.
Benefits of using ETL
This solution helps users work more efficiently by using standard templates and automation to make data integration and analysis faster and easier. It also provides speed, because users can start analyzing their data just a few days after setting up the system. With comprehensive analysis tools like pre-built metrics, easy-to-use dashboards, and the SIGNAL query language, users can explore their data in detail and understand how their processes are performing. Most importantly, the system helps identify problems and delays in workflows, so users can find ways to improve processes and make their operations run more smoothly.