Introducing time series models in SAP Analytics Cloud Smart Predict
After completing this lesson, you will be able to:
After completing this lesson, you will be able to:
- Explain time series analysis in Smart Predict
Use cases for time series models
Use augmented analytics to control travel and expenses
Time series forecasting is useful for estimating future values of a measure where you have a time dimension available to help you identify a trend.
In the interaction below, we will walk you through a case study for using time series models to control expenses.
What sorts of topics can we investigate with a time series model?
You can answer questions such as:
- How will the revenue of a shop evolve over the next month?
- What are the expected sales by product per regions for the next weeks?
- How will the stock of products vary in a warehouse over the following weeks?
- How will cash flow evolve during the next quarter?
Time series analysis in Smart Predict
What is a time series?
A time series is a series of data points indexed in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. For example, a time series could track the movement of revenue or costs over a specified period of time with data points recorded at regular intervals (e.g. weekly, monthly, quarterly or annually).
Historical values of the target variable and the corresponding dates are required when building and training a time series model. This data 'couple' of date and target value is called the signal. The signal is analyzed by the time series forecasting model in Smart Predict. Values of other variables taken at the same dates (in the past and in the future) can be included as influencer variables for the model and are used to refine the analysis of the signal.
Basically, the signal is the variable that you want to explain, or predict the values for. If you want to forecast the product sales for the next 6 months, for example, then Product sales is your signal variable.
The horizon is the number of predictions to be estimated in the future. This number depends directly on the size of the historical data.
- 5:1 is a good ratio to estimate the horizon and get predictions with relevant confidence intervals. This means that if there are 100 historical cases, then 20 values of the target variable can be predicted in the future. To predict six months ahead, 30 months of historical data must be provided.
- Best practice:
- It is generally recommended to discard the history that's too far away in time.
- Although 20 values or less can be chosen, if more are needed then it is better to collect more historical cases.
- Using the predictive scenario settings when building your time series model, it is possible to define the time window that's used, either using all available months or restricting to a certain time period.
In SAP Analytics Cloud, the historical data is automatically ordered chronologically and split into two sets:
- The first 75% of the data is used to train the time series forecasting model.
- The remaining 25% is used to select the best candidate model.
How is the data internally partitioned to optimize the predictive model?
Smart Predict uses the training and validation datasets and performs the following steps when creating a time series model:
- From the training dataset, several trial versions of the time series model are trained.
- The best trial version of the time series model is selected.
- The trial version is evaluated using the validation set
- The final predictive time series model is created.
Considerations when creating a time series forecasting model
There are a few considerations to take into account when creating your time series forecasting model:
- Scale of the predictions: Consider the scale of the predictions. For example, if historical data is captured every month, week, day, hour, or minute, then the predictions will be produced in the same unit of time. Therefore, if data values are recorded every month, it is not meaningful to request predictions for the next few days. If data is recorded every minute by sensors, but the minute is not relevant for the use case, then a higher unit of time such as hour should be used.
- Aggregation: Consider the aggregation of data in the unit of time required and define an aggregation function.
- For example, an aggregation function could calculate one value for the hour from the 60 values measured for each of the 60 minutes of this hour. It can be the first value, the last value, the mid-value, or a calculated value (e.g. average or a more complex formula).
- An important point to keep in mind is the size of the aggregation because a large aggregation may hide information and possibly decrease the quality of the predictions. However, an appropriate aggregation smooths the signal when there is a lot of noise. Test and experiment to choose the best aggregation function.
- Sort the data: The historical dataset should be cleaned so that each unit of time corresponds to only one value of the target variable. Smart Predict will automatically sort the data.
Login or Register
Save progress to your learning plan by logging in or creating an account