Building a time series model using SAP Analytics Cloud Smart Predict

Objectives
After completing this lesson, you will be able to:

After completing this lesson, you will be able to:

  • Build a time series model using Smart Predict

Datasets for a time series model

Input dataset recap for building time series predictive scenarios

The training dataset contains the past observations (the history) that will be used to generate the predictive model and the data/time these past observations were recorded.

  • In this dataset, the values of the signal variable are known.
  • The dataset might also contain some influencer variables.
  • The past and future values of the influencer variables must be known (at least for the expected forecast horizon).

By analyzing the training dataset, Smart Predict generates the time series model.

Times series model with the settings pane open on the right hand side.

Limitations

The training or application input dataset must not contain more than 1,000 columns.

While applying the predictive model to an application dataset, Smart Predict generates additional columns and the application process can get blocked if the application dataset already risks crossing the limit of 1,000 columns.

Build and train a times series model

Settings for time series models

As with classification and regression model, you must select your data source and edit your column details, however, there are specific settings used when building and training a time series model that are covered below.

Predictive Goal

Target: The Target variable is the signal that you want to predict the values for, or explain.

Date: The Date variable is mandatory for times series models.

Regardless of the date granularity selected in the time series predictive scenarios with a dataset as the data source, every date format should include years, months and days. This means that even if a quarterly or monthly forecast is required, the date format in the dataset still needs to include days.

For example, for the YYYY-MM-DD date format, the time series predictive scenarios can be created where the date granularity can be:

  • Year expressed as YYYY-01-01 where YYYY is variable (moving year).
  • Quarter or Month expressed as YYYY-MM-01 where YYYY-MM is variable (moving month).
  • Weekly data in the date format YYYY-MM-DD taking for instance the 1st day of the week as the characters DD (moving week).
  • Day (calendar dates) expressed as YYYY-MM-DD where YYYY-MM-DD is variable (moving day).

Number of forecast periods: The number of forecasts to generate. In the case that the input dataset contains future values for influencers, the number of forecasts needs to be less than or equal to the number of future values in the dataset. If there are future values for the next six months, then the number of forecasts requested cannot exceed six.

The number of forecasts delivered with confidence intervals is determined as follows:

  • If the training dataset size is equal to or fewer than 12 periods, it is treated as a small dataset case, and by default the number of forecasts with confidence intervals is set to 1.
  • In other cases, the number of forecasts with confidence intervals is set to 1/5 of the training dataset size.
  • If the training dataset contains 1,000 rows of data, Smart Predict can provide up to 200 forecasts with confidence intervals. If more than 200 forecasts are required, the accuracy of the forecasts starting from the 201st cannot be evaluated.

Entity: This is an optional variable(s) that is used to split up the predictive model into segments, with each one producing its own predictive model, with distinct predictions for each segment.

For example, it might be more relevant to have KPIs on both stores and products. If this type of prediction is useful, then click the box, and select the columns for values that are to be used to segment by.

Time series model with Time Series Data Source and Predictive Goals section showing.

Limitations for time series models: If the predictive model is configured for a number of forecast periods and/or entities beyond the recommended maximum limits, it is likely to create performance issues that can impact other users on the same SAP Analytics Cloud tenant.

  • The maximum number of entities is 1000.
  • The maximum number of forecast periods (independent of the number of entities) is 500.

Predictive Model Training

Train using: Select which observations will be used when training the dataset for the time series model. You have two options:

  1. All Observations: Train the predictive model using all observations available in the dataset. Choose the date of the last observation, or define the last date.
  2. Window of Observations: Specify a restricted period of observations. Select the number of days, weeks, month, or years to be included in the observation window. Choose the date of the last observation, or define the last date (this date must be available in the dataset).

Until: You have two options:

  1. Last Observation: Let the application use the last training reference date as a basis.
  2. User-defined Date: Select a specific date (that is available in the dataset).

Exclude As Influencer: The past and future values of the influencer variables must be known (at least for the expected forecast horizon). Select the influencer variables to be excluded when the time series forecast model is trained.

Convert Negative Forecast Values to Zero: Turn negative forecasts to 0. This is useful when negative values are not relevant for the business scenario, for example, number of births. The negative values are forced to take a 0 value, having an influence on the error computation and the selection of the best predictive model.

Time series model with the Predictive Model Training section showing.

Build and train a time series model

Scenario

You have created a story and selected your data but are unsure where to start. You know that you need to create insights for churn data based on the data but would like guidance. You work for a small business who wants to forecast their daily cash flow over the next 21 working days. They have approximately 9 months worth of historic working day cash flow data and have created a number of influencer variables to try and improve the accuracy of the model.

The data you have been provided is as follows:

A table with variables to the left and description to the left.

What skills will you develop in this practice exercise?

In this practice exercise, you will be able to perform the following tasks in SAP Analytics Cloud:

  1. Create a time series predictive scenario
  2. Select the data source for the time series model
  3. Edit the time series data source column details
  4. Set the predictive goal target, date, and time periods
  5. Train the model

Save progress to your learning plan by logging in or creating an account

Login or Register