Analyzing the Results of a Time Series Model in SAP Analytics Cloud Smart Predict

Objective

After completing this lesson, you will be able to analyze the results of a time series model model.

Forecast Report

Global Performance Indicators

You can assess the quality of your predictive model's performance using the Expected MAPE. The expected MAPE is the evaluation of the 'error' made when using the predictive model to estimate the future values of the signal, whatever the horizon.

  • For each actual observed value, the predictive model calculates as many forecasted values as requested by the analyst. This is called the horizon of forecasts.
  • Each of those forecasted values is compared to the corresponding actual ones. Then, for each possible horizon, a per-horizon MAPE can be calculated. This value is the mean of the absolute differences between actual and forecast values, expressed as a percentage of the actual values.
  • The expected MAPE is the mean of all per-horizon MAPE values that have been calculated.
    • An expected MAPE of zero indicates a perfect predictive model. The lower the Horizon-Wide MAPE, the better your predictive model performance.
    • An expected MAPE of 17.63% indicates that the error made when using a forecasted value is approximately 17%.
The Expected MAPE and Forecasts vs. Actual for a Times Series Model.

Forecasts vs. Actual

The forecast vs. actual graph displays curves for the forecast (predicted) and actual (target) values for the time series data source. It shows the predictive model's accuracy.

The predictions are displayed at the end of the graph. For each forecasted value, the predictive model provides estimates of the minimum and maximum errors.

  • The area between the upper and lower limits of the possible errors in the predictive forecasts produced by your predictive model is called the confidence interval. This area is only displayed for predictive forecasts.
  • Outliers are values marked with a red circle on the graph.
  • The forecasting error indicator is the absolute difference between the actual and predicted values. This indicator is also called the residue. The residue abnormal threshold is set to three times the standard deviation of the residue values on an estimation (or validation) data source. The forecasted value and error limit values are listed in the forecast table for each predictive forecast.

Forecasts

The forecast table displays the following information to help you analyze the performance of your time series predictive model:

  1. Forecast: Predicted values for the predictive model over a set of known data called the validation partition.
  2. Error min and max: Minimum and maximum deviation measures of the values around the predictive forecasts. If you choose to segment your time series forecasting predictive model by entity, you will have this information for each segment.
The Forecast table with Time, Forecast, Error Max, and Error Min displayed.

Outliers

The output table displays details of the outliers. An actual signal value is considered an outlier if its corresponding forecasting error is abnormal relative to the forecasting error mean on the estimation data set.

The forecasting error indicator is the absolute difference between the actual and predicted values and is also called the residue. The abnormality threshold for residues is set to 3 times the standard deviation of the residue values in the training data set.

Anomalies are signal values outside the zone of possible error for the predictive forecast, defined by its upper and lower limits. The signal is compared to all the predictive forecasts.

Outliers (Past) and Outliers (Future) for the Forecast report.

Explanation Report

Time Series Breakdown

The textual explanation in the time series breakdown report describes the modeling technique used to calculate the forecast. Time series breakdown can be either:

  1. A time series breakdown modeling technique that you can apply when working with a time series with limited disruptions.
  2. A time series breakdown when a smoothing technique is applied when working with a disrupted time series that doesn't follow a regular trend or cycle.

Both breakdowns include the following:

  • The Actual is the observed historical data.
  • The Forecast is the result of the prediction of the future.
  • Influencers represent the part of the time series impacted by the influencers, specified in the Influencers field of the predictive model settings.
  • Fluctuations represent the part of the time series detected by the predictive model that depends on past values of the time series. The report shows the influence of the last observations before the predictive forecast. Fluctuations reflect changes that are not detected at the trend and cycle level. For example, the predictive model can detect that the previous 37 values affect the actual values.
  • Residuals refer to what is left when the trends, cycles, and fluctuations have been extracted from the initial time series. Residuals are not systematic or predictable and reflect the part of the time series that Smart Predict cannot explain or model. The smaller the residuals, the better the predictive model. A good predictive model produces residual data that contains no pattern.
The Time Series Breakdown with a time series with limited disruptions showing.

Interpretation of Trend and Cycles

Depending on the type of time series breakdowns used, the Trend and Cycles are interpreted differently.

1. The predictive model was built by breaking down the time series into different components.

The Trend is the general direction of the time series. The report can show linear or quadratic trends.

The Cycles are the fixed-length or seasonal cycles detected in the time series.

  • Fixed-length cycles recur every N observations.
  • The recurrence of seasonal cycles is based on a calendar time unit such as day, week, month, and so on. For seasonal cycles, the report shows the recurrence of the cyclic pattern and the time granularity at which it appears. The following seasonal cycles can be detected:
    • a pattern recurring every year when observed on a half-monthly basis
    • a pattern recurring every year when observed monthly
    • a pattern recurring every year when observed on a semester basis
    • a pattern recurring every year when observed weekly
    • a pattern recurring every quarter when observed monthly
    • a pattern recurring every semester when observed monthly
    • a pattern recurring every month when observed weekly
    • a pattern recurring every year when observed daily
    • a pattern recurring every month when observed daily
    • a pattern recurring every week when observed daily
    • a pattern recurring every hour when observed on a minute basis
    • a pattern recurring every day when observed on an hourly basis
    • a pattern recurring every minute when observed on a second basis

2. The predictive model was built incrementally by smoothing the time series, with more weight given to recent observations.

The Trend is the orientation of the forecast data. It is calculated using an algorithm that applies exponential smoothing to past data over time.

The Cycles are seasonal cycles, with or without amplitude variations.

  • These cycles are calculated using an algorithm that applies an exponential smoothing technique on the past data over time. The recurrence of seasonal cycles is based on calendar time units such as days, weeks, and months.
  • The report shows the recurrence of the cyclic pattern. The following seasonal cycles can be detected:
    • a pattern recurring every semester
    • a pattern recurring every quarter
    • a pattern recurring every month
    • a pattern recurring every two weeks
    • a pattern recurring every week
    • a pattern recurring every day
    • a pattern recurring every hour
    • a pattern recurring every minute
    • a pattern recurring every second

Impact of Cycles

An Impact of Cycles graph is displayed when some cycles are detected in the forecasted time series. It provides details on how the target is impacted by cycles, including seasonal and fixed-length cycles.

The cycles are named by their recurrence (Yearly Cycle, Six Days Cycle, and so on), and each bar represents the impact of the cycle for a given period: that is, how much the cycle increases or decreases the value predicted for that period.

When a smoothing technique is used, Smart Predict always displays the last three values for each period.

Constant Amplitude Cycle

Some cycles have a constant amplitude. This means that, for a given period within the cycle, the impact of that period is the same for any occurrence of the cycle.

In such cases, only one cycle is displayed (there is only one series), as the impact in each period is identical across all occurrences of the cycle.

In the example below, the impact on the prediction for Saturday (Sat) is -85027.87 , and this impact is the same every week.

Explanation Report tab showing the Impact of Cycles chart displaying the data described above.

Variable Amplitude Cycle

Some cycles repeat over time with an amplitude that can change. For a specific period of the cycle, the impact differs for each cycle. To illustrate this evolution, the impact of the last three cycles is displayed (the chart has three series).

In the following example, the cycle's amplitude increases over time.

Explanation Report tab showing the Impact of Cycles chart.

Past Target Value Contributions

The Past Target Value Contributions identify the past observations that most influence the forecast.

  • At the step of identifying the model components, Smart Predict found that previous values of the time series have an impact on the actual values.
  • The graph shows how the recent past, or distant past, of an autoregressive component, influences the target.
  • The lags are numbered with negative integers representing their distance in the past from the predictive forecast. Lag -1 is the point in the past just before the forecast. Lag -5 is five points in the past.
  • In the following example, a predictive model is developed to forecast the ozone rate for the next 12 months.
    • The graph shows that observed values influence the ozone rate in the recent or distant past.
    • It also shows the more important dates.
    • The lags are numbered with negative integers that represent how far back in the past they are from the predictive forecasts. Smart Predict found that the 10 previous values affect the subsequent values. Therefore, the graph stops at 10.

Using these lags, you can analyze how previous values influence subsequent values. Here, the lags -1 and -6 are influential.

Explanation Report tab showing the Fluctuations chart.

Relative Impact of Components

The Relative Impact of Components chart represents the weight of that component in the absolute value of the actuals across the time series. A relative impact of n% means that the considered component alone represents n% of the actuals. The relative impacts of the components add up to 100%, including the final residuals.

The component final residuals represent the portion of the actuals not explained by the time series forecasting model.

In the following example:

  • In this example, the trend alone accounts for 91% of the actuals.
  • The yearly cycle weight represents only small variations around the trend and accounts for only for 7% of the actuals.
ComponentRelative impact
Trend91%
Cycles (yearly cycle)7%
Final residuals2%
Relative impact of components report with actuals, trend, and cycle displayed.

Target Statistics

These signal statistics (minimum, maximum, mean, and standard deviation) are provided for both the training and validation data sets. In this example, the Validation data set information is displayed. To change it, use the dropdown menu.

Explanation report for a time series model with validation dataset target statistics information displayed.

Segmented Time Series Model

Segmented Time Series Models

When you use an entity to segment a time series, the reports for each entity are available.

  • If there are fewer than 20 segments, then these reports are available automatically following training.

    Select the column values that appear together, forming a segment. For example, Product X, Store Y, from the top-left dropdown list in both tabs, to view its report.

  • If a predictive model contains more than 20 segments, the reports for each segment are not available automatically following training; they are accessed on demand. This is to ensure that time is not wasted creating reports for predictive models with a large number of segments when not all of those reports are required at once.
    • Select the segment, and after a slight delay, the reports are created and made available.
    • Once a report is available, you can then access it immediately at any time.
Reports by entity for a segmented time series model.

Next Steps

Once you have analyzed your predictive model, you have two choices:

1. The predictive model's performance is satisfactory. If you are happy with your model's performance, use it and apply it.

2. The predictive model's performance must be improved. If you are unhappy with the model's performance, you must experiment with the settings.

To experiment with the settings, you can:

  • Duplicate the predictive model.
    1. Open the predictive scenario, which contains the predictive model to be duplicated.
    2. Open the Predictive Model list.
    3. Click on the predictive model level to be duplicated, and in the menu, select Copy. Copying creates an exact (untrained) copy of the original predictive model.
    4. Compare the two versions and find the best one.
  • Update the settings of the existing model and retrain it. The previous version has been deleted.