Analyzing the results of a regression model in SAP Analytics Cloud Smart Predict

Objectives
After completing this lesson, you will be able to:

After completing this lesson, you will be able to:

  • Explain how results of a regression model are analyzed in Smart Predict

Overview report

Global performance indicators

Root Mean Squared Error (RMSE)  measures the average difference between values predicted by the predictive model and the actual values. It provides an estimation of how well the predictive model is able to predict the target value (accuracy).

  • The lower the value of the RMSE, the better the predictive model is.
  • A perfect predictive model (a hypothetic predictive model that would always predict the exact expected value) would have an RMSE value of 0.
  • The RMSE has the advantage of representing the amount of error in the same unit as the predicted column making it easy to interpret. For example, when predicting an amount in dollars, the RMSE can be interpreted as the amount of error in dollars.
  • To improve the Root Mean Squared Error, add more influencer variables in the training dataset.

Prediction confidence indicates the capacity of your predictive model to achieve the same degree of accuracy when you apply it to a new dataset, which has the same characteristics as the training dataset.

  • Prediction confidence takes a value between 0% and 100%.
  • The predicted confidence value should be as close as possible to 100%.
  • To improve prediction confidence, you can add new rows to your dataset, for example.
Overview report containing the Global Performance Indicators chart.

Target statistics

Gives descriptive statistics for the target variable in each dataset.

NameMeaning
MinimumMinimum value found in the dataset for the target variable
MaximumMaximum value found in the dataset for the target variable
MeanMean of the target variable
Standard deviationMeasure of the extent to which the target values are spread around their average
Overview report containing the Target Statistics chart.

Influencer contributions

The chart in the Overview report shows how the top five influencers impact the target. It is also shown in the Influencer contributions report, with all additional influencers covered in that report.

Predicted vs. actual

This chart compares the prediction accuracy of the predictive model to a perfect predictive model and shows the predictive model errors.

During the training phase, predictions are calculated using the training dataset. To build the graph, Smart Predict groups these predictions into 20 segments (or bins), with each segment representing roughly 5% of the population.

For each of these segments, some basic statistics are computed:

  • Segment mean is the mean of the predictions on each segment.
  • Target mean is the mean of the actual target values.
  • Target variance is the variance of this target within each segment.

By default, the following curves are displayed:

  1. The Validation - Actual  curve shows the actual target values as a function of the predictions.
  2. The hypothetical Perfect Model curve shows that all the predictions are equal to the actual values.
  3. The Validation - Error Min and Validation - Error Max curves show the range for the actual target values.

The area between the Error Max and Error Min represents the possible deviation of the current predictive model - it's the confidence interval around the predictions.

For each curve, a dot on the graph corresponds to the segment mean on the X-axis, and the target mean on the Y-axis.

Overview report containing the Predicted vs Actual chart.

Interpreting the chart: Three main conclusions can be made using the Predicted vs. Actual chart depending on the relative positions of the curves on the graph.

  1. If the validation and perfect model curves don't match:
    • The predictive model isn't accurate.
    • Confirm this conclusion by checking the prediction confidence indicators.
    • If the indicators confirm the predictive model isn't reliable, improve accuracy by adding more rows or variables to the input dataset.
  2. If the validation and perfect model curves match closely:
    • The predictive model is accurate.
    • Confirm this conclusion by checking the predictive confidence indicators.
    • If the indicators confirm its reliability, trust the predictive model and use its predictions.
  3. If the validation and perfect model curves match closely but diverge significantly on a segment:
    • The predictive model is accurate, but its performance is hindered in the diverging segment.
    • Confirm this conclusion by checking the predictive confidence indicators.
    • If the indicators confirm its overall reliability, improve that segment's predictions by adding more rows or variables in the input dataset

Influencer contributions report

Influencer contributions

This chart shows how the influencers impact the target.

  • All influencers are displayed as a default and are sorted by decreasing importance.
  • The influencer contributions show the relative importance of each variable used in the predictive model.
  • Only the contributive influencers are displayed in the reports, the variables with no contribution are hidden. The most contributive ones are those that best explain the target.
  • The sum of their contributions equals 100%.
The Influencer Contributions report showing the Influencer Contributions chart for a regression model.

Grouped category influence

The grouped category Influence report analyzes the influence of different categories of a variable on the target:

  • If the influence value is positive, a high target value is more likely.
  • If the influence value is negative, a high target value is less likely.
  • The influence of a category can be positive or negative.

Grouped category influence shows groupings of categories of an influencer, where all the categories in a group share the same influence on the target variable.

  • The X-axis represents the influence of the grouped categories on the target variable.
  • The Y-axis represents the grouped categories

The length and direction of a bar shows whether the category has more or fewer high value observations compared to the mean:

  • A positive bar (influence on target greater than 0) indicates that the category contains more observations belonging to the high target values compared to the mean (calculated on the entire validation dataset).
  • A value of 0 means that the category has no specific influence on the target.
  • A negative bar (influence on target less than 0) indicates that the category contains fewer observations belonging to the high target values compared to the mean (calculated on the entire validation dataset).
The Influencer Contributions report showing the Grouped Category Influence chart for a regression model.

Group category statistics

The grouped category statistics chart shows the details of how the grouped categories influence the target variable over the selected dataset.

  • The X-Axis displays the target mean: For a continuous target, the target mean is the average of the target variable for the category in the dataset.
  • The Y-Axis displays the frequency of the grouped category in the selected dataset.
The Influencer Contributions report showing the Grouped Category Statistics chart for a regression model.

Next steps

Once you have analyzed your predictive model, you have two choices:

1. The predictive model's performance is satisfactory. If you are happy with your model's performance, then use it and apply the model.

2. The predictive model's performance needs to be improved. If you are unhappy with the model's performance, you will need to experiment with the settings.

To do this you can either:

  • Duplicate the predictive model.
    1. Open the predictive scenario, which contains the predictive model to be duplicated.
    2. Open the Predictive Model list.
    3. Click of the predictive model level to be duplicated, and select Copy in the menu. This will create an exact (untrained) copy of the original version of the predictive model.
    4. Compare the two versions and find the best one.
  • Update the settings of the exiting model and retrain it. This will erase the previous version.

Save progress to your learning plan by logging in or creating an account

Login or Register