Global Performance Indicators
Root Mean Squared Error (RMSE) measures the average difference between values predicted by the predictive model and the actual values. It provides an estimation of how well the predictive model is able to predict the target value (accuracy).
- The lower the value of the RMSE, the better the predictive model is.
- A perfect predictive model (a hypothetic predictive model that would always predict the exact expected value) would have an RMSE value of 0.
- The RMSE has the advantage of representing the amount of error in the same unit as the predicted column making it easy to interpret. For example, when predicting an amount in dollars, the RMSE can be interpreted as the amount of error in dollars.
- To improve the Root Mean Squared Error, add more influencer variables in the training data set.
Prediction confidence indicates the predictive model's capacity to achieve the same accuracy when you apply it to a new data set with the same characteristics as the training data set.
- Prediction confidence takes a value between 0% and 100%.
- The predicted confidence value must be as close as possible to 100%.
- To improve prediction confidence, you can add new rows to your data set, for example.
Target Statistics
Gives descriptive statistics for the target variable in each data set.
Name | Meaning |
---|---|
Minimum | The minimum value found in the data set for the target variable. |
Maximum | The maximum value found in the data set for the target variable. |
Mean | The mean of the target variable. |
Standard deviation | The measure of the extent to which the target values are spread around their average. |
Influencer Contributions
The chart in the Overview report shows how the top five influencers impact the target. It is also shown in the Influencer contributions report, with all additional influencers covered in that report.
Predicted vs. Actual
This chart compares the prediction accuracy of the predictive model to a perfect predictive model and shows the predictive model errors.
During the training phase, predictions are calculated by using the training data set. To build the graph, Smart Predict groups these predictions into twenty segments (or bins), with each segment representing roughly 5% of the population.
For each of these segments, some basic statistics are computed:
- Segment mean is the mean of the predictions on each segment.
- Target mean is the mean of the actual target values.
- Target variance is the variance of this target within each segment.
By default, the following curves are displayed:
- The Validation - Actual curve shows the actual target values as a function of the predictions.
- The hypothetical Perfect Model curve shows that all the predictions are equal to the actual values.
- The Validation - Error Min and Validation - Error Max curves show the range for the actual target values.
The area between the Error Max and Error Min represents the possible deviation of the current predictive model - it is the confidence interval around the predictions.
For each curve, a dot on the graph corresponds to the segment mean on the X-axis, and the target mean on the Y-axis.
Interpreting the chart: Three main conclusions can be made using the Predicted vs. Actual chart, depending on the relative positions of the curves on the graph.
- If the validation and perfect model curves do not match:
- The predictive model is not accurate.
- To confirm this conclusion, check the prediction confidence indicators.
- If the indicators confirm that the predictive model is unreliable, improve accuracy by adding more rows or variables to the input data set.
- If the validation and perfect model curves match closely:
- The predictive model is accurate.
- To confirm this conclusion, check the predictive confidence indicators.
- If the indicators confirm its reliability, trust the predictive model and use its predictions.
- If the validation and perfect model curves match closely but diverge significantly on a segment:
- The predictive model is accurate, but its performance is hindered in the diverging segment.
- To confirm this conclusion, check the predictive confidence indicators.
- If the indicators confirm its overall reliability, improve that segment's predictions by adding more rows or variables in the input data set.