Building a Regression Model in SAP Analytics Cloud Smart Predict

Objective

After completing this lesson, you will be able to Identify the key steps required to build a regression model.

Data Sets for a Regression Model

Input Data Set Recap for Building Regression Scenarios

The training data source contains the past observations that are used to generate the regression model. In this data set, the values of the target variable, which is the variable corresponding to your business issue, are known.

By analyzing the training data set, Smart Predict generates a regression model that explains and predicts the target variable, based on the variables identified as influencers.

Once the regression model is trained, it can be applied on an application data set, generating the predicted values of the target in the output data set.

A regression model screen with the Settings pane open on the right side of the screen. Key settings: Training data source, target, and exclude as influencers.

Build and Train a Regression Model

Business Scenario

Before a bank offers a home loan, it is critical to conduct a home appraisal. The appraisal confirms the validity of the sales price of the property for the bank.

You have been asked to build and train a regression model that estimates the price of a home, based on several factors including: square feet of home, square feet of lot, number of bedrooms, number of bathrooms, location, and so on. You have been provided data for the following variables:

Variable Data

VariableDescription
IDUnique ID for each customer.
DateSale date of the estate. This variable is to be excluded for this regression model.
PRICEThe property's sales price in dollars. Price is the variable that you are trying to predict.
BEDROOMSNumber of bedrooms above basement level.
BATHROOMSNumber of bathrooms above basement level.
SQFT_LIVIAbove ground living area in square feet.
SQFT_LOTSquare foot measurement for all floors.
FLOORSNumber of floors.
WATERFRONTIs there a waterfront on the property? Yes = 1/No=0
ConditionOverall condition rating.
GRADEOverall grade.
YR_BUILTOriginal property construction date (year).
ZIPCODEHouse location

In this practice exercise, you will:

  1. Build a regression model.
  2. Train the regression model.
  3. Verify the output from the regression model.

Log in to track your progress & complete quizzes