The SAP HANA PAL Unified Classification function [1] provides an efficient way to train classification models with enhanced features such as:
- Seamless switching between classification algorithms
- Automatic dataset partitioning
- Built-in model evaluation procedures
- Support for additional evaluation metrics
For this task, the Hybrid Gradient Boosting Tree (HGBT) algorithm is selected by setting the 'func' parameter to 'HybridGradientBoostingTree'. Finally, the training time, that is, the time taken to fit the model to the training dataset, is displayed.
123456789101112131415161718192021# Train the classifer model using PAL HybridGradientBoostingTree
# Initialize the model object
hgbc = UnifiedClassification(func='HybridGradientBoostingTree',
n_estimators = 101, split_threshold=0.1,
learning_rate=0.1, max_depth=6,
split_method='histogram', max_bin_num=256, feature_grouping=True,
tolerant_iter_num=5,
resampling_method='cv', fold_num=5, ref_metric=['auc'],
evaluation_metric = 'error_rate')
# Execute the training of the model
# key= 'EMPLOYEE_ID',
hgbc.fit(data=df_train.drop('EMPLOYEE_ID'),
label='FLIGHT_RISK',
partition_method='stratified', stratified_column='FLIGHT_RISK', training_percent=0.8,
ntiles=20,
build_report=True)
display(hgbc.runtime)
1.82820272445678