The Employee Churn dataset is freely available and can be uploaded from [1]. When it is loaded, it can be explored using SAP HANA DataFrames, which provide a high performance, columnar, in-memory data structure for efficient data manipulation and analysis within SAP HANA Cloud.
To begin, the dataset is read into a Pandas DataFrame using the read_csv() function, and the first five observations are displayed to get an initial overview:
12df_data = pd.read_csv("./Emp_Churn_Train.csv", sep = ',')
df_data.head(5)
Overview
ITEM _NUMBER | EMPLOYEE_ID | AGE | AGE_GROUP10 | AGE_GROUP5 | GENERATION | CRITICAL_JOB_ROLE | RISK_OF_LOSS | IMPACT_OF_LOSS | FUTURE_LEADER | GENDER ... | CURRENT_COUNTRY | CURCOUNTRYLAT | CURCOUNTRYLON | PROMOTION_WITHIN_LAST_3_YEARS | CHANGED_POSITION_WITHIN_LAST_2_YEARS | CHANGE_IN_PERFORMANCE_RATING | FUNCTIONALAREACHANGETYPE | JOBLEVELCHANGETYPE | HEADS | FLIGHT_RISK |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 10032 | 33 | (25-35] | (30-35] | Generation Y | Critical | High | High | No Future Leader | Female ... | USA | 39.78373 | -100.445882 | No Promotion | No Change | 0 - Not available | No change | No change | 1 | No |
1 | 10033 | 43 | (35-45] | (40-45] | Generation X | Critical | Low | High | No Future Leader | Female ... | Germany | 51.08342 | 10.423447 | No Promotion | No Change | 0 - Not available | No change | No change | 1 | No |
2 | 10034 | 33 | (25-35] | (30-35] | Generation Y | Critical | Medium | High | No Future Leader | Female ... | USA | 39.78373 | -100.445882 | No Promotion | No Change | 0 - Not available | External Hire | External Hire | 1 | No |
3 | 10035 | 33 | (25-35] | (30-35] | Generation Y | Critical | High | High | No Future Leader | Male ... | USA | 39.78373 | -100.445882 | No Promotion | No Change | 0 - Not available | No change | No change | 1 | No |
4 | 10036 | 33 | (25-35] | (30-35] | Generation Y | Critical | Low | Low | No Future Leader | Male ... | USA | 39.78373 | -100.445882 | No Promotion | No Change | 0 - Not available | No change | No change | 1 | Yes |
5 rows x 43 columns
Additional materials related to the Employee Churn scenario can be found in the same folder [2].