IBM-HR-Analytics-Employee-Attrition-and-Performance[1].pptx

konteu 6 views 9 slides Oct 18, 2025
Slide 1
Slide 1 of 9
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9

About This Presentation

IBM HR Analytics: Employee Attrition & Performance
Exploring key factors influencing employee attrition and performance using data analytics and machine learning


Slide Content

IBM HR Analytics: Employee Attrition & Performance Exploring key factors influencing employee attrition and performance using data analytics and machine learning.

Presentation Overview This presentation will cover the systematic approach to analyzing HR data, from initial setup to predictive modeling. 1 Introduction & Setup Importing libraries and loading the dataset. 2 Data Preparation Cleaning and preprocessing the raw data. 3 Exploratory Analysis Uncovering key insights from the data. 4 Predictive Modeling Building and evaluating a machine learning model.

Setting the Stage: Data Preparation Our analysis begins with importing essential libraries and loading the dataset, followed by crucial data cleaning steps. Importing Libraries import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport seaborn as snsfrom sklearn.preprocessing import LabelEncoderfrom sklearn.model_selection import train_test_splitfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.metrics import accuracy_score, classification_reportimport warningswarnings.filterwarnings('ignore') Loading & Previewing Data df = pd.read_csv('/content/WA_Fn-UseC_-HR-Employee-Attrition.csv')df.head() The dataset contains 1470 entries across 35 columns, including employee demographics, job roles, and performance metrics.

Data Cleaning & Preprocessing Techniques Before analysis, we ensure data quality by checking for missing values, duplicates, and dropping irrelevant columns. 1 Data Overview The dataset has 1470 entries and 35 columns, with no missing values or duplicated rows, ensuring data integrity. 2 Column Removal Irrelevant columns like 'EmployeeCount', 'Over18', 'StandardHours', and 'EmployeeNumber' were dropped to streamline the dataset. This systematic approach prepares the data for meaningful exploratory analysis and predictive modeling.

Exploratory Data Analysis (EDA) Techniques We delve into the data to uncover patterns and relationships, focusing on attrition rates and key influencing factors. Attrition Rate The overall attrition rate is 16.12% , indicating a significant portion of employees leaving the company. Average Tenure The average employee tenure is 7.01 years , providing context for employee retention.

Key Factors Influencing Attrition Our EDA reveals several critical factors correlated with higher attrition rates, guiding our predictive model development. Overtime Work Employees working overtime show a higher tendency to leave the company. Monthly Income Lower monthly income is often correlated with increased attrition. Department Most employees are in Research & Development, with varying attrition by department.

Machine Learning Model: Predicting Attrition We built a Random Forest Classifier to predict employee attrition, leveraging encoded categorical features and a train-test split. Encoding & Splitting Categorical features were encoded using LabelEncoder. The dataset was split into training (70%) and testing (30%) sets. label_cols = ['Attrition', 'BusinessTravel', 'Department', 'EducationField', 'Gender', 'JobRole', 'MaritalStatus', 'OverTime']le = LabelEncoder()for col in label_cols: df_model[col] = le.fit_transform(df_model[col])X = df_model.drop('Attrition', axis=1)y = df_model['Attrition']X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) Model Training A Random Forest Classifier with 100 estimators was trained on the prepared data. clf = RandomForestClassifier(n_estimators=100, random_state=42)clf.fit(X_train, y_train)y_pred = clf.predict(X_test)

Model Evaluation & Feature Importance The model achieved an accuracy of 86.17%, with key features identified for attrition prediction. Model Performance The model demonstrates strong predictive capability, particularly in identifying non-attriting employees. Model Accuracy: 0.8616780045351474Classification Report: precision recall f1-score support 0 0.87 0.98 0.92 380 1 0.50 0.10 0.16 61 accuracy 0.86 441 macro avg 0.69 0.54 0.54 441weighted avg 0.82 0.86 0.82 441 Top Features Feature importance analysis highlights the most influential factors in predicting attrition.

Conclusion & Next Steps Our analysis provides actionable insights into employee attrition, paving the way for targeted HR strategies. Key Takeaways Attrition rate: 16.12% Average tenure: 7.01 years Overtime and lower income correlate with higher attrition. Model accuracy: 86.17% Recommendations Address overtime issues. Review compensation structures. Implement targeted retention programs. Future Work Explore additional data sources. Develop more sophisticated models. Integrate real-time attrition monitoring.
Tags