This case study aims to identify patterns which indicate if a client has difficulty paying their instalments which may be used for taking actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc. This will ensure that the consumers c...
This case study aims to identify patterns which indicate if a client has difficulty paying their instalments which may be used for taking actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc. This will ensure that the consumers capable of repaying the loan are not rejected. Identification of such applicants using EDA is the aim of this case study.
The company wants to understand the driving factors (or driver variables) behind loan default, i.e. the variables which are strong indicators of default. The company can utilize this knowledge for its portfolio and risk assessment.
Size: 5 MB
Language: en
Added: Jun 16, 2024
Slides: 29 pages
Slide Content
Exploratory Data Analysis Loan Defaulter Segmentation ASSIGNMENT Presented by --- AMOL KORE
BUSINESS OBJECTIVE Â This case study aims to identify patterns which indicate if a client has difficulty paying their instalments which may be used for taking actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc. This will ensure that the consumers capable of repaying the loan are not rejected. Identification of such applicants using EDA is the aim of this case study. The company wants to understand the driving factors (or driver variables) behind loan default, i.e. the variables which are strong indicators of default. The company can utilize this knowledge for its portfolio and risk assessment.
Problem Statement The data which we have analyzed contains the information about the loan application at the time of applying for the loan. It contains two types of scenarios: The client with payment difficulties: he/she had late payment more than X days on at least one of the first Y instalments of the loan in our sample. (in our analysis, it is mentioned as target =1) All other cases: All other cases when the payment is paid on time. (in our analysis, it is mentioned as target =0)
ANALYSIS & STEPS TAKEN Data Sourcing (already provided in assignment) Data loading and Data Cleaning : Fixing the rows and columns Imputing and Removing Missing columns Handling Outliers 3. Univariate Analysis : Categorical – Numerical Analysis Numerical- Numerical Analysis Categorical-Categorical Analysis 4.Bivariate and Multivariate Analysis: Numeric – Numeric Analysis Correlation Numerical – Categorical Analysis Categorical – Categorical Analysis
most of the loans have been taken by female default rate for females are just ~7% which is safer and lesser than male CODE_GENDER
most of the customers have taken cash loan customers who have taken cash loans are less likely to default NAME_CONTRACT_TYPE
The safest segments are working, commercial associates and pensioners NAME_INCOME_TYPE
Married people are safe to target, default rate is 8% NAME_FAMILY_STATUS
People having house/apartment are safe to give the loan with default rate of ~8% NAME_HOUSING_TYPE
Correlation Matrix FLAG column Heatmap
Correlation Matrix EXT column Heatmap')
Higher education is the safest segment to give the loan with a default rate of less than 5% NAME_EDUCATION_TYPE
Transport type 3 highest defaulter Others, Business Entity Type 3, Self Employed are good to go with default rate around 10 % ORGANIZATION_TYPE
unaccompanied people had tankan most of the loans and the default rate is ~8.5% which is still okay NAME_TYPE_SUITE -
Low-Skill Laborers and drivers are highest defaulters Accountants are less defaulters Core staff, Managers and Laborers are safer to target with a default rate of <= 7.5 to 10% OCCUPATION_TYPE
Transport type 3 highest defaulter Others, Business Entity Type 3, Self Employed are good to go with default rate around 10 % ORGANIZATION_TYPE
most of the loans were given for the goods price ranging between 0 to 1 ml most of the loans were given for the credit amount of 0 to 1 ml most of the customers are paying annuity of 0 to 50 K mostly the customers have income between 0 to 1 ml univariate numeric variables analysis
AMT_CREDIT and AMT_GOODS_PRICE are linearly corelated, if the AMT_CREDIT increases the defaulters are decreasing Bivariate analysis
people having income less than or equals to 1 ml, are more like to take loans out of which who are taking loan of less than 1.5 million, could turn out to be defaulters. we can target income below 1 million and loan amount greater than 1.5 million Bivariate analysis
people having children 1 to less than 5 are safer to give the loan Bivariate analysis
People who can pay the annuity of 100K are more like to get the loan and that's upto less than 2ml (safer segment) Bivariate analysis
for the repairing purpose customers had applied mostly prev. and the same purpose has most number of cancelations Analysis on merged data
most of the app. which were prev. either canceled or refused 80-90% of them are repayor in the current data Analysis on merged data
offers which were unused prev. now have maximum number of defaulters despite of having high income band customers Analysis on merged data
Final Conclusion/Insights
Bank should target the customers Having low income i.e. below 1 ml working in Others, Business Entity Type 3, Self Employed org. type working as Accountants, Core staff, Managers and Laborers having house/apartment and are married and having children not more than 5 Highly educated preferably female unaccompanied people can be safer - default rate is ~8.5%
Amount segment recommended The credit amount should not be more than 1 ml annuity can be made of 50K (depending on the eligibility) income bracket could be below 1 ml 80-90% of the customer who were prev. cancelled/refused, are repapers. Bank can do the analysis and can consider to give loan to these segments
Precautions org. Transport type 3 should be avoided Low-Skill Labourers and drivers should be avoided offers prev. unused and high income customer should be avoided
THANK YOU FOR LISTENING! --- [email protected] 8080100957 @mr.amolkore WEBSITE EMAIL PHONE SOCIAL MEDIA