Sol ppt Exploratory data analysis loan defaluter

AmolKore4 25 views 29 slides Jun 16, 2024
Slide 1
Slide 1 of 29
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29

About This Presentation

This case study aims to identify patterns which indicate if a client has difficulty paying their instalments which may be used for taking actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc. This will ensure that the consumers c...


Slide Content

Exploratory Data Analysis Loan Defaulter Segmentation ASSIGNMENT Presented by --- AMOL KORE

BUSINESS OBJECTIVE   This case study aims to identify patterns which indicate if a client has difficulty paying their instalments which may be used for taking actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc. This will ensure that the consumers capable of repaying the loan are not rejected. Identification of such applicants using EDA is the aim of this case study. The company wants to understand the driving factors (or driver variables) behind loan default, i.e. the variables which are strong indicators of default. The company can utilize this knowledge for its portfolio and risk assessment.

Problem Statement The data which we have analyzed contains the information about the loan application at the time of applying for the loan. It contains two types of scenarios: The client with payment difficulties: he/she had late payment more than X days on at least one of the first Y instalments of the loan in our sample. (in our analysis, it is mentioned as target =1) All other cases: All other cases when the payment is paid on time. (in our analysis, it is mentioned as target =0)

ANALYSIS & STEPS TAKEN Data Sourcing  (already provided in assignment) Data loading and Data Cleaning : Fixing the rows and columns Imputing and Removing Missing columns Handling Outliers 3. Univariate Analysis : Categorical – Numerical Analysis Numerical- Numerical Analysis Categorical-Categorical Analysis 4.Bivariate and Multivariate Analysis: Numeric – Numeric Analysis Correlation Numerical – Categorical Analysis Categorical – Categorical Analysis

most of the loans have been taken by female default rate for females are just ~7% which is safer and lesser than male CODE_GENDER

most of the customers have taken cash loan customers who have taken cash loans are less likely to default NAME_CONTRACT_TYPE

The safest segments are working, commercial associates and pensioners NAME_INCOME_TYPE

Married people are safe to target, default rate is 8% NAME_FAMILY_STATUS

People having house/apartment are safe to give the loan with default rate of ~8% NAME_HOUSING_TYPE

Correlation Matrix FLAG column Heatmap

Correlation Matrix EXT column Heatmap')

Higher education is the safest segment to give the loan with a default rate of less than 5% NAME_EDUCATION_TYPE

Transport type 3 highest defaulter Others, Business Entity Type 3, Self Employed are good to go with default rate around 10 % ORGANIZATION_TYPE

unaccompanied people had tankan most of the loans and the default rate is ~8.5% which is still okay NAME_TYPE_SUITE -

Low-Skill Laborers and drivers are highest defaulters Accountants are less defaulters Core staff, Managers and Laborers are safer to target with a default rate of <= 7.5 to 10% OCCUPATION_TYPE

Transport type 3 highest defaulter Others, Business Entity Type 3, Self Employed are good to go with default rate around 10 % ORGANIZATION_TYPE

most of the loans were given for the goods price ranging between 0 to 1 ml most of the loans were given for the credit amount of 0 to 1 ml most of the customers are paying annuity of 0 to 50 K mostly the customers have income between 0 to 1 ml univariate numeric variables analysis

AMT_CREDIT and AMT_GOODS_PRICE are linearly corelated, if the AMT_CREDIT increases the defaulters are decreasing Bivariate analysis

people having income less than or equals to 1 ml, are more like to take loans out of which who are taking loan of less than 1.5 million, could turn out to be defaulters. we can target income below 1 million and loan amount greater than 1.5 million Bivariate analysis

people having children 1 to less than 5 are safer to give the loan Bivariate analysis

People who can pay the annuity of 100K are more like to get the loan and that's upto less than 2ml (safer segment) Bivariate analysis

for the repairing purpose customers had applied mostly prev. and the same purpose has most number of cancelations Analysis on merged data

most of the app. which were prev. either canceled or refused 80-90% of them are repayor in the current data Analysis on merged data

offers which were unused prev. now have maximum number of defaulters despite of having high income band customers Analysis on merged data

Final Conclusion/Insights

Bank should target the customers Having low income i.e. below 1 ml working in Others, Business Entity Type 3, Self Employed org. type working as Accountants, Core staff, Managers and Laborers having house/apartment and are married and having children not more than 5 Highly educated preferably female unaccompanied people can be safer - default rate is ~8.5%

Amount segment recommended The credit amount should not be more than 1 ml annuity can be made of 50K (depending on the eligibility) income bracket could be below 1 ml 80-90% of the customer who were prev. cancelled/refused, are repapers. Bank can do the analysis and can consider to give loan to these segments

Precautions org. Transport type 3 should be avoided Low-Skill Labourers and drivers should be avoided offers prev. unused and high income customer should be avoided

THANK YOU FOR LISTENING! --- [email protected] 8080100957 @mr.amolkore WEBSITE EMAIL PHONE SOCIAL MEDIA