Data Scientist Introduction bref overview of Concepts

rahulgulab12 10 views 25 slides Jun 12, 2024
Slide 1
Slide 1 of 25
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25

About This Presentation

Data Science Introduction


Slide Content

Data Scientist April 2018

Agenda Data Science and its Application Stages-Data science, Project roles C lassification , decision tree , random forest Demo using R Technology

Data Science Introduction-Concepts D ata science is managing the process that can transform hypotheses and data into actionable predictions .

Data Science Introduction-Applications Amazon’s product recommendation systems LinkedIn’s contact recommendation system Retail Business – Buying patterns , segment Twitter’s trending topics Google’s advertisement valuation systems Walmart’s consumer demand projection systems

Data Science Domains

Data Science Introduction-Project Roles

Data Science Introduction- Processes in Data Science Project

Data Science – Modelling Methods

Data Science – Modelling Method Classification and Regression Trees Example :- Finding bad l oan applications Input variables :- Loan amount, duration, a ge, salary , any other loan , address, I ncome , education , background data , location etc 1000 applications exist out of which 300 have been defaulted Decision Tree for identifying Potential defaulters

Classification and Regression Trees Duration>50 Amount>4 million Amount> 1mil Amount<5 mil Bad (0.68) Duration>120 Good (0.75) Good (0.56) Bad (0.25) Good (0.61) Bad (0.88)

Data Science – Modelling Method K – nearest Neighbors( Knn ) Example : Male , Female distribution Hair Length ( c ms ) 60 40 20 0/ 140 150 160 170 180 190 200 Height ( cms )

Data Science – Modelling Method K – nearest Neighbors( Knn ) Example : Male , Female distribution Hair Length ( c ms ) 60 40 20 0/ 140 150 160 170 180 190 200 Height ( cms )

Data Science – Modelling Method Random Forest (RF) Tree 1 Tree 3 Tree 2

Data Science – Modelling Method Random Forest (RF) Input All Trees Prediction Tree1: Tree2: Tree3: Random Forest Predicts:

Data Science – Modelling Method Random Forest (RF) A pplication where random forest algorithm is widely used: Banking - loyal customer and fraud customers Medicine-Disease (patient’s medical records) Stock Market- S tock behavior, loss , Profit E-commerce- Similar customer , segmentation

Data Science – Modelling Method Support Vector Example : Male , Female distribution Hair Length ( c ms ) 60 40 20 0/ 140 150 160 170 180 190 200 Height ( cms )

Data Science – Modelling Method Support Vector Example : Male , Female distribution Hair Length ( c ms ) 60 40 20 0/ 140 150 160 170 180 190 200 Height ( cms )

Data Science –Model Evaluation Process Training , Test and Validation DATA Test/ Train Split Training DATA Test DATA Training Process Model Predictions

Data Science Demo Example

Demo Explanation Data 3 Species

Demo Explanation Load the package Caret and load the data Split the data into 2 parts -80 % would be kept in dataset and 20 % into validation Feed the dataset to 4 algorithms(CART,KNN,SV,RF) Select the best algorithm Feed the validation to best algorithm Check the output

Data Science Demo Installing the R platform. Loading the dataset. Summarizing the dataset. Visualizing the dataset. Evaluating some algorithms. Making some predictions

Other Practical's of Data Science https :// towardsdatascience.com/examples-of-data-science-with-r-789c6996435 Customer analysis and predictive analysis Association rules –( medical diagnosis, bio-medical, census data, fraud detection, CRM ) Hr Analytics - Finding valuable employees and retaining it

Data Science Resources Practical Data Science with R Demo commands R and R Studio installation files Resources kept at below locatio n \\gb-pb-dbm-v01\Data_Science_Resources

Questions and Feedbacks ?
Tags