intoducton_to_data_science_data_cleaning

alhashediyemen1 11 views 8 slides Jul 21, 2024
Slide 1
Slide 1 of 8
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8

About This Presentation

What is exploratory data analysis and how is it performed?
How to slice and mask (filter) pandas Series and DataFrame objects.
How to merge multiple DataFrames together.
How to aggregate data with GroupBy operations.
methods of data cleaning.
An introduction to data cleaning.
How to create new info...


Slide Content

Lewis Tunstall | Data Scientist | [email protected]
Leandro von Werra | Data Scientist | [email protected]
BTW2401 - Data Science
Lesson 3 - Data Cleaning
and Feature Engineering

Last Week Today
•What is exploratory data analysis and how is it
performed?
•How to slice and mask (filter) pandas Series and
DataFrame objects
•How to merge multiple DataFrames together
•How to aggregate data with GroupBy operations
Last week we covered the following topics:

Merging DataFrames

Column and row axes in pandas

Split, apply, combine

Where Are We Going?
Week No. Date Topic
8 20.02 Introduction to data science
9 27.02 Exploratory data analysis
10 5.03 Data cleaning and feature engineering
11 12.03 Introduction to random forests
12 19.03 Random forest deep dive
13 26.03 Model interpretation
14 2.04 Classification
15 9.04 No class (Easter)
16 16.04Midterm exam & define group projects
17 23.04Cross-validation and model performance
Find a
question
Collect
the data
Deploy
the model
Evaluate
the model
Create a
model
Prepare
the data
Data

Where Are We Going?
Week No. Date Topic
8 20.02 Introduction to data science
9 27.02 Exploratory data analysis
10 5.03 Data cleaning and feature engineering
11 12.03 Introduction to random forests
12 19.03 Random forest deep dive
13 26.03 Model interpretation
14 2.04 Classification
15 9.04 No class (Easter)
16 16.04Midterm exam & define group projects
17 23.04Cross-validation and model performance
Find a
question
Collect
the data
Deploy
the model
Evaluate
the model
Create a
model
Prepare
the data
Data

Today’s Lesson
•An introduction to data cleaning
•How to create new informative features
•Dealing with missing values
•Converting strings to categories
Pandas, pandas, pandas!
A mini quiz followed by …