What is exploratory data analysis and how is it performed?
How to slice and mask (filter) pandas Series and DataFrame objects.
How to merge multiple DataFrames together.
How to aggregate data with GroupBy operations.
methods of data cleaning.
An introduction to data cleaning.
How to create new info...
What is exploratory data analysis and how is it performed?
How to slice and mask (filter) pandas Series and DataFrame objects.
How to merge multiple DataFrames together.
How to aggregate data with GroupBy operations.
methods of data cleaning.
An introduction to data cleaning.
How to create new informative features.
Dealing with missing values.
Converting strings to categories.
Exploratory data analysis.
Size: 13.15 MB
Language: en
Added: Jul 21, 2024
Slides: 8 pages
Slide Content
Lewis Tunstall | Data Scientist | [email protected]
Leandro von Werra | Data Scientist | [email protected]
BTW2401 - Data Science
Lesson 3 - Data Cleaning
and Feature Engineering
Last Week Today
•What is exploratory data analysis and how is it
performed?
•How to slice and mask (filter) pandas Series and
DataFrame objects
•How to merge multiple DataFrames together
•How to aggregate data with GroupBy operations
Last week we covered the following topics:
Merging DataFrames
Column and row axes in pandas
Split, apply, combine
Where Are We Going?
Week No. Date Topic
8 20.02 Introduction to data science
9 27.02 Exploratory data analysis
10 5.03 Data cleaning and feature engineering
11 12.03 Introduction to random forests
12 19.03 Random forest deep dive
13 26.03 Model interpretation
14 2.04 Classification
15 9.04 No class (Easter)
16 16.04Midterm exam & define group projects
17 23.04Cross-validation and model performance
Find a
question
Collect
the data
Deploy
the model
Evaluate
the model
Create a
model
Prepare
the data
Data
Where Are We Going?
Week No. Date Topic
8 20.02 Introduction to data science
9 27.02 Exploratory data analysis
10 5.03 Data cleaning and feature engineering
11 12.03 Introduction to random forests
12 19.03 Random forest deep dive
13 26.03 Model interpretation
14 2.04 Classification
15 9.04 No class (Easter)
16 16.04Midterm exam & define group projects
17 23.04Cross-validation and model performance
Find a
question
Collect
the data
Deploy
the model
Evaluate
the model
Create a
model
Prepare
the data
Data
Today’s Lesson
•An introduction to data cleaning
•How to create new informative features
•Dealing with missing values
•Converting strings to categories
Pandas, pandas, pandas!
A mini quiz followed by …