feature-Selection-Lab-8-20032024-111222am.pptx

chaudhryzunair4 16 views 13 slides May 28, 2024
Slide 1
Slide 1 of 13
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13

About This Presentation

Introduction to DATA SCIENCE


Slide Content

Introduction to Data Science Lab 8 (Feature Selection)

Feature Selection There are mainly 3 ways for feature selection: Filter Methods Wrapper Method Embedded Methods

Univariate Selection: Filter Method The filter method ranks each feature based on some uni -variate metric and then selects the highest-ranking features. Variance No Duplicate Columns High correlation Chi square test M utual information of the independent variable. The filter method looks at individual features for identifying it’s relative importance. A feature may not be useful on its own but may be an important influencer when combined with other features. Filter methods may miss such features.

Filter Method: Variance Removing the features whose variance is zero (having similar values)

Filter Method: No duplicate Columns Remove all the columns which have duplicate values

Filter Method: Correlation T wo or more than two features are mutually correlated, they convey redundant information to the model and hence only one of the correlated features should be retained to reduce the number of features. If independent features are correlated with dependent features (target variable), you don't need to remove them. If independent features are correlated with dependent features by 80% or 90% then drop those kind of features and train model with remaining features.

Chi-Square Calculation: An Example Χ 2 (chi-square) calculation (numbers in parenthesis are expected counts calculated based on the data distribution in the two categories) It shows that like_science_fiction and play_chess are correlated in the group as the value is greater than 10.828 Play chess Not play chess Sum (row) Like science fiction 250(90) 200(360) 450 Not like science fiction 50(210) 1000(840) 1050 Sum(col.) 300 1200 1500

Filter Method: Chi Square Test & Mutual Information Chi Square Test & Mutual Information For chi square c onvert to categorical data by converting data to integers

Wrapper Method F eature selection process is based on a specific machine learning algorithm that is to be applied to a particular record. It follows a greedy search approach by evaluating all possible combinations of features based on the evaluation criterion. Wrapper methods are of following types: Forward Elimination Backward Elimination

Forward Elimination The procedure starts with an empty set of features. The best of the original features is determined and added to the reduced set. At each subsequent iteration, the best of the remaining original attributes is added to the set. SequentialFeatureSelector from sklearn can be used for forward feature elimination by setting parameter direction=‘forward’

Forward Elimination ExtraTreesClassifier from sklearn can also be used for the forward elimination feature selection

Backward Elimination The procedure starts with the full set of features. At each step, it removes the worst attribute remaining in the set. SequentialFeatureSelector from sklearn can be used for forward feature elimination by setting parameter direction=‘backward’

Comparison Filter Filter methods do not incorporate a machine learning model in order to determine if a feature is good or bad Filter methods are much faster compared to wrapper methods as they do not involve training the models. Filter methods may fail to find the best subset of features in situations when there is not enough data to model the statistical correlation of the features Wrapper W rapper methods use a machine learning model and train it the feature to decide if it is essential for the final model or not W rapper methods are computationally costly, and in the case of massive datasets, wrapper methods are probably not the most effective feature selection method to consider. W rapper methods can always provide the best subset of features because of their exhaustive nature. Using features from wrapper methods in your final machine learning model can lead to overfitting as wrapper methods already train machine learning models with the features and it affects the true power of learning.
Tags