Pattern_recognition for master computer science

minacodegirl 0 views 19 slides Sep 28, 2025

Slide 1 of 19

About This Presentation

this slide is for Pattern recognition course in master computer science

Size: 1.78 MB

Language: en

Added: Sep 28, 2025

Slides: 19 pages

Slide Content

Feature Selection for Classification using PCA and Information Gain ( Erick Odhiambo Omuya )

Feature Selection & Classification 2

Classification process Breakdown of data into groups: 1. the method finds a model for the class attribute as a function of other variables of the datasets. 2. it applies a previously designed model on the new and unseen datasets. Using machine learning methods f.g : Decision Trees, Logistic & Linear Regression, ANN, Naïve Bayes, SVM, kNN , … ) 3

Feature selection Feature selection is a process that involves removing non-relevant and repeated features from a data set so as to improve the performance of machine learning techniques and their applications. 4

Feature selection algorithms Supervised S elect relevant features based on labelled datasets. Supervised methods can either be filter, wrapper, or embedded models. Semi supervised use both labelled and unlabeled data to evaluate the relevance of features. Unsupervised Identify and select relevant features without using class label information 5

Supervised methods 6 The filter model works in a way that feature selection and learning of the model are independent. The wrapper model uses a small set of feature. The Embedded model mainly deals with selecting features that rate highly in terms of accuracy. feature search process is embedded into the classification algorithm, and the learning process and the feature selection process can’t be separated.

Big challenge The curse of dimensionality! 7

High dimensionality in data sets It results from: collecting information with many features or variables that has not been proved to be either needed or significant for the task. 8

a hybrid model for selecting features and classifying data It will work to reduce data dimensions , reduce training time and provide better performance of classification using the selected features. 9

Hybrid model consists of the following components: Last Model Training and Classification First Principal Component Analysis Second Evaluation of features using Information Gain 10

Information Gain Step One : Given an attribute A and a class C, step one is calculating the Entropy (H) before observation of attribute A which is given by the formula: 11

Step two Step two is to calculate the Entropy after observation of an attribute A which is given by: 12

Last Step The Last step is calculating Information Gain. The Information Gain of attribute A is the difference between the entropy before observation of attribute a In A and the entropy after attribute observation of the attribute: 13

Pattern_recognition for master computer science

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pattern_recognition for master computer science

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx