Classification Aalgorithms KNN and Protype-based classifiers.pptx

A Popular Classifier Challenges for Classification algorithms K-Nearest Neighbor (KNN) Slides Credit: Dr. Zulfiqar Habib Edited by: Dr. Allah Bux Sargana

Image Classification: The Problem Human vs Machine Perception: Images are represented as R d arrays of numbers, e.g., R 3 with integers between [0, 255], where d = 3 represents 3 color channels (RGB) What the machine (computer) sees

Image Classification: Challenges Viewpoint variation Illumination Michelangelo 1475-1564

Image Classification: Challenges Scale

Image Classification: Challenges Deformation Occlusion

Image Classification: Challenges Background clutter Intra-class variation Kilmeny Niland 1995

An Image Classifier no obvious way to hard-code the algorithm for recognizing a cat, or other classes. >> f = imread ('rabbit.jpg'); >> predict(f) ???? Unlike, e.g., sorting a list of numbers,

An Image Classifier: Data-driven approach Use Machine Learning to train an image classifier on some part of annotated data Evaluate the classifier on a withheld set of test images

The Image Classification Pipeline (Input) Dataset collection & labelling (Learning) Learning & training an image classifier (Evaluation) Testing of classifier on withheld images

The Image Classification Pipeline Input: Our input consists of a set of N images, each labelled with one of K different classes. We refer to this data as the training set . Learning: Our task is to use the training set to learn what every one of the classes looks like. We refer to this step as training a classifier , or learning a model . Evaluation: Evaluate the quality of the classifier by asking it to predict labels for a new set of images that it has never seen before. We will then compare the true labels of these images to the ones predicted by the classifier. Intuitively, we're hoping that a lot of the predictions match up with the true answers (which we call the ground truth ).

The Machine Learning Framework y = f ( x ) Output Prediction function Image feature Training: given a training set of labelled examples {( x 1 , y 1 ), …, ( x N , y N )}, estimate the prediction function f by minimizing the prediction error on the training set Testing: apply f to a never before seen test example x and output the predicted value y = f ( x )

Nearest Neighbor Classifier Assign label of nearest training data point to each test data point

Nearest Neighbor Classifier Assign label of nearest training data point to each test data point Partitioning of feature space for two-category 2D and 3D data

K Nearest Neighbor (KNN) Distance measure: Euclidean where X n and X m are the n - th and m - th data points The test sample (green dot) should be classified either to blue squares or to red triangles. If k = 3 (solid line circle) it is assigned to the red triangles because there are 2 triangles and only 1 square inside the inner circle. If k = 5 (dashed line circle) it is assigned to the blue squares (3 squares vs. 2 triangles inside the outer circle).

KNN vs K-Means Clustering Q : What is the complexity of the 1-NN classifier w.r.t. training set of N images and test set of M images? at training time? at test time? KNN represents a supervised classification algorithm that will give new data points accordingly to the k number or the closest data points, while K-Means clustering is an unsupervised clustering algorithm that gathers and groups data into k number of clusters.

Given: Fisher's Iris dataset: DataIris (data size = 150) Species of Iris: petal sepal Setosa Versicolor Virginica The data set consists of 50 samples from each of three species of Iris. Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters. Example: Sample of data set

Example: Sample of data set No. Sepal. length Sepal. width Petal. length Petal. width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa … 76 6.6 3.0 4.4 1.4 versicolor … 150 5.9 3.0 5.1 1.8 virginica

Example Task: To classify a sample of 150 irises in the 3 following species: versicolor , virginica and setosa Number of given attributes: 4 From 4 characteristics measured on the flowers (the length of the sepal, the width of the sepal, the length of the petal and the width of the petal). In this example, only last 2 attributes are considered. Type of attribute to be predicted: Discrete with 3 classes

Example: Code… % Load the sample data, which includes Fisher's iris data of 5 measurements on a sample of 150 irises. >> load fisheriris >> whos Name Size Bytes Class Attributes meas 150x4 4800 double species 150x1 19300 cell >> species species = ' setosa ' ' setosa ' ------- ' versicolor ' ' versicolor ' ------- ' virginica ' ' virginica ' ------- >> meas meas = 5.1000 3.5000 1.4000 0.2000 4.9000 3.0000 1.4000 0.2000 4.7000 3.2000 1.3000 0.2000 4.6000 3.1000 1.5000 0.2000 5.0000 3.6000 1.4000 0.2000 -------- ------- ------- --------

>> x = meas (:, 3:4); % use data of last 2 columns for fitting >> y = species; % response data >> mdl = ClassificationKNN.fit (x, y) % 1NN mdl = ClassificationKNN : PredictorNames : {'x1' 'x2'} ResponseName : 'Y' ClassNames : {1x3 cell} ScoreTransform : 'none' NObservations : 150 Distance: ' euclidean ' NumNeighbors : 1 Example: Code…

% Predict the classification of an average flower >> flwr = mean(x) % an average flower flwr = 3.7580 1.1993 >> flwrClass = predict(mdl, flwr ) flwrClass = 'versicolor‘ Example: Code… >> gscatter (x(:, 1), x(:, 2), species) >> set(legend, 'location', 'best') >> line( flwr (1), flwr (2), 'marker', 'x', 'color', 'k', ' markersize ', 10, 'linewidth', 2) Petal length Petal width

% Predict another flower >> flwr = [5 1.55]; % Given (petal length, petal width) >> flwrClass = predict(mdl, flwr ) % prediction by 1NN flwrClass = ' virginica ‘ >> md5 = ClassificationKNN.fit (x,y,'NumNeighbors',5); % 5NN >> flwrClass = predict(md5, flwr ) % prediction by 5NN flwrClass = ' versicolor ' Example: Code Why different?

Species Length Width Distance virginica 5.0000 1.5000 0.0500 versicolor 4.9000 1.5000 0.1118 versicolor 4.9000 1.5000 0.1118 versicolor 5.1000 1.6000 0.1118 virginica 5.1000 1.5000 0.1118 Example: Analysis Value Count Percent virginica 2 40.00% versicolor 3 60.00%

K Nearest Neighbor (KNN) Find the k nearest images, have them vote on the label What is the best distance to use? What is the best value of k to use? i.e. how do we set the hyperparameters ? Very problem-dependent. Must try them all out and see what works best.

Prototype-based classifier

Classification Aalgorithms KNN and Protype-based classifiers.pptx

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Classification Aalgorithms KNN and Protype-based classifiers.pptx

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx