Advantages and Disadvantages of Instance-Based Learning Advantages : Works well when data is not available all at once (incremental learning). Can handle complex classification problems without requiring a predefined model. Simple to implement and understand . Disadvantages: Requires a large amount of memory to store all training data. Computationally expensive because it must compare every new instance with stored instances. Slower than model-based approaches, especially for large datasets.
Distance Metrics Used for Similarity-Based Learning The classification in instance-based learning depends on measuring the distance or similarity between instances. Common distance metrics include: Hamming Distance – Used for categorical data. Euclidean Distance – Measures straight-line distance. Manhattan Distance – Measures distance along grid-like paths. Minkowski Distance – A general form of both Euclidean and Manhattan distances. Cosine Similarity – Measures the angle between two vectors. Mahalanobis Distance – Takes correlations between features into account. Jaccard Coefficient – Measures similarity between sets.
Introduction to Instance-Based Learning Instance-Based Learning (IBL) stores all training instances and classifies new data points by comparing them to stored examples. It does not build an explicit model like decision trees or neural networks. Predictions are made only when a new query (test instance) is given. Examples of Instance-Based Learning Algorithms: k-Nearest Neighbor (k-NN) Variants of Nearest Neighbor Learning Locally Weighted Regression Learning Vector Quantization (LVQ) Self-Organizing Maps (SOM) Radial Basis Function (RBF) Networks Key Points: Self-Organizing Maps (SOM) and RBF Networks are related to artificial neural networks. Instance-based methods can struggle with irrelevant and correlated features, leading to misclassification.
Nearest-Neighbor Learning (k-NN) Definition: k-Nearest Neighbor (k-NN) is a non-parametric and lazy learning algorithm. It is used for both classification and regression problems. How k-NN Works: When given a new test instance, k-NN finds the k closest training instances (neighbors). The class of the test instance is determined by a majority vote of the k-nearest neighbors. If it’s a regression problem, the prediction is the average of the k neighbors' values.
Properties of k-NN Lazy Learning: k-NN does not build a model beforehand. Classification occurs only when a new test instance is given. Similarity-Based Classification: Assumes that similar objects exist close to each other in feature space. Uses a distance metric (e.g., Euclidean distance) to measure similarity. Choosing k: The value of k determines the performance of the model. A small k makes the model sensitive to noise. A large k can smooth out variations but might misclassify points.
Solution: We need to classify whether a student with the following details: CGPA = 6.1 Assessment Score = 40 Projects Submitted = 5 will Pass or Fail based on past student data. We will use the k-NN algorithm with k = 3 and Euclidean distance .
Step 2: Calculate Euclidean Distance
Final Prediction: The student with (CGPA = 6.1, Assessment = 40, Projects Submitted = 5) is predicted to FAIL.
Problem 2 A university wants to classify whether a student will get a scholarship based on their CGPA, Entrance Exam Score, and Number of Extracurricular Activities . You are given a training dataset of 8 students , where each student is labeled as either "Yes" (Scholarship awarded) or "No" (Scholarship not awarded) . A new student with the following attributes applies for a scholarship: CGPA = 7.0 Entrance Exam Score = 75 Extracurricular Activities = 4 Using k-NN (k = 3) and Euclidean distance , determine whether the student will receive a scholarship .
Step 2 Calculate Euclidian distance
What is Weighted k-NN? Weighted k-Nearest Neighbors (Weighted k-NN) is an improvement over the standard k-Nearest Neighbors (k-NN) algorithm. It assigns different weights to the k-nearest neighbors based on their distance to the test instance, giving more importance to closer neighbors. How is Weighted k-NN Different from k-NN? In regular k-NN , all k-nearest neighbors contribute equally to the classification or regression. In weighted k-NN , closer neighbors get higher weights , and farther neighbors get lower weights to make predictions more accurate.
Weighted k- Nearest- Neighbor Algorithm
Problem
Since Fail (0.655) > Pass (0.3425) , the test instance is classified as Fail .
Advantages of Weighted k-NN ✅ More accurate than regular k-NN when some neighbors are much closer than others. ✅ Less sensitive to noisy distant neighbors . ✅ Useful when distances vary significantly . When to Use Weighted k-NN? When data points have varying distances. When you want to give more importance to closer neighbors. When handling imbalanced datasets where class distributions are uneven.
Nearest Centroid Classifier (NCC) The Nearest Centroid Classifier (NCC) is a simple and effective classification algorithm that assigns a new data point to the class whose centroid (mean feature vector) is closest to it.
Problem 2
Locally Weighted Regression ( L WR) LWR is a non-parametric regression algorithm . ✅ It does not assume a fixed shape (like a straight line or polynomial). ✅ It adapts to different regions of the data by giving more weight to nearby points . Unlike traditional linear regression, which fits a single line , LWR fits different lines for different regions of the data . It assigns more importance (weight) to nearby points and less importance to faraway points. Example: Think of it like guessing the house price in a neighborhood: If you check prices only in your street → More accurate for your area! If you check all over the city → Less accurate because some areas are very different. LWR follows the "only nearby points matter" approach.
Locally Weighted Regression (LWR) is an instance-based learning algorithm that works by fitting different regression models at different points in the dataset. Instead of finding a single global equation, LWR estimates the best fit locally by assigning weights to nearby points . Step 1:Select a Query Point (x_query ) In LWR, we make predictions one point at a time . We select a specific query point (x_query) , which is the input value where we want to estimate the output (y_pred). This could be any value within the range of our dataset (e.g., a BMI value for which we want to predict the disease progression score ). Example: Imagine we have BMI values from 18 to 40 , and we want to predict the disease score for BMI = 25 . Here , x_query = 25 is our query point. Instead of using all data points equally , LWR will give more importance to points close to BMI = 25 and less importance to farther points.
Step 2: Compute Weights for Data Points To make our prediction at x_query, we assign weights to all the training points based on their distance from x_query . Nearby points get higher weights Faraway points get lower weights How are weights assigned? LWR uses the Gaussian function (or Kernel function) to calculate the weight for each training point. The formula is:
Explanation
Step 3: Fit a Weighted Linear Regression Model Now that we have assigned weights to all training points, we fit a weighted linear regression model to predict the value at x_query . LWR modifies the normal linear regression equation to include weights. The normal equation for linear regression is Instead of fitting one straight line for all data , we fit a new weighted line for every query point using only the relevant nearby points .
Step 4: Repeat for Multiple Query Points Since LWR computes a separate regression model for each point , we repeat Steps 1-3 for multiple values of x_query. We choose a range of x_query values. We apply LWR at each of these points. This generates a smooth regression curve instead of a single line . Example : Imagine we have BMI values ranging from 18 to 40 , and we compute LWR for: x_query = 18 x_query = 19 x_query = 20 … x_query = 40 By repeating LWR at multiple points , we get a smooth prediction curve instead of a rigid single line .
Visualization of How LWR Works Imagine you have the following dataset (BMI vs Disease Progression ): BMI : 18 20 22 25 27 30 35 38 40 Disease: 120 130 140 150 165 180 200 210 220 a) Query Point (x_query = 25) Close points (22, 25, 27) get high weight . Far points (18, 38, 40) get low weight . We fit a small weighted regression model for BMI = 25. b) Query Point (x_query = 30) Close points (27, 30, 35) get high weight . Far points (18, 20, 40) get low weight . We fit another local regression model here. By repeating this process at many values of BMI , we create a smooth non-linear curve instead of one straight line.
Final Comparison: LWR vs. Linear Regression
Problems on LWR
This problem demonstrates Locally Weighted Regression (LWR) using a small dataset of four data points and a query point x=2x = 2x=2. The goal is to predict the corresponding expenditure (y) using LWR .
Step 1: Compute Prediction Using Linear Regression
Step 2: Compute Euclidean Distance
Step 3: Compute Weights Using Gaussian Kernel The weight for each instance is calculated using the Gaussian kernel formula: Let’s assume τ = 2 for the kernel function: In short, τ = 2 is just an example value picked to balance local influence and smoothness. τ = 0.1 → Only very close points have significant influence → More overfitting. τ = 10 → Almost all points contribute → Results in more global behavior, similar to ordinary least squares (OLS).
So, the Locally Weighted Regression prediction for x=2 is 6.67 . Summary of the problem We used a dataset with Salary (x) → Expenditure (y) . We applied Euclidean distance to find the 3 closest points . We calculated weights using the Gaussian kernel . We computed the weighted average prediction , getting 6.67 instead of 5.96 (from global regression).