A statistical model or a machine learning algorithm is said to have underfitting when a model is too simple to capture data complexities. It represents the inability of the model to learn the training data effectively result in poor performance both on the training and testing data. In simple terms,...
A statistical model or a machine learning algorithm is said to have underfitting when a model is too simple to capture data complexities. It represents the inability of the model to learn the training data effectively result in poor performance both on the training and testing data. In simple terms, an underfit model’s are inaccurate, especially when applied to new, unseen examples. It mainly happens when we uses very simple model with overly simplified assumptions. To address underfitting problem of the model, we need to use more complex models, with enhanced feature representation, and less regularization.
A statistical model is said to be overfitted when the model does not make accurate predictions on testing data. When a model gets trained with so much data, it starts learning from the noise and inaccurate data entries in our data set. And when testing with test data results in High variance. Then the model does not categorize the data correctly, because of too many details and noise. The causes of overfitting are the non-parametric and non-linear methods because these types of machine learning algorithms have more freedom in building the model based on the dataset and therefore they can really build unrealistic models. A solution to avoid overfitting is using a linear algorithm if we have linear data or using the parameters like the maximal depth if we are using decision trees.
Size: 3.29 MB
Language: en
Added: Dec 15, 2023
Slides: 20 pages
Slide Content
Overfitting and Underfitting in Machine Learning
Overfitting in machine learning Overfitting refers to a scenario when the model tries to cover all the data points present in the given dataset. The model starts caching noise and inaccurate values present in the dataset. Reduces the efficiency and accuracy of the model. The overfitted model has low bias and high variance
Overfitting in machine learning Cont. Reasons for Overfitting
Overfitting in machine learning Cont. Overfitted model performance The accuracy score is good and high during training but it decreases during testing. How to avoid Overfitting Using cross-validation Using Regularization techniques Implementing Ensembling techniques. Picking a less parameterized/complex model Training the model with sufficient data Removing features Early stopping the training
Underfitting in machine learning Underfitting is just the opposite of overfitting. Underfitting occurs when our machine learning model is not able to capture the underlying trend of the data . An underfitted model has high bias and low variance .
Underfitting in machine learning Cont. Reasons for Underfitting
Underfitting in machine learning Cont. Underfitted model performance The accuracy score is low during training as well as testing. How to avoid Underfitting Preprocessing the data to reduce noise in data More training to the model Increasing the number of features in the dataset Increasing the model complexity Increasing the training time of the model to get better results.
Good fit model in machine learning A good fit model is a balanced model, which is not suffering from underfitting and overfitting. This is a perfect model which gives good accuracy score during training and equally performs well during testing.
Detecting Overfitting And Underfitting And Good Fit Detecting for Classification and Regression: Error Overfitting Right Fit Underfitting Training Low Low High Test High Low High
Comparison between Overfitting and Underfitting VS. Good Fit in a Model
Example To Understand Overfitting vs. Underfitting (vs. Good Fitting) in Machine Learning Consider a AI class consisting of students and a professor
Example To Underfitting vs. Overfitting (vs. Best Fitting) in Machine Learning Cont. We can broadly divide the students into 3 features (Hobby, Interest, Attention).
Example To Underfitting vs. Overfitting (vs. Best Fitting) in Machine Learning Cont. The professor first delivers lectures and teaches the students about the problems and how to solve them. At the end of the day, the professor simply takes a quiz based on what he taught in the class.
Example To Underfitting vs. Overfitting (vs. Best Fitting) in Machine Learning Cont. So, let’s discuss what happens when the professor takes a classroom test at the end of the day: We can clearly infer that the student who simply memorizes everything is scoring better without much difficulty.
Example To Underfitting vs. Overfitting (vs. Best Fitting) in Machine Learning Cont. Now here’s the twist. Let’s also look at what happens during the semester final, when students have to face new unknown questions which are not taught in the class by the Professor.
How Does this Relate to Underfitting and Overfitting in Machine Learning Summaries the students Class Test and Semester Exam scores with a feature of Interest. We Make a dataset with Class Test and Semester Exam scores as a training and testing dataset.
How Does this Relate to Underfitting and Overfitting in Machine Learning This example relates to the problem dataset Which the train, test and validation scores of the dataset. This example relates to the problem Which we encountered during the train and test scores of the decision tree classifier.
How Does this Relate to Underfitting and Overfitting in Machine Learning Cont. Let’s work on connecting this example with the results of the decision tree classifier.