DTS304TC: Machine Learning Lecture 4: Boosting Dr Kang Dang D-5032, Taicang Campus [email protected] Tel: 88973341 1
Ensembling Methods 2 Ensembles involve a group of prediction models working together to improve classification outcomes. Bagging Revisited : Bagging develops multiple models using randomly selected subsets of the training dataset. Introduction to Boosting : Our focus today shifts to Boosting, a technique where models are trained in sequence, with an increased focus on examples that were incorrectly predicted in previous rounds.
Decision Stump A weak learner refers to a model that predicts marginally better than a random guess, exemplified by achieving accuracy just above 50%, such as 55%. Boosting leverages these minimal-performing models, prioritizing those that are computationally simple, such as a decision stump—a decision tree characterized by only one decision node. 3
Visualizing Decision Stump Classifiers 4 Decision stump classifiers delineate the feature space using horizontal and vertical boundaries, creating distinct half-spaces for classification.
Understanding AdaBoost Algorithm 5
Understanding the AdaBoost Algorithm Key steps of AdaBoost: In every iteration, AdaBoost adjusts the weights of the training instances, emphasizing those that were previously misclassified. A fresh weak classifier is then trained on these reweighted instances. Newly developed classifiers are merged into the current ensemble with an assigned weight based on their accuracy , thereby strengthening the collective decision-making power. Repeat this process many times. Each weak learner is tasked with minimizing the error on the weighted data. AdaBoost's strategy of focusing on prior errors reduces the overall bias of the model, sharpening accuracy with each step. 6
Example of AdaBoost Original Dataset: Equal Weights to All the samples.
AdaBoost Round-1 weighted training error. weighting of the current decision stump => Train a classifier using => Question: why is w=0.3 here? 8
AdaBoost Round-1 weighted training error. weighting of the current decision stump => Train a classifier using => => 9
Question If the decision stump is well optimized, The error rate should be always lesser than 0.5. Why? (the confidence/trust) level will be greater than 0 . When will the confidence level be high? (the confidence/trust) is used as the weight of each individual learner/weak classifier. 10
Question In adaboost , after each boosting round, we should update the weight of each individual sample using this formula What does this mean? How to explain the figure? 11
AdaBoost Round-2 weighted training error. weighting of the current decision stump => Train a classifier using => => 12
AdaBoost Round-3 13 weighted training error. weighting of the current decision stump => Train a classifier using => =>
AdaBoost Final Classifier 14
Key steps of the above AdaBoost process 15 Data Samples with Weights Train a decision stump (See decision tree building lecture slide) Calculate the weighted classification error of the decision stump Calculate weighting factor of the current decision stump Increase Data Sample Weights for Wrongly Classified Samples Next Round of Boosting Final classifier after T round of boosting:
AdaBoost Algorithm 16
AdaBoost Algorithm Key Steps 17 1. Classifier Training according to weighted data samples 2. Calculated weighted error of current classifier 3. Calculate classifier coefficient 4. Update data weights
Adaboost Examples 18 The decision made by the latest added learner is depicted with a dashed black line. The collective decision boundary formed by the entire ensemble is illustrated in green.
Control AdaBoost Overfitting AdaBoost's training error is theoretically demonstrated to approach zero as the algorithm progresses. With an increasing number of classifiers, there's a risk of overfitting the model to the training data. To prevent overfitting, it's crucial to adjust the number of boosting rounds. This is best achieved by utilizing a separate validation set to fine-tune the process. 19
AdaBoost Application: Face Detection 20 AdaBoost has gained prominence for its effective use in the realm of facial detection. Achieved real-time facial detection capabilities as early as 2001.
AdaBoost Application: Face Detection 21 The fundamental classifier, or weak learner, operates by evaluating the sum of pixel intensities within a specified rectangular area, employing certain efficient computational techniques. As the number of boosting iterations increases, the selected features increasingly concentrate on specific facial regions, improving detection accuracy.
AdaBoost Face Detection Experiments in Lab 2 Please DO Attend Lab 2 of this week, as we will detailed study face detection in AdaBoost via experiments there. 22
Course Summary Boosting strategically lowers bias by creating a collective of weak classifiers, where each subsequent classifier is fine-tuned to address the errors of the preceding ensemble. Careful calibration of the number of boosting iterations is helpful to avert the risk of overfitting. A practical application of boosting is face detection, showcasing the effectiveness of this ensemble method. 23