A slide Presentation on two types of ensemble methods called bagging and boosting used to identify malicious activity in internal network of the company named General Electronics . This slide basically focuses on how the two methods works .
Size: 243.1 KB
Language: en
Added: Apr 30, 2020
Slides: 11 pages
Slide Content
Ensemble Methods Mehnaz Maharin
Ensemble Methods
Work Flow
Bagging -Random Forest What? Why? Model Averaging Approach A machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression. To reduce Overfitting To reduce Variance
Bagging Decision Tree Type_Name Score Status Risk_factor Avg_Score Classification HRU Indicator Alert_Category Alert_Type Grouping Indicator_Heat_Score Output: Malicious Score Risk Factor Classific-ation 1
Bagging – Random Forest D m DT 1 DT 2 DT 3 DT n 1 1 1 Aggregation (Majority Voting) 1 d n n<m D<D
Boosting What? Why? a family of machine learning algorithms that convert weak learners to strong ones A machine learning ensemble meta-algorithm designed to reduce bias and also variances . To reduce bias To reduce Variance
Boosting Base Learner 1 Base Learner 2 Base Learner 3 Weak Learners Incorrectly Classified
f 1 f 2 f 3 ……… .. O/P Sample Weight n 1/n f 1 Y=4 N=1 f 3 f n STUMPS
f 1 f 2 f 3 ……… .. O/P Sample Weight n =7 1/7 1/7 1/7 1/7 f 1 Y=4 N=1 STUMPS Step 2 Total Error= 1/7 Step 3: Performance of Stump= .5*log e (1-TE/TE) Step 1 0.8 Updated Weight Step 4 0.6 Normal Weight Step 5
0.6 Normal Weight Buckets Step 6 Step 7 f 1 f 2 f 3 ……… .. O/P n =7 Test Dataset DT1 DT2 DTn 1 1