21. Regression Tree in machine learning.pptx

7 views 18 slides Apr 07, 2025
Slide 1
Slide 1 of 18
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18

About This Presentation

Regression Tree Regression Tree


Slide Content

regression Tree

INTRODUCTION Classification Trees:  When the decision tree has categorical target variable. The above tree is an example of a classification tree because we know that there are two options for the result. Regression Trees: When the decision tree has a continuous target variable. For example, a regression tree would be used for the price of a newly launched product because price can be anything depending on various constraints. Both types of decision trees fall under the Classification and Regression Tree (CART) designation.

Regression tree

example Standard deviation Golf players = {25, 30, 46, 45, 52, 23, 43, 35, 38, 46, 48, 52, 44, 30} Average of golf players = (25 + 30 + 46 + 45 + 52 + 23 + 43 + 35 + 38 + 46 + 48 + 52 + 44 + 30 )/14 = 39.78 Standard deviation of golf players =  √[( (25 – 39.78) 2  + (30 – 39.78) 2  + (46 – 39.78) 2  + … + (30 – 39.78) 2  )/14] = 9.32

Golf players for sunny outlook = {25, 30, 35, 38, 48} Average of golf players for sunny outlook = (25+30+35+38+48)/5 = 35.2 Standard deviation of golf players for sunny outlook = √(((25 – 35.2) 2  + (30 – 35.2) 2  + … )/5) = 7.78

Golf players for overcast outlook = {46, 43, 52, 44} Average of golf players for overcast outlook = (46 + 43 + 52 + 44)/4 = 46.25 Standard deviation of golf players for overcast outlook = √(((46-46.25) 2 +(43-46.25) 2 +…)= 3.49

Golf players for overcast outlook = {45, 52, 23, 46, 30} Average of golf players for overcast outlook = (45+52+23+46+30)/5 = 39.2 Standard deviation of golf players for rainy outlook = √(((45 – 39.2) 2 +(52 – 39.2) 2 +…)/5)= 10.87

Weighted standard deviation for outlook = (4/14)x3.49 + (5/14)x10.87 + (5/14)x7.78 = 7.66 Standard deviation reduction for outlook = 9.32 – 7.66 = 1.66

Weighted standard deviation for humidity = (7/14)x9.36 + (7/14)x8.73 = 9.04 Standard deviation reduction for humidity = 9.32 – 9.04 = 0.27

Root Node Outlook !4 data - Global Std dev 5 data - Global Std dev 5 Temp Hot Mild Cool Sunny Wind Weak Strong Humidity High Normal

Golf players for sunny outlook = {25, 30, 35, 38, 48} Average of golf players for sunny outlook = (25+30+35+38+48)/5 = 35.2 Standard deviation of golf players for sunny outlook = √(((25 – 35.2) 2  + (30 – 35.2) 2  + … )/5) = 7.78 Considered as Global standard deviation for this sub data set = 7.78

Standard deviation for sunny outlook and hot temperature = 2.5 Standard deviation for sunny outlook and cool temperature = Standard deviation for sunny outlook and mild temperature = 6.5

Weighted standard deviation for sunny outlook and temperature = (2/5)x2.5 + (1/5)x0 + (2/5)x6.5 = 3.6 Standard deviation reduction for sunny outlook and temperature = 7.78 – 3.6 = 4.18

Weighted standard deviations for sunny outlook and humidity = (3/5)x4.08 + (2/5)x5 = 4.45 Standard deviation reduction for sunny outlook and humidity = 7.78 – 4.45 = 3.33 Weighted standard deviations for sunny outlook and wind = (2/5)x9 + (3/5)x5.56 = 6.93 Standard deviation reduction for sunny outlook and wind = 7.78 – 6.93 = 0.85 Summarizing standard deviations for windy feature when outlook is sunny

Final form of regression Tree https://sefiks.com/2018/08/28/a-step-by-step-regression-decision-tree-example/ Leaf Node = Golf Player 5 5 1 (2) 2

Decision Tree Entropy Information gain – Higher gain  Best candidate to be selected as a node Entropy – If all the data belongs to the same class label – Entropy =0 (Pure) If the input data belongs to many class lables – Entropy = near to 1 (Impure) Nodes  Input attributes (Ex: outlook) Arcs/links/edges  Values of input attributes (Ex: Sunny, Rainy, Overcast) Top node – Root node Other nodes in the tree – Intermediate nodes Leaf node (last level of the tree) – Identifies the corresponding class label (Ex: Play = Yes/ No) From Decision tree  Derive classification rules How many rules can be derived ?  No of leaf level nodes
Tags