One R (1R) Algorithm

6,055 views 14 slides Sep 11, 2017
Slide 1
Slide 1 of 14
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14

About This Presentation

Walk through of the basics of a One R (1R) Algorithm


Slide Content

O ne R Presented by Hien Nguyen

What is O ne R (1R) algorithm Way to find very easy classification rule Generates a one level decision tree which tests just one feature Features Model builder (Rules) Library Recommendation

OneR STEPS Steps: Consider each feature in turn There will be on branch in the decision tree for each feature Allot the majority class to each branch Repeat the same for all attribute and choose the one with minimum error

OneR PSEUDO CODE Pseudo code for OneR For each feature For each value of that feature, make a rule as follows: Count how often each class appears Find the most frequent class Make the rule assign that class to this feature value Calculate the error rate of the rule Choose the rule with the smallest error rate.

OneR example Dataset Outlook Temperature Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal False Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True no

OneR example Consider Outlook:

OneR example

OneR EXAMPLE From this example, decision tree based on Outlook and Humidity gives minimum total error We could choose either of these two features and its corresponding rules to be our classification rule Missing is treated as any other feature Assuming that our latest observation is: Rainy, Mind, Normal,False, ? Then based on the rule for Outlook, Play = Yes If we use the rule for Humidity, Play = Yes

OneR with numerical attributes To deal with numerical features, we Discretize them The steps include: Sort instances on the basis of feature’s value Place breakpoints where class changes These breakpoints gives us discrete numerical range Majority class of each range is consider as its range

Numerical Dataset

Numerical attributes and OneR Apply these steps of discretizing, we get 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes| No |Yes Yes Yes | No No|Yes Yes Yes| No| Yes Yes | No The problem with this approach is that we can get a large number of devision or Overfitting Therefore, we could enfore a minimum number of instances: (min = 3) 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes No Yes Yes Yes | No No Yes Yes Yes| No Yes Yes No

Numerical attributes and OneR When two adjacent division have the same majority class, then we can join these two divisions 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes No Yes Yes Yes No No Yes Yes Yes | No Yes Yes No Which gives the following classification rules temparature <= 77.5 then play = Yes temparature > 77.5 then play = No

SUMMARY OneR  is a simple classification algorithm that generates one rule for each predictor in the data, then selects the rule with the smallest total error as its "one rule "   To create a rule for a predictor, we construct a frequency table for each predictor against the target. It produces rules that are simple for humans to interpret Oftten used for establishing a baseline

Practice Color Size Act Age Inflated YELLOW SMALL STRETCH ADULT T YELLOW SMALL DIP ADULT T YELLOW SMALL DIP CHILD T YELLOW SMALL STRETCH ADULT T YELLOW SMALL STRETCH CHILD T YELLOW SMALL DIP ADULT T YELLOW SMALL DIP CHILD T YELLOW LARGE STRETCH ADULT F YELLOW LARGE STRETCH CHILD F YELLOW LARGE DIP ADULT F YELLOW LARGE DIP CHILD F PURPLE SMALL STRETCH ADULT F PURPLE SMALL STRETCH CHILD F PURPLE SMALL DIP ADULT F PURPLE SMALL DIP CHILD F PURPLE LARGE STRETCH ADULT F PURPLE LARGE STRETCH CHILD F PURPLE LARGE DIP CHILD F Please apply OneR on this dataset with the following test cases: YELLOW,SMALL,STRETCH,CHILD, ?
Tags