Lecture04_Concept Learning_ FindS Algorithm.pptx

DrMTayyabChaudhry1 11 views 78 slides Sep 13, 2024
Slide 1
Slide 1 of 78
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78

About This Presentation

Concept Learning for Machine Learning


Slide Content

Machine Learning Lecture 03 – Concept Learning- Find-S Algorithm Slides Credit Dr. Rao Muhammad Adeel Nawab Edited by : Dr. Allah Bux Sargana

Representing Training Examples and Hypothesis for FIND-S Machine Learning Algorithm

Concept Learning Concept learning, also known as category learning was defined by Bruner, Goodnow , & Austin (1967) as "the search for and listing of attributes that can be used to distinguish exemplars from non exemplars of various categories”. In a concept learning task, a human or machine learner is trained to classify objects by being shown a set of example objects along with their class labels. Concept is an idea of something formed by combining all its features or attributes which construct the given concept. Every concept has two components: Attributes: features that one must look for to decide whether a data instance is a positive one of the concept. A rule: denotes what conjunction of constraints on the attributes will qualify as a positive instance of the concept .

Concept Learning (Cont.) Concept Learning: Acquiring the definition of a general category from given sample positive and negative training examples of the category. Concept Learning can be seen as a problem of searching through a predefined space of potential hypotheses for the hypothesis that best fits the training examples. The hypothesis space has a general-to-specific ordering of hypotheses, and the search can be efficiently organized by taking advantage of a naturally occurring structure over the hypothesis space.

Concept Learning (Cont.) A Formal Definition for Concept Learning: Inferring a boolean -valued function from training examples of its input and output. An example for concept-learning is the learning of bird-concept from the given examples of birds (positive examples) and non-birds ( negative examples ). • We are trying to learn the definition of a concept from given examples.

Learning Input-Output Functions – General Settings Input to Learner In this lecture Note that h is an approximation of Target Function f Set of Training Examples (D) Set of Functions / Hypothesis (H) A Hypothesis (h) from H which best fits the Training Examples (D) Output by Learner Learner is FIND-S Machine Learning Algorithm 😊

Lecture Focus In this Lecture, we will take the Gender Identification Problem and try to explain three main things Representation of Hypothesis (h) Searching Strategy Representation of Training Examples (D) How to represent Training Examples (D) in a Format which FIND-S Algorithm can understand and learn from them? How to represent Hypothesis (h) in a Format which FIND-S Algorithm can understand ? What searching strategy is used by FIND-S Algorithm to find a h from H , which best fits the Training Examples (D)?

Gender Identification Problem Gender Identification Machine Learning Problem Input Output Human Features Gender of a Human Task Treated as Given f eatures a Human (Input), predict the Gender of the Human (Output) Learning Input-Output Function i.e . Learn from Input to predict Output

Representation of Examples Representation of Input and Output Example = Input + Output Attribute-Value Pair Input = Human Output = Gender

Representation of Examples Representation of Input and Output Example = Input + Output Attribute-Value Pair Input = Human Output = Gender

Representation of Input 1 A Input is represented as a 3 5 Height HairLength 2 4 6 Weight HeadCovered WearingChain ShirtSleeves Set of 6 Input Attributes

Representation of Input (Cont.) Instance No. Input Attribute Data Type Input Attribute Values x 1 Height Categorical Short, Normal, Tall x 2 Weight Categorical Light, Heavy x 3 HairLength Categorical Short, Long x 4 HeadCovered Categorical Yes, No x 5 WearingChain Categorical Yes, No x 6 ShirtSleeves Categorical Half, Full

Representation of Output Set of 1 Output Attribute Gender Input is represented as a Instance No. Output Attribute Data Type Output Attribute Values x 1 Gender Categorical Yes, No Yes means Female and No means Male Note

Computing Size of Instance Space (X) Instance No. Input Attribute Input Attribute Values No. of Values x 1 Height Short, Normal, Tall 3 x 2 Weight Light, Heavy 2 x 3 HairLength Short, Long 2 x 4 HeadCovered Yes, No 2 x 5 WearingChain Yes, No 2 x 6 ShirtSleeves Half, Full 2 |X | = No. of Values of Height Input Attribute x No. of Values of Weight Input Attribute x No. of Values of HairLength Input Attribute x No. of Values of HeadCovered Input Attribute x No. of Values of WearingChain Input Attribute x No. of Values of ShirtSleeves Input Attribute |X | = 3*2*2*2*2*2 = 96

Sample Data We obtained a Sample Data of 6 examples Instance No. Input Gender Height Weight Hair Length Head Covered Wearing Chain Shirt Sleeves x 1 Short Light Short Yes Yes Half Male x 2 Short Light Long Yes Yes Half Female x 3 Tall Heavy Long Yes Yes Full Female x 4 Short Light Long Yes No Full Male x 5 Short Heavy Short Yes Yes Half Female x 6 Tall Light Short No Yes Full Male

Representation of Hypothesis (h) We represent a Hypothesis (h) as Conjunction (AND) of Constraints on Input Attributes Each constraint can be: No value allowed (null hypothesis Ø): e.g. Height = Ø A specific value : e.g. Height = Short A don’t care value (any of possible values): e.g. Height = ? Most Specific Hypothesis (h) < Height Weight HairLength HeadCovered WearingChain ShirtSleeves > < ∅ ∅ ∅ ∅ ∅ ∅ >

Representation of Hypothesis (h ) (Cont.) Most General Hypothesis (h) < Height Weight HairLength HeadCovered WearingChain ShirtSleeves > < ? ? ? ? ? ? > Another Hypothesis (h) < Height Weight HairLength HeadCovered WearingChain ShirtSleeves > < Normal Light ? ? No ? > Important Note The order of Input Attributes must be exactly same in Training Example (d) and Hypothesis (h)

Computing Size of Concept Space (C) and Hypothesis Space (H) Instance No. Input Attribute Input Attribute Constraints No. of Constraints x 1 Height ∅, Short, Normal, Tall, ? 5 x 2 Weight ∅, Light, Heavy, ? 4 x 3 HairLength ∅, Short, Long, ? 4 x 4 HeadCovered ∅, Yes, No, ? 4 x 5 WearingChain ∅, Yes, No, ? 4 x 6 ShirtSleeves ∅. Half, Full, ? 4

Computing Size of Concept Space (C) and Hypothesis Space (H) Size of Concept Space (C) Size of Hypothesis Space (H) (Syntactically Distinct Hypothesis) Size of Hypothesis Space (H) (Semantically Distinct Hypothesis) Size of Instance Space (X) | X| = 96 = 79,228,162,514,264,337,593,543,950,336       Only one more value for attributes: ?, and one hypothesis representing empty set of instances.

FIND-S Algorithm - Machine Learning Cycle

Machine Learning Cycle Four phases of a Machine Learning Cycle are Build the Model using Training Data Training Phase Testing Phase Evaluate the performance of Model using Testing Data Application Phase Deploy the Model in Real-world, to make prediction on Real-time unseen Data Feedback Phase Take Feedback form the Users and Domain Experts to improve the Model

Split the Sample Data We split the Sample Data using Random Split Approach into 1/3 2/3 Training Data Testing Data

Sample Data Instance No. Input Gender Height Weight Hair Length Head Covered Wearing Chain Shirt Sleeves x 1 Short Light Short Yes Yes Half Male x 2 Short Light Long Yes Yes Half Female x 3 Tall Heavy Long Yes Yes Full Female x 4 Short Light Long Yes No Full Male x 5 Short Heavy Short Yes Yes Half Female x 6 Tall Light Short No Yes Full Male

Training Data Instance No. Input Gender Height Weight Hair Length Head Covered Wearing Chain Shirt Sleeves x 1 Short Light Short Yes Yes Half Female x 2 Short Light Long Yes Yes Half Female x 3 Tall Heavy Long Yes Yes Full Male x 4 Short Light Long Yes No Full Female

Testing Data Instance No. Input Gender Height Weight Hair Length Head Covered Wearing Chain Shirt Sleeves x 1 Short Light Short Yes Yes Half Male x 2 Tall Light Short No Yes Full Male

Note After splitting Sample Data using Random Split Approach Sample Data is balanced 3 Positive Instances (Female) Training Data is unbalanced 3 Negative Instances (Male) Testing Data is unbalanced 3 Positive Instances (Female) 1 Negative Instances (Male) Positive Instances (Female) 2 Negative Instances (Male)

Sample Data – Vector Representation Vector Representation of Examples            

Training Data – Vector Representation Vector Representation of Training Examples        

Testing Data – Vector Representation Vector Representation of Test Examples    

Find-S Algorithm (or Learner) Initialize h to the Most Specific Hypothesis in H For each positive training instance x For each attribute constraint a i in h If the constraint a i in h is satisfied by x then do nothing else replace a i in h by the next more general constraint that is satisfied by x Output hypothesis h

Specific to General Constraints A set of specific values (for e.g. Short, Normal and Tall for Height Attribute) 2 A don’t care value (?) 3 No value allowed (Ø) 1 We have three constraints on our Attributes Most Specific Constraint Note that ? is next more generic constraint than specific value Note that specific value is next more generic constraint than No value allowed (Ø )

Training Phase Best Fit means In the Training Phase, the FIND-S Algorithm will Search H to find out a h, which best fits the Training Data h correctly classifies Positive and Negative instances in the Training Data Incorrect Classification Correct Classification Positive instance is classified as Positive Negative instance is classified as Positive Negative instance is classified as Negative Positive instance is classified as Negative

Training Phase,(Cont.) Initialize h to the Most Specific Hypothesis in H h = < >   For each positive training instance x For each attribute constraint a i in h If the constraint a i in h is satisfied by x then do nothing else replace a i in h by the next more general constraint that is satisfied by x

Training Phase,(Cont.) x 1 = <Short, Light, Short, Yes, Yes, Half> + First Training Example Let’s see if attribute constraints in h satisfy x 1 or not? If ( = Short AND = Light AND = Short AND = Yes AND = Yes AND = Half) THEN Gender = Yes Else Gender = No

Training Phase,(Cont.) As we can see that attribute constraints in h do not satisfy x 1 Therefore , x 1 is incorrectly classified as Negative To satisfy x 1 , we will need to replace attribute constraints in h by the next more general constraint that is satisfied by x 1 h = <∅ ,∅ ,∅ ,∅ ,∅ ,∅> will become h 1 = <Short, Light, Short, Yes, Yes, Half>

Training Phase,(Cont.) x 1 = <Short, Light, Short, Yes, Yes, Half> + First Training Example Let’s see if attribute constraints in h 1 satisfy x 1 or not?   If (Short = Short AND Light = Light AND Light = Short AND Yes = Yes AND Yes = Yes AND Half = Half) THEN Gender = Yes Else Gender = No As we can see that attribute constraints in h 1 satisfy x 1 Therefore , x 1 is correctly classified as Positive

Training Phase,(Cont.) x 2 = <Short, Light, Long, Yes, Yes, Half> + Second Training Example Let’s see if attribute constraints in h 1 satisfy x 2 or not? If (Short = Short AND Light = Light AND Long = Short AND Yes = Yes AND Yes = Yes AND Half = Half) THEN Gender = Yes Else Gender = No

Training Phase,(Cont.) As we can see that attribute constraints in h 1 do not satisfy x 2 Therefore , x 2 is incorrectly classified as Negative To satisfy x 2 , we will need to replace attribute constraints in h 1 by the next more general constraint that is satisfied by x 2 h 1 = <Short, Light, Short, Yes, Yes, Half> will become h 2 = <Short, Light, ?, Yes, Yes, Half>

Training Phase,(Cont.) x 2 = <Short, Light, Long, Yes, Yes, Half> + Second Training Example Let’s see if attribute constraints in h 2 satisfy x 2 or not?   If (Short = Short AND Light = Light AND? = Long AND Yes = Yes AND Yes = Yes AND Half = Half) THEN Gender = Yes Else Gender = No As we can see that attribute constraints in h 1 satisfy x 2 Therefore , x 2 is correctly classified as Positive

Note Learner (FIND-S Algorithm) has observed two Training examples up till now and our hypothesis is as follows h 2 = <Short, Light, ?, Yes, Yes, Half> Let’s see if h 2 must best fit the observed Trailing Example i.e. x 1 and x 2 or not? h 2 correctly classifies x 1 as Positive x 2 as Positive To conclude, h 2 best fits the first two observed Trailing Example i.e. x 1 and x 2

Training Phase,(Cont.) x 3 = <Tall, Heavy, Long, Yes, Yes, Full> - Third Training Example Note that 3 rd Training Example is Negative and FIND-S only operates on Positive Training Examples Therefore , there will be no change in h 2 and h 3 h 2 = <Short, Light, ?, Yes, Yes, Half> will become h 3 = <Short, Light, ?, Yes, Yes, Half>

Training Phase,(Cont.) Interestingly h 3 correctly classifies x 3 as Negative   If ( Short = Tall AND Light = Heavy AND ? = Long AND Yes = Yes AND Yes = Yes AND Half = Full ) THEN Gender = Yes Else Gender = No

Training Phase,(Cont.) x 3 = <Tall, Heavy, Long, Yes, Yes, Full> - Third Training Example h 3 correctly classifies x 1 as Positive Thus , h 3 best fits the three Training Examples observed up till now x 2 as Positive x 3 as Negative

Training Phase,(Cont.) x 4 = <Short, Light, Long, Yes, No, Full> + Fourth Training Example Let’s see if attribute constraints in h 3 satisfy x 4 or not?   If (Short = Short AND Light = Light AND ? = Long AND Yes = Yes AND Yes = No AND Half = Full ) THEN Gender = Yes Else Gender = No

Training Phase,(Cont.) To satisfy x 4 , we will need to replace attribute constraints in h 3 by the next more general constraint that is satisfied by x 4 h 3 = <Short, Light, ?, Yes, , Half > h 4 = <Short, Light, ?, Yes, ?, ?> As we can see that attribute constraints in h 3 do not satisfy x 4 Therefore , x 4 is incorrectly classified as Negative

Training Phase,(Cont.) x 4 = <Short, Light, Long, Yes, No, Full> + Fourth Training Example Let’s see if attribute constraints in h 4 satisfy x 4 or not?   If (Short = Short AND Light = Light AND ? = Long AND Yes = Yes AND ? = No AND ? = Full) THEN Gender = Yes Else Gender = No As we can see that attribute constraints in h 4 satisfy x 4 Therefore , x 4 is correctly classified as Positive

Training Phase,(Cont.) h 4 correctly classifies x 1 as Positive Thus , h 4 best fits the four Training Examples observed up till now x 2 as Positive x 3 as Negative x 4 as Positive There were total 4 Training Examples and we have observed all of them Note

Find-S Algorithm   x 1 = <Short, Light, Short, Yes, Yes, Half> + h = < >   x 2 = <Short, Light, Long, Yes, Yes, Half> + h 1 =<Short, Light, Short, Yes, Yes, Half>   x 3 = <Tall, Heavy, Long, Yes, Yes, Full> - h 2, 3 = <Short, Light, ?, Yes, Yes, Half>   x 4 = <Short, Light, Long, Yes, No, Full> + h 4 = <Short, Light, ?, Yes, ?, ?>

The h returned by FIND-S Algorithm is After observing all the Training Examples, the FIND-S Algorithm will Output hypothesis h h = <Short, Light, ?, Yes, ?, ?> h is an approximation of the Target Function f Note Training Phase,(Cont.)

Training Data Model Training Phase,(Cont.)         h = <Short, Light, ?, Yes, ?, ?>

In the next phase i.e. Testing Phase, we will Model – in the form of Rules Evaluate the performance of the Model Training Phase,(Cont.) If (Height = Short AND Weight = Light AND HairLength = ? AND HeadCovered = Yes AND WearingChain = ? AND ShirtSleeves = ?) THEN Gender = Yes Else Gender = No

Testing Phase Answer Question Evaluate the performance of Model on unseen data (or Testing Data) How good Model has learned?

Evolution Measures Evaluation will be carried out using Error measure

Error Definition Formula Error is defined as the proportion of incorrectly classified Test instances Accuracy = 1 - Error Note

Evaluate Model Apply Model on Test Data     Applying Model on x 5 If (Short = Short AND Light = Light AND Short = ? AND Yes = Yes AND Yes = ? AND Full = ?) THEN Gender = Yes Else Gender = No   Prediction returned by Model x 5 is predicted Positive (Incorrectly Classified Instance)

Evaluate Model (Cont.) Applying Model on x 6 If ( Tall = Short AND Light = Light AND Short = ? AND No = Yes AND Yes = ? AND Full = ?) THEN Gender = Yes Else Gender = No   Prediction returned by Model x 6 is predicted Negative (Correctly Classified Instance)

Evaluate Model (Cont.) Test Example Actual Predicted x 5 = < Short, Light, Short, Yes, Yes, Half>   - (Male) + (Female) x 6 = <Tall, Light, Short, No, Yes, Full>   - (Male) - (Male) Error = 1/2 = 0.5

Application Phase We assume that our Model Model is deployed in the Real-world and now we can make Predictions on Real-time Data Performed well on large Test Data and can be deployed in Real-world

Steps – Making Predictions on Real-time Data Take Input from User Step 1 Convert User Input into Feature Vector Step 2 Return Prediction to the User Step 4 Apply Model on the Feature Vector Step 3 Exactly same as Feature Vectors of Training and Testing Data

Example – Making Predictions on Real-time Data Take Input from User Step 1 Convert User Input into Feature Vector Step 2 Note that order of Attributes must be exactly same as that of Training and Testing Examples Enter Height (Short, Normal, Tall): Short Enter Weight (Light, Heavy): Light Enter HairLength (Short, Long): Long Is HeadCovered (Yes, No): Yes Is WearingChain (Yes, No): Yes Is ShirtSleeves (Half, Full): Half <Short, Light, Long, Yes, Yes, Half>

Example – Making Predictions on Real-time Data Return Prediction to the User Step 4 Apply Model on the Feature Vector Step 3 If (Short = Short AND Light = Light AND Long = ? AND Yes = Yes AND Yes = ? AND Half = ?) THEN Gender = Yes Else Gender = No Positive You can take Input from user, apply Model and return predictions as many times as you like 😊 Note

Take Feedback on your deployed Model from Improve your Model based on Feedback 😊 Feedback Phase Domain Experts and Users

Inductive Bias - FIND-S Algorithm Inductive Bias Inductive Bias of FIND-S Algorithm Inductive Bias Is the set of assumptions needed in addition to Training Examples to justify Deductively Learner’s Classification Training Data is error free Target Function / Concept is present in the Hypothesis Space (H)

Strengths and Weakness - FIND-S Algorithm Strengths Weaknesses Returns a Model (h), which can be used to make predictions on unseen data Only works on error free Data However , Real-world Data is noisy Works on assumption that Target Function is present in the Hypothesis Space (H) However , we may / may not find the Target Function in the Hypothesis Space (H) and this may / may not be known Only returns one hypothesis which best fits the Training Data However , there can be multiple hypothesis , which best fits the Training Data

TODO Task Consider the Titanic Dataset with the following Attributes Gender : Male, Female Ticket Class: Upper, Middle, Lower Parent/Child Abroad: Zero, One, Two, Three Embarked : Cherbourg, Queenstown, Southampton Survival : No, Yes

TODO (Cont.) We obtained the following Sample Data Instance No. Gender Ticket class Parent/Child Abroad Embarked Survival x 1 Male Lower Zero Southampton No x 2 Female Upper Zero Cherbourg Yes x 3 Male Lower Zero Southampton No x 4 Female Lower Zero Southampton Yes x 5 Male Lower Zero Queenstown No x 6 Female Upper Zero Southampton Yes Sample Data was split into Training and Testing Data in a Train-Test Split Ratio of 67%-33%

TODO (Cont.) Training Data Instance No. Gender Ticket class Parent/Child Abroad Embarked Survival x 1 Male Lower Zero Southampton No x 2 Female Upper Zero Cherbourg Yes x 3 Male Lower Zero Southampton No x 4 Female Lower Zero Southampton Yes

TODO (Cont.) Testing Data Instance No. Gender Ticket class Parent/Child Abroad Embarked Survival x 5 Male Lower Zero Queenstown No x 6 Female Upper Zero Southampton Yes Note Consider FIND-S Algorithm when answering questions given on next slide Well Justified Your answer should be

TODO (Cont.) Questions Write down the Input and Output for the above Machine Learning Problem? How Training Example is represented ? How Hypothesis (h ) should be represented ? Calculate Size of Instance Space, Concept Space, Syntactically Distinct Hypothesis and Semantically Distinct Hypothesis? Execute the Machine Learning Cycle? Write down your observations that you observed during the execution of Machine Learning Cycle?

Your Turn Task Select a Machine Learning Problem (similar to: Titanic – Machine Learning form Disaster) and answer the questions given on next slide. Note Consider FIND-S Algorithm in answering all the questions.

Your Turn Write down the Input and Output for the selected Machine Learning Problem? Questions How Training Example is represented ? Calculate Size of Instance Space, Concept Space, Syntactically Distinct Hypothesis and Semantically Distinct Hypothesis? Execute the Machine Learning Cycle? How Hypothesis (h ) should be represented ? Write down your observations that you observed during the execution of Machine Learning Cycle?

Lecture Summary (Cont.) Therefore, in Research, we mainly refine the solution(s) proposed for a Real-world Problem The main steps of a Research Cycle are as follows Step 1 Identify the Real-world Problem Step 2 Propose Solution (called Solution 01) to solve the Real-world Problem Step 3 List down Strengths and Weaknesses of Solution 01

Lecture Summary (Cont.) Step 4 Propose Solution (called Solution 02) to Step 5 List down Strengths and Weaknesses of Solution 02 Step 6 Propose Solution (called Solution 03) to further strengthen the Strengths of Solution 01 overcome limitations of Solution 01 Strengthen the Strengths of Solution 02 overcome limitations of Solution 02

Lecture Summary (Cont.) Step 4 Continue this cycle till the Day of Judgment 😊 Considering FIND-S Algorithm Input to Learner (FIND-S Algorithm) Set of Training Examples (D) Output by Learner (FIND-S Algorithm) A Hypothesis (h) from H which best fits the Training Examples (D) Set of Functions / Hypothesis (H) Note that h is an approximation of Target Function

Lecture Summary (Cont.) Inductive Bias Is the set of assumptions needed in addition to Training Examples to justify Deductively Learner’s Classification FIND-S Algorithm – Summary Representation of Example Attribute-Value Pair Representation of Hypothesis (h) Conjunction (AND) of Constraints on Attributes

Lecture Summary (Cont.) Training Regime Incremental Method Inductive Bias of FIND-S Algorithm Training Data is error-free Strengths Returns a Model (h), which can be used to “make predictions” on unseen data Target Function / Concept is present in the Hypothesis Space (H)

Lecture Summary (Cont.) Weaknesses Only works on error-free Data However, Real-world Data is “noisy” Works on assumption that Target Function is present in the Hypothesis Space (H) However, we may / may not find the Target Function in the Hypothesis Space (H) and this may / may not be known Only returns one hypothesis which best fits the Training Data However, there can be multiple hypothesis , which best fit the Training Data
Tags