MACHINE LEARNING-LEARNING RULE

FDP ON MACHINE LEARNING Dr.M.Bindhu Associate Professor/ECE Saveetha Engineering College

OUTLINE Machine Learning (AI) –Learning Rules Sequential Covering Algorithm Learn One Rule Why Learn First Order Rules? First Order Logic: Terminology The FOIL Algorithm Why Combine Inductive And Analytical Learning? Kbann : Prior Knowledge To Initialize The Hypothesis Tangentprop , EBNN: Prior Knowledge Alters Search Objective FOCL Algorithm : Prior Knowledge Alters Search Operators

Machine Learning (AI) A key aspect of intelligence is the ability to learn knowledge over time. Vast majority of machine learning algorithm have been developed to learn knowledge that is inherently propositional, where the domain of interest is encoded in terms of a fixed set of variables. Subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses

https://youtu.be/ad79nYk2keg

We can learn sets of rules and then converting the tree to rules. We can also use a genetic algorithm that encodes the rules as bit strings. But, these only work with predicate rules (no variables). They also consid er the set of rules as a whole, not one rule at a time. LEARNING RULES

SEQUENTIAL COVERING ALGORITHM 1. .Learn one rule with high accuracy, any coverage 2. Remove positive examples covered by this rule 3. Repeat Example: Wind(d1)= Strong, Humdity (d1 )= Low, Outlook(d1 )= Sunny, PlayTennis (d1 )= No Wind(d2)= Weak, Humdity (d2 )= Med, Outlook(d2 )= Sunny, PlayTennis (d2 )=Yes Wind(d3 )=Med, Humdity (d3 )=Med, Outlook(d3 )=Rain, PlayTennis (d3 )=No Target_attribute is the one wish to learn. e.g . PlayTennis (x) Attributes is the set of all possible attributes. e.g . Wind(x), Humidity(x), Outlook(x) threshold is the desired performance .

SEQUENTIAL COVERING ALGORITHM Sequential_covering ( Target_attribute , Attributes, Examples, Threshold) : Learned_rules = {} Rule = Learn-One-Rule( Target_attribute , Attributes, Examples) while Performance(Rule, Examples) > Threshold : Learned_rules = Learned_rules + Rule Examples = Examples - {examples correctly classified by Rule} Rule = Learn-One-Rule( Target_attribute , Attributes, Examples) Learned_rules = sort Learned_rules according to performance over Examples return Learned_rules

Drawback Of Sequential Covering Algorithm We require Learn-One-Rule to have high (perfect?) accuracy but not necessarily high coverage ( i.e., when it makes a prediction it should be true ). Since it performs a greedy search it is not guaranteed to find the best or smallest set of rules that cover the training examples .

Why Learn First Order Rules? Propositional logic allows the expression of individual propositions and their truth-functional combination. 1.Propositions like Tom is a man or All men are mortal may be represented by single proposition letters such as P or Q 2. Truth functional combinations are built up using connectives, such as ∧, ∨, ¬, → Example: P∧Q Inference rules defined over propositional forms,P → Q P is Tom is a man and Q is All men are mortal, Inference that Tom is mortal does not follow in propositional logic

Why Learn First Order Rules?.. Contd First order logic allows the expression of propositions and their truth functional combination, but it also allows us to represent propositions as assertions of predicates about individuals or sets of individuals. Example: Propositions like Tom is a man or All men are mortal may be represented by predicate-argument representations such as man(tom) or ∀x(man(x) → mortal(x)) (so, variables range over individuals) . Inference rules permit conclusions to be drawn about sets/individuals – e.g. mortal(tom)

Why Learn First Order Rules?.. Contd

Day Outlook Temp Humid Wind PlayTennis D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Low Weak Yes D6 Rain Cool Low Strong No D7 Overcast Cool Low Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Low Weak Yes D10 Rain Mild Low Weak Yes D11 Sunny Mild Low Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Low Weak Yes D14 Rain Mild High Strong No

Learn-One-Rule P os positive Examples N eg negative Examples while P os , Learn a N ewRule { N ewRule most general rule possible N ewRule N eg N eg { while N ewRuleN eg, do Add a new literal to specialize N ewRule 1. Candidate literals generate candidates 2.Best literal argmaxL2Candidate literals P erf ormance ( SpecializeRule (N ewRule ; L)) 3. add Best literal to N ewRule preconditions 4. N ewRule N eg subset of N ewRuleN eg that satises N ewRule preconditions { Learned rules Learned rules + N ewRule { P os P os fmembers of P os covered by N ewRuleg Return Learned rule

Autonomous Car Technology Laser Terrain Mapping S t anle y Learning from Human Drivers Sebastian Adaptive Vision P a th P l ann i n g Images and movies taken from Sebastian Thrun ’ s multimedia w 1 e 4 bsite.

Idea: organize the hypothesis space search in general to specific fashion. Start with most general rule precondition, then greedily add the attribute that most improves performance measured over the training examples . Can be generalized to multi-valued target functions. There are other ways to define Performance(), besides using entropy . Learn One Summary

First-Order Logic Definitions Every expression is composed of constants , variables , predicates , and functions . A term is any constant, or variable, or any function applied to any term. A literal is any predicate (or its negation) applied to any set of terms.Female (Mary), ¬Female(x) A ground literal is a literal that does not contain any variables. A clause is any disjunction of literals whose variables are universally quantified.∀ x : Female(x) ∨ Male(x) A Horn clause is an expression of the form H ← L1 ∧ L2 ∧ ... ∧ Ln

First Order Logic: Terminology – constants – e.g. bob, 23, a – variables – e.g. X,Y,Z – predicate symbols – e.g. female, father – predicates take on the values True or False only – function symbols – e.g. Age – functions can take on any constant as a value – connectives – e.g. ∧, ∨, ¬, → (or ←) – quantifiers – e.g. ∀, ∃

A term is – any constant – e.g. bob – any variable – e.g X – any function applied to any term – e.g. age(bob) A literal is any predicate or negated predicate applied to any terms – e.g. f emale (sue), ¬f ather (X,Y) – A ground literal is a literal that contains no variables – e.g. f emale (sue) – A positive liter al is a literal that does not contain a negated predicate – e.g. f emale (sue) – A negative literal is a literal that contains a negated predicate – e.g ¬f ather (X,Y)

Learning First-Order Horn Clauses Say we are trying to learn the concept Daughter( x,y ) from examples. Each person is described by the attributes: Name, Mother, Father, Male, Female. Each example is a pair of instances, say a and b : Name(a ) = Sharon, Mother(a) = Louise, Father(a) = Bob, Male(a) = False, Female(a) = True Name(b) = Bob, Mother(b) = Nora, Father(b) = Victor, Male(b) = True, Female(b) = False, Daughter( a,b ) = True If we give a bunch of these examples to CN2 or C4.5 they will output a set of rules like: IF Father(a) = Bob ∧ Name(b) = Bob ∧ Female(a) THEN Daughter( a,b ) A first-order learner would output more general rules like IF Father(x) = y ∧ Female(x) THEN Daughter( x,y )

FOIL ALGORITHM FOIL extends the SEQUENTIAL-COVERING and LEARN-ONE-RULE algorithms for propositional rule learning to first order rule learning • FOIL learns in two phases: an outer loop which acquires a disjunction of Horn clause-like rules which together cover the positive examples an inner loop which constructs individual rules by progressive specialisation of a rule through adding new literals selected until no negative examples are covered.

FOIL(T arget predicate; P redicates ; Examples) P os positive Examples N eg negative Examples while P os , do Learn a N ewRule N ewRule most general rule possible N ewRuleN eg N eg while N ewRuleN eg , do Add a new literal to specialize N ewRule 1. Candidate literals generate candidates 2. Best literal argmaxL2Candidate literals F oil Gain(L; N ewRule ) 3. add Best literal to N ewRule preconditions 4. N ewRuleN eg subset of N ewRuleN eg that satisfiesN ewRule preconditions Learned rules Learned rules + N ewRule P os P os -members of P os covered by N ewRuleg Return Learned rule

Inductive and Analytical Learning INDUCTIVE LEARNING ANALYTICAL LEARNING Inductive logic programming is particularly useful in bioinformatics and natural language processing. Hypothesis fits data Hypothesis fits domain theory Statistical inference Deductive inference Requires little prior Learns from scarce data Syntactic inductive bias Bias is domain theory Plentiful data No prior knowledge Scarce data Perfect prior knowledge

25 Domain Theory Cup  Stable, Liftable , OpenVessel Stable  BottomIsFlat Liftable  Graspable, Light Graspable  HasHandle OpenVessel  HasConcavity , ConcavityPointsUp Cup Stable Liftable OpenVessel Graspable BottomIsFlat Light HasConcavity ConcavityPointsUp HasHandle

26 KBANN Knowledge Based Artificial Neural Networks KBANN (data D , domain theory B ) 1. Create a feed forward network h equivalent to B 2. Use BACKPROP to tune h to fit D

CS 5751 Machine Learning Chapter 12 Comb. Inductive/Analytical 27 Neural Net Equivalent to Domain Theory Expensive BottomIsFlat MadeOfCeramic MadeOfStyrofoam MadeOfPaper HasHandle HandleOnTop HandleOnSide Light HasConcavity ConcavityPointsUp Fragile Stable Liftable OpenVessel Cup Graspable large positive weight large negative weight negligible weight

28 Creating Network Equivalent to Domain Theory Create one unit per horn clause rule (an AND unit) Connect unit inputs to corresponding clause antecedents For each non-negated antecedent, corresponding input weight w  W , where W is some constant For each negated antecedent, weight w  - W Threshold weight w  -(n - .5) W, where n is number of non-negated antecedents Finally, add additional connections with near-zero weights Liftable  Graspable, ¬Heavy

29 Result of Refining the Network Expensive BottomIsFlat MadeOfCeramic MadeOfStyrofoam MadeOfPaper HasHandle HandleOnTop HandleOnSide Light HasConcavity ConcavityPointsUp Fragile Stable Liftable OpenVessel Cup Graspable large positive weight large negative weight negligible weight

30 Hypothesis Space Search in KBANN Hypotheses that fit training data equally well Initial hypothesis for KBANN Initial hypothesis for Backpropagation Hypothesis Space

31 EBNN Explanation Based Neural Network Key idea: Previously learned approximate domain theory Domain theory represented by collection of neural networks Learn target function as another neural network

32 Expensive BottomIsFlat MadeOfCeramic MadeOfStyrofoam MadeOfPaper HasHandle HandleOnTop HandleOnSide Light HasConcavity ConcavityPointsUp Fragile Stable Explanation in Terms of Domain Theory Prior learned networks for useful concepts combined into a single target network Graspable Stable OpenVessel Liftable Cup

33 Hypothesis Space Search in TangentProp Hypotheses that maximize fit to data TangetProp Search Backpropagation Search Hypothesis Space Hypotheses that maximize fit to data and prior knowledge

34 FOCL algorithm ( First Order Combined Learner ) Adaptation of FOIL that uses domain theory When adding a literal to a rule, not only consider adding single terms, but also think about adding terms from domain theory Most importantly a potentially incorrect hypothesis is allowed as an initial approximation to the predicate to be learned. The main goal of FOCL is to incorporate the methods of explanation-based learning (EBL)

35 Search in FOCL Cup  HasHandle [2+,3-] Cup  Cup  ¬HasHandle [2+,3-] Cup  Fragile [2+,4-] Cup  BottomIsFlat , Light, HasConcavity , ConcavityPointsUp [4+,2-] ... Cup  BottomIsFlat, Light, HasConcavity, ConcavityPointsUp, HandleOnTop [0+,2-] Cup  BottomIsFlat, Light, HasConcavity, ConcavityPointsUp, ¬ HandleOnTop [4+,0-] Cup  BottomIsFlat, Light, HasConcavity, ConcavityPointsUp, HandleOnSide [2+,0-] ...

36 FOCL Results Recognizing legal chess endgame positions : 30 positive, 30 negative examples FOIL: 86% FOCL: 94% (using domain theory with 76% accuracy) NYNEX telephone network diagnosis 500 training examples FOIL: 90% FOCL: 98% (using domain theory with 95% accuracy)

VIDEO LINK https://www.youtube.com/watch?v=rVlN0ZCYmtI https://www.youtube.com/watch?v=SSrD02pdE78

RECAP Sequential Covering Algorithm Learn One Rule The FOIL Algorithm FOCL Algorithm

http://jmvidal.cse.sc.edu/talks/learningrules/allslides.xml http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/mlbook/ch10.pdf https://bcssp10.files.wordpress.com/2013/02/lecture151.pdf REFERENCES Machine learnning ---Tom Mitchell (1998):

HEARTFUL THANK YOU

MACHINE LEARNING-LEARNING RULE

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

MACHINE LEARNING-LEARNING RULE

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx