Inference in First-Order Logic

11,909 views 59 slides May 11, 2017
Slide 1
Slide 1 of 59
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59

About This Presentation

Artificial Intelligence A Modern Approach


Slide Content

INFERENCE IN FIRST-ORDER LOGIC Artificial Intelligence, A Modern Approach, CH.9 2017.04.28(Fri) Junya Tanaka (M1)

Definition Sound I nference derives true conclusions given true premises Complete I nference derives all true conclusions from a set of premises Substitution(SUBST({ var_name / value}, sentence)) EX. P = Animal(X) SUBST ({X/Panda},P}) = Animal(Panda)

Introduction Chapter 7 showed how a knowledge-based agent could represent the world in which it operates and deduce what actions to take. Chapter 8 examine first-order logic , which is sufficiently expressive to represent a good deal of our common-sense knowledge. effective procedures for answering questions posed in first-order logic. Goal of this chapter is …

Agenda 9.1 Propositional VS. First-Oder Interference 9.2 Unification and Lifting 9.3 Forward Chaining 9.4 Backward Chaining 9.5 Resolution 9.6 Summery

9.1 Propositional VS. First-Oder Inference This section and the next introduce the ideas underlying modern logical inference systems some simple inference rules that can be applied to sentences with quantifiers to obtain sentences without quantifiers Inference rules for quantifiers Universal Quantifiers a type of quantifier a logical constant which is interpreted as "given any" or "for all” Existential Quantifiers a type of quantifier, a logical constant which is interpreted as "there exists", "there is at least one", or "for some"    

Universal Instantiation(UI: )   we can infer any sentence obtained by substituting a ground term (a term without variables) for the variable. Example If ∀x Person(x) (every x is a person) then Person(John), Person(Jack)…so on are true. So substitute x with anything(ground term) is legal . denote the result of applying the substitution to the sentence . Then the rule is written for any variable v and ground term g  

Example Suppose our knowledge base contains the standard folkloric axiom stating that all greedy kings are evil: Then it seems quite permissible to infer any of the following sentences:  

Existential Instantiation T he existential sentence says there is some object satisfying a condition, and applying the existential instantiation rule just gives a name to that object F or any sentence α , variable v , and constant symbol k that does not appear elsewhere in the knowledge base, Intuition There is “some” object satisfying a condition. We cannot randomly substitute the variable. Instead, we use a constant “C” which we assume that C satisfies the condition.  

Example From the sentence ∃ x Crown(x ) ∧ OnHead (x, John ) we can infer the sentence Crown(C1 ) ∧ OnHead (C1, John ) as long as C1 does not appear in knowledge base. Provided C 1 is a new constant symbol, called a Skolem constant Existential Instantiation is a special case of a more general process called S kolemization

UI and EI UI can be applied several times to add new sentences; the new KB is logically equivalent to the old. Given everything is valid, instantiating anything is also valid. EI can be applied only once to replace the existential sentence; the new KB is not equivalent to the old, but is satisfiable iff the old KB was satisfiable . E.g. ∃x Killer(x, Victim) After substituting this sentence, we get Killer(Murderer, Victim) and the original sentence with ∃ is discarded. The new KB is not equivalent to the old one since the original sentence was replaced. These two KB are not logically equivalent yet inferentially equivalent

Reduction to propositional inference W e have rules for inferring nonquantified sentences from quantified sentences I t becomes possible to reduce first-order inference to propositional inference The first idea is that, just as an existentially quantified sentence can be replaced by one instantiation, a universally quantified sentence can be replaced by the set of all possible instantiations

Example Suppose the KB contains just the following: ∀ x King(x) ∧ Greedy(x) ⇒ Evil(x) King(John ) Greedy(John) Brother(Richard , John) Instantiating the universal sentence in all possible ways, we have King(John ) ∧ Greedy(John) ⇒ Evil(John) King(Richard ) ∧ Greedy(Richard) ⇒ Evil(Richard) King(John ) Greedy(John ) Brother(Richard , John) The new KB is propositionalized : proposition symbols are King(John ), Greedy(John), Evil(John), King(Richard) etc.

Problems Problem1: W ith function symbols, there are infinitely many ground terms, Father(Father(Father(John ))) Problem2: We can check the entailment of a sentence. (Modus Ponens, Resolution, etc.) However, unable to check the “non-entailment” of the sentence. (Infinitely nested statement → infinite loop)

9.2 Unification and Lifting T he propositionalization approach is rather inefficient Example the inference of Evil(John) from the sentences ∀x King(x)∧Greedy(x) ⇒ Evil(x) King (John ) Greedy (John ) This seems completely obvious to a human being

F irst-order inference rule Generalized Modus Ponens It can be summarized as "P implies Q and P is asserted to be true, so therefore Q must be true For atomic sentences p , p ′, and q, where there is a substitution θ such that SUBST( θ , pi′) = SUBST( θ , pi), for all i ,  

Example of Generalized Modus Ponens Example Use the rule that greedy kings are evil, find some x such that x is a king and x is greedy, and then infer that this x is evil T he substitution {x/ John,y /John} to the implication premises King(x) and Greedy(x) and the knowledge-base sentences King(John) and Greedy(y) will make them identical p 1′ is King(John) p 2′ is Greedy(y) θ is {x/John, y/John} S UBST ( θ , q) is Evil(John) . p 1 is King(x) p 2 is Greedy(x) q is Evil(x)

Unification We can get the inference immediately if we can find a substitution θ such that King(x) and Greedy(x) match King(John) and Greedy(y) θ = {x/ John,y /John} works Unification finds substitutions that make different logical expressions look identical UNIFY takes two sentences and returns a unifier for them, if one exists UNIFY( p , q ) =  where SUBST(, p ) = SUBST (, q ) Basically, find a  that makes the two clauses look alike

Example of Unification Examples UNIFY(Knows( John,x ), Knows( John,Jane )) = {x/Jane} UNIFY(Knows( John,x ), Knows( y,Bill )) = {x/Bill, y/John} UNIFY(Knows( John,x ), Knows( y,Mother (y))= {y/John, x/Mother(John) UNIFY(Knows( John,x ), Knows( x,Elizabeth )) = fail Last example fails because x would have to be both John and Elizabeth We can avoid this problem by standardizing: The two statements now read UNIFY(Knows( John,x ), Knows( z,Elizabeth )) This is solvable: UNIFY(Knows( John,x ), Knows( z,Elizabeth )) = {x/ Elizabeth,z /John}

Unification To unify Knows( John,x ) and Knows( y,z ) Can use θ = {y/John, x/z } Or θ = {y/John, x/John, z/John} The first unifier is more general than the second. There is a single most general unifier (MGU) that is unique up to renaming of variables. MGU = { y/John, x/z }

Unification Algorithm

Storage and Retrieval Use TELL and ASK to interact with Inference Engine Implemented with STORE and FETCH STORE( s ) stores sentence s FETCH( q ) returns all unifiers that the query q unifies with some sentence in Knowledge Base Example: q = Knows( John,x ) KB is: Knows( John,Jane ), Knows( y,Bill ), Knows( y,Mother (y)) Result is  1 ={x/Jane},  2 =,  3 = {John/ y,x /Mother(y )}

Storage and Retrieval First approach: Create a long list of all propositions in Knowledge Base Attempt unification with all propositions in Knowledge Base Works, but is inefficient Need to restrict unification attempts to sentences that have some chance of unifying Index facts in Knowledge Base Predicate Indexing Index predicates: All “Knows” sentences in one bucket All “Loves” sentences in another Use Subsumption Lattice (see below)

Storage and Retrieval Subsumption Lattice Child is obtained from parent through a single substitution Lattice contains all possible queries that can be unified with it. Works well for small lattices Predicate with n arguments has a 2 n lattice Structure of lattice depends on whether the base contains repeated variables

9.3 Forward Chaining Forward Chaining one of the two main methods of reasoning when using an inference engine and can be described logically as repeated application Idea: Start with atomic sentences in the KB Apply Modus Ponens Add new atomic sentences until no further inferences can be made Works well for a KB consisting of Situation  Response clauses when processing newly arrived data

First-Order definite clauses They are disjunctions of literals of which exactly one is positive First Order Definite Clauses Disjunctions of literals of which exactly one is positive: Example : King(x)  Greedy(x)  Evil(x) King(John) Greedy(y) First Order Definite Clauses can include variables Variables are assumed to be universally quantified Greedy(y) means y Greedy(y) Not every KB can be converted into first definite clauses

Example(1/2) Consider the following problem: The law says that it is a crime for an American to sell weapons to hostile nations. The country Nono , an enemy of America, has some missiles, and all of its missiles were sold to it by Colonel West, who is American Prove West is a criminal

Example(2/2) it is a crime for an American to sell weapons to hostile nations: American(x )  Weapon(y)  Sells( x,y,z )  Hostile(z)  Criminal(x) Nono … has some missiles, i.e.,  x Owns( Nono,x )  Missile(x): Owns(Nono,M 1 ) and Missile(M 1 ) … all of its missiles were sold to it by Colonel West Missile(x)  Owns( Nono,x )  Sells( West,x,Nono ) Missiles are weapons: Missile(x)  Weapon(x) An enemy of America counts as "hostile“: Enemy( x,America )  Hostile(x) West, who is American … American(West) The country Nono , an enemy of America … Enemy( Nono,America )

S imple F orward-chaining algorithm

Proof of Forward Chaining

Proof of Forward Chaining

Proof of Forward Chaining

Properties of forward chaining Sound and complete for first-order definite clauses Datalog = first-order definite clauses + no functions FC terminates for Datalog in finite number of iterations May not terminate in general if α is not entailed This is unavoidable: entailment with definite clauses is semidecidable

Efficiency of forward chaining Incremental forward chaining: no need to match a rule on iteration k if a premise wasn't added on iteration k-1  match each rule whose premise contains a newly added positive literal Matching itself can be expensive: Database indexing allows O(1) retrieval of known facts e.g., query Missile(x) retrieves Missile(M 1 ) Forward chaining is widely used in deductive databases There are two major possible sources of complexity. Pattern matching Incremental forward chaining

Pattern matching Finding all possible unifiers consumes large amount of time. For example, Missile(x) ∧ Owns( Nono , x) → Sells(West, x, Nono ) Assume that Nono owns 1000 objects but only 3 are missile. If we sequentially find substitution of x in KB from objects owned by Nono , it would take lots of time to find a missile. On the other hand, if we started with finding from x that is missile, we would check only 3 times and that reduce great amount of computational time. This is called “conjunct ordering” problem: find a good order to solve the conjuncts of the rule premise NP-hard problem (but good heuristic available)

Example : Map-coloring problem Given a map, find a way to paint the color on each city(node) so that the adjacent cities do not share the same color. Colorable() means the constraints are satisfiable . Diff( wa,nt )  Diff( wa,sa )  Diff( nt,q )  Diff( nt,sa )  Diff( q,nsw )  Diff( q,sa )  Diff( nsw,v )  Diff( nsw,sa )  Diff( v,sa )  Colorable() Diff( Red,Blue ) Diff ( Red,Green ) Diff( Green,Red ) Diff( Green,Blue ) Diff( Blue,Red ) Diff( Blue,Green )

Incremental forward chaining On each iteration, Forward Chaining matches every rule in Knowledge Base. ( But didn’t show in the previous slide) E.g. Missile(x) → Weapon(x) is matched with Missile(M1) resulting in “Weapon(M1)”, which is already in KB, in every iteration . Observation : Facts obtained from iteration t must use some facts from the iteration t-1 to derive. It’s true, if the fact in iteration t could be derived with new fact from iteration t-1, we would get that iteration t fact sooner. Solution : At iteration t, we take a rule only if its premise contains a conjunct that unifies with a newly derived fact at iteration t-1. Note: The premise just needs to contain, doesn’t have to be whole.  

Example of Incremental forward chaining Example We were able to derive the fact P(John) at 5 th iteration. There are these following sentences in KB (1) P(x) ∧ Q(Jane) → AA (2) K(x) ∧ M(x, y) ∧ N(x) ∧ O(x, z) → BB (3) P(Jack) ∧ N(Jake) → CC At the 6 th iteration, there is no point in checking the second and third sentences (2) does not contain P (3) cannot unify with P(John) since P was already substituted as Jack. And at the 7 th iteration, we don’t match P(John) with (1) because we will keep deriving AA.

9.4 Backward Chaining Backward Chaining is a “generator” Generates all possible substitutions to the goal. Backward Chaining is depth-first. Idea Starts from the goal Matches each literal in goal’s premise from left to right with facts in Knowledge Base Creates new recursive call, if the call succeed, the proof in that branch is completed. If the current goal is completed, continue with new sentence that can lead to goal. Be reminded : goal might be able to reached from one or more rules.

Backward chaining

Backward chaining

Backward chaining

Backward chaining

Backward chaining

Backward chaining

Backward chaining

Backward Chaining Properties of Backward chaining Depth-first recursive proof search Incomplete due to infinite loops Fixed by checking current goal against every goal Inefficient due to repeated subgoals Fixed by caching previous results However, Backward Chaining is widely used “without” improvements.

Backward Chaining Logic programming Algorithm = Logic + Control (Robert Kowalski’s equation) Declare problem in logical symbol form Query desired statement The program will find the solution to the query based on Knowledge Base. Prolog Logic programming language Use Backward Chaining as an inference algorithm Close world assumption

Prolog Redundant inference and infinite loops Prolog is trapped in infinite loops easily because of its depth-first style search. Let path(x , y) refers to “an existence of path between point x and z.” link(x, y) refers to “x is directly connected to y.” (adjacent node) To see if a path between two distant nodes exist, we use path(n1, n2) We model the rules in recursive style with prolog syntax as… path(n1, n2) :- link(n1, n2) path(n1 , n2) :- path(n1, n), path(n, n2)

Prolog Redundant inference and infinite loops path(n1, n2) :- link(n1, n2) Case 1: two nodes are directly connected. path(n1, n2) :- path(n1, n), link(n, n2) Case 2:two nodes are connected by some path between n1 and n while n is adjacent to n2. This can be written in FOL style as… [link(n1, n2)] ∨ [path(n1, n) ∧ link(n, n2)] → path(n1, n2)

Prolog If implemented badly, prolog will keep on delving itself to find the path as the right picture

Constraint logic programming(CLP ) Normally logic program(e.g. prolog) can answer only 1 or finite number of answers at time. What if there are infinitely many answers? Example x >=0 ∧ y>=0 ∧ z>=0 ∧ x+y >=z ∧ y+z >=x ∧ x+z >=y → Triangle(x, y, z) For Prolog Triangle(3,4,5) is True. But Triangle(3,4,z) causes an error because prolog cannot handle z>= 0 constraint. CLP can handle this. Triangle(3,4,z) results in 1<= z <= 7 CLP was invented for constraint problem solving. The answer is correct as long as it meets the constraints

9.5 Resolution Conjunctive normal form for First-order logic As in propositional logic, resolution algorithm works with sentences in Conjunctive normal form . Every First-order logic sentence can be converted into an inferentially equivalent Conjunctive normal form sentence .

Resolution Conversion to Conjunctive normal form Everyone who loves all animals is loved by someone: ∀x [∀y Animal(y) → Loves(x, y)] → [∃y Loves(y, x)] 1. Eliminate implications and biconditionals (if any) ∀x [¬(∀y ¬Animal(y) ∨ Loves(x, y))] ∨ [∃y Loves(y, x)] 2. Move ¬ inwards: (¬∀x p ≡ ∃x ¬p , ¬ ∃x p ≡ ∀x ¬p) ∀x [∃y ¬(¬ Animal(y) ∨ Loves(x, y))] ∨ [∃y Loves(y, x)] ∀x [∃y ¬¬ Animal(y) ∧ ¬Loves(x, y))] ∨ [∃y Loves(y, x)] ∀x [∃y Animal(y) ∧ ¬Loves(x, y))] ∨ [∃y Loves(y, x )] 3. Standardize variables: ∀x [∃y Animal(y) ∧ ¬Loves(x, y))] ∨ [∃y Loves(y, x)] Notice that there are two y in the sentence. However, these y refer to different objects The first one is animal, the second one is the person who loves x. To avoid confusion, we rename(standardize) the second y. ∀x [∃y Animal(y) ∧ ¬Loves(x, y))] ∨ [∃z Loves(z, x)]

Resolution 4. Skolemization As in the previous slide, skolemization is done so as to drop the EQ. However, we cannot use skolem variable as before. ∀x [Animal(A) ∧ ¬Loves(x, A))] ∨ [Loves(B, x)] Everyone either fails to love a “particular” animal A. Is loved by a “particular” person B. The meaning of the sentence has changed. In this case, we have to use “ skolem function” which its value depends on x. ∀x [Animal(F(x)) ∧ ¬Loves(x, F(x)))] ∨ [Loves(G(x), x)] Skolem function are all the “universally quantified variable” in a “scope of existential quantifier”. So… skolem funtion behaves like a universal quantifier

Resolution 5. Drop universal quantifiers Since everything up to this point is universally quantified, dropping the sign is a legal action. [Animal(F(x)) ∧ ¬Loves(x, F(x))] ∨ [Loves(G(x), x)] 6. Rewrite into Conjunctive form (Distribute ∧ over ∨) (L1 ∨ … ∨ Ln) ∧ (M1 ∨ … ∨ Mm) ∧ … ∧ (Z1 ∨ … ∨ Zz ) [Animal(F(x)) ∨ Loves(G(x), x)] ∧ [¬Loves(x, F(x)) ∨ Loves(G(x), x)]

Resolution Properties of resolution Complete for FOL Refutation-complete If a set of sentences is unsatisfiable , resolution will always be able to derive a contradiction. However, resolution can produce nonconstructive proofs. A proof which indirectly shows a mathematical object exists without providing a specific example or algorithm for producing an example. To avoid that, answer literal is added to the negated goal. CNF(KB ∧ ¬(a ∧ answerLiteral ))

Resolution strategies Unit preference Make use of “unit clause” : a sentence which has exactly one literal. Idea : use unit clause to produce shorter sentence. Since we want to reach empty clause as soon as possible, having a heuristic that could reduce the sentence’s length is worth trying

Resolution strategies Set of support( SoS ) A special set that in every step of resolution, the resolvent comes from a member of that set. Idea : We get a problem with a set of conditions. To solve the problem, we have to use every condition indicated in the problem. If we could solve without using every condition???? Something must be off! That set of conditions is an analogy to the set of support. Drawback If we define bad SoS , problem becomes unsolvable.

Summary A first approach uses inference rules for instantiating quantifiers in order to propositionalize the inference problem. However, it’s very slow. Lifted Modus Ponens, called generalized Modus Ponens, can be applied to FOL. FC and BC also use this rule. FC is a bottom-up process. Starting from scattered facts, assembling them to get the desired conclusion. BC is a top-down process. Starting from desired goal, then try to find the facts that can prove the goal true. Resolution also has its lifted version which works in the same manner, CNF and proof by contradiction, as the propositional version.
Tags