knowld in learning jkjdsbfbsdvs iuuohgguh

Knowledge in Learning
Copyright, 1996 © Dale Carnegie & Associates,
Inc.
Chapter 19
Fall 2005

CSE 471/598 by H. Liu 2
A logical formulation of learning
What’re Goal and Hypotheses
Goal predicate Q - WillWait
Learning is to find an equivalent logical
expression we can classify examples
Each hypothesis proposes such an
expression - a candidate definition of Q
r WillWait(r) Pat(r,Some) 
Pat(r,Full) Hungry(r)Type(r,French) 
…

CSE 471/598 by H. Liu 3
Hypothesis space is the set of all hypotheses
the learning algorithm is designed to
entertain.
One of the hypotheses is correct:
H1 V H2 V…V Hn
Each Hi predicts a certain set of examples -
the extension of the goal predicate.
Two hypotheses with different extensions
are logically inconsistent with each other,
otherwise, they are logically equivalent.

CSE 471/598 by H. Liu 4
What are Examples
An example is an object of some logical
description to which the goal concept may
or may not apply.
Alt(X1)^!Bar(X1)^!Fri/Sat(X1)^…
Ideally, we want to find a hypothesis that
agrees with all the examples.
The relation between f and h are: ++, --, +-
(false negative), -+ (false positive). If the
last two occur, example I and h are
logically inconsistent.

CSE 471/598 by H. Liu 5
Current-best hypothesis
search
Maintain a single hypothesis
Adjust it as new examples arrive to
maintain consistency (Fig 19.1)
Generalization for positive examples
Specialization for negative examples
Algorithm (Fig 19.2, page 681)
Need to check for consistency with all
existing examples each time taking a new
example

CSE 471/598 by H. Liu 6
Example of WillWait
Fig 18.3 for Current-Best-Learning
Problems: nondeterministic, no guarantee
for simplest and correct h, need backtrack

CSE 471/598 by H. Liu 7
Least-commitment search
Keeping only one h as its best guess is the
problem -> Can we keep as many as possible?
Version space (candidate elimination) Algorithm
incremental
least-commitment
From intervals to boundary sets
G-set and S-set
S0 – the most specific set contains nothing <0,0,…,0>
G0 – the most general set covers everything <?,?,…,?>
Everything between is guaranteed to be consistent with
examples.
VS tries to generalize S0 and specialize G0
incrementally

CSE 471/598 by H. Liu 8
Version space
Generalization and specialization (Fig 19.4):

find d-sets that contain only true/+, and true/-;

Sj can only be generalized and Gj can only be specialized

False positive for Si, too general, discard it

False negative for Si, too specific, generalize it minimally

False positive for Gi, too general, specialize it minimally

False negative for Gi, too specific, discard it
When to stop

One concept left (Si = Gi)

The version space collapses (G is more special than S, or..)

Run out of examples
An example with 4 instances from Tom Mitchell’s
book
One major problem: can’t handle noise

CSE 471/598 by H. Liu 9
Using prior knowledge
For DT and logical description
learning, we assume no prior
knowledge
We do have some prior knowledge,
so how can we use it?
We need a logical formulation as
opposed to the function learning.

CSE 471/598 by H. Liu 10
Inductive learning in the
logical setting
The objective is to find a hypothesis that
explains the classifications of the
examples, given their descriptions.
Hypothesis ^ Description |= Classifications

Hypothesis is unknown, explains the observations

Descriptions - the conjunction of all the example
descriptions

Classifications - the conjunction of all the example
classifications
Knowledge free learning

Decision trees

Description = Classifications

CSE 471/598 by H. Liu 11
A cumulative learning
process
Observations, K-based learning,
Hypotheses, and prior knowledge, as in Fig
19.6 (p 687)
The new approach is to design agents that
already know something and are trying to
learn some more.
Intuitively, this should be faster and better
than without using knowledge, assuming
what’s known is always correct.
How to implement this cumulative learning
with increasing knowledge?

CSE 471/598 by H. Liu 12
Some examples of using
knowledge
One can leap to general conclusions after
only one observation.
Your such experience?
Traveling to Brazil: Language and name
?
A pharmacologically ignorant but
diagnostically sophisticated medical
student …
?

CSE 471/598 by H. Liu 13
Some general schemes
Explanation-based learning (EBL)
Hypothesis^Description |= Classifications
Background |= Hypothesis
doesn’t learn anything factually new from instance
Relevance-based learning (RBL)
Hypothesis^Descriptions |= Classifications
Background^Descrip’s^Class |= Hypothesis
deductive in nature
Knowledge-based inductive learning (KBIL)
Background^Hypothesis^Descrip’s |= Classifications

CSE 471/598 by H. Liu 14
Inductive logical
programming (ILP)
ILP can formulate hypotheses in general
first-order logic

Others like DT are more restricted languages
Prior knowledge is used to reduce the
complexity of learning:

prior knowledge further reduces the H space

prior knowledge helps find the shorter H

Again, assuming prior knowledge is correct

CSE 471/598 by H. Liu 15
Explanation-based learning
A method to extract general rules from
individual observations
The goal is to solve a similar problem faster
next time.
Memoization - speed up by saving results
and avoiding solving a problem from
scratch
EBL does it one step further - from
observations to rules

CSE 471/598 by H. Liu 16
Why EBL?
Explaining why something is a good idea is
much easier than coming up with the idea.
Once something is understood, it can be
generalized and reused in other circumstances.
Extracting general rules from examples
EBL constructs two proof trees simultaneously
by variablization of the constants in the first
tree
An example (Fig 19.7)

CSE 471/598 by H. Liu 17
Basic EBL
Given an example, construct a proof tree
using the background knowledge
In parallel, construct a generalized proof
tree for the variabilized goal
Construct a new rule (leaves => the root)
Drop any conditions that are true
regardless of the variables in the goal

CSE 471/598 by H. Liu 18
Efficiency of EBL
Choosing a general rule

too many rules -> slow inference

aim for gain - significant increase in speed

as general as possible
Operationality - A subgoal is operational means
it is easy to solve

Trade-off between Operationality and
Generality
Empirical analysis of efficiency in EBL study

CSE 471/598 by H. Liu 19
Learning using relevant
information
Prior knowledge: People in a country
usually speak the same language
Nat(x,n) ^Nat(y,n)^Lang(x,l)=>Lang(y,l)
Observation: Given nationality, language is
fully determined

Given Fernando is Brazilian & speaks
Portuguese
Nat(Fernando,B) ^ Lang(Fernando,P)
We can logically conclude
Nat(y,B) => Lang(y,P)

CSE 471/598 by H. Liu 20
Functional dependencies
We have seen a form of relevance:
determination - language (Portuguese) is a
function of nationality (Brazil)
Determination is really a relationship
between the predicates
The corresponding generalization follows
logically from the determinations and
descriptions.

CSE 471/598 by H. Liu 21
We can generalize from Fernando to all Brazilians,
but not to all nations. So, determinations can limit
the H space to be considered.
Determinations specify a sufficient basis
vocabulary from which to construct hypotheses
concerning the target predicate.
A reduction in the H space size should make it
easier to learn the target predicate

For n Boolean features, if the determination contains d
features, what is the saving for the required number of
examples according to PAC?

CSE 471/598 by H. Liu 22
Learning using relevant
information
A determination P Q says if any examples
match on P, they must also match on Q
Find the simplest determination consistent with
the observations
Search through the space of determinations from
one predicate, two predicates
Algorithm - Fig 19.8 (page 696)
Time complexity is n choosing p
Feature selection is about finding determination
Feature selection is an active research area for
machine learning, pattern recognition, statistics

CSE 471/598 by H. Liu 23
Combining relevance based learning with
decision tree learning -> RBDTL

Reduce the required training data

Reduce the hypothesis space
Its learning performance improves (Fig 19.9).

Performance in terms of training set size

Gains: time saving, less chance to overfit
Other issues about relevance based learning

noise handling

using other prior knowledge
Semi-supervised learning
Expert knowledge as constraints

from attribute-based to FOL

CSE 471/598 by H. Liu 24
Inductive logic programming
It combines inductive methods with FOL.
ILP represents theories as logic programs.
ILP offers complete algorithms for inducing
general, first-order theories from examples.
It can learn successfully in domains where
attribute-based algorithms fail completely.
An example - a typical family tree (Fig 19.11)

CSE 471/598 by H. Liu 25
Inverse resolution
If Classifications follow from B^H^D, then
we can prove this by resolution with
refutation (completeness).
The normal resolution is
C1 and C2 -> C (the resolvent)
If we run the proof backwards, we can find
a H such that the proof goes through.
C -> C1 and C2
C and C2 -> C1
Generating inverse proofs
A family tree example (Fig 19.13)

CSE 471/598 by H. Liu 26
Inverse resolution involves search

Each inverse resolution step is nondeterministic
For any C and C1, there can be many C2
Discovering new knowledge with IR

It’s not easy - a monkey and a typewriter
Discovering new predicates with IR

Fig 19.14
The ability to use background knowledge
provides significant advantages

CSE 471/598 by H. Liu 27
Top-down learning (FOIL)
A generalization of DT induction to the first-
order case by the same author of C4.5

Starting with a general rule and specialize it to fit data

Now we use first-order literals instead of attributes,
and H is a set of clauses instead of a decision tree.
Example: =>grandfather(x,y) (page 701)

positive and negative examples

adding literals one at a time to the left-hand side
e.g., Father (x,y) => Grandfather(x,y)
How to choose literal? (Algorithm on page 702)

the rule should agree with some + examples, none of
– examples

FOIL removes the covered + examples, repeats

CSE 471/598 by H. Liu 28
Summary
Using prior knowledge in cumulative
learning
Prior knowledge allows for shorter H’s.
Prior knowledge plays different logical
roles as in entailment constraints
EBL, RBL, KBIL
ILP generates new predicates so that
concise new theories can be expressed.

knowld in learning jkjdsbfbsdvs iuuohgguh

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

knowld in learning jkjdsbfbsdvs iuuohgguh

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......