Bottom up parser

EsmeraldaAkshu1 16,162 views 62 slides Jun 04, 2016
Slide 1
Slide 1 of 62
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62

About This Presentation

Bottom up parser


Slide Content

Bottom up parsing

Bottom-Up Parsing
Bottom-up parsing is more general than top-down.
A bottom-up parser builds a derivation by working
from the input sentence back toward the start symbol S
Preferred method in practice
Also called LR parsing
L means that tokens are read left to right
R means that it constructs a rightmost derivation

Bottom-up Parsing
•Two types of bottom-up parsing:
•Operator-Precedence parsing
•LR parsing
•covers wide range of grammars.
•SLR –simple LR parser
•LR –most general LR parser
•LALR –intermediate LR parser (Lookahead LR
parser)

Bottom-Up Parsing
LR parsing reduces a string to the start
symbol by inverting productions:
Str input string of terminals
repeat
Identify β in str such that A →β is a
production
Replace β by A in str
until str = S (the start symbol) OR all
possibilities are exhausted

Bottom-Up Parsing
int + (int) + (int)
E + (int) + (int)
E + (E) + (int)
E + (int)
E + (E)
E
E → E + ( E ) | int
A rightmost derivation
in reverse

Reductions
Bottom-up parsing as the process of
"reducing" a string w to the start symbol
of the grammar.
At each reduction step, a substring that
matches the body of a production is
replaced by the non-terminal at the head
of that production.

Handle
Handle of a string: Substring that matches the
RHS of some production AND whose reduction
to the non-terminal on the LHS is a step along
the reverse of some rightmost derivation.

Handles
Formally:
Handle of a right sentential form is <A ,
location of in >
i.e. A is a handle of at the location
immediately after the end of , if:S => A=> 
A certain sentential form may have many different
handles.
Right sentential forms of a non-ambiguous grammar
have one unique handle

Example
S aABeaAde aAbcde abbcde
S aABe
A Abc| b
B d
S aABe is a handle of aABein location 1.
B d is a handle of aAde in location 3.
A Abc is a handle of aAbcde in location 2.
A b is a handle of abbcde in location 2.

Handle Pruning
A rightmost derivation in reverse can be obtained by “handle-pruning.”
abbcde
Find the handle = b at loc. 2
aAbcde
b at loc. 3 is not a handle:
aAAcde
... blocked.

Handle-pruning
The process of discovering a handle & reducing it to the appropriate left-hand side is
called handle pruning.
Handle pruning forms the basis for a bottom-up parsing method.
To construct a rightmost derivation
S=
0

1

2
… 
n-1

n= w
Apply the following simple algorithm
for i n to 1 by -1
Find the handleA
i 
i
in 
i
Replace
iwith A
ito generate 
i-1

1.S -> 0 S1|01 indicate the handle in each of the following right-sentential forms:
1.000111
2.00S11
2.For the grammar S -> S S + | S S * | a indicate the handle in each of the following
right-sentential forms:
1.SSS + a * +
2.SS + a * a+
3.aaa * a + +.
3.Give bottom-up parses for the following input strings
1.The input 000111 according to the grammar of
Exercise 1
2.The input aaa * a + + according to the grammar of 2

Shift Reduce Parsing with a
Stack
Two problems:
locate a handle
decide which production to use (if there are
more than two candidate productions).

Shift-reduce Parsing
A shift-reduce parser is a stack automaton with four actions
Shift —next word is shifted onto the stack
Reduce—right end of handle is at top of stack
Locate left end of handle within the stack
Pop handle off stack & push appropriate lhs
Accept —stop parsing & report success
Error—call an error reporting/recovery routine
Accept & Error are simple
Shift is just a push and a call to the scanner
Reducetakes |rhs| pops & 1 push

x-2*y
Stack Input Handle Action
$ id-num* idNone shift
$ id -num* id
1. Shift until the top of the stack is the right end of a handle
2. Find the left end of the handle and reduce
0Goal Expr
1Expr Expr+ Term
2 |Expr-Term
3 |Term
4Term Term * Factor
5 |Term/ Factor
6 |Factor
7Factornumber
8 |id
9 |(Expr)

Back to x-2*y
Stack Input Handle Action
$ id-num* id none shift
$ id -num* id 8,1 reduce 8
$ Factor -num* id 6,1 reduce 6
$ Term -num* id 3,1 reduce 4
$ Expr -num* id
1. Shift until the top of the stack is the right end of a handle
2. Find the left end of the handle and reduce
0Goal Expr
1Expr Expr+ Term
2 |Expr-Term
3 |Term
4Term Term * Factor
5 |Term/ Factor
6 |Factor
7Factornumber
8 |id
9 |(Expr)

Back to x-2*y
Stack Input Handle Action
$ id-num* id none shift
$ id -num* id 8,1 reduce 8
$ Factor -num* id 6,1 reduce 6
$ Term -num* id 3,1 reduce 4
$ Expr -num* id
1. Shift until the top of the stack is the right end of a handle
2. Find the left end of the handle and reduce
0Goal Expr
1Expr Expr+ Term
2 |Expr-Term
3 |Term
4Term Term * Factor
5 |Term/ Factor
6 |Factor
7Factornumber
8 |id
9 |(Expr)
Expr is not a handle at this point because it does not
occur at this point in the derivation.

Back to x-2*y
Stack Input Handle Action
$ id-num* id none shift
$ id -num* id 8,1 reduce 8
$ Factor -num* id 6,1 reduce 6
$ Term -num* id 3,1 reduce 3
$ Expr -num* id none shift
$ Expr- num* id none shift
$ Expr-num * id
1. Shift until the top of the stack is the right end of a handle
2. Find the left end of the handle and reduce
0Goal Expr
1Expr Expr+ Term
2 |Expr-Term
3 |Term
4Term Term * Factor
5 |Term/ Factor
6 |Factor
7Factornumber
8 |id
9 |(Expr)

Back to x-2*y
Stack Input Handle Action
$ id-num* id none shift
$ id -num* id 8,1 reduce 8
$ Factor -num* id 6,1 reduce 6
$ Term -num* id 3,1 reduce 3
$ Expr -num* id none shift
$ Expr- num* id none shift
$ Expr-num * id 7,3 reduce 7
$ Expr-Factor * id 6,3 reduce 6
$ Expr-Term * id
1. Shift until the top of the stack is the right end of a handle
2. Find the left end of the handle and reduce
0Goal Expr
1Expr Expr+ Term
2 |Expr-Term
3 |Term
4Term Term * Factor
5 |Term/ Factor
6 |Factor
7Factornumber
8 |id
9 |(Expr)

Back to x-2*y
Stack Input Handle Action
$ id-num* id none shift
$ id -num* id 8,1 reduce 8
$ Factor -num* id 6,1 reduce 6
$ Term -num* id 3,1 reduce 3
$ Expr -num* id none shift
$ Expr- num* id none shift
$ Expr-num * id 7,3 reduce 7
$ Expr-Factor * id 6,3 reduce 6
$ Expr-Term * id none shift
$ Expr-Term * id none shift
$ Expr-Term * id
1.Shift until the top of the stack is the right end of a handle
2.Find the left end of the handle and reduce
0Goal Expr
1Expr Expr+ Term
2 |Expr-Term
3 |Term
4Term Term * Factor
5 |Term/ Factor
6 |Factor
7Factornumber
8 |id
9 |(Expr)

Back to x-2*y
5 shifts +
9 reduces +
1 accept
Stack Input Handle Action
$ id-num* id none shift
$ id -num* id 8,1 reduce 8
$ Factor -num* id 6,1 reduce 6
$ Term -num* id 3,1 reduce 3
$ Expr -num* id none shift
$ Expr- num* id none shift
$ Expr-num * id 7,3 reduce 7
$ Expr-Factor * id 6,3 reduce 6
$ Expr-Term * id none shift
$ Expr-Term * id none shift
$ Expr-Term * id 8,5 reduce 8
$ Expr-Term * Factor 4,5 reduce 4
$ Expr-Term 2,3 reduce 2
$ Expr 0,1 reduce 0
$ Goal none accept
1.Shift until the top of the stack is the right end of a handle
2.Find the left end of the handle and reduce
0Goal Expr
1Expr Expr+ Term
2 |Expr-Term
3 |Term
4Term Term * Factor
5 |Term/ Factor
6 |Factor
7Factornumber
8 |id
9 |(Expr)

Goal
<id,x>
Term
Fact.
Expr –
Expr
<id,y>
<num,2>
Fact.
Fact.Term
Term
*
Stack Input Action
$ id-num* id shift
$ id -num* idreduce 8
$ Factor -num* idreduce 6
$ Term -num* idreduce 3
$ Expr -num* id shift
$ Expr- num* id shift
$ Expr-num * idreduce 7
$ Expr-Factor * idreduce 6
$ Expr-Term * id shift
$ Expr-Term * id shift
$ Expr-Term * id reduce 8
$ Expr-Term * Factor reduce 4
$ Expr-Term reduce 2
$ Expr reduce 0
$ Goal accept
Back to x-2*y
Corresponding Parse Tree

Conflicts During Shift-Reduce Parsing
Conflicts
“shift/reduce” or “reduce/reduce”
Example:
stmtif exprthen stmt
| if exprthen stmtelse stmt
| other (any other statement)
Stack Input
if … then stmt else …Shift/ Reduce Conflict
We can’t tell
whether it is a
handle

LR Parsing
Bottom-up parser based on a concept called
LR(k) parsing
"L" is for left-to-right scanning of the input.
"R" for constructing a rightmost derivation in
reverse,
“k” for the number of input symbols of
lookahead that are used in making parsing
decisions.

Why LR Parsers?
For a grammar to be LR it is sufficient that a
left-to-right shift-reduce parser be able to
recognize handles of right-sentential forms
when they appear on top of the stack.

Why LR Parsers?
LRparserscanbeconstructedtorecognizeall
programminglanguageconstructsforwhich
context-freegrammarscanbewritten.
TheLR-parsingmethodisthemostgeneralnon-
back-trackingshift-reduceparsingmethod.
AnLRparsercandetectasyntacticerrors.
TheclassofgrammarsthatcanbeparsedusingLR
methodsisapropersupersetoftheclassof
grammarsthatcanbeparsedwithpredictiveorLL
methods.

Components of LR Parser
X
Y
Z
$
A+B$
Parsing Program
Parse Table
ActionGoto
output
stack
Input buffer

Techniques for Creating Parse Table
SLR: Construct parsing table for small set of grammars called
SLR(1).
Easy to develop.
CLR(CANONICAL LR) : Most powerful.
Generates a large parse table.
More difficult develop.
Works for all types of CFG
May detect position of error.
LALR(LOOK AHEAD LR) :Widely used method.
Optimizes the size of parse table, by combining some states.
Information may get lost.

SLR Parsers
1.Formation of augmented grammar G’ for
the given grammar G
2.Construction of LR(0) collection of
items.
To find LR(0) collection of items Closure(I)
and Goto(I,X) have to be computed.
3.Finding first and follow of non-terminals
4.Construction of parse table.

Formation of Augmented
Grammar
The augmented grammar G’, is G with a new start
symbol S’ and an additional production S’ -> S
E->E+T|T
T->T*F|F
The augmented grammar G’ is given by
E’->E
E->E+T|T
T->T*F|F

Items and the LR(O)
Automaton
Howdoesashift-reduceparserknowwhen
toshiftandwhentoreduce?
Howdoestheparserknowthatsymbolonthe
topofthestackisnotahandle?
AnLRparsermakesshift-reducedecisions
bymaintainingstatestokeeptrackofallthe
operations.
Statesrepresentsetsof"items."

Items and the LR(O)
Automaton
An LR(O) item of a grammar G is a production of G, with a
dotat some position in the body.
Production A -> XYZ yields the four items
A -> ·XYZ
A -> X·YZ
A -> XY·Z
A -> XYZ·
A ->εgenerates only one item, A ->.
An item indicates how much of a production we have seen
at a given point in the parsing process.

Items and the LR(O)
Automaton
The item A -> ·XYZ indicates that we hope to see a string derivable from XYZ next
on the input.
Item A -> X·YZ indicates that we have just seen on the input a string derivable from
X and that we hope next to see a string derivable from YZ.
Item A -> XY·Z indicates that we have just seen on the input a string derivable from
XY and that we hope next to see a string derivable from Z.
Item A -> XYZ· indicates that we have seen the body XY Z and that it may be time
to reduce XYZ to A

Items and the LR(O)
Automaton
CollectionofsetsofLR(0)items,calledthe
canonicalLR(0)collection.
Providesthebasisforconstructingadeterministic
finiteautomatonthatisusedtomakeparsing
decisions.
AutomatoniscalledanLR(0)automaton.
EachstateoftheLR(0)automatonrepresentsa
setofitemsinthecanonicalLR(0)collection.

Construction of LR(0) Items
Items are viewed as states in NFA.
Grouped to form same states.
Process of grouping together is called subset
Construction Algorithm.
Closure and goto operations have to be
computed.

Closure of Item Sets
If I is a set of items for a grammar G, then CLOSURE(I)
is the set of items constructed from I by the two rules:
Initially, add every item in I to CLOSURE(I).
If A -> α·Bγis in CLOSURE(I) and B -> γis a
production, then add the item B -> γto CLOSURE(I),
if it is not already there. Apply this rule until no more
new items can be added to CLOSURE (I).

Computation of CLOSURE
E’ E
E E + T | T
T T * F | F
F ( E ) | id
I
0
E’ .E
E .E + T
E .T
T .T * F
T .F
F .( E )
F .id

Computation of CLOSURE
SetOfltems CLOSURE (I) {
J = I;
repeat
for ( each item A -> a·Bγin J )
for ( each production B ->γ of G )
if (B ->.γ is not in J )
add B ->.γ to J;
until no more items are added to J on one round;
return J;
}

Kernel items : The initial item, S' ->·S, and all items whose dots are not at
the left end.
Non-kernel items : All items with their dots at the left end, except for S' ->
·S.

The Function GOTO
GOTO (I, X) is defined to be the closure of the set of all items [A -> αX.β] such that
[A -> α.Xβ] is in I.
If I is the set of two items { [E' -> E·] , [E -> E· + T] } , then
GOTO(I, +) contains the items
E -> E + ·T
T -> ·T * F
T -> ·F
F -> · (E)
F -> ·id

The Function GOTO

Grammar
S E + S | E
E num
E num .
S’ S .$
num
ES’ .S $
S .E + S
S .E
E .num
1
S E . +S
S E .
2
S E + S .
5
S E + . S
S .E + S
S .E
E .num
3
S’ S $ .
7
4
S
S
$
+
E
num

num + $ E S
1s4 g2 g6
2 s3 SE
3s4 g2 g5
4 Enum Enum
5 SE+S
6 s7
7 accept

S'S
S L=R
S R
L *R
L id
R L
id
S'S
S L=R
S R
L *R
L id
R L
L id 
S L =R
R L 
S'S 
I
0
I
1
I
2
I
3
S R 
I
4
L *R
R L
L id
L * R
I
5
S
L
*
idR
S L= R
R L
L *R
L id
I
6
=
R
S L=R 
R L 
L
L
I
7
id
I
3
*
*
L *R 
R
I
8
I
9

state action goto
id = * $ SL R
0 s3 s5 1 2 4
1 accept
2 s6/r(RL)
3 r(Lid) r(Lid)
4 r(SR)
5 s3 s5 7 8
6 s3 s5 7 9
7 r(RL) r(RL)
8 r(L*R) r(L*R)
9 r(SL=R)

Structure of the LR Parsing
Table
1.The ACTION function takes as arguments a state i and a terminal
a (or $, the input endmarker). The value of ACTION[i, a] can have
one of four forms:
a)Shift j , where j is a state. The action taken by the parser shifts
input a to the stack, but uses state j to represent a .
b)Reduce A ->β. The action of the parser reduces βon the top of
the stack to head A.
c)Accept. The parser accepts the input and finishes parsing;
d)Error. The parser discovers an error in its input and takes some
corrective action.
2.Extend the GOTO function, defined on sets of items, to states: if
GOTo [Ii , A] = Ij , then GOTO also maps a state i and a
nonterminal A to state j .

Construction of SLR Parse
Table
1.Construct C={I0,I1…….In} the collection sets of LR(0)
items for G’.
2.Initial state of the parser is constructed from the set of items
for [S’->S]
3.State I is constructed from Ii. The parsing actions are
determined as follows.
1.If [A->α.aβ] is in Ii and GOTO(Ii,a]=Ij, then ACTION[i,a]
is set to ‘shift j’ here ‘a’ must be a terminal.
2.If[A->α.] is in Ii, then ACTION[i,a] is set to reduce A->a
for all ‘a’ Follow(A).
3.If S’->S is in Ii, than action[i, $] is set to ‘accept’.

Construction of SLR parsing
table
4. GOTO transitions are constructed for all non-terminals. If GOTO(Ii,A) =Ij, then
goto(i,A)of the parse table is set to j.
5. All other entries are error entries.

LR-Parser Configurations
Helps to have complete state o its stack and the remaining
input. A configuration of an LR parser is a pair:(s
0s
1
………… s
m, a
ia
i+1…………a
n$).
where the first component is the stack contents and the
second component is the remaining input.

Behavior of the LR Parser

LR-parsing algorithm.
INPUT: An input string w and an LR-parsing table with functions ACTION
and GOTO for a grammar G.
OUTPUT: If w is in L ( G), the reduction steps of a bottom-up parse for W ;
otherwise, an error indication.
METHOD: Initially, the parser has So on its stack, where So is the initial
state, and w$ in the input buffer.

LR-Parser Configurations