Compiler Design LR parsing SLR ,LALR CLR

RiazulIslam10 5,430 views 23 slides Mar 21, 2018
Slide 1
Slide 1 of 23
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23

About This Presentation

LR parsing in compiler design presentation


Slide Content

Welcome My Presentation Md.Riazul Islam CE-14047

Introduction to LR Parsing: Simple LR The most prevalent type of bottom-up parser today is based on a concept called LR(k) parsing; the "L" is for left-to-right scanning of the input, the "R" for constructing a rightmost derivation in reverse, and the k for the number of input symbols of lookahead that are used in making parsing decisions. The cases k = 0 or k = 1 are of practical interest, and we shall only consider LR parsers with k 5 1 here. When (k) is omitted, k is assumed to be 1

This section introduces the basic concepts of LR parsing and the easiest method for constructing shift-reduce parsers, called "simple LR" (or SLR, for short). Some familiarity with the basic concepts is helpful even if the LR parser itself is constructed using an automatic parser generator. We begin with "items" and "parser states;" the diagnostic output from an LR parser generator typically includes parser states, which can be used to isolate the sources of parsing conflicts.

Examples: Example : Consider the augmented expression grammar: E' + E E -+ E+T (T T + T*F 1 F E -+ (E) I id If I is the set of one item {[E' -+ .El}, then CLOSURE(I) contains the set of items I. in Fig. 4.3

To see how the closure is computed, E' -+ -E is put in CLOSURE(I) by rule (1). Since there is an E immediately to the right of a dot, we add the E-productions with dots at the left ends: E -+ .E + T and E -+ ST. Now there is a T immediately to the right of a dot in the latter item, so we add T -+ ST * F and T -+ .F. Next, the F to the right of a dot forces us to add F + .(E) and F -+ -id, but no other items need to be added

1. Kernel items: the initial item, S' -+ .S, and all items whose dots are not at the left end. 2. Nonkernel items: all items with their dots at the left end, except for S' -+ .S.

Table

The codes for the actions are: 1. si means shift and stack state i , 2. rj means reduce by the production numbered j, 3. acc means accept, 4. blank means error. Note that the value of GOTO[S, a] for terminal a is found in the ACTION field connected with the shift action on input a for state s. The GOTO field gives GOTO[S, A] for nonterminals A. Although we have not yet explained how the entries for Fig. 4.37 were selected, we shall deal with this issue short

More Powerful LR Parsers: we shall extend the previous LR parsing techniques to use one symbol of lookahead on the input. There are two different methods: 1. The "canonical-LR" or just "LR" method, which makes full use of the lookahead symbol(s). This method uses a large set of items, called the LR(1) items. 2. The " lookahead -LR" or "LALR" method, which is based on the LR(0) sets of items, and has many fewer states than typical parsers based on the LR(1) items. By carefully introducing lookaheads into the LR(0) items, we can handle many more grammars with the LALR method than with the SLR method, and build parsing tables that are no bigger than the SLR tables. LALR is the method of choice in most situations.

Canonical LR(1) Items : We shall now present the most general technique for constructing an LR parsing table from a grammar. Recall that in the SLR method, state i calls for reduction by A -+ a if the set of items Ii contains item [A --+ as] and a is in FOLLOW(A). In some situations, however, when state i appears on top of the stack, the viable prefix pa on the stack is such that PA cannot be followed by a in any right-sentential form. Thus, the reduction by A -+ a should be invalid on input a.

Examples: Example : Consider the following augmented grammar. We begin by computing the closure of {[St -+ -S, $1). To close, we match the item [St -+ -S, $1 with the item [A -+ a-BP, a] in the procedure CLOSURE. That is, A = St, a = e, B = S, P = e, and a = $. Function CLOSURE tells us to add [B -+ .y, b] for each production B -+ y and terminal b in FIRST(P~). In terms of the present grammar, B -+ y must be S -+ CC, and since ,8 is c and a is $, b may only be $. Thus we add [S -+ .CC, $1

We continue to compute the closure by adding all items [C -+ .y, b] for b in FIRST(C$). That is, matching [S -+ .CC, $1 against [A -+ a.B,O , a], we have A = S, a = 6, B = C, p = C, and a = $. Since C does not derive the empty string, FIRST(C$) = FIRST(C). Since FIRST@) contains terminals c and d, we add items [C -+ - cC , c], [C -+ . cC , dl, [C -t -d, c] and [C -+ -d, dl. None of the new items has a nonterminal immediately to the right of the dot, so we have completed our first set of LR(1) items. The initial set of items is I0: S->.S,$ S -> .CC, $ C -> . cC , c/d C -> .d, c/d

Constructing LALR Parsing Table We now introduce our last parser construction method, the LALR ( Eoolcahead - LR) technique. This method is often used in practice, because the tables obtained by it are considerably smaller than the canonical LR tables, yet most common syntactic constructs of programming languages can be expressed conveniently by an LALR grammar. The same is almost true for SLR grammars, but there are a few constructs that cannot be conveniently handled by SLR techniques

Example