CD Unit 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 95

UNIT-III

Bottom-Up Parsing

Compiler Design- Unit-3


Contents
• Bottom up Parsing
• Reductions
• Handle Pruning
• Shift Reduce Parsing
• Problems related to Shift Reduce Parsing
• Conflicts during Shift Reduce Parsing
• Operator Precedence Parser
• Computation of LEADING
• Computation of TRAILING
• Problems related to LEADING AND TRAILING
• LR Parsers – Why LR Parsers
• Items and LR(0) Automation
• Closure of Item sets
• LR Parsing Algorithm
• SLR Grammars
• SLR Parsing Tables
• Problems related to SLR
• Construction of Canonical LR(1) and LALR
• Construction of LALR
• Problems related to Canonical LR(1) and LALR Parsing Table

Compiler Design- Unit-3


Bottom-Up Parsing
Introduction
• Bottom up parsing works in the opposite direction from top down.
• A top down parser begins with the start symbol at the top of the parse tree and works
downward, driving productions in forward order until it gets to the terminal leaves.
• A bottom up parse starts with the string of terminals itself and builds from the leaves
upward, working backwards to the start symbol by applying the productions in reverse.
• Along the way, a bottom up parser searches for substrings of the working string that
match the right side of some production. When it finds such a substring, it reduces it,
i.e., substitutes the left side non-terminal for the matching right side.
• The goal is to reduce all the way up to the start symbol and report a successful parse.
Contd…
• A bottom-up parse corresponds to the construction of a parse tree for an input string
beginning at the leaves (the bottom) and working up towards the root (the top).
• It is the process of "reducing" a Input string w to the start symbol of the grammar.
• At each reduction step, a specific substring matching the body of a production is
replaced by the non terminal at the head of that production
• A reduction is the reverse of a step in a derivation, therefore bottom up parser is
rightmost derivation in reverse.
• The key decisions during bottom-up parsing are about when to reduce and about what
production to apply, as the parser proceeds.
Example for reduction

Example:
• Consider the grammar:
S → aABe
A → Abc | b
B→d
• The sentence/Input string(w) to be recognized is abbcde.
Handle

• A "handle" is a substring that matches the right side of a production, and whose
reduction to a non terminal in the left side represents previous step in a rightmost
derivation in reverse.
Contd…
Contd…
Handle Pruning
• A Handle is a substring that matches the body of a production.
• Handle reduction is a step in the reverse of rightmost derivation.
• A rightmost derivation in reverse can be obtained by handle pruning.
• The implementation of handle pruning involves the following data-structures:-
– a stack - to hold the grammar symbols;
– an input buffer that contains the remaining input and a table to decide handles.
Contd…
Shift-Reduce Parsing

• Shift-reduce parsing is a form of bottom-up parsing in which a stack holds


grammar symbols and an input buffer holds the rest of the string to be
parsed.
• Initially, the stack is empty, and the string w is on the input, as follows:
• Stack Input Buffer
$ w$
• During a left-to-right scan of the input string, the parser shifts zero or more
input symbols onto the stack, until the handle appears on top of the stack.
• It then reduces by left side of the appropriate production.
• The parser repeats this cycle until it has detected an error or until the stack
contains the start symbol and the input is empty:
Stack Input Buffer
$S $
• Upon entering this configuration, the parser halts and announces successful
completion of parsing.
Example

• Consider the following grammar and the Input string w= i d 1 * i d 2 $.

Shift reduce parsing for w= i d 1 * i d 2 $.


Contd…
• The primary operations are shift and reduce
• There are actually four possible actions a shift-reduce parser can make: (1) shift, (2)
reduce, (3) accept, and (4) error.

1. Shift. Shift the next input symbol onto the top of the stack.
2. Reduce. If the handle appears on top of the stack then, its reduction by using
appropriate production rule is done i.e. RHS of production rule is popped out of stack
and LHS of production rule is pushed onto the stack.
3. Accept. Announce successful completion of parsing.
4. Error. Discover a syntax error and call an error recovery routine.
Conflicts During Shift-Reduce Parsing

• A shift-reduce parser for a grammar can reach a configuration in which the


parser
1. cannot decide whether to shift or to reduce (a shift/reduce conflict)
2. cannot decide which of several reductions to make (a reduce/reduce
conflict). Reduce-reduce conflicts are rare and usually indicate a problem
in the grammar definition.
Example for shift/reduce conflict
Example for reduce/reduce conflict
Bottom-Up Parsing
• A bottom-up parser creates the parse tree of the given input starting
from leaves towards the root.
• A bottom-up parser tries to find the right-most derivation of the given
input in the reverse order.
S ⇒ ... ⇒ ω (the right-most derivation of ω)
← (the bottom-up parser finds the right-most derivation in the reverse order)

Compiler Design- Unit-3


Operator-Precedence Parser
• Operator grammar
– small, but an important class of grammars
– we may have an efficient operator precedence parser (a shift-reduce
parser) for an operator grammar.
• In an operator grammar, no production rule can have:
– ε at the right side
– two adjacent non-terminals at the right side.

• Ex:
E→AB E→EOE E→E+E |
A→a E→id E*E |
B→b O→+|*|/ E/E | id
not operator grammar not operator grammar operator grammar

Compiler Design- Unit-3


Precedence Relations
• In operator-precedence parsing, we define three disjoint precedence
relations between certain pairs of terminals.
a <. b b has higher precedence than a
a =· b b has same precedence as a
a .> b b has lower precedence than a

• The determination of correct precedence relations between terminals


are based on the traditional notions of associativity and precedence of
operators. (Unary minus causes a problem).

Compiler Design- Unit-3


Using Operator-Precedence Relations
• The intention of the precedence relations is to find the handle of
a right-sentential form,
<. with marking the left end,
=· appearing in the interior of the handle, and
.
> marking the right hand.

• In our input string $a1a2...an$, we insert the precedence relation


between the pairs of terminals (the precedence relation holds between
the terminals in that pair).

Compiler Design- Unit-3


Using Operator -Precedence Relations
E → E+E | E-E | E*E | E/E | E^E | (E) | -E | id

id + * $
The partial operator-precedence . . .
id > > >
table for this grammar + <. .
> <. .
>
* <. .
> .
> .
>
$ <. <. <.

• Then the input string id+id*id with the precedence relations inserted
will be:
$ <. id .> + <. id .> * <. id .> $

Compiler Design- Unit-3


To Find The Handles
1. Scan the string from left end until the first .> is encountered.
2. Then scan backwards (to the left) over any =· until a <. is encountered.
3. The handle contains everything to left of the first .> and to the right of
the <. is encountered.

$ <. id .> + <. id .> * <. id .> $ E → id $ id + id * id $


$ <. + <. id .> * <. id .> $ E → id $ E + id * id $
$ <. + <. * <. id .> $ E → id $ E + E * id $
$ < . + < . * .> $ E → E*E $ E + E * .E $
$ < . + .> $ E → E+E $E+E$
$$ $E$

Compiler Design- Unit-3


Operator-Precedence Parsing Algorithm
• The input string is w$, the initial stack is $ and a table holds precedence relations between
certain terminals
Algorithm:
set p to point to the first symbol of w$ ;
repeat forever
if ( $ is on top of the stack and p points to $ ) then return
else {
let a be the topmost terminal symbol on the stack and let b be the symbol pointed to by p;
if ( a <. b or a =· b ) then { /* SHIFT */
push b onto the stack;
advance p to the next input symbol;
}
else if ( a .> b ) then /* REDUCE */
repeat pop stack
until ( the top of stack terminal is related by <. to the terminal most recently popped );
else error();
}

Compiler Design- Unit-3


Operator-Precedence Parsing Algorithm -- Example
stack input action
$ id+id*id$ $ <. id shift
$id +id*id$ id .> + reduce E → id
$ +id*id$ shift
$+ id*id$ shift
$+id *id$ id .> * reduce E → id
$+ *id$ shift
$+* id$ shift
$+*id $ id .> $ reduce E → id
$+* $ * .> $ reduce E → E*E
$+ $ + .> $ reduce E → E+E
$ $ accept

Compiler Design- Unit-3


How to Create Operator-Precedence Relations
• We use associativity and precedence relations among operators.

1. If operator O1 has higher precedence than operator O2, 


O1 .> O2 and O2 <. O1

2. If operator O1 and operator O2 have equal precedence,


they are left-associative  O1 .> O2 and O2 .> O1
they are right-associative  O1 <. O2 and O2 <. O1

3. For all operators O,


O <. id, id .> O, O <. (, (<. O, O .> ), ) .> O, O .> $, and $ <. O

4. Also, let
(=·) $ <. ( id .> ) ) .> $
( <. ( $ <. id id .> $ ) .> )
( <. id
Compiler Design- Unit-3
Operator-Precedence Relations
+ - * / ^ id ( ) $
. .
+ > > <. <. <. <. <. .
> .
>
. .
- > > <. <. <. <. <. .
> .
>
. . . .
* > > > > <. <. <. .
> .
>
. . . .
/ > > > > <. <. <. .
> .
>
. . . .
^ > > > > <. <. <. .
> .
>
. . . . . . .
id > > > > > > >
( <. <. <. <. <. <. <. =·
. . . . . . .
) > > > > > > >
$ <. <. <. <. <. <. <.

Compiler Design- Unit-3


Handling Unary Minus
• Operator-Precedence parsing cannot handle the unary minus when we
also the binary minus in our grammar.
• The best approach to solve this problem, let the lexical analyzer handle
this problem.
– The lexical analyzer will return two different operators for the unary minus and the binary
minus.
– The lexical analyzer will need a lookhead to distinguish the binary minus from the unary
minus.
• Then, we make
O <. unary-minus for any operator
unary-minus .> O if unary-minus has higher precedence than O
unary-minus <. O if unary-minus has lower (or equal) precedence than O

Compiler Design- Unit-3


Precedence Functions
• Compilers using operator precedence parsers do not need to store the
table of precedence relations.
• The table can be encoded by two precedence functions f and g that map
terminal symbols to integers.
• For symbols a and b.
f(a) < g(b) whenever a <. b
f(a) = g(b) whenever a =· b
f(a) > g(b) whenever a .> b

Compiler Design- Unit-3


Disadvantages of Operator Precedence Parsing
• Disadvantages:
– It cannot handle the unary minus (the lexical analyzer should handle
the unary minus).
– Small class of grammars.
– Difficult to decide which language is recognized by the grammar.

• Advantages:
– simple
– powerful enough for expressions in programming languages

Compiler Design- Unit-3


Error Recovery in Operator-Precedence Parsing
Error Cases:
1. No relation holds between the terminal on the top of stack and the
next input symbol.
2. A handle is found (reduction step), but there is no production with
this handle as a right side

Error Recovery:
3. Each empty entry is filled with a pointer to an error routine.
4. Decides the popped handle “looks like” which right hand side. And
tries to recover from that situation.

Compiler Design- Unit-3


Computation of LEADING & TRAILING
• Two Methods to determine a precedence relation between a
pair of terminals
– Based on associativity and precedence relations of operators
– Using Operator Precedence Grammar
• Implementation of Operator-Precedence Parser:
– An operator-precedence parser is a simple shift-reduce parser that is capable of
parsing a subset of LR(1) grammars.
– The operator-precedence parser can parse all LR(1) grammars where two
consecutive non-terminals and epsilon never appear in the right-hand side of any
rule.

Compiler Design- Unit-3


Computation of LEADING & TRAILING
• Steps involved in Parsing:
1. Ensure the grammar satisfies the pre-requisite.
2. Computation of the function LEADING()
3. Computation of the function TRAILING()
4. Using the computed leading and trailing ,construct the operator
Precedence Table
5. Parse the given input string based on the algorithm
6. Compute Precedence Function and graph.

Compiler Design- Unit-3


Computation of LEADING
Leading is defined for every non-terminal. Terminals that can be the first
terminal in a string derived from that non-terminal.

Compute LEADING (A)


• LEADING (A) = {a| A → γaδ, where γ is ε or a single non-terminal.}
• Rule 1: a is in LEADING (A) if there is a production of the form
A → γaδ, Where γ is ε or a single non-terminal
• Rule 2: a is in LEADING (B) and if there is a production of the form
A → Bα, then a is in LEADING (A)

Compiler Design- Unit-3


Computation of TRAILING
Trailing is defined for every non-terminal. • Terminals that can be the last
terminal in a string derived from that non-terminal.

Compute TRAILING (A)


• TRAILING (A) = {a| A → γaδ, where δ is ε or a single non-terminal.}
• Rule 1: a is in TRAILING (A) if there is a production of the form
A → γaδ, Where δ is ε or a single non-terminal
• Rule 2: a is in TRAILING (B) and if there is a production of the form
A → αB, then a is in TRAILING (A)

Compiler Design- Unit-3


Computation of LEADING & TRAILING--Example

Compiler Design- Unit-3


Algorithm for constructing
Precedence Relation Table
Step 2: After computing LEADING and TRAILING, the table is constructed between
all the terminals in the grammar including the ‘$’ symbol.
PARSINGTABLE(Grammar G, LEADING(), TRAILING() )
{
For each production A X1X2X3 ...Xn
for i = 1 to n-1
1. if Xi and Xi+1 are terminals
set Xi =· Xi+1
2. if i ≤ n-2 and Xi and Xi+2 are terminals and Xi+1 is a non-terminal
set Xi =· Xi+2
3. if Xi is a terminal and Xi+1 is a non-terminal then for all ‘a’ in
Leading(Xi+1) set Xi <. a
4. if Xi is a non-terminal and Xi+1 is a terminal then for all ‘a’ in
Trailing(Xi) set a .> Xi+1
} Compiler Design- Unit-3
Precedence Relation Table

Compiler Design- Unit-3


Parsing Algorithm
Algorithm:
set p to point to the first symbol of w$ ;
repeat forever
if $ is on top of the stack and p points to $ then return
else begin
let a be the topmost terminal symbol on the stack and let b be the symbol pointed to by p;
if ( a <. b or a =· b ) then
push b onto the stack;
advance p to the next input symbol;
end
else if a .> b then
repeat pop stack
until the top of stack terminal is related by <. to the terminal most recently popped
else error();
end

Compiler Design- Unit-3


Parse the given input string (id+id)*id$
• Step 3

Compiler Design- Unit-3


Precedence Functions
• Compilers using operator-precedence parsers need not store the table of precedence
relations. In most cases, the table can be encoded by two precedence functions f and g
that map terminal symbols to integers. We attempt to select f and g so that, for
symbols a and b.

Compiler Design- Unit-3


Algorithm for Constructing Precedence Functions

Compiler Design- Unit-3


Compiler Design- Unit-3
Introduction to LR Parsing
Why LR Parsers?

• LR parsing is attractive for a variety of reasons:


• LR parsers can be constructed to recognize all programming language constructs.
• The LR-parsing method is the most general non backtracking shift-reduce parsing
method it can be implemented efficiently.
• An LR parser can detect a syntactic error as soon as it is possible to do so on a
left-to-right scan of the input.
• The class of grammars that can be parsed using LR methods is a proper superset of the
class of grammars that can be parsed with predictive or LL methods. LR grammars can
describe more languages than LL grammars.
• The principal drawback of the LR method is that it is too much work to construct an
LR parser for a typical programming-language grammar.
LR Parsers

• The most powerful shift-reduce parsing (yet efficient) is:

LR(k) parsing.

left to right right-most k lookhead


scanning derivation (k is omitted  it is 1)

• LR parsing is attractive because:


– LR parsing is most general non-backtracking shift-reduce parsing, yet it is still efficient.
– The class of grammars that can be parsed using LR methods is a proper superset of the class
of grammars that can be parsed with predictive parsers.
LL(1)-Grammars ⊂ LR(1)-Grammars
– An LR-parser can detect a syntactic error as soon as it is possible to do so a left-to-right
scan of the input.

Compiler Design- Unit-3


LR Parsers
• LR-Parsers
– covers wide range of grammars.
– SLR – simple LR parser
– LR – most general LR parser
– LALR – intermediate LR parser (look-head LR parser)
– SLR, LR and LALR work same (they used the same algorithm),
only their parsing tables are different.

Compiler Design- Unit-3


LR Parsing Algorithm

input a1 ... ai ... an $


stack
Sm
Xm
LR Parsing Algorithm output
Sm-1
Xm-1
.
.
Action Table Goto Table
S1 terminals and $ non-terminal
X1 s s
t four different t each item is
S0 a actions a a state number
t t
e e
s s
Compiler Design- Unit-3
A Configuration of LR Parsing Algorithm
• A configuration of a LR parsing is:
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )

Stack Rest of Input

• Sm and ai decides the parser action by consulting the parsing action


table. (Initial Stack contains just So )

• A configuration of a LR parsing represents the right sentential form:


X1 ... Xm ai ai+1 ... an $

Compiler Design- Unit-3


Actions of A LR-Parser
1. shift s -- shifts the next input symbol and the state s onto the stack
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )  ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )

2. reduce A→β (or rn where n is a production number)


– pop 2|β| (=r) items from the stack;
– then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )  ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )

– Output is the reducing production reduce A→β

3. Accept – Parsing successfully completed

4. Error -- Parser detected an error (an empty entry in the action table)
Compiler Design- Unit-3
Reduce Action
• pop 2|β| (=r) items from the stack; let us assume that β = Y1Y2...Yr
• then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm-r Sm-r Y1 Sm-r ...Yr Sm, ai ai+1 ... an $ )
 ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )

• In fact, Y1Y2...Yr is a handle.


X1 ... Xm-r A ai ... an $ ⇒ X1 ... Xm Y1...Yr ai ai+1 ... an $

Compiler Design- Unit-3


(SLR) Parsing Tables for Expression Grammar
Action Table Goto Table
1) E → E+T state id + * ( ) $ E T F
0 s5 s4 1 2 3
2) E→T
1 s6 acc
3) T → T*F
2 r2 s7 r2 r2
4) T→F 3 r4 r4 r4 r4
5) F → (E) 4 s5 s4 8 2 3
6) F → id 5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5

Compiler Design- Unit-3


Actions of A (S)LR-Parser -- Example
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id$ shift 7
0T2*7 id+id$ shift 5
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
0T2 +id$ reduce by E→T E→T
0E1 +id$ shift 6
0E1+6 id$ shift 5
0E1+6id5 $ reduce by F→id F→id
0E1+6F3 $ reduce by T→F T→F
0E1+6T9 $ reduce by E→E+T E→E+T
0E1 $ accept

Compiler Design- Unit-3


Constructing SLR Parsing Tables – LR(0) Item
• An LR(0) item of a grammar G is a production of G a dot at the some
position of the right side.
• Ex: A → aBb Possible LR(0) Items: A → aBb
. .
..
(four different possibility)
A → aB b
A → aBb
A → a Bb

• Sets of LR(0) items will be the states of action and goto table of the
SLR parser.
• A collection of sets of LR(0) items (the canonical LR(0) collection) is
the basis for constructing SLR parsers.
• Augmented Grammar:
G’ is G with a new production rule S’→S where S’ is the new starting
symbol.
Compiler Design- Unit-3
The Closure Operation
• If I is a set of LR(0) items for a grammar G, then closure(I) is the
set of LR(0) items constructed from I by the two rules:
1. Initially, every LR(0) item in I is added to closure(I).
.
.
2. If A → α Bβ is in closure(I) and B→γ is a production rule of G;
then B→ γ will be in the closure(I).
We will apply this rule until no more new LR(0) items can be added
to closure(I).

Compiler Design- Unit-3


The Closure Operation -- Example

E’ → E .
closure({E’ → E}) =
E → E+T { E’ → E . kernel items
E→T .
E → E+T
T → T*F E→ T .
T→F .
T → T*F
F → (E) T→ F .
F → id .
F → (E)
.
F → id }

Compiler Design- Unit-3


Goto Operation
• If I is a set of LR(0) items and X is a grammar symbol (terminal or
non-terminal), then goto(I,X) is defined as follows:
.
.
– If A → α Xβ in I
then every item in closure({A → αX β}) will be in goto(I,X).

. . .
Example:

T→ .. ..
I ={ E’ → E, E →
T*F, T → F,
E+T, E → T,

F→
.. ..
(E), F → id }
goto(I,E) = { E’ → E , E → E +T }

.. . . .
goto(I,T) = { E → T , T → T *F }
goto(I,F) = {T → F }
.
F→ . .
goto(I,() = { F → ( E), E →
(E), F → id }
E+T, E → T, T →
Compiler Design- Unit-3
T*F, T → F,
Construction of The Canonical LR(0) Collection
• To create the SLR parsing tables for a grammar G, we will create the
canonical LR(0) collection of the grammar G’.

.
• Algorithm:
C is { closure({S’→ S}) }
repeat the followings until no more set of LR(0) items can be added to C.
for each I in C and each grammar symbol X
if goto(I,X) is not empty and not in C
add goto(I,X) to C

• goto function is a DFA on the sets in C.

Compiler Design- Unit-3


The Canonical LR(0) Collection -- Example
I0: E’ → .EI1: E’ → E.I6: E → E+.T I9: E → E+T.
E → .E+T E → E.+T T → .T*F T → T.*F
E → .T T → .F
T → .T*F I2: E → T. F → .(E) I10: T → T*F.
T → .F T → T.*F F → .id
F → .(E)
F → .id I3: T → F. I7: T → T*.F I11: F → (E).
F → .(E)
I4: F → (.E) F → .id
E → .E+T
E → .T I8: F → (E.)
T → .T*F E → E.+T
T → .F
F → .(E)
F → .id

I5: F → id.

Compiler Design- Unit-3


Transition Diagram (DFA) of Goto Function

E T
I0 I1 I6 I9 * to I7
F
( to I3
T
id to I4
to I5
F I2 * I7 F
I10
(
I3 id to I4
(
to I5
I4 E I8 )
id id T
F
to I2 + I11
I5 to I3 to I6
(
to I4

Compiler Design- Unit-3


Constructing SLR Parsing Table
(of an augumented grammar G’)

1. Construct the canonical collection of sets of LR(0) items for G’. C←


{I0,...,In}
2. Create the parsing action table as follows
• If a is a terminal, A→α.aβ in Ii and goto(Ii,a)=Ij then action[i,a] is shift j.
• If A→α. is in Ii , then action[i,a] is reduce A→α for all a in FOLLOW(A)
where A≠S’.
• If S’→S. is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the grammar is not SLR(1).

3. Create the parsing goto table


• for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j

4. All entries not defined by (2) and (3) are errors.


5. Initial state of the parser contains S’→.S

Compiler Design- Unit-3


Parsing Tables of Expression Grammar
Action Table Goto Table
state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5

Compiler Design- Unit-3


Another Example

• GivenGrammar - G
S → AA
A → aA | b
• Step 1:
• Add Augment Production and insert '•' symbol at the first position for every
production in G
S` → •S
S → •AA
A → •aA
A → •b
• Step 2:
• Add Augment production to the I0 State and Compute the Closure
I0 = Closure (S` → •S)
Contd…

• Add all productions starting with "A" in modified I0 State because


"•" is followed by the non-terminal. So, the I0 State becomes.
• I0= S` → •S
S → •AA
A → •aA
A → •b
• I1= Go to (I0, S) = closure (S` → S•) = S` → S•
• Here, the Production is reduced so close the State.
• I1= S` → S•
• I2= Go to (I0, A) = closure (S → A•A)
• Add all productions starting with A in to I2 State because "•" is
followed by the non-terminal. So, the I2 State becomes
• I2 =S→A•A
A → •aA
A → •b
Contd…

• I3= Go to (I0,a) = Closure (A → a•A)


• Add productions starting with A in I3.
• A → a•A
A → •aA
A → •b
• I4= Go to (I0, b) = closure (A → b•) = A → b•
• I5= Go to (I2, A) = Closure (S → A • A) = S → AA•
• Go to (I2,a) = Closure (A → a•A) = (same as I3)
• Go to (I2, b) = Closure (A → b•) = (same as I4)
• Go to (I3, a) = Closure (A → a•A) = (same as I3)
Go to (I3, b) = Closure (A → b•) = (same as I4)
• I6= Go to (I3, A) = Closure (A → aA•) = A → aA•
Contd…
LR(0) Table

• If a state is going to some other state on a terminal then it correspond to a Shift move.
• If a state is going to some other state on a variable then it correspond to Go to move.
• If a state contain the final item in the particular row then write the Reduce move
completely.
LR(0) Table Entry Explanation
• Productions are numbered as follows:
S → AA ... (1)
A → aA ... (2)
A → b ... (3)
• I0 on S is going to I1 so write it as 1.
• I0 on A is going to I2 so write it as 2.
• I2 on A is going to I5 so write it as 5.
• I3 on A is going to I6 so write it as 6.
• I0, I2and I3on a are going to I3 so write it as S3 which means that shift 3.
• I0, I2 and I3 on b are going to I4 so write it as S4 which means that shift 4.
• I4, I5 and I6 all states contains the final item because they contain • in the right most
end. So rate the production as production number.
Contd…
• I1 contains the final item which drives(S` → S•), so action {I1, $} = Accept.
• I4 contains the final item which drives A → b• and that production corresponds to the
production number 3 so write it as r3 in the entire row.
• I5 contains the final item which drives S → AA• and that production corresponds to
the production number 1 so write it as r1 in the entire row.
• I6 contains the final item which drives A → aA• and that production corresponds to
the production number 2 so write it as r2 in the entire row.
SLR(1) Grammar
• An LR parser using SLR(1) parsing tables for a grammar G is called as
the SLR(1) parser for G.
• If a grammar G has an SLR(1) parsing table, it is called SLR(1)
grammar (or SLR grammar in short).
• Every SLR grammar is unambiguous, but every unambiguous grammar
is not a SLR grammar.

Compiler Design- Unit-3


shift/reduce and reduce/reduce conflicts
• If a state does not know whether it will make a shift operation or
reduction for a terminal, we say that there is a shift/reduce conflict.

• If a state does not know whether it will make a reduction operation


using the production rule i or j for a terminal, we say that there is a
reduce/reduce conflict.

• If the SLR parsing table of a grammar G has a conflict, we say that that
grammar is not SLR grammar.

Compiler Design- Unit-3


Conflict Example
S → L=R I0: S’ → .S I1:S’ → S. I6:S → L=.R I9: S → L=R.
S→R S → .L=R R → .L
L→ *R S → .R I2:S → L.=R L→ .*R
L → id L → .*R R → L. L → .id
R→L L → .id
R → .L I3:S → R.

I4:L → *.R I7:L → *R.


Problem R → .L
FOLLOW(R)={=,$} L→ .*R I8:R → L.
= shift 6 L → .id
reduce by R → L
shift/reduce conflict I5:L → id.

Compiler Design- Unit-3


Conflict Example2
S → AaAb I0: S’ → .S
S → BbBa S → .AaAb
A→ε S → .BbBa
B→ε A→.
B→.

Problem
FOLLOW(A)={a,b}
FOLLOW(B)={a,b}
areduce by A → ε b reduce by A → ε
reduce by B → ε reduce by B → ε
reduce/reduce conflict reduce/reduce conflict

Compiler Design- Unit-3


Constructing Canonical LR(1) Parsing Tables
• In SLR method, the state i makes a reduction by A→α when the current
token is a:
– if the A→α. in the Ii and a is FOLLOW(A)
• In some situations, βA cannot be followed by the terminal a in
a right-sentential form when βα and the state i are on the top stack.
This means that making reduction in this case is not correct.

S → AaAb S⇒AaAb⇒Aab⇒ab S⇒BbBa⇒Bba⇒ba


S → BbBa
A→ε Aab ⇒ ε ab Bba ⇒ ε ba
B→ε AaAb ⇒ Aa ε b BbBa ⇒ Bb ε a

Compiler Design- Unit-3


LR(1) Item
• To avoid some of invalid reductions, the states need to carry more
information.
• Extra information is put into a state by including a terminal symbol as a
second component in an item.
• A LR(1) item is:
.
A → α β,a where a is the look-head of the LR(1) item
(a is a terminal or end-marker.)

Compiler Design- Unit-3


LR(1) Item (cont.)

.
• When β ( in the LR(1) item A → α β,a ) is not empty, the look-head
does not have any affect.
.
• When β is empty (A → α ,a ), we do the reduction by A→α only if the
next input symbol is a (not for any terminal in FOLLOW(A)).

.
• A state will contain A → α ,a1 where {a1,...,an} ⊆ FOLLOW(A)
...
.
A → α ,an

Compiler Design- Unit-3


Canonical Collection of Sets of LR(1) Items
• The construction of the canonical collection of the sets of LR(1) items
are similar to the construction of the canonical collection of the sets of
LR(0) items, except that closure and goto operations work a little bit
different.

closure(I) is: ( where I is a set of LR(1) items)


– every LR(1) item in I is in closure(I)
.
– if A→α Bβ,a in closure(I) and B→γ is a production rule of G;
then B→.γ,b will be in the closure(I) for each terminal b in
FIRST(βa) .

Compiler Design- Unit-3


goto operation

• If I is a set of LR(1) items and X is a grammar symbol


(terminal or non-terminal), then goto(I,X) is defined as
follows:
– If A → α.Xβ,a in I
then every item in closure({A → αX.β,a}) will be in
goto(I,X).

Compiler Design- Unit-3


Construction of The Canonical LR(1) Collection
• Algorithm:
C is { closure({S’→.S,$}) }
repeat the followings until no more set of LR(1) items can be added to C.
for each I in C and each grammar symbol X
if goto(I,X) is not empty and not in C
add goto(I,X) to C

• goto function is a DFA on the sets in C.

Compiler Design- Unit-3


A Short Notation for The Sets of LR(1) Items
• A set of LR(1) items containing the following items
.
A → α β,a1
...
.
A → α β,an

can be written as

.
A → α β,a1/a2/.../an

Compiler Design- Unit-3


Canonical LR(1) Collection -- Example
S → AaAb I0: S’ → .S ,$ I1: S’ →
S S. ,$
S → BbBa S → .AaAb ,$ A
A→ε S → .BbBa ,$ I2: S → A.aAb ,$ a to I4
B→ε A → . ,a B
B → . ,b I3: S → B.bBa ,$ b to I5

I4: S → Aa.Ab ,$ A I6: S → AaA.b ,$ I8:aS → AaAb. ,$


A → . ,b

I5: S → Bb.Ba ,$ B I7: S → BbB.a ,$ I9:bS → BbBa. ,$


B → . ,a

Compiler Design- Unit-3


Canonical LR(1) Collection – Example2
S’ → S I0:S’ → .S,$ I1:S’ → S.,$ I4:L → *.R,$/= R to I7
1) S → L=R S → .L=R,$ S * R → .L,$/= L
to I8
2) S → R S → .R,$ L I2:S → L.=R,$ to I6 L→ .*R,$/= *
3) L→ *R L → .*R,$/= R → L.,$ L → .id,$/= to I4
id
4) L → id L → .id,$/= R to I5
I3:S → R.,$ id
5) R → L R → .L,$ I5:L → id.,$/=

I9:S → L=R.,$
R I13:L → *R.,$
I6:S → L=.R,$ to I9
R → .L,$ L I10:R → L.,$
to I10
L → .*R,$ * I4 and I11
to I11 I11:L → *.R,$ R
L → .id,$ to I13
id R → .L,$ L I5 and I12
to I12 to I10
L→ .*R,$ *
I7:L → *R.,$/= L → .id,$ to I11 I7 and I13
id
I8: R → L.,$/= to I12
I12:L → id.,$ I8 and I10
Compiler Design- Unit-3
Construction of LR(1) Parsing Tables
1. Construct the canonical collection of sets of LR(1) items for G’. C←
{I0,...,In}
2. Create the parsing action table as follows
.
• If a is a terminal, A→α aβ,b in Ii and goto(Ii,a)=Ij then action[i,a] is shift j.
.
• If A→α ,a is in Ii , then action[i,a] is reduce A→α where A≠S’.
• If S’→S.,$ is in I , then action[i,$] is accept.
i
• If any conflicting actions generated by these rules, the grammar is not LR(1).

3. Create the parsing goto table


• for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j

4. All entries not defined by (2) and (3) are errors.


5. Initial state of the parser contains S’→.S,$
Compiler Design- Unit-3
LR(1) Parsing Tables – (for Example2)
id * = $ S L R
0 s5 s4 1 2 3
1 acc
2 s6 r5
3 r2
4 s5 s4 8 7
no shift/reduce or
5 r4 r4
no reduce/reduce conflict
6 s12 s11 10 9
7 r3 r3 ⇓
8 r5 r5 so, it is a LR(1) grammar
9 r1
10 r5
11 s12 s11 10 13
12 r4
13 r3
Compiler Design- Unit-3
LALR Parsing Tables
• LALR stands for LookAhead LR.

• LALR parsers are often used in practice because LALR parsing tables
are smaller than LR(1) parsing tables.
• The number of states in SLR and LALR parsing tables for a grammar G
are equal.
• But LALR parsers recognize more grammars than SLR parsers.
• yacc creates a LALR parser for the given grammar.
• A state of LALR parser will be again a set of LR(1) items.

Compiler Design- Unit-3


Creating LALR Parsing Tables

Canonical LR(1) Parser  LALR Parser


shrink # of states

• This shrink process may introduce a reduce/reduce conflict in the


resulting LALR parser (so the grammar is NOT LALR)
• But, this shrik process does not produce a shift/reduce conflict.

Compiler Design- Unit-3


The Core of A Set of LR(1) Items
• The core of a set of LR(1) items is the set of its first component.

. .
. .
Ex: S → L =R,$ S → L =R Core
R → L ,$ R→L

• We will find the states (sets of LR(1) items) in a canonical LR(1) parser with same
cores. Then we will merge them as a single state.

.
I1:L → id ,= A new state: I12: L → id ,= .
 L → id ,$ .
.
I2:L → id ,$ have same core, merge them

• We will do this for all states of a canonical LR(1) parser to get the states of the LALR
parser.
• In fact, the number of the states of the LALR parser for a grammar will be equal to the
number of states of the SLR parser for that grammar.
Compiler Design- Unit-3
Creation of LALR Parsing Tables
• Create the canonical LR(1) collection of the sets of LR(1) items for
the given grammar.
• Find each core; find all sets having that same core; replace those sets
having same cores with a single set which is their union.
C={I0,...,In}  C’={J1,...,Jm}where m ≤ n
• Create the parsing tables (action and goto tables) same as the
construction of the parsing tables of LR(1) parser.
– Note that: If J=I1 ∪ ... ∪ Ik since I1,...,Ik have same cores
 cores of goto(I1,X),...,goto(I2,X) must be same.
– So, goto(J,X)=K where K is the union of all sets of items having same cores as goto(I1,X).

• If no conflict is introduced, the grammar is LALR(1) grammar.


(We may only introduce reduce/reduce conflicts; we cannot introduce
a shift/reduce conflict)

Compiler Design- Unit-3


Shift/Reduce Conflict
• We say that we cannot introduce a shift/reduce conflict during the
shrink process for the creation of the states of a LALR parser.
• Assume that we can introduce a shift/reduce conflict. In this case, a state
of LALR parser must have:
.
A → α ,a andB → β aγ,b .
• This means that a state of the canonical LR(1) parser must have:

.
A → α ,a andB → β aγ,c .
But, this state has also a shift/reduce conflict. i.e. The original canonical
LR(1) parser has a conflict.
(Reason for this, the shift operation does not depend on lookaheads)

Compiler Design- Unit-3


Reduce/Reduce Conflict
• But, we may introduce a reduce/reduce conflict during the shrink
process for the creation of the states of a LALR parser.

.
I1 : A → α ,a I2: A → α ,b .
.
B → β ,b B → β ,c .

.
I12: A → α ,a/b  reduce/reduce conflict
.
B → β ,b/c

Compiler Design- Unit-3


Canonical LALR(1) Collection – Example2
S’ → S .
I0:S’ → S,$ I1:S’ → S ,$. .
I411:L → * R,$/= R to I713
1) S → L=R
.
S → L=R,$
S *
. .
R → L,$/= L
2) S → R
.
S → R,$
L I2:S → L =R,$ to I6
. L→ *R,$/= . *
to I810
3) L→ *R
.
L → *R,$/= R
R → L ,$

. L → id,$/= . id
to I411
4) L → id
5) R → L .
L → id,$/= I3:S → R ,$
id
I512:L → id ,$/= . to I512

.
R → L,$

. R
to I9 I9:S → L=R ,$ . Same Cores
.
I6:S → L= R,$
L
to I810 I4 and I11
.
R → L,$
*

.
L → *R,$ to I411
id I5 and I12
L → id,$ to I512

.
I713:L → *R ,$/=
I7 and I13

.
I810: R → L ,$/= I8 and I10

Compiler Design- Unit-3


LALR(1) Parsing Tables – (for Example2)
id * = $ S L R
0 s5 s4 1 2 3
1 acc
2 s6 r5
3 r2
4 s5 s4 8 7
no shift/reduce or
5 r4 r4
no reduce/reduce conflict
6 s12 s11 10 9
7 r3 r3 ⇓
8 r5 r5 so, it is a LALR(1) grammar
9 r1

Compiler Design- Unit-3


Using Ambiguous Grammars
• All grammars used in the construction of LR-parsing tables must be
un-ambiguous.
• Can we create LR-parsing tables for ambiguous grammars ?
– Yes, but they will have conflicts.
– We can resolve these conflicts in favor of one of them to disambiguate the grammar.
– At the end, we will have again an unambiguous grammar.
• Why we want to use an ambiguous grammar?
– Some of the ambiguous grammars are much natural, and a corresponding unambiguous
grammar can be very complex.
– Usage of an ambiguous grammar may eliminate unnecessary reductions.
• Ex.
E → E+T | T
E → E+E | E*E | (E) | id  T → T*F | F
F → (E) | id

Compiler Design- Unit-3


Sets of LR(0) Items for Ambiguous Grammar
.E
I0: E’ →
E
I1: E’ → E. +
I 4: E → E + E
E
. I7: E → E+E .
+ I

.E+E . . .
4
( *
E→ E → E +E E → E+E E → E +E I5
E→ .E*E .
E → E *E E → E*E id . I2
.
E → E *E
.(E) .
*
I3
E→ E → (E)
E→ .id ( E → id .
(
I 5: E → E * E E . .
+ I
.E) E → E+E
(
. I8: E → E*E
. *
4

.
I 2: E → ( I2
.E+E E → E*E
id E → E +E I5
.
.
E→ I3
.E*E E E → (E)
E → E *E

.
E→
id E→ .(E) id E → id
E→ .id I6: E → (E ) .
)
I9: E → (E) .
I : E → id.
E → E +E
+
.
3 E → E *E
* I4
.
I5

Compiler Design- Unit-3


SLR-Parsing Tables for Ambiguous Grammar
FOLLOW(E) = { $,+,*,) }

State I7 has shift/reduce conflicts for symbols + and *.

E + E
I0 I1 I4 I7

when current token is +


shift  + is right-associative
reduce  + is left-associative

when current token is *


shift  * has higher precedence than +
reduce  + has higher precedence than *

Compiler Design- Unit-3


SLR-Parsing Tables for Ambiguous Grammar
FOLLOW(E) = { $,+,*,) }

State I8 has shift/reduce conflicts for symbols + and *.

E * E
I0 I1 I5 I7

when current token is *


shift  * is right-associative
reduce  * is left-associative

when current token is +


shift  + has higher precedence than *
reduce  * has higher precedence than +

Compiler Design- Unit-3


SLR-Parsing Tables for Ambiguous Grammar
Action Goto
id + * ( ) $ E
0 s3 s2 1
1 s4 s5 acc
2 s3 s2 6
3 r4 r4 r4 r4
4 s3 s2 7
5 s3 s2 8
6 s4 s5 s9
7 r1 s5 r1 r1
8 r2 r2 r2 r2
9 r3 r3 r3 r3
Compiler Design- Unit-3

You might also like