CSC 4181 Compiler Construction Parsing
CSC 4181 Compiler Construction Parsing
CSC 4181 Compiler Construction Parsing
Compiler Construction
Parsing
1
Outline
Top-down v.s. Bottom-up
Top-down parsing Bottom-up parsing
Recursive-descent Shift-reduce parsers
parsing
LR(0) parsing
LL(1) parsing
LR(0) items
LL(1) parsing
algorithm Finite automata of items
SLR(1) grammar
Parsing conflict
Parsing 2
Introduction
Parsing 4
Parse Trees and Derivations
E
E E+E
E E
id + E
+
E E id + E * E
id *
id id id + id * E
Top-down parsing id + id * id
E
EE+E
E + E E + E * E
E E E + E * id
id *
E + id * id
id id
Bottom-up parsing id + id * id
Parsing 5
Top-down Parsing
Parsing 6
Top-down Parsing
Why is it difficult?
Cannot decide until later
Next token: if Structure to be built: St
St MatchedSt | UnmatchedSt
UnmatchedSt
if (E) St| if (E) MatchedSt else UnmatchedSt
MatchedSt if (E) MatchedSt else MatchedSt |...
Production with empty string
Next token: id Structure to be built: par
par parList |
Parsing 7
Recursive-Descent
procedure match(expTok)
{ if (token==expTok)
then getToken
else error
}
The token is not consumed until getToken
is executed.
Parsing 10
Problems in Recursive-Descent
Parsing 11
LL(1) Parsing
LL(1)
Read input from (L) left to right
Simulate (L) leftmost derivation
1 lookahead symbol
Parsing 12
Concept of LL(1) Parsing
Parsing 13
Example of LL(1) Parsing
E TX
FNX
n
F
(E)NX
(TX)NX T
N
( ( n + ( n ) ) * n $
(FNX)NX
(nNX)NX
(nX)NX X
E
(nATX)NX
(n+TX)NX
(n+FNX)NX
A
n
F
+
)
(n+(E)NX)NX E T X
(n+(TX)NX)NX (
T
N X A T X |
(n+(FNX)NX)NX
(n+(nNX)NX)NX
(n+(nX)NX)NX Finished
E
X
M
* A + | -
(n+(n)NX)NX T F N
(n+(n)X)NX
(n+(n))NX
F
)
n N M F N |
(n+(n))MFNX M *
(n+(n))*FNX T
N
(n+(n))*nNX F ( E ) | n
(n+(n))*nX
(n+(n))*n
E
X
Parsing $ 14
LL(1) Parsing Algorithm
Push the start symbol into the stack
WHILE stack is not empty ($ is not on top of stack) and the
stream of tokens is not empty (the next input token is not $)
SWITCH (Top of stack, next token)
CASE (terminal a, a):
Pop stack; Get next token
CASE (nonterminal A, terminal a):
IF the parsing table entry M[A, a] is not empty THEN
Get A X1 X2 ... Xn from the parsing table entry M[A,
a] Pop stack;
Push Xn ... X2 X1 into stack in that order
ELSE Error
CASE ($,$): Accept
OTHER: Error
Parsing 15
LL(1) Parsing Table
If the nonterminal N is on
the top of stack and the
next token is t, which
production rule to use?
Choose a rule N X t N
X
such that N
Y
X t
X * tY or Q Y
X * and S * WNtY
t … … …
Parsing 16
First Set
Let X be or be in V or T.
First(X ) is the set of the first terminal in any
sentential form derived from X.
If X is a terminal or , then First(X ) ={X }.
If X is a nonterminal and X X1 X2 ... Xn is a
rule, then
First(X1) -{} is a subset of First(X)
First(Xi )-{} is a subset of First(X) if for all j<i
First(Xj) contains {}
is in First(X) if for all j≤n First(Xj)contains
Parsing 17
Examples of First Set
exp exp addop term | st ifst | other
term ifst if ( exp ) st elsepart
addop + | - elsepart else st |
term term mulop factor | exp 0|1
factor
mulop * First(exp) = {0,1}
factor (exp) | num First(elsepart) = {else, }
First(ifst) = {if}
First(addop) = {+, -}
First(st) = {if, other}
First(mulop) = {*}
First(factor) = {(, num}
First(term) = {(, num}
First(exp) = {(, num}
Parsing 18
Algorithm for finding First(A)
Parsing 20
Follow Set
Parsing 21
Algorithm for Finding Follow(A)
Follow(S) = {$} If A is the start
FOR each A in V-{S} symbol, then $ is
Follow(A)={} in Follow(A).
WHILE change is made to some Follow sets If there is a rule A
Y X Z, then
FOR each production A X1 X2 ... Xn, First(Z) - {} is in
FOR each nonterminal Xi Follow(X).
Add First(Xi+1 Xi+2...Xn)-{} If there is production
into Follow(Xi). B X A Y and
(NOTE: If i=n, Xi+1 Xi+2...Xn= ) is in First(Y), then
IF is in First(Xi+1 Xi+2...Xn) THEN Follow(A) contains
Add Follow(A) to Follow(Xi) Follow(B).
Parsing 22
Finding Follow Set: An Example
exp term exp’
First Follow
exp’ addop term exp’ |
exp ( num $)
addop + | -
term factor term’ exp’ + - $ )
term’ mulop factor term’ | addop + -
mulop * term ( num + - $ )
factor ( exp ) | num term’ *
mulop *
factor ( num
Parsing 23
Constructing LL(1) Parsing Tables
Parsing 24
Example: Constructing LL(1) Parsing Table
First Follow
exp {(, num} {$,)} ( ) + - * n $
exp’ {+,-, } {$,)}
addop {+,-} {(,num} exp
1 1
term {(,num} {+,-,),$}
term’ {*, } {+,-,),$} exp’ 3 2 2 3
mulop {*} {(,num}
factor {(, num} {*,+,-,),$} addop 4 5
1 exp term exp’
2 exp’ addop term exp’ term 6 6
3 exp’
4 addop +
5 addop - term’ 8 8 8 7 8
6 term factor term’
7 term’ mulop factor term’ mulop
8 term’ 9
9 mulop *
10 factor ( exp ) factor
11 factor num 10 11
Parsing 25
LL(1) Grammar
Parsing 26
LL(1) Parsing Table for non-LL(1) Grammar
Parsing 28
Left Recursion
Parsing 29
Removal of Immediate Left Recursion
Bad News!
Can only be removed when there is no empty-
string production and no cycle in the grammar.
Good News!!!!
Never seen in grammars of any programming
languages
Parsing 31
Left Factoring
AXY|XZ
can be left-factored as
A X A’ and A’ Y | Z
Parsing 32
Example of Left Factor
seq st ; seq | st
can be left-factored as
seq st seq’
seq’ ; seq |
Parsing 33
Bottom-up Parsing
Parsing 34
Example of Shift-reduce Parsing
Grammar
S’ S
S (S)S | Reverse of
Parsing actions rightmost derivation
Stack Input Action from left to right
$ (())$ shift 1 (())
$( ())$ shift 2 (())
$(( ))$ reduce S 3 (())
$((S ))$ shift 4 ((S))
$((S) )$ reduce S 5 ((S))
$((S)S )$ reduce S (S)S 6 ((S)S)
$(S )$ shift 7 (S)
$(S) $ reduce S 8 (S)
$(S)S $ reduce S (S)S 9 (S)S
$S $ accept 10 S’ S
Parsing 35
Example of Shift-reduce Parsing
Grammar
S’ S
S (S)S |
Parsing actions
Stack Input Action
$ (())$ shift 1 (()) handle
$( ())$ shift 2 (())
$(( ))$ reduce S 3 (())
$((S ))$ shift 4 ((S))
$((S) )$ reduce S 5 ((S))
$((S)S )$ reduce S (S)S 6 ((S)S)
$(S )$ shift 7 (S)
$(S) $ reduce S 8 (S)
$(S)S $ reduce S (S)S 9 (S)S
$S $ accept 10 S’ S
Viable prefix
Parsing 36
Terminologies
Right sentential form Right sentential form
sentential form in a rightmost (S)S
derivation ((S)S)
Viable prefix Viable prefix
( S ) S, ( S ), ( S, (
sequence of symbols on the
( ( S ) S, ( ( S ), ( ( S , ( (, (
parsing stack
Handle
Handle ( S ) S. with S
right sentential form + ( S ) S . with S
position where reduction can ( ( S ) S . ) with S ( S ) S
be performed + production
used for reduction LR(0) item
S ( S ) S.
LR(0) item S (S).S
production with distinguished S (S.)S
position in its RHS S (.S)S
S .(S)S
Parsing 37
Shift-reduce parsers
Parsing 39
LR(0) items
LR(0) item
production with a distinguished position in the RHS
Initial Item
Item with the distinguished position on the leftmost of
the production
Complete Item
Item with the distinguished position on the rightmost of
the production
Closure Item of x
Item x together with items which can be reached from x
via -transition
Kernel Item
Original item, not including closure items
Parsing 40
Finite automata of items
Grammar: S
S’ .S S’ S.
S’ S
S (S)S
S S .(S)S S.
Items:
S’ .S (
S’ S.
S .(S)S S
S (.S)S S (S.)S
S (.S)S
S (S.)S )
S (S).S
S
S (S)S. S (S).S S (S)S.
S.
Parsing 41
DFA of LR(0) Items
S S’ .S S S’ S.
S’ .S S’ S. S .(S)S
S.
S (S.)S
S .(S)S
S. ( S
)
( S (.S)S
S S .(S)S
S (.S)S S (S.)S S.
) ( ( S (S).S
S .(S)S
S.
S (S).S
S S
S (S)S.
S (S)S.
Parsing 42
LR(0) parsing algorithm
Item in state token Action
A-> x.By where B is terminal B shift B and push state s
containing A -> xB.y
A-> x.By where B is terminal not B error
A -> x. - reduce with A -> x (i.e. pop x,
backup to the state s on top of
stack) and push A with new
state d(s,A)
S’ -> S. none accept
S’ -> S. any error
Parsing 43
LR(0) Parsing Table
A’ .A A A’ A. 1
A .(A) State Action Rule ( a ) A
A .a 0 a
A a. 2
0 shift 3 2 1
( a 1 reduce A’ -> A
A (.A)
A (A.) 4
2 reduce A -> a
A .(A)
A .a 3 A 3 shift 3 2 4
)
( 4 shift 5
A (A). 5 5 reduce A -> (A)
Parsing 44
Example of LR(0) Parsing
State Action Rule ( a ) A
0 shift 3 2 1
1 reduce A’ -> A
2 reduce A -> a
3 shift 3 2 4
4 shift 5
5 reduce A -> (A)
Stack Input Action
$0 ((a))$ shift
$0(3 (a))$ shift
$0(3(3 a))$ shift
$0(3(3a2 ))$ reduce
$0(3(3A4 ))$ shift
$0(3(3A4)5 )$ reduce
$0(3A4 )$ shift
$0(3A4)5 $ reduce
$0A1 $ accept
Parsing 45
Non-LR(0)Grammar
Conflict
S’ .S S S’ S. 1
Shift-reduce conflict S .(S)S
A state contains a S .
0 S (S.)S 3
complete item A x. and (
a shift item A x.By S
Parsing 46
SLR(1) parsing
Parsing 47
SLR(1) parsing algorithm
Parsing 48
SLR(1) grammar
Conflict
Shift-reduce conflict
A state contains a shift item A x.Wy such that W is
a terminal and a complete item B z. such that W
is in Follow(B).
Reduce-reduce conflict
A state contains more than one complete item with
some common Follow set.
A grammar is an SLR(1) grammar if there is
no conflict in the grammar.
Parsing 49
SLR(1) Parsing Table
A (A) | a
State ( a ) $ A
0 S3 S2 1
A’ .A A A’ A. 1 1 AC
A .(A)
A .a 0 a 2 R2
A a. 2 3 S3 S2 4
( a
A (.A)
4 S5
A .(A)
A .a 3 A A (A.) 4
5 R1
)
( A (A). 5
Parsing 50
SLR(1) Grammar not LR(0)
S’ .S S
S .(S)S S’ S. 1 S (S)S |
S. 0
S (S.)S 3
( S
S (.S)S ) State ( ) $ S
S .(S)S
S. 2 0 S2 R2 R2 1
S (S).S 1 AC
( ( S .(S)S 2 S2 R2 R2 3
S. 4 3 S4
S 4 S2 R2 R2 5
5 R1 R1
S (S)S. 5
Parsing 51
Disambiguating Rules for Parsing Conflict
Shift-reduce conflict
Prefer shift over reduce
In case of nested if statements, preferring shift over
reduce implies most closely nested rule for dangling
else
Reduce-reduce conflict
Error in design
Parsing 52
Dangling Else
S’ .S S S’ S. 1
0 I if S else .S 6
S .I I I S .I
S .other S I. 2 S .other
I .if S I .if S S
I .if S else I I .if S else S
S if
else
I .if S else S 7
other other
if state if else other $ S I
other
0 S4 S3 1 2
S .other 3 I if .S 4 1 ACC
other
I if .S else S
I if S. 5 S .I 2 R1 R1
I if S. else S S S .other 3 R2 R2
I .if S
4 S4 S3 5 2
I .if S else S
5 S6 R3
if
6 S4 S3 7 2
7 R4 R4
Parsing 53