Grammar
Grammar
Grammar
AUTOMATA THEORY
AND
COMPILER DESIGN
(CC-3203 )
Grammar
P is Production rules for Terminals and Non-terminals. A production rule has the
form α → β, where α and β are strings on VN ∪ ∑ and least one symbol of α
belongs to VN.
Grammar
3
Example
Grammar G1 −
({S, A, B}, {a, b}, S, {S → AB, A → a, B → b})
Here,
Productions, P : S → AB, A → a, B → b
Grammar
4
Derivation
Derivation is a sequence of production rules. It is used to get the input string
through these production rules. During parsing, we have to take two decisions.
These are as follows:
We have to decide the non-terminal which is to be replaced.
We have to decide the production rule by which the non-terminal will be replaced.
Leftmost Derivation:
In the leftmost derivation, the input is scanned and replaced with the production rule
from left to right. So in leftmost derivation, we read the input string from left to right.
Example:
Production rules:
E=E+E
E=E-E
E=a|b
Input
a-b+a
Grammar
6
Production rules:
E=E+E
E=E-E
E=a|b
Input
a-b+a
The leftmost derivation is:
E=E+E
E=E-E+E
E=a-E+E
E=a-b+E
E=a-b+a
Grammar
7
Rightmost Derivation:
In rightmost derivation, the input is scanned and replaced with the production rule
from right to left. So in rightmost derivation, we read the input string from right to
left.
Example
Production rules:
E =E+E
E = E - E
E = a | b
Grammar
8
Production rules:
E=E+E
E=E-E
E=a|b
Input
a-b+a
The rightmost derivation is:
E=E-E
E=E-E+E
E=E-E+a
E=E-b+a
E=a-b+a
Grammar
9
Derivation Tree
Derivation tree is a graphical representation for the derivation of the
given production rules for a given CFG.
It is the simple way to show how the derivation can be done to obtain
some string from a given set of production rules.
The derivation tree is also called a parse tree.
Grammar
10
Production rules:
E=E+E
E=E-E
E=a|b
Input
a-b+a
The rightmost derivation is:
E=E-E
E=E-E+E
E=E-E+a
E=E-b+a
E=a-b+a
Grammar
12
Example-1
S → bSb | a | b
Input string= bbabb
Example-2:
S → xB / yA
S → xS / yAA / x
B → yS / xBB / y
Consider a string W = xxxyyxyyyx
Grammar
13
Ambiguity in Grammar
A grammar is said to be ambiguous if there exists more than one
leftmost derivation or more than one rightmost derivation or more than
one parse tree for the given input string.
If the grammar is not ambiguous, then it is called unambiguous.
Grammar
14
Ambiguity in Grammar
Example 1:
Let us consider a grammar G with the production rule
1. E →I
2. E → E + E
3. E → E * E
4. E → (E)
5. I→ ε | 0 | 1 | 2 | ... | 9
For the string "3 * 2 + 5", the above grammar can generate two
parse trees by leftmost derivation.
Grammar
15
Ambiguity in Grammar
For the string "3 * 2 + 5", the above grammar can generate two
parse trees by leftmost derivation.
Grammar
16
Type - 3 Grammar
Type-3 grammars generate regular languages. Type-3 grammars must
have a single non-terminal on the left-hand side and a right-hand side
consisting of a single terminal or single terminal followed by a single
non-terminal.
The productions must be in the form X → a or X → aY
where X, Y ∈ N (Non terminal)
and a ∈ T (Terminal)
The rule S → ε is allowed if S does not appear on the right side of any
rule. X → ε
X → a | aY
Y → b
Grammar
19
Type - 2 Grammar
Type-2 grammars generate context-free languages.
The productions must be in the form α → β
Where A ∈ N (Non-terminal)
and γ ∈ (T ∪ N)* (String of terminals and non-terminals).
These grammars’ languages are recognized by a non-deterministic pushdown
automaton. Example
S→Xa
X→a
X → aX
X → abc
X→ε
Grammar
20
Type - 1 Grammar
Type-1 grammars generate context-sensitive languages. The
productions must be in the form
αAβ → α γ β
where A ∈ N (Non-terminal)
and α, β, γ ∈ (T ∪ N)* (Strings of terminals and non-terminals)
The strings α and β may be empty, but γ must be non-empty.
The rule S → ε is allowed if S does not appear on the right side of any rule. The
languages generated by these grammars are recognized by a linear bounded
automaton. Example
AB → AbBc
A → bcA
B → b
Grammar
21
Type - 0 Grammar
Type-0 grammars generate recursively enumerable languages. The
productions have no restrictions. They are any phase structure grammar
including all formal grammars.
They generate the languages that are recognized by a Turing machine.
Simplification of CFG
Simplification of grammar means reduction of grammar by
removing useless symbols.
As we have seen, various languages can efficiently be represented by a
context-free grammar.
All the grammar are not always optimized that means the grammar may
consist of some extra symbols(non-terminal).
Having extra symbols, unnecessary increase the length of grammar.
Grammar
23
Simplification of CFG
The properties of reduced grammar are given below:
1. Each variable (i.e. non-terminal) and each terminal of G appears in the
derivation of some word in L.
Elimination of ε Production
Theproductions of type S → ε are called ε productions. These type of
productions can only be removed from those grammars that do not
generate ε.
Step 1: First find out all nullable non-terminal variable which derives ε.
Step 2: For each production A → a, construct all production A → x, where x is
obtained from a by removing one or more non-terminal from step 1.
Step 3: Now combine the result of step 2 with the original production and
remove ε productions.
Grammar
26
Elimination of ε Production
Example:
Remove the production from the following CFG by preserving the meaning of
it.
1. S → XYX
2. X → 0X | ε
3. Y → 1Y | ε
Solution:
Now, while removing ε production, we are deleting the rule X → ε and Y → ε.
To preserve the meaning of CFG we are placing ε at the right-hand side
whenever X and Y have appeared.
Let us take
1. S → XYX
Grammar
27
Elimination of ε Production
Let us take
1. S → XYX
2. S → YX
Elimination of ε Production
Now let us consider
1. X → 0X
If we place ε at right-hand side for X then,
1. X →0
1. X → 0X | 0
Similarly Y → 1Y | 1
Collectively we can rewrite the CFG with removed ε production
as:
S → XY | YX | XX | X | Y | ε
X → 0X | 0
Y → 1Y | 1
Grammar
29
Step 3: Repeat step 1 and step 2 until all unit productions are removed.
Grammar
30
GNF
Example:
1. S → XB | AA
2. A → a | SA
3. B→b
4. X→a
Grammar
41
Solution:
As the given grammar G is already in CNF and there is no left recursion, so we can skip
step 1 and step 2 and directly go to step 3.
The production rule A → SA is not in GNF, so we substitute S → XB | AA in the production
rule A → SA as:
1. S → XB | AA
2. A → a | XBA | AAA
3. B→b
4. X→a
The production rule S → XB and A → XBA is not in GNF, so we substitute X → a in the
production rule S → XB and A → XBA as:
1. S → aB | AA
2. A → a | aBA | AAA
3. B→b
4. X→a
Grammar
42
The production rule S → AA is not in GNF, so we substitute A → aC | aBAC | a | aBA in production rule
S → AA as:
1. S → aB | aCA | aBACA | aA | aBAA
2. A → aC | aBAC | a | aBA
3. C → AAC
4. C → aCA | aBACA | aA | aBAA
5. B→b
6. X→a
The production rule C → AAC is not in GNF, so we substitute A → aC | aBAC | a | aBA in production
rule C → AAC as:
1. S → aB | aCA | aBACA | aA | aBAA
2. A → aC | aBAC | a | aBA
3. C → aCAC | aBACAC | aAC | aBAAC
4. C → aCA | aBACA | aA | aBAA
5. B→b
6. X→a
Hence, this is the GNF form for the grammar G.
44
References
Hopcroft, Ullman, “Introduction to Automata Theory, Languages and Computation”, Pearson
Education.
KLP Mishra and N. Chandrasekaran, “Theory of Computer Science: Automata, Languages and
Computation”, PHI Learning Private Limited, Delhi India.
Peter Linz, "An Introduction to Formal Language and Automata", Narosa Publishing house.