CT-4212 Theory of Automata and Computations

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 39

DEFENCE ENGINEERING UNIVERSITY

College of Engineering
Bishoftu, Ethiopia

Department
of
Computer and Information Technology

LAB MANUAL
for
Theory of Automata and
Computation
CT-4212 Theory of Automata and Computation 2

TABLE OF CONTENTS

Introduction 3

Lab 1: Deterministic Finite Automata 4

Lab 2: Non Deterministic Finite Automata 6

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 3

INTRODUCTION

1.1 About this Lab Manual

This lab manual is written to accompany the Theory of Automata and


Computation course (CT-4212) for the Department of Computer and Information
Technology of Defence Engineering University. Lab sessions are 2 hours a week.
The lab topics complemented the material covered during lecture sessions and in
accordance to the course syllabus.
This lab manual is divided into a number of lab topics. Each lab contains a
topic background followed by activities. Activities are categorized into three
activities: Pre Lab activity, which can be an introductory exercise or sample demo
program; Lab Activity, these are the activities evaluated by the instructors. A lab
topic includes one or more lab activity. It is not expected that all these activities will
be given; the instructors can choose which activity can be given; Post Lab activity,
these are assignments that may be provided outside the lab sessions for additional
exercise.

1.2 Assessment Criteria

Lab activity 15%

Post Activity 5%
(if not given 20% is given to lab activity)

1.3 Expectations

 Use of Internet during class is not allowed.


 Required post activity hand – in should be submitted on time.
 Students are not allowed to collaborate during lab activity.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 4

LAB 1: Deterministic Finite Automata

Learning Outcomes:

After performing this lab, the student should be able to:


 Construct deterministic finite automata given a regular language.
 Analyze the strings accepted by the finite automata

Background

Deterministic Finite Automata

A Deterministic Finite Automaton (DFA) is a 5-tuple M = (Q, Σ, δ, q0, F) where

Q = Finite state of “internal states”


Σ = Finite set of symbols called “Input alphabet”
δ: Q x Σ  Q = Transition Function
q0 ∊ Q = Initial state
F  Q = Set of Final states

The input mechanism can move only from left to right and reads exactly one symbol on each
step. The transitions from one internal state to another are governed by the transition
function δ.

Language of DFA

 Automata of all kinds define languages.


 If M is an automaton, L(M) is its language.
 Informal: For a DFA M, L(M) is the set of strings labeling paths from the start state to
a final state.
 Formal: L(M) = the set of strings w such that δ(q0, w) is in F.

Example:

Construct the DFA M for the language L over the alphabet  = {0, 1}.

L = set of string with ‘001’ as a substring.

M = ({q0, q1, q2, q3}, {0, 1}, δ, q0, {q3})

Pre Lab Activity

Construct a DFA for the following languages over the alphabet  = {0, 1}.
a. The set of strings ending in ‘00’.
b. The set of strings with three consecutive 0’s.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 5

Lab Activity

Problem:
Given the following regular languages:

a. L(M1) = { w ∈ {0, 1}* | w ends in ‘1101’}


b. L(M2) = { w ∈ {0, 1}* | every ‘1’ has a ‘0’ immediately to its right }

Requirement:
Construct the Deterministic Finite Automata using JFLAP 7.0 and submit the file using the
filename:

a. DFA_M1.jff
b. DFA_M2.jff

Hand – in:
Submit the following DFA in paper with their 5–tuple notation, transition graph and transition
table.

Post Lab Activity

Construct the DFA M for the language L

L(M) = {all strings over the alphabet  = {0, 1}, where the strings contain the
substring ‘0101’}

Submit the transition graph of DFA with 5-tuple notation and transition table.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 6

LAB 2: Non Deterministic Finite Automata

Learning Outcomes:

After performing this lab, the student should be able to:


 Construct non deterministic finite automata given a regular language.
 Analyze the strings accepted by the finite automata
 Construct the equivalent deterministic finite automata.

Background

Non Deterministic Finite Automata

A Non Deterministic Finite Automaton (DFA) is a 5-tuple M = (Q, Σ, δ, q0, F) where

Q = Finite state of “internal states”


Σ= Finite set of symbols called “Input alphabet”
δ: Q x Σ  2Q = Transition Function
q0 ∊ Q = Initial state
F  Q = Set of Final states

The transition function δ takes a state in Q and an input symbol in Σ as arguments and
returns a subset of Q. Notice that the only difference between NFA and a DFA is in the type
of value that δ returns: as set of states in the case of an NFA and a single state in the case
of a DFA.

Language of NFA

 Automata of all kinds define languages.


 If M is a non deterministic finite automaton, L(M) is its language.

Example:

Construct the NFA M for the language L:

L(M) = { w  {0, 1}* | w has a ‘1’ in the third position from the end }

M = ({q0, q1, q2, q3}, {0,1}, δ, q0, {q3})

Pre Lab Activity

Construct the NFA M accepting the language L = {a*  b*}

Lab Activity

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 7

Problem:
Given the regular languages L1 and L2:

a. L(M1) = {a* b* a* a}
b. L(M2) = {a}  {ab*}

Requirement:
Construct the Non Deterministic Finite Automata (M) equivalent. The following filename
should be followed.

a. NFA_M1.jff ( for the non deterministic finite automaton M1)


b. NFA_M2.jff ( for the non deterministic finite automaton M2)

Hand – in:
Submit the following NFA in white paper:
a. Transition graph of NFA with 5 – tuple notation.
b. Transition table of NFA

Post Lab Activity

Problem:
Given the regular language L(M) = {00* 11*}

Requirement:
Construct the Non Deterministic Finite Automata (M) equivalent.

Hand – in:
Submit the following NFA in white paper:
c. Transition graph of NFA with 5 – tuple notation.
d. Transition table of NFA

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 8

LAB 3: Equivalence of Deterministic and Nondeterministic FA

Learning Outcomes:

After performing this lab, the student should be able to:


 Determined the difference between non deterministic finite automata and
deterministic finite automata.
 Analyze the strings accepted by these two finite automata
 Construct the equivalent deterministic finite automata given nondeterministic
finite automata.

Background

Equivalence of DFA and NFA

Definition: Two finite automatons M1 and M2 are equivalent if and only if

L(M1) = L(M2)

Both DFA and NFA recognize the same class of languages. It is important to note that every
NFA has an equivalent DFA.

Example:

a. Given the NFA M1 below, provide the equivalent DFA M2.

M1 = ({q0, q1}, {a, b}, δ, q0, {q1})

Transition Table of NFA

δ a b

 q0 {q0, q1} {q1}


*q1  {q0, q1}

Transition Table of DFA


Rename
δ' a b δ' a b

 [q0] [q0, q1] [q1] A B C


*[q0, q1] [q0, q1] [q0, q1] *B B B
*[q1] [] [q0, q1] *C D B
[] [] [] D D D

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 9

Transition Graph with 5 – tuple notation

M2 = ({A, B, C, D}, {a, b}, δ’, A, {B, C})

Pre Lab Activity

Construct the DFA M’ equivalent to the NFA M accepting the language L = {a*  b*}

Lab Activity

Problem:
Given the NFA below:

a. M1 = ({q0, q1, q2}, {0, 1}, δ, q0, {q1})

b. M2 = ({ q0, q1, q2, q3 }, { a, b, c }, δ, q0, { q0, q2, q3 })

Requirement:

Construct the equivalent Deterministic Finite Automaton (M’). The following filename should
be followed.

a. NFA_M1.jff ( for the non deterministic finite automaton M1)


b. DFA_M1.jff ( for the equivalent deterministic finite automaton M1’)
c. NFA_M2.jff ( for the non deterministic finite automaton M2)
d. DFA_M2.jff ( for the equivalent deterministic finite automaton M2’)

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 10

Hand – in:
Submit the following DFA in white paper:
a. Transition graph of NFA with 5 – tuple notation
b. Transition table of NFA
c. Transition table of DFA. (note: rename your states using A, B, C …)
d. Transition graph of DFA with 5 – tuple notation

Post Lab Activity

Problem: Given the NFA (M1) below:

M1 = ({ q0, q1, q2 }, { 0, 1 }, δ, q0, {q2})

Requirement:
Construct the equivalent Deterministic Finite Automaton (M2) by providing the following:
a. Transition graph of NFA with 5 – tuple notation M1.
b. Transition table of NFA
c. Transition table of DFA. (note: rename your states using A, B, C …)
d. Transition graph of DFA with 5 – tuple notation M2.

Hand – in:
Submit all the requirements above in white paper.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 11

LAB 4: Non Deterministic Finite Automata with Epsilon Transitions


(ε – NFA)

Learning Outcomes:

After performing this lab, the student should be able to:


 Determined the difference between non deterministic finite automata (NFA) and
non deterministic finite automata with epsilon transitions (ε – NFA).
 Analyze the strings accepted by the finite automata
 Construct the equivalent non deterministic finite automata give a finite automata
with epsilon transitions.

Background

Nondeterministic Finite Automaton with Epsilon Transitions (ε - NFA)

A Nondeterministic Finite Automaton with epsilon transition (ε - NFA) is a 5-tuple:

M = (Q, , δ, q0 , F ) where

Q = Finite state of “internal states”


 = Finite set of symbols called “Input alphabet”
δ: Q x (  {ε})  2Q = Transition Function
q0 ∈ Q= Initial state
F  Q = Set of Final states

This NFA allows the symbol ε, empty string, as the second argument of δ. This means that
the NFA can make a transition without consuming an input symbol.

Example:
Construct the ε – NFA for the given transition table.

a b c ε
p {p} {q} {r} 
q {q} {r}  {p}
*r {r}  {p} {q}

Epsilon Closures (ε-closure)

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 12

Informally, to get the ε-closure of a state is to follow all transitions out of that state labeled ε
and following all states that can be reached along path whose arcs are all labeled ε.

Formally, w can define the ε-closure(p) as follows:

If state q is in ε-closure(p) and there is a transition from state q to state r labeled ε,


then r is in ε-closure(p). More precisely, if δ is the transition function of the ε-NFA
involved, and q is in ε-closure(p), then ε-closure(p) also contains all the states in δ(q,
ε).

Eliminating ε-Transitions (Equivalence of NFA and ε-NFA)

Every NFA is an ε-NFA. It just has no transitions on ε. Converse requires us to take an ε-


NFA and construct an NFA that accepts the same language. We do so by combining ε–
transitions with the next transition on a real input.
 Start with an ε-NFA E with states Q, inputs Σ, start state q 0, final states F, and
transition function δE.
 Construct an “ordinary” NFA N with states Q, inputs Σ, start state q 0, final states F’,
and transition function δN.
 Compute δN(q, a) as follows:
o Let S = ε-closure(q).
o δN(q, a) is the union over all p in S of δE(p, a).
 F’ = the set of states q such that ε-closure(q) contains a state of F.

Example: Convert the given ε-NFA to NFA by eliminating the ε-transitions.

a b c ε
p {p} {q} {r} 
q {q} {r}  {p}
*r {r}  {p} {q}

a. Get the ε-closure of each state


ε-closure(p) = {p}
ε-closure(q) = {p, q}
ε-closure(r) = {p, q, r}

b. Transition table of NFA

δ' A b c
p {p} {q} {r}
q {p, q} {q, r} {r}
*r {p, q, r} {q, r} {p, r}

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 13

Pre Lab Activity

Eliminate the ε-transitions for the given transition table

a b c ε
p  {q} {r} {q , r}
q {p} {r} {p , q} 
*r    

Lab Activity

Problem:

Given the ε-NFA (M1)

M1 = ({q0, q1, q2}, {0, 1}, δ, q0, {q1})

Requirement:

Convert the ε-NFA (M1) to NFA (M2) by providing the following:


a. Transition table of the ε-NFA (δ)
b. ε-closure of each state
c. Transition table of the NFA (δ’)
d. Transition diagram of NFA (M2)

Hand – in:

Submit all the requirements above in white paper.

Post Lab Activity

Problem:

Nondeterministic Finite Automaton with epsilon transition (M)

M = ({q0, q1, q2}, {a, b}, δ, q0, {q1})

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 14

Requirement:

Convert the ε-NFA (M) to DFA (M’’) by providing the following:


a. Transition table of the ε-NFA (δ)
b. ε-closure of each state
c. Transition table of the NFA (δ’)
d. Transition table of the DFA (δ’’)
e. Transition diagram of DFA (M’’)

Hand – in:

Submit all the requirements above in white paper.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 15

LAB 5: Regular Expressions

Learning Outcomes:

After performing this lab, the student should be able to:


 Construct regular expression given a regular language.
 Construct a regular language given a regular expression.
 Analyze the strings accepted by the regular expression.

Background

Regular Expressions

Another way to describe regular languages, other than finite automata, is via notation of
regular expression. This notation involves a combination of strings of symbols from some
alphabet , parentheses, and the operators +, •, and *. If E is a regular expression, then
L(E) is the language it defines.

Formal Definition of Regular Expression

Let  be a given alphabet. Then


1. , ε, and a ∊  are all regular expressions. These are called primitive regular
expressions.
2. If r1 and r2 are regular expressions, so are r1 + r2, r1 • r2, r* and (r1).
3. A string is a regular expression if and only if it can be derived from primitive regular
expressions by a finite number of applications of the rules in (2).

Languages Associated with Regular expressions

Regular expressions can be used to describe some simple languages. If r is a regular


expression, then L(r) denote the language associated with r. This language is defined is as
follows:

The language L(r) denoted by any regular expression r is defined by the following
rules:
1.  is a regular expression denoting the empty set, { } or ,
2. ε is a regular expression denoting { ε },
3. for every a ∊ , a is a regular expression denoting {a}.
If r1 and r2 are regular expressions, then
4. L(r1 + r2) = L(r1)  L(r2,
5. L(r1 • r2) = L(r1) L(r2,
6. L((r1)) = L(r1),
7. L(r1*) = (L(r1))*.

Precedence of Operators

 Parentheses may be used wherever needed to influence the grouping of operators.


 Order of precedence is * (highest), then concatenation, then + (lowest)

Examples:
L(01) = {01}.
L(01+0) = {01, 0}.
L(0(1+0)) = {01, 00}.
L(0*) = {ε, 0, 00, 000,… }.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 16

Pre Lab Activity

Provide the language L(r) for the following regular expressions.


a. r = ((0 + 10)*(ε + 1))
b. r = (0 + 1)* (a + bb)
c. r = (00)* (11)*1

Provide the regular expressions for the following languages.


a. L(r) = {w ∈ {0, 1}* | w has at least one pair of consecutive zeros}.
b. L(r) = {w ∈ {0, 1}* | w has no pair of consecutive zeros}.

Lab Activity

Activity 1.

Problem:
Given the following languages:
a. L1 = {anbm | n ≥ 3 , m is even}
b. L2 = {anbm | n < 4 , m ≤ 3}
c. Complement of L2 (L3)

Requirement:
Find the regular expression equivalent to the above regular languages.

Activity 2.

Problem:
Given the regular language L:

L((aa)*b(aa)* + a(aa)* b a(aa)*)

Requirement:
Provide the regular expressions for the following languages.

Hand – in:

Submit all the requirements for Activity 1 and 2 in white paper.

Post Lab Activity

1. Given the language: L((a + b)*b(a +ab)*)


a. Provide the set having all the strings of length less than 4
b. How many strings are of length less than 3?

2. Give the regular expression for L = {anbm | n ≥ 1 , m ≥ 1, nm ≥ 3}

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 17

LAB 6: Regular Expressions and Finite Automata

Learning Outcomes:

After performing this lab, the student should be able to:


 Determine the difference and similarities of the notation using a regular
expression and a finite automaton.
 Construct regular expression given finite automata (ε-NFA).
 Construct a deterministic finite automaton from a given regular expression.

Background

Finite Automata and Regular Expressions

While the regular expression approach in describing languages is fundamentally different


from the finite automaton approach, these two notations represents exactly the same
language, which we refer as “regular languages”. In order to show that a regular expression
define the same language as the finite automaton, we will show that:

1. Every language defined by one of these automata is also defined by a regular


expression. We can assume that the language is accepted by some DFA.
2. Every language defined by a regular expression is defined by one of these automata.
For this part, we will show that there is an NFA with ε-transitions accepting the same
language.

Regular Expression to Finite Automata

Every language defined by a regular expression is also defined by a finite automaton.


Suppose L = L(R) for a regular expression. We show that L = L(E) for some ε-NFA E with:
1. Exactly one accepting state.
2. No arcs into the initial state.
3. No arcs out of the accepting state.

The following figures are the equivalent ε-NFA for the regular languages described by a
regular expression.

ε a

r=ε (b) r = (c) r = a

ε R ε

ε ε
S

(d) R + S

R ε S

(e) R • S

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 18

ε ε
R

ε
(f) R*

Example:

Provide the equivalent ε-NFA of the regular expression L(01*)

Pre Lab Activity

Provide the equivalent ε-NFA of the regular expression L(00(0+1)*)

Lab Activity

Problem:

Given the regular expression : L(00 + 11)*

Requirement:

Provide the following requirements and submit in white paper.

a. Construct the ε – NFA equivalent with 5-tuple notation


b. Transition table of ε – NFA (δ)
c. ε – closure of each state
d. Transition table of NFA (δ’)
e. Transition table of DFA (δ’’)
f. Transition graph of DFA (M’’) with 5-tuple notation

Post Lab Activity

Provide the same requirements above given the regular expression L(a* + b*)

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 19

LAB 7: Minimization of Deterministic Finite Automata

Learning Outcomes:

After performing this lab, the student should be able to:


 Determine the difference and similarities of the notation using a regular
expression and a finite automaton.
 Construct regular expression given finite automata (ε-NFA).
 Construct a deterministic finite automaton from a given regular expression.

Background

Equivalence and Minimization of DFAs

Testing Equivalence of States

Any DFA defines a unique language, but the converse is not true. For a given language,
there are many DFAs that accept it. There may be a considerable difference in the number
of states of such equivalent automata.

Definition:

Two states p and q of a DFA are called indistinguishable if

δ(p, w) ∈ F implies δ(q, w) ∈ F,


and
δ(p, w) ∉ F implies δ(q, w) ∉ F,

for all w ∈ *. If, on the other hand, there exists some string w ∈ * such that

δ(p, w) ∈ F implies δ(q, w) ∉ F,

or vice versa, then the states p and q are said to be distinguishable by a string w.

If two states are said to be indistinguishable then the states are equivalent.

Table – Filing Algorithm

The table – filling algorithm shows the equivalence of the states in a given DFA. The
table shows that if there is an X, it indicates that the pair of states is distinguishable, and the
blank squares indicate those pairs of states that are found equivalent.

Minimization of DFAs

An important consequence of the test for equivalence of states is to minimize a DFA. That
is, for each DFA we can find an equivalent DFA that has few number of states. The
algorithm is as follows:

1. Use the table – filling algorithm to find all the pairs of equivalent states.
2. Partition the set of states into blocks of mutually equivalent states.
3. Construct the minimum – state equivalent DFA.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 20

Example: Minimize the states of the given DFA M.

M = ({A, B, C, D, E, F, G, H}, {0, 1}, δ, A, {C})

Table Filling Algorithm

B X
C X X
D X X X
E X X X
F X X X X
G X X X X X X
H X X X X X X
A B C D E F G

Block of equivalent states are: [A, E], [B, H], [D, F], [C], [G]

M’ = ({[A, E], [B, H], [D, F], [C], [G]}, {0, 1}, δ, [A, E], {[C]})

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 21

Pre Lab Activity

Minimize the DFA below.

δ 0 1

B A
A
B A C
C D B
*D D A
E D F
F G E
G F G
H G D

Lab Activity

Problem:
Given the transition table below

δ 0 1

B E
A
B C F
*C D H
D E H
E F I
*F G B
G H B
H I C
*I A E

Requirement:
Construct the DFA with minimal states.

Hand – in:
Submit the following in white paper.

a. The table – filling algorithm


b. The minimized DFA with 5-tuple representation

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 22

Post Lab Activity

Problem:

Given the DFA (M) and the algorithm:

M = ({q0, q1, q2, q3, q4, q5}, {0, 1}, δ, q0, {q3, q5})

Algorithm for Minimal DFA:

Step 1: Use the table filling algorithm to get the block of states that are mutually
equivalent.

Step 2: Remove all the states (block of states) that are unreachable by the start
state.

Requirement:

Construct the Minimal DFA (M’)

Hand – in:

Submit the following in white paper


a. The Final table in the table filling algorithm
b. The Minimal DFA (M’)

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 23

LAB 8: Context – Free Grammars

Learning Outcomes:

After performing this lab, the student should be able to:


 Construct context – free grammars for generating pattern of strings.
 Construct context – free languages.
 Determine the difference between right and left linear grammar and how can they
constructed.

Background

A Context – Free Grammar (CFG) is a set of recursive rewriting rules (or productions) used
to generate patterns of strings (or language).

There are four important components in a grammatical description of a language:


1. There is a finite set of symbols that form the strings of the language being defined.
These symbols are called terminals or terminal symbols.
2. There is a finite set of variables, also called nonterminals. Each variable
represents a language.
3. One of the variables represents the language being defined. This is called the start
symbol. Other variables represent auxiliary classes of strings that are used to help
define the language of the start symbol.
4. There is a finite set of productions or rules that represent the recursive definition of
a language. Each production consists of :
a. A variable that is being (partially) defined by the production. This variable is
often called the head of the production
b. The production symbol 
c. A string of zero or more terminals and variables. This string, called the body
of the production, represents one way to form strings in the language of the
variable if the head. In doing so, we leave terminals unchanged and
substitute for each variable of the body any string that is known to be in the
language of that variable.
These four components form a context – free grammar (CFG). Formally, we will represent a
CFG G by its four components, that is: G = (V, T, P, S) where :
V is the finite set of variables
T is the finite set of terminals
P is the set of productions
S is the start symbol

To generate a string of terminal symbols from a CFG, we:


 Begin with a string consisting of the start symbol;
 Apply one of the productions with the start symbol on the left hand size, replacing the
start symbol with the right hand side of the production;
 Repeat the process of selecting nonterminal symbols in the string, and replacing
them with the right hand side of some corresponding production, until all
nonterminals have been replaced by terminal symbols.

Example: Given the grammar G = ({S}, {a, b}, P, S) with productions P as

S  aSb | S  SS | S  λ

This grammar generates strings such as abab, aaabbb and aababb


Right and Left – Linear Grammar

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 24

A grammar G = (V, T, S, P) is said to be right-linear if all productions are of the form

A  xB,
A  x,

where A, B  V, and x  T*. A grammar is said to be left-linear if all productions are of the
form

A  Bx,
A  x,

A grammar is one that is either right-linear or left-linear. Note that in a grammar there is at
most one variable appears on the right side of any production. Furthermore, that variable
must consistently be either the rightmost or leftmost symbol of the right side of any
production.

Example:

The grammar G1 = ({S} , {a, b}, P1, S), with P1 given as

S abS | a

is right-linear. The grammar G2 = ({S, S1, S2} , {a, b} ,P2, S), with productions P2 as
S S1ab
S1 S1ab | S2
S2  a

is left-linear. Both G1 and G2 are regular grammars.

Derivations Using a Grammar

We derive strings in the language of a CFG by starting with the start symbol, and repeatedly
replacing some variable A by the right side of one of its productions. That is, the
“productions for A” are those that have A on the left side of the .

Example: Derive the string 000111, given the grammar G = ({S}, {0,1}, P, S) with productions
P as
S  01
S  0S1

The derivation is S => 0S1 => 00S11 => 000111.

Rightmost and Leftmost Derivation

Given the grammar G1 and G2, the sequence

S => abS => ababS =>ababa

is a derivation of G1 therefore rightmost derivation, while the sequence

S => S1ab => S1abab => S2abab => ababa

is a derivation with G2 therefore leftmost derivation.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 25

Language of a Grammar

Definition: If G is a CFG, then L(G), the language of G, is {w | S => w}.


Note: w must be a terminal string, S is the start symbol.

A language that is defined by some CFG is called a context-free language.

Example:
Given the grammar G = ({S}, {0, 1}, P, S) with P as

S  0S1
Sλ

Give the language generated by the grammar L(G).

L(G) = { 0n1n | n≥ 0}

Lab Activity

Activity 1:
Find the context – free grammar for the following languages (with n ≥ 0, m ≥ 0)
a. L(G1) = { anbm | n ≠ m – 1}
b. L(G2) = { anbm | n ≠ 2m}

Activity 2:
Give the right – linear (G1) and left linear (G2) grammar of G = ({S, S1, A, B}, {am b}, P, S)
with P as
S  AS1 | S1B
S1  aS1b | λ
A aA | a
B  bB | b

Note: Provide the 4-tuple notation of both grammars.

Post Lab Activity

Problem:
Given the grammar G = ({ S,A, B}, {a, b}, P, S) with P as
S  abB
B  bbAa
A  aaBb | λ

Requirement:

Provide the following:

a. The language accepted by the grammar L(G).


b. Give the leftmost and rightmost derivation of the string: abbbaabbaba

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 26

LAB 9: Chomsky Normal Form of Context – Free Grammars

Learning Outcomes:

After performing this lab, the student should be able to:


 Simplify a context free grammar by eliminating unnecessary (undesirable)
productions.
 Provide the Chomsky normal form of a given context free grammar

Background

Simplification of Context – Free Grammars

In a Context Free Grammar (CFG), it may not be necessary to use all the symbols in V  T,
or all the production rules in P while deriving sentences. Let us try to eliminate symbols and
productions in G which are not useful in deriving sentences.

Let G = (V, T, P, S) be a context-free grammar. Suppose that P contains a production of the


form

A x1 B x2.

Assume that A and B are different variables and that

B  y1 | y 2 | … | y n

is the set of all productions in P which have B as the left side.

Let G’ = (V,T, P’, S) be the grammar in which P’ is constructed by deleting A  x1 B x2 from


P, and adding to it

A  x1y1x2 | x1y2x2 | … | x1ynx2

Then L (G’) = L(G).

Substitution Rule

A production A x1Bx2 can be eliminated from a grammar if we put in its place the set of
productions in which B is replaced by all strings it derives in one step. In this result, it is
necessary that A and B are different variables.

Example: Consider G = ({A,B}, {a, b, c}, P, A) with productions

A  a | aaA | abBc
B  abbA | b.

Using the suggested substitution rule for the variable B, we get the grammat G’ with
productions

A  a | aaA |ababbAc | abbc


B  abbA | b

The new grammar G’ is equal to G.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 27

Notice that, in the example given, the variable B (in G’) and its associated productions are
still in the grammar even though they can no longer play a part in any derivations.

Removing Useless Productions

In the grammar G with P,


S  aSb | λ | A
A  aA

the production S A does not play any role because A cannot be transformed into a
terminal string. ‘A’ can occur in a string derived from S, this can never lead to a sentence.
Hence this production rule can be removed, which does not affect the language.

Definition: Let G = (V, T, P, S) be a CFG.

A variable A ∈ V is said to be “useful” if and only if there is at least one w ∈ L(G) such that
S =>* xAy =>* w

with x, y in (V  T)*. In other words, a variable is useful if and only if it occurs in at least one
derivation. A variable that is not useful is called useless. A production is useless if it
involves any useless variable.

Another reason a variable may be useless is shown in the next grammar.

Consider the grammar G with P


SA
A  aA | λ
B  bA

Here the variable B is said to be “useless” and so is the production B  bA. Although B can
derive a terminal string, there is no way to achieve S =>* xBy.

The examples above illustrate the two reasons why a variable is useless:

 Either because it cannot derive a terminal string (or not generating any string), or
 Because it cannot be reached from the start symbol

The procedure for removing useless variables and productions is based on recognizing
these two situations.

Theorem: Let G = (V, T, P, S) be a CFG. Then there exist an equivalent grammar G’ = (V’,
T’, P’, S) that does not contain any useless variables or productions.

Proof: The grammar G’ can be generated from G by an algorithm consisting of two parts. In
the first part we construct an intermediate G 1 = (V1, T2, P1, S) such that V1 contains only
variables A for which

A =>* w ∈ T*

is possible. The steps are:


1. Set V1 to .
2. Repeat the following step until no more variables are added to V1.
 For every A ∈ V for which P has a production of the form

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 28

A  x1x2 ... xn, with all xi in V1  T,

 Add A to V1.

3. Take P1 as all the productions in P whose symbols are all in V1  T.

In the second part of the construction, we get the final answer G’ from G 1. We draw the
variable dependency graph of G1. Dependency graph shows that a variable is useful
only if there is a path from the vertex labelled S to the vertex labelled with that
variable. Using this dependency graph, we will find all variables that cannot be reached from
S and these are removed from the variable set, as are the productions involving them. The
result is the grammar G’ = (V’, T’, P’, S).

Example: Eliminate useless symbols and productions from G = ({S, A, B, C}, {a, b}, P, S)
with P as

S  aS | A | C
Aa
B  aa
C  aCb

Using the first part of the theorem, we can find that C is not terminating, therefore can be
considered as a useless variable and its production is also useless. After the first part, the
grammar will be:

S  aS | A
Aa
B  aa

Using the second part, we will notice that B is variable unreachable by the start symbol
therefore a useless variable and its production also useless. After the second part, the
grammar which has no more useless productions is:

S  aS | A
Aa

Removing λ – Productions

One kind of production that is sometimes undesirable is one which the right side is the empty
string.

Definition: Any production of a context – free grammar of the form

Aλ

is called a λ – production. Any variable A which the derivation is

A=>* λ

is possible is called nullable.

A grammar may generate a language not containing λ, yet have some λ-production or
nullable variables. In such cases, the λ-productions can be removed.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 29

Consider the grammar

S  aS1b
S1  aS1b | λ

The λ-production S1  λ can be removed after adding new production obtained by


substituting λ for S1 where it occurs on the right. Doing so we get the grammar

S  aS1b | ab
S1  aS1b | ab

This new grammar generates the same language as the original one.

Theorem. Let G be any context – free grammar with λ not in L(G). Then there exists an
equivalent grammar G’ having no λ-productions.

Proof: We first find the set Vn of all nullable variables of G, using the following steps,

1. For all productions A  λ, put A into Vn ,


2. Repeat the following step until no further variables are added to Vn.
 For all productions

B  A1A2 ... An,

where A1, A2, ..., An are in Vn, put B into Vn.

Once the set Vn has been found, we can construct P’. To do so, we look at all productions
in P of the form

A  x1x2 ...xm, m≥ 1,

where each xi ∈ V  T. For each such production of P, we put into P’ that production as well
as all those generated by replacing nullable variables with λ in all possible combinations.
For example, if xi and xj are both nullable, there will be one production on P’ with x i, replaced
with λ, one in which xj is replaved with λ, and one in which both xi and xj are replaced with λ.

There is one exception: if all xi are nullable, the production A  λ is not put into P’.

Example: Find a context – free grammar without λ – productions equivalent to the grammar
defined by

S  ABaC
A  BC
Bb|λ
CD|λ
Dd

From the first step of the construction in the above theorem, we find that the nullable
variables are A, B, C. Then, following the second step of the construction, we get

S  ABaC | BaC | AaC | ABa | aC | Aa | Ba | a


A  B | C | BC
Bb
CD
Dd

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 30

Removing Unit Productions

Productions in which both sides are a single variable are at time undesirable.

Definition. Any production of a context – free grammar of the form

AB

where A, B ∈ V is called a unit – production.

To remove unit production, we use the substitution rule.

Example: Remove all unit productions from

S  Aa | B
B  A | bb
A  a | bc | B

First create a dependency graph for the unit productions. Using a dependency graph, we
see that

S => A, S => B, B => A, A => B

Hence, we add to the original non – unit productions


S  Aa
B  bb
A  a | bc

the new rules


S  a | bc | bb
B  a | bc
A  bb

to obtain the equivalent grammar


S  a | bc | bb | Aa
B  a | bc | bb
A  a | bc | bb

Note that the removal of the unit – productions has made B and the associated productions
useless.

We can all put these results together to show that grammars for context – free languages
can be made free of useless productions, λ – productions and unit productions.

Theorem. Let L be a context – free language that does not contain λ. Then there exists a
context – free grammar that generates L and that does not have any useless productions, λ
– productions and unit productions.

To remove all undesirable productions, use the following sequence of steps:

1. Remove λ – productions
2. Remove unit productions
3. Remove useless productions

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 31

Chomsky Normal Form

Any context – free language L without λ – production is generated by a grammar in which


productions are of the form:

A  BC
or
Aa

Where, A, B, and C ∈ V and a ∈ T.

Procedure to find the Equivalent Grammar in CNF

i. Eliminate the undesirable productions, if any


ii. Eliminate the terminals on the right hand side of length two or more
iii. Restrict the number of variables on the right hand side of productions to two.

Examples:

Obtain a grammar in Chomsky Normal Form (CNF) equivalent to the grammar G 1 with
productions P given by
S  ABa
A  aab
B  Ac

Following the steps:


i. The given set P does not have any undesirable productions.
ii. None of the given rules is in proper form.

For S  ABa, we have


S  ABBa
Ba  a

For A  aab, we have


A  BaBaBb
Bb b

For B  Ac, we have


B  ABc
Bc  c

Therefore, after performing (ii), the new set of productions are


S  ABBa
A  BaBaBb
B  ABc
Ba  a
Bb b
Bc  c

iii. In set of the productions above, we have


S  ABBa
A  BaBaBb

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 32

not in the proper form. Hence, we assume new variables D1 and D2 and the new
productions.
S  AD1
D1  BBa
A  BaD2
D2  BaBb

Thus the grammar in Chomsky Normal Form (CNF) is G2 given by the productions given by
S  AD1
D1  BBa
A  BaD2
D2  BaBb
B  ABc
Ba  a
Bb b
Bc  c

Pre Lab Activity

Obtain the Chomsky Normal Form of grammar G given P as

S  a | b | cSS

Lab Activity

Activity 1: Remove all undesirable productions from

S  a | aA | B | C
A  aB | λ
B  Aa
C  cCD
D  ddd

Activity 2: Obtain the Chomsky Normal Form of grammar G given P as

S abSb | a | aAb
A  bS | aAAb

Post Lab Activity

Requirement: Submit all the required answers in white paper.

a. Remove all undesirable productions from


S  aA | aBB
A  aaA | λ
B  bB | bbC
CB

b. Obtain the Chomsky Normal Form of the grammar G given P as


S  AB | aB
A  aab | λ
B  bbA

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 33

LAB 10: Pushdown Automata

Learning Outcomes:

At the end of this laboratory session, the student will be able to:
 Understand the concept and application of pushdown automata
 Differentiate pushdown automata from finite automata
 Construct pushdown automata
 Define the language generated by pushdown automata

Background

PUSHDOWN AUTOMATA (PDA)

Nondeterministic PDA

An NPDA is defined by the 7-tuple

M = (Q, , , δ, q0, Z, F)
where
Q = Finite set of internal states of the control unit
 = input alphabet
 = Finite set of symbols called “stack alphabet”
δ : Q x (  {λ}) x   finite subsets of Q x * is the transition function
q0 ∊ Q = initial state of the control unit
Z is the stack start symbol
F  Q = Set of final states

The arguments of δ are the current state of the control unit, the current input symbol, and the
current symbol on the top of the stack. The result is a set of pairs (q, x) where

q = next state of the control unit


x = string is put top of the stack in place of the single symbol there before.

The “stack” is an additional component available as part of PDA. The stack increases its
memory.

Transition Function of NPDA

The transition function for an NPDA has the form

δ: Q x (  {λ}) x   finite subsets of Q x *

δ is now a function of three arguments. The first two arguments are the same as before
(NFA):
i. the state,
ii. either λ, or a symbol from the input alphabet.

The third argument is the symbol on top of the stack. Just as the input symbol is
“consumed” when the function is applied, the stack symbol is also “consumed” (removed
from the stack).

Note that while the second argument may be λ, rather than a member of the input alphabet
( so that no input symbol is consumed), there is no option for the third argument.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 34

δ always consumes a symbol from the symbols from the stack, no move is possible if the
stack is empty.

There may also be a λ – transition, where the second argument may be λ, which means that
a move that does not consume an input symbol is possible. No move is possible if the stack
is empty.

Example: Consider the set of transition rules of an NPDA given by

δ(q1, a, b) = { (q2, cd), (q3, λ)}

If at any time the control unit is in state q1, the input symbol read is ‘a’, and the symbol on
the top of the stack is ‘b’, then one of the following two cases can occur:

a. the control unit tends to go into the state q2 and the string ‘cd’ replaces ‘b’ on top
of the stack;
b. the control unit goes into state q3 with the symbol b removed from the top of the
stack.

In the deterministic case, when the function δ is applied, the automaton moves to a new
state q ∊ Q and pushes a new string of symbols x ∊ * onto the result of applying δ is a
finite set of (q, x) pairs.

Drawing NPDAs

NPDAs are not usually drawn. However, with a few minor extensions, we can draw a NPDA
similar to the way we draw an NFA.

Instead of labeling an arc with an element of , we can label arcs with a / x, y where a ∊ , x
∊  and y ∊ *.

Let us consider the NPDA given by

M = ({q0, q1, q2, q3}, {a, b}, {0, 1}, δ, q0, 0, {q3})

where:
δ(q0, a, 0) = {(q1, 10), (q3, λ)}
δ(q0, λ, 0) = {(q3, λ)}
δ(q1, a, 1) = {(q1, 11)}
δ(q1, b, 1) = {(q2, λ)}
δ(q2, b, 1) = {(q2, λ)}
δ(q2, λ, 0) = {(q3, λ)}

This NPDA is drawn as follows:

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 35

Please note that the top of the stack is considered to the left, so that for example, if we get
an ‘a’ from the starting position, the stack changes from ‘0’ to ‘10’.

Execution of NPDA

Assume that someone is in the middle of stepping through a string with a DFA, and we need
to take over and finish the job. There are two things that are required to be known:

a. the state of the DFA is in, and;


b. what the remaining input is.

But if the automaton is an NPDA we need to know one more… the contents of the stack.

Instantaneous Description of a PDA

The instantaneous description of a PDA is a triplet (q, w, u), where

q = current state of the automaton


w = unread part of the string
u = stack contents (written as a string, with the leftmost symbol as the top of the
stack)

Let the symbol “├” denote a move of the NPDA, and suppose that δ(q1, a, x) = {(q2, y), …},
then the following is possible:

(q1, aW, xz) ├ (q2, W, yz)

where W indicates the rest of the string following ‘a’ and z indicates the rest of the stack
contents underneath the x.

This notation tells that in moving from state q1 to state q2, an ‘a’ is consumed from the input
string aW, and the x at the top of the stack xz is replaced with y, leaving yz on the stack.

Accepting Strings with an NPDA

Assume that you have the NPDA given by

M = (Q, , , δ, q0, z, F)

To recognize string w, begin with the instantaneous assumption

(q0, w, z)
where
q0 = start state
w = entire string to be processed, and
z = start stack symbol

Starting with this instantaneous description, make zero or more moves. There are two kinds
of moves that can be made:

a. λ-transition. If you are in state q1, x is the top symbol in the stack, and

δ(q1, λ, x) = {(q2, w2), …}

then you can replace the symbol x with the string w2 and move to q2.

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 36

b. Nonempty transitions. If you are in state q1, ‘a’ is the next unconsumed input
symbol, x is the top of the stack, and

δ(q1, a, x) = {(q2, w2), …}

the you can remove the ‘a’ from the input string, replace the symbol x with the string
w2 and move to state q2.

If you are in the final state when you reach the end of the string, (and may be make some λ-
transitions after reaching the end), then the string is accepted by the NPDA. It does not
matter what is on the stack.

Example: Let us consider the NPDA given by

δ(q0, a, 0) = {(q1, 10), (q3, λ)}


δ(q0, λ, 0) = {(q3, λ)}
δ(q1, a, 1) = {(q1, 11)}
δ(q1, b, 1) = {(q2, λ)}
δ(q2, b, 1) = {(q2, λ)}
δ(q2, λ, 0) = {(q3, λ)}

Is it possible to recognize the string ‘aaabbb’ ?

Solution: it is possible using the following sequences of “moves”

(q0, aaabbb, 0) ├ (q1, aabbb, 10)


├ (q1, abbb, 110)
├ (q1, bbb, 1110)
├ (q2, bb, 110)
├ (q2, b, 10)
├ (q2, λ, 0)

Acceptance

A string w ∊ * is accepted by M if and only if

(q0, w, z) ├ * (p, λ, u) where p ∊ F, u ∊ *

 This is “acceptance by final state”


 We care only that the computation ends in a final state with all input
consumed

A string w ∊ * is accepted by M if and only if

(q0, w, z) ├ * (p, λ, λ)

 This is “acceptance by empty stack”


 We care only that the computation ends with an empty stack and all input
consumed

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 37

Languages of a NPDA

The notation “├ ” is used to indicate a single move of an NPDA.

“├* ” is used to indicate a sequence of zero or more moves.

“├+ ” is used to indicate a sequence of one or more moves.

If M = (Q, , , δ, q0, Z, F) is an NPDA, then the language accepted by M, L(M), is given by

L(M) = { w ∊ * | (q0, w, z) ├* (p, λ, u), p ∊ F, u ∊ * }

L(M) = set of all strings accepted by final state

or

N(M) = { w ∊ * | (q0, w, z) ├* (p, λ, λ) }

N(M) = set of all strings accepted by empty stack

Pre Lab Activity

Given the PDA


M = ({q0, q1}, {a,b}, {a, z0}, δ, q0, z0, )

where δ is given by
δ(q0, a, z0) = {(q0, az0) }
δ(q0, a, a) = {(q0, aa)}
δ(q0, b, a) = {(q1, a)}
δ(q1, b, a) = {(q1, a)}
δ(q1, a, a) = {(q1, λ)}
δ(q1, λ, z0) = {(q1, λ)}

Provide the following:


a. Construct the diagram for the NPDA
b. Language accepted by the NPDA

Lab Activity

Given the PDA


M = ({q0, q1, q2}, {a,b}, {a, z0}, δ, q0, z0, )
where δ is given by
δ(q0, a, z0) = {(q1, az0) }
δ(q1, a, a) = {(q1, aa)}
δ(q1, b, a) = {(q1, a)}
δ(q2, b, a) = {(q1, λ)}
δ(q1, λ, z0) = {(q1, λ)}

Provide the following:


a. Construct the diagram for the NPDA
b. Language accepted by the NPDA

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 38

Post Lab Activity

Given the PDA


M = ({q0, q1}, {a, b, c}, {z0, z1}, δ, q0, z0, )

where δ is given by
δ(q0, a, z0) = {(q0, z1z0)}
δ(q0, a, z1) = {(q0, z1z1)}
δ(q0, b, z1) = {(q1, λ)}
δ(q1, b, z1) = {(q1, λ)}
δ(q1, c, z0) = {(q1, z0)}
δ(q1, λ, z0) = {(q1, λ)}

Provide the following:

c. Construct the diagram for the NPDA


d. Language accepted by the NPDA

Department of Computer and Information Technology


CT-4212 Theory of Automata and Computation 39

RESOURCES / TOOLS

 Windows Operating System


 JFLAP 7.0

REFERENCE MATERIALS:

 J.E. Hopcroft, R. Motwani and J.D. Ullman, “Introduction to Automata Theory,


Languages and Computations”, Second Edition, Pearson Education, 2003

 S.P. Eugene Xavier, “Theory of Automata, Formal Languages and Computation”,


New Age International Publishers, 2005

 Peter Linz, “An Introduction to Formal Languages and Automata”, 3rd Edition,
Jones and Bartlett Publishers Inc., 2001

Department of Computer and Information Technology

You might also like