Pset2 Solutions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Theory of Computation ’24

Problem Set 2

Notations. Let Σ = {a, b}. For w ∈ Σ⋆ , let |w| denote the length of w. Let #a (w) denote the
number of as in w and let #b (w) denote the number of bs in w.

1. Definition 1 Let Σ and Γ be two finite alphabets. A function f : Σ∗ −→ Γ∗ is called a


string homomorphism if for all x, y ∈ Σ∗ , f (x · y) = f (x) · f (y).
Prove that the class of regular languages is closed under homomorphisms. That is, prove
that if L ⊆ Σ∗ is a regular language, then so is f (L) = {f (x) | x ∈ L}. Here, it is advisable
to informally describe how you will turn a DFA for L into an NFA for f (L).
Solution.
We make the following observation about the homomorphism operation. Given f (L), for
any x = x0 · x1 · · · xn ∈ L, it can be proven recursively that
f (x) = f (x0 · x1 · · · xn ) = f (x0 ) · f (x1 ) · · · f (xn )
Let’s see how to convert a DFA for L to an NFA for f (L) informally. An NFA for f (L) can
keep the same transitions from one state to the next. The catch is that each alphabet σ for
a transition needs to be replaced by its image f (σ), which is a string. Now, how do we go
from one state p to another state q using some string f (σ)? We introduce artificial states
in between p and q. Specifically, we add a DFA D recognizing the language {f (σ)} in the
middle and add epsilon-transitions from state p to the start state of D and from the accept
state(s) of D to q. Note that the accept states of D are not the accept states of our NFA
for f (L). (The same can be done by using an NFA recognizing {f (σ)}).
Formally, let M (L) = (Q, Σ, δ, q0 , F ) be a DFA that recognizes a given regular language L.
Let M (f (σ)) denote the DFA recognizing {f (σ)} for each σ ∈ Σ. Then, we construct an
NFA N (f (L)) = (Q′ , Γ, δ ′ , q0′ , F ′ ).

• Q′ = {q | q ∈ Q or q is a state in M (f (σ)) where σ ∈ Σ}


• Γ is the language of the homomorphism
• q0′ = q0
• F′ = F

• The transition function δ ′ : (Q′ × Γ) → 2Q will be loosely defined as follows. We will
need multiple copies of each M (f (σ)). We replace the transition δ(q1 , σ) = q2 between
two states q1 , q2 ∈ Q as follows. Take a copy C of M (f (σ)) and place it between q1 and
q2 . Draw out ϵ-transitions from q1 to the start state in C and from accept states of C
to q2 .

1
It is easy to realize that M (f (L)) certainly recognizes f (L). To see this, consider any string
x = x0 · x1 · · · xn ∈ L. This string goes through some sequence of states in M (L) as follows.
x x x
q0 = q (0) −→
0
q (1) −→
1
q (2) · · · q (n) −→
n
q (n+1)

where q (i) represents the state reached after reading x0 · x1 · · · xi−1 . By construction, f (x) =
f (x0 ) · f (x1 ) · · · f (xn ) ∈ f (L) goes through a similar sequence of states in N (f (L)).
f (x0 ) f (x1 ) f (xn )
q0 = q (0) −−−→ q (1) −−−→ q (2) · · · q (n) −−−→ q (n+1)

The only difference is that reading each string f (xi ) takes N (f (L)) through some additional
intermediate states that help ’recognize’ f (xi ).
The final states of both the M (L) and N (f (L)) are the same. So, if q (n+1) ∈ F , then x is
accepted in M (L) and f (x) in M (f (L)). Otherwise, both are rejected from their respective
DFAs. In other words, we have proven that any string f (x) is accepted in N (f (L)) if and
only if x is accepted in M (L). This means that N (f (L)) recognizes f (L).
2. Let Lk = {x ∈ {0, 1}∗ | |x| ≥ k and the k’th character of x from the end is a 1}. Prove that
every DFA that recognizes Lk has at least 2k states. Also show that, on the other hand,
there is an NFA with O(k) states that recognizes Lk .
Solution. This one can be proved directly using Pigeon Hole principle. But now that we
have seen the notion of distinguishable strings and proved a lower bound on the number of
states in the DFA using that, let us use it.
Consider any two different k-bit strings x = x1 · · · xk and y = y1 · · · yk and let i be some
position such that xi ̸= yi (there must be at least one such position). Hence, one of the
strings contains a 1 in the ith position, while the other contains a 0. Let z = 0i−1 . Then
z is a distinguishing extension of x and y with respect to L since exactly one of xz and yz
has the kth bit from the end as 1. Since there are 2k binary strings of length k, which are
all mutually distinguishable by the above argument, any DFA for the language must have
at least 2k states.
NFA with k + 1 states. We construct an NFA with Q = {0, 1, 2, · · · k}, with the names
of the states corresponding to how many of the last k bits the NFA has seen. Define
δ(0, 0) = 0, δ(0, 1) = {0, 1} and δ(i − 1, 0/1) = i for 2 ≤ i ≤ k. We set q0 = 0 and F = {k}.
The machine starts at state 0, on seeing a 1 it may guess that it is the kth bit from the end
and proceed to state 1. Observe that the transitions ensure that the NFA reaches state k
and accepts if and only if there are exactly k − 1 bits following the one on which it moved
from 0 to 1.
3. A devilish NFA is same as an NFA, except that we define the acceptance criterion of a
devilish NFA as follows. We say that a devilish NFA N accepts x if and only if every run of
N on x ends in an accepting state. Prove that a language is recognized by a devilish NFA
if and only if the language is regular.
Solution.
We learn two interpretations of the regular NFA - it proceeds magically by guessing the correct
choice at each step, or it parallely computes every possible path of the run of a string. The
second interpretation says (in a way) that the NFA keeps track of all the possible ”states”
you can be in after reading some string.

2
Proof.
(=⇒) Given a language L that is recognized by a devilish NFA ND (L). We show that L
is regular by converting ND (L) to an equivalent DFA M (L). Similar to how we convert
a regular NFA to a DFA, we consider the states of M (L) to be the power set of states of
ND (L), wherein each state Q′ ⊆ 2Q represents that the runs of ND (L) are in all states of Q′
at once. The transitions are also defined in the same manner as in the NFA to DFA problem.
However, in this case, there is only 1 accept state, being {q | q ∈ F }, i.e. the subset exactly
equal to F - because this state signifies that all runs of a string ended in an accept state.
Hence, we have shown that every devilish NFA can be converted into an equivalent DFA.
Hence, any language recognized by a devilish NFA is regular.
(⇐=) Given that a language L is regular, we show that it is recognized by some devilish
NFA. Looking carefully at the definition of the devilish NFA, it states, ”A devilish NFA is
the same as an NFA but . . .”. We know that every DFA is an NFA. Also, one string can
run in only one way on a DFA. Hence, if a string is accepted by a DFA, then it is actually
accepted by ”all” runs of the string on the DFA. Then, by definion, every DFA is a devilish
NFA. Since L is regular, there exists a DFA M (L) recognizing it. By the above argument,
this DFA is a valid devilish NFA for L.
4. If A is any language, let A 1 − denote the set of all first halves of strings in A so that
2

A 1 − = {x | for some y, |x| = |y| and xy ∈ A}


2

Show that if A is regular, then so is A 1 − .


2

Solution. Here is one way to do this: Lets say A is given by an NFA (we will abuse notation
to call this NFA also A) A = (Q, Σ, δ, q0 , qacc ). Here we use one fact that any NFA with
multiple accept states can be simulated by an NFA with only a single accept state (why?).
Let’s build an NFA B = (Q′ , Σ, δ ′ , q0′ , F ′ ).
Then define B as follows:
• The states Q′ are of the form [q1 , q2 ] where q1 ∈ Q, q2 ∈ Q
• The start state of B is [q0 , qacc ].
• δ ′ ([q1 , q2 ], a) = {[u, v] ∈ Q × Q : u ∈ δ(q1 , a), ∃b ∈ Σ, q2 ∈ δ(v, b)}
• The accepting states of B are F ′ = {[q, q] : q ∈ Q}.
Intuition.The main idea behind the above construction is the following. The computation
on NFA B can be thought of as ‘simulating’ two different computations on A using the
‘pairs’ - one forward, another one backward. Imagine putting a finger on the start state of
A (which is q0 ) and another one on the accept state of A. Now, suppose you are read one
symbol from the input to B - this is possibly the first symbol of a potential x. which can be
accepted. You move one finger in A as per the transition. However, the second finger does
not really know where to go - why? Well because you do not know what y is, leave aside
knowing it’s end. But here in comes the power of non-determinism. You just ’guess’ the last
symbol of y and move to that state. This is easily simulated in B using the construction we
have done.
Now it is not difficult to see that you will accept x if at the end of x, both your fingers are
exactly at the same position in A. This is simulated by the accept states defined for B.

3
To prove formally, let x be a string accepted by the NFA B. Also, let δˆ′ be the extended
definition of the transition function δ ′ such that q ′ ∈ δˆ′ (q, x) if and only if there exists a
path in NFA B from q to q ′ labelled with the string x. We prove that then ∃y such that
xy ∈ A. By definition of accept states in B, δˆ′ ([q0 , qacc ] = [q, q] for some q ∈ Q. We make
two observations now. Each state in B is a 2-tuple. Now, consider the 2-tuples on the path
denoted by δˆ′ ([q0 , qacc ], x). By definition, if δ ′ ([q1 , q2 ], a) = [q1′ , q2′ ]) for some a ∈ x, then
q1′ ∈ δ(q1 , a). Hence δ̂(q0 , x) = q. On the other hand, there exists some symbol a′ such that
q2 ∈ δ(q2′ , a′ ). Thus, there must exist some string y such that qacc ∈ δ̂(q, y) . (Ideally we
should prove the last two by induction but this is sort of an overkill here). Also, it is trivial
to observe that |x| = |y| Hence xy is accepted by A.
Conversely, consider a string x ∈ A 1 −1 . Hence, there exists a y such that |x| = |y| and
2
xy ∈ A. We first prove the following by induction. For every 0 < ℓ ≤ |x|, let where xℓ be
the ℓ-length prefix of x. Further, let δ̂(q0 , xℓ = q1′ and there exists a string yℓ of length ℓ
such that qacc ∈ δ̂(q2′ , yℓ ). Then δ̂(q0 , xℓ ) = [q1′ .q2′ ]. I am leaving the proof if this hypothesis
to you - its not hard.
Once we have this, let us now go back to the string xy ∈ A where |x| = |y|. Hence, there
has to be a state q such that q ∈ δ̂(q0 , x) and qacc ∈ δ̂(q, y). Thus by the above proof,
δ̂(q0 , xℓ ) = [q, q] ∈ F ′ . Thus x is accepted by B.
5. Write the regular expressions corresponding to the following languages.
a. L = {#a (w) = 1( mod 2)}.
Solution. So this one wants you to find a reg-ex for strings that contain an odd number
of a’s. A nice and standard technique to solve these problems is to write down examples
and break down the string in to patterns that are going to be repeated. In this case, since
you need an odd number of a’s (that is numbers of the form 2k + 1, for k ≥ 0), observe
that any string with this property can be broken down as follows - begin with zero or
more b’s, followed by an a (this takes care of the +1). Now the following pattern repeats
- zero or more b′ s followed by another a, again followed by zero or more b′ s and finally an
a (this takes care of the 2k part). Finally there could be bunch of b’s at the end. Hence,
now it is easy to see that one possible reg-ex for L is
(b⋆ a)(b⋆ ab⋆ a)⋆ b⋆

Again, the solution itself is not so important. What you should remember is the technique.
Also note that there could be simpler regular expressions for the above.
b. L = {every other letter in w is a} Solution. Very easy.
c. L = {w contains an odd number of a’s and an even number of b’s} Solution. Solution.
This is a tricky one. Here is one possible solution. Observe that any string with the above
property can broken in to three possible parts : the first part contains an even number
of a’s and b’s, the second part is one of the following - just a single a (this makes the
number of a’s odd) or a bab , followed by a third part which has zero or more occurrences
of bb. Now the problem reduces to figuring out the first part, that is reg-ex for strings
with even number of a’s and b’s. This is slightly simpler than the original task. Here is
an observation, any such string can be thought of as zero or more repetition of one of
the following patterns - aa, bb, abba, baab, abab, baba (one can formally prove this but let
us just be convinced for now). Now we can write the final reg-ex

4
(aa + bb + abba + baab + abab + baba)⋆ (a + bab)(bb)⋆

You might also like