TCS GOLDEN NOTES THEORY OF COMPUTATION .pdf

Theory of computer science
Unit 1
Basic Concepts and Finite Automata

Overview
●Importance of TCS
●Alphabets, Strings, Languages, Closure properties.
●Finite Automata (FA) and Finite State machine (FSM).
●Deterministic Finite Automata (DFA) and Nondeterministic Finite Automata
(NFA): Definitions, transition diagrams and Language recognizers
●NFA to DFA Conversion
●Equivalence between NFA with and without ε- transitions
●Minimization of DFA
●FSM with output: Moore and Mealy machines, Equivalence
●Applications and limitations of FA

What is automata theory ?
●Study of abstract computing devices, or “machines”
●Automaton = an abstract computing device
●Note: A “device” need not even be a physical hardware!
●A fundamental question in computer science:
●Find out what different models of machines can do and cannot do
●The theory of computation

Theory of Computation: A Historical Perspective

Alphabet
An alphabet is a finite, non-empty set of symbols
We use the symbol ∑ (sigma) to denote an alphabet
Examples:
1.Binary: ∑ = {0,1}
2.All lower case letters: ∑ = {a,b,c,..z}
3.Alphanumeric: ∑ = {a-z, A-Z, 0-9}
4.DNA molecule letters: ∑ = {a,c,g,t}

Strings
A string or word is a finite sequence of symbols chosen from ∑
1.Empty string is ε (or “epsilon”): empty string is the string with zero
occurrences of symbols.

2.Length of a string w, denoted by “|w|”, is equal to the number of (non- e)
characters in the string
E.g., x = 010100 |x| = 6
x = 01 ε 0 ε 1 ε 00 ε |x| = ?
NOTE: |ε| = 0

Power of an alphabet
Let ∑ be an alphabet.
∑
k
= the set of all strings of length k
∑* = ∑
0
U ∑
1
U ∑
2
U … is set of all strings over an alphabet ∑
∑
+
= ∑
1
U ∑
2
U ∑
3
U …

NOTE : ∑
0
={ε} regardless of what alphabet ∑ is, i.e ε is the only string whose
length is 0.
∑
1
= {0,1}

∑
2
= {00,01,11,10} so on….

Language
L is said to be a language over alphabet ∑, only if L ⊆ ∑*
This is because ∑* is the set of all strings (of all possible length including 0)
over the given alphabet ∑
Examples:
1.Let L be the language of all strings consisting of n 0’s followed by n 1’s:
L = {e, 01, 0011, 000111,…}
2.Let L be the language of all strings of with equal number of 0’s and 1’s:
L = {e, 01, 10, 0011, 1100, 0101, 1010, 1001,…}

Question

Finite Automata
Some Applications:
●Software for designing and checking the behavior of digital circuits
●Lexical analyzer of a typical compiler
●Software for scanning large bodies of text (e.g., web pages) for pattern
finding
●Software for verifying systems of all types that have a finite number of states
(e.g., stock market transaction, communication/network protocol)

●“Regular Grammar” are exactly the ones that can be described by finite automata.
●A finite automaton has a set of states and its control moves from state to state in
response to external inputs.
●One of the crucial distinctions among classes of finite automata is whether that control
is deterministic meaning that the automaton cannot be in more than one state at any one
time or non deterministic meaning that it may be in several states at once
●Adding nondeterminism does not let us define any language that cannot be defined by a
deterministic finite automaton but there can be substantial efficiency in describing an
application using a nondeterministic automaton.
●Nondeterminism allows us to program solutions to problems using a higher level
language. The nondeterministic finite automaton is then compiled by an algorithm we
into a deterministic automaton that can be executed on a conventional computer.

Contd..
●Informally, a state diagram that comprehensively captures all possible states and
transitions that a machine can take while responding to a stream or sequence of
input symbols.
●Recognizer for “Regular Languages”
1.Deterministic Finite Automata (DFA)
The machine can exist in only one state at any given time
2.Non-deterministic Finite Automata (NFA)
The machine can exist in multiple states at the same time

Deterministic Finite Automata- Deﬁnition

A Deterministic Finite Automaton (DFA) consists of:
Q ==> a finite set of states
Σ ==> a finite set of input symbols (alphabet)
q0 ==> a start state
F ==> set of accepting states( F ⊂ Q)
δ ==> a transition function, which is a mapping between Q x Σ ==> Q
Transition function that takes as arguments a state and an input symbol & returns a
state
A DFA is defined by the 5-tuple: {Q Σ q F δ }

Graph representation of DFA
●Nodes = states.
●Arcs represent transition function: Arc from state p to state q labeled by all those
input symbols that have transitions from p to q.
●Arrow labeled “Start” to the start state.
●Final states indicated by double circles

What does a DFA do on reading an input string?

●Input: a word w in Σ*
●Question: Is w acceptable by the DFA?
●Steps:
1.Start at the “start state” q0.
2.For every input symbol in the sequence w do, Compute the next state from the current
state, given the current input symbol in w and the transition function.
3.If after all symbols in w are consumed, the current state is one of the accepting states
(F) then accept w;
4.Otherwise, reject w.

Simple notations for DFA
●A transition diagram

Example

The transition diagram for the DFA accepting all strings with a substring 01

●A transition table:

Example

Regular Grammar
●Let L(A) be a language recognized by a DFA A, then L(A) is called a “Regular
Language”.
●Locate regular languages in the Chomsky Hierarchy.

The Chomsky Hierarchy

Example 1
●Build a DFA for the following language:
L = {w | w is a binary string that contains 01 as a substring}
●Steps for building a DFA to recognize L:
1.Σ = {0,1}
2.Decide on the states: Q
3.Designate start state and final state(s)
4.δ: Decide on the transitions:
●“Final” states == same as “accepting states”
●Other states == same as “non-accepting states”

DFA for strings containing 01

Extended transition function

Example

Delta hat

Language of a DFA
A DFA A accepts string w if there is a path from q0 to an accepting (or final)
state that is labeled by w
i.e., L(A) = { w | is in F }
i.e., L(A) = all strings that lead to an accepting state from q0

Non Deterministic Finite automata
A Non-deterministic Finite Automaton (NFA) is of course “non-deterministic”
●Implying that the machine can exist in more than one state at the same time
●Transitions could be non-deterministic

Contd...
●The NFAs accept exactly the regular languages, just as DFAs do.
●They are often more succinct and easier to design than DFAs.
●We can always convert an NFA to a DFA, the latter may have exponentially more
states than the NFA.
●The difference between the DFA and the NFA is in the type of transition function.

Contd..
A Non-deterministic Finite Automaton (NFA) consists of:
Q ==> a finite set of states
Σ ==> a finite set of input symbols (alphabet)
q0 ==> a start state
F ==> set of accepting states
δ ==> a transition function, which is a mapping between Q x Σ ==> subset of Q
An NFA is also defined by the 5-tuple: {Q Σ q F δ }

How to use an NFA ?
●Input: a word w in Σ*
●Question: Is w acceptable by the NFA?
●Steps:
1.Start at the “start state” q0
2.For every input symbol in the sequence w do:
Determine all possible next states from all current states, given the current input
symbol in w and the transition function
3.If after all symbols in w are consumed and if at least one of the current states is a final
state then accept w;
4.Otherwise, reject w.

Example
An NFA accepting all strings that end in 01

●Transition tables can be used to specify the transition function for an NFA as
well as for a DFA.
●The only difference is that each entry in the table for the NFA is a set, even if
the set is a singleton has one member.
●Also, when there is no transition at all from a given state on a given input
symbol, the proper entry is Φ the empty set.

Transition function for input 00101

Extension of to δ NFA paths

Language of a NFA
●An NFA accepts w if there exists at least one path from the start state to an accepting
(or final) state that is labeled by w
●Formally, if N= {Q Σ q F δ} is an NFA, then
L(N) = { w | ∩ F ≠ Φ }
which means that L(N) is the set of strings w in ∑* such that contains at least
one accepting state.

What is an error state ?
NOTE :
Omitting to explicitly
show error states is just a
matter of design
convenience
(one that is generally
followed for NFAs), and
i.e., this feature should
not be confused with the
notion of
non-determinism.

●DFA in practice has about as many states as the NFA, although it often has more
transitions.
●In the worst case however, the smallest DFA can have 2
n
states while the smallest
NFA for the same language has only n states.
●Every language that can be described by some NFA can also be described by some
DFA.
●The proof that DFAs can do whatever NFAs can do involves an important
construction called the subset construction because it involves constructing all subsets
of the set of states of the NFA.
● In general, many proofs about automata involve constructing one automaton from
another.
●It is important for us to observe the subset construction as an example of how one
formally describes one automaton in terms of the states and transitions of another,
without knowing the specifics of the latter automaton

Equivalence of DFA and NFA
Theorem: A language L is accepted by a DFA if and only if it is accepted by an NFA.
Proof:
1.If part:
Prove by showing every NFA can be converted to an equivalent DFA
2.Only-if part is trivial:
Every DFA is a special case of an NFA where each state has exactly one transition for
every input symbol. Therefore, if L is accepted by a DFA, it is accepted by a corresponding
NFA.

PROOF OF IF PART

●Surprisingly, for any NFA there is a DFA that accepts the same language.
●Proof is the subset construction.
●The number of states of the DFA can be exponential in the number of states of
the NFA.
●Thus, NFA’s accept exactly the regular languages.

NFA to DFA subset construction

Notice that this transition table belongs to a deterministic nite automaton.Even though the
entries in the table are sets the states of the constructed DFA are sets.To make the point
clearer we can invent new names for these states eg A for Φ B for q0 and so on.This clears
the point that the entries in the table are single states of the DFA.
1.Out of the 8 states in Fig. starting in the start state B we can only reach states B, E
and F.
2.The other 5 states are inaccessible from the start state.

Resultant DFA constructed from NFA that accepts the strings ending with 01.
NOTE
1.Enumerate all possible subsets.
2.Determine transitions
3.Retain only those states reachable from {q0}

NFA to DFA: Repeating the example using LAZY
CREATION

Correctness of subset creation

Applications

Few properties of DFA and NFA

FA with epsilon transitions

To simulate any transition:
Step 1) Go to all immediate destination states.
Step 2) From there go to all their epsilon-closure states
as well.

Closure of states
●CL(q) = set of states you can reach from state q following only arcs labeled ε .
●Example: CL(A) = {A}; CL(E) = {B, C, D, E}.
●Closure of a set of states = union of the closure of each state.

Minimization of DFA (refer separately shared pdf for
this topic)

Equivalence theorem

1.DISTINGUISHABLE STATES
2.SET PARTITIONING

NOTE : Remove all the states that are dead or unreachable from the initial state via
any set of the transition of DFA and then start applying Equivalence theorem.
Refer : https://www.geeksforgeeks.org/minimization-of-dfa/

Myhill Nerode Theorem (Table ﬁlling method.)
Refer : https://www.tutorialspoint.com/automata_theory/dfa_minimization.html

FSM with output : Mealy and Moore machines

A finite state machine is a machine that has many states and has a logical way of changing
from one state to another.
1.FSM - without o/p
2.FSM - with output
(A)Mealy machine : output in transition
(B)Moore machine : output on state

Mealy machine
A mealy machine is an FSM whose output depends on the present state as well as the
present input. In mealy machine every transition for a particular input symbol has a fixed
output. It can be described by a 6 tuple:
●Q is a finite set of states.
●∑ is a finite set of symbols called the input alphabet.
●O is a finite set of symbols called the output alphabet.
●δ is the input transition function where δ: Q × ∑ → Q
●X is the output transition function where X: Q × ∑ → O
●q0 is the initial state from where any input is processed (q0 ∈ Q).

Moore machine
In Moore machine the value of output function depends on the present state only. Moore
machine can be described as
1.Q is a finite set of states.
2.∑ is a finite set of symbols called the input alphabet.
3.O is a finite set of symbols called the output alphabet.
4.δ is the input transition function where δ: Q × ∑ → Q
5.X is the output transition function where X: Q → O
6.q0 is the initial state from where any input is processed (q0 ∈ Q).

Moore to mealy machine
Input - Moore machine
Output - mealy machine
Steps-
1.Take a blank mealy machine transition table format
2.Copy all the Moore machine transition states into this table format.
3.Check the present states and their corresponding outputs in the Moore machine state
table; if for a state Qi output is m, copy it into the output columns of the mealy
machine state table whenever Qi appears in the next state.

Mealy to moore machine
●Step 1: For each state(Q
i
), calculate the number of different outputs that are
available in the transition table of the Mealy machine.
●Step 2: Copy state Q
i
, if all the outputs of Q
i
are the same. Break qi into n states
as Q
in
, if it has n distinct outputs where n = 0, 1, 2....
●Step 3: If the output of initial state is 0, insert a new initial state at the starting
which gives 1 output.

1.For q, there is only one incident edge with output 0. So, we don't need to split this
state in Moore machine.
2.For q
2
, there is 2 incident edge with output 0 and 1. So, we will split this state into q
20
(
state with output 0) and q
21
(with output 1).
3.For q
3
, there is 2 incident edge with output 0 and 1. So, we will split this state into q
30
(
state with output 0) and q
30
( state with output 1).
4.For q
4
, there is only one incident edge with output 0. So, we don't need to split this.

Transition table for moore machine

Transition diagram for moore machine

Limitations
1.The input tape is read only and the only memory it can have is by moving from
state to state and since there are finite number of states, a finite automata has
memory which is strictly finite.
2.The finite automata have only string pattern (regular expression) recognizing
power.
3.The head movement is restricted in one direction, either from left to right or
right to left.

Equivalence of FA
Two automata M0 and M1 are said to be equivalent to each other if both accept exactly the
same set of input strings. Formally, if two automata M0 and M1 are equivalent then
●If there is a path from the start state of M0 to a final state of M0 labeled M01 M02
… … M0k, there is a path from the start state of M1 to a final state of M1 labeled
M01, M02 .. . . . . M0k.
●If there is a path from the start state of M1 to a final state of M1 labeled M11, M12
… . . . . . … . M1k, there is a path from the start state of M0 to a final state of M0
labeled M11, M12,…….. M1k.

Application of FA
We have several application based on finite automata and finite state machine. Some
are given below;
●A finite automata is highly useful to design Lexical Analyzers.
●A finite automata is useful to design text editors.
●A finite automata is highly useful to design spell checkers.
●A finite automata is useful to design sequential circuit design (Transducer).

Equivalence&Minimizationof Equivalence

&

Minimization

of

DFAs
41

Applications of interest 
Comparing two DFAs:

L(DFA
1
) == L(DFA
2
)?

HowtominimizeaDFA?

How

to

minimize

a

DFA?
1.
Remove unreachable states
2.
Identif
y
& condense e
q
uivalent states into one
yq
42

WhentocalltwostatesinaDFA When

to

call

two

states

in

a

DFA

“equivalent”?
Two states p and q are said to be
equivalent
iff:
re does!
equivalent

iff:

i)
Any string w accepted by starting at p is also accepted by
starting at q;
only futur
p
matter -o
p q
AND
w
i)
Any string w rejected by starting at p is also rejected by starting at q.
t doesn’t
p
w
43
Past
q
w
p≡q

Com
p
utin
g
e
q
uivalent states
pgq
in a DFA
Table Filling Algorithm
A
C
E
G
0
1 0
0
1
A=
B==
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
Cxx=
Dxxx=
Exxxx=
1
0
0
Fxxxxx=
Gxxx=xx=
H
x
x
=
x
x
x
x
=
Pass #0
1. Mark accepting states ≠ non-accepting states
H
x
x
=
x
x
x
x
=
ABCDEFGH
Pass #1
1. Compare every pair of states
2. Distinguish by one symbol transition
3. Mark = or ≠ or blank(tbd)
Pass #2
44
Pass

#2 1. Compare every pair of states
2. Distinguish by up to two symbol transitions (until different or same or tbd)
….
(keep repeating until table complete)

Table Fillin
g
Al
g
orithm -ste
p

gg
p
by step
A
C
E
G
0
1 0
0
1
A=
B=
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
C=
D=
E=
1
0
0
F=
G=
H
=
H
=
ABCDEFGH
45

Table Fillin
g
Al
g
orithm -ste
p

gg
p
by step
A
C
E
G
0
1 0
0
1
A=
B=
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
C=
D=
EXXXX=
1
0
0
FX=
GX=
H
X
=
H
X
=
ABCDEFGH
1. Mark Xbetween accepting vs. non-accepting state
46

Table Fillin
g
Al
g
orithm -ste
p

gg
p
by step
A
C
E
G
0
1 0
0
1
A=
B=
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
CX=
DX=
EXXXX=
1
0
0
FX=
GXX=
H
X
X
=
H
X
X
=
ABCDEFGH
1. Mark Xbetween accepting vs. non-accepting state
2. Look 1- hop away for distinguishing states or strings
47

Table Fillin
g
Al
g
orithm -ste
p

gg
p
by step
A
C
E
G
0
1 0
0
1
A=
B=
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
CXX=
DXX=
EXXXX=
1
0
0
FX=
GXX X=
H
X
X
X
=
H
X
X
X
=
ABCDEFGH
1. Mark Xbetween accepting vs. non-accepting state
2. Look 1- hop away for distinguishing states or strings
48

Table Fillin
g
Al
g
orithm -ste
p

gg
p
by step
A
C
E
G
0
1 0
0
1
A=
B=
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
CXX=
DXXX=
EXXXX=
1
0
0
FXX=
GXXX X=
H
X
X
=
X
=
H
X
X
=
X
=
ABCDEFGH
1. Mark Xbetween accepting vs. non-accepting state
2. Look 1- hop away for distinguishing states or strings
49

Table Fillin
g
Al
g
orithm -ste
p

gg
p
by step
A
C
E
G
0
1 0
0
1
A=
B=
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
CXX=
DXXX=
EXXXX=
1
0
0
FXXX=
GXXX=X=
H
X
X
=
X
X
=
H
X
X
=
X
X
=
ABCDEFGH
1. Mark Xbetween accepting vs. non-accepting state
2. Look 1- hop away for distinguishing states or strings
50

Table Fillin
g
Al
g
orithm -ste
p

gg
p
by step
A
C
E
G
0
1 0
0
1
A=
B=
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
CXX=
DXXX=
EXXXX=
1
0
0
FXXX=
GXXX=XX=
H
X
X
=
X
X
X
=
H
X
X
=
X
X
X
=
ABCDEFGH
1. Mark Xbetween accepting vs. non-accepting state
2. Look 1- hop away for distinguishing states or strings
51

Table Fillin
g
Al
g
orithm -ste
p

gg
p
by step
A
C
E
G
0
1 0
0
1
A=
B=
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
CXX=
DXXX=
EXXXX=
1
0
0
FXXX=
GXXX=XX=
H
X
X
=
X
X
X
X
=
H
X
X
=
X
X
X
X
=
ABCDEFGH
1. Mark Xbetween accepting vs. non-accepting state
2. Look 1- hop away for distinguishing states or strings
52

Table Fillin
g
Al
g
orithm -ste
p

gg
p
by step
A
C
E
G
0
1 0
0
1
A=
B==
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
CXX=
DXXX=
EXXXX=
1
0
0
FXXXXX=
GXXX=XX=
H
X
X
=
X
X
X
X
=
H
X
X
=
X
X
X
X
=
ABCDEFGH
1. Mark Xbetween accepting vs. non-accepting state
2. Pass 1:
Look 1- hop away for distinguishing states or strings
3. Pass 2:
Look 1
-
hop away again for distinguishing states or strings
53
Look

1
hop

away

again

for

distinguishing

states

or

strings
continue….

Table Fillin
g
Al
g
orithm -ste
p

gg
p
by step
A
C
E
G
0
1 0
0
1
A=
B==
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
CXX=
DXXX=
EXXXX=
1
0
0
FXXXXX=
GXXX=XX=
H
X
X
=
X
X
X
X
=
H
X
X
=
X
X
X
X
=
ABCDEFGH
E
q
uivalences:
1. Mark Xbetween accepting vs. non-accepting state
2. Pass 1:
Look 1- hop away for distinguishing states or strings
3. Pass 2:
Look 1
-
hop away again for distinguishing states or strings
54
q
•A=B
•C=H
• D=G
Look

1
hop

away

again

for

distinguishing

states

or

strings
continue….

Table Fillin
g
Al
g
orithm -ste
p

gg
p
by step
A
C
E
G
0
1 0
0
1
A
C
E
0
1
1 0
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
A
C
E
D
F
0
1
1
0
0
1
1
0
0
0
1
Retrain only one copy for
E
q
uivalences:
Retrain

only

one

copy

for

each equivalence set of states
55
q
•A=B
•C=H
• D=G

Table Fillin
g
Al
g
orithm
–
gg
special case
A=
B=
A
C
E
G
0
1 0
0
1
C=
D=
E=
A
C
E
G
B
D
F
H
0
1
11
1
1
0
0
0
0
1
?
F=
G=
H
=
1
0
0
H
=
ABCDEFGH
Q) What happens if the input DFA
has more than one final state?
56
Can all final states initially be treated
as equivalent to one another?

Putting it all together …
How to minimize a DFA? 
Goal:
Minimize the number of states in
a DFA
Dth
fi t t l f th t t t t

Algorithm:
1.
Eliminate states unreachable from the
D
ep
th
-
fi
rs
t

t
raversa
l
f
rom
th
e s
t
ar
t
s
t
a
t
e
start state
2.
Identify and remove equivalent states
Table filling algorithm
3.
Output the resultant DFA
57

Are Two DFAs Equivalent?
q
DFA
1
Unified DFA
q
0
…
DFA
2
Is q
0
≡ q
0
’?
: if yes, then DFA
1
≡DFA
2
: else, not equiv.
q
0
’
…
1. Make a new dummy DFA by just putting together both DFAs
2. Run table-filling algorithm on the unified DFA
3
IF
the start states of both DFAs are fo nd to be eq i alent
58
3
.
IF
the

start

states

of

both

DFAs

are

fo
u
nd

to

be

eq
u
iv
alent
,
THEN:DFA
1
≡ DFA
2
ELSE:different

Regular expressions and Languages
Unit-2

Overview
●Regular Expression (RE)
●Equivalence of RE and FA, Arden‘s Theorem
●RE Applications
●Regular Language (RL)
●Closure properties of RLs
●Decision properties of RLs
●Pumping lemma for RLs

Arden’s Theorem
The Arden's Theorem is useful for checking the equivalence of two regular expressions
as well as in the conversion of DFA to a regular expression.
Let us see its use in the conversion of DFA to a regular expression.
Following algorithm is used to build the regular expression form given DFA.
1. Let q
1
be the initial state.
2. There are q
2
, q
3
, q
4
....q
n
number of states. The ﬁnal state may be some q
j
where j<=
n.
3. Let α
ji
represents the transition from q
j
to q
i
.

4. Calculate q
i
such that
q
i
= α
ji
* q
j

If q
j
is a start state then we have:
q
i
= α
ji
* q
j
+ ε

5. Similarly, compute the ﬁnal state which ultimately gives the regular expression
'r'.

Construct the regular expression for the given DFA

Solution:
Let us write down the equations

Since q1 is the start state, so ε will be added, and the input 0 is coming to q1 from
q1 hence we write
State = source state of input × input coming to it
Similarly,

Since the ﬁnal states are q1 and q2, we are interested in solving q1 and q2 only.
Let us see q1 ﬁrst

q1 = q1* 0 + ε
q2 = q1* 1 + q2 *1
q3 = q2* 0 + q3* (0+1)
q1 = q1* 0 + ε

We can rewrite it as

Assuming R = q1, Q = ε, P = 0
We get

q1 = 0* (ε.R*= R*)
Substituting the value into q2, we will get
q2 = 0* 1 + q2 1
q2 = 0* 1 (1)* (R = Q + RP → Q P*)

q1 = ε + q1* 0, which is similar to R = Q + RP, and gets reduced to R = QP*.
q1 = ε.(0)*

The regular expression is given by
r = q1 + q2
= 0
*
+ 0
*
1.1
*

r = 0
*
+ 0
*
1
+
(1.1
*
= 1
+
)

Re
g
ular Ex
p
ressions
gp
1

2
RE’s: Introduction

Regular expressions
are an algebraic
way to describe languages.
They describe exactly the regular
languages.
If E is a regular expression, then L(E) is
the language it defines.
We’ll describe RE’s and their languages
recursively.

3
RE’s: Definition
Basis 1: If
a
is any symbol, then ais a
RE, and L(a) = {a}.
Note: {a} is the language containing one
string, and that string is of length 1.
Basis 2: εis a RE, and L(ε) = {ε}.
Basis 3:
∅
is a RE, and L(
∅
) =
∅
.

RegularExpressionsvs Finite Regular

Expressions

vs
.
Finite

Automata

Offers a declarative way to express the pattern of any
string we want to accept

Eg
01
*
+10
*

E
.
g
.,
01 +

10

Automata => more machine-like
itti tt[ t/jt]
<
inpu
t
: s
t
r
ing , ou
t
pu
t
:
[
accep
t/
re
jec
t]
>

Regular expressions => more program syntax-like

Unix environments heavily use regular expressions

E.g., bash shell, grep, vi & other editors, sed
Perlscripting
goodforstringprocessing
2

Perl

scripting

–
good

for

string

processing

Lexical analyzers such as Lex or Flex

Regular Expressions
Regular
expressions
Finite Automata
(DFA, NFA, -NFA)
=
Automata/machines
Syntactical
expressions
Regular
Languages Languages
Formal language classes
3
classes

4
RE’s: Definition –(2)
Induction 1: If E
1
and E
2
are regular
expressions, then E
1
+E
2
is a regular
expression, and L(E
1
+E
2
) =
L(E
1
)L(E
2
).
Induction 2: If E
1
and E
2
are regular
expressions, then E
1
E
2
is a regular
expression, and L(E
1
E
2
) = L(E
1
)L(E
2
).
Concatenation
: the set of strings wx such that w
Is in L(E
1
) and x is in L(E
2
).

5
RE’s: Definition –(3)
Induction 3: If E is a RE, then E* is a
RE, and L(E*) = (L(E))*.
Closure
, or “Kleene closure” = set of strings
w
1
w
2
…w
n
, for some n >
0, where each w
i
is
in L(E).
Note: when n=0, the string is ε.

7
Examples: RE’s
L(01) = {01}.
L(01+0) = {01, 0}.
L(0(1+0)) = {01, 00}.
Note order of precedence of operators.
L(0*) = {ε, 0, 00, 000,… }.
L((0+10)*(ε+1)) = all strings of 0’s
and 1’s without two consecutive 1’s.

Precedence of Operators 
Highest to lowest

*
operator(star)


operator

(star)

.
(concatenation)

+ operato
r

Example:

01* + 1 =
(
0 .
((
1
)
*
)

)
+ 1
9
((()))

Language Operators 
Union
of two languages:

L U M= all strings that are either in L or M

Note:
A union of two languages produces a third
language

Concatenation
of two languages:

L . M= all strings that are of the form xy
s.t., x L and y M

The
dot
operatorisusuallyomitted
4

The

dot

operator

is

usually

omitted


i.e., LMis same as L.M

“i” here refers to how many stri ngs to concatenate from the parent
language L to produce strings in the language L
i
Kleene Closure (the * operator) 
Kleene Closure
of a given language L:

L
0
= {

}

L
1
= {w | for some w

L}

L
2
= { w
1
w
2
| w
1

L, w
2

L (duplicates allowed)}

L
i= { w
1
w
2
…w
i
| all w’s chosen are

L (duplicates allowed)}

(Note: the choice of each w
i
is independent)
U

L* =
U
i≥0
L
i
(arbitrary number of concatenations)
Example: 
Let L = { 1, 00}
L
0
{
}

L
0
=
{

}

L
1
= {1,00}

L
2
= {11,100,001,0000}

L
3
=
{
111
,
1100
,
1001
,
10000
,
000000
,
00001
,
00100
,
0011
}
5
{
,
,
,
,
,
,
,
}

L* = L
0
U
L
1
U
L
2
U
…

Kleene Closure (special notes) 
L* is an infinite set iff |L| ≥1 and L≠{

}
IfL {
}th L* {
}
Why?

If

L
=
{

}
,
th
en
L*
=
{

}

If L = Φ, then L* = {

}
Why?
Why?
Σ* denotes the set of all words over an
alphabet
Σ
alphabet

Σ

Therefore, an abbreviated way of saying
there is an arbitrary language L over an
lhbt
Σ
i
6
a
lp
h
a
b
e
t

Σ

is:

L Σ*

Building Regular Expressions 
Let E be a regular expression and the languagerepresentedbyEisL(E) language

represented

by

E

is

L(E)

Then:
(E)=E

(E)

=

E

L(E + F) = L(E) U L(F) L(EF) L(E)L(F)

L(E

F)
=
L(E)

L(F)

L(E*) = (L(E))*
7

Example: how to use these regular
ex
p
ression
p
ro
p
erties and lan
g
ua
g
e
ppp gg
operators?

L = { w | w is a binary string which does not contain two consecutive 0s or
two consecutive 1s anywhere)

E.g., w = 01010101 is in L, while w = 10010 is not in L

Goal:
Build a regular expression for L

Four cases for w:

Case A: w starts with 0 and |w| is even

Case B: w starts with 1 and |w| is even

Case C: w starts with 0 and |w| is odd

Case D: w starts with 1 and |w| is odd

Case

D:

w

starts

with

1

and

|w|

is

odd


Regular expression for the four cases:

Case A: (01)*

Case B: (10)*

Case C: 0(10)*

Case D: 1(01)*

Since L is the union of all 4 cases:

Reg Exp for L = (01)* + (10)* + 0(10)* + 1(01)*

If we introduce then the regular expression can be simplified to:
8

Reg Exp for L = (

+1)(01)*(

+0)

8
Equivalence of RE’s and
Automata
We need to show that for every RE,
there is an automaton that accepts the
same language.
Pick the most powerful automaton type: the
ε-NFA.
And we need to show that for every
automaton, there is a RE defining its
language.
Pick the most restrictive type: the DFA.

FiniteAutomata(FA)&Regular Finite

Automata

(FA)

&

Regular

Expressions (Reg Ex)

To show that they are interchangeable,
consider the following theorems:

Theorem 1:
For every DFA A there exists a regular
expression R such that L(R)=L(A)

Theorem 2:For ever
y
re
g
ular ex
p
ression R there
Proofs
in the book
yg p
exists an

-NFA E such that L(E)=L(R)

NFA
NFA

-
NFA
NFA
Theorem 2
Kleene Theorem
10
DF
A
Reg Ex
Theorem 1

9
Converting a RE to an ε-NFA
Proof is an induction on the number of
operators (+, concatenation, *) in the
RE.
We always construct an automaton of a
special form (next slide).

10
Form of ε-NFA’sConstructed
No arcs from outside,
no arcs leaving
Start state: Only state with external predecessors
“Final” state:
Only state
with external
successors

11
RE to ε-NFA: Basis
Symbol a:
ε: ∅
:
a ε

12
RE to ε-NFA: Induction 1–Union
For E
1
For E
2
For E
1

E
2
ε
εε
ε

13
RE to ε-NFA: Induction 2–
Concatenation
For E
1
For E
2
For E
1
E
2
ε

14
RE to ε-NFA: Induction 3–Closure
For E
For E*
ε
ε
ε ε


-NFA
Reg Ex
Theorem 2
RE to -NFA construction Example:
(0+1)*01(0+1)*
(0+1)* 01 (0+1)*
0




0
1
0



1



1



12

15
DFA-to-RE
A strange sort of induction.
States of the DFA are assumed to be
1,2,…,n.
We construct RE’s for the labels of
restricted sets of paths.
Basis: single arcs or no arc at all.
Induction: paths that are allowed to
traverse next state in order.

Reg Ex
DFA
Theorem 1
DFA to RE construction
Informally, trace all distinct paths (traversing cycles only once)
from the start state to each of the final states
and enumerate all the ex
p
ressions alon
g
the wa
y
Example:
0
1
1
0
0,1pgy
q
0
q
1
q
2
0
1
(1
*
)
0
(0
*
)
1
(0
+
1)
*
(1 )
0
(0 )
1
(0

1)
00*
1*
1(0+1)*
Q) What is the language?
11
1*00*1(0+1)*
Q)

What

is

the

language?

16
k-Paths
A k-path is a path through the graph of
the DFA that goes thoughno state
numbered higher than k.
Endpoints are not restricted; they can
be any state.

17
Example: k-Paths
1
3
2
0
0 0
1
11
0-paths from 2 to 3:
RE for labels = 0.
1-paths from 2 to 3:
RE for labels = 0+11.
2-paths from 2 to 3:
RE for labels =
(10)*0+1(01)*1
3-paths from 2 to 3:
RE for labels = ??

18
k-Path Induction
Let R
ij
k
be the regular expression for
the set of labels of k-paths from state i
to state j.
Basis: k=0. R
ij
0
= sum of labels of arc
from i to j.
∅
if no such arc.
But add εif i=j.

19
Example: Basis
R
12
0
= 0.
R
11
0
=
∅
+ ε= ε.
1
3
2
0
0 0
1
11

20
k-Path Inductive Case
A k-path from i to j either:
1.Never goes through state k, or
2.Goes through k one or more times.
R
ij
k
= R
ij
k-1
+ R
ik
k-1
(R
kk
k-1
)* R
kj
k-1
.
Doesn’t go
through k
Goes from i to k the first time
Zero or more times from k to k
Then, from k to j

21
Illustration of Induction
States < k
k
i
j
Paths not going
through k
From k to j
From k to k Several times
Path to k

22
Final Step
The RE with the same language as the
DFA is the sum (union) of R
ij
n
, where:
1.n is the number of states; i.e., paths are
unconstrained.
2.i is the start state.
3.j is one of the final states.

23
Example
R
23
3
= R
23
2
+ R
23
2
(R
33
2
)*R
33
2
=
R
23
2
(R
33
2
)*
R
23
2
= (10)*0+1(01)*1
R
33
2
=0(01)*(1+00) + 1(10)*(0+11)
R
23
3
= [(10)*0+1(01)*1]
[(0(01)*(1+00) + 1(10)*(0+11))]*
1
3
2
0
0 0
1
11

Algebraic Laws… 
Distributive:

E(F+G) = EF + EG

(F+G)E = FE+GE

Idempotent:
E + E = E
IliKl l

I
nvo
lv
ing
Kl
eene c
losures:

(E*)* = E*

Φ
*=


Φ



*= 

E
+
=EE*
14

E? =

+E

A
lg
ebraic Laws of Re
g
ular
gg
Expressions 
Commutative:

E+F = F+E

Associative:

(E+F)+G = E+(F+G) (EF)G=E(FG)

(EF)G

=

E(FG)

Identity:

E
+
Φ
=
E

E
Φ

E


E = E

= E

Annihilator:
13

ΦE = EΦ = Φ

26
Identities and Annihilators
∅
is the identity for +.
R +
∅
= R.
ε is the identity for concatenation.
εR = Rε= R.

∅
is the annihilator for concatenation.
∅
R = R
∅
=
∅
.

PropertiesofRegular Properties

of

Regular

Lan
g
ua
g
es
gg

Topics 1)
How to prove whether a given languageisregularornot? language

is

regular

or

not?
Closurepropertiesofregular
2)
Closure

properties

of

regular

languages

Some lan
g
ua
g
es are not
gg
regular When is a language is regular?
if we are able to construct one of the
fll i f
o
ll
ow
ing:
DFA orNFA or

-NFA orregular
expression
When is it not?
If we can show that no FA can be built for a language

How to
p
rove lan
g
ua
g
es are
pgg
notregular? What if we cannot come up with any FA?
A
)
Can it be lan
g
ua
g
e that is not re
g
ular?
)
gg g
B) Or is it that we tried wrong approaches?
How do we decisively prove that a language
is not regular?
“ff
4
“
The hardest thing o
f
all is to
f
ind a black cat in a dark room,
especially if there is no cat!” -Confucius

Exam
p
le of a non-re
g
ular
p
g
language Let L = {w | w is of the form 0
n
1
n
, for all n≥0}

H
yp
othesis:L is not re
g
ular
yp
g

Intuitive rationale:
How do you keep track
of a runnin
g
count in an FA?
g

A more formal rationale:

By contradition, if L is regul ar then there should exist a DFA fLf
or
L
.

Let k = number of states in that DFA.

Consider the special word w= 0
k
1
k
=> w L
5

DFA is in some state p
i, after consuming the first i symbols in
w

Uses Pigeon Hole Principle
Rationale…

Let {p
0
,p
1
,… p
k
} be the sequence of states that the
DFA should have visited after consuming the first
ksymbolsinwwhichis0
k
k

symbols

in

w

which

is

0
k

But there are only k states in the DFA!

==
>atleastonestateshouldrepeatsomewhere

>

at

least

one

state

should

repeat

somewhere

along the path (by ++ Principle)

==> Let the repeating state be p
i=p
J
for i < j

==> We can fool the DFA by inputing 0
(k-(j-i))
1
k
and
still get it to accept (note: k-(j-i) is at most k-1).
==>DFAacceptsstringsw/unequalnumberof0s
6

==>

DFA

accepts

strings

w/

unequal

number

of

0s

and 1s, implying that the DFA is wrong!

ThePumpingLemmafor The

Pumping

Lemma

for

Regular Languages
What it is? The Pumping Lemma is a property
of all regular languages.
How is it used? A technique that is used to show that a given language is not regular
7

Pum
p
in
g
Lemma for Re
g
ular
pg g
Languages Let L be a regular language Then there exists
some constant Nsuch thatfor
every
string w

Ls.t. |w|≥N, there exists
a
waytobreak
w
intothreeparts
w=
x
y
z
way

to

break

w
into

three

parts
,
w=
x
y
z
,
such that:
1.
y
≠

y
2.
|xy|≤N
3.
For all k≥0, all strings of the form xy
k
z

L
8
This property should hold for all
regular languages.
Definition:N is called the “Pumping Lemma Constant”

Pumping Lemma: Proof
Lisreg lar >itsho ldha eaDFA

L

is

reg
u
lar
=
>

it

sho
u
ld

ha
v
e

a

DFA
.

Set
N:= number of states in the DFA
Ati
Lt||
≥N
hldh th

A
ny s
t
r
ing w

L
, s.
t
.
|
w
|
≥N
, s
h
ou
ld

h
ave
th
e
form: w=a
1
a
2
…a
m
, where m≥N
L tth t t t d ft di th fi t

L
e
t

th
e s
t
a
t
es
t
raverse
d
a
ft
er rea
di
ng
th
e
fi
rs
t

N symbols be: {p
0
,p
1
,… p
N
}

==>ThereareN+1p
-
states whilethereareonly

==>

There

are

N+1

p
-
states
,
while

there

are

only

N DFA states

==> at least one state has to repeat
9
i.e, p
i= p
J
where 0≤i<j≤N (by PHP)

Pumping Lemma: Proof… 
=> We should be able to break w=xyzas follows:

x=a
1
a
2
..a
i; y=a
i+1
a
i+2
..a
J
; z=a
J+1
a
J+2
..a
m

x
’s path will be p
0
..p
i

xs

path

will

be

p
0
..p
i

y’s path will be p
i
p
i+1
..p
J
(but p
i=p
J
implying a loop)

z’s path will be p
J
p
J+1
..p
m

Now consider another
y
k
(for k loops)
x
z

Now

consider

another

string w
k
=xy
k
z, where k≥0

Case k=0

DFA will reach the accept state p
p
0
p
i
p
m
x
z
=p
j

DFA

will

reach

the

accept

state

p
m

Case k>0

DFA will loop for y
k
, and finally reach the accept state p
m
for z
10

In either case, w
k
L This proves part (3) of the lemma

Pumping Lemma: Proof… 
For part (1):

Since i<
j,

y
≠

p
0
p
i
p
m
xz
y
k
(for k loops)
j,
y

Forpart(2):
=p
j

For

part

(2):

By PHP, the repetition of states has to
occur within the first N symbols in w

==> |xy|≤N
11

The Pur
p
ose of the Pum
p
in
g

ppg
Lemma for RL 
To prove that some languagescannot be
regular.
be

regular.

12

How to use the
p
um
p
in
g

ppg
lemma? Think of playing a 2 person game

Role 1:
We claim that the language cannot
blb
e regu
la
r

R
o
le
2:
A
n
ad
v
e
r
sa
r
y
wh
o

c
la
im
s
th
e

oe
ad e sa y
oca s e
language is regular Weshowthattheadversary’sstatementwill

We

show

that

the

adversary’s

statement

will

lead to a contradiction that implyiespumping
lemma cannothold for the language.
13

We win!!

How to use the
p
um
p
in
g

ppg
lemma? (The Steps) 1.
(we) L is not regular.
2.
(adv.) Claims that L is regular and gives you avalueforNasitsP/Lconstant a

value

for

N

as

its

P/L

constant
3.
(we) Using N, choose a string w L s.t.,
1
|w|≥N
1
.
|w|

≥

N
,
2.
Using w as the template, construct other words
w
k
of the form xy
k
zand show that at least one
suchw

L
such

w
k

L

=> this implies we have successfully broken the
pumping lemma for the language, and hence that the
di
14
a
d
versary
is wrong.
(Note: In this process, we may have to try many values of k,
starting with k=0, and then 2, 3, .. so on, until w
k
L )

Note: We don’t have any control over N, except that it is positive.
We also don’t have any control over how to split w=xyz,
but xyz should respect the P/L conditions (1) and (2).
Using the Pumping Lemma 
What WE do?

What the Adversary does?
1 ClaimsL isregular
3. Using N, we construct
1
.
Claims

L

is

regular
2. Provides N
our template string w
4. Demonstrate to the
adversary either adversary
,
either

through pumping up or
down on w, that some
stringw
k

L
string

w
k

L
(this should happen
regardless of w=xyz)
15

ExampleofusingthePumpingLemmato
Note:
This N can be anything (need not necessarily be the #states in the DFA.
It’s the adversary’s choice.)
Example

of

using

the

Pumping

Lemma

to

prove that a language is not regular
LtL
{|i bi ti ith l b
L
e
t

L
eq
=
{
w
|
w
i
s a
bi
nary s
t
r
i
ng w
ith
equa
l
num
b
er
of 1s and 0s}

Your Claim:
L
eq
is not regular
eq

Proof:

By contradiction, let L
eq
be regular

P/Lconstant shouldexist
adv. 
adv

P/L

constant

should

exist

Let N= that P/L constant

Consider input w = 0
N
1
N
(your choicefor the template string)
you
adv
.
(your

choice

for

the

template

string)

By pumping lemma, we should be able to break
w=xyz, such that:
≠
you
16
1)
y
≠

2)
|xy|≤N
3)
For all k≥0, the string xy
k
zis also in L

Template string w = 0
N
1
N
= 00 …. 011 … 1
NN
Proof…

Because
|xy|≤N
, xyshould contain only 0s

(This and because
y≠

,
impliesy=0
+
)
Th f
i
tt
N
10
you

Th
ere
f
ore x can conta
in a
t
mos
t
N
-
1

0
s

Also, all the N 1s must be inside z

By(3), any string of the form
x
y
k
z

L
eq
for all
k
≥0

By

(3),

any

string

of

the

form

x
y
z

L
eq
for

all

k
≥0


Case k=0:
xz has at most N-1 0s but has N 1s

Therefore, xy
0
zL
eq
Setting k=0 is
referred to as
“pumping down”

This violates the P/L (a contradiction)
Another wa
y
of
p
rovin
g
this will be to show that if
17
yp g
the #0s is arbitrarily pumped up (e.g., k=2),
then the #0s will become exceed the #1s
Setting k>1 is
referred to as
“pumping up”

Exercise 2 Prove L = {0
n
10
n
| n≥ 1} is not regular
Note:
This n is not to be confused with the pumping
lemma constant N. That canbe different.
In other words, the above question is same as
proving:

L = {0
m
10
m
| m≥ 1} is not regular
18

Example 3: Pumping Lemma Claim:
L = { 0
i
| i is a perfect square} is not regular

Proof:
BtditiltLb l

B
y con
t
ra
di
c
ti
on,
le
t

L

b
e regu
lar.

P/L should apply

Let N= P/L constant

Choose w=0
N
2

Choose

w=0

By pumping lemma, w=xyzsatisfying all three rules

By rules (1) & (2), yhas between 1 and N 0s

B
y
rule
(
3
),
an
y
strin
g
of the form x
y
k
zis also in L for all k ≥0
y(),yg
y

Case k=0:

#zeros (xy
0
z) = #zeros (xyz) - #zeros (y)

N
2
–N ≤ #zeros (xy
0
z) ≤ N
2
-1

(N
-
1)
2
<N
2
-
N≤#zeros (
x
y
0
z
)≤N
2
-
1<
N
2
19

(N
1)
<

N
N

≤

#zeros

(
x
y
z
)

≤

N
1

<

N

xy
0
zL

But the above will complete the proof ONLY IF N>1.

… (proof contd.. Next slide)

Example 3: Pumping Lemma

(proof contd…)

If the adversary pick N=1, then (N-1)
2
≤ N
2
– N, and therefore the #zeros(xy
0
z)
could end up being a perfect square!

This means that
p
um
p
in
g
down
(
i.e.
,
settin
g
k=0
)
is not
g
ivin
g
us the
p
roof!
ppg (, g ) gg p

So lets try pumping up next…

Case k=2:

#zeros (xy
2
z) = #zeros (xyz) + #zeros (y)

N
2
+1≤#zeros (
x
y
2
z
)≤N
2
+N

N
+

1

≤

#zeros

(
x
y
z
)

≤

N
+

N

N
2
< N
2
+ 1 ≤ #zeros (xy
2
z) ≤ N
2
+ N < (N+1)
2

xy
2
zL

(Notice that the above should hold for all po ssible N values of N>0. Therefore, this
completes the proof.)
20

ClosurepropertiesofRegular Closure

properties

of

Regular

Languages
21

Closure
p
ro
p
erties for Re
g
ular
pp g
Languages (RL)
This is different
from Kleene

Closure property:

If a set of regular languages are combined using
tthth ltil il
closure
an opera
t
or,
th
en
th
e resu
lti
ng
language
is a
lso
regular

Re
g
ular lan
g
ua
g
es are closedunder:
ggg

Union, intersection, complement, difference

Reversal

Kleene closure

Concatenation

Homomorphism
Now, lets prove all of this!
22

Homomorphism

Inverse homomorphism

RLs are closed under union 
IF L and M are two RLs THEN:

they both have two corresponding regular
expressions, R and S respectively

(L U M) can be represented using the regular expressionR+S expression

R+S


Therefore
,

(
L U M
)
is also re
g
ula
r
23
,( ) g
How can this be proved using FAs?

RLs are closed under complementation 
If L is an RL over ∑, then L=∑*-L

To show L is also regular, make the following
DFA for L
DFA for L
construction
Convert every final state into non-final, and
every non-final state into a final state
q
0
q
F1
q
F2
q
i
q
0
q
F1
q
F2
q
i
…
…
24
q
Fk
q
Fk
Assumes q0 is a non-final state. If not, do the opposite.

RLs are closed under intersection 
A quick, indirect way to prove:

ByDeMorgan
’slaw:

By

DeMorgans

law:


L ∩ M = (L U M)

SinceweknowRLsareclosedunderunion

Since

we

know

RLs

are

closed

under

union

and complementation, they are also closed
under intersection

A more direct way would be construct a finiteautomatonfor
L∩M
25
finite

automaton

for

L

∩

M

DFA construction for L ∩ M 
A
L
= DFA for L = {Q
L
, ∑ , q
L
,F
L
, δ
L
}

A
M
= DFA for M = {Q
M
, ∑ , q
M
,F
M
, δ
M
}

Build A
L ∩ M
= {Q
L
xQ
M
,∑, (q
L
,q
M
), F
L
xF
M
,δ}
such that:
δ((
))
(
δ
()
δ
())
h
i
Q
d

δ((
p,q
)
,a
)
=
(
δ
L
(
p,a
)
,
δ
M
(
q,a
))
, w
h
ere p
in
Q
L
,an
d
q
inQ
M

This construction ensures that a strin
g
w will
g
be accepted if and only if w reaches an
accepting state in both
input DFAs.
26

DFA construction for L ∩ M
q
F1
DFA for L
p
F1
DFA for M
q
0
q
F2…
q
i
p
0
p
F2…
p
i
q
j
a
p
j
a
DFA for L

M
(q
F1
,p
F1
)
…
a
(q
i
,p
i)
(q
j
,p
j)
(q
0
,p
0
)
27

RLs are closed under set difference 
We observe:

L
-
M
=
L
∩M
Closed under intersection
Closed under complementation

L

M

L

∩

M
Th f L
Mi l l
complementation

Th
ere
f
ore,
L
-
M

is a
lso regu
la
r
28

RLs are closed under reversal Reversal of a string w is denoted by w
R

E.
g
.
,
w=00111
,
w
R
=11100
g, ,
Reversal of a language: 
L
R
=
Thelanguagegeneratedby

L

The

language

generated

by

reversing all
strings in L
Theorem:
If L is regular then L
R
is also
regular
29
regular


-NFA Construction for L
R
DFAfor L
New -NFA for L
R
q
q
F1
q
q
q
a
DFA

for

L
New start
q
’


q
0
q
F2…
q
i
q
j
New

start
state
q
0
 
Make the
old start state
q
Fk
as the only new final state
Reverse all transitions
What to do if q
as
30
Reverse

all

transitions
Convert the old set of final states into non-final
states
What

to

do

if

q
0
w
as
one of the final states in the input DFA?

IfLisregular L
R
isregular(proof
If

L

is

regular
,
L
is

regular

(proof

using regular expressions)

Let E be a regular expression for L

Given E, how to build E
R
?
Bi
IfE
Øth
E
R
E

B
as
is:
If

E
=

,
Ø
, or a,
th
en
E
R
=
E

Induction:
Every part of E (refer to the part as “F”)
can be in onl
y
oneof the three followin
g
forms:
y
g
1.
F = F
1
+F
2

F
R
= F
1
R
+F
2
R
2
F=F
F
2
.
F

=

F
1
F
2

F
R
= F
2
R
F
1
R
3.
F = (F
1
)*
(
R
)* (
R
)*
31

(
F
R
)*
=
(
F
1
R
)*

Homomorphisms 
Substitute each symbol
in ∑ (main alphabet)
by a corresponding string
in T (another
lhbt)
a
lp
h
a
b
e
t)

h: ∑--->T*

Example
:

Example
:

Let ∑={0,1} and T={a,b}

Let a homomorphic function h on ∑ be:

h(0)=ab, h(1)=


If w=10110, then h(w) =abab = abab
Ingeneral
32

In

general
,

h(w) = h(a
1
) h(a
2
)… h(a
n
)

RLs are closed under homomorphisms 
Theorem:
If L is regular, then so is h(L)

Proof:
If E is a RE for L, then show L(h(E)) = h(L(E))

Basis:
If E=

, Ø, or a, then the claim holds.

Induction:
There are three forms of E:
1.
E = E
1
+E
2
1
2

L(h(E)) = L(h(E
1
) + h(E
2
)) = L(h(E
1
)) U L(h(E
2
)) ----- (1)

h(L(E)) = h(L(E
1
) + L(E
2
)) = h(L(E
1
)) U h(L(E
2
)) ----- (2)

By inductive hypothesis, L(h(E
1
))= h(L(E
1
)) and L(h(E
2
))=
h(L(E
2
))

Therefore, L(h(E)= h(L(E)
2.
E = E
1
E
2
E(E
)*
Similar ar
g
ument
Think of a DFA based
co
n
st
r
uct
io
n
33
3.
E
=
(E
1
)*
g
co st ucto

Given a DFA for L, how to convert it into an FA for h(L)?
FA Construction for h(L)
q
F1
DFA for L
Replace every edge “a” by
a path
labeled h(a)
ith DFA
q
0
q
F2
q
i
q
j
a
in
th
e new
DFA
h(a)
q
Fk…
h(a)
-
Build a new FAthat simulates h(a) for every symbol a transition in
34
-
Build

a

new

FA

that

simulates

h(a)

for

every

symbol

a

transition

in

the above DFA
- The resulting FA may or may not be a DFA, but will be a FA for h(L)

The set of strings in ∑*
whose homomorphic translation
results in the strings of M
Given a DFA for M, how to convert it into an FA for h
-1
(M)?
Inverse homomorphism 
Let h: ∑--->T*

Let M be a language over alphabet T

h
-1
(M) = {w | w ∑* s.t., h(w) M }
Claim:
If M is regular, then so is h
-1
(M)

Proof:

Let A be a DFA for M Ctt thDFAA’hih dh
1
(M)

C
ons
t
ruc
t
ano
th
er
DFA

A’
w
hi
c
h
enco
d
es
h
-
1
(M)

A’ is an exact replica of A, except that its transition
functions are s.t. for any input symbol ain ∑, A’
35
will simulate h(a) in A.

δ(p,a) = δ(p,h(a))

Decisionpropertiesofregular Decision

properties

of

regular

languages
Any “decision problem” looks like this:
Decision problem
In
p
ut
Yes
problem

solver
p
(generally a question)
No
36

Membership question 
Decision Problem:
Given L, is w in L?

Possibleanswers:
YesorNo

Possible

answers:
Yes

or

No

Approach:
B ild DFAf L
1.
B
u
ild
a
DFA

f
or
L
2.
Input w to the DFA
3.
If the DFA ends in an accepting state,
then yes; otherwise no.
37

Emptiness test 
Decision Problem:
Is L=Ø ?

Approach:
On a DFA for L: 1.
From the start state, run a reachability test, which returns: returns: 1.
success:
if there is at least one final state that is
reachable from the start state
2
failure:
otherwise
2
.
failure:
otherwise
2.
L=Ø if and only if the reachability test fails
38
How to implement the reachability test?

Finiteness 
Decision Problem:
Is L finite or infinite?

Approach:
On a DFA for L: 1.
Remove all states unreachable from the start state
2.
Remove all states that cannot lead to an
y
acce
p
tin
g
state.
ypg
3.
After removal, check for cycles in the resulting FA
4.
L is finite if there are no cycles; otherwise it is infinite
Ath h

A
no
th
er approac
h

Build a regular expression and look for Kleene closure
39
How to implement steps 2 and 3?

Finiteness test -examples
Ex 1) Is the language of this DFA finite or infinite?
X
q
6
0
1
X
FINITE
Ex 2) Is the language of this DFA finite or infinite?
q
6
X
0
X
INFINITE
40
1
0