COSC 3101A -Design and
Analysis of Algorithms
10
BFS, DFS
Topological Sort
Strongly Connected Components
Many of these slides are taken from Monica Nicolescu, Univ. of Nevada, Reno, [email protected]
7/6/2004 Lecture 10 COSC3101A 2
Searching in a Graph
•Graph searching= systematically follow the
edges of the graph so as to visit the vertices of
the graph
•Two basic graph searching algorithms:
–Breadth-first search
–Depth-first search
–The difference between them is in the order in which
they explore the unvisited edges of the graph
•Graph algorithms are typically elaborations of
the basic graph-searching algorithms
7/6/2004 Lecture 10 COSC3101A 3
Breadth-First Search (BFS)
•Input:
–A graph G = (V, E)(directed or undirected)
–A sourcevertex s V
•Goal:
–Explore the edges of Gto “discover” every vertex
reachable from s, taking the ones closest to sfirst
•Output:
–d[v]= distance (smallest # of edges) from sto v, for
all v V
–A “breadth-first tree” rooted at sthat contains all
reachable vertices
7/6/2004 Lecture 10 COSC3101A 4
Breadth-First Search (cont.)
•Discover vertices in increasing order of distance
from the source s –search in breadthnot depth
–Find all vertices at 1 edge from s, then all vertices at 2
edges from s, and so on
1 2
5 4
3
7
7
9
11
12
6
7/6/2004 Lecture 10 COSC3101A 5
Breadth-First Search (cont.)
•Keeping track of progress:
–Color each vertex in either white,
grayor black
–Initially, all vertices are white
–When being discovered a vertex
becomes gray
–After discovering all its adjacent
vertices the node becomes black
–Use FIFO queue Q to maintain the
set of gray vertices
1 2
5 4
3
1 2
5 4
3
source
1 2
5 4
3
7/6/2004 Lecture 10 COSC3101A 6
Breadth-First Tree
•BFS constructs a breadth-first tree
–Initially contains the root (source vertex s)
–When vertex vis discovered while scanning
the adjacency list of a vertex uvertex v
and edge (u, v)are added to the tree
–uis the predecessor(parent) of vin the
breadth-first tree
–A vertex is discovered only once it has at
most one parent
1 2
5 4
3
source
7/6/2004 Lecture 10 COSC3101A 7
BFS Additional Data Structures
•G = (V, E) represented using adjacency lists
•color[u]–the color of the vertex for all u V
•[u]–predecessor of u
–If u = s(root) or node uhas not yet been
discovered [u] = NIL
•d[u]–the distance from the source sto
vertex u
•Use a FIFO queue Q to maintain the set of
gray vertices
1 2
5 4
3
d=1
=1
d=1
=1
d=2
=5
d=2
=2
source
7/6/2004 Lecture 10 COSC3101A 8
BFS(G, s)
1.for each u V[G] -{s}
2. docolor[u]WHITE
3. d[u]←
4. [u] = NIL
5.color[s] GRAY
6.d[s] ← 0
7.[s] = NIL
8.Q
9.Q← ENQUEUE(Q, s)
Q: s
0
r s t u
v w x y
r s t u
v w x y
r s t u
v w x y
7/6/2004 Lecture 10 COSC3101A 9
BFS(G, s)
10.while Q
11.do u← DEQUEUE(Q)
12. for each v Adj[u]
13. do if color[v] = WHITE
14. thencolor[v]←GRAY
15. d[v] ←d[u] + 1
16. [v] = u
17. ENQUEUE(Q, v)
18. color[u]BLACK
0
1
r s t u
v w x y
Q: w
Q: s
0
r s t u
v w x y
1 0
1
r s t u
v w x y
Q: w, r
7/6/2004 Lecture 10 COSC3101A 10
Example
1 0
1
r s t u
v w x y
Q: s
0
r s t u
v w x y
Q: w, r
v w x y
1 0 2
1 2
r s t u
Q: r, t, x
1 0 2
2 1 2
r s t u
v w x y
Q: t, x, v
1 0 2 3
2 1 2
r s t u
v w x y
Q: x, v, u
1 0 2 3
2 1 2 3
r s t u
v w x y
Q: v, u, y
1 0 2 3
2 1 2 3
r s t u
v w x y
Q: u, y
1 0 2 3
2 1 2 3
r s t u
v w x y
Q: y
r s t u
1 0 2 3
2 1 2 3
v w x y
Q:
7/6/2004 Lecture 10 COSC3101A 12
Analysis of BFS
10.while Q
11.do u← DEQUEUE(Q)
12. for each v Adj[u]
13. do if color[v] = WHITE
14. thencolor[v]= GRAY
15. d[v] ←d[u] + 1
16. [v] = u
17. ENQUEUE(Q, v)
18. color[u]BLACK
(1)
(1)
Scan Adj[u] for all vertices
in the graph
•Each vertex is scanned only
once, when the vertex is
dequeued
•Sum of lengths of all
adjacency lists = (E)
•Scanning operations:
O(E)
•Total running time for BFS = O(V + E)
7/6/2004 Lecture 10 COSC3101A 13
Shortest Paths Property
•BFS finds the shortest-path distance from the
source vertex s Vto each node in the graph
•Shortest-path distance = (s, u)
–Minimum number of edges in any path from sto u
r s t u
1 0 2 3
2 1 2 3
v w x y
source
7/6/2004 Lecture 10 COSC3101A 14
Depth-First Search
•Input:
–G= (V, E)(No source vertex given!)
•Goal:
–Explore the edges of G to “discover” every vertex in V
starting at the most current visited node
–Search may be repeated from multiple sources
•Output:
–2 timestamps on each vertex:
•d[v]= discovery time
•f[v]= finishing time (done with examining v’s adjacency list)
–Depth-first forest
1 2
5 4
3
7/6/2004 Lecture 10 COSC3101A 15
Depth-First Search
•Search “deeper” in the graph whenever
possible
•Edges are explored out of the most recently
discovered vertex vthat still has unexplored
edges
•After all edges of v have been explored, the search
“backtracks” from the parent of v
•The process continues until all vertices reachable from the
original source have been discovered
•If undiscovered vertices remain, choose one of them as a
new source and repeat the search from that vertex
•DFS creates a “depth-first forest”
1 2
5 4
3
7/6/2004 Lecture 10 COSC3101A 16
DFS Additional Data Structures
•Global variable: time-step
–Incremented when nodes are discovered/finished
•color[u] –similar to BFS
–White before discovery, gray while processing and
black when finished processing
•[u]–predecessor of u
•d[u], f[u]–discovery and finish times
GRAYWHITE BLACK
0 2Vd[u] f[u]
1 ≤ d[u] < f [u] ≤ 2 |V|
7/6/2004 Lecture 10 COSC3101A 17
DFS(G)
1.for each u V[G]
2. do color[u]← WHITE
3. [u] ← NIL
4.time← 0
5.for each u V[G]
6. do if color[u] = WHITE
7. then DFS-VISIT(u)
•Every time DFS-VISIT(u) is called, ubecomes the
root of a new tree in the depth-first forest
u v w
x y z
7/6/2004 Lecture 10 COSC3101A 18
DFS-VISIT(u)
1.color[u]← GRAY
2.time← time+1
3.d[u]← time
4.for each v Adj[u]
5. do if color[v]= WHITE
6. then [v] ← u
7. DFS-VISIT(v)
8.color[u]← BLACK
9.time← time + 1
10.f[u]← time
1/
u v w
x y z
u v w
x y z
time = 1
1/ 2/
u v w
x y z
7/6/2004 Lecture 10 COSC3101A 19
Example
1/ 2/
u v w
x y z
1/
u v w
x y z
1/ 2/
3/
u v w
x y z
1/ 2/
4/ 3/
u v w
x y z
1/ 2/
4/ 3/
u v w
x y z
B
1/ 2/
4/5 3/
u v w
x y z
B
1/ 2/
4/5 3/6
u v w
x y z
B
1/ 2/7
4/5 3/6
u v w
x y z
B
1/ 2/7
4/5 3/6
u v w
x y z
BF
7/6/2004 Lecture 10 COSC3101A 20
Example (cont.)
1/8 2/7
4/5 3/6
u v w
x y z
BF
1/8 2/7 9/
4/5 3/6
u v w
x y z
BF
1/8 2/7 9/
4/5 3/6
u v w
x y z
BF
C
1/8 2/7 9/
4/5 3/6 10/
u v w
x y z
BF
C
1/8 2/7 9/
4/5 3/6 10/
u v w
x y z
BF
C
B
1/8 2/7 9/
4/5 3/610/11
u v w
x y z
BF
C
B
1/8 2/79/12
4/5 3/610/11
u v w
x y z
BF
C
B
The results of DFS may depend on:
•The order in which nodes are explored
in procedure DFS
•The order in which the neighbors of a
vertex are visited in DFS-VISIT
7/6/2004 Lecture 10 COSC3101A 21
Edge Classification
•Tree edge (reaches a WHITE
vertex):
–(u, v)is a tree edge if v was first
discovered by exploring edge (u, v)
•Back edge (reaches a GRAY
vertex):
–(u, v), connecting a vertex uto an
ancestor vin a depth first tree
–Self loops (in directed graphs) are
also back edges
1/
u v w
x y z
1/ 2/
4/ 3/
u v w
x y z
B
7/6/2004 Lecture 10 COSC3101A 22
Edge Classification
•Forward edge (reaches a BLACK
vertex & d[u] < d[v]):
–Non-tree edges (u, v)that connect a vertex
uto a descendant vin a depth first tree
•Cross edge (reaches a BLACK vertex
& d[u] > d[v]):
–Can go between vertices in same depth-first
tree (as long as there is no ancestor /
descendant relation) or between different
depth-first trees
1/ 2/7
4/5 3/6
u v w
x y z
BF
1/8 2/7 9/
4/5 3/6
u v w
x y z
BF
C
7/6/2004 Lecture 10 COSC3101A 23
Analysis of DFS(G)
1.for each u V[G]
2. do color[u]← WHITE
3. [u] ← NIL
4.time← 0
5.for each u V[G]
6. do if color[u] = WHITE
7. then DFS-VISIT(u)
(V)
(V) –exclusive
of time for
DFS-VISIT
7/6/2004 Lecture 10 COSC3101A 24
Analysis of DFS-VISIT(u)
1.color[u]← GRAY
2.time← time+1
3.d[u]← time
4.for each v Adj[u]
5. do if color[v]= WHITE
6. then [v] ← u
7. DFS-VISIT(v)
8.color[u]← BLACK
9.time← time + 1
10.f[u]← time
Each loop takes
|Adj[v]|
DFS-VISIT is called exactly
once for each vertex
Total: Σ
vV|Adj[v]| + (V) =
(E)
(V + E)
7/6/2004 Lecture 10 COSC3101A 25
Properties of DFS
•u = [v] DFS-VISIT(v) was called
during a search of u’s adjacency list
•Vertex vis a descendant of vertex u
in the depth first forest vis
discovered during the time in which
uis gray
1/ 2/
3/
u v w
x y z
7/6/2004 Lecture 10 COSC3101A 26
Parenthesis Theorem
In any DFS of a graph G, for
all u, v, exactly one of the
following holds:
1.[d[u], f[u]] and [d[v], f[v]]are
disjoint, and neither of uand v
is a descendant of the other
2.[d[v], f[v]] is entirely within
[d[u], f[u]] and vis a
descendant of u
3.[d[u], f[u]] is entirely within
[d[v], f[v]] and uis a
descendant of v
3/6 2/9 1/10
4/5 7/8 12/13
uvwx
y z s
11/16
14/15
t
12345678910 131112 141516
s
z
t
v u
y w
x
(s(z(y(xx)y)(ww)z)s) u)(t(v (uu)t)
Well-formed expression: parenthesis are
properly nested
7/6/2004 Lecture 10 COSC3101A 27
Other Properties of DFS
Corollary
Vertex vis a proper descendant of u
d[u] < d[v] < f[v] < f[u]
Theorem (White-path Theorem)
In a depth-first forest of a graph G, vertex
vis a descendant of uif and only if at time
d[u], there is a path u vconsisting of
only white vertices.
1/ 2/
u
v
1/8 2/79/12
4/5 3/610/11
u
v
BF
C
B
7/6/2004 Lecture 10 COSC3101A 28
Topological Sort
Topological sortof a directed acyclic graph G =
(V, E): a linear order of vertices such that if there
exists an edge (u, v), then uappears before vin
the ordering.
•Directed acyclic graphs (DAGs)
–Used to represent precedence of events or processes
that have a partial order
abefore b bbefore c
bbefore c a before c
abefore c
What about
aand b?
Topological sort helps us establish a total order
7/6/2004 Lecture 10 COSC3101A 29
Topological Sort
undershorts
pants
belt
socks
shoes
watch
shirt
tie
jacket
TOPOLOGICAL-SORT(V, E)
1.Call DFS(V, E) to compute
finishing timesf[v] for each
vertex v
2.When each vertex is finished,
insert it onto the front of a
linked list
3.Return the linked list of
vertices
1/
2/
3/4
5
6/7
8
9/10
11/
12/
13/14
15
16 17/18
jackettiebeltshirtwatchshoespantsundershortssocks
Running time: (V + E)
7/6/2004 Lecture 10 COSC3101A 30
Topological Sort
undershorts
pants
belt
socks
shoes
watch
shirt
tie
jacket
1/
2/
3/4
5
6/7
8
9/10
11/
12/
13/14
15
16 17/18
jackettiebeltshirtwatchshoespantsundershortssocks
Topological sort:
an ordering of vertices along a
horizontal line so that all directed
edges go from left to right.
7/6/2004 Lecture 10 COSC3101A 31
Lemma
A directed graph is acyclica DFS on G yields
no back edges.
Proof:
“”: acyclic no back edge
–Assume back edgeprove cycle
–Assume there is a back edge (u, v)
vis an ancestor of u
there is a path from vto uin G (v u)
v u + the back edge (u, v) yield a cycle
v
u
(u, v)
7/6/2004 Lecture 10 COSC3101A 32
Lemma
A directed graph is acyclica DFS on G yields
no back edges.
Proof:
“”: no back edge acyclic
–Assume cycle prove back edge
–Suppose G contains cycle c
–Let vbe the first vertex discovered in c, and (u, v)be
the preceding edge in c
–At time d[v], vertices of cform a white path v u
–uis descendant of vin depth-first forest (by white-path
theorem)
(u, v)is a back edge
v
u
(u, v)
7/6/2004 Lecture 10 COSC3101A 33
Strongly Connected Components
Given directed graph G = (V, E):
A strongly connected component (SCC) of G
is a maximal set of vertices C V such that for
every pair of vertices u, vC, we have both
u vand v u.
7/6/2004 Lecture 10 COSC3101A 34
The Transpose of a Graph
•G
T
= transposeof G
–G
T
is G with all edges reversed
–G
T
= (V, E
T
), E
T
= {(u, v) : (v, u)E}
•If using adjacency lists: we can create G
T
in
(V + E) time
1 2
5 4
3
1 2
5 4
3
7/6/2004 Lecture 10 COSC3101A 35
Finding the SCC
•Observation: G and G
T
have the same SCC’s
–uand vare reachable from each other in G they are
reachable from each other in G
T
•Idea for computing the SCC of a DAG G = (V, E):
–Make two depth first searches: one on G and one on G
T
1 2
5 4
3
1 2
5 4
3
7/6/2004 Lecture 10 COSC3101A 36
STRONGLY-CONNECTED-COMPONENTS(G)
1.call DFS(G) to compute finishing times f[u]for
each vertex u
2.compute G
T
3.call DFS(G
T
), but in the main loop of DFS,
consider vertices in order of decreasing f[u]
(as computed in first DFS)
4.output the vertices in each tree of the depth-
first forest formed in second DFS as a separate
SCC
7/6/2004 Lecture 10 COSC3101A 37
Example
a b c d
e f g h
1/
2/3/4 65/7
8/11/
12/
13/ 91014
15
16
f
4
h
6
g
7
d
9
c
10
a
14
e
15
b
16
DFS on the initial graph G
DFS on G
T:
•start at b: visit a, e
•start at c: visit d
•start at g: visit f
•start at h
Strongly connected components: C
1= {a, b, e}, C
2= {c, d}, C
3= {f, g}, C
4= {h}
a b c d
e f g h
7/6/2004 Lecture 10 COSC3101A 38
Component Graph
•The component graphG
SCC
= (V
SCC
, E
SCC
):
–V
SCC
= {v
1, v
2, …, v
k}, where v
icorresponds to each
strongly connected component C
i
–There is an edge (v
i, v
j) E
SCC
if G contains a
directed edge (x, y) for some x C
iand y C
j
•The component graph is a DAG
a b c d
e f g h
a b e
c d
f g h
7/6/2004 Lecture 10 COSC3101A 39
Lemma 1
Let C and C’ be distinct SCC’s in G
Let u, vC, and u’, v’C’
Suppose there is a path u u’in G
Then there cannot also be a path v’ vin G.
u
v
u’
v’
Proof
•Suppose there is a path v’ v
•There exists u u’ v’
•There exists v’ v u
•uand v’are reachable from
each other, so they are not in
separate SCC’s: contradiction!
C C’
7/6/2004 Lecture 10 COSC3101A 40
Notations
•Extend notation for d(starting time) and f
(finishing time) to sets of vertices U V:
–d(U) = min
uU{ d[u]} (earliest discovery time)
–f(U) = max
uU{ f[u]} (latest finishing time)
a b c d
e f g h
1/
2/3/4 65/7
8/11/
12/
13/ 91014
15
16
C
1 C
2
C
3
C
4
d(C
1)
f(C
1)
d(C
2)
f(C
2)
d(C
3)
f(C
3)
d(C
4)
f(C
4)
=11
=16
=1
=10
=2
=7
=5
=6
7/6/2004 Lecture 10 COSC3101A 41
Lemma 2
•Let C and C’ be distinct SCCs in a directed
graph G = (V, E). If there is an edge (u, v)E,
where u C and vC’ then f(C) > f(C’).
•Consider C
1and C
2, connected by edge (b, c)
a b c d
e f g h
1/
2/3/4 65/7
8/11/
12/
13/ 91014
15
16
C
1 C
2
C
3
C
4
d(C
1)
f(C
1)
d(C
2)
f(C
2)
d(C
3)
f(C
3)
d(C
4)
f(C
4)
=11
=16
=1
=10
=3
=7
=5
=6
7/6/2004 Lecture 10 COSC3101A 42
Corollary
•Let C and C’ be distinct SCCs in a directed
graph G = (V, E). If there is an edge (u, v)E
T
,
where u C and vC’ then f(C) < f(C’).
•Consider C
2and C
1, connected by edge (c, b)
C
1= C’ C
2= C
C
3
C
4
a b c d
e f g h
•Since(c, b) E
T
(b, c) E
•From previous
lemma:
f(C
1) > f(C
2)
f(C’) > f(C)
7/6/2004 Lecture 10 COSC3101A 43
Discussion
f(C) < f(C’)
•Each edge in G
T
that goes between different
components goes from a component with an
earlier finish time (in the DFS) to one with a later
finish time
C
1= C’ C
2= C
C
3
C
4
a b c d
e f g h
7/6/2004 Lecture 10 COSC3101A 44
Why does SCC Work?
•When we do the second DFS, on G
T
, we start with a
component C such that f(C) is maximum (b, in our case)
•We start from band visit all vertices in C
1
•From corollary: f(C) > f(C’) for all C C’ there are no
edges from C to any other SCCs in G
T
DFS will visit only vertices in C
1
The depth-first tree rooted at bcontains exactly the
vertices of C
1 C
1 C
2
C
3 C
4
a b c d
e f g h
f
4
h
6
g
7
d
9
c
10
a
14
e
15
b
16
7/6/2004 Lecture 10 COSC3101A 45
Why does SCC Work? (cont.)
•The next root chosen in the second DFS is in SCC C
2
such that f(C) is maximum over all SCC’s other than C
1
•DFS visits all vertices in C
2
–the only edges out of C
2go to C
1, which we’ve already visited
The only tree edges will be to vertices in C
2
•Each time we choose a new root it can reach only:
–vertices in its own component
–vertices in components already visited
C
1 C
2
C
3 C
4
a b c d
e f g h
f
4
h
6
g
7
d
9
c
10
a
14
e
15
b
16
7/6/2004 Lecture 10 COSC3101A 46
Readings
•Chapter 22
•Appendix B