Exponents
X
0
= 1 by definition
X
a
X
b
= X
(a+b)
X
a
/ X
b
= X
(a-b)
Show that: X
-n
= 1 / X
n
(X
a
)
b
= X
ab
Logarithms
log
aX = Y a
Y
= X, a > 0, X > 0
E.G: log
28 = 3; 2
3
= 8
log
a1 = 0 because a
0
= 1
logX means log
2X
lgX means log
10X
lnX means log
eX,
where ‘e’ is the natural
number
Logarithms
log
a(XY) = log
aX + log
aY
log
a(X/Y) = log
aX –log
aY
log
a(X
n
) = nlog
aX
log
ab = (log
2b)/ (log
2a)
a
log
a
x
= x
Recursive Definitions
Basic idea:To define objects,
processes and properties in terms of
simpler objects,
simpler processes or
properties of simpler
objects/processes.
Recursive Definitions
Terminating rule-defining
the object explicitly.
Recursive rules-defining
the object in terms of a
simpler object.
Examples
FactorialsN!
f(n) = n!
f(0) = 1 i.e. 0! = 1
f(n) = n * f(n-1)
i.e. n! = n * (n-1)!
Function Growth
lim ( n ) = ∞, n → ∞
lim ( n
a
) = ∞, n → ∞, a > 0
lim ( 1 / n ) = 0, n → ∞
lim ( 1 / (n
a
) )= 0, n → ∞, a > 0
lim ( log( n )) = ∞, n → ∞
lim ( a
n
) = ∞, n → ∞, a > 0
Examples
lim (n/ n
2
) = 0, n → ∞
lim (n
2
/ n) = ∞, n → ∞
lim (n
2
/ n
3
) = 0, n → ∞
lim (n
3
/ n
2
) = ∞, n → ∞
lim (n / ((n+1)/2) = 2, n → ∞.
Some Derivatives
(log
an)' = (1/n) log
ae
(a
n
)' = (a
n
) ln(a)
Proofs
Direct proof
Proof by induction
Proof by counterexample
Proof by contradiction
Proof by contraposition
Direct Proof
Based on the definition of the object / property
Example:
Prove that if a number is divisible by 6 then it is
divisible by 2
Proof:Let m divisible by 6.
Therefore, there exists q such that m = 6q
6 = 2 . 3
m = 6q = 2.3.q = 2r, where r = 3q
Therefore m is divisible by 2
Proof by Induction
We use proof by induction when our claim
concerns a sequence of cases, which can be
numbered
Inductive base:
Show that the claim is true for the smallest
case, usually k = 0 or k = 1.
Inductive hypothesis:
Assume that the claim is true for some k
Prove that the claim is true for k+1
Example of Proof by Induction
Prove by induction that
S(N) = Σ 2
i
= 2
(N+1)
-1, for any integer N ≥ 0
i=0 to N
1.Inductive base
Let n = 0. S(0) = 2
0
= 1
On the other hand, by the formula S(0) = 2
(0+1)
–1 = 1.
Therefore the formula is true for n = 0
2. Inductive hypothesis
Assume that S(k) = 2
(k+1)
–1
We have to show that S(k+1) = 2
(k + 2)
-1
By the definition of S(n):
S(k+1) = S(k) + 2
(k+1)
= 2
(k+1)
–1 + 2
(k+1)
= 2. 2
(k+1)
–1 = 2
(k+2)
–1
Proof by Counterexample
Used when we want to prove that a statement is
false.
Types of statements: a claim that refers to all
members of a class.
EXAMPLE:The statement "all odd numbers are
prime" is false.
A counterexample is the number 9: it is odd and
it is not prime.
Proof by Contradiction
Assume that the statement is false, i.e. its
negation is true.
Show that the assumption implies that
some known property is false -this would
be the contradiction
Example:Prove that there is no largest
prime number
Proof by Contraposition
Used when we have to prove a statement of the
form P Q.
Instead of proving P Q, we prove its equivalent
~Q ~P
Example:Prove that if the square of an integer is
odd then the integer is odd
We can prove using direct proof the statement:
If an integer is even then its square is even.
Chapter 2:
Algorithm Analysis
Big-Oh and Other Notations in
Algorithm Analysis
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Big-Oh and Other Notations in
Algorithm Analysis
Classifying Functions by Their
Asymptotic Growth
Theta, Little oh, Little omega
Big Oh, Big Omega
Rules to manipulate Big-Oh expressions
Typical Growth Rates
Classifying Functions by
Their Asymptotic Growth
Asymptotic growth: The rate of growth of
a function
Given a particular differentiable function
f(n), all other differentiable functions fall
into three classes:
.growing with the same rate
.growing faster
.growing slower
f(n) and g(n) have
same rate of growth, if
lim( f(n) / g(n) ) = c,
0 < c < ∞, n -> ∞
Notation: f(n) = Θ( g(n) )
pronounced "theta"
Theta
f(n) grows slowerthan g(n)
(or g(n) grows faster than f(n))
if
lim( f(n) / g(n) ) = 0, n →∞
Notation: f(n) = o( g(n) )
pronounced "little oh"
Little oh
f(n) grows faster than g(n)
(or g(n) grows slower than f(n))
if
lim( f(n) / g(n) ) = ∞, n -> ∞
Notation: f(n) = ω (g(n))
pronounced "little omega"
Little omega
if g(n) =o( f(n) )
then f(n) =ω( g(n) )
Examples: Compare nand n
2
lim( n/n
2
) = 0, n →∞, n = o(n
2
)
lim( n
2
/n ) = ∞, n →∞, n
2
= ω(n)
Little omega and Little oh
Theta: Relation of Equivalence
R: "having the same rate of growth":
relation of equivalence,
gives a partition over the set of all
differentiable functions -classes of
equivalence.
Functions in one and the same class are
equivalent with respect to their growth.
Algorithms with Same Complexity
Two algorithms have same complexity,
if the functions representing the number
of operations have
same rate of growth.
Among allfunctions with same rate of
growth we choose the simplestone to
represent the complexity.
Examples
Compare nand (n+1)/2
lim( n / ((n+1)/2 )) = 2,
same rate of growth
(n+1)/2 = Θ(n)
-rate of growth of a linear function
Examples
Compare n
2
and n
2
+ 6n
lim( n
2
/ (n
2
+ 6n ) )= 1
same rate of growth.
n
2
+6n = Θ(n
2
)
rate of growth of a quadraticfunction
Examples
Compare log nand log n
2
lim( log n / log n
2
) = 1/2
same rate of growth.
log n
2
= Θ(log n)
logarithmicrate of growth
Θ(n
3
):n
3
5n
3
+ 4n
105n
3
+ 4n
2
+ 6n
Θ(n
2
):n
2
5n
2
+ 4n+6
n
2
+5
Θ(log n): log n
log n
2
log (n + n
3
)
Examples
Comparing Functions
same rate of growth: g(n) = Θ(f(n))
different rate of growth:
either g(n) = o (f(n))
g(n) grows slower than f(n),
and hence f(n) = ω(g(n))
or g(n) = ω (f(n))
g(n) grows faster than f(n),
and hence f(n) = o(g(n))
f(n) = O(g(n))
if f(n) grows with
same rate or slowerthan g(n).
f(n) = Θ(g(n))or
f(n) = o(g(n))
The Big-Oh Notation
n+5 = Θ(n) = O(n) = O(n
2
)
= O(n
3
) = O(n
5
)
the closest estimation: n+5 = Θ(n)
the general practice is to use
the Big-Oh notation:
n+5 = O(n)
Example
The inverseof Big-Oh is Ω
If g(n) = O(f(n)),
then f(n)= Ω (g(n))
f(n) grows faster or with the same
rate as g(n): f(n) = Ω (g(n))
The Big-Omega Notation
Rules to manipulate
Big-Oh expressions
Rule 1:
a. If
T1(N) = O(f(N))
and
T2(N) = O(g(N))
then
T1(N) + T2(N) =
max( O( f (N) ), O( g(N) ) )
Rules to manipulate
Big-Oh expressions
b. If
T1(N) = O( f(N) )
and
T2(N) = O( g(N) )
then
T1(N) * T2(N) = O( f(N)* g(N) )
Rules to manipulate
Big-Oh expressions
Rule 2:
If T(N) is a polynomial of degree k,
then
T(N) = Θ( N
k
)
Rule 3:
log
k
N = O(N)for any constant k.
Examples
n
2
+ n = O(n
2
)
we disregard any lower-order term
nlog(n) = O(nlog(n))
n
2
+nlog(n) = O(n
2
)
C constant, we write O(1)
logNlogarithmic
log
2
Nlog-squared
N linear
NlogN
N
2
quadratic
N
3
cubic
2
N
exponential
N! factorial
Typical Growth Rates
Problems
N
2
= O(N
2
) true
2N = O(N
2
) true
N = O(N
2
) true
N
2
= O(N) false
2N = O(N) true
N = O(N) true
Problems
N
2
= Θ (N
2
)true
2N = Θ (N
2
) false
N = Θ (N
2
) false
N
2
= Θ (N) false
2N = Θ (N) true
N = Θ (N) true
Chapter 2:
Algorithm Analysis
Application of Big-Oh to program
analysis
Running Time Calculations
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
46
Background
The work done by an
algorithm, i.e. its complexity,
is determined by the
number of the basic operations
necessary to solve the
problem.
47
The Task
Determinehowthenumberof
operationsdependonthesize
ofinput:
N-sizeofinput
F(N)-numberofoperations
48
Basic operations in an algorithm
Problem:Find x in an array
Operation:Comparison of xwith an
entry in the array
Size of input:The number of the
elements in the array
49
Basic operations ….
Problem:Multiplying two matrices
with real entries
Operation:
Multiplication of two real numbers
Size of input:
The dimensions of the matrices
50
Basic operations ….
Problem:Sort an array of numbers
Operation:Comparison of two
array entries plus moving elements in
the array
Size of input:The number of
elements in the array
51
Counting the number of
operations
A.forloops O(n)
The running time of a forloop is
at most the running time of the
statements inside the looptimes
the number of iterations.
52
forloops
sum = 0;
for( i = 0; i < n; i++ )
sum = sum + i;
The running time is O(n)
53
B. Nested loops
The total running time is the
running time of the inside
statementstimes
the product of the sizes of all the
loops
Counting the number of
operations
54
Nested loops
sum = 0;
for( i = 0; i < n; i++)
for( j = 0; j < n; j++)
sum++;
The running time is O(n
2
)
55
C. Consecutive
program fragments
Total running time :
the maximum of the running time
of the individual fragments
Counting the number of
operations
56
Consecutive program fragments
sum = 0; O(n)
for( i = 0; i < n; i++)
sum = sum + i;
sum = 0; O(n
2
)
for( i = 0; i < n; i++)
for( j = 0; j < 2n; j++) sum++;
The maximum is O(n
2
)
57
D: If statement
if C
S1;
else
S2;
The running time is the maximum
of the running times of S1and S2.
Counting the number of
operations
58
EXAMPLES
O(n
3
):
sum = 0;
for( i = 0 ; i < n; i++)
for( j = 0 ; j < n*n; j++ )
sum++;
59
EXAMPLES
O(n
2
):
sum = 0;
for( i = 0; i < n; i++)
for( j = 0; j < i; j++)
sum++;
60
EXAMPLES
O(n
3
logn)
for(j = 0; j < n*n; j++)
compute_val(j);
The complexity of
compute_val(x)is given to
be O(n*logn)
61
Search in an unordered
array of elements
for (i = 0; i < n; i++)
if (a[ i ] == x) return 1;
return -1;
O(n)
62
Search in a table n x m
for (i = 0; i < n; i++)
for (j = 0; j < m; j++)
if (a[ i ][ j ] == x) return 1 ;
return -1; O(n*m)
Chapter 2:
Algorithm Analysis
Application of Big-Oh to program
analysis
Logarithms in Running Time
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
64
Binary search
Euclid’s algorithm
Exponentials
Rules to count operations
Logarithms in Running Time
65
Divide-and-conquer
algorithms
Subsequently reducing the
problem by a factor of two
require O(logN)operations
66
Why logN?
A complete binary tree with
Nleaves has logNlevels.
Each level in the divide-and-
conquer algorithms corresponds to
an operation
Hence the number of operations is
O(logN)
67
Example: 8 leaves, 3 levels
68
Binary Search
Solution 1:
Scan all elements from left to right,
each time comparing with X.
O(N)operations.
69
Binary Search
Solution 2: O(logN)
Find the middle element A
midin the
list and compare it with X
If they are equal, stop
If X < A
mid
consider the left part
If X > A
mid
consider the rightpart
Do until the list is reduced to one
element
70
Euclid's algorithm
Finding the greatest common
divisor (GCD)
GCD of M and N, M > N,
= GCD of N and M % N
71
GCD and recursion
Recursion:
IfM%N = 0 return N
Elsereturn GCD(N, M%N)
The answer is the last nonzero
remainder.
72
M N rem
24 15 9
15 9 6
9 6 3
6 3 0
3 0
73
long gcd ( long m, long n)
{
long rem;
while (n!= 0)
{
rem = m% n;
m=n;
n=rem;
}
return m; }
Euclid’s
Algorithm
(non-recursive
implementation)
74
Why O(logN)
M % N <= M / 2
After 1
st
iteration Nappears as first
argument, the remainder is less thanN/2
After 2
nd
iteration the remainder appears
as first argument and will be reduced by a
factor of two
Hence O(logN)
75
Computing X
N
X
N
= X*(X
2
)
N / 2
,N is odd
X
N
= (X
2
)
N / 2
,N is even
76
long pow (long x, int n)
{
if ( n == 0) return 1;
if (is_Even( n ))
return pow(x * x, n/2);
else return
x * pow( x * x, n/2);
}
77
Why O(LogN)
If Nis odd : two multiplications
The operations are at most 2logN:
O(logN)
78
Another recursion forX
N
Another recursive definition that
reduces the power just by 1:
X
N
= X*X
N -1
Here the operations are N-1, i.e. O(N)
and the algorithm is less efficient
than the divide-and-conquer
algorithm.
79
How to count operations
single statements (not function calls)
: constant O(1) = 1.
sequential fragments: the maximum
of the operations of each fragment
80
single loop running up to N, with
single statements in its body: O(N)
single loop running up to N,
with the number of operations in the
body O(f(N)):
O( N * f(N) )
How to count operations
81
two nested loops each running up to
N, with single statements: O(N
2
)
divide-and-conquer algorithms with
input size N: O(logN)
Or O(N*logN)if each step requires additional
processing of N elements
How to count operations
82
Example:What is the probability two
numbers to be relatively prime?
tot = 0; rel = 0;
for ( i = 0; i <= n; i++)
for(j = i+1; j <= n; j++)
{ tot++;
if( gcd( i, j )==1) rel++; }
return(rel/tot);
Running time = ?
Chapter 3:
Abstract Data Types
Lists, Stacks
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
85
LIST ABSTRACTION
Definition:
A linear configuration of
elements, called nodes.
86
Characteristics
Insert and delete nodes in any order
The nodes areconnected
Each node has twocomponents
Information (data)
Link to the next node
The nodes are accessed through the links
between them
87
Head Predecessor
of X
Node X Success-
or of X
tail
For each node the node that is in
front of it is called predecessor.
The node that is after it is called
successor.
88
Terminology
Head (front, first node):
The node without predecessor, the node
that starts the lists.
Tail (end, last node):
The node that has no successor, the last
node in the list.
Current node:The node being processed.
From the current node we can access the next
node.
Empty list:No nodes exist
89
Basic operations
To create/destroy a list
To expand/shrink the list
Read/Write operations
Changing the current node(moving along the
list)
To report current position in the list
To report status of the list
90
ADT List Notation
L-list
e-item of the same type as the
information part of an element
(a node) in the list
b-boolean value
91
Operations in ADT Notation
Insert(L,e)
Inserts a node with information e before the
current position
Delete(L)
Deletes the current node in L , the current
position indicates the next node.
RetrieveInfo(L) e
Returns the information in the current node.
92
Insertion and Deletion
A. Insertion
To insert a node Xbetween the
nodes Aand B:
.Create a link from Xto B.
.Create a link from A to X,
93
Insertion
X
A B
94
Insertion and Deletion
B. Deletion
To delete a node Xbetween Aand B:
•Create a link from Ato B,
•Remove node X
95
Deletion
A X B
96
Node Linking
1.Single linked lists:
Each node contains two links -to the previous
and to the next node
2.Double linked lists :
Each node contains a link only to the next node
3.Circular lists:
The tail is linked to the head.
97
List Implementation
Static–using an array
Dynamic–using linear nodes
98
Array Implementation
Twoparallel arraysare used:
Index array-the number stored
in the i-th element shows the index
of the "next" node , i.e. node that
follows the i-th node
Data array-used to store the
informational part of the nodes.
99
100
STACKS
Definition:
The last stored element is
the first to be accessed
(LIFO: last in -first out)
101
Basic operations
Push:store a data item at
the top of the stack
Pop:retrieve a data item
from the top of the stack
102
ADT Definition of STACK
Notation:
Sstack
eitem of same type as the
elements of S
bboolean value
103
Operations
Init_Stack(S)
Procedure to initialize Sto an
empty stack
Destroy_Stack(S)
Procedure to delete all elements in S
104
Operations
Stack_Empty(S) b
Boolean function that returns
TRUE if Sis empty.
Stack_Full(S) b
Boolean function that returns
TRUE if Sis full.
105
Operations
Push(S,e)
Procedure to place an item einto S
(if there is room, i.e. Sis not full)
Pop(S)e
Procedure to take the last item
stored in S(this item is called also -
top element) if Sis not empty
106
Example
A procedure to replace the elements of a
nonempty stack, assuming they are of type
integers, with their sum
Pre:A nonempty stack with elements of
type integers.
Post:Scontains only one element –
the sum of previously stored elements.
107
Algorithm
1.e1 Pop(S)
2.while stack is not empty
repeat
2.1. e2 pop(S)
2.2. push(S, e1+e2)
2.3. e1 pop (S)
3.push(S,e1)
Chapter 3:
Abstract Data Types
Queues
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
109
QUEUES
Definition:A sequence of
elements of the same type.
The first stored element is first
accessible.
The structure is known also under
the nameFIFO -first in first out
110
Basic operations
EnQueue : store a data item at
the end of the queue
DeQueue: retrieve a data item
from the beginning of the queue
111
ADT Definition of QUEUE
Notation:
Qqueue
eitem of same type as the
elements of Q
bboolean value
112
Operations
Init_Queue(Q)
Initialize Q to an empty queue
Queue_Empty(Q) b
Boolean function that returns TRUE is
Q is empty
Queue_Full(Q)b
Boolean function that returns TRUE if
Q is full :array-based implementations
113
Operations
EnQueue(Q,e)
Procedure to place an item einto
Q at the end (if Q is not full)
DeQueue(Q)e
Procedure to take the first item
stored in Q if Q is not empty
114
Problem 1
Append_Queue(Q,P):A procedure to append
a queue P onto the end of a queue Q, leaving P
empty.
Pre:queue P and queue Q, initialized
Post:Q contains all elements originally in Q,
followed by the elements that were in P in same
order. P is empty.
•Design an algorithm to solve the problem
115
Problem 2
Reverse_Queue(Q):A procedure to reverse
the elements in a queue Q
Pre:queue Q, initialized
Post:Q contains all elements re-written in
reverse order
•Design a non-recursive algorithm using a
stack
•Design a recursive algorithm
•Find the complexity of the algorithms
116
Solutions to Problem 2
A. Non-recursive
Init_Stack(S)
While not Queue_Empty(Q)
e DeQueue(Q)
Push(S,e)
While not Stack_Empty(S)
e Pop(S)
EnQueue(Q,e)
.
Complexity O(N),
N -the number of
elements in Q.
117
Solutions to Problem 2
B. Recursive
Reverse_Queue(Q):
If not Queue_Empty(Q)
e DeQueue(Q)
Reverse_Queue(Q)
EnQueue(Q,e)
return
Complexity
O(N),
N -the
number of
elements
in Q.
118
Problem 3
Append_Reverse_Queue(Q,P):Append a
queue Pin reverse order onto the end of a
queue Q, leaving Pempty.
Pre:Pand Qinitialized (possibly empty)
Post:Qcontains all elements originally in Q,
followed by the elements that were in Pin
reverse order. Pis empty
•Design a recursive algorithm
Chapter 4: Trees
General Tree Concepts
Binary Trees
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
120
Trees
Definitions
Representation
Binary trees
Traversals
Expression trees
121
Definitions
tree -a non-empty collection of
vertices & edges
vertex (node)-can have a
name and carry other associated
information
path-list of distinct vertices in
which successive vertices are
connected by edges
122
Definitions
any two vertices must have one
and only one path between them
else its not a tree
a tree withN nodeshasN-1 edges
123
Definitions
root-starting point (top) of the
tree
parent(ancestor) -the vertex
“above” this vertex
child (descendent) -the vertices
“below” this vertex
124
Definitions
leaves(terminal nodes) -have
no children
level -the number of edges
between this node and the root
ordered tree -where children’s
order is significant
125
Definitions
Depth of a node-the length of the path
from the root to that node
•root: depth 0
Height of a node-the length of the longest
path from that node to a leaf
•any leaf: height 0
Height of a tree:The length of the longest
path from the root to a leaf
126
Balanced Trees
the difference between the
height of the left sub-tree
and the height of the right
sub-tree is not more than 1.
127
Trees -Example
E
R
T
ELPM
EA
S
A
root
Leaves or
terminal nodes
Child (of
root)
Depth of T: 2
Height of T: 1
Level
0
1
3
2
128
Tree Representation
Class TreeNode
{
Object element;
TreeNode firstChild;
TreeNode nextSibling;
}
129
Example
a
b f
e
c d g
a
b e
c
d
f
g
130
Binary Tree
S
A
B
N
O
N
P
D
M
I
S Internal
node
External
node
131
Height of a Complete Binary
Tree
L 0
L 1
L 2
L 3
At each level the number of the nodes is
doubled.
total number of nodes:
1 + 2 + 2
2
+ 2
3
= 2
4
-1 = 15
132
Nodes and Levels in a
Complete Binary Tree
Number of the nodes in a tree with M levels:
1 + 2 + 2
2
+ …. 2
M
= 2
(M+1)
-1 = 2*2
M
-1
Let Nbe the number of the nodes.
N = 2*2
M
-1, 2*2
M
= N + 1
2
M
= (N+1)/2
M = log( (N+1)/2 )
N nodes : log( (N+1)/2 )= O(log(N)) levels
M levels: 2
(M+1)
-1 = O(2
M
) nodes
133
Binary Tree Node
Class BinaryNode
{
Object Element; // the data in the node
BinaryNode left; // Left child
BinaryNode right; // Right child
}
134
Binary Tree –Preorder
Traversal
C
L
R
E
T
D
O
N
U
M
P
A
Root
Left
Right
First letter -at the root
Last letter –at the rightmost node
135
Preorder Algorithm
preorderVisit(tree)
{
if (current != null)
{
process (current);
preorderVisit (left_tree);
preorderVisit (right_tree);
}
}
136
Binary Tree –Inorder
Traversal
U
A
E
R
T
N
P
D
M
O
C
L
Left
Root
Right
First letter -at the leftmost node
Last letter –at the rightmost node
137
Inorder Algorithm
inorderVisit(tree)
{
if (current != null)
{
inorderVisit (left_tree);
process (current);
inorderVisit (right_tree);
}
}
138
Binary Tree –Postorder
Traversal
D
L
U
A
N
E
P
R
O
M
C
T
Left
Right
Root
First letter -at the leftmost node
Last letter –at the root
139
Postorder Algorithm
postorderVisit(tree)
{
if (current != null)
{
postorderVisit (left_tree);
postorderVisit (right_tree);
process (current);
}
}
140
Expression Trees
12
The stack contains references to tree nodes
(bottom is to the left)
+
1 2
3
*
+
1 2
3
(1+2)*3
Post-fix notation:1 2 + 3 *
Chapter 4: Trees
General Tree Concepts
Binary Trees
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
143
Trees
Definitions
Representation
Binary trees
Traversals
Expression trees
144
Definitions
tree -a non-empty collection of
vertices & edges
vertex (node)-can have a
name and carry other associated
information
path-list of distinct vertices in
which successive vertices are
connected by edges
145
Definitions
any two vertices must have one
and only one path between them
else its not a tree
a tree withN nodeshasN-1 edges
146
Definitions
root-starting point (top) of the
tree
parent(ancestor) -the vertex
“above” this vertex
child (descendent) -the vertices
“below” this vertex
147
Definitions
leaves(terminal nodes) -have
no children
level -the number of edges
between this node and the root
ordered tree -where children’s
order is significant
148
Definitions
Depth of a node-the length of the path
from the root to that node
•root: depth 0
Height of a node-the length of the longest
path from that node to a leaf
•any leaf: height 0
Height of a tree:The length of the longest
path from the root to a leaf
149
Balanced Trees
the difference between the
height of the left sub-tree
and the height of the right
sub-tree is not more than 1.
150
Trees -Example
E
R
T
ELPM
EA
S
A
root
Leaves or
terminal nodes
Child (of
root)
Depth of T: 2
Height of T: 1
Level
0
1
3
2
151
Tree Representation
Class TreeNode
{
Object element;
TreeNode firstChild;
TreeNode nextSibling;
}
152
Example
a
b f
e
c d g
a
b e
c
d
f
g
153
Binary Tree
S
A
B
N
O
N
P
D
M
I
S Internal
node
External
node
154
Height of a Complete Binary
Tree
L 0
L 1
L 2
L 3
At each level the number of the nodes is
doubled.
total number of nodes:
1 + 2 + 2
2
+ 2
3
= 2
4
-1 = 15
155
Nodes and Levels in a
Complete Binary Tree
Number of the nodes in a tree with M levels:
1 + 2 + 2
2
+ …. 2
M
= 2
(M+1)
-1 = 2*2
M
-1
Let Nbe the number of the nodes.
N = 2*2
M
-1, 2*2
M
= N + 1
2
M
= (N+1)/2
M = log( (N+1)/2 )
N nodes : log( (N+1)/2 )= O(log(N)) levels
M levels: 2
(M+1)
-1 = O(2
M
) nodes
156
Binary Tree Node
Class BinaryNode
{
Object Element; // the data in the node
BinaryNode left; // Left child
BinaryNode right; // Right child
}
157
Binary Tree –Preorder
Traversal
C
L
R
E
T
D
O
N
U
M
P
A
Root
Left
Right
First letter -at the root
Last letter –at the rightmost node
158
Preorder Algorithm
preorderVisit(tree)
{
if (current != null)
{
process (current);
preorderVisit (left_tree);
preorderVisit (right_tree);
}
}
159
Binary Tree –Inorder
Traversal
U
A
E
R
T
N
P
D
M
O
C
L
Left
Root
Right
First letter -at the leftmost node
Last letter –at the rightmost node
160
Inorder Algorithm
inorderVisit(tree)
{
if (current != null)
{
inorderVisit (left_tree);
process (current);
inorderVisit (right_tree);
}
}
161
Binary Tree –Postorder
Traversal
D
L
U
A
N
E
P
R
O
M
C
T
Left
Right
Root
First letter -at the leftmost node
Last letter –at the root
162
Postorder Algorithm
postorderVisit(tree)
{
if (current != null)
{
postorderVisit (left_tree);
postorderVisit (right_tree);
process (current);
}
}
163
Expression Trees
12
The stack contains references to tree nodes
(bottom is to the left)
+
1 2
3
*
+
1 2
3
(1+2)*3
Post-fix notation:1 2 + 3 *
Chapter 4: Trees
AVL Trees
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
166
AVL Trees
Keep the tree balanced
Use node rotation
Balanced condition:
The left and the right subtrees of each
node should differ by at most one level.
The method was proposed by two Russian scientists
Adelson-Velskii and Landis in 1962
167
Single Rotation
5
3
7
6 8
9
The sub-tree containing the inserted
node (rooted at 8) is at the same
side as the ‘heavier’ sub-tree of node
5 (rooted at 7)
Links to be changed
New node
168
After the Rotation
7
5
8
963
The middle node 6 is switched
to the other subtree
169
Double Rotation
5
6
78
5
7 89
6
9
8
8
7
7
The new node 6 is in theleft subtree of node 8, while node 8 is the
rightsubtree of 5 -> rotate 7 and 8 to obtain same-side trees.
Animation
New node
170
Splay Trees
Move to the top each accessed
node with rotations, decreasing
the depth of the tree.
171
Multi-Way Search
Nodes contain more than one key
nodes are used only to contain the keys, the actual
records are stored at the terminal nodes at the
bottom of the tree
E I
R
A C G H
S
N
172
Complexity Issues
AVL trees:
Search,Insertion, and Deletion:O(logN)
Creating the tree –O(N)
Trees withmultiple keys: O(Log
mN)
m –branching factor
Chapter 4: Trees
Binary Search Trees
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
174
Binary Search Trees
Definitions
Operations and complexity
Advantages and disadvantages
AVL Trees
Single rotation
Double rotation
Splay Trees
Multi-Way Search
175
Definitions
Each node –
a record with a key and a
value
a left link
a right link
All records with smallerkeys –
left subtree
All records with largerkeys –
right subtree
176
Example
177
Operations
Search -compare the values and proceed
either to the left or to the right
Insertion -unsuccessful search -insert the
new node at the bottom where the search has
stopped
Deletion -replace the value in the node with
the smallest value in the right subtree
or the largest value in the left subtree.
Retrieval in sorted order –inorder
traversal
178
Complexity
Logarithmic, depends on the shape of
the tree
In the worst case –O(N) comparisons
179
Advantages of BST
Simple
Efficient
Dynamic
One of the most fundamental algorithms in CS
The method of choice in manyapplications
180
Disadvantages of BST
The shape of the tree depends on the order of
insertions, and it can be degenerated.
When inserting or searching for an element, the key
of each visited node
has to be compared with the key of the element to
be inserted/found.
Keys may be long and the run time may increase
much.
Animation
181
Improvements of BST
Keeping the tree balanced:
AVL trees(Adelson -Velskii and Landis)
Balance condition: left and right subtrees of each
node can differ by at most one level.
It can be proved that if this condition is observed
the depth of the tree is O(logN).
Reducing the time for key comparison:
Radix trees-comparing only the leading bits of the
keys (not discussed here)
Chapter 4: Trees
Radix Search Trees
Mark Allen Weiss, Florida University
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
183
Radix Search Trees
Radix Searching
Digital Search Trees
Radix Search Trees
Multi-Way Radix Trees
184
Radix Searching
Idea:Examine the search keys
one bit at a time
Advantages:
reasonable worst-case performance
easy way to handle variable length keys
some savings in space by storing part
of
the key within the search structure
competitive with both binary search
trees
185
Radix Searching
Disadvantages:
biased data can lead to degenerate
trees with bad performance
for some methods use of space is
inefficient
dependent on computer’s
architecture –difficult to do efficient
implementations in some high-level
languages
187
Digital Search Trees
Similar to binary tree search
Difference:
Branch in the tree by comparing the
key’s bits, not the keys as a whole
188
Example
A 00001
S 10011
E 00101
R 10010
C 00011
H 01000
I 01001
N 01110
G 00111
X 11000
M 01101
P 10000
L 01100
A
P
X
M
E
G
C
L
H
NI
S
R
10
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
189
Example
A
P
X
M
E
G
C
L
H
NI
S
R
10
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
Z
01
inserting Z = 11010
go right twice
go left –external node
attach Z to the left of X
190
Digital Search Trees
Things to remember about digital search
trees:
Equal keys are anathema–must be kept
in separate data structures, linked to the
nodes.
Worst case –better than for binary
search trees–the length of the longest
path is equal to the longest match in the
leading bits between any two keys.
191
Digital Search Trees
Search or insertion requires about log(N)
comparisons on the average and b
comparisons in the worst case in a tree
built from N random b-bit keys.
No path will ever be longer than the
number of bits in the keys
192
Radix Search Trees
If the keys are long digital search trees
have low efficiency.
Radix search trees: do not store keys in
the tree at all, the keys are in the external
nodes of the tree.
Called tries(try-ee) from “retrieval”
193
Radix Search Trees
Two types of nodes
Internal:contain only links to other
nodes
External:contain keys and no links
194
Radix Search Trees
To insert a key–
1.Go along the path described by the leading
bit pattern of the key until an external node is
reached.
2. If the external node is empty, storethere the
new key.
If the external node contains a key, replaceit
by an internal node linked to the new key and the old
key. If the keys have several bits equal, more internal
nodes are necessary.
NOTE:insertion does not depend on the order of the keys.
195
Radix Search Trees
To search for a key–
1.Branch according to its bits,
2. Don’t compare it to anything, until we
get to an external node.
3. One full key comparison there
completes the search.
196
Example
A 00001
S 10011
E 00101
R 10010
C 00011
AC
E
RS
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
197
Example -insertion
A 00001
S 10011
E 00101
R 10010
C 00011
H 01000
AC
E
RS
H
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
External node -empty
198
Example -insertion
A 00001
S 10011
E 00101
R 10010
C 00011
H 01000
I 01001
AC
E
RS
H
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
I
0
0
0
1
1
1
External node -occupied
199
Radix Search Trees -summary
•Program implementation-
Necessity to maintain two types of nodes
Low-level implementation
•Complexity:aboutlogNbit comparisons in average case
andb bit comparisonsin the worst case in a tree built
from Nrandom b-bit keys.
Annoying feature: One-way branching for keys with a
large number of common leading bits :
The number of the nodes may exceed the number of the keys.
On average –N/ln2 = 1.44N nodes
200
Multi-Way Radix Trees
The height of the tree is limited by the number of
the bits in the keys
If we have larger keys –the height increases. One
way to overcome this deficiency is using a multi-
way radix treesearching.
The branching is not according to 1 bit, but rather
according to several bits (most often 2)
If mbits are examined at a time –the search is
speeded up by a factor of 2
m
Problem:if mbits at a time, the nodes will have
2
m
links, may result in considerable amount of
wasted space due to unused links.
201
Multi-Way Radix Trees -
example
Search –take left,
right or middle
links
according to the
first two bits.
Insert –replace
external node by
the key
(E.G. insert T 10100).
X
ACEG
NP
H
I
L M R
S
00
01
10
11
T
Nodes with 4 links–00, 01, 10, 11
202
Multi-Way Radix Trees
Wasted space –due to the large number of
unused links.
Worse if M -the number of bits considered, gets
higher.
The running time: log
MN –very efficient.
Hybrid method:
Large M at the top,
Small M at the bottom
Chapter 5: Hashing
Collision Resolution:
Separate Chaining
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
Hash Tables –Collision
Problem:many-to-one mapping
a potentially huge set of strings
a small set of integers
Collision:Having a second key into a
previously used slot.
Collision resolution:deals with keys
that are mapped to same slots.
Separate Chaining
Invented by H. P. Luhn, an IBM engineer, in
January 1953.
Idea:Keys hashing to same slot are kept in
linked lists attached to that slot
Useful forhighly dynamic situations,
where the number of the search keys cannot
be predicted in advance.
Example
Key: A S E A R C H I N G E X A M P L E
Hash: 1 8 5 1 7 3 8 9 3 7 5 2 1 2 5 1 5
(M = 11)
Separate chaining:
0 12 3 4 56 78 910
= LM N = E= GH I=
AX C P RS =
A= = E ==
A E
= = = end of list
Separate Chaining –Length of
Lists
N-number of keys
M-size of table
N/M-average length of the lists
Separate Chaining –Search
Unsuccessful searchesgo to the end of
some list.
Successful searchesare expected to go
half the way down some list.
Separate Chaining –Choosing
Table Size M
relatively smallso as not to use up
a large area of contiguous memory
but enough largeso that the lists
are short for more efficient
sequential search
Separate Chaining –
other chaining options
Keep the lists ordered: useful if there are
much more searches than inserts, and if most
of the searches are unsuccessful
Represent the chains as binary search tree.
Extra effort needed –not efficient
Separate Chaining –
advantages and disadvantages
Advantages
used when memory space is a concern
easily implemented
Disadvantages
unevenly distributed keys –long lists :
search time increases, many empty
spaces in the table.
Chapter 5: Hashing
Hash Tables
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
214
Hashing
What is Hashing?
Direct Access Tables
Hash Tables
215
Hashing -basic idea
A mapping between the search
keys and indices -efficient
searching into an array.
Each element is found with one
operation only.
216
Hashing example
Example:
1000 students,
identification number between 0 and 999,
use an array of 1000 elements.
SSN of each student a 9-digit number.
much more elements than the number of
the students, -a great waste of space.
217
The Approach
Directly referencing records in a table
usingarithmetic operations on keysto
map them onto table addresses.
Hash function: function that transforms
thesearch key into a table address.
218
Direct-address tables
The most elementary form of hashing.
Assumption–direct one-to-one
correspondence between the keys and
numbers 0, 1, …, m-1., m –not very large.
Array A[m].Each position (slot) in the array
contains a pointer to a record, or NIL.
219
Direct-Address Tables
The most elementary form of hashing
Assumption –direct one-to-one
correspondence between the keys and
numbers 0, 1, …, m-1., m –not very large.
Array A[m].Each position (slot) in the array
contains either a reference to a record, or
NULL.
Cost –the size of the array we need is
determined the largest key. Not very useful if
there are only a few keys
220
Hash Functions
Transform the keys into
numbers within a
predetermined interval to be
used as indices in an array
(table, hash table) to store the
records
221
Hash Functions –Numerical
Keys
Keys –numbers
If Mis the size of the array, then
h(key) = key % M.
This will map all the keys into
numbers within the interval
[0 .. (M-1)].
222
Hash Functions –Character
Keys
Keys –strings of characters
Treat the binary representation of a
key as a number, and then apply the
hash function
223
How keys are treated as numbers
If each character is represented with
mbits,
then the string can be treated as
base-mnumber.
224
Example
A K E Y :
00001 01011 00101 11001 =
1 . 32
3
+ 11 . 32
2
+ 5 . 32
1
+ 25 . 32
0
=
44271
Each letter is represented by its
position in the alphabet. E.G,Kis the
11-th letter, and its representation is
01011( 11 in decimal)
225
Long Keys
If the keys are very long, an overflow
may occur.
A solutionto this is to apply the
Horner’s method
in computing the hash function.
226
Horner’s Method
a
nx
n
+ a
n-1.x
n-1
+ a
n-2.x
n-2
+ … + a
1x
1
+ a
0x
0
=
x(x(…x(x (a
n.x +a
n-1) + a
n-2) + …. ) + a
1) + a
0
4x
5
+ 2x
4
+ 3x
3
+ x
2
+ 7x
1
+ 9x
0
=
x( x( x( x ( 4.x +2) + 3) + 1 ) + 7) + 9
The polynomial can be computed by
alternating the multiplication and addition operations
227
Example
V E R Y L O N G K E Y
10110 00101 10010 11001 01100 01111 01110 00111 01011 00101 11001
V E R Y L O N G K E Y
22 5 18 25 12 15 14 7 11 5 25
22 . 32
10
+ 5 . 32
9
+ 18 .32
8
+ 25 . 32
7
+ 12 . 32
6
+
15 . 32
5
+ 14 . 32
4
+ 7 . 32
3
+ 11 . 32
2
+ 5 . 32
1
+
25 . 32
0
228
Example (continued)
V E R Y L O N G K E Y
22 . 32
10
+ 5 . 32
9
+ 18 . 32
8
+ 25 . 32
7
+ 12 . 32
6
+
15 . 32
5
+ 14 . 32
4
+ 7 . 32
3
+ 11 . 32
2
+ 5 . 32
1
+ 25 . 32
0
(((((((((22.32 + 5)32 + 18)32 + 25)32 + 12)32 + 15)32 + 14)32 +
7)32 +11)32 +5)32 + 25
compute the hash function by applying the modoperation at
each step, thus avoiding overflowing.
h
0= (22.32 + 5) % M
h
1= (32.h
0+ 18) % M
h
2= (32.h
1+25) % M
. . . .
229
int hash32 (char[] name, int tbl_size)
{
key_length = name.length;
int h = 0;
for (int i=0; i < key_length ; i++)
h = (32 * h + name[i]) % tbl_size;
return h;
}
Code
230
Hash Tables
Index -integer, generated by a hash
function between 0 and M-1
Initially -blank slots.
sentinel value, or a special field in
each slot.
231
Hash Tables
Insert-hash function to
generate an address
Searchfor a key in the table -
the same hash function is used.
232
Size of the Table
Table size M-different from the
number of recordsN
Load factor:N/M
M must be primeto ensure even
distribution of keys
Chapter 5: Hashing
Collision Resolution: Open Addressing
Extendible Hashing
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
Open Addressing
Invented by A. P. Ershov and
W. W. Peterson in 1957 independently.
Idea:Store collisions in the hash table.
Table size-must be at least twice the
number of the records
Open Addressing
If collision occurs,
next probes are performed following the formula:
h
i(x) = ( hash(x) + f(i) ) mod Table_Size
where:
h
i(x)is an index in the table to insertx
hash(x)is the hash function
f(i)is the collision resolution function.
i-the current attempt to insert an element
Open Addressing
Problems with delete:a special flag is needed
to distinguish deleted from empty positions.
Necessary for the search function –if we come to
a “deleted” position, the search has to continue
as the deletion might have been done after the
insertion of the sought key
–the sought key might be further in the table.
Linear Probingf(i) = i
Insert:If collision -probe the next slot .
If unoccupied –store the key there.
If occupied –continue probing next slot.
Search:a) match –successful search
b) empty position –unsuccessful search
c) occupied and no match –continue
probing.
If end of the table -continue from the
beginning
Key:A S E A R C H I N G E X A M P L E
Hash 1 0 5 1 18 3 8 9 14 7 5 5 1 13 16 12 5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
S A E
* A R
C G H I N
* E
* * * * * X
* * * A
L M P
* * * * * * E
Example
* -unsuccessful attempts
Linear Probing
Disadvantage: “ Primary clustering”
Large clusters tend to build up.
Probability to fill a slot:
? ?
ifilled slots
slot a slot b
slot a:(i+1)/M
slot b:1/M
Quadratic Probing
Use a quadratic function to compute the
next index in the table to be probed.
The ideahere is to skip regions in the table
with possible clusters.
f(i) = i
2
Quadratic Probing
In linear probingwe check the I-th
position. If it is occupied, we check the
I+1
st
position, nextI+2
nd
, etc.
In quadric probing, if the I-th
position is occupied we check theI+1
st
,
next we checkI+4
th
,
nextI + 9
th
, etc.
Double Hashing
Purpose –to overcome the disadvantage of
clustering.
A second hash functionto get a fixed increment
for the “probe” sequence.
hash2(x) = R -(x mod R)
R:prime, smaller than table size.
f(i) = i*hash2(x)
Rehashing
Table size : M > N
For small load factor the
performance is much better,
than for N/M close to one.
Best choice: N/M = 0.5
When N/M > 0.75 -rehashing
Rehashing
Build a second table twice as largeas
the original and rehash there all the
keys of the original table.
Expensive operation,
running time O(N)
However, once done, the new hash
table will have good performance.
Extendible Hashing
external storage
Nrecords in total to store,
Mrecords in one disk block
No more than two blocks are examined.
Extendible Hashing
Idea:
•Keys are grouped according to the
first mbits in their code.
•Each group is stored in one
disk block.
•If some block becomes full,
each group is split into two ,
and m+1bits are considered to
determine the location of a record.
Example
4 disk blocks, each can contain 3 records
4 groups of keys according to the first
two bits
00010010011000111000
00100010101010011010
01100
00 01 10 11
directory
Example (cont.)
New key to be inserted: 01011.
Block2 is full, so we start considering 3 bits
00010 01001 01100 1000111000
---- 01010 --- 11010
00100 01011 10100
directory
000/001 010 011 100/101 110/111
(still on
same block)
Extendible Hashing
Size of the directory : 2
D
2
D
= O(N
(1+1/M)
/ M)
D-the number of bits considered.
N-number of records
M-number of disk blocks
Conclusion 1
Hashing is a search method,
used when
sorting is not needed
access time is the primary
concern
Conclusion 2
Time-space trade-off:
No memory limitations–use the
key as a memory address (minimum
amount of time).
No time limitations–use
sequential search (minimum amount of
memory)
Hashing –gives a balance between these
two extremes –a way to use a reasonable
amount of both memory and time.
Conclusion 3
To choose a good hash function is a
“black art”.
The choice depends on the
nature of keysand the
distributionof the numbers
corresponding to the keys.
Conclusion 4
Best course of action:
•separate chaining:if the number of
records is not known in advance
•open addressing:if the number of the
records can be predicted and there is
enough memory available
Chapter 6:
Priority Queues
Priority Queues
Binary Heaps
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
257
The Model
A priority queue is a queue where:
Requests are inserted in the order of arrival
The request with highest priority is processed
first (deleted from the queue)
The priority is indicated by a number, the
lower the number -the higher the priority.
258
Implementations
Linked list:
-Insert at the beginning -O(1)
-Find the minimum -O(N)
Binary search tree (BST):
-Insert O(logN)
-Find minimum -O(logN)
259
Implementations
Binary heap
Better than BST because it does not support
links
-Insert O(logN)
-Find minimum O(logN)
Deleting the minimal element takes a constant
time, however after that the heap structure has
to be adjusted, and this requires O(logN)time.
260
Binary Heap
Heap-Structure Property:
Complete Binary Tree -Each node has two
children, except for the last two levels.
The nodes at the last level do not have children.
New nodes are inserted at the last level from left
to right.
Heap-Order Property:
Each node has a higher priority than its
children
261
6
10
12
15 17 18 23
20 19 34
Binary Heap
Next node to be inserted -right child of the yellow node
262
Binary heap implementationwith
an array
Root -A(1)
Left Child of A(i) -A(2i)
Right child of A(i)-A(2i+1)
Parent of A(I) -A([i/2]).
The smallest element is always at the
root, the access time to the element
with highest priority is constant O(1).
263
Example
6 10 12 15 17 18 23 20 19 34
Consider 17:
position in the array -5.
parent10is at position [5/2] = 2
left child is at position 5*2 = 10 (this is34)
right child -position 2*5 + 1 = 11 (empty.)
264
Problems
Problem 1:
2 8 10 16 17 18 23 20 21 30
Reconstruct the binary heap
265
Problems
Problem 2: Give the array representation for
3
10
16
13 12 18 23
15 19 30 14
266
Basic Operations
Insert a node –Percolate Up
Delete a node –Percolate Down
Decrease key, Increase key, Remove key
Build the heap
267
Percolate Up –Insert a Node
A hole is created at the bottom of the tree, in the
next available position.
6
10
12
15 17 18 23
20 19 34
270
Percolate Up
6
10
12
15 18 23
21 19 34
17
16
Complexity of insertion: O(logN)
271
Percolate Down –
Delete a Node
10
16
13 12 18 23
15 19 30 34
272
Percolate Down –
the wrong way
1
0
16
13 12 18 23
15 19 30 34
273
Percolate Down –
the wrong way
16
13 18 23
15 19 34
10
12
30
The empty hole violates the
heap-structure property
274
Percolate Down
Last element -20. The hole at the root.
10
16
12 13 18 23
15 19 30 20
We try to insert 20 in the hole by percolating the hole down
275
Percolate Down
16
12 13 18 23
15 19 30
20
10
276
Percolate Down
16
13 18 23
15 19 30
20
10
12
277
Percolate Down
16
13 18 23
20 19 30
10
12
15
Complexity of deletion: O(logN)
278
Other Heap Operations
1. DecreaseKey(p,d)
increase the priority of element pin
the heap with a positive value d.
percolate up.
2. IncreaseKey(p,d)
decrease the priority of element pin
the heap with a positive value d.
percolate down.
279
Other Heap Operations
3. Remove(p)
a. Assigning the highest priority to p -percolate p
up to the root.
b.Deleting the element in the root and filling the
hole by percolating down and trying to insert the
last element in the queue.
4. BuildHeap
input N elements
place them into an empty heap through successive
inserts. The worst case running time isO(NlogN).
280
Build Heap -O(N)
Given an array of elements to be
inserted in the heap,
treat the array as a heap with order
property violated,
and then do operations to fix the
order property.
285
Theorem
For a perfect binary tree of height
hcontainingN = 2
h+1
-1nodes,
the sumSof the heights of the
nodes is
S =2
h+1
-1 -(h + 1) = O(N)
286
Proof
The tree has 1node at height h, 2nodes
at height h-1, 4nodes at height h -2, etc
1 (2
0
) h
2 (2
1
) h -1
4 (2
2
) h -2
8 (2
3
) h -3
…….
2
h
0
S = 2
i
(h -i), i = 0 to h
293
Insertion Sort
PRE: array of N elements (from 0 to N-1)
POST: array sorted
1. An array of one element is sorted
2.Assume that the first p elements are
sorted.
for j = p to N-1
Take the j-th element and find a place
for it among the first j sorted elements
294
Insertion Sort
int j, p; comparable tmp;
for ( p = 1; p < N ; p++)
{
tmp = a[p];
for ( j=p; j>0 && tmp < a[ j-1 ] ; j--)
a[ j ] = a[ j-1 ];
a[ j ] = tmp;
}
Animation
295
Analysis of the Insertion Sort
Insert the N-th el.: at most N-1 comparisons
N-1 movements
Insert the N-1
st
el. at most N-2 comparisons
N-2 movements
Insert the 2
nd
el.: 1 comparison
1 movement
2*(1 + 2 +… N -1) = 2*N * (N-1) / 2 =
N(N-1) = Θ (N
2
)
Almost sorted array: O(N)
Average complexity:Θ (N
2
)
296
A lower bound for
simple sorting algorithms
•An inversion :
an ordered pair (A
i
, A
j
) such that
i < j but A
i
> A
j
Example:10, 6, 7, 15, 3,1
Inversions are:
(10,6), (10,7), (10,3), (10,1), (6,3), (6,1)
(7,3), (7,1) (15,3), (15,1), (3,1)
297
Swapping
Swapping adjacent elements that are out of
order removes one inversion.
A sorted array has no inversions.
Sorting an array that contains iinversions
requires at least iswaps of adjacent elements
How many inversions are there in an average
unsorted array?
298
Theorems
Theorem 1:The average number of
inversions in an array of N distinct elements is
N (N -1) / 4
Theorem 2:Any algorithm that sorts by
exchanging adjacent elements requires
Ω (N
2
)time on average.
For a sorting algorithm to run in less than
quadratic time it must do something other
than swap adjacent elements
Chapter 7:
Sorting Algorithms
Shell Sort
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
300
Shell Sort
Improves on insertion sort
Compares elementsfar apart,
thenless far apart, finally
comparesadjacentelements
(as an insertion sort).
301
Idea
arrange the data sequence in a
two-dimensional array
sort the columnsof the array
repeatthe process each time with
smaller number of columns
302
Example
3 7 9 0 5 1 6 8 4 2 0 6 1 5 7 3 4 9 8 2
it is arranged in an array with 7 columns (left),
then the columns are sorted (right):
3 7 9 0 5 1 6 3 3 2 0 5 1 5
8 4 2 0 6 1 5 7 4 4 0 6 1 6
7 3 4 9 8 2 8 7 9 9 8 2
303
Example (cont)
3 3 2 0 0 1
0 5 1 1 2 2
5 7 4 3 3 4
4 0 6 4 5 6
1 6 8 5 6 8
7 9 9 7 7 9
8 2 8 9
one column in
the last step–
it is only a 6,
an 8 and a 9
that have to
move a little
bit to their
correct position
304
Implementation
•one-dimensional array that is
indexed appropriately.
•an increment sequence
to determine how far apart
elements to be sorted are
305
Increment sequence
Determines how far apart elements to
be sorted are:
h
1
, h
2
, ..., h
t
with h
1
= 1
h
k
-sorted array-all elements
spaced a distance h
k
apart are sorted
relative to each other.
306
Correctness of the algorithm
Shellsort only works because an array that
is h
k
-sorted remains h
k
-sorted when
h
k-1
-sorted.
Subsequent sorts do not undo the
work done by previous phases.
The last step (with h= 1) -
Insertion Sort on the whole array
307
Java code for Shell sort
int j, p, gap; comparable tmp;
for (gap = N/2; gap > 0; gap = gap/2)
for ( p = gap; p < N ; p++)
{
tmp = a[p];
for ( j = p ; j >= gap && tmp < a[ j-gap ];
j = j -gap)
a[ j ] = a[ j -gap ];
a[j] = tmp;
}
308
Increment sequences
1. Shell's original sequence:
N/2 , N/4 , ..., 1 (repeatedly divide by 2).
2. Hibbard's increments:
1, 3, 7, ..., 2
k
-1 ;
3. Knuth's increments:
1, 4, 13, ..., ( 3
k
-1) / 2 ;
4. Sedgewick's increments:
1, 5, 19, 41, 109, ....
9 ·4
k
-9 ·2
k
+ 1or 4
k
-3 ·2
k
+ 1.
309
Analysis
Shellsort's worst-case performance using
Hibbard's increments is Θ(n
3/2
).
The averageperformance is thought to be
about O(n
5/4
)
The exact complexity of this algorithm is still
being debated .
for mid-sizeddata : nearly as well if not
better than the faster (n log n)sorts.
Chapter 7:
Sorting Algorithms
Merge Sort
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
311
Merge Sort
Basic Idea
Example
Analysis
Animation
312
Idea
Two sorted arrays can be merged
in linear time with Ncomparisons
only.
Given an array to be sorted,
consider separately
its left half and its right half,
sort them and then merge them.
313
Characteristics
Recursive algorithm.
Runs in O(NlogN)worst-case running
time.
Where is the recursion?
Each half is an array that can be sorted
using the same algorithm -divide into
two, sort separately the left and the right
halves, and then merge them.
314
Example
315
Merge Sort Code
void merge_sort ( int [ ] a, int left, int right)
{
if(left < right) {
int center = (left + right) / 2;
merge_sort (a,left, center);
merge_sort(a,center + 1, right);
merge(a, left, center + 1, right);
}
}
316
Analysis of Merge Sort
Assumption:N is a power of two.
For N = 1 time is constant (denoted by 1)
Otherwise:
time to mergesort N elements=
time to mergesort N/2 elements +
time to merge two arrays each N/2 el.
317
Recurrence Relation
Time to merge two arrays each N/2
elements is linear, i.e. O(N)
Thus we have:
(a) T(1) = 1
(b) T(N) = 2T(N/2) + N
318
Solving the Recurrence
Relation
T(N) = 2T(N/2) + N divide by N:
(1) T(N) / N = T(N/2) / (N/2) + 1
Telescoping:N is a power of two, so we can
write
(2) T(N/2) / (N/2) = T(N/4) / (N/4) +1
(3) T(N/4) / (N/4) = T(N/8) / (N/8) +1
…….
T(2) / 2 = T(1) / 1 + 1
319
Adding the Equations
The sum of the left-hand sides will be equal to
the sum of the right-hand sides:
T(N) / N + T(N/2) / (N/2) + T(N/4) / (N/4) +
… + T(2)/2 =
T(N/2) / (N/2) + T(N/4) / (N/4) + ….
+ T(2) / 2 + T(1) / 1 +LogN
(LogNis the sum of 1’s in the right-hand sides)
320
Crossing Equal Terms,
Final Formula
After crossing the equal terms, we get
T(N)/N = T(1)/1 + LogN
T(1) is 1, hence we obtain
T(N) = N + NlogN = (NlogN)
Hence the complexity of the Merge Sort
algorithm is(NlogN).
Chapter 7:
Sorting Algorithms
Quick Sort
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
322
Quick Sort
Basic Idea
Code
Analysis
Advantages and Disadvantages
Applications
Comparison with Heap sort
and Merge sort
Animation
323
Basic Idea
Pick one element in the array, which will
be thepivot.
Make one pass through the array, called a
partitionstep, re-arranging the entries so
that:
•entries smallerthan the pivot areto the left
of the pivot.
•entrieslargerthan the pivot areto the right
324
Basic Idea
Recursively apply quicksortto the part of
the array that is to the left of the pivot, and
to the part on its right.
No merge step, at the end all the elements
are in the proper order
325
Choosing the Pivot
Some fixed element: e.g. the first, the
last, the one in the middle.
Bad choice-may turn to be the
smallest or the largest element, then
one of the partitions will be empty
Randomly chosen(by random generator)
-still abad choice
326
Choosing the Pivot
Themedian of the array
(if the array has N numbers, the
median is the [N/2] largest number).
This isdifficult to compute-increases
the complexity.
327
Choosing the Pivot
The median-of-three choice:
take the first, the last and the middle
element.
Choose the median of these three
elements.
328
Find the Pivot –Java Code
intmedian3 ( int[ ]a, intleft, intright)
{ intcenter = (left + right)/2;
if(a [left]> a [ center]) swap (a [ left], a [center]);
if(a [center]> a [right]) swap (a[center], a[ right]);
if(a [left]> a [ center]) swap (a [ left], a [center]);
swap(a[ center], a [ right-1]);
returna[ right-1];
}
329
Quick Sort –Java Code
If ( left + 10 < = right)
{ // do quick sort }
elseinsertionSort (a, left, right);
330
Quick Sort –Java Code
{ inti = left, j = right -1;
for( ; ; )
{ while(a [++ i ] < pivot) { }
while( pivot < a [--j ] ) { }
if(i < j) swap (a[ i ], a [ j ]);
else break; }
swap (a [ i ], a [ right-1 ]);
quickSort ( a, left, i-1);
quickSort (a, i+1, right); }
331
Implementation Notes
Compare the two versions:
A.while (a[++i] < pivot) { }
while (pivot < a[--j]){ }
if ( i < j ) swap (a[i], a[j]);
else break;
B. while (a[ i ] < pivot) { i++ ; }
while (pivot < a[ j ] ) { j--; }
if ( i < j ) swap (a [ i ], a [ j ]);
else break;
332
Implementation Notes
If we have an array of equal
elements, the second code will
never increment i or decrement j,
and will do infinite swaps.
i and j will never cross.
333
Complexity of Quick Sort
Average-case O(NlogN)
Worst Case: O(N
2
)
This happens when the pivot is the
smallest (or the largest) element.
Then one of the partitions is empty,
and we repeat recursively the
procedure for N-1elements.
334
Complexity of Quick Sort
Best-case O(NlogN)
The pivot is the median of the array,
the left and the right parts have same
size.
There are logN partitions, and to
obtain each partitions we do N
comparisons (and not more than N/2
swaps). Hence the complexity is
O(NlogN)
335
Analysis
T(N) = T(i) + T(N -i -1) + cN
The time to sort the file is equal to
the time to sort the left partition
withielements, plus
the time to sort the right partition with
N-i-1elements, plus
the time to build the partitions.
336
Worst-Case Analysis
The pivot is the smallest (or the largest) element
T(N) = T(N-1) + cN, N > 1
Telescoping:
T(N-1) = T(N-2) + c(N-1)
T(N-2) = T(N-3) + c(N-2)
T(N-3) = T(N-4) + c(N-3)
…………...
T(2) = T(1) + c.2
337
Worst-Case Analysis
T(N) + T(N-1) + T(N-2) + … + T(2) =
= T(N-1) + T(N-2) + … + T(2) + T(1) +
c(N) + c(N-1) + c(N-2) + … + c.2
T(N) = T(1) +
c times (the sum of 2 thru N)
= T(1) + c (N (N+1) / 2 -1) = O(N
2
)
338
Best-Case Analysis
The pivot is in the middle
T(N) = 2T(N/2) + cN
Divide by N:
T(N) / N = T(N/2) / (N/2) + c
339
Best-Case Analysis
Telescoping:
T(N) / N = T(N/2) / (N/2) + c
T(N/2) / (N/2) = T(N/4) / (N/4) + c
T(N/4) / (N/4) = T(N/8) / (N/8) + c
……
T(2) / 2 = T(1) / (1) + c
341
Advantages and Disadvantages
Advantages:
One of the fastest algorithms on average
Does not need additional memory (the
sorting takes place in the array -this is
called in-place processing )
Disadvantages:
The worst-case complexity is O(N
2
)
342
Applications
Commercial applications
QuickSort generally runs fast
No additional memory
The above advantages compensate for
the rare occasions when it runs withO(N
2
)
343
Applications
Neveruse in applications which require aguarantee
of response time:
Life-critical
(medical monitoring,
life support in aircraft, space craft)
Mission-critical
(industrial monitoring and control in handling
dangerous materials, control for aircraft, defense,
etc)
unless you assume the worst-case
response time
Warning:
344
Comparison with Heap Sort
O(NlogN)complexity
quick sort runs faster, (does not
support a heap tree)
the speed of quick sort is not
guaranteed
345
Comparison with Merge Sort
Merge sort guarantees O(NlogN)time
Merge sort requires additional memory
with size N
Usually Merge sort is not used for main
memory sorting, only for external memory
sorting
Chapter 9: Graphs
Topological Sort
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
348
Problem
P: Assemble pool table
F: Carpet the floor
W: Panel the walls
C: Install ceiling
How would you order these
activities?
349
Problem Graph
S
F P
W
C
E
S –startF –floorP –pool table
E –end W –wallsC -ceiling
Some possible
orderings:
C W F P
W C F P
F P W C
C F P W
Important: P must be after F
350
Topological Sort
RULE:
If there is a path from uto v,
then vappears after uin the ordering.
351
Graphs’ Characteristics
directed
otherwise (u,v) means a path from uto vand
from v to u, cannot be ordered.
acyclic
otherwise uand vare on a cycle :
uwould precede v, vwould precede u.
352
The ordering may not be unique
V1
V2 V3
V4
legal orderings
V1, V2, V3, V4
V1, V3, V2, V4
353
Some Definitions
Degreeof a vertex U:
the number of edges (U,V)-outgoingedges
Indegreeof a vertex U:
the number of edges (V,U)-incomingedges
The algorithm for topological sort uses
"indegrees" of vertices.
354
Basic Algorithm
1.Compute the indegrees of all vertices
2.Find a vertex Uwith indegree 0and print it
(store it in the ordering)
Ifthere is nosuch vertex then there is a
cycleand the vertices cannot be ordered. Stop.
3. Remove Uand all its edges (U,V)from the graph.
4. Updatethe indegrees of the remaining vertices.
Repeat steps 2 through 4 while there are vertices
to be processed.
355
Example
V1 V2
V3 V4 V5
Indegrees
V1 0
V2 1
V32
V4 2
V5 1
First to be sorted: V1
New indegrees:
V2 0
V3 1
V4 1
V5 2
Possible sorts:V1, V2, V4, V3, V5; V1, V2, V4, V5, V3
356
Complexity
O(|V|
2
),|V| -the number of vertices.
To find a vertex of indegree 0 we scan all the
vertices -
|V| operations.
We do this for all vertices: |V|
2
operations
357
Improved Algorithm
Find a vertex of degree 0,
Scan onlythose vertices whose
indegrees have been updated to 0.
358
Improved Algorithm
1.Compute the indegrees of all vertices
2.Store all vertices with indegree 0 in a queue.
3.Get a vertex U.
4. For all edges (U,V)update the indegree of V, and
put V in the queue if the updated indegreeis 0.
Repeat steps 3 and 4 while the queue is not empty.
359
Complexity
Operations needed to compute the indegrees:
Adjacency lists: O(|E|)
Matrix: O(|V|
2
)
Complexity of the improved algorithm
Adjacency lists: O(|E| + |V|),
Matrix representation: O(|V|
2
)
Note that if the graph is complete |E| = O(|V|
2
)
|V| -number of vertices,
|E| -number of edges.
Chapter 9: Graphs
Shortest Paths
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
361
Shortest Paths
Shortest Paths in Unweighted
Graphs
Shortest Paths in Weighted
Graphs -Dijkstra’s Algorithm
Animation
362
Unweighted Directed Graphs
V1 V2
V3 V4
V5
V6 V7
What is the shortest path from V3 to V5?
363
Problem Data
The problem:Given a source vertex s, find the
shortest path to all other vertices.
Data structures needed:
Graph representation:
Adjacency lists / adjacency matrix
Distance table:
distances from source vertex
paths from source vertex
365
Breadth-first search in graphs
Take a vertex and examine all
adjacent vertices.
Do the same with each of the
adjacent vertices.
366
Algorithm
1.Store sin a queue, and initialize
distance = 0in the Distance Table
2.While there are vertices in the queue:
Read a vertex vfrom the queue
For all adjacent verticesw:
If distance = -1 (not computed)
Distance = (distance to v) + 1
Parent = v
Append wto the queue
367
Complexity
Matrix representation: O(|V|
2
)
Adjacency lists -O(|E| + |V|)
We examine all edges (O(|E|)),
and we store in the queue each
vertex only once (O(|V|)).
368
Weighted Directed Graphs
V1 V2
V3 V4
V5
V6 V7
What is the shortest distance from V3 to V7?
369
Comparison
Similar to the algorithmfor unweighted graphs
Differences:
-weightsare included in the graph representation
-priority queue: the node with the smallest
distance is chosen for processing
-distance is not any more the number of edges,
instead it is thesum of weights
-Distance table isupdatedif newly computed
distance is smaller.
370
Algorithm
1. Storesin a priority queue withdistance = 0
2. While there are vertices in the queue
DeleteMina vertexvfrom the queue
For all adjacent verticesw:
Compute new distance
Store in / update Distance table
Insert/update in priority queue
371
Processing Adjacent Nodes
For all adjacent verticesw:
Compute new distance = (distance to v) + (d(v,w))
If distance = -1 (not computed)
store new distance in table
path = v
Insert win priority queue
If old distance > new distance
Update old_distance = new_distance
Update path = v
Update priority in priority queue
372
Complexity
O(E logV + V logV) = O((E + V) log(V))
Each vertex is stored only once in the queue–O(V)
DeleteMin operation is : O( V logV )
Updating the priority queue –
search and inseart: O(log V)
performed at most for each edge: O(E logV)
Chapter 9: Graphs
Spanning Trees
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
374
Spanning Trees
Spanning Trees in Unweighted
Graphs
Minimal Spanning Trees in Weighted
Graphs -Prim’s Algorithm
Minimal Spanning Trees in Weighted
Graphs -Kruskal’s Algorithm
Animation
375
Definitions
Spanning tree: a tree that contains all vertices
in the graph.
Number of nodes: |V|
Number of edges: |V|-1
376
Spanning trees for unweighted
graphs -data structures
A table (an array) T
size = number of vertices,
Tv= parent of vertex v
Adjacency lists
A queue of vertices to be processed
377
Algorithm -initialization
1.Choose a vertex Sand store it in a queue,
set
a counter = 0(counts the processed nodes),
and Ts = 0(to indicate the root),
Ti = -1,i s (to indicate vertex not
processed)
378
Algorithm –basic loop
2. While queue not emptyand counter < |V| -1:
Read a vertex Vfrom the queue
For all adjacent vertices U:
If Tu = -1(not processed)
Tu V
counter counter + 1
store Uin the queue
379
Algorithm
–results and complexity
Result:
Root: S, such that Ts = 0
Edges in the tree: (Tv, V)
Complexity:O(|E| + |V|)-we process
all edges and all nodes
381
Minimum Spanning Tree -
Prim's algorithm
Given: Weighted graph.
Find a spanning tree with the minimal sum
of the weights.
Similarto shortest paths in a weighted graphs.
Difference: we record the weight of the
current edge, not the length of the path.
382
Data Structures
A table: rows = number of vertices,
three columns:
T(v,1)= weightof the edge from
parent to vertex V
T(v,2)= parentof vertex V
T(v,3)= True, if vertex V is fixedin the tree,
False otherwise
383
Data Structures (cont)
Adjacency lists
A priority queue of vertices to be processed.
Priority of each vertexis determined by
the weight of edgethat links the vertex
to its parent.
The priority may changeif we change
the parent, provided the vertex is not yet
fixed in the tree.
384
Prim’s Algorithm
Initialization
For all V, set
first column to –1 (not included in the tree)
second column to 0 (no parent)
third column to False (not fixed in the tree)
Select a vertex S, set T(s,1) = 0(root: no parent)
Store Sin a priority queue with priority 0.
385
While (queue not empty) loop
1. DeleteMinVfrom the queue, setT(v,3) = True
2. For all adjacent toVverticesW:
IfT(w,3)= Truedo nothing –the vertex is fixed
IfT(w,1) = -1(i.e. vertex not included)
T(w,1) weight of edge (v,w)
T(w,2) v (the parent of w)
insert w in the queue with
priority = weight of (v,w)
386
While (queue not empty) loop
(cont)
If T(w,1) > weight of (v,w)
T(w,1) weight of edge (v,w)
T(w,2) v (the parent of w)
update priority of w in the queue
remove, and insert with new priority
= weight of (v,w)
387
Results and Complexity
At the end of the algorithm, the tree
would be represented in the table with its
edges
{(v, T(v,2) ) | v = 1, 2, …|V|}
Complexity:O(|E|log(|V|))
388
Kruskal's Algorithm
The algorithm works with :
set of edges,
tree forests,
each vertex belongs to only one
tree in the forest.
389
The Algorithm
1. Build |V| treesof one vertex only -each vertex
is a tree of its own. Store edges in priority queue
2. Choose an edge (u,v) with minimum weight
if uand vbelong to one and the same tree,
do nothing
ifuand vbelong to different trees,
link the trees by the edge (u,v)
3. Perform step 2 until all trees are combined into
one tree only
391
Example (cont)
1 2 3 4 5
Forest of 5 trees
Edges in
Priority Queue:
(1,2) 1
(3,5) 2
(2,4) 3
(1,3) 4
(4,5) 4
(2,5) 5
(1,5) 6
1 2
Process edge (1,2)
Process edge (3,5)
1 2
3
1
1
2
5
3 4 5
4
392
Example (cont)
Edges in
Priority
Queue:
(2,4) 3
(1,3) 4
(4,5) 4
(2,5) 5
(1,5) 6
Process edge (2,4)
1 2
1
4
3
Process edge (1,3)
2
5
1
2
4
3
1
3
4
All trees are
combined in
one, the
algorithm
stops
3
2
5
393
Operations on Disjoint Sets
Comparison:Are the vertices in the same tree
Union:combine two disjoint sets to form one set -
the new tree
Implementation
Union/findoperations: the unions are represented
as trees -nlogn complexity.
The set of trees is implemented by an array.
394
Complexity of Kruskal’s
Algorithm
O(|E|log(|V|)
Explanation:
The complexity is determined by the heap operations and the
union/find operations
Union/findoperations on disjoint sets represented as trees:
tree search operations, with complexity O(log|V|)
DeleteMin in Binary heaps with N elements is O(logN),
When performed for Nelements, it is O(NlogN).
395
Complexity (cont)
Kruskal’s algorithm works with a binary heap that
contains all edges: O(|E|log(|E|))
However, a complete graphhas
|E| = |V|*(|V|-1)/2, i.e. |E| = O(|V|
2
)
Thus for the binary heap we have
O(|E|log(|E|)) = O(|E|log (|V|
2
) = O(2|E|log (|V|)
= O(|E|log(|V|))
396
Complexity (cont)
Since for each edge we perform DeleteMin
operation and union/find operation, the overall
complexity is:
O(|E| (log(|V|) + log(|V|))=
O(|E|log(|V|))
Chapter 9: Graphs
Breadth-First and
Depth-First Search
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
398
Breadth-First and Depth-First
Search
BFS Basic Algorithm
BFS Complexity
DFS Algorithm
DFS Implementation
Relation between BFS and DFS
399
BFS –Basic Idea
Given a graph with N vertices and a selected
vertex A:
for(i = 1;
there are unvisited vertices ; i++)
Visit all unvisited vertices at distance i
(iis the length of the shortest path between A
and currently processed vertices)
Queue-based implementation
400
BFS –Algorithm
BFS algorithm
1.Store source vertex Sin a queueand mark as
processed
2. whilequeue is not empty
Read vertex vfrom the queue
forall neighbors w:
If wis not processed
Mark as processed
Append in the queue
Record the parent of wto be v(necessary only
if we need the shortest path tree)
401
Example
1
4
2 3 5
6
Adjacency lists
1: 2, 3, 4
2: 1, 3, 6
3: 1, 2, 4, 5, 6
4: 1, 3, 5
5: 3, 4
6: 2, 3
Breadth-first traversal: 1, 2, 3, 4, 6, 5
1: starting node
2, 3, 4 : adjacent to 1
(at distance 1 from node 1)
6 : unvisited adjacent to node 2.
5 : unvisited, adjacent to node 3
The order depends on the order of
the nodes in the adjacency lists
402
Shortest Path Tree
Consider the distance table:
Nodes
AB C DE
Distance to A 01 1 21
Parent 0A A CA
The table defines the shortest
path tree, rooted at A.
403
BFS –Complexity
Step 1: read a node from the queue O(V)times.
Step 2: examine all neighbors, i.e. we examine all edges
of the currently read node.
Not oriented graph: 2*Eedges to examine
Hence the complexity of BFS isO(V + 2*E)
404
Depth-First Search
Proceduredfs(s)
mark all vertices in the graph as not reached
invoke scan(s)
Procedurescan(s)
mark and visit s
foreach neighbor wof s
ifthe neighbor is not reached
invoke scan(w)
405
Depth-First Search with Stack
Initialization:
mark all vertices as unvisited,
visit(s)
whilethe stack is not empty:
pop (v,w)
ifw is not visited
add (v,w)to tree T
visit(w)
Procedure visit(v)
mark vas visited
foreach edge (v,w)
push (v,w)in the stack
406
Recursive DFS
DepthFirst(Vertex v)
visit(v);
foreach neighborwof v
if(wis not visited)
add edge (v,w)to tree T
DepthFirst(w)
Visit(v)
mark vas visited
407
Example
1 4
2 3 5
6
Adjacency lists
1: 2, 3, 4
2: 6, 3, 1
3: 1, 2, 6, 5, 4
4: 1, 3, 5
5: 3, 4
6: 2, 3
Depth first traversal: 1, 2, 6, 3, 5, 4
the particular order is dependent
on the order of nodes in the
adjacency lists
408
BFS and DFS
bfs(G)
list L = empty
tree T = empty
choose a starting vertex x
visit(x)
while(L nonempty)
remove edge (v,w) from
beginning of L
if wnot visited
add (v,w) to T
visit(w)
dfs(G)
list L = empty
tree T = empty
choose a starting vertex x
visit(x)
while(L nonempty)
remove edge (v,w) from
end of L
if wnot visited
add (v,w) to T
visit(w)
Visit ( vertex v)
mark vas visited
for each edge (v,w)
add edge (v,w) to end of L
Chapter 9: Graphs
Applications of
Depth-First Search
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
411
Graph Connectivity
Connectivity
Biconnectivity
Articulation Points and Bridges
Connectivity in Directed Graphs
412
Connectivity
Definition:
An undirected graph is said to be connected
if for any pair of nodes of the graph, the two
nodes are reachable from one another (i.e. there
is a path between them).
If starting from any vertex we can visit all other
vertices, then the graph is connected
413
Biconnectivity
A graph is biconnected, if there are no
vertices whose removal will disconnect the
graph.
A B
C
D
E
A B
C
D
E
biconnected
not biconnected
414
Articulation Points
Definition: A vertex whose removal
makes the graph disconnected is called
an articulation pointor cut-vertex
A B
C
D
E
Cis an articulation point
We can compute articulation points
using depth-first search and a
special numbering of the vertices in
the order of their visiting.
415
Bridges
Definition:
An edge in a graph is called a bridge, if its
removal disconnects the graph.
Any edge in a graph, that does not lie on a cycle, is a bridge.
Obviously, a bridge has at least one articulation point at its end,
however an articulation point is not necessarily linked in a bridge.
A B
C
D
E
(C,D) and (E,D) are bridges
416
Example 1
A B
C
E
F
D
Cis an articulation point, there are no bridges
417
Example 2
A B
C
E
F
D
Cis an articulation point, CBis a bridge
418
Example 3
A
B
C
E
F
G
D
Band Care articulation points, BCis a bridge
419
Example 4
A
B C D
E
Band Care articulation points. All edges are bridges
420
Example 5
A
B
C
D E
F
G
Biconnected graph-no articulation points and no bridges
421
Connectivity in Directed
Graphs (I)
Definition:A directed graph is said to be
strongly connected
if for any pair of nodes there is a path
from each one to the other
A B
C
D
E
422
Connectivity in Directed
Graphs (II)
Definition:A directed graph is said to be
unilaterally connectedif for any pair of
nodes at least one of the nodes is reachable
from the other
A B
C
D
E
Each strongly connected
graph is also unilaterally
connected.
423
Connectivity in Directed
Graphs (III)
Definition:A directed graph is said to be
weakly connectedif the underlying
undirected graph is connected
A B
C
D
E
There is no path
between B and D
Each unilaterally
connected graph is also
weakly connected
Chapter 9: Graphs
Summary
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
425
Summary of Graph Algorithms
Basic Concepts
Topological Sort
Shortest Paths
Spanning Trees
Scheduling Networks
Breadth-First and Depth First Search
Connectivity
426
Basic Concepts
Basic definitions: vertices and edges
More definitions: paths, simple paths,
cycles, loops
Connected and disconnected graphs
Spanning trees
Complete graphs
Weighted graphs and networks
Graph representation
Adjacency matrix
Adjacency lists
427
Topological Sort
RULE:If there is a path from uto v,
then vappears after uin the ordering.
Graphs: directed,acyclic
Degreeof a vertex U: the number of outgoingedges
Indegreeof a vertex U: the number of incomingedges
The algorithm for topological sort uses "indegrees" of vertices.
Implementation: with a queue
Complexity: Adjacency lists: O(|E| + |V|),
Matrix representation: O(|V|
2
)
428
Shortest Path in Unweighted
Graphs
Data Structures needed:
Distance Table
Queue
Implementation: Breadth-First search using a queue
Complexity:
Matrix representation:O(|V|2)
Adjacency lists-O(|E| + |V|)
429
Algorithm
1.Store sin a queue, and initialize
distance = 0in the Distance Table
2.While there are vertices in the queue:
Read a vertex vfrom the queue
For all adjacent verticesw:
If distance = -1 (not computed)
Distance = (distance to v) + 1
Parent = v
Append wto the queue
430
Shortest Path in Weighted
Graphs –Dijkstra’s Algorithm
1. Storesin a priority queue withdistance = 0
2. While there are vertices in the queue
DeleteMina vertexvfrom the queue
For all adjacent verticesw:
Compute new distance
Store in / update Distance table
Insert/update in priority queue
431
Complexity
O(E logV + V logV) = O((E + V) log(V))
Each vertex is stored only once in the queue–O(V)
DeleteMin operation is : O( V logV )
Updating the priority queue –
search and inseart: O(log V)
performed at most for each edge: O(E logV)
432
Spanning Trees
Spanning tree: a tree that contains all vertices
in the graph.
Number of nodes: |V|
Number of edges: |V|-1
433
Spanning Trees of Unweighted
Graphs
Breadth-First Search using a queue
Complexity:O(|E| + |V|)-we process all
edges and all nodes
434
Min Spanning Trees of Weighted
Graphs –Prim’s Algorithm
Find a spanning tree with the minimal sum
of the weights.
Similarto shortest paths in a weighted graphs.
Difference: we record the weight of the
current edge, not the length of the path.
Implementation: with a Priority Queue
Complexity:O(|E| log (|V|) )
435
Min Spanning Trees of Weighted
Graphs –Kruskal’s Algorithm
The algorithm works with the set of
edges, stored in Priority queue.
The tree is built incrementally starting
with a forest of nodes only and adding
edges as they come out of the priority
queue
Complexity: O(|E| log (|V|))
436
Scheduling Networks
Directed simple acyclic graph
Nodes:events, Source, Sink
Edges:activities (name of the task, duration)
Find the earliest occurrence time (EOT) of an
event
Find the earliest completion time (ECT) of an
activity
Find the slack of an activity: how much the
activity can be delayed without delaying the
project
Find the critical activities: must be completed on
time in order not to delay the project
437
EOT and ECT
Earliest occurrence time of an event
EOT(source) = 0
EOT( j ) = max ( EOTs of all activities preceding the event)
Earliest completion time of an activity
ECT( j, k ) = EOT( j ) + duration( j, k)
1. Sort the nodes topologically
2. For each event j : initialize EOT(j) = 0
3. For each event iin topological order
For each event jadjacent from i :
ECT(i,j)= EOT(i) + duration (i,j)
EOT(j)= max(EOT(j) , ECT(i,j))
438
Slack of activities
Compute latest occurrence time LOT(j) ,slack(i, j)
1.InitializeLOT(sink) = EOT(sink)
2.For each non-sink event iin the reverse topological order:
LOT(i)= min(LOT( j ) –duration( i, j))
3.For each activity (i,j)
slack( i, j )= LOT( j ) –ECT( i, j)
Critical activities: slack(j) = 0
Critical pathin the project:
all edges are activities with slack = 0
The critical path is found using breadth-first (or depth-first) search on the
edges
Complexity:
O(V + E)with adjacency lists,
O(V
2
)with adjacency matrix
439
BFS and DFS
bfs(G)
list L = empty
tree T = empty
choose a starting vertex x
visit(x)
while(L nonempty)
remove edge (v,w) from
beginning of L
if wnot visited
add (v,w) to T
visit(w)
dfs(G)
list L = empty
tree T = empty
choose a starting vertex x
visit(x)
while(L nonempty)
remove edge (v,w) from
end of L
if wnot visited
add (v,w) to T
visit(w)
Visit ( vertex v)
mark vas visited
for each edge (v,w)
add edge (v,w) to end of L
Complexity: O( V + E)
Chapter 10: Algorithm
Design Techniques
Backtracking
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
Backtracking
How to solve a problem by search
Generate and Test
What is Backtracking
Example
The Idea
Problem solving:
search in a space of possible states.
Initial
state
Target state2
Target state1
Generate and Test Method
•Takeastate
•Generatenextstatesbyapplyingrelevantrules
(notallrulesareapplicableforagivenstate!)
•Testtoseeifthegoalstateisreached.
Ifyes-stop
Ifno-addthegeneratedstatesintoa
stack/queue(donotaddifthestatehasalready
beengenerated)andrepeattheprocedure.
Togetthesolution:Recordthepaths
Search Methods
Breadth-first vs depth-first search
Exhaustive search (brute force) vs
informative (constraint-based,
heuristic) search
Backtracking
Backtracking
•rejecting the currently reached state
as a dead-end state,
•going back to some previous level in
the state-space tree/graph
•proceeding with the exploration of
other paths toward the target state
Backtracking occurs in depth-first search
Example
Decide on:
State representation
Operators
Constraints
Initial state
Target state
LOAN
+ LOAN
LOAN
LOAN
-------
DEBT
Replace the letters with
distinct digits such that the
addition is correct
Chapter 10: Algorithm
Design Techniques
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
449
Algorithm Design Techniques
Brute Force
Greedy Algorithms
Divide and Conquer
Dynamic Programming
Transform and Conquer
Backtracking
Genetic Algorithms
450
Brute Force
Based on the problem’s statement and
definitions of the concepts involved.
Examples:
Sequential search
Exhaustive search: TSP, knapsack
Simple sorts: selection sort, bubble sort
Computing n!
451
Greedy Algorithms
"take what you can get now" strategy
Work in phases.
In each phase the currentlybest
decision is made
452
Greedy Algorithms -
Examples
•Dijkstra's algorithm
(shortest path is weighted graphs)
•Prim's algorithm, Kruskal's
algorithm
(minimal spanning tree in weighted graphs)
•Coin exchange problem
•Huffman Trees
453
Divide and Conquer
•Reduce the problem to smaller
problems (by a factor of at least 2)
solved recursively and then
combine the solutions
Examples:Binary Search
Mergesort
Quick sort
Tree traversal
In general, problems that can be defined recursively
454
Decrease and Conquer
•Reduce the problem to smaller
problems solved recursively and
then combine the solutions
Examples of decrease-and-conquer algorithms:
Insertion sort
Topological sorting
Binary Tree traversals: inorder, preorder and postorder
(recursion)
Computing the length of the longest path in a binary tree
(recursion)
Computing Fibonacci numbers (recursion)
Reversing a queue (recursion)
455
Dynamic Programming
Bottom-Up Techniquein which the
smallest sub-instances are explicitly
solved first and the results of these used to
construct solutions to progressively larger
sub-instances.
Example:
Fibonacci numbers computed by iteration.
456
Transform and Conquer
The problem is modified to be more amenable to solution. In the
second stage the problem is solved.
• Problem simplification e.g. presorting
Example:finding the two closest numbers in an array of
numbers.
Brute force solution: O(n
2
)
Transform and conquer with presorting: O(nlogn)
• Change in the representation
Example:AVL trees guarantee O(nlogn) search time
• Problem reduction
Example:least common multiple: lcm(m,n) = (m*n)/ gcd(m,n)
457
Backtracking
Generate-and-Test methods
Based on exhaustive search in
multiple choice problems
Typically used with depth-first state
space search problems.
Example: Puzzles
458
Backtracking –
State Space Search
•initial state
•goal state(s)
•a set of intermediate states
•a set of operatorsthat transform one state into
another. Each operator has preconditions and
postconditions.
•a cost function –evaluates the cost of the
operations (optional)
•a utility function –evaluates how close is a given
state to the goal state (optional)
459
Genetic Algorithms
Search for good solutions among possible solutions
The best possible solution may be missed
A solutionis coded by astring, also called
chromosome. The words string and chromosome are
used interchangeably
A strings fitnessis a measure of how good a
solution it codes. Fitness is calculated by a fitness
function
Selection:The procedure to choose parents
Crossoveris the procedure by which two
chromosomes mate to create a new offspring
chromosome
Mutation:with a certain probability flip a bit in the
offspring
460
Basic Genetic Algorithm
1.Start:Generate random population of n
chromosomes (suitable solutions for the problem)
2.Fitness:Evaluate the fitness f(x) of each
chromosome xin the population
3.New population:Create a new population by
repeating following steps until the new population
is complete
4.Test:If the end condition is satisfied, stop, and
return the best solution in current population
461
New Population
Selection:Select two parent chromosomes
from a population according to their
fitness
Crossover:With a crossover probability
cross over the parents to form a new
offspring (children). If no crossover was
performed, offspring is an exact copy of parents.
Mutation:With a mutation probability
mutate new offspring at each locus
(position in chromosome).
462
Conclusion
How to choose the approach?
First, by understanding the problem, and second, by
knowing various problems and how they are solved using
different approaches.
Strongly recommended reading to learn more about how to
design algorithms:
The Algorithm Design Manual, Steven S. Skiena
Department of Computer Science, State University of New York
http://www2.toki.or.id/book/AlgDesignManual/BOOK/BOOK/BOOK.HTM
Algorithm Design Paradigms, Paul E. Dunne
University of Liverpool, Department of Computer Science
http://www.csc.liv.ac.uk/~ped/teachadmin/algor/algor.html
Complexity Issues
Mark Allen Weiss: Data Structures and Algorithm Analysis in Java
Mark Allen Weiss, Florida University
465
Class P Problems
Polynomial run time in the worst-case
"tractable" problems
Examples
Simple sorting algorithms
Advanced sorting algorithm
Topological Sort
Shortest Paths
Min Spanning Trees
466
Class NP Problems
NP-Problems:Problems for which there is no known
polynomial-time algorithms, "intractable"problems
NP means non-deterministically polynomial. If we
have a candidate for a solution, we can verify it in
polynomial time. The difficult part is to generate all
possible candidates.
Example:(the satisfiability problem) Given a boolean expression of
nvariables, find a set of variable assignments that satisfy the
expression
Obviously, P NP, but P NP (or P = NP) is an open question
467
NP-Complete Problems
An NP-Complete problem is an NP problem with the
property that any problem in NP can be reduced to it in
polynomial time
In 1971 Stephen Cook (currently working at the
University of Toronto, Ontario) showed that the
satisfiability problem is NP-complete by proving that
every other NP problem could be transformed to it.
http://www.cs.toronto.edu/DCS/People/Faculty/sacook.html
468
NP-Complete Problems
Bin packing problem:How to pack or store objects of various sizes and
shapes with a minimum of wasted space?
Clique problem:A clique is a subset K of vertices in an undirected graph
G such that every pair is joined by an edge in G. The problems are:
Given a graph G, and an integer k, does G have a clique consisting of
k vertices?
Given a graph G, find a clique with as many vertices as possible.
Knapsack problem:You have a knapsack with a given size and several
objects with various sizes and values. The problem is how to fill your
knapsack with these objects so that you have a maximum total value.
Hamiltonian Cycle:a cycle that contains all nodes of the graph. Proving
that a graph has a Hamiltonian cycle is a NP-complete problem.
469
NP-Complete Problems
NP problems and non-NP problems:
For NP problems, given a candidate for a solution, we can verify it in
polynomial time.
For non-NP problems, we cannot verify in polynomial time whether
a candidate for a solution is a solution or not.
Example:the problem to prove that a graph does not have a
Hamiltonian cycle is a non-NP problem.
NP-hard problems:NP-Complete, but not necessarily in NP class
470
Decidability
Yes/no problems (decision problems)
A decision problem is a question that has two
possible answers yesor no. The question is
about some input.
471
Examples of Yes/No
Problems
Given a graph G and a set of vertices K, is K a
clique?
Given a graph G and a set of edges M, is M a
spanning tree?
Given a set of axioms (boolean expressions)
and an expression, is the expression provable
under the axioms?
472
Decidable, semi-decidable, and
undecidable problems
A problem isdecidableif there is an algorithm
(P or NP) that says yesif the answer is yes,
and nootherwise.
A problem issemi-decidableif there is an
algorithm that says yesif the answer is yes,
however it may loop infinitely if the answer is
no.
A problem isundecidableif we can prove that
there is no algorithm that will deliver an
answer.
473
Example of semi-decidable
problem
Given a set of axioms, prove that an expression is true.
Problem 1: Let the axioms be:
A v B
A v C
~B
Prove A.
To prove A we add ~A to the axioms.
If A is true then ~A will be false and this will cause a
contradiction -the conjunction of all axioms plus ~A will result in
False
(A v B) ~A = B
B (A v C) = (B A) v (B C)
B ~ B = False
474
Example of semi-decidable
problem
Problem 2: Let the axioms be:
A v B
A v C
~B
Prove ~A.
We add A and obtain:
(A v C) A = A
(A v B) A = A
A ~B = A ~B
(A ~B) (A v B) = A ~ B
…..
This process will never stop, because the expressions we
obtain will always be different from False
475
Example of undecidable
problem
The halting problem
Let LOOP be a program that checks other
programs for infinite loops:
LOOP(P) stops and prints "yes" if P loops infinitely
LOOP(P) enters an infinite loop if P stops
What about LOOP(LOOP)?
http://www.cgl.uwaterloo.ca/~csk/washington/halt.html