Introduction to Algorithms
3
rd
ed.
Ch1-ch2
1
2024
Algorithms
Algorithm: Is any well-defined computational
procedure that takes some value, or set of values, as
input and produces some value, or set of values, as
output. An algorithm is thus a sequence of
computational steps that transform the input into the
output.
Data structures
A data structure is a way to storeand organizedata
in order to facilitate access and modifications.
Getting Started
2
L1.3
Design and Analysis of Algorithms
•Analysis: predict the cost of an algorithm in
terms of resources and performance
•Design: design algorithms which minimizethe
cost
Getting Started
The Role of Algorithms in Computing
Algorithms
Example: sorting problem
What kinds of problems are solved by algorithms?
What is the suitable data structures?
Is it Hard problem?
So, what about the Efficiency of the solution?
•Computer Atakes
•Comp. Btakes:
By using an algorithm whose running time grows more slowly, even with a poor
compiler, computer B runs more than 17 times faster than computer A
Getting Started
4
L1.5
The problem of sorting
Input:Sequence a
1, a
2, …, a
nof numbers.
Example:
Input:8 2 4 9 3 6
Output:2 3 4 6 8 9
Output:permutation a'
1, a'
2, …, a'
nsuch
that a'
1a'
2
…
a'
n.
Getting Started
Complexities
•Complexities: the amount of resources (such as
time or memory) required to solve a problem
or perform a task
Constants: O(1) ………… O( n
n
)
Getting Started
6
Analysis –Worst case
AverageCase: Random data, tj=j/2
T(n)=(an
2
+bn+c)/2
Best Case
Average case
Worst case
11
Some math
12
Analysis
•Best Case
•Average case
•Worst case
•Order of Growth:
We can express this running
time as an + b for
constants a and b that depend on
the statement costs c
i
; it is thus
a linear function of n.
13
Designing algorithms
•The divide-and-conquer approach
•Divide the problem into a number of subproblems that are smaller
instances of the same problem.
Conquer the subproblemsby solving them recursively. If the
subproblem sizes are small enough, however, just solve the
subproblems in a straightforward manner.
Combine the solutions to the subproblemsinto the solution for the
original problem.
•The mergesortalgorithm closely follows the divide-and-conquer
paradigm. Intuitively, it operates as follows.
Divide: Divide the n-element sequence to be sorted into two
subsequences of n=2 elements each.
Conquer: Sort the two subsequences recursively using merge sort.
Combine: Merge the two sorted subsequences to produce the sorted
answer.
14
MergeSort
•MergeSort: a divide-and-conquerapproach
•Recall from last time:
Big problem
Smaller
problem
Smaller
problem
Yet smaller
problem
Yet smaller
problem
Yet smaller
problem
Yet smaller
problem
Recurse!
Divide and
Conquer:
Recurse!
15
MergeSort
16
MergeSort
17
It works Let’s assume n = 2
t
•Invariant:
“In every recursive call,
MERGESORT returns a sorted array.”
•n = length(A)
•ifn ≤1:
•returnA
•L = MERGESORT(A[1 : n/2])
•R = MERGESORT(A[n/2+1 : n ])
•returnMERGE(L,R)
•Base case (n=1): a 1-element
array is always sorted.
•Maintenance: Suppose that L and
R are sorted. Then MERGE(L,R) is
sorted.
•Termination: “In the top
recursive call, MERGESORT
returns a sorted array.”
The maintenance step needs more
details!! Why is this statement true?
Not technically a “loop invariant,”
but a ”recursion invariant,” that
should hold at the beginning of
every recursive call.
18
Analysis
19
Why cn????
c(n/2)+c(n/2) = cn
20
21
Simple extraExample
•Holds for the first element
•Holds at i=k and k+1
•Holds at the end
22
Other problems
•Bubble Sort, insertion sort, selection sort
•Although merge sort runs in ……. worst-case
time and insertion sort runs in worst-case
time, the constant factors in insertion sort can
make it faster in the practice for small problem
sizes on many machines
23
n log(n) vs n
2
continued
n n log(n)n^2
8 24 64
16 64 256
32 160 1024
64 384 4096
128 896 16384
256 2048 65536
512 4608 262144
1024 102401048576
0
200000
400000
600000
800000
1000000
0 20040060080010001200
n log(n)n^2
24
Back to Insertion sort pseudocode
Go one-at-a-time
until things are in
the right place.
27
Mmmm: it is 1+2+3+….+n
So, it is big oh n
2
Back to Insertion sort: running time
n-1 iterations
of the outer
loop
In the worst case,
about n iterations
of this inner loop
Running time is about n
2
28
n
2
In general
Maintain a loop invariant.
•Initialization: the loop invariant holds before the
first iteration.
•Maintenance: If it is true before the t’thiteration,
it will be true before the (t+1)’stiteration
•Termination: It is useful to know that the loop
invariant is true at the end of the last iteration.
A loop invariant is something that
should be true at every iteration.
(This is proof-of-correctness by induction)
29
Insertion sort: correctness
•Loop invariant: At the start of the t’thiteration (of the
outer loop), the first telements of the array are
sorted.
•Initialization: At the start of the first iteration, the first
element of the array is sorted. ✓
•Maintenance: By construction, the point of the t’th
iteration is to put the (t+1)’stthing in the right place.
•Termination: (At the end of the algorithm), the first
len(A) items are sorted. ✓
30
To summarize
Insertion Sort is an algorithm that
correctly sorts an arbitrary n-element
array in time about n
2
.
Can we do better?
31
L1.32
Example of insertion sort (Back)
8 2 4 9 3 6
L1.33
Example of insertion sort
8 2 4 9 3 6
L1.34
Example of insertion sort
8 2 4 9 3 6
2 8 4 9 3 6
L1.35
Example of insertion sort
8 2 4 9 3 6
2 8 4 9 3 6
L1.43
Running time
•The running time depends on the input: an
already sorted sequence is easier to sort.
•Major Simplifying Convention:Parameterize
the running time by the size of the input, since
short sequences are easier to sort than long
ones.
➢T
A(n) = time of A on length n inputs
•Generally, we seek upper bounds on the
running time, to have a guarantee of
performance.
L1.44
Kinds of analyses
Worst-case: (usually)
•T(n) =maximum time of algorithm
on any input of size n.
Average-case: (sometimes)
•T(n) =expected time of algorithm
over all inputs of size n.
•Need assumption of statistical
distribution of inputs.
Best-case: (NEVER)
•Cheat with a slow algorithm that
works fast on someinput.
L1.45
Machine-independent time
What is insertion sort’s worst-case time?
BIG IDEAS:
•Ignore machine dependent constants,
otherwise impossible to verify and to compare algorithms
•Look far at growthof T(n)as n→ ∞.
“Asymptotic Analysis”
Asymptotic notations
introduction
46
Which is larger:
10n
2
or 2n
3
What is n
0
--------------------------------------------------------------
2n
3
<= 3n
3
+5 <=5n
3
n0=2
L1.47
Q-notation
•Drop low-order terms; ignore leading constants.
•Example: 3n
3
+ 90n
2
–5n+ 6046 = Q(n
3
)
DEF:
Q(g(n)) = { f (n):there exist positive constants c
1, c
2, and
n
0such that 0 c
1 g(n) f (n) c
2 g(n)
for all nn
0}
Basic manipulations:
2n
3
<= 3n
3
+5 <=5n
3
n
0=2
Growth
48
L1.49
Asymptoticperformance
n
T(n)
n
0
.
•Asymptotic analysis is a
useful tool to help to
structure our thinking
toward better algorithm
•We shouldn’t ignore
asymptotically slower
algorithms, however.
•Real-world design
situations often call for a
careful balancing
When ngets large enough, a Q(n
2
)algorithm alwaysbeats a Q(n
3
)algorithm.
L1.50
Insertion sort analysis (back)
Worst case:Input reverse sorted.()
=
Q=Q=
n
j
njnT
2
2)()(
Average case:All permutations equally likely.()
=
Q=Q=
n
j
njnT
2
2)2/()(
Is insertion sort a fast sorting algorithm?
•Moderately so, for small n.
•Not at all, for large n.
[arithmetic series]
L1.51
Example 2: Integer Multiplication
•Let X = A B and Y = C D where A,B,C and D
are integer digits
•Simple Method: XY = 2 x 2 multiplications
•Running Time Recurrence
T(n) is the max number of multiplications
•Solution T(n) = q(n
2
), n is number of digits in
the number
L1.52
Back to Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
h= lg n
cn
cn
cn
#leaves = n Q(n)
Total= Q(nlg n)
…
L1.54
Analyzing merge sort
MERGE-SORTA[1 . . n]
1.If n= 1, done.
2.Recursively sort A[ 1 . . n/2]
and A[ n/2+1 . . n ] .
3.“Merge”the 2sorted lists
T(n)
Q(1)
2T(n/2)
Q(n)
Sloppiness: Should be T( n/2) + T( n/2) , but it turns out not to matter
asymptotically.
L1.55
Recurrence for merge sort
T(n) =
Q(1) ifn= 1;
2T(n/2)+ Q(n) ifn> 1.
•We shall usually omit stating the base
case when T(n) = Q(1)for sufficiently
small n, but only when it has no effect on
the asymptotic solution to the recurrence.
•Lecture 2 provides several ways to find a
good upper bound on T(n).
L1.56
Finally
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0is constant.
L1.57
Solve T(n) = 2T(n/2) + cn, where c > 0is constant.
T(n)
Finally
Recursion tree
L1.58
Solve T(n) = 2T(n/2) + cn, where c > 0is constant.
T(n/2) T(n/2)
cn
Finally
Recursion tree
L1.59
Solve T(n) = 2T(n/2) + cn, where c > 0is constant.
cn
T(n/4) T(n/4) T(n/4) T(n/4)
cn/2 cn/2
Finally
Recursion tree
L1.60
Solve T(n) = 2T(n/2) + cn, where c > 0is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
Finally
Recursion tree
L1.61
Solve T(n) = 2T(n/2) + cn, where c > 0is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
h= lg n
Finally
Recursion tree
L1.62
Solve T(n) = 2T(n/2) + cn, where c > 0is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
h= lg n
cn
Finally
Recursion tree
L1.63
Solve T(n) = 2T(n/2) + cn, where c > 0is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
h= lg n
cn
cn
Finally
Recursion tree
L1.64
Solve T(n) = 2T(n/2) + cn, where c > 0is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
h= lg n
cn
cn
cn
…
Finally
Recursion tree
L1.65
Solve T(n) = 2T(n/2) + cn, where c > 0is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2 cn/2
Q(1)
h= lg n
cn
cn
cn
#leaves = n Q(n)
…
Finally
Recursion tree
L1.66
Conclusions
•Q(nlgn)grows more slowly than Q(n
2
).
•Therefore, merge sort asymptotically
beats insertion sort in the worst case.
•In practice, merge sort beats insertion
sort for n> 30or so.
•O(1) <O(lgn)<O(n)< etc. as discussed