Lecture 5_ Sorting and order statistics.pptx

JosephKariuki46 69 views 43 slides May 17, 2024
Slide 1
Slide 1 of 43
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43

About This Presentation

statistics in design and analysis of algorithms


Slide Content

Design and Analysis of Algorithm s Master Theorem Department of Computer Science

Asymptotic Behavior of Recursive Algorithms When analyzing algorithms, recall that we only care about the asymptotic behavior Recursive algorithms are no different Rather than solving exactly the recurrence relation associated with the cost of an algorithm, it is sufficient to give an asymptotic characterization The main tool for doing this is the master theorem

Master Theorem Let T(n) be a monotonically increasing function that satisfies T(n) = a T(n/b) + f(n) T(1) = c where a  1, b  2, c>0. If f(n) is  ( n d ) where d  0 then If a < b d T(n) = If a = b d If a > b d

Master Theorem: Pitfalls You cannot use the Master Theorem if T(n) is not monotone, e.g. T(n) = sin (x) f(n) is not a polynomial, e.g., T(n)=2T(n/2)+ 2 n b cannot be expressed as a constant, e.g. Note that the Master Theorem does not solve the recurrence equation

Master Theorem: Example 1 Let T(n) = T(n/2) + ½ n 2 + n. What are the parameters? a = b = d = Therefore, which condition applies? 1 2 2 1 < 2 2 , case 1 applies We conclude that T(n)   ( n d ) =  (n 2 )

Master Theorem: Example 2 Let T(n)= 2T(n/4) +  n + 42. What are the parameters? a = b = d = Therefore, which condition applies? 2 4 1/2 2 = 4 1/2 , case 2 applies We conclude that

Master Theorem: Example 3 Let T(n)= 3 T(n/2) + 3/4 n + 1. What are the parameters? a = b = d = Therefore, which condition applies? 3 2 1 3 > 2 1 , case 3 applies We conclude that Note that log 2 3 1.584… , can we say that T(n)   (n 1.584 ) No, because log 2 3 1.5849… and n 1.584   (n 1.5849 )

Exercises For each recurrence, either give the asympotic solution using the Master Theorem (state which case), or else state that the Master Theorem doesn't apply. T(n) = 3T(n/2) + n 2 T(n) = 7T(n/2) + n 2 T(n) = 4T(n/2) + n 2 T(n) = 3T(n/4) + n lg n T(n) = T(n - 1) + n T(n) = 2T(n/4) + n 0.51

Design and Analysis of Algorithm s Sorting and order statistics Department of Computer Science

Heaps A heap is a complete binary tree, where the entry at each node is greater than or equal to the entries in its children . When a complete binary tree is built, its first node must be the root. The second node is always the left child of the root The third node is always the right child of the root. The nodes always fill each level from left-to-right.

Definition Max Heap Store data in ascending order Has property of A[Parent( i )] ≥ A[ i ] Min Heap Store data in descending order Has property of A[Parent( i )] ≤ A[ i ]

Heaps The heap property requires that each node's key is >= to the keys of its children. The biggest node is always at the top in a Maxheap . Therefore, a heap can implement a priority queue (where we need quick access to the highest priority item). 19 4 22 21 27 23 45 35

Max Heap Example 16 19 1 4 12 7 Array A 19 12 16 4 1 7

Min heap example 12 7 19 16 4 1 Array A 1 4 16 12 7 19

Heapify Heapify picks the largest child key and compares it to the parent key. If parent key is larger then heapify quits, otherwise it swaps the parent key with the largest child key. So that the parent is larger than its children.   HEAPIFY ( A ,i ) p  LEFT( i ) -- assign left index to p q  RIGHT( i ) -- assign right index to q if p<=heap-size[A] and A[p] > A[ i ] -- check left child and if larger than parent assign then largest  p -- largest= index of left child else largest  i -- largest= index of parent if q<=heap-size[A] and A[q] > A[largest] -- check right child and if larger than parent or left child then largest  p -- largest= index of right child if largest ≠ i -- only swap when parent is less than either child then exchange A[ i ] with A[largest] HEAPIFY ( A,largest )

Analyzing Heapify(): Formal Fixing up relationships between i , l , and r takes (1) time If the heap at i has n elements, how many elements can the subtrees at l or r have? Answer: 2 n /3 (worst case: bottom row 1/2 full) So time taken by Heapify() is given by T ( n )  T (2 n /3) + (1) By case 2 of the Master Theorem, T ( n ) = O( lg n ) Thus, Heapify () takes logarithmic time

BUILD HEAP We can use the procedure ' Heapify ' in a bottom-up fashion to convert an array A[1 . . n ] into a heap. Since the elements in the subarray A[ n /2 +1 . . n ] are all leaves, the procedure BUILD_HEAP goes through the remaining nodes of the tree and runs ' Heapify ' on each one. The bottom-up order of processing node guarantees that the subtree rooted at the children are a heap before ' Heapify ' is run at their parent. BUILD-HEAP (A) Heap-size [A]  length [A] for i  length [A/2] to 1 do HEAPIFY ( A,i ) Running Time: Each call to HEAPIFY costs O( lg n) There are n calls: O(n) O(n lg n)

Adding a Node to a Heap Put the new node in the next available spot. Push the new node upward, swapping with its parent until the new node reaches an acceptable location . 19 4 22 21 27 23 45 35 42

Adding a Node to a Heap 19 4 22 21 42 23 45 35 27 19 4 22 21 35 23 45 42 27

Adding a Node to a Heap In general, there are two conditions that can stop the pushing upward: The parent has a key that is >= new node, or The node reaches the root. The process of pushing the new node upward is called reheapification upward .

Removing the Top of a Heap Move the last node onto the root. 19 4 22 21 35 23 45 42 27

Removing the Top of a Heap The process of pushing the new node downward is called reheapification downward . 19 4 22 21 27 23 42 35 Reheapification downward can stop under two circumstances: The children all have keys <= the out-of-place node, or The node reaches the leaf.

Implementing a Heap We will store the data from the nodes in a partially-filled array. An array of data 21 27 23 42 35

Implementing a Heap Data from the root goes in the first location of the array. An array of data 21 27 23 42 35 42

Implementing a Heap Data from the next row goes in the next two array locations. An array of data 21 27 23 42 35 42 35 23

Implementing a Heap Data from the next row goes in the next two array locations. An array of data 21 27 23 42 35 42 35 23 27 21

Heap Sort The heapsort algorithm consists of two phases: - build a heap from an arbitrary array - use the heap to sort the data To sort the elements in the decreasing order , use a min heap To sort the elements in the increasing order , use a max heap HEAPSORT(A) BUILD-HEAP (A) for i  length[A] down to 2 do exchange A[1] with A[ i ] heap-size[A]  heap-size[A]-1 --discard node n from heap by decrementing HEAPIFY (A,1)

Heapsort Given BuildHeap () , an in-place sorting algorithm is easily constructed: Maximum element is at A[1] Discard by swapping with element at A[n] Decrement heap_size[A] A[n] now contains correct value Restore heap property at A[1] by calling Heapify() Repeat, always swapping A[1] for A[heap_size(A)]

Convert the array to a heap 16 4 7 1 12 19 Picture the array as a complete binary tree: 16 4 7 12 1 19

16 4 7 12 1 19 16 4 19 12 1 7 16 12 19 4 1 7 19 12 16 4 1 7 swap swap swap Convert the array to a heap

Time Analysis of Heapsort Build Heap Algorithm will run in O(n) time There are n -1 calls to Heapify each call requires O(log n ) time Heap sort program combine Build Heap program and Heapify , therefore it has the running time of O(n log n) time Total time complexity: O(n log n)

Possible Application When we want to know the task that carry the highest priority given a large number of things to do Interval scheduling, when we have a list of certain task with start and finish times and we want to do as many tasks as possible Sorting a list of elements that needs an efficient sorting algorithm

Heapsort advantages The primary advantage of heap sort is its efficiency. The execution time efficiency of heap sort is O(n log n). The memory efficiency of the heap sort, unlike the other n log n sorts, is constant, O(1), because heap sort algorithm is not recursive.

Priority Queues A priority queue is a collection of zero or more elements  each element has a priority or value Unlike the FIFO queues, the order of deletion from a priority queue (e.g., who gets served next) is determined by the element priority Elements are deleted by increasing or decreasing order of priority rather than by the order in which they arrived in the queue Types of priority queue Minimum priority queue: Returns the smallest element and removes it from the data structure. Maximum priority queue: Returns the largest element and removes it from the data structure.

Priority queue methods A priority queue supports the following methods: pop ()/ GetNext : ; Removes the item with the highest priority from the queue. Push()/ InsertWithPriority : add an element to the queue with an associated priority. top ()/ PeekAtNext look at the element with highest priority without removing it. This may be based on either the minimum or maximum operation depending on the priority definition of the priority queue Priority size(): Returns the number of elements in the priority queue. empty (): Returns true if the priority queue is empty.

Priority Queue Operations Insert(S, x) inserts the element x into set S Maximum(S) returns the element of S with the maximum key ExtractMax (S) removes and returns the element of S with the maximum key

Advantages/Disadvantages Advantages They enable you to retrieve items not by the insertion time (as in a stack or queue). They offer flexibility in allowing client application programs to perform a variety of different operations on sets of records with keys. numerical keys (priorities) Disadvantages Generally slow: Deletion of elements make a location empty. The search of items include the empty spaces too. This takes more time. To avoid this, use an empty indicator or shift elements forward when each element is deleted. We can also maintain priority queue as an array of ordered elements, to avoid risk in searching

Implementation of priority queue Binary heap : uses heap data structure-must be a complete binary tree, first element is the root, root>=child. Complexity- O(log n)- insertion and deletion. Binary search tree : may be a complete binary tree or not, LeftChild<=root<=RightChild - Priority queue is implemented using in order traversal on a BST. Complexity- O(log n) - insertion, O(1) - deletion Sorted array or list : elements are arranged linearly based on priorities. Highest priority comes first. Complexity- O(n) - insertion, O(1)- deletion. Fibonacci heap : Complexity- O(log n). Binomial heap : Complexity- O(log n).

Priority Queue Sorting We can use a priority queue to sort a set of comparable elements Insert the elements one by one with a series of insert operations Remove the elements in sorted order with a series of removeMin operations The running time of this sorting method depends on the priority queue implementation Algorithm PQ-Sort ( S, C ) Input sequence S , comparator C for the elements of S Output sequence S sorted in increasing order according to C P  priority queue with comparator C while  S.isEmpty () e  S.removeFirst () P.insert ( e , ) while  P.isEmpty () e  P.removeMin (). key () S.insertLast ( e )

Sequence-based Priority Queue Implementation with an unsorted list Performance: insert takes O (1) time since we can insert the item at the beginning or end of the sequence removeMin and min take O ( n ) time since we have to traverse the entire sequence to find the smallest key Implementation with a sorted list Performance: insert takes O ( n ) time since we have to find the place where to insert the item removeMin and min take O (1) time, since the smallest key is at the beginning 4 5 2 3 1 1 2 3 4 5

Insertion-Sort Insertion-sort is the variation of PQ-sort where the priority queue is implemented with a sorted sequence Running time of Insertion-sort: Inserting the elements into the priority queue with n insert operations takes time proportional to 1 + 2 + … + n Removing the elements in sorted order from the priority queue with a series of n removeMin operations takes O ( n ) time Insertion-sort runs in O ( n 2 ) time

Insertion-Sort Example Sequence S Priority queue P Input: (7 , 4 , 8 , 2 , 5 , 3 , 9) () Phase 1 (a) (4 , 8 , 2 , 5 , 3 , 9) (7) (b) (8 , 2 , 5 , 3 , 9) (4 , 7) (c) (2 , 5 , 3 , 9) (4 , 7 , 8) (d) (5 , 3 , 9) (2 , 4 , 7 , 8) (e) (3 , 9) (2 , 4 , 5 , 7 , 8) (f) (9) (2 , 3 , 4 , 5 , 7 , 8) (g) () (2 , 3 , 4 , 5 , 7 , 8 , 9) Phase 2 (a) (2) (3 , 4 , 5 , 7 , 8 , 9) (b) (2 , 3) (4 , 5 , 7 , 8 , 9) .. .. .. . . . (g) (2 , 3 , 4 , 5 , 7 , 8 , 9) ()

In-place Insertion-sort Instead of using an external data structure, we can implement insertion-sort in-place A portion of the input sequence itself serves as the priority queue For in-place insertion-sort We keep sorted the initial portion of the sequence We can use swaps instead of modifying the sequence 5 4 2 3 1 5 4 2 3 1 4 5 2 3 1 2 4 5 3 1 2 3 4 5 1 1 2 3 4 5 1 2 3 4 5
Tags