Merge sort analysis and its real time applications

hawkeye0 15,334 views 26 slides Sep 25, 2016
Slide 1
Slide 1 of 26
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26

About This Presentation

It about merge sort and its application only


Slide Content

Merge Sort Analysis and its Real-Time Applications Group-4 BE-3 rd year , sem-5 th Sarvajanik College of Engineering & Technology Department of Computer E ngineering (Shift-1) 1

140420107014 DUDHWALA FRANCY KALPESH 140420107015 DUMASIA YAZAD HOMYAR 140420107016 GAJJAR KARAN DEVENDRABHAI 140420107018 GOTAWALA VATSAL YAGNESH . 2

Introduction to sorting Introduction to merge sort algorithm Complexity analysis of merge sort algorithm Real-time application 3 Contents : A glimpse of what is to come

Introduction to sorting S orting refers to arranging a set of data in some logical order. For ex. A telephone directory can be considered as a list where each record has three fields - name, address and phone number. Being unique, phone number can work as a key to locate any record in the list. Sorting is among the most basic problems in algorithm design. We are given a sequence of items, each associated with a given key value. And the problem is to rearrange the items so that they are in an increasing(or decreasing) order by key. The methods of sorting can be divided into two categories: Internal Sorting External Sorting 4

Internal Sorting If all the data that is to be sorted can be adjusted at a time in main memory, then internal sorting methods are used External Sorting When the data to be sorted can’t be accommodated in the memory at the same time and some has to be kept in auxiliary memory, then external sorting methods are used. NOTE: We will only consider External sorting 5

Stable and Not Stable Sorting If a sorting algorithm, after sorting the contents, does not change the sequence of similar content in which they appear, it is called stable sorting . If a sorting algorithm, after sorting the contents, changes the sequence of similar content in which they appear, it is called unstable sorting. 6

Efficiency of Sorting Algorithm The complexity of a sorting algorithm measures the running time of a function in which n number of items are to be sorted. The choice of sorting method depends on efficiency considerations for different problems. Tree most important of these considerations are: The length of time spent by programmer in coding a particular sorting program. Amount of machine time necessary for running the program. The amount of memory necessary for running the program . 7

Efficiency of Sorting Algorithm Various sorting methods are analyzed in the cases like – best case , worst case or average case . Most of the sort methods we consider have requirements that range from O(n logn) to O(n 2 ). A sort should not be selected only because its sorting time is 0(nlogn) ; the relation of the file size n and the other factors affecting the actual sorting time must be considered. Determining the time requirement of sorting technique is to actually run the program and measure its efficiency. Once a particular sorting technique is selected the need is to make the program as efficient as possible. Any improvement in sorting time significantly affect the overall efficiency and saves a great deal of computer time . 8

Efficiency of Sorting Algorithm Space constraints are usually less important than time considerations. The reason for this can be, as for most sorting programs, the amount of space needed is closer to 0(n) than to 0(n 2 ) The second reason is that, if more space is required, it can almost always be found in auxiliary storage. 9

Introduction to merge sort algorithm 10 Divide-and-conquer , breaks a problem into sub problems that are similar to the original problem, recursively solves the sub problems, and finally combines the solutions to the sub problems to solve the original problem. Think of a divide-and-conquer algorithm as having three parts: Divide  the problem into a number of sub-problems that are smaller instances of the same problem. Conquer  the sub-problems by solving them recursively. If they are small enough, solve the sub-problems as base cases. Combine  the solutions to the sub-problems into the solution for the original problem. Because we're using divide-and-conquer to sort, we need to decide what our sub problems are going to be . Divide-and-conquer algorithms

11 Problem divide Sub problem Sub problem Conquer Solve sub-problem Solve sub-problem Solution to Sub-problem Solution to Sub-problem Solution to problem Combine

Merge sort algorithm 12 Merge sort is a sorting technique based on divide and conquer technique that was invented by John von Neumann in 1945. Merge sort work on Two basic principle : Sorting smaller list is faster than sorting larger list. Combining two sorted sub lists is faster than of two unsorted list.

Working of merge sort 13 Merge sort works in three stage: Divide : Merge sort first divides the list into equal halves and then combines them in a sorted manner . Conquer : Then sort the sub-lists Combine : After sorting merge all sub-lists into single list.

18 26 32 6 43 15 9 1 18 26 32 6 43 15 9 1 18 26 32 6 43 15 9 1 26 18 6 32 15 43 1 9 18 26 32 6 43 15 9 1 18 26 32 6 43 15 9 1 18 26 32 6 15 43 1 9 6 18 26 32 1 9 15 43 1 6 9 15 18 26 32 43 18 26 18 26 18 26 32 32 6 6 32 6 18 26 32 6 43 43 15 15 43 15 9 9 1 1 9 1 43 15 9 1 18 26 32 6 43 15 9 1 18 26 6 32 6 26 32 18 15 43 1 9 1 9 15 43 1 6 9 15 18 26 32 43 Original Sequence Sorted Sequence 14 Working of merge sort

43 15 9 1 22 26 19 55 37 43 99 2 18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2 18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2 18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2 18 26 32 6 43 15 9 1 22 26 19 55 37 43 99 2 18 26 32 6 15

16 kn kn/2 kn/2 kn/4 kn/4 kn/4 kn/4 1 1 1 1 1 1 1 1 kn kn kn kn Log 2 n Total : Log 2 n + kn

Algorithm for merge sort : 17 Algorithm MERGE_SORT(A,1,n) //A is array of size n if low < high then mid  floor ( low + high ) / 2 MERGE_SORT(A , low , mid) MERGE_SORT(A , mid+1 , high) COMBINE(A , low, mid, high) end Algorithm COMBINE(A , low , mid , high) L 1  mid – low +1 L 2  high – mid for i  1 to L 1 do LEFT[ i ]  A [ low + i -1 ] end for j  1 to L 2 do RIGHT[ i ]  A[ mid + j ] end LEFT [ L 1 + 1]  ∞ RIGHT [ L 2 + 1]  ∞ i  1 , j  1 for k  low to high do i f LEFT [ i ]  RIGHT [ j ] then A[ k ]  LEFT [ i ] i  i +1 else A []  RIGHT [] j = j + 1 end end end

Ex: sorting the list using merge sort algorithm. 18 33 22 44 00 99 88 11 1 4 2 6 5 3 33 22 44 00 99 88 11 1 11 33 22 44 00 33 22 44 00 22 33 00 44 00 22 33 44 99 88 11 88 99 88 99 11 11 11 88 99 33 22 44 00 99 88 11 2 6 3 10 5 4 7 9 8 12 17 16 18 13 15 14 p r

Merge sort works as follow: MERGE_SORT(A , p , r): if p<r then, q=( r + p)/2 MERGE_SORT(A , p , q) MERGE_SORT(A , q+1 , r) COMBINE(A, p , q , r) end 19 Step- 0 : MERGE_SORT(A , 0 , 6) Step- 1 : MERGE_SORT(A , 0 , 3) Step-11 : MERGE_SORT(A , 4 , 6) Step- 2 : MERGE_SORT(A , 0 , 1) Step-12 : MERGE_SORT(A , 4 , 5 ) Step- 3 : MERGE_SORT(A , 0 , 0) Step-13 : MERGE_SORT(A , 4 , 4) Step- 4 : MERGE_SORT(A , 1 , 1) Step-14 : MERGE_SORT(A , 5 , 5) Step- 5 : COMBINE(A , 0 , 0 , 1) Step-15 : COMBINE(A , 4 , 4 , 4 , 5) Step- 6 : MERGE_SORT(A , 2 , 3) Step-16 : MERGE_SORT(A , 6 , 6) Step-7 : MERGE_SORT(A , 2, 2) Step-17 : COMBINE(A , 4 , 5 , 6) Step-8 : MERGE_SORT(A , 3 , 3 ) Step-18 : COMBINE(A , 0 , 3 ,6) Step-9 : COMBINE(A , 2 , 2 , 3 ) Step-10 : COMBINE(A , 0 , 1 , 3)

20 Example base on previous method to sorting the list using merge sort algorithm.

Complexity analysis of merge sort algorithm Divide : This step computes the middle of the sub array, which takes constant time . Thus, D(n) = θ (1). Conquer : We recursively sole two sub problems, each of size (n/2) , which contributes 2T(n/2) to the running time. Combine: Combine procedure on an n-element sub array takes times θ (n).so C(n) = θ (n ). T(n)= 21 0 ,if n <=1 T(n/2) + T(n/2) + D(n) + C(n) , else

22 T(n) = 2 * T(n/2) + θ (n) + θ (1) T(n) = 2 * T(n/2) + n T(n/2) = 2 * ( 2 * T(n/4) + n/2 ) T(n) = 2 * ( 2 * T(n/2) + n/2 ) + n = 2 2 * T(n/ 2 2 ) + 2n . . . . =2 k T(n/ 2 k )+ kn Suppose, n = 2 k so k = log 2 n T( 2 k ) = n* T(n/n) + log 2 n * n =n *T(1) +n* log 2 n But T(1)=0 T(n)= O(n* log 2 n )

23 Weather list is already sorted , inverse sorted or randomly distributed , all three steps must be performed. Merge sort cannot identify if list is sorted. So numbers of comparisons are same for 3 case. Worst case of merge sort takes les time compared to insertion sort , selection sort & bubble sort . However , best case of insertion sort (O(n)) beats all three cases of merge sort( O(nlog 2 n )) Best case Average case Worst case O(nlog 2 n) O(nlog 2 n) O(nlog 2 n)

Property of merge sort : Not Adaptive : Running time doesn’t change with data pattern. Stable/ Unstable : Both implementations are possible . Not Incremental : Does not sort one by one element in each pass. Not online : Need all data to be in memory at the time of sorting. Not in place : It need O(n) extra space to sort two sub list of size(n/2). 24

Real-time application The e-commerce application Have you ever noticed on any e-commerce website, they have this section of "You might like", they have maintained an array for all the user accounts and then whichever has the least number of inversion with your array of choices, they start recommending what they have bought or they like. I am not going into the time and space complexity details of the algorithm. Obviously there are a lot of ways of doing this and this is one of them . And some e-commerce website like policybazaar.com , trivago.com etc. use there search engine for collecting data from other website to provide minimum cost of booking hotel room or purchasing product . 25

26 Thank you