Sorting algorithms

Sorting algorithms Eleonora Ciceri , Politecnico di Milano Email: [email protected]

What is sorting Sorting is any process of arranging items systematically , with two distinct meanings: Ordering : arranging items in a sequence ordered by some criterion Categorizing : grouping items with similar properties

What is a sorting algorithm A sorting algorithm is an algorithm that puts elements of a list in a certain order 78 26 4 12 90 Sorting algorithm 4 12 26 78 90

Why do we need to sort? Sorted lists / sequences are useful in the following cases: 1) Efficient lookup and search 56 31 2 47 54 19 64 85 23 56 31 2 47 54 19 64 85 23 56 31 2 47 54 19 64 85 23 56 31 2 47 54 19 64 85 23 56 31 2 47 54 19 64 85 23 56 31 2 47 54 19 64 85 23 Finding “19” Non-ordered list: 6 accesses

Why do we need to sort? Sorted lists / sequences are useful in the following cases: 1) Efficient lookup and search 2 19 23 31 47 54 56 64 85 2 19 23 31 47 54 56 64 85 Finding “19” Ordered list: 2 accesses

Why do we need to sort? Sorted lists / sequences are useful in the following cases: 2) Merge sequences 2 5 9 13 20 1 4 8 21 1 2 4 5 8 9 13 20 21

What do we sort? “Workers sort parcels in a postal facility” (from: Wikipedia, the free encyclopedia)

What do we sort? We are going to order collections of data A couple of examples: Arrays Linked lists

Compare and swap to sort elements

Common building block of sorting algorithms There are several algorithms one could use to order a list of elements Although they are all different, they share a common building block: T hey compare pairs of elements to decide which is their relative order W hen needed, they swap the elements to restore their order 6 8 3 5 9 Compare: 6 5 3 8 9 Swap:

How to compare elements: Lexicographic order Elements are usually ordered according to the lexicographic (or, equivalently, lexicographical ) order Also known as dictionary order The lexicographic order is a generalization of the way the alphabetical order of words is based on the alphabetical order of their component letters

How to compare elements: Some examples Some examples Order numbers from the smallest to the largest Order words alphabetically Order people ( implemented using a struct that contains name, surname, age ) alphabetically by surname People having the same surname are ordered according to their name People having the same name and surname are ordered according to their age Thus: this proves that comparisons could be based on complex rules that sort out what to do in case of ties

Sorting algorithms time complexity

Which is the difference between sorting algorithms? Each sorting algorithm is characterized by a particular sorting strategy Are they all equal? NO ! Some of them are naïve, some of them aren’t Naïve algorithms require a larger number of comparisons A larger number of comparisons amount to a larger time frame spent to order elements in the collection

(Time) complexity of an algorithm In computer science, the time complexity of an algorithm quantifies the amount of time taken by the algorithm to run The time complexity: is expressed as a function of the length of the input is commonly expressed using big-O notation

Big-O notation “The big-O notation describes the limiting behavior of a function when the number of elements it has to process tends to infinity” We’ll try to simplify the concept in this way: number of elements complexity N O ( N 2 ) e.g., the number of elements in the array we want to order e.g., the number of comparisons we need to perform to order the array

Big-O notation: Which algorithm is “the best”? To us, an algorithm is “good” if it allows us to save time Let N be the number of elements we want to process E.g., number of elements in an array What if we have four algorithms whose time complexities are as follows? Algorithm 1 O(N) Algorithm 2 O(N 2 ) Algorithm 3 O(log(N)) Algorithm 4 O(N*log(N))

Big-O notation: Which algorithm is “the best”? Time complexity can be treated as if it were a mathematical function Being mathematical functions, these curves can be drawn on a graph Algorithm 1 O(N) f(N) = N Algorithm 2 O(N 2 ) f(N) = N 2 Algorithm 3 O(log(N)) f(N) = log(N) Algorithm 4 O(N*log(N)) f(N) = N*log(N)

Big-O notation: Which algorithm is “the best”? 2

Big-O notation: Which algorithm is “the best”? 2 The higher the number of required operation, the more the time, the worse the algorithm!

So, how do we select a sorting algorithm? We can identify several “families” of algorithms: Simple sorts Efficient sorts BubbleSort and variants Distribution sorts

So, how do we select a sorting algorithm? We can identify several “families” of algorithms: Simple sorts Efficient sorts BubbleSort and variants Distribution sorts Simple sorting algorithms are efficient on small data amounts (due to low overhead), but generally do not perform well on large lists Insertion sort Selection sort

So, how do we select a sorting algorithm? We can identify several “families” of algorithms: Simple sorts Efficient sorts BubbleSort and variants Distribution sorts Efficient sorting algorithms are those algorithms whose average complexity is the best you can find ( O(N*log(N)) ) Merge sort Heap sort Quick sort

So, how do we select a sorting algorithm? We can identify several “families” of algorithms: Simple sorts Efficient sorts BubbleSort and variants Distribution sorts Bubble sort algorithm is very simple, and the same characteristics is inherited by all its variants. However, it is highly inefficient (i.e., its time complexity is very high: O(N 2 ) ) Bubble sort Shell sort Comb sort

So, how do we select a sorting algorithm? We can identify several “families” of algorithms: Simple sorts Efficient sorts BubbleSort and variants Distribution sorts These algorithms distribute the input to intermediate structures, which are gathered and placed on the output. They are useful in case of very large data sets that do not fit in memory (since the intermediate structures can be deployed on different machines ) Counting sort Bucket sort Radix sort

One of the most inefficient algorithms ( O(N 2 ) ) One of the most efficientalgorithms ( O(N*log(N)) )

A quick note on time complexity A n algorithm performance may vary with different input of the same sizes [1 2 3 4 5] does not require any swap [5 4 3 2 1] requires a lot of swaps! We commonly attribute to each algorithm: A worst-case complexity (maximum amount of time it may require) An average-case complexity (averaged on all possible inputs) We won’t go into the details, since this is just an introduction to sorting algorithms 

An inefficient sorting algorithm: the Bubble Sort algorithm

Bubble sort: the idea The bubble sort algorithm repeatedly steps through the list of elements to be sorted: Comparing each pair of adjacent elements Swapping them if they are in wrong order The algorithm stops when, by going through the whole array, we do not require any swap (i.e., the array is fully ordered)

Bubble sort: the running example Array: [ 5 1 4 2 8] 5 > 1, thus: swap required Swapped something? False End of iteration? False

Bubble sort: the running example Array: [1 5 4 2 8] 5 > 4, thus: swap required Swapped something? Yes End of iteration? False

Bubble sort: the running example Array: [1 4 5 2 8] 5 > 2, thus: swap required Swapped something? Yes End of iteration? False

Bubble sort: the running example Array: [1 4 2 5 8 ] 5 < 8 , thus: swap not required Swapped something? Yes End of iteration? Yes

Bubble sort: the running example Array: [1 4 2 5 8] We need to proceed, since in the current iteration we swapped something Following operations: Exclude “8” from next iteration Restart with a new iteration Swapped something? Yes End of iteration? Yes

Bubble sort: the running example Array: [ 1 4 2 5 8] 1 < 4, thus: swap not required Swapped something? No End of iteration? No

Bubble sort: the running example Array: [1 4 2 5 8] 4 > 2, thus: swap required Swapped something? No End of iteration? No

Bubble sort: the running example Array: [1 2 4 5 8] 4 < 5, thus: swap not required Swapped something? Yes End of iteration? Yes

Bubble sort: the running example Array: [1 2 4 5 8] We need to proceed, since in the current iteration we swapped something Following operations: Exclude “5” from next iteration Restart with a new iteration Swapped something? Yes End of iteration? Yes

Bubble sort: the running example Array: [ 1 2 4 5 8] 1 < 2, thus: swap not required Swapped something? No End of iteration? No

Bubble sort: the running example Array: [1 2 4 5 8] 2 < 4, thus: swap not required Swapped something? No End of iteration? Yes

Bubble sort: the running example Array: [1 2 4 5 8] In the current iteration nothing was swapped The algorithm terminates Swapped something? No End of iteration? Yes

Bubble sort: the pseudo-code procedure bubbleSort ( A : list of sortable items ) n = length(A) repeat swapped = false for i = 1 to n-1 inclusive do if A[i-1] > A[ i ] then swap (A[i-1], A[i]) swapped = true end if end for n = n - 1 until not swapped end procedure (from: Wikipedia, the free encyclopedia)

Bubble sort in C+ +: Iterate over the array We will use the concept of iterators over the array To iterate over the array: int * first int * last while (first < last) { // do what you want first++; }

Bubble sort in C+ +: The algorithm void bubble_sort_int ( int * first, int * last) { bool swapped; do { swapped = false ; int * current = first + 1 ; while ( current < last ) { if (*(current- 1 ) > *current) { swap(current- 1 , current); swapped = true ; } current++; } last--; } while (swapped); }

Bubble sort in C++: Swapping elements void swap ( int * first, int * second) { int temp = *first; *first = *second; *second = temp; }

An efficient sorting algorithm: the Merge Sort algorithm

Merge sort: the idea The merge sort is a divide-and-conquer algorithm where: The unsorted list of N elements is divided into N sub-lists , each containing one element Sub-lists are repeatedly merged to produce new sorted sub-lists

How to merge two sub-lists Sub-lists are merged so as produce a longer sorted sub-list At each step, we: compare the first element of sub-list A and the first element of sub-list B select the smallest element insert it into the resulting list

How to merge two sub-lists: An example How to merge these: to obtain this? 1 6 10 4 5 27 1 4 5 6 10 27

How to merge two sub-lists: An example 1 6 10 4 5 27 1 1)

How to merge two sub-lists: An example 1 6 10 4 5 27 1 6 10 4 5 27 1 4 1) 2 )

How to merge two sub-lists: An example 1 6 10 4 5 27 1 6 10 4 5 27 1 4 6 10 5 27 1 4 5 1) 2 ) 3 )

How to merge two sub-lists: An example 1 6 10 4 5 27 1 6 10 4 5 27 1 4 6 10 5 27 1 4 5 6 10 27 1 4 5 6 1) 2 ) 3 ) 4 )

How to merge two sub-lists: An example 1 6 10 4 5 27 1 6 10 4 5 27 1 4 6 10 5 27 1 4 5 6 10 27 1 4 5 6 10 27 1 4 5 6 10 1) 2 ) 3 ) 4 ) 5 )

How to merge two sub-lists: An example 1 6 10 4 5 27 1 6 10 4 5 27 1 4 6 10 5 27 1 4 5 6 10 27 1 4 5 6 10 27 1 4 5 6 10 27 / 1 4 5 6 10 27 1) 2 ) 3 ) 4 ) 5 ) 6 )

How to merge two sub-lists: The pseudocode function merge(left, right) // merge lists until at least one of them is empty while notempty (left) and notempty (right) if first(left) <= first(right) append first(left) to result left = rest( left ) else append first(right) to result right = rest(right ) // left has elements left while notempty (left) append first(left) to result left = rest( left ) / / right has elements left while notempty (right) append first(right) to result right = rest(right ) return result (from: Wikipedia, the free encyclopedia)

Merge sort: the step-by-step algorithm We start with this list… …and, according to the algorithm, we need to keep dividing it in “ left sub-list ” and “ right sub-list ”…

Merge sort: the step-by-step algorithm …and split again, since sub-lists length is greater than 1…

Merge sort: the step-by-step algorithm … and again.

Merge sort: the step-by-step algorithm Now it’s time to merge the lists, in pairs, so that the resulting list is ordered We start from the bottom, i.e., from the shorter sub-lists we extracted before This is now ordered

Merge sort: the step-by-step algorithm Merge… This is now ordered

Merge sort: the step-by-step algorithm …and merge. This is now ordered

Trying to generalize: Which operations are needed at each iteration? Split phase: If the current list has length 1, stop with the split phase If the current list has length > 1, split it in two parts (left, right) Merge phase: Merge the left and right parts into a unique, ordered list

Trying to generalize: What happens to a list? Given a list: Split in two parts (left and right) 1 Split in two parts (left and right) 1 Do something to order the sub-list Do something to order the sub-list 2 2 Merge the parts in a single list 3 Merge the parts in a single list 3

Trying to generalize: What happens to a list? Algorithmically : [left, right] = split(list) left = order(left) right = order(right) list = merge(left, right)

Trying to generalize: What happens to a list? Algorithmically : [left, right] = split(list) left = mergeSort (left) right = mergeSort (right) list = merge(left, right) This is a recursive call to the merge sort function

Trying to generalize: What happens to a list? Now we finalize the algorithm : function mergeSort (list) { if (length(list) == 1) return list [left, right] = split(list) left = mergeSort (left) right = mergeSort (right) list = merge(left, right) return list }

Trying to generalize: What happens to a list? Now we finalize the algorithm : function mergeSort (list) { if (length(list) == 1) return list [left, right] = split(list) left = mergeSort (left) right = mergeSort (right) list = merge(left, right) return list } This function splits the list in two parts having the same length This function merges the sub-lists so that the result is ordered

Merge sort in C+ +: Iterate over the array We will use the concept of iterators over the array To iterate over the array: int * first int * last while (first < last) { // do what you want first++; }

Merge sort in C+ +: Iterate over the array We will use the concept of iterators over the array To iterate over the array: int * first int * last while (first < last) { // do what you want first++; } Remember: last always points outside the array

Merge sort in C++: Split the array in two parts If the length of the array is even: The sub-lists are as follows: Left: from first (included) to middle (excluded) Right: from middle (included) to last (excluded) int * first int * last int * middle

Merge sort in C++: Split the array in two parts If the length of the array is even: The sub-lists are as follows: Left: from first (included) to middle (excluded) Right: from middle (included) to last (excluded) int * first int * last int * middle LEFT

Merge sort in C++: Split the array in two parts If the length of the array is even: The sub-lists are as follows: Left: from first (included) to middle (excluded) Right: from middle (included) to last (excluded) int * first int * last int * middle RIGHT

Merge sort in C++: Split the array in two parts If the length of the array is odd: The sub-lists are as follows: Left: from first (included) to middle (excluded) Right: from middle (included) to last (excluded) In this case: right is longer than left int * first int * last int * middle

Merge sort in C++: The recursive algorithm void merge_sort_int ( int * first, int * last) { int N = last - first; if (N <= 1 ) return ; int * middle = first + N/ 2 ; merge_sort_int ( first, middle); merge_sort_int ( middle, last); merge (first, middle, last); }

Merge sort in C++: The merge function void merge ( int * first, int * middle, int * last) { std :: size_t N = last - first; int *result = new int [N], * result_current = result, * left_current = first, * right_current = middle; while ( left_current < middle && right_current < last) { if (* left_current <= * right_current ) { * result_current = * left_current ; left_current ++; } else { * result_current = * right_current ; right_current ++; } result_current ++; } fill_with_remaining_elements ( result_current , left_current , middle ); fill_with_remaining_elements ( result_current , right_current , last); for ( std :: size_t i = ; i < N; i++) first[i] = result [i]; delete [] result ; }

Merge sort in C++: The merge function void merge ( int * first, int * middle, int * last) { std :: size_t N = last - first; int *result = new int [N], * result_current = result, * left_current = first, * right_current = middle; while ( left_current < middle && right_current < last) { if (* left_current <= * right_current ) { * result_current = * left_current ; left_current ++; } else { * result_current = * right_current ; right_current ++; } result_current ++; } fill_with_remaining_elements ( result_current , left_current , middle ); fill_with_remaining_elements ( result_current , right_current , last); for ( std :: size_t i = ; i < N; i++) first[i] = result [i]; delete [] result ; } result is a temporary buffer that contains the ordered array At the end of the function, result is restored in the original array and deleted

Merge sort in C++: The merge function void fill_with_remaining_elements ( int *& result_current , int * sub_list_current , int * sub_list_last ) { while ( sub_list_current < sub_list_last ) { * result_current = * sub_list_current ; sub_list_current ++; result_current ++; } }

References

References https://en.wikipedia.org/wiki/ Sorting https://en.wikipedia.org/wiki/ Sorting_algorithm https://en.wikipedia.org/wiki/ Time_complexity https://en.wikipedia.org/wiki/ Big_O_notation https://en.wikipedia.org/wiki/ Bubble_sort https://en.wikipedia.org/wiki/ Merge_sort

References http://stackoverflow.com/questions/24650626/how-to-implement-classic-sorting-algorithms-in-modern-c

Sorting algorithms

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Sorting algorithms

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 48

Slide 49

Slide 50

Slide 51

Slide 52

Slide 53

Slide 54

Slide 55

Slide 56

Slide 57

Slide 58

Slide 59

Slide 60

Slide 61

Slide 62

Slide 63

Slide 64

Slide 65

Slide 66

Slide 67

Slide 68

Slide 69

Slide 70

Slide 71

Slide 72

Slide 73

Slide 74

Slide 75

Slide 76

Slide 77