Randomized Algorithm- Advanced Algorithm

MahbuburRahman273 775 views 77 slides Oct 22, 2021
Slide 1
Slide 1 of 77
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77

About This Presentation

Credit : Nusrat Jahan & Fahima Hossain , Dept. of CSE, JnU, Dhaka.

Randomized Algorithm- Advanced Algorithm, Deterministic, Non Deterministic, LAS Vegas, MONTE Carlo Algorithm.


Slide Content

Randomized Algorithms

Presented To: Nusrat Jahan Id: B-150305002 Fahima Hossain Id: B-150305036 2 Dr. Md. Manowarul Islam Associate Professor Dept. of Computer Science & Engineering, Jagannath University, Dhaka Presented By:

Outline Deterministic VS Non-Deterministic Deterministic Algorithm Randomized Algorithms Types of Randomized Algorithms Las Vegas Monte Carlo Las Vegas Quick Sort Smallest Enclosing Circle Monto Carlo Minimum Cut Primality Testing Smallest Enclosing Disk Minimum Spanning Tree Michael's Algorithms 3

Determinictic vs Non-Deterministic 1

Deterministic vs Non-Deterministic Deterministic Algorithm: In deterministic algorithm, for a given particular input, the computer will always produce the same output going through the same states. Can solve the problem in polynomial time. Can determine what is the next step. Non-Deterministic Algorithm: In non-deterministic algorithm, for the same input, the compiler may produce different output in different runs. C an’t solve the problem in polynomial time. Can’t determine what is the next step. 5

Deterministic Algorithm 6 Goal of Deterministic Algorithm The solution produced by the algorithm is correct. The number of computational steps is same for different runs of the algorithm with the same input.

Deterministic Algorithm 7 Problem in Deterministic Algorithm Given a computational problem – It may be difficult to formulate an algorithm with good running time, or The exploitation of running time of an algorithm for that problem with the number of inputs. Efficient heuristics, Approximation algorithms, Randomized algorithms Remedies

Randomized Algorithm 2

Randomized Algorithm 9 What is a Randomized Algorithm? An algorithm that uses random numbers to decide what to do next anywhere in its logic is called Randomized Algorithm. A randomized algorithm is an algorithm that employees a degree of randomness as a part of its logic. A randomized algorithm is one that makes random choices during its execution.

To overcome the computation problem of exploitation of running time of a deterministic algorithm, randomized algorithm is used. Randomized algorithm uses uniform random bits also called as pseudo random number as an input to guides its behavior (Output). Randomized algorithms rely on the statistical properties of random numbers (e.g. randomized algorithm is quick sort). It tries to achieve good performance in the average case . 10

Why use Randomized Algorithm 11 Simple and easy to implement. For example, Karger's min-cut algorithm Faster and produces optimum output with very high probability. To improve efficiency with faster runtimes. For example, we could use a randomized quicksort algorithm. Deterministic quicksort can be quite slow on certain worst case inputs (e.g., input that is almost sorted), but randomized quicksort is fast on all inputs. To improve memory usage. Random sampling as a way to sparsify input and then working with this smaller input is a common technique. In parallel/distributed computing, each machine only has a part of the data, but still has to make decisions that affect global outcomes. Randomization plays a key role in informing these decisions.

Randomized Algorithm Las Vegas algorithms Monte Carlo algorithms Types of Randomized algorithm

Las Vegas 3

Las Vegas 14 Always produces correct output . Running time is random. Time complexity is based on a random value and time complexity is evaluated as expected value. So correctness is deterministic , time complexity is probabilistic . Expected running time should be polynomial . Improve performance Ex.: Randomized quicksort Searching in solution space Use

Quick Sort 4

Divide and Conquer 16 The design of Quicksort is based on the divide-and-conquer paradigm. Divide: Partition the array A[ p..r ] into two subarrays A[p..q-1] and A[q+1,r] such that, A[x] <= A[q] for all x in [p..q-1] A[x] > A[q] for all x in [q+1,r] Conquer: Recursively sort A[p..q-1] and A[q+1,r] Combine: nothing to do here

Deterministic QuickSort Algorithm 17 Given an array A containing n (comparable) elements, sort them in increasing/decreasing order. Here a pivot element is chosen either leftmost or rightmost number for performing the algorithm. The Problem   QSORT(A, p, q)

Deterministic QuickSort Algorithm 18 PARTITION(A, p, r) x := A[r]; i := p-1; for j = p to r-1{ if A[j] <= x then i := i+1; swap(A[ i ] , A[j]); } swap(A[i+1], A[r]); return i+1:

Deterministic QuickSort Algorithm 19 The running time is the dependent on the PARTITION procedure. Each time the PARTITION procedure is called, it selects a pivot element. Thus, there can be at most n calls to PARTITION over the entire execution of the quicksort algorithm. PARTITION takes time plus an amount of time that is proportional to the number of iterations of the loop. The running time of QUICKSORT is X be the number of comparisons performed in the loop of PARTITION .  

Randomized QuickSort Algorithm 20 In this algorithm, pick a random number and then perform An Useful Concept – random number

Randomized QuickSort Algorithm 21 Randomized-Quicksort(A, p, r) if p < r then q := Randomized-Partition(A, p, r); Randomized-Quicksort(A, p,q-1); Randomized-Quicksort(A,p+1,r); Randomized-Partition(A, p, r) i := Random(p, r); swap(A[i], A[r]); p := Partition(A, p, r); Return p; Almost the same as Partition as Deterministic QuickSort , but now the pivot element is not the rightmost/leftmost element, but rather an element from A[ p..r ] that is chosen uniformly at random .

Randomized QuickSort Algorithm 22 Pos 1 2 3 ……. …… ……. …… ……. …… n value ……. …… ……. …… ……. …… Pos 1 2 3 ……. …… ……. …… ……. …… n value ……. …… ……. …… ……. …… Pick a random value Pos 1 2 3 ……. …… m …… ……. …… n value ……. …… …… ……. …… Pos 1 2 3 ……. …… m …… ……. …… n value ……. …… …… ……. …… Let’s m will be the pivot value Pos 1 2 3 ……. …… m …… ……. …… n value ……. …… …… ……. …… Pos 1 2 3 ……. …… m …… ……. …… n value ……. …… …… ……. …… i =   Perform swap

Randomized QuickSort Algorithm 23 Goal The running time of quicksort depends mostly on the number of comparisons performed in all calls to the Randomized-Partition routine. Let X denote the random variable counting the number of comparisons in all calls to Randomized-Partition.

24 What was the main Problem in Deterministic Quicksort 1 2 … … … n Suppose given Sorted array and we have to perform here Quicksort n Pivot 1 2 … … … n n-1 1 2 … … … n n-2 1 st element will be fixed position Then perform for n-1 number 2 nd element will be fixed position In that case the algorithm doesn’t perform divide and conquer. This is the worse case that it has to check from 1 st element to last element for every time….

25 What will be happened in case of Randomized Quicksort 1 2 3 4 5 6 4 Pick a random number 1 2 3 6 5 4 i p r Swap(A[ i ],A[r]) and then perform Partition function Pivot p r

26 What will be happened in case of Randomized Quicksort 1 2 3 6 5 4 i =p-1 j x 1 2 3 6 5 4 i j x 1 2 3 6 5 4 i j x 1 2 3 6 5 4 i j A[j] <= x ? No 1 2 3 6 5 4 i j A[j] <= x ? No 1 2 3 5 6 4 4 6

Comparison 27 Best Case: Worst Case:   Expected Case: Expected Worst Case:   Randomized Quicksort Deterministic Quicksort In worst case the randomized function can pick the index of corner element every time. But it is rare to pick the corner element.

Average runtime vs Expected runtime 28 Average runtime is averaged over all inputs of a deterministic algorithm. Expected runtime is the expected value of the runtime random variable of a randomized algorithm. It effectively “average” over all sequences of random numbers. Expected runtime Average runtime

Smallest Enclosing Circle 5

Smallest Enclosing Circle 30 Problem Definition Given n points in a plane, compute the smallest radius circle that encloses all n point. Also known as minimum covering circle problem, bounding circle problem, smallest enclosing circle problem

Smallest Enclosing Circle 31 Applications: Facility location problem ( 1-center problem ) Best deterministic algorithm : [Nimrod Megiddo, 1983] time complexity, too complex, uses advanced geometry Randomized Las Vegas algorithm: [Emo Welz , 1991] Expected time complexity, too simple, uses elementary geometry. The algorithm is recursive. Based on a linear programming algorithm of “ Raimund Seidel”.  

Smallest Enclosing Circle 32

Monte Carlo 1

It may produce incorrect answer. We are able to bound its probability. By running it many times on independent random variables, we can make the failure probability arbitrarily small at the expense of running time. E.g. Randomized Mincut Algorithm 34

Monte Carlo Example Suppose we want to find a number among n given numbers which is larger than or equal to the median. Suppose A1 < … < An . We want Ai , such that i ≥ n/2. It’s obvious that the best deterministic algorithm needs O(n) time to produce the answer. n may be very large! Suppose n is 100,000,000,000! Choose 100 of the numbers with equal probability. Find the maximum among these numbers. Return the maximum. 35

Monte Carlo Example 36 The running time of the given algorithm is O(1). The probability of Failure is 1/(2100). Consider that the algorithm may return a wrong answer but the probability is very smaller than the hardware failure or even an earthquake!

Michael’s Algorithms: Closest Pair of Points

38 Problem Statement: an array of n points in the plane and the problem is to find the closest pair of points in the array. Distance between two points p and q can be found by the following formula: | pq | =   Closest Pair of Points

Algorithm Input: An array of n points P[ ]. Output: Smallest distance between two points in the given array. 39 P[ ] = {0,1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17}

Algorithm Cont …. 40 Sort the array according to the x-coordinates at first as preprocessing step. P[ ] = {13, 12, 11, 0, 14, 16, 1, 10, 17, 9, 2, 15, 3, 8, 4, 5, 7, 6} Find the middle point in sorted array. We can take P[n/2] as the middle point. P[ ] = {13, 12, 11, 0, 14, 16, 1, 10, 17, 9, 2, 15, 3, 8, 4, 5, 7, 6} 2. Divide the array in two halves. The first subarray contains points for P[0] to P[n/2] and the second subarray contains points from P[n/2+1] to P[n-1]. = {13, 12, 11, 0, 14, 16, 1, 10, 17} = {9, 2, 15, 3, 8, 4, 5, 7, 6}  

Algorithm Cont …. 41 3. Recursively find the smallest distance between two subarrays. Let the distance be dl and dr. Find the minimum of dl and dr. Let the minimum be d. d = min(dl, dr )

42 Our knowledge: Insertion time depends on whether the closest pair is changed or not. If output is the same: 1 clock tick. If output is not the same: |D| clock ticks. With random insertion order, show that the expected total number of clock ticks used by D is O(n)

Examples of Randomized Algorithms

Minimum Cut Min-Cut of a weighted graph is defined as the minimum sum of weights of (at least one)edges that when removed from the graph divides the graph into two groups. The algorithm works on a method of shrinking the graph until only one node is left in the graph. Minimum value in the list would be the minimum cut value of the graph. 44

Minimum Cut Cont.... 45 Select the edge with minimum weight and according this minimum weight edge next move is done in e network graph. Some points are taken in consideration when working with Min-Cut: A cut of connected graph is obstained bye dividing vertex set V of graph G into 2 sets & . T here are no common vertices in & that is, two sets are disjoint. U = V  

Minimum Cut Cont …. Algorithm: Repeat steps 2 to 4 until only two vertices are left. P ick an edge e(u,v) at random. M erge u and v. R emove self loops from E. Return |E|. 46 a b d c e f

47 a b ,d c e f a b ,d c e f

48 a b d c e, f a b d c ef a, b d c ef ab d c ef

49 ab d c, ef ab d c, ef ab d c, ef

Minimum Cut 50 Problem definition: Given a connected graph G=(V,E) on n vertices and m edges, compute the smallest set of edges that will make G disconnected. Best deterministic algorithm : [ Stoer and Wagner, 1997] O( mn ) time complexity. Randomized Monte Carlo algorithm: [ Karger , 1993] O(m log n) time complexity. Error probability: for any that we desire.  

Applications of Minimum Cut Algorithm 51 Partitioning items in a database, Identify clusters of related documents, Network reliability, Network design, Circuit design, etc.

Smallest Enclosing Disk 52 The Problem: Given a set of n points, P = { , , ….. } in 2D, compute a disk of minimum radius that contains all the points in P. Trivial Solution: Consider each triple of points , ∈ P, and check whether every other point in P lies inside the circle defined by , , . Time complexity: O( ) An Easy Implementable Efficient Solution: Consider furthest point Voronoi diagram. Its each vertex represents a circle containing all the points in P. Choose the one with minimum radius. Time complexity: O(n log n)  

53 Goal: compute a disk containing k points and having radius at most 2* . = smallest radius disk containing k points K=2 means the problem is closest pair problem. It’s a simple form of clustering.   K=5

Smallest Enclosing Disk Cont …. 54 A Simple Randomized Algorithm We generate a random permutation of the points in P. Notations: = { , ,…, . = the smallest enclosing disk of .   An incremental procedure Result: If ∈ then = . If then pi lies on the boundary of .  

Smallest Enclosing Disk Cont …. Algorithm: Compute a random permutation of P = { , ,…, . Let be the smallest enclosing disk for { , }. for i = 3 to n do if ∈ then = else = MINIDISKWITHPOINT({ , ,…, }, ) Return .   55 Algorithm MINIDISC(P) Input: A set P of n points in the plane. Output: The smallest enclosing disk for P.

Smallest Enclosing Disk Cont …. 56 Algorithm MINIDISCWITHPOINT(P, q) Idea: Incrementally add points from P one by one and compute the smallest enclosing circle under the assumption that the point q (the 2nd parameter) is on the boundary. Input: A set of points P, and another point q. Output: Smallest enclosing disk for P with q on the boundary.

Smallest Enclosing Disk Cont …. 57 Algorithm MINIDISCWITH2POINT(P, , )   Idea: Thus we have two fixed points; so we need to choose another point among P \ {q1, q2} to have the smallest enclosing disk containing P.

Time Complexity 58 MINIDISKWITHPOINTS needs O(n) time if we do not consider the time taken in the call of the routine MINIDISKWITH2POINTS. Worst case: O( )   Expected case: MINIDISKWITH2POINTS needs O(n) time.

Primality Testing 59 Algorithm: for i = 2 to N-1 { if (x mod i == 0) return -1 } return 1 Time Complexity: O(N) Mathematical Result: 2, ….. ,   Prime Number: divisible only by 1 and the number itself. Ex: 1, 2, 3, 5, 7, 11, ………

Primality Testing Cont …. 60 Algorithm: for i = 2 to { if (x mod i == 0) return -1 } return 1 Time Complexity: O( )  

Primality Testing Cont …. 61 Fermat’s Theorem: If n is prime, then 1 (mod n) for any integer a < n.   n = 3 , a = 2, then % 3 = 4 % 3 = 1 n = 5, a = 2, then % 5 = 16 % 5 = 1 n = 5, a = 3, then % 5 = 81 % 5 = 1  

Primality Testing Cont …. 62 Algorithm: if(( )% = = 1) for( i = 1 to large) do { a = random (n) 1, ………, n-1 z = if(( z%n ) ≠ 1) return False } return True   Input: Prime = n Sufficient number of integers a < n

Primality Testing 63 Applications: RSA-cryptosystem Algebraic algorithms Best deterministic algorithm : [Agrawal, Kayal and Saxena , O( ) time complexity . Randomized Monte Carlo algorithm: [Rabin, 1980] O(k ) time complexity. Error probability: for any that we desire.   For = 50, this probability is  

Randomized Data Structure -Boom Filter

A randomized data structure for fast searching Keys represented in compressed storage as bit array Two operations supported – Insert and Search Search returns YES (present) or NO (not present) NO is always correct YES is correct with a probability Similar to Monte Carlo with one-sided error Many practical applications in networks, content search etc. 65

Bloom Filter Operation 66 A bit array A[0..m-1] of size m Initially all bits are set to 0 A set of k random and independent hash functions , , …, producing a hash value between 0 and m – 1 Insert key x Compute = (x) for i = 0, 1,….,k – 1 Set A[ ] = 1 for i = 0, 1, …,k – 1 ( , , ,…, ) is called the signature of x Search for a key x Compute = hi (x) for i = 0, 1,…,k – 1 Answer YES if A[ ] =1 for all i , NO otherwise  

67 NO answer is always correct If x was inserted, corresponding ’s must have been set to 1 for all i YES answers may be correct ’s may have been set to 1 due to insert of other keys Note that all of them must have been set to 1 by insert of other keys Could be insert of more than one key, each setting some bits  

Example 68

Example 69

70 Classifying Randomized Algorithms by Their Methods Avoiding Worst-Case Inputs : Obtained by hiding the details of the algorithm from the adversary. Since the algorithm is chosen randomly, he can’t pick an input that is bad for all of them. Sampling : Randomness is used  for choosing a  simple random sample , without replacement, of  k  items from a population of unknown size  n  in a single pass over the items. In this way, the adversary can’t direct us to non-representative samples.

71 Classifying Randomized Algorithms by Their Methods Hashing : Obtained by selecting a  hash function  at random from a family of hash functions. This guarantees a low number of collisions in  expectation , even if the data is chosen by an adversary.  Building Random Structures : By creating a randomized algorithm to create structures, the probability can be reached to substantial. Symmetry Breaking : Randomization can break the deadlocks of making the progress of multiple processes stymied.

Advantages of Randomized Algorithms 72 The algorithm is usually simple and easy to implement, The algorithm is fast with very high probability, and It produces optimum output with very high probability.

Difficulties in Randomized Algorithm 73 There is a finite probability of getting incorrect answer. However, the probability of getting a wrong answer can be made arbitrarily small by the repeated employment of randomness. Analysis of running time or probability of getting a correct answer is usually difficult. Getting truly random numbers is impossible. One needs to depend on pseudo random numbers. So, the result highly depends on the quality of the random numbers. Its quality depends on quality of random number generator used as part of the algorithm. The other disadvantage of randomized algorithm is hardware failure.

Application & Scope

75 Tool for sorting: Randomized Quick Sort, then there is no user that always gets worst case. Everybody gets expected O(n Log n) time. Cryptography : Randomized algorithms have huge applications in Cryptography, e.g : RSA Crypto-System. Load Balancing . Number-Theoretic Applications:  Primality Testing Data Structures: Hashing, Sorting, Searching, Order Statistic and Computational Geometry. Algebraic identities: Polynomial and matrix identity verification. Interactive proof systems. Application

76 Mathematical programming: Faster algorithms for linear programming, Rounding linear program solutions to integer program solutions Graph algorithms: Minimum spanning trees, shortest paths, minimum cuts. Counting and enumeration: Matrix permanent Counting combinatorial structures. Parallel and distributed computing: Deadlock avoidance distributed consensus. Probabilistic existence proofs: Show that a combinatorial object arises with non-zero probability among objects drawn from a suitable probability space. Application

Thank You Any Question