Matrix chain multiplication in design analysis of algorithm

957 views 49 slides Mar 16, 2024
Slide 1
Slide 1 of 49
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49

About This Presentation

v


Slide Content

Matrix Chain Multiplication

Matrix-chain Multiplication Suppose we have a sequence or chain A 1 , A 2 , …, A n of n matrices to be multiplied That is, we want to compute the product A 1 A 2 …A n There are many possible ways ( parenthesizations ) to compute the product.

Matrix-chain Multiplication To compute the number of scalar multiplications necessary, we must know: Algorithm to multiply two matrices Matrix dimensions Can you write the algorithm to multiply two matrices?

Algorithm to Multiply 2 Matrices Input : Matrices A p × q and B q × r (with dimensions p × q and q × r ) Result : Matrix C p × r resulting from the product A · B MATRIX-MULTIPLY ( A p × q , B q × r ) 1. for i ← 1 to p 2. for j ← 1 to r 3. C [ i , j ] ← 0 4. for k ← 1 to q 5. C [ i , j ] ← C [ i , j ] + A [ i , k ] · B [ k , j ] 6. return C Scalar multiplication in line 5 dominates time to compute C Number of scalar multiplications = pqr

Matrix Chain Multiplication A - k  m matrix B - m  n matrix C = A  B is defined as k  n size matrix with elements 2 1 3 0 1 1 1 2 1 2 -1 0 9 1  = 2 2

Matrix Chain Multiplication A - k  m matrix B - m  n matrix C = A  B Each element of C can be computed in time  (m) Matrix C can be computed in time  ( kmn ) Matrix multiplication is associative, i.e. A  (B  C) = (A  B)  C

Matrix Chain Multiplication A - 2  10000 matrix B - 10000  5 matrix C - 5  50 matrix A  (B  C) (B  C) - = 10000*5*50 = 2500000 scalar multiplications A  (B  C) - = 2*10000*50 = 1000000 scalar multiplications 3500000 scalar multiplications (A  B)  C (A  B) - = 2*10000*5 =100000 scalar multiplications (A  B)  C - = 2*5*50 =500 scalar multiplications 100500 scalar multiplications How we parenthesize a chain of matrices can have a dramatic impact on the cost of evaluating the product.

Matrix Chain Multiplication - Problem Problem For a given sequence of matrices find a parenthesization which gives the smallest number of multiplications that are necessary to compute the product of all matrices in the given sequence.

Counting Number of Parenthesizations

Catalan Numbers Multiplying n Numbers Objective: Find C(n), the number of ways to compute product x 1 . x 2 …. x n . n multiplication order 2 (x 1 · x 2 ) 3 (x 1 · (x 2 · x 3 )) ((x 1 · x 2 ) · x 3 ) 4 (x 1 · (x 2 · (x 3 · x 4 ))) (x 1 · ((x 2 · x 3 ) · x 4 )) ((x 1 · x 2 ) · (x 3 · x 4 )) ((x 1 · (x 2 · x 3 )) · x 4 ) (((x 1 · x 2 ) · x 3 ) · x 4 ) n C(n) 1 1 2 1 3 2 4 5 5 14 6 42 7 132

Counting the Number of Parenthesizations

Number of Parenthesizations

Recursive equation: where is the last multiplication? Catalan numbers: Asymptotic value : Multiplying n Numbers - small n

Applying Dynamic Programming Characterize the structure of an optimal solution Recursively define the value of an optimal solution Compute the value of an optimal solution in a bottom-up fashion Construct an optimal solution from the information computed in Step 3

Matrix-Chain Multiplication Given a chain of matrices A 1 , A 2 , …, A n  , where for i = 1, 2, …, n matrix A i has dimensions p i-1 x p i , fully parenthesize the product A 1  A 2  A n in a way that minimizes the number of scalar multiplications. A 1  A 2  A i  A i+1  A n p x p 1 p 1 x p 2 p i-1 x p i p i x p i+1 p n-1 x p n

1. The Structure of an Optimal Parenthesization Step 1 : Characterize the structure of an optimal solution A i..j : matrix that results from evaluating the product A i A i +1 A i +2 ... A j , i ≤ j An optimal parenthesization of the product A 1 A 2 ... A n – Splits the product between A k and A k + 1 , for some 1 ≤ k < n A i..j = (A 1 A 2 A 3 ... A k ) · (A k +1 A k +2 ... A n ) – i.e., first compute A 1.. k and A k +1.. n and then multiply these two The cost of this optimal parenthesization Cost of computing A 1.. k + Cost of computing A k +1.. n + Cost of multiplying A 1.. k · A k +1.. n

A i…j = A i…k A k+1…j Key observation : Given optimal parenthesization (A 1 A 2 A 3 ... A k ) · (A k +1 A k +2 ... A n ) Parenthesization of the sub-chain A 1 A 2 … A k Parenthesization of the sub-chain A k +1 A k +2 …A n should be optimal if the parenthesization of the chain A 1 A 2 …A n is optimal (why?) 1. The Structure of an Optimal Parenthesization… That is, the optimal solution to the problem contains within it the optimal solution to sub-problems i.e . optimal substructure within optimal solution exists.

2. A Recursive Solution Step 2 : Define the value of optimal solution recursively Subproblem : Determine the minimum cost of parenthesizing A i…j = A i A i+1  A j for 1  i  j  n Let m[ i , j] = the minimum number of multiplications needed to compute A i…j Full problem (A 1..n ): m[1, n]

2. A Recursive Solution …. Define m recursively: i = j: A i… i = A i  m[ i , i ] = 0 , for i = 1, 2, …, n , since the sub-chain contain just one matrix; no multiplication at all. i < j: assume that the optimal parenthesization splits the product A i A i+1  A j between A k and A k+1 , i  k < j A i…j = A i…k A k+1…j m[ i , j] = m[ i , k] + m[k+1, j] + p i-1 p k p j A i…k A k+1…j A i…k A k+1…j

2. A Recursive Solution ….. Consider the subproblem of parenthesizing A i…j = A i A i+1  A j for 1  i  j  n = A i…k A k+1…j for i  k < j m[ i , j] = the minimum number of multiplications needed to compute the product A i…j m[ i , j] = m[ i , k] + m[k+1, j] + p i-1 p k p j min # of multiplications to compute A i…k # of multiplications to compute A i… k A k …j min # of multiplications to compute A k+1…j m[i, k] m[k+1,j] p i-1 p k p j

2. A Recursive Solution …. m[ i , j] = m[ i , k] + m[k+1, j] + p i-1 p k p j We do not know the value of k There are j – i possible values for k: k = i , i+1, …, j-1 Minimizing the cost of parenthesizing the product A i A i+1  A j becomes: if i = j m[ i , j] = min {m[ i , k] + m[k+1, j] + p i-1 p k p j } if i < j ik <j

Reconstructing the Optimal Solution Additional information to maintain: s[ i , j] = a value of k at which we can split the product A i A i+1  A j in order to obtain an optimal parenthesization

3. Computing the Optimal Costs if i = j m[ i , j] = min {m[ i , k] + m[k+1, j] + p i-1 p k p j } if i < j ik <j A recurrent algorithm may encounter each subproblem many times in different branches of the recursion  overlapping subproblems Compute a solution using a tabular bottom-up approach

3. Computing the Optimal Costs … if i = j m[ i , j] = min {m[ i , k] + m[k+1, j] + p i-1 p k p j } if i < j ik <j Length = 0: i = j, i = 1, 2, …, n Length = 1: j = i + 1, i = 1, 2, …, n-1 1 1 2 3 n 2 3 n first second Compute rows from bottom to top and from left to right In a similar matrix s we keep the optimal values of k m[1, n] gives the optimal solution to the problem i j

Example: min {m[ i , k] + m[k+1, j] + p i-1 p k p j } m[2, 2] + m[3, 5] + p 1 p 2 p 5 m[2, 3] + m[4, 5] + p 1 p 3 p 5 m[2, 4] + m[5, 5] + p 1 p 4 p 5 1 1 2 3 6 2 3 6 i j 4 5 4 5 m[2, 5] = min Values m[i, j] depend only on values that have been previously computed k = 2 k = 3 k = 4

Example Compute A 1  A 2  A 3 A 1 : 10 x 100 (p x p 1 ) A 2 : 100 x 5 (p 1 x p 2 ) A 3 : 5 x 50 (p 2 x p 3 ) m[ i , i] = 0 for i = 1, 2, 3 m[1, 2] = m[1, 1] + m[2, 2] + p p 1 p 2 (A 1 A 2 ) = 0 + 0 + 10 *100* 5 = 5,000 m[2, 3] = m[2, 2] + m[3, 3] + p 1 p 2 p 3 (A 2 A 3 ) = 0 + 0 + 100 * 5 * 50 = 25,000 m[1, 3] = min m[1, 1] + m[2, 3] + p p 1 p 3 = 75,000 (A 1 (A 2 A 3 )) m[1, 2] + m[3, 3] + p p 2 p 3 = 7,500 ((A 1 A 2 )A 3 ) 1 1 2 2 3 3 5000 1 25000 2 7500 2

n  length[p] – 1 for i  1 to n do m[ i , i ]  0 for l  2 to n, do for i  1 to n-l+1 do j  i+l-1 m[ i , j]   for k  i to j-1 do q  m[ i , k] + m[k+1, j] + p i-1 . p k . p j if q < m[ i , j] then m[ i , j] = q s[ i , j]  k return m and s, “l is chain length” Algorithm Chain-Matrix-Order(p) m[1,4] m[2,4] m[3,4] m[4,4] m[1,3] m[2,3] m[3,3] m[1,2] m[2,2] m[1,1] 1 2 3 4 4 3 2 1

Problem : Compute optimal multiplication order for a series of matrices given below m[1,4] m[2,4] m[3,4] m[4,4] m[1,3] m[2,3] m[3,3] m[1,2] m[2,2] m[1,1] P = 10 P 1 = 100 P 2 = 5 P 3 = 50 P 4 = 20 Example: Dynamic Programming 1 2 3 4 4 3 2 1

Main Diagonal m[1, 1] = 0 m[2, 2] = 0 m[3, 3] = 0 m[4, 4] = 0 Main Diagonal m[1,4] m[2,4] m[3,4] m[1,3] m[2,3] m[1,2] 1 2 3 4 4 3 2 1

m[3, 4] = 0 + 0 + 5 . 50 . 20 = 5000 s[3, 4] = k = 3 Computing m[3,4] , 4] m[1,4] m[2,4] 3 5000 m[1,3] m[2,3] m[1,2] 1 2 3 4 4 3 2 1

m[2, 3] = 0 + 0 + 100 . 5 . 50 = 25000 s[2, 3] = k = 2 Computing m[2, 3] m[1,4] m[2,4] 3 5000 m[1,3] 2 25000 m[1,2] 1 2 3 4 4 3 2 1

m[1, 2] = 0 + 0 + 10 . 100 . 5 = 5000 s[1, 2] = k = 1 Computing m[1, 2] m[1,4] m[2,4] 3 5000 m[1,3] 2 25000 1 5000 1 2 3 4 4 3 2 1

m[2, 4] = min(0+5000+100.5.20, 25000+0+100.50.20) = min(15000, 35000) = 15000 s[2, 4] = k = 2 Computing m[2, 4] m[1,4] 2 15000 3 5000 m[1,3] 2 25000 1 5000 1 2 3 4 4 3 2 1

m[1, 3] = min(0+25000+10.100.50, 5000+0+10.5.50) = min(75000, 2500) = 2500 s[1, 3] = k = 2 Computing m[1, 3] m[1,4] 2 15000 3 5000 2 2500 2 25000 1 5000 1 2 3 4 4 3 2 1

m[1, 4] = min(0+15000+10.100.20, 5000+5000+ 10.5.20, 2500+0+10.50.20) = min(35000, 11000, 35000) = 11000 s[1, 4] = k = 2 Computing m[1, 4] 2 11000 2 15000 3 5000 2 2500 2 25000 1 5000 4 3 2 1

Final Cost Matrix Order of Computation Final Cost Matrix and Its Order of Computation 2 11000 2 15000 3 5000 2 2500 2 25000 1 5000 4 3 2 1 1 2 3 4 10 8 5 4 9 6 3 7 2 1 4 3 2 1 1 2 3 4

2 2 3 2 2 1 K,s Values Leading Minimum m[ i , j] 1 2 3 4 4 3 2 1

l = 2 10*20*25=5000 35*15*5=2625 l =3 m[3,5] = min m[3,4]+m[5,5] + 15*10*20 =750 + 0 + 3000 = 3750 m[3,3]+m[4,5] + 15*5*20 =0 + 1000 + 1500 = 2500

n  length[p] – 1 for i  1 to n do m[ i , i ]  0 for l  2 to n, do for i  1 to n-l+1 do j  i+l-1 m[ i , j]   for k  i to j-1 do q  m[ i , k] + m[k+1, j] + p i-1 . p k . p j if q < m[ i , j] then m[ i , j] = q s[ i , j]  k return m and s, “l is chain length” Analysis Chain-Matrix-Order(p) Takes O ( n 3 ) time Requires O ( n 2 ) space

11- 40 Our algorithm computes the minimum-cost table m and the split table s The optimal solution can be constructed from the split table s Each entry s [ i , j ]= k shows where to split the product A i A i +1 … A j for the minimum cost. 4. Construct the Optimal Solution

4. Construct the Optimal Solution … s[ i , j] = value of k such that the optimal parenthesization of A i A i+1  A j splits the product between A k and A k+1 3 3 3 5 5 - 3 3 3 4 - 3 3 3 - 1 2 - 1 - - 1 1 2 3 6 2 3 6 i j 4 5 4 5 s[1, n] = 3  A 1..6 = A 1..3 A 4..6 s[1, 3] = 1  A 1..3 = A 1..1 A 2..3 s[4, 6] = 5  A 4..6 = A 4..5 A 6..6 A 1..n = A 1..s[1, n]  A s[1, n]+1..n

4. Construct the Optimal Solution … 3 3 3 5 5 - 3 3 3 4 - 3 3 3 - 1 2 - 1 - - 1 1 2 3 6 2 3 6 i j 4 5 4 5 PRINT-OPT-PARENS( s, i, j ) if i = j then print “A” i else print “(” PRINT-OPT-PARENS( s, i, s[i, j] ) PRINT-OPT-PARENS( s, s[i, j] + 1, j ) print “)”

Example: A 1   A 6 3 3 3 5 5 - 3 3 3 4 - 3 3 3 - 1 2 - 1 - - 1 1 2 3 6 2 3 6 i j 4 5 4 5 PRINT-OPT-PARENS( s, i, j ) if i = j then print “A” i else print “(” PRINT-OPT-PARENS( s, i, s[i, j] ) PRINT-OPT-PARENS( s, s[i, j] + 1, j ) print “)” P-O-P(s, 1, 6) s[1, 6] = 3 i = 1, j = 6 “(“ P-O-P (s, 1, 3) s[1, 3] = 1 i = 1, j = 3 “(“ P-O-P(s, 1, 1)  “A 1 ” P-O-P(s, 2, 3) s[2, 3] = 2 i = 2, j = 3 “(“ P-O-P (s, 2, 2)  “A 2 ” P-O-P (s, 3, 3)  “A 3 ” “)” “)” ( ( ( A 4 A 5 ) A 6 ) ) A 1 ( A 2 A 3 ) ) … ( s[1..6, 1..6]

Elements of Dynamic Programming When should we look for a DP solution to an optimization problem? Two key ingredients for the problem Optimal substructure Overlapping subproblems

Optimal Substructure

Optimal Substructure

Optimal Substructure

Overlapping Subproblems

Overlapping Subproblems