In linear algebra, the Strassen algorithm, named after Volker Strassen, is an algorithm for matrix multiplication. It is faster than the standard matrix multiplication algorithm for large matrices, with a better asymptotic complexity, although the naive algorithm is often better for smaller matrices...
In linear algebra, the Strassen algorithm, named after Volker Strassen, is an algorithm for matrix multiplication. It is faster than the standard matrix multiplication algorithm for large matrices, with a better asymptotic complexity, although the naive algorithm is often better for smaller matrices. The Strassen algorithm is slower than the fastest known algorithms for extremely large matrices, but such galactic algorithms are not useful in practice, as they are much slower for matrices of practical size. For small matrices even faster algorithms exist.
Strassen's algorithm works for any ring, such as plus/multiply, but not all semirings, such as min-plus or boolean algebra, where the naive algorithm still works, and so called combinatorial matrix multiplication.Let
A
{\displaystyle A},
B
{\displaystyle B} be two square matrices over a ring
R
{\displaystyle {\mathcal {R}}}, for example matrices whose entries are integers or the real numbers. The goal of matrix multiplication is to calculate the matrix product
C
=
A
B
{\displaystyle C=AB}. The following exposition of the algorithm assumes that all of these matrices have sizes that are powers of two (i.e.,
A
,
B
,
C
∈
Matr
2
n
×
2
n
(
R
)
{\displaystyle A,\,B,\,C\in \operatorname {Matr} _{2^{n}\times 2^{n}}({\mathcal {R}})}), but this is only conceptually necessary — if the matrices
A
{\displaystyle A},
B
{\displaystyle B} are not of type
2
n
×
2
n
{\displaystyle 2^{n}\times 2^{n}}, the "missing" rows and columns can be filled with zeros to obtain matrices with sizes of powers of two — though real implementations of the algorithm do not do this in practice.
The Strassen algorithm partitions
A
{\displaystyle A},
B
{\displaystyle B} and
C
{\displaystyle C} into equally sized block matrices
A
=
[
A
11
A
12
A
21
A
22
]
,
B
=
[
B
11
B
12
B
21
B
22
]
,
C
=
[
C
11
C
12
C
21
C
22
]
,
{\displaystyle A={\begin{bmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\end{bmatrix}},\quad B={\begin{bmatrix}B_{11}&B_{12}\\B_{21}&B_{22}\end{bmatrix}},\quad C={\begin{bmatrix}C_{11}&C_{12}\\C_{21}&C_{22}\end{bmatrix}},\quad }
with
A
i
j
,
B
i
j
,
C
i
j
∈
Mat
2
n
−
1
×
2
n
−
1
(
R
)
{\displaystyle A_{ij},B_{ij},C_{ij}\in \operatorname {Mat} _{2^{n-1}\times 2^{n-1}}({\mathcal {R}})}. The naive algorithm would be:
[
C
11
C
12
C
21
C
22
]
=
[
A
11
×
B
11
+
A
12
×
B
21
A
11
×
B
12
+
A
12
×
B
22
A
21
×
B
11
+
A
22
×
B
21
A
21
×
B
12
+
A
22
×
B
22
]
.
{\displaystyle {\begin{bmatrix}C_{11}&C_{12}\\C_{21}&C_{22}\end{bmatrix}}={\begin{bmatrix}A_{11}{\color {red}\times }B_{11}+A_{12}{\color {red}\times }B_{21}\quad &A_{11}{\color {red}\times }B_{12}+A_{12}{\color {red}\times }B_{22}\\A_{21}{\color {red}\times }B_{11}+A_{22}{\color {red}\times }B_{21}\quad &A_{21}{\color {red}\times }B_{12}+A_{22}{\color {red}\times }B_{22}\end{bmatrix}}.}
This construction does not reduce the number of multiplications: 8 multiplications of matrix blocks are still needed to calculate the
C
i
j
{\displaystyle C_{ij}} matrices, the same number
Size: 3.46 MB
Language: en
Added: Sep 19, 2024
Slides: 59 pages
Slide Content
Copyright Li Zimao @ 2007-2008-1 SCUEC
Divide and Conquer
1.Merge sort and quick sort
2.Binary search
3.Multiplication of large integers
4.Strassen’s Matrix Multiplication
5.Decrease and Conquer:
Topological sort
Copyright Li Zimao @ 2007-2008-1 SCUEC
Expected Outcomes
Students should be able to
Explain the ideas of binary search, multiplication of
large integers, and Strassen’s matrix multiplication
algorithms
Analyze the time complexity of the above algorithms
Copyright Li Zimao @ 2007-2008-1 SCUEC
Binary Search an Iterative Algorithm
Very efficient algorithm for searching in sorted array:
K
vs
A[0] . . . A[m] . . . A[n-1]
If K = A[m], stop (successful search); otherwise, continue
searching by the same method in A[0..m-1] if K < A[m]
and in A[m+1..n-1] if K > A[m]
ALGORITHM BinarySearch(A[0..n-1], K)
l 0; r n-1
while l r do // l and r crosses over can’t find K
m (l+r)/2
if K = A[m] return m //the key is found
else if K < A[m] r m-1 //the key is on the left half of the array
else l m+1 // the key is on the right half of the array
return -1
Copyright Li Zimao @ 2007-2008-1 SCUEC
3. Binary Search – a Recursive Algorithm
ALGORITHM BinarySearchRecur(A[0..n-1], l, r, K)
if l > r
return –1
else
m (l + r) / 2
if K = A[m]
return m
else if K < A[m]
return BinarySearchRecur(A[0..n-1], l, m-1, K)
else
return BinarySearchRecur(A[0..n-1], m+1, r, K)
Naïve method for matrix multiplication and its time complexity
Copyright Li Zimao @ 2007-2008-1 SCUEC
General DAC approach
Copyright Li Zimao @ 2007-2008-1 SCUEC
Copyright Li Zimao @ 2007-2008-1 SCUEC
DAC for Matrix multiplication
four
Divide n*n into submatrices
with 2*2 as initial condition
Copyright Li Zimao @ 2007-2008-1 SCUEC
Copyright Li Zimao @ 2007-2008-1 SCUEC
Note: Here a,b,c,d,e,f,g,h are submatrices of size n/2*n/2.
Addition means its a matrix addition not the scalar addition.
What’s the Recurrence relation ?
Copyright Li Zimao @ 2007-2008-1 SCUEC
Is there any other way?
Yes. Strassen’s Method
Copyright Li Zimao @ 2007-2008-1 SCUEC
Strassen’s Matrix Multiplication
Strassen observed [1969] that the product of two
matrices can be computed as follows so that the time
complexity can be improved by reducing the number of
multiplications.(i.e from 8 to 7 no. of multiplications)
C
00
C
01
A
00
A
01
B
00
B
01
= *
C
10
C
11
A
10
A
11
B
10
B
11
M
1 + M
4 - M
5 + M
7 M
3 + M
5
=
M
2
+ M
4
M
1
+ M
3
- M
2
+ M
6
Copyright Li Zimao @ 2007-2008-1 SCUEC
Formulas for Strassen’s Algorithm
M
1 = (A
00 + A
11) (B
00 + B
11)
M
2 = (A
10 + A
11) B
00
M
3 = A
00 (B
01 - B
11)
M
4
= A
11
(B
10
- B
00
)
M
5 = (A
00 + A
01) B
11
M
6 = (A
10 - A
00) (B
00 + B
01)
M
7
= (A
01
- A
11
) (B
10
+ B
11
)
Note: Observe we have only 7 multiplications instead of 8 as in DAC.
Simple way to remember
Strassen’s method
Copyright Li Zimao @ 2007-2008-1 SCUEC
To remember Strassen’s formulas
You just need to remember 4 Rules :
AHED (Learn it as ‘Ahead’)
Diagonal
Last CR
First CR
Also, consider X as (Row +) and Y as
(Column -) matrix
Follow the Steps :
Write
P1 = A; P2 = H; P3 = E; P4 = D
For P5 we will use Diagonal Rule
i.e.
(Sum the Diagonal Elements Of
Matrix X ) * (Sum the Diagonal
Elements Of Matrix Y ), we get
P5 = (A + D)* (E + H)
Copyright Li Zimao @ 2007-2008-1 SCUEC
Contd.,
For P6 we will use Last CR Rule
i.e. Last Column of X and Last
Row of Y and remember that
Row+ and Column- so i.e.
(B – D) * (G + H),
We get P6 = (B – D) * (G + H)
For P7 we will use First CR Rule
i.e. First Column of X and First
Row of Y and remember that
Row+ and Column- so i.e.
(A – C) * (E + F),
We get P7 = (A – C) * (E + F)
So, we get
P1 = A
P2= H
P3= E
P4= D
P5= ( A + D ) * ( E + H )
P6= ( B – D ) * ( G + H)
P7= ( A – C ) * ( E + F)
Copyright Li Zimao @ 2007-2008-1 SCUEC
P1, P2,P3 P4?
Come Back to P1 :
we have A there and it’s adjacent
element in Y Matrix is E, since Y is
Column Matrix so we select a column
in Y such that E won’t come, we find F
H Column, so multiply A with (F-H)
So, finally P1 = A * (F – H)
Come Back to P2 :
we have H there and it’s adjacent
element in X Matrix is D, since X is
Row Matrix so we select a Row in X
such that D won’t come, we find A B
Column, so multiply H with (A+B)
So, finally P2 = (A + B) * H
Come Back to P3 :
we have E there and it’s adjacent
element in X Matrix is A, since X is
Row Matrix so we select a Row in X
such that A won’t come, we find C D
Column, so multiply E with (C + D)
So, finally P3 = (C + D) * E
Copyright Li Zimao @ 2007-2008-1 SCUEC
C = X * Y ?
Come Back to P4 :
we have D there and it’s adjacent
element in Y Matrix is H, since Y is
Column Matrix so we select a column
in Y such that H won’t come, we find
G E Column, so multiply D with (G –
E)
So, finally P4 = D * (G – E)
So, we get
P1 = A * (F – H)
P2 = (A + B) * H
P3 = (C + D) * E
P4 = D * (G – E)
P5= ( A + D ) * ( E + H )
P6= ( B – D ) * ( G + H)
P7= ( A – C ) * ( E + F)
We are done with P1 – P7 equations, so
now we move to C1 – C4 equations in
Final Matrix C :
Remember Counting : Write P1 + P2 at
C2
Write P3 + P4 at its diagonal Position
i.e. at C3
Write P4 + P5 + P6 at 1st position and
subtract P2 i.e. C1 = P4 + P5 + P6 – P2
Write odd values at last Position with
alternating – and + sign i.e. P1 P3 P5
P7 becomes
C4 = P1 – P3 + P5 – P7
Copyright Li Zimao @ 2007-2008-1 SCUEC
Example1:Multiplication of two 2*2 matrices using Strassen’s
method
Copyright Li Zimao @ 2007-2008-1 SCUEC
Copyright Li Zimao @ 2007-2008-1 SCUEC
Analysis of Strassen’s Algorithm
If n is not a power of 2, matrices can be padded with zeros.
Number of multiplications:
M(n) = 7M(n/2), M(1) = 1
Solution: M(n) = 7
log
2
n
= n
log
2
7
≈ n
2.807
vs. n
3
of brute-force alg.
Algorithms with better asymptotic efficiency are known but they
are even more complex.
Copyright Li Zimao @ 2007-2008-1 SCUEC
Standard vs Strassen: Practical
N MultiplicationsAdditions
Standard alg.100 1,000,000 990,000
Strassen’s alg.100 411,822 2,470,334
Standard alg.1000 1,000,000,000999,000,000
Strassen’s alg.1000 264,280,2851,579,681,709
Standard alg.10,00010
12
9.99*10
11
Strassen’s alg.10,0000.169*10
12
10
12
Decrease and Conquer strategy
Copyright Li Zimao @ 2007-2008-1 SCUEC
Variations of Decrease and Conquer
Copyright Li Zimao @ 2007-2008-1 SCUEC
Decrease by constant (Typically by 1)
Copyright Li Zimao @ 2007-2008-1 SCUEC
Fig. Decrease by constant typically by one
Applications: Insertion sort, DFS, BFS, Topological order
Decrease by constant factor (Typically by half)
Copyright Li Zimao @ 2007-2008-1 SCUEC
Fig. Decrease by constant factor typically by n/2
Applications: Binary search, Fake coin problem
Variable size Decrease
•Is also a one of the variations of decrease and
conquer method.
•In variable size decrease, in each iteration of the
loop the size reduction pattern varies from one
iteration to another iteration.
Ex. GCD of two numbers using Euclid’s algorithm.
Copyright Li Zimao @ 2007-2008-1 SCUEC
Difference?
Copyright Li Zimao @ 2007-2008-1 SCUEC
Note: Observe that the divide and conquer solves two instances of
the Problem of size n/2 whereas decrease and conquer solves only
one instance of the problem.
O(n) time complexity
O(logn) time complexity
Topological sort or Topological ordering
• A
topological sort
or
topological ordering
of a directed
graph
is a linear ordering of its vertices such that for
every directed edge
uv
from vertex
u
to vertex
v,
u
comes
before
v
in the ordering.
•A topological ordering is possible if and only if the graph
has no
directed cycles, that is, if it is a directed acyclic
graph
(DAG)
•Any DAG has at least one topological ordering or even
more.
•We have two methods for topological ordering:
DFS Based method
Source removal method
Copyright Li Zimao @ 2007-2008-1 SCUEC
Topological Ordering Methods
Copyright Li Zimao @ 2007-2008-1 SCUEC
Ex1: Using DFS method
Copyright Li Zimao @ 2007-2008-1 SCUEC
Ex2: Topological sort using DFS method
Copyright Li Zimao @ 2007-2008-1 SCUEC
Find the topological order using DFS method for the below graphs.
Copyright Li Zimao @ 2007-2008-1 SCUEC
Source Removal method
Copyright Li Zimao @ 2007-2008-1 SCUEC
Ex1:
Copyright Li Zimao @ 2007-2008-1 SCUEC
Copyright Li Zimao @ 2007-2008-1 SCUEC
Ex2:
Copyright Li Zimao @ 2007-2008-1 SCUEC
Ex2: solution
Copyright Li Zimao @ 2007-2008-1 SCUEC