Strassen's method

If you have seen matrices before, then you probably know how to multiply them. If $A = (a_{ij})$ and $B = (b_{ij})$ are square $n \times n$ matrices, then in the product $C = A \times B$, we define the entry $c_{ij}$ for $i, j = 1, 2, ..., n$, by

$$c_{ij} = \sum_{k=0}^n a_{ik} . b_{kj}$$

You might at first think that any matrix multiplication algorithm must take $Ω(n^3)$ time, since the natural definition of matrix multiplication requires that many multiplications. You would be incorrect, however: we have a way to multiply matrices in less time than that. In this article, we shall see Strassen’s remarkable recursive algorithm for multiplying $n \times n$ matrices. It runs in $Θ(n^{\log_2 7})$ time, which is asymptotically better than the straightforward approach. [1]

 

Strassen’s method

The strassen’s method makes the recursion tree slightly less bushy by performing only 7 recursive multiplications of sub-matrices instead of 8. Strassen’s method is not at all obvious and it involves four steps as described in the book [1]. Let’s directly move on to the chapter end exercises of the book [1]. This article is just a supplementary, but not a substitute to the book. [1]

Exercises

4.2-2

Write pseudocode for Strassen’s algorithm.
By following the 4 steps mentioned above, we can write the algorithm. 

We assume that n is an exact power of 2. So, $n = 2^k$ for some integer k.

**Input:** square matrices $A$ and $B$  
**Output:** Product of two square matrices $C$  
$STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(A, B) \\
n \gets A.rows \\
Let \; C  \;  be \; new \; n \cdot n \; matrix$  
`if ` $\;n\; =1$  
$\phantom{{}++{}}$ $C_{11}\gets A_{11} \cdot B_{11}$  
`else `  
$\phantom{{}++{}}$ partition the matrices $A$ in to $A_{11}, A_{12}, A_{21}, A_{22}$ and $B$ into $B_{11}, B_{12}, B_{21}, B_{22}$  
$\phantom{{}++{}}$ partition the matrix $C$ into $C_{11}, C_{12}, C_{21}$ and $\; C_{22}$
$\phantom{{}++{}}$  
$\phantom{{}++{}}$  Let $S1$ through $S10$ be submatrices such that,  
$\phantom{{}++{}}$  $S1 \gets B_{12} - B_{22}$  
$\phantom{{}++{}}$  $S2 \gets A_{11} + A_{12}$  
$\phantom{{}++{}}$  $S3 \gets A_{21} + A_{22}$  
$\phantom{{}++{}}$  $S4 \gets B_{21} - B_{11}$  
$\phantom{{}++{}}$  $S5 \gets A_{11} + A_{22}$  
$\phantom{{}++{}}$  $S6 \gets B_{11} + B_{22}$  
$\phantom{{}++{}}$  $S7 \gets A_{12} - A_{22}$  
$\phantom{{}++{}}$  $S8 \gets B_{21} + B_{22}$  
$\phantom{{}++{}}$  $S9 \gets A_{11} - A_{21}$  
$\phantom{{}++{}}$  $S10 \gets B_{11} + B_{12}$  
$\phantom{{}++{}}$  Let $P_1 \ldots P_7\; $be $\frac{n}{2} \cdot \frac{n}{2}$ matrices      
$\phantom{{}++{}}$  $P_{1} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(A_{11}, S1)$  
$\phantom{{}++{}}$  $P_{2} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(S2, B_{22})$  
$\phantom{{}++{}}$  $P_{3} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(S3, B_{11})$  
$\phantom{{}++{}}$  $P_{4} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(A_{22}, S4)$   
$\phantom{{}++{}}$  $P_{5} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(S5, S6)$   
$\phantom{{}++{}}$  $P_{6} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(S7, S8)$   
$\phantom{{}++{}}$  $P_{7} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(S9, S10)$   
$\phantom{{}++{}}$  $C_{11} \gets P_5 + P_4 - P_2 + P_6$   
$\phantom{{}++{}}$  $C_{12} \gets P_1 + P_2$   
$\phantom{{}++{}}$  $C_{21} \gets P_3 + P_4$   
$\phantom{{}++{}}$  $C_{22} \gets P_5 + P_1 - P_3 - P_7$   
`return ` $C$.

Now we have the algorithm with us and let’s implement it and use it to multiply two matrices first.

$$
    \left[
    \begin{matrix}
    1 & 3 \\
    7 & 5 \\
    \end{matrix}
    \right]
.
    \left[
    \begin{matrix}
    6 & 8 \\
    4 & 2 \\
    \end{matrix}
    \right]
=     \left[
    \begin{matrix}
    18 & 14 \\
    62 & 66 \\
    \end{matrix}
    \right]
$$


Here’s the code:


How about non-square matrices

Well, even though the Strassen’s method is defined only for square matrices, it can be used to multiply non-square matrices where circumstances warrant. [2] For an instance let’s consider two matrices,

$$
    \left[
    \begin{matrix}
    3 & 1 & 1 & 4 \\
    5 & 3 & 2 & 1 \\
    \end{matrix}
    \right]
.
    \left[
    \begin{matrix}
    4 & 9 \\
    6 & 8 \\
    9 & 7 \\
    7 & 6 \\
    \end{matrix}
    \right]
=     \left[
    \begin{matrix}
    55 & 66 \\
    63 & 89 \\
    \end{matrix}
    \right]
$$

Write your left factor as a row of two square matrices and your right factor as a column of two square matrices each with the dimension $2 \times 2$. Then sum those two resulting matrices to get the answer. 

Exercises

4.2-3

How would you modify Strassen’s algorithm to multiply $n \times n$ matrices in which n is not an exact power of 2? Show that the resulting algorithm runs in time $Θ(n^{\log_2 7})$ [1]

Strassen’s algorithm can be applied to $n \times n$ matrix multiplications where n is not an exact power of 2 by padding the operands with 0’s. Let $m = 2^k$ such that $2^ {k - 1} < n < 2^k$ (m equals $2^{\lceil{log_2n}\rceil}$ ). Create $m \times m$ matrices A’ and B’ by padding A and B respectively. Applying Strassen’s algorithm, the resulting matrices C’, A’ and B’ appear as follows, where C’ is the matrix product of A’ and B’:

$$
    C' = \left[
    \begin{matrix}
    C & 0 \\
    0 & 0 \\
    \end{matrix}
    \right]\quad
A' =
    \left[
    \begin{matrix}
    A & 0 \\
    0 & 0 \\
    \end{matrix}
    \right]\quad
B' =     \left[
    \begin{matrix}
    B & 0 \\
    0 & 0 \\
    \end{matrix}
    \right]
$$

To obtain the product, we simply extract the matrix C from $C'$.

The runtime for this method is $Θ(m^{\log_2 7})$. Since $2^{k – 1} < n$, it follows that $m < 2n$. Therefore, the runtime becomes $Θ((2n)^{log_2 7}) = Θ(2^{log_2 7} ⋅ n^{log_2 7} ) = Θ(n^{log_2 7})$. [3]

Finally, let’s use our implementation of Strassen’s algorithm to multiply two $3 \times 3$ matrices.

$$
    \left[
    \begin{matrix}
    2 & 7 & 3 \\
    1 & 5 & 8 \\
    0 & 4 & 1 \\
    \end{matrix}
    \right]
.
    \left[
    \begin{matrix}
    3 & 0 & 1 \\
    2 & 1 & 0 \\
    1 & 2 & 4
    \end{matrix}
    \right]
=     \left[
    \begin{matrix}
    23 & 13 & 14 \\
    21 & 21 & 33 \\
    9 & 6 & 4
    \end{matrix}
    \right]
$$

Analysis

Let’s use the Master method for analysing the recurrence,
$T(n) = 7 T(n/2) + Θ(n^2)$,

which describes the running time of Strassen’s algorithm. Here, we have $a = 7,\quad b = 2,\quad f(n) = Θ(n^2)$, and thus  $n^{\log_b a} = n^{\log_2 7}$ Recalling that $2.8 < log_2 7 < 2.81$ We see that $f(n) = O(n^{\log_2 7 - \epsilon})$ for $\epsilon = 0.8$. Again, case 1 applies, and we have the solution $T(n) = Θ(n^{\log_2 7})$ [1]

References

[1] https://www.amazon.com/Introduction-Algorithms-3rd-MIT-Press/dp/0262033844
[2] https://math.stackexchange.com/questions/1445064/strassens-algorithm-for-non-square-matrices
[3] https://www.eecis.udel.edu/~saunders/courses/621/03f/modelV.pdf

Comments

Popular posts from this blog

Introducing Java Reactive Extentions in to a SpringBoot Micro Service

Optimal binary search trees

Combining the emissions of multiple Observables together using RxJava Zip operator in a Spring Boot Micro service