Strassen's method
If you have seen matrices before, then you probably know how to multiply them. If $A = (a_{ij})$ and $B = (b_{ij})$ are square $n \times n$ matrices, then in the product $C = A \times B$, we define the entry $c_{ij}$ for $i, j = 1, 2, ..., n$, by
$$c_{ij} = \sum_{k=0}^n a_{ik} . b_{kj}$$
You might at first think that any matrix multiplication algorithm must take $Ω(n^3)$ time, since the natural definition of matrix multiplication requires that many multiplications. You would be incorrect, however: we have a way to multiply matrices in less time than that. In this article, we shall see Strassen’s remarkable recursive algorithm for multiplying $n \times n$ matrices. It runs in $Θ(n^{\log_2 7})$ time, which is asymptotically better than the straightforward approach. [1]
Strassen’s method
The strassen’s method makes the recursion tree slightly less bushy by performing only 7 recursive multiplications of sub-matrices instead of 8. Strassen’s method is not at all obvious and it involves four steps as described in the book [1]. Let’s directly move on to the chapter end exercises of the book [1]. This article is just a supplementary, but not a substitute to the book. [1]
Exercises
4.2-2
Write pseudocode for Strassen’s algorithm.
By following the 4 steps mentioned above, we can write the algorithm.
We assume that n is an exact power of 2. So, $n = 2^k$ for some integer k.
**Input:** square matrices $A$ and $B$
**Output:** Product of two square matrices $C$
$STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(A, B) \\
n \gets A.rows \\
Let \; C \; be \; new \; n \cdot n \; matrix$
`if ` $\;n\; =1$
$\phantom{{}++{}}$ $C_{11}\gets A_{11} \cdot B_{11}$
`else `
$\phantom{{}++{}}$ partition the matrices $A$ in to $A_{11}, A_{12}, A_{21}, A_{22}$ and $B$ into $B_{11}, B_{12}, B_{21}, B_{22}$
$\phantom{{}++{}}$ partition the matrix $C$ into $C_{11}, C_{12}, C_{21}$ and $\; C_{22}$
$\phantom{{}++{}}$
$\phantom{{}++{}}$ Let $S1$ through $S10$ be submatrices such that,
$\phantom{{}++{}}$ $S1 \gets B_{12} - B_{22}$
$\phantom{{}++{}}$ $S2 \gets A_{11} + A_{12}$
$\phantom{{}++{}}$ $S3 \gets A_{21} + A_{22}$
$\phantom{{}++{}}$ $S4 \gets B_{21} - B_{11}$
$\phantom{{}++{}}$ $S5 \gets A_{11} + A_{22}$
$\phantom{{}++{}}$ $S6 \gets B_{11} + B_{22}$
$\phantom{{}++{}}$ $S7 \gets A_{12} - A_{22}$
$\phantom{{}++{}}$ $S8 \gets B_{21} + B_{22}$
$\phantom{{}++{}}$ $S9 \gets A_{11} - A_{21}$
$\phantom{{}++{}}$ $S10 \gets B_{11} + B_{12}$
$\phantom{{}++{}}$ Let $P_1 \ldots P_7\; $be $\frac{n}{2} \cdot \frac{n}{2}$ matrices
$\phantom{{}++{}}$ $P_{1} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(A_{11}, S1)$
$\phantom{{}++{}}$ $P_{2} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(S2, B_{22})$
$\phantom{{}++{}}$ $P_{3} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(S3, B_{11})$
$\phantom{{}++{}}$ $P_{4} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(A_{22}, S4)$
$\phantom{{}++{}}$ $P_{5} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(S5, S6)$
$\phantom{{}++{}}$ $P_{6} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(S7, S8)$
$\phantom{{}++{}}$ $P_{7} \gets STRASSENS\_SQUARE\_MATRIX\_MULTIPLY(S9, S10)$
$\phantom{{}++{}}$ $C_{11} \gets P_5 + P_4 - P_2 + P_6$
$\phantom{{}++{}}$ $C_{12} \gets P_1 + P_2$
$\phantom{{}++{}}$ $C_{21} \gets P_3 + P_4$
$\phantom{{}++{}}$ $C_{22} \gets P_5 + P_1 - P_3 - P_7$
`return ` $C$.
Now we have the algorithm with us and let’s implement it and use it to multiply two matrices first.
$$
\left[
\begin{matrix}
1 & 3 \\
7 & 5 \\
\end{matrix}
\right]
.
\left[
\begin{matrix}
6 & 8 \\
4 & 2 \\
\end{matrix}
\right]
= \left[
\begin{matrix}
18 & 14 \\
62 & 66 \\
\end{matrix}
\right]
$$
Here’s the code:
How about non-square matrices
Well, even though the Strassen’s method is defined only for square matrices, it can be used to multiply non-square matrices where circumstances warrant. [2] For an instance let’s consider two matrices,
$$
\left[
\begin{matrix}
3 & 1 & 1 & 4 \\
5 & 3 & 2 & 1 \\
\end{matrix}
\right]
.
\left[
\begin{matrix}
4 & 9 \\
6 & 8 \\
9 & 7 \\
7 & 6 \\
\end{matrix}
\right]
= \left[
\begin{matrix}
55 & 66 \\
63 & 89 \\
\end{matrix}
\right]
$$
Write your left factor as a row of two square matrices and your right factor as a column of two square matrices each with the dimension $2 \times 2$. Then sum those two resulting matrices to get the answer.
Exercises
4.2-3
How would you modify Strassen’s algorithm to multiply $n \times n$ matrices in which n is not an exact power of 2? Show that the resulting algorithm runs in time $Θ(n^{\log_2 7})$ [1]
Strassen’s algorithm can be applied to $n \times n$ matrix multiplications where n is not an exact power of 2 by padding the operands with 0’s. Let $m = 2^k$ such that $2^ {k - 1} < n < 2^k$ (m equals $2^{\lceil{log_2n}\rceil}$ ). Create $m \times m$ matrices A’ and B’ by padding A and B respectively. Applying Strassen’s algorithm, the resulting matrices C’, A’ and B’ appear as follows, where C’ is the matrix product of A’ and B’:
$$
C' = \left[
\begin{matrix}
C & 0 \\
0 & 0 \\
\end{matrix}
\right]\quad
A' =
\left[
\begin{matrix}
A & 0 \\
0 & 0 \\
\end{matrix}
\right]\quad
B' = \left[
\begin{matrix}
B & 0 \\
0 & 0 \\
\end{matrix}
\right]
$$
To obtain the product, we simply extract the matrix C from $C'$.
The runtime for this method is $Θ(m^{\log_2 7})$. Since $2^{k – 1} < n$, it follows that $m < 2n$. Therefore, the runtime becomes $Θ((2n)^{log_2 7}) = Θ(2^{log_2 7} ⋅ n^{log_2 7} ) = Θ(n^{log_2 7})$. [3]
Finally, let’s use our implementation of Strassen’s algorithm to multiply two $3 \times 3$ matrices.
$$
\left[
\begin{matrix}
2 & 7 & 3 \\
1 & 5 & 8 \\
0 & 4 & 1 \\
\end{matrix}
\right]
.
\left[
\begin{matrix}
3 & 0 & 1 \\
2 & 1 & 0 \\
1 & 2 & 4
\end{matrix}
\right]
= \left[
\begin{matrix}
23 & 13 & 14 \\
21 & 21 & 33 \\
9 & 6 & 4
\end{matrix}
\right]
$$
Analysis
Let’s use the Master method for analysing the recurrence,
$T(n) = 7 T(n/2) + Θ(n^2)$,
which describes the running time of Strassen’s algorithm. Here, we have $a = 7,\quad b = 2,\quad f(n) = Θ(n^2)$, and thus $n^{\log_b a} = n^{\log_2 7}$ Recalling that $2.8 < log_2 7 < 2.81$ We see that $f(n) = O(n^{\log_2 7 - \epsilon})$ for $\epsilon = 0.8$. Again, case 1 applies, and we have the solution $T(n) = Θ(n^{\log_2 7})$ [1]
References
[1] https://www.amazon.com/Introduction-Algorithms-3rd-MIT-Press/dp/0262033844
[2] https://math.stackexchange.com/questions/1445064/strassens-algorithm-for-non-square-matrices
[3] https://www.eecis.udel.edu/~saunders/courses/621/03f/modelV.pdf
Comments
Post a Comment