3977:, even in commutative field. On 4x4 matrices, AlphaTensor unexpectedly discovered a solution with 47 multiplication steps, an improvement over the 49 required with Strassen’s algorithm of 1969, albeit restricted to mod 2 arithmetic. Similarly, AlphaTensor solved 5x5 matrices with 96 rather than Strassen's 98 steps. Based on the surprising discovery that such improvements exist, other researchers were quickly able to find a similar independent 4x4 algorithm, and separately tweaked Deepmind's 96-step 5x5 algorithm down to 95 steps in mod 2 arithmetic and to 97 in normal arithmetic. Some algorithms were never discovered before, e.g. (4, 5, 5) got improved to 76 from 80 in normal and mod 2 arithmetics.
5287:
5068:
1653:
448:
2584:
1247:
3697:
are represented in the same basis. Karstadt and
Schwartz computed in different bases and traded 3 additions for less expensive basis transformations. They also proved that one cannot go below 12 additions per step using different bases. In subsequent work Beniamini et el. applied this base change trick to more general decompositions than 2x2-block matrices and improved the leading constant for their run times.
1228:
1648:{\displaystyle {\begin{pmatrix}C_{11}&C_{12}\\C_{21}&C_{22}\\\end{pmatrix}}={\begin{pmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\\\end{pmatrix}}{\begin{pmatrix}B_{11}&B_{12}\\B_{21}&B_{22}\\\end{pmatrix}}={\begin{pmatrix}A_{11}B_{11}+A_{12}B_{21}&A_{11}B_{12}+A_{12}B_{22}\\A_{21}B_{11}+A_{22}B_{21}&A_{21}B_{12}+A_{22}B_{22}\\\end{pmatrix}}}
3689:
3421:
4216:
995:
3696:
It is known that a
Strassen-like algorithm with a 2x2-block matrix step requires at least 7 block matrix multiplications. In 1976 Probert showed that such an algorithm requires at least 15 additions (including subtractions), however, a hidden assumption was that the blocks and the 2x2-block matrix
5092:
On modern architectures with hierarchical memory, the cost of loading and storing input matrix elements tends to dominate the cost of arithmetic. On a single machine this is the amount of data transferred between RAM and cache, while on a distributed memory multi-node machine it is the amount
1863:
A variant of this algorithm that works for matrices of arbitrary shapes and is faster in practice splits matrices in two instead of four submatrices, as follows. Splitting a matrix now means dividing it into two parts of equal size, or as close to equal sizes as possible in the case of odd
5273:
more memory than is needed to store the inputs. This algorithm can be combined with
Strassen to further reduce runtime. "2.5D" algorithms provide a continuous tradeoff between memory usage and communication bandwidth. On modern distributed computing environments such as
2417:
6095:
Fawzi, Alhussein; Balog, Matej; Huang, Aja; Hubert, Thomas; Romera-Paredes, Bernardino; Barekatain, Mohammadamin; Novikov, Alexander; R. Ruiz, Francisco J.; Schrittwieser, Julian; Swirszcz, Grzegorz; Silver, David; Hassabis, Demis; Kohli, Pushmeet (October 2022).
4221:
can be performed independently of each other, as can the four summations (although the algorithm needs to "join" the multiplications before doing the summations). Exploiting the full parallelism of the problem, one obtains an algorithm that can be expressed in
3973:(mod 2 arithmetic). The best "practical" (explicit low-rank decomposition of a matrix multiplication tensor) algorithm found ran in O(n). Finding low-rank decompositions of such tensors (and beyond) is NP-hard; optimal multiplication even for 3x3 matrices
2227:
3934:
that used a single-player game analogy to invent thousands of matrix multiplication algorithms, including some previously discovered by humans and some that were not. Operations were restricted to the non-commutative ground field (normal arithmetic) and
2077:
2443:
environment where cache sizes are effectively dynamic due to other processes taking up cache space. (The simple iterative algorithm is cache-oblivious as well, but much slower in practice if the matrix layout is not adapted to the algorithm.)
458:
The three loops in iterative matrix multiplication can be arbitrarily swapped with each other without an effect on correctness or asymptotic running time. However, the order can have a considerable impact on practical performance due to the
3153:
2978:
3533:
2567:
1223:{\displaystyle C={\begin{pmatrix}C_{11}&C_{12}\\C_{21}&C_{22}\\\end{pmatrix}},\,A={\begin{pmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\\\end{pmatrix}},\,B={\begin{pmatrix}B_{11}&B_{12}\\B_{21}&B_{22}\\\end{pmatrix}},}
3265:
4008:
5245:
which arranges the processors in a 3D cube mesh, assigning every product of two input submatrices to a single processor. The result submatrices are then generated by performing a reduction over each row. This algorithm transmits
5321:-1 steps are needed. The performance improves further for repeated computations leading to 100% efficiency. The cross-wired mesh array may be seen as a special case of a non-planar (i.e. multilayered) processing structure.
5587:
2643:-matrices which require only 7 multiplications (instead of the usual 8), at the expense of several additional addition and subtraction operations. Applying this recursively gives an algorithm with a multiplicative cost of
5314:-2 steps although this is reduced to half this number for repeated computations. The standard array is inefficient because the data from the two matrices does not arrive simultaneously and it must be padded with zeroes.
560:
cache misses in the worst case. As of 2010, the speed of memories compared to that of processors is such that the cache misses, rather than the actual calculations, dominate the running time for sizable matrices.
6055:—the wreath product approach is unlikely to be applicable to the large matrix problems that arise in practice. the input matrices must be astronomically large for the difference in time to be apparent.
3527:
3259:
2275:
2712:
273:
1823:
3974:
5149:
is the size of fast memory. The naïve algorithm is then used over the block matrices, computing products of submatrices entirely in fast memory. This reduces communication bandwidth to
3970:
2740:
has its merits. A table that compares key aspects of the improved version based on recursive multiplication of 2x2-block matrices via 7 block matrix multiplications follows. As usual
2121:
3052:
2877:
3844:
1965:
3798:
1742:
3850:, Xu, Xu, and Zhou. This algorithm, like all other recent algorithms in this line of research, is a generalization of the Coppersmith–Winograd algorithm, which was given by
2625:
3756:
5964:"Worst-case complexity bounds on algorithms for computing the canonical structure of finite abelian groups and the Hermite and Smith normal forms of an integer matrix"
3818:
3730:
5963:
1658:
which consists of eight multiplications of pairs of submatrices, followed by an addition step. The divide-and-conquer algorithm computes the smaller multiplications
3684:{\displaystyle 4\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-12n^{2}+1.5n^{2}\cdot \log _{2}\left({\frac {{\sqrt {2}}n}{\sqrt {M}}}\right)+5M}
3058:
2883:
3416:{\displaystyle 4\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-12n^{2}+3n^{2}\cdot \log _{2}\left({\frac {{\sqrt {2}}n}{\sqrt {M}}}\right)+5M}
2461:
4211:{\displaystyle {\begin{pmatrix}A_{11}B_{11}+A_{12}B_{21}&A_{11}B_{12}+A_{12}B_{22}\\A_{21}B_{11}+A_{22}B_{21}&A_{21}B_{12}+A_{22}B_{22}\\\end{pmatrix}}}
2778:
2758:
5906:
Beniamini, Gal; Cheng, Nathan; Holtz, Olga; Karstadt, Elaye; Schwartz, Oded (2020). "Sparsifying the
Operators of Fast Matrix Multiplication Algorithms".
6346:
Agarwal, R.C.; Balle, S. M.; Gustavson, F. G.; Joshi, M.; Palkar, P. (September 1995). "A three-dimensional approach to parallel matrix multiplication".
5330:
6019:
The
Coppersmith–Winograd algorithm is not practical, due to the very large hidden constant in the upper bound on the number of multiplications required.
6736:
5335:
2578:
94:
974:
amounts to several orders of magnitude on modern machines, so that the actual calculations dominate the running time, rather than the cache misses.
2736:
Since
Strassen's algorithm is actually used in practical numerical software and computer algebra systems improving on the constants hidden in the
6306:
Irony, Dror; Toledo, Sivan; Tiskin, Alexander (September 2004). "Communication lower bounds for distributed-memory matrix multiplication".
5052:
on any real computer. The algorithm isn't practical due to the communication cost inherent in moving data to and from the temporary matrix
6857:
5206:
2D mesh, one submatrix of the result can be assigned to each processor, and the product can be computed with each processor transmitting
5036:
steps, meaning it takes that much time on an ideal machine with an infinite number of processors; therefore, it has a maximum possible
6913:
6882:
5506:
6284:
6183:
5350:
1844:
6877:
6729:
3441:
2412:{\displaystyle C={\begin{pmatrix}A_{1}&A_{2}\end{pmatrix}}{\begin{pmatrix}B_{1}\\B_{2}\end{pmatrix}}=A_{1}B_{1}+A_{2}B_{2}}
6407:
5741:
5630:
5409:
3173:
3858:
in 1990. The conceptual idea of these algorithms are similar to
Strassen's algorithm: a way is devised for multiplying two
2646:
6004:
3878:
is so large that these algorithms are only worthwhile for matrices that are too large to handle on present-day computers.
195:
6903:
6836:
6722:
6680:
Van Zee, Field G.; van de Geijn, Robert A. (2015). "BLIS: A Framework for
Rapidly Instantiating BLAS Functionality".
5561:
5480:
6867:
5261:
words per processor, which is asymptotically optimal. However, this requires replicating each input matrix element
5116:
2726:
1748:
43:. Many different algorithms have been designed for multiplying matrices on different types of hardware, including
2631:
Algorithms exist that provide better running times than the straightforward ones. The first to be discovered was
3940:
2222:{\displaystyle C=A{\begin{pmatrix}B_{1}&B_{2}\end{pmatrix}}={\begin{pmatrix}AB_{1}&AB_{2}\end{pmatrix}}}
5479:
Alman, Josh; Williams, Virginia
Vassilevska (2020), "A Refined Laser Method and Faster Matrix Multiplication",
6067:
5286:
31:
efficient. Applications of matrix multiplication in computational problems are found in many fields including
3847:
110:
5617:. ASPLOS91: 4th Int'l Conference on Architecture Support for Programming Languages & Operating Systems.
3874:
multiplications, and this technique is applied recursively. However, the constant coefficient hidden by the
2998:
2823:
2072:{\displaystyle C={\begin{pmatrix}A_{1}\\A_{2}\end{pmatrix}}{B}={\begin{pmatrix}A_{1}B\\A_{2}B\end{pmatrix}}}
6826:
3999:
3701:
983:
6467:
5345:
3823:
468:
452:
5013:
is a keyword that signal a computation may be run in parallel with the rest of the function call, while
134:
6780:
5729:
5551:
2639:
in 1969 and often referred to as "fast matrix multiplication". It is based on a way of multiplying two
6641:
Goto, Kazushige; van de Geijn, Robert A. (2008). "Anatomy of high-performance matrix multiplication".
3761:
1700:
6831:
5340:
5071:
Block matrix multiplication. In the 2D algorithm, each processor is responsible for one submatrix of
2436:
6655:
5983:
5696:
2594:
6745:
6571:
6513:
6360:
6320:
5733:
2439:: there is no tuning parameter required to get optimal cache performance, and it behaves well in a
51:
systems, where the computational work is spread over multiple processors (perhaps over a network).
5393:
5355:
3881:
89:). Better asymptotic bounds on the time required to multiply matrices have been known since the
6908:
6650:
6566:
6508:
6384:
6355:
6315:
5978:
5691:
3735:
418:
55:
54:
Directly applying the mathematical definition of matrix multiplication gives an algorithm that
6212:
5928:
3148:{\displaystyle 5\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-15n^{2}+3M}
2973:{\displaystyle 6\left({\frac {{\sqrt {3}}n}{\sqrt {M}}}\right)^{\log _{2}7}\cdot M-18n^{2}+3M}
6790:
5307:
5119:
that partitions each input matrix into a block matrix whose elements are submatrices of size
5108:
3885:
3803:
3715:
1663:
1233:
which works for all square matrices whose dimensions are powers of two, i.e., the shapes are
467:
use of the algorithm; which order is best also depends on whether the matrices are stored in
460:
48:
20:
5385:
6785:
6602:(2009). "A class of parallel tiled linear algebra algorithms for multicore architectures".
6441:
6385:"Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms"
6109:
6000:
5539:
5025:
2562:{\displaystyle \Theta \left(m+n+p+{\frac {mn+np+mp}{b}}+{\frac {mnp}{b{\sqrt {M}}}}\right)}
40:
32:
5290:
Matrix multiplication completed in 2n-1 steps for two n×n matrices on a cross-wired mesh.
8:
6764:
5386:
4223:
2715:
414:
64:
36:
24:
6445:
6257:
6161:
Kauers, Manuel; Moosbauer, Jakob (2022-12-02). "Flip Graphs for Matrix
Multiplication".
6113:
6697:
6668:
6629:
6611:
6537:
6431:
6276:
6162:
6138:
6097:
5907:
5774:
5486:
5457:
5433:
3995:
2763:
2743:
2632:
122:
90:
44:
5948:
6580:
6536:
Kak, S. (2014). "Efficiency of matrix multiplication on the cross-wired mesh array".
6522:
6403:
6143:
6125:
5810:
5793:
5778:
5737:
5723:
5655:
5626:
5557:
5405:
6265:
Proceedings of the thirteenth annual ACM symposium on Theory of computing - STOC '81
5831:
Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures
6800:
6701:
6689:
6672:
6660:
6621:
6576:
6518:
6479:
6395:
6365:
6325:
6280:
6268:
6221:
6133:
6117:
6048:
Even if someone manages to prove one of the conjectures—thereby demonstrating that
5988:
5943:
5888:
5861:
5834:
5805:
5766:
5701:
5664:
5618:
5535:
5397:
2440:
475:
5879:
Probert, Robert L. (1976). "On the additive complexity of matrix multiplication".
5226:
words, which is asymptotically optimal assuming that each node stores the minimum
6841:
6399:
6098:"Discovering faster matrix multiplication algorithms with reinforcement learning"
5996:
5826:
5295:
4002:. These are based on the fact that the eight recursive matrix multiplications in
3855:
3851:
3705:
2636:
403:
98:
6708:
6625:
6484:
6424:
5854:"Pebbling Game and Alternative Basis for High Performance Matrix Multiplication"
5401:
2431:
The cache miss rate of recursive matrix multiplication is the same as that of a
6759:
6633:
6329:
6121:
5719:
5547:
3931:
3875:
2737:
1918:
86:
5428:
Williams, Virginia Vassilevska; Xu, Yinzhan; Xu, Zixuan; Zhou, Renfei (2024),
5067:
278:
From this, a simple algorithm can be constructed which loops over the indices
6897:
6805:
6599:
6129:
5651:
5381:
1843:
to sum the four pairs of resulting matrices element-wise. Application of the
447:
6664:
6240:
6032:
5838:
2718:
is reduced compared to the naïve algorithm, but it is faster in cases where
6714:
6147:
2730:
2583:
987:
540:, every iteration of the inner loop (a simultaneous sweep through a row of
6272:
5622:
578:
version, where the matrix is implicitly divided into square tiles of size
97:) remains unknown. As of April 2024, the best announced bound on the
5682:
Miller, Webb (1975), "Computational complexity and numerical stability",
5317:
The result is even faster on a two-layered cross-wired mesh, where only 2
2447:
The number of cache misses incurred by this algorithm, on a machine with
2435:
iterative version, but unlike that algorithm, the recursive algorithm is
2432:
574:
39:
and in seemingly unrelated problems such as counting the paths through a
6369:
6225:
5668:
6468:"A faster parallel algorithm for matrix multiplication on a mesh array"
6392:
Proceedings of the 17th International Conference on Parallel Processing
5865:
5770:
5543:
4226:
6821:
6499:
Kak, S (1988). "A two-layered mesh array for matrix multiplication".
5853:
5275:
2784:
Progress for Strassen-like recursive 2x2-block matrix multiplication
1659:
464:
6693:
5992:
5892:
5705:
5087:
2591:
over time for the computational complexity of matrix multiplication
6167:
5912:
5491:
5462:
5438:
3927:
125:
because of the large constants and cannot be realized practically.
6616:
6542:
6436:
2729:. It is very useful for large matrices over exact domains such as
5482:
32nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2021)
5037:
6220:(Ph.D.). Massachusetts Institute of Technology. pp. 54–57.
5757:
Strassen, Volker (1969). "Gaussian Elimination is not Optimal".
5717:
121:
time, given by Alman and Williams. However, this algorithm is a
6872:
6862:
5613:
Lam, Monica S.; Rothberg, Edward E.; Wolf, Michael E. (1991).
5588:"6.172 Performance Engineering of Software Systems, Lecture 8"
5615:
The Cache Performance and Optimizations of Blocked Algorithms
5278:, specialized multiplication algorithms have been developed.
5169:, which is asymptotically optimal (for algorithms performing
3936:
1828:
accounting for the eight recursive calls on matrices of size
421:
is to assume that the inputs are all square matrices of size
6242:
A cellular computer to implement the Kalman Filter Algorithm
5017:
waits for all previously "forked" computations to complete.
6597:
5905:
5556:(3rd ed.). MIT Press and McGraw-Hill. pp. 75–79.
5093:
transferred between nodes; in either case it is called the
6345:
5534:
3704:
how well Strassen's algorithm can be improved in terms of
5430:
New Bounds for Matrix Multiplication: from Alpha to Omega
3522:{\displaystyle 5n^{\log _{2}7}-4n^{2}+1.5n^{2}\log _{2}n}
928:
In the idealized cache model, this algorithm incurs only
5294:
There are a variety of algorithms for multiplication on
6184:"AI Reveals New Possibilities in Matrix Multiplication"
6033:"Toward an Optimal Algorithm for Matrix Multiplication"
3254:{\displaystyle 5n^{\log _{2}7}-4n^{2}+3n^{2}\log _{2}n}
5075:. In the 3D algorithm, every pair of submatrices from
4017:
3980:
2328:
2290:
2180:
2139:
2028:
1980:
1454:
1387:
1323:
1256:
1158:
1084:
1010:
6425:"Dimension Independent Matrix Square Using MapReduce"
6094:
4011:
3943:
3826:
3806:
3764:
3758:
matrix over a field can be multiplied together using
3738:
3718:
3536:
3444:
3268:
3176:
3061:
3001:
2886:
2826:
2766:
2746:
2707:{\displaystyle O(n^{\log _{2}7})\approx O(n^{2.807})}
2649:
2597:
2464:
2278:
2124:
1968:
1751:
1703:
1250:
998:
506:
cache lines), the above algorithm is sub-optimal for
198:
6422:
5585:
5507:"Matrix Multiplication Inches Closer to Mythic Goal"
5097:. The naïve algorithm using three nested loops uses
6679:
5929:"Matrix multiplication via arithmetic progressions"
5926:
5454:
Faster Matrix Multiplication via Asymmetric Hashing
5331:
Computational complexity of mathematical operations
564:The optimal variant of the iterative algorithm for
548:) incurs a cache miss when accessing an element of
268:{\displaystyle c_{ij}=\sum _{k=1}^{m}a_{ik}b_{kj}.}
113:, Xu, Xu, and Zhou. This improves on the bound of
5725:Numerical Recipes: The Art of Scientific Computing
4210:
3964:
3838:
3812:
3792:
3750:
3724:
3683:
3521:
3415:
3253:
3147:
3046:
2972:
2871:
2772:
2752:
2706:
2619:
2561:
2411:
2221:
2071:
1817:
1736:
1690:The complexity of this algorithm as a function of
1647:
1222:
267:
6598:Buttari, Alfredo; Langou, Julien; Kurzak, Jakub;
6305:
5336:Computational complexity of matrix multiplication
5088:Communication-avoiding and distributed algorithms
3991:
2579:Computational complexity of matrix multiplication
1845:master theorem for divide-and-conquer recurrences
982:An alternative to the iterative algorithm is the
95:computational complexity of matrix multiplication
93:in the 1960s, but the optimal time (that is, the
6895:
6640:
5824:
5427:
5306:on a standard two-dimensional mesh using the 2D
5083:that is multiplied is assigned to one processor.
5021:achieves its goal by pointer manipulation only.
2725:or so and appears in several libraries, such as
2714:. Strassen's algorithm is more complex, and the
6557:Kak, S (1988). "Multilayered array computing".
6465:
6382:
6160:
6068:"Discovering novel algorithms with AlphaTensor"
5612:
5586:Amarasinghe, Saman; Leiserson, Charles (2010).
977:
6090:
6088:
5851:
986:for matrix multiplication. This relies on the
417:). A common simplification for the purpose of
6730:
6466:Bae, S.E.; Shinn, T.-W.; Takaoka, T. (2014).
6423:Bosagh Zadeh, Reza; Carlsson, Gunnar (2013).
5825:Karstadt, Elaye; Schwartz, Oded (July 2017).
5478:
5452:Duan, Ran; Wu, Hongxun; Zhou, Renfei (2022),
2733:, where numerical stability is not an issue.
6744:
3985:
3800:field operations. The current best bound on
3732:, is the smallest real number for which any
1818:{\displaystyle T(n)=8T(n/2)+\Theta (n^{2}),}
439:, i.e., cubic in the size of the dimension.
6085:
5927:Coppersmith, Don; Winograd, Shmuel (1990),
5920:
5064:speedup, without using a temporary matrix.
4289:Otherwise, allocate space for a new matrix
294:, computing the above using a nested loop:
6737:
6723:
6341:
6339:
6258:"I/O complexity: The red-blue pebble game"
5961:
5451:
5310:, one can complete the multiplication in 3
3965:{\displaystyle \mathbb {Z} /2\mathbb {Z} }
1847:shows this recursion to have the solution
474:In particular, in the idealized case of a
6682:ACM Transactions on Mathematical Software
6654:
6643:ACM Transactions on Mathematical Software
6615:
6570:
6541:
6512:
6483:
6435:
6359:
6319:
6166:
6137:
5982:
5947:
5911:
5809:
5695:
5490:
5461:
5437:
3958:
3945:
1146:
1072:
6383:Solomonik, Edgar; Demmel, James (2011).
6301:
6299:
6297:
6255:
6181:
6030:
6024:
5827:"Matrix Multiplication, a Little Faster"
5791:
5756:
5646:
5644:
5642:
5504:
5423:
5421:
5376:
5374:
5372:
5370:
5285:
5281:
5066:
5056:, but a more practical variant achieves
2582:
446:
101:of a matrix multiplication algorithm is
27:, much work has been invested in making
6868:Basic Linear Algebra Subprograms (BLAS)
6336:
6290:from the original on December 15, 2019.
6214:Cilk: Efficient Multithreaded Computing
6210:
6206:
6204:
5878:
5718:Press, William H.; Flannery, Brian P.;
5594:. Massachusetts Institute of Technology
4809:(or do a short loop, perhaps unrolled).
2760:gives the dimensions of the matrix and
2572:
1855:, the same as the iterative algorithm.
620:be a new matrix of the appropriate size
552:. This means that the algorithm incurs
314:be a new matrix of the appropriate size
6896:
6394:. Vol. Part II. pp. 90–109.
6238:
5681:
5650:
5530:
5528:
5526:
5474:
5472:
5380:
5241:elements. This can be improved by the
4726:(wait for parallel forks to complete).
3047:{\displaystyle 6n^{\log _{2}7}-5n^{2}}
2872:{\displaystyle 7n^{\log _{2}7}-6n^{2}}
1858:
128:
6718:
6294:
5639:
5581:
5579:
5577:
5575:
5573:
5498:
5418:
5367:
2587:Improvement of estimates of exponent
6239:Cannon, Lynn Elliot (14 July 1969).
6201:
5858:SIAM Journal on Scientific Computing
5852:Schwartz, Oded; Vaknin, Noa (2023).
5445:
431:, in which case the running time is
23:is such a central operation in many
6556:
6535:
6498:
5798:Linear Algebra and Its Applications
5794:"On multiplication of 2×2 matrices"
5523:
5469:
5351:Sparse matrix–vector multiplication
5341:CYK algorithm § Valiant's algorithm
5267:times, and so requires a factor of
4286:(or multiply a small block matrix).
3981:Parallel and distributed algorithms
3839:{\displaystyle \omega <2.371552}
2451:lines of ideal cache, each of size
2426:
1921:version of the iterative algorithm.
469:row-major order, column-major order
135:definition of matrix multiplication
13:
6590:
6245:(Ph.D.). Montana State University.
6182:Brubaker, Ben (23 November 2022).
5570:
2465:
1793:
1719:
14:
6925:
6256:Hong, J. W.; Kung, H. T. (1981).
5722:; Vetterling, William T. (2007).
5505:Hartnett, Kevin (23 March 2021).
5384:(2012). "Sorting and Searching".
442:
6914:Matrix multiplication algorithms
6031:Robinson, Sara (November 2005),
5117:communication-avoiding algorithm
3793:{\displaystyle n^{\omega +o(1)}}
2795:#matrix multiplications per step
1917:is below some threshold, use an
1737:{\displaystyle T(1)=\Theta (1);}
514:stored in row-major order. When
29:matrix multiplication algorithms
6550:
6529:
6492:
6459:
6416:
6376:
6249:
6232:
6175:
6154:
6060:
5955:
5936:Journal of Symbolic Computation
5899:
5872:
5845:
5818:
5785:
5750:
5711:
5675:
5962:Iliopoulos, Costas S. (1989),
5833:. SPAA '17. pp. 101–110.
5606:
5180:In a distributed setting with
3921:
3785:
3779:
3710:matrix multiplication exponent
2701:
2688:
2679:
2653:
2620:{\displaystyle O(n^{\omega })}
2614:
2601:
1809:
1796:
1787:
1773:
1761:
1755:
1728:
1722:
1713:
1707:
16:Algorithm to multiply matrices
1:
5949:10.1016/S0747-7171(08)80013-2
5361:
4000:shared-memory multiprocessors
6581:10.1016/0020-0255(88)90010-2
6523:10.1016/0167-8191(88)90078-6
6400:10.1007/978-3-642-23397-5_10
5811:10.1016/0024-3795(71)90009-7
5298:. For multiplication of two
3992:divide-and-conquer algorithm
3702:theoretical computer science
2780:designates the memory size.
1241:. The matrix product is now
984:divide-and-conquer algorithm
978:Divide-and-conquer algorithm
7:
6626:10.1016/j.parco.2008.10.002
6485:10.1016/j.procs.2014.05.208
6308:J. Parallel Distrib. Comput
5402:10.1007/978-1-84800-070-4_4
5388:The Algorithm Design Manual
5346:Matrix chain multiplication
5324:
2801:total arithmetic operations
1694:is given by the recurrence
486:bytes per cache line (i.e.
453:row- and column-major order
67:operations to multiply two
10:
6930:
6781:System of linear equations
6330:10.1016/j.jpdc.2004.03.021
6211:Randall, Keith H. (1998).
6122:10.1038/s41586-022-05172-4
5730:Cambridge University Press
5657:Cache-Oblivious Algorithms
5553:Introduction to Algorithms
3930:introduced AlphaTensor, a
3868:-matrices with fewer than
3700:It is an open question in
2798:#matrix additions per step
2576:
960:cache misses; the divisor
77:matrices over that field (
6850:
6832:Cache-oblivious algorithm
6814:
6773:
6752:
6472:Procedia Computer Science
5971:SIAM Journal on Computing
5792:Winograd, Shmuel (1971).
5184:processors arranged in a
5105:communication bandwidth.
3986:Shared-memory parallelism
3751:{\displaystyle n\times n}
572:in row-major layout is a
6904:Numerical linear algebra
6883:General purpose software
6746:Numerical linear algebra
3994:sketched earlier can be
6665:10.1145/1356052.1356053
5839:10.1145/3087556.3087579
5356:Method of Four Russians
5095:communication bandwidth
3813:{\displaystyle \omega }
3725:{\displaystyle \omega }
476:fully associative cache
5860:. pp. C277–C303.
5291:
5084:
4212:
3966:
3840:
3814:
3794:
3752:
3726:
3685:
3523:
3417:
3255:
3149:
3048:
2974:
2873:
2774:
2754:
2708:
2628:
2621:
2563:
2413:
2223:
2073:
1819:
1738:
1649:
1224:
461:memory access patterns
455:
269:
235:
6878:Specialized libraries
6791:Matrix multiplication
6786:Matrix decompositions
6273:10.1145/800076.802486
5623:10.1145/106972.106981
5540:Leiserson, Charles E.
5392:. Springer. pp.
5289:
5282:Algorithms for meshes
5070:
5024:This algorithm has a
4213:
3967:
3888:that, given matrices
3886:Monte Carlo algorithm
3841:
3815:
3795:
3753:
3727:
3706:asymptotic complexity
3686:
3524:
3418:
3256:
3150:
3049:
2975:
2874:
2804:total I/O-complexity
2775:
2755:
2709:
2622:
2586:
2577:Further information:
2564:
2455:bytes, is bounded by
2414:
2224:
2074:
1820:
1739:
1664:scalar multiplication
1650:
1225:
450:
402:This algorithm takes
270:
215:
99:asymptotic complexity
21:matrix multiplication
6709:How To Optimize GEMM
6559:Information Sciences
6267:. pp. 326–333.
5111:, also known as the
5026:critical path length
4478:Parallel execution:
4009:
3941:
3882:Freivalds' algorithm
3824:
3804:
3762:
3736:
3716:
3534:
3442:
3266:
3174:
3059:
2999:
2884:
2824:
2764:
2744:
2647:
2633:Strassen's algorithm
2595:
2573:Sub-cubic algorithms
2462:
2276:
2122:
1966:
1749:
1701:
1248:
996:
471:, or a mix of both.
196:
189:matrix with entries
91:Strassen's algorithm
33:scientific computing
25:numerical algorithms
6765:Numerical stability
6446:2013arXiv1304.1467B
6370:10.1147/rd.395.0575
6114:2022Natur.610...47F
2785:
2716:numerical stability
1859:Non-square matrices
415:asymptotic notation
129:Iterative algorithm
37:pattern recognition
6604:Parallel Computing
6501:Parallel Computing
5866:10.1137/22M1502719
5771:10.1007/BF02165411
5720:Teukolsky, Saul A.
5592:MIT OpenCourseWare
5308:Cannon's algorithm
5292:
5109:Cannon's algorithm
5085:
4208:
4202:
3962:
3836:
3810:
3790:
3748:
3722:
3712:, usually denoted
3681:
3519:
3413:
3251:
3162:Karstadt, Schwartz
3145:
3044:
2970:
2869:
2783:
2770:
2750:
2704:
2629:
2617:
2559:
2409:
2357:
2317:
2219:
2213:
2166:
2069:
2063:
2009:
1815:
1734:
1687:as its base case.
1645:
1639:
1440:
1376:
1309:
1220:
1211:
1137:
1063:
988:block partitioning
456:
419:algorithm analysis
265:
123:galactic algorithm
6891:
6890:
6409:978-3-642-23397-5
5743:978-0-521-88068-8
5663:(Master's). MIT.
5632:978-0-89791-380-5
5544:Rivest, Ronald L.
5536:Cormen, Thomas H.
5411:978-1-84800-069-8
3694:
3693:
3666:
3665:
3655:
3564:
3563:
3553:
3398:
3397:
3387:
3296:
3295:
3285:
3089:
3088:
3078:
2914:
2913:
2903:
2773:{\displaystyle M}
2753:{\displaystyle n}
2552:
2549:
2522:
1870:Inputs: matrices
623:Pick a tile size
6921:
6801:Matrix splitting
6739:
6732:
6725:
6716:
6715:
6705:
6676:
6658:
6637:
6619:
6585:
6584:
6574:
6554:
6548:
6547:
6545:
6533:
6527:
6526:
6516:
6496:
6490:
6489:
6487:
6463:
6457:
6456:
6454:
6452:
6439:
6429:
6420:
6414:
6413:
6389:
6380:
6374:
6373:
6363:
6343:
6334:
6333:
6323:
6303:
6292:
6291:
6289:
6262:
6253:
6247:
6246:
6236:
6230:
6229:
6219:
6208:
6199:
6198:
6196:
6194:
6179:
6173:
6172:
6170:
6158:
6152:
6151:
6141:
6092:
6083:
6082:
6080:
6079:
6074:. 5 October 2022
6072:www.deepmind.com
6064:
6058:
6057:
6054:
6037:
6028:
6022:
6021:
6016:
6015:
6009:
6003:, archived from
5986:
5968:
5959:
5953:
5952:
5951:
5933:
5924:
5918:
5917:
5915:
5903:
5897:
5896:
5876:
5870:
5869:
5849:
5843:
5842:
5822:
5816:
5815:
5813:
5789:
5783:
5782:
5754:
5748:
5747:
5728:(3rd ed.).
5715:
5709:
5708:
5699:
5679:
5673:
5672:
5662:
5648:
5637:
5636:
5610:
5604:
5603:
5601:
5599:
5583:
5568:
5567:
5532:
5521:
5520:
5518:
5517:
5502:
5496:
5495:
5494:
5476:
5467:
5466:
5465:
5449:
5443:
5442:
5441:
5425:
5416:
5415:
5391:
5378:
5272:
5266:
5260:
5240:
5225:
5223:
5222:
5205:
5204:
5203:
5194:
5193:
5192:
5183:
5176:
5168:
5166:
5165:
5148:
5142:
5141:
5140:
5130:
5129:
5128:
5104:
5082:
5078:
5074:
5063:
5055:
5051:
5035:
5020:
4992:
4969:
4946:
4923:
4897:
4888:
4879:
4870:
4861:
4854:
4845:
4836:
4827:
4818:
4808:
4785:
4775:, element-wise:
4774:
4770:
4766:
4746:
4739:
4717:
4687:
4657:
4627:
4597:
4567:
4537:
4507:
4474:
4465:
4456:
4447:
4438:
4431:
4422:
4413:
4404:
4395:
4388:
4379:
4370:
4361:
4352:
4345:
4336:
4327:
4318:
4309:
4302:
4292:
4285:
4262:
4251:
4217:
4215:
4214:
4209:
4207:
4206:
4199:
4198:
4189:
4188:
4176:
4175:
4166:
4165:
4154:
4153:
4144:
4143:
4131:
4130:
4121:
4120:
4107:
4106:
4097:
4096:
4084:
4083:
4074:
4073:
4062:
4061:
4052:
4051:
4039:
4038:
4029:
4028:
3998:in two ways for
3971:
3969:
3968:
3963:
3961:
3953:
3948:
3917:
3907:
3899:
3895:
3891:
3873:
3867:
3845:
3843:
3842:
3837:
3819:
3817:
3816:
3811:
3799:
3797:
3796:
3791:
3789:
3788:
3757:
3755:
3754:
3749:
3731:
3729:
3728:
3723:
3690:
3688:
3687:
3682:
3671:
3667:
3661:
3660:
3656:
3651:
3648:
3639:
3638:
3626:
3625:
3610:
3609:
3588:
3587:
3580:
3579:
3569:
3565:
3559:
3558:
3554:
3549:
3546:
3528:
3526:
3525:
3520:
3512:
3511:
3502:
3501:
3486:
3485:
3470:
3469:
3462:
3461:
3430:Schwartz, Vaknin
3422:
3420:
3419:
3414:
3403:
3399:
3393:
3392:
3388:
3383:
3380:
3371:
3370:
3358:
3357:
3342:
3341:
3320:
3319:
3312:
3311:
3301:
3297:
3291:
3290:
3286:
3281:
3278:
3260:
3258:
3257:
3252:
3244:
3243:
3234:
3233:
3218:
3217:
3202:
3201:
3194:
3193:
3154:
3152:
3151:
3146:
3135:
3134:
3113:
3112:
3105:
3104:
3094:
3090:
3084:
3083:
3079:
3074:
3071:
3053:
3051:
3050:
3045:
3043:
3042:
3027:
3026:
3019:
3018:
2979:
2977:
2976:
2971:
2960:
2959:
2938:
2937:
2930:
2929:
2919:
2915:
2909:
2908:
2904:
2899:
2896:
2878:
2876:
2875:
2870:
2868:
2867:
2852:
2851:
2844:
2843:
2786:
2782:
2779:
2777:
2776:
2771:
2759:
2757:
2756:
2751:
2724:
2713:
2711:
2710:
2705:
2700:
2699:
2678:
2677:
2670:
2669:
2642:
2626:
2624:
2623:
2618:
2613:
2612:
2590:
2568:
2566:
2565:
2560:
2558:
2554:
2553:
2551:
2550:
2545:
2539:
2528:
2523:
2518:
2492:
2454:
2450:
2441:multiprogramming
2418:
2416:
2415:
2410:
2408:
2407:
2398:
2397:
2385:
2384:
2375:
2374:
2362:
2361:
2354:
2353:
2340:
2339:
2322:
2321:
2314:
2313:
2302:
2301:
2263:
2259:
2255:
2228:
2226:
2225:
2220:
2218:
2217:
2210:
2209:
2195:
2194:
2171:
2170:
2163:
2162:
2151:
2150:
2109:
2105:
2078:
2076:
2075:
2070:
2068:
2067:
2057:
2056:
2040:
2039:
2019:
2014:
2013:
2006:
2005:
1992:
1991:
1953:
1949:
1924:Recursive cases:
1916:
1897:
1887:
1883:
1873:
1854:
1842:
1834:
1824:
1822:
1821:
1816:
1808:
1807:
1783:
1743:
1741:
1740:
1735:
1693:
1686:
1654:
1652:
1651:
1646:
1644:
1643:
1636:
1635:
1626:
1625:
1613:
1612:
1603:
1602:
1591:
1590:
1581:
1580:
1568:
1567:
1558:
1557:
1544:
1543:
1534:
1533:
1521:
1520:
1511:
1510:
1499:
1498:
1489:
1488:
1476:
1475:
1466:
1465:
1445:
1444:
1437:
1436:
1425:
1424:
1411:
1410:
1399:
1398:
1381:
1380:
1373:
1372:
1361:
1360:
1347:
1346:
1335:
1334:
1314:
1313:
1306:
1305:
1294:
1293:
1280:
1279:
1268:
1267:
1240:
1236:
1229:
1227:
1226:
1221:
1216:
1215:
1208:
1207:
1196:
1195:
1182:
1181:
1170:
1169:
1142:
1141:
1134:
1133:
1122:
1121:
1108:
1107:
1096:
1095:
1068:
1067:
1060:
1059:
1048:
1047:
1034:
1033:
1022:
1021:
973:
972:
971:
959:
957:
955:
954:
953:
952:
941:
938:
922:
906:
885:
865:
849:
845:
839:
832:
816:
812:
805:
789:
785:
778:
747:
716:
682:
678:
674:
667:
663:
659:
652:
648:
644:
638:
636:
635:
619:
613:
609:
606:Input: matrices
599:
598:
597:
588:
587:
586:
571:
567:
559:
551:
547:
544:and a column of
543:
539:
538:
536:
535:
530:
527:
513:
509:
505:
503:
502:
497:
494:
485:
481:
451:Illustration of
438:
430:
412:
396:
386:
372:
352:
348:
342:
335:
331:
324:
320:
313:
307:
303:
300:Input: matrices
293:
289:
285:
281:
274:
272:
271:
266:
261:
260:
248:
247:
234:
229:
211:
210:
188:
178:
174:
170:
160:
156:
146:
120:
108:
84:
76:
63:
58:on the order of
6929:
6928:
6924:
6923:
6922:
6920:
6919:
6918:
6894:
6893:
6892:
6887:
6846:
6842:Multiprocessing
6810:
6806:Sparse problems
6769:
6748:
6743:
6713:
6694:10.1145/2764454
6656:10.1.1.140.3583
6593:
6591:Further reading
6588:
6555:
6551:
6534:
6530:
6497:
6493:
6464:
6460:
6450:
6448:
6427:
6421:
6417:
6410:
6387:
6381:
6377:
6348:IBM J. Res. Dev
6344:
6337:
6304:
6295:
6287:
6260:
6254:
6250:
6237:
6233:
6217:
6209:
6202:
6192:
6190:
6188:Quanta Magazine
6180:
6176:
6159:
6155:
6108:(7930): 47–53.
6093:
6086:
6077:
6075:
6066:
6065:
6061:
6049:
6035:
6029:
6025:
6013:
6011:
6007:
5993:10.1137/0218045
5984:10.1.1.531.9309
5966:
5960:
5956:
5931:
5925:
5921:
5904:
5900:
5893:10.1137/0205016
5877:
5873:
5850:
5846:
5823:
5819:
5790:
5786:
5755:
5751:
5744:
5716:
5712:
5706:10.1137/0204009
5697:10.1.1.148.9947
5680:
5676:
5660:
5649:
5640:
5633:
5611:
5607:
5597:
5595:
5584:
5571:
5564:
5548:Stein, Clifford
5533:
5524:
5515:
5513:
5511:Quanta Magazine
5503:
5499:
5477:
5470:
5450:
5446:
5426:
5419:
5412:
5379:
5368:
5364:
5327:
5284:
5268:
5262:
5247:
5227:
5218:
5216:
5207:
5199:
5197:
5196:
5188:
5186:
5185:
5181:
5170:
5161:
5159:
5150:
5144:
5135:
5133:
5132:
5123:
5121:
5120:
5098:
5090:
5080:
5076:
5072:
5057:
5053:
5041:
5029:
5018:
5007:
5006:
4990:
4983:
4976:
4967:
4960:
4953:
4944:
4937:
4930:
4921:
4914:
4907:
4896:
4890:
4887:
4881:
4878:
4872:
4869:
4863:
4859:
4853:
4847:
4844:
4838:
4835:
4829:
4826:
4820:
4816:
4807:
4800:
4793:
4787:
4780:
4772:
4768:
4756:
4744:
4729:
4715:
4708:
4701:
4694:
4685:
4678:
4671:
4664:
4655:
4648:
4641:
4634:
4625:
4618:
4611:
4604:
4595:
4588:
4581:
4574:
4565:
4558:
4551:
4544:
4535:
4528:
4521:
4514:
4505:
4498:
4491:
4484:
4473:
4467:
4464:
4458:
4455:
4449:
4446:
4440:
4436:
4430:
4424:
4421:
4415:
4412:
4406:
4403:
4397:
4393:
4387:
4381:
4378:
4372:
4369:
4363:
4360:
4354:
4350:
4344:
4338:
4335:
4329:
4326:
4320:
4317:
4311:
4307:
4294:
4290:
4284:
4277:
4270:
4264:
4257:
4237:
4224:fork–join style
4201:
4200:
4194:
4190:
4184:
4180:
4171:
4167:
4161:
4157:
4155:
4149:
4145:
4139:
4135:
4126:
4122:
4116:
4112:
4109:
4108:
4102:
4098:
4092:
4088:
4079:
4075:
4069:
4065:
4063:
4057:
4053:
4047:
4043:
4034:
4030:
4024:
4020:
4013:
4012:
4010:
4007:
4006:
3988:
3983:
3975:remains unknown
3957:
3949:
3944:
3942:
3939:
3938:
3924:
3909:
3901:
3897:
3893:
3889:
3869:
3859:
3856:Shmuel Winograd
3852:Don Coppersmith
3825:
3822:
3821:
3805:
3802:
3801:
3769:
3765:
3763:
3760:
3759:
3737:
3734:
3733:
3717:
3714:
3713:
3650:
3649:
3647:
3643:
3634:
3630:
3621:
3617:
3605:
3601:
3575:
3571:
3570:
3548:
3547:
3545:
3541:
3540:
3535:
3532:
3531:
3507:
3503:
3497:
3493:
3481:
3477:
3457:
3453:
3452:
3448:
3443:
3440:
3439:
3382:
3381:
3379:
3375:
3366:
3362:
3353:
3349:
3337:
3333:
3307:
3303:
3302:
3280:
3279:
3277:
3273:
3272:
3267:
3264:
3263:
3239:
3235:
3229:
3225:
3213:
3209:
3189:
3185:
3184:
3180:
3175:
3172:
3171:
3130:
3126:
3100:
3096:
3095:
3073:
3072:
3070:
3066:
3065:
3060:
3057:
3056:
3038:
3034:
3014:
3010:
3009:
3005:
3000:
2997:
2996:
2955:
2951:
2925:
2921:
2920:
2898:
2897:
2895:
2891:
2890:
2885:
2882:
2881:
2863:
2859:
2839:
2835:
2834:
2830:
2825:
2822:
2821:
2765:
2762:
2761:
2745:
2742:
2741:
2719:
2695:
2691:
2665:
2661:
2660:
2656:
2648:
2645:
2644:
2640:
2637:Volker Strassen
2608:
2604:
2596:
2593:
2592:
2588:
2581:
2575:
2544:
2540:
2529:
2527:
2493:
2491:
2472:
2468:
2463:
2460:
2459:
2452:
2448:
2437:cache-oblivious
2429:
2424:
2423:
2403:
2399:
2393:
2389:
2380:
2376:
2370:
2366:
2356:
2355:
2349:
2345:
2342:
2341:
2335:
2331:
2324:
2323:
2316:
2315:
2309:
2305:
2303:
2297:
2293:
2286:
2285:
2277:
2274:
2273:
2261:
2260:vertically and
2257:
2238:
2212:
2211:
2205:
2201:
2196:
2190:
2186:
2176:
2175:
2165:
2164:
2158:
2154:
2152:
2146:
2142:
2135:
2134:
2123:
2120:
2119:
2107:
2088:
2062:
2061:
2052:
2048:
2045:
2044:
2035:
2031:
2024:
2023:
2015:
2008:
2007:
2001:
1997:
1994:
1993:
1987:
1983:
1976:
1975:
1967:
1964:
1963:
1951:
1932:
1902:
1889:
1885:
1875:
1871:
1861:
1848:
1836:
1829:
1803:
1799:
1779:
1750:
1747:
1746:
1702:
1699:
1698:
1691:
1685:
1679:
1672:
1666:
1638:
1637:
1631:
1627:
1621:
1617:
1608:
1604:
1598:
1594:
1592:
1586:
1582:
1576:
1572:
1563:
1559:
1553:
1549:
1546:
1545:
1539:
1535:
1529:
1525:
1516:
1512:
1506:
1502:
1500:
1494:
1490:
1484:
1480:
1471:
1467:
1461:
1457:
1450:
1449:
1439:
1438:
1432:
1428:
1426:
1420:
1416:
1413:
1412:
1406:
1402:
1400:
1394:
1390:
1383:
1382:
1375:
1374:
1368:
1364:
1362:
1356:
1352:
1349:
1348:
1342:
1338:
1336:
1330:
1326:
1319:
1318:
1308:
1307:
1301:
1297:
1295:
1289:
1285:
1282:
1281:
1275:
1271:
1269:
1263:
1259:
1252:
1251:
1249:
1246:
1245:
1238:
1234:
1210:
1209:
1203:
1199:
1197:
1191:
1187:
1184:
1183:
1177:
1173:
1171:
1165:
1161:
1154:
1153:
1136:
1135:
1129:
1125:
1123:
1117:
1113:
1110:
1109:
1103:
1099:
1097:
1091:
1087:
1080:
1079:
1062:
1061:
1055:
1051:
1049:
1043:
1039:
1036:
1035:
1029:
1025:
1023:
1017:
1013:
1006:
1005:
997:
994:
993:
980:
967:
965:
961:
948:
946:
942:
939:
934:
933:
931:
929:
926:
925:
920:
903:
896:
891:
883:
876:
870:
851:
847:
843:
837:
818:
814:
810:
791:
787:
783:
777:
749:
746:
718:
715:
687:
680:
676:
672:
665:
661:
657:
650:
646:
642:
631:
629:
624:
617:
611:
607:
593:
591:
590:
582:
580:
579:
569:
565:
553:
549:
545:
541:
531:
528:
523:
522:
520:
515:
511:
507:
498:
495:
490:
489:
487:
483:
479:
445:
432:
422:
406:
400:
399:
394:
383:
378:
370:
363:
357:
350:
346:
340:
333:
329:
322:
318:
311:
305:
301:
291:
290:from 1 through
287:
283:
282:from 1 through
279:
253:
249:
240:
236:
230:
219:
203:
199:
197:
194:
193:
180:
176:
172:
162:
158:
148:
138:
131:
114:
109:time, given by
102:
78:
68:
59:
17:
12:
11:
5:
6927:
6917:
6916:
6911:
6906:
6889:
6888:
6886:
6885:
6880:
6875:
6870:
6865:
6860:
6854:
6852:
6848:
6847:
6845:
6844:
6839:
6834:
6829:
6824:
6818:
6816:
6812:
6811:
6809:
6808:
6803:
6798:
6788:
6783:
6777:
6775:
6771:
6770:
6768:
6767:
6762:
6760:Floating point
6756:
6754:
6750:
6749:
6742:
6741:
6734:
6727:
6719:
6712:
6711:
6706:
6677:
6638:
6600:Dongarra, Jack
6594:
6592:
6589:
6587:
6586:
6572:10.1.1.90.4753
6565:(3): 347–365.
6549:
6528:
6514:10.1.1.88.8527
6491:
6458:
6415:
6408:
6375:
6361:10.1.1.44.3404
6354:(5): 575–582.
6335:
6321:10.1.1.20.7034
6314:(9): 1017–26.
6293:
6248:
6231:
6200:
6174:
6153:
6084:
6059:
6023:
5977:(4): 658–669,
5954:
5919:
5898:
5887:(2): 187–203.
5881:SIAM J. Comput
5871:
5844:
5817:
5804:(4): 381–388.
5784:
5765:(4): 354–356.
5749:
5742:
5710:
5674:
5652:Prokop, Harald
5638:
5631:
5605:
5569:
5562:
5522:
5497:
5468:
5444:
5417:
5410:
5382:Skiena, Steven
5365:
5363:
5360:
5359:
5358:
5353:
5348:
5343:
5338:
5333:
5326:
5323:
5283:
5280:
5177:computation).
5089:
5086:
5005:
5004:
5003:
5002:
4996:
4995:
4994:
4988:
4981:
4971:
4965:
4958:
4948:
4942:
4935:
4925:
4919:
4912:
4899:
4894:
4885:
4876:
4867:
4856:
4851:
4842:
4833:
4824:
4810:
4805:
4798:
4791:
4779:Base case: if
4751:
4750:
4749:
4748:
4741:
4727:
4721:
4720:
4719:
4713:
4706:
4699:
4689:
4683:
4676:
4669:
4659:
4653:
4646:
4639:
4629:
4623:
4616:
4609:
4599:
4593:
4586:
4579:
4569:
4563:
4556:
4549:
4539:
4533:
4526:
4519:
4509:
4503:
4496:
4489:
4476:
4471:
4462:
4453:
4444:
4433:
4428:
4419:
4410:
4401:
4390:
4385:
4376:
4367:
4358:
4347:
4342:
4333:
4324:
4315:
4287:
4282:
4275:
4268:
4256:Base case: if
4232:
4231:
4219:
4218:
4205:
4197:
4193:
4187:
4183:
4179:
4174:
4170:
4164:
4160:
4156:
4152:
4148:
4142:
4138:
4134:
4129:
4125:
4119:
4115:
4111:
4110:
4105:
4101:
4095:
4091:
4087:
4082:
4078:
4072:
4068:
4064:
4060:
4056:
4050:
4046:
4042:
4037:
4033:
4027:
4023:
4019:
4018:
4016:
3987:
3984:
3982:
3979:
3960:
3956:
3952:
3947:
3932:neural network
3923:
3920:
3900:, verifies in
3876:Big O notation
3835:
3832:
3829:
3809:
3787:
3784:
3781:
3778:
3775:
3772:
3768:
3747:
3744:
3741:
3721:
3692:
3691:
3680:
3677:
3674:
3670:
3664:
3659:
3654:
3646:
3642:
3637:
3633:
3629:
3624:
3620:
3616:
3613:
3608:
3604:
3600:
3597:
3594:
3591:
3586:
3583:
3578:
3574:
3568:
3562:
3557:
3552:
3544:
3539:
3529:
3518:
3515:
3510:
3506:
3500:
3496:
3492:
3489:
3484:
3480:
3476:
3473:
3468:
3465:
3460:
3456:
3451:
3447:
3437:
3434:
3431:
3428:
3424:
3423:
3412:
3409:
3406:
3402:
3396:
3391:
3386:
3378:
3374:
3369:
3365:
3361:
3356:
3352:
3348:
3345:
3340:
3336:
3332:
3329:
3326:
3323:
3318:
3315:
3310:
3306:
3300:
3294:
3289:
3284:
3276:
3271:
3261:
3250:
3247:
3242:
3238:
3232:
3228:
3224:
3221:
3216:
3212:
3208:
3205:
3200:
3197:
3192:
3188:
3183:
3179:
3169:
3166:
3163:
3160:
3156:
3155:
3144:
3141:
3138:
3133:
3129:
3125:
3122:
3119:
3116:
3111:
3108:
3103:
3099:
3093:
3087:
3082:
3077:
3069:
3064:
3054:
3041:
3037:
3033:
3030:
3025:
3022:
3017:
3013:
3008:
3004:
2994:
2991:
2988:
2985:
2981:
2980:
2969:
2966:
2963:
2958:
2954:
2950:
2947:
2944:
2941:
2936:
2933:
2928:
2924:
2918:
2912:
2907:
2902:
2894:
2889:
2879:
2866:
2862:
2858:
2855:
2850:
2847:
2842:
2838:
2833:
2829:
2819:
2816:
2813:
2810:
2806:
2805:
2802:
2799:
2796:
2793:
2790:
2769:
2749:
2738:big O notation
2703:
2698:
2694:
2690:
2687:
2684:
2681:
2676:
2673:
2668:
2664:
2659:
2655:
2652:
2616:
2611:
2607:
2603:
2600:
2574:
2571:
2570:
2569:
2557:
2548:
2543:
2538:
2535:
2532:
2526:
2521:
2517:
2514:
2511:
2508:
2505:
2502:
2499:
2496:
2490:
2487:
2484:
2481:
2478:
2475:
2471:
2467:
2428:
2427:Cache behavior
2425:
2422:
2421:
2420:
2419:
2406:
2402:
2396:
2392:
2388:
2383:
2379:
2373:
2369:
2365:
2360:
2352:
2348:
2344:
2343:
2338:
2334:
2330:
2329:
2327:
2320:
2312:
2308:
2304:
2300:
2296:
2292:
2291:
2289:
2284:
2281:
2268:
2267:
2266:
2265:
2232:
2231:
2230:
2229:
2216:
2208:
2204:
2200:
2197:
2193:
2189:
2185:
2182:
2181:
2179:
2174:
2169:
2161:
2157:
2153:
2149:
2145:
2141:
2140:
2138:
2133:
2130:
2127:
2114:
2113:
2112:
2111:
2082:
2081:
2080:
2079:
2066:
2060:
2055:
2051:
2047:
2046:
2043:
2038:
2034:
2030:
2029:
2027:
2022:
2018:
2012:
2004:
2000:
1996:
1995:
1990:
1986:
1982:
1981:
1979:
1974:
1971:
1958:
1957:
1956:
1955:
1926:
1925:
1922:
1901:Base case: if
1899:
1867:
1866:
1860:
1857:
1826:
1825:
1814:
1811:
1806:
1802:
1798:
1795:
1792:
1789:
1786:
1782:
1778:
1775:
1772:
1769:
1766:
1763:
1760:
1757:
1754:
1744:
1733:
1730:
1727:
1724:
1721:
1718:
1715:
1712:
1709:
1706:
1683:
1677:
1670:
1656:
1655:
1642:
1634:
1630:
1624:
1620:
1616:
1611:
1607:
1601:
1597:
1593:
1589:
1585:
1579:
1575:
1571:
1566:
1562:
1556:
1552:
1548:
1547:
1542:
1538:
1532:
1528:
1524:
1519:
1515:
1509:
1505:
1501:
1497:
1493:
1487:
1483:
1479:
1474:
1470:
1464:
1460:
1456:
1455:
1453:
1448:
1443:
1435:
1431:
1427:
1423:
1419:
1415:
1414:
1409:
1405:
1401:
1397:
1393:
1389:
1388:
1386:
1379:
1371:
1367:
1363:
1359:
1355:
1351:
1350:
1345:
1341:
1337:
1333:
1329:
1325:
1324:
1322:
1317:
1312:
1304:
1300:
1296:
1292:
1288:
1284:
1283:
1278:
1274:
1270:
1266:
1262:
1258:
1257:
1255:
1231:
1230:
1219:
1214:
1206:
1202:
1198:
1194:
1190:
1186:
1185:
1180:
1176:
1172:
1168:
1164:
1160:
1159:
1157:
1152:
1149:
1145:
1140:
1132:
1128:
1124:
1120:
1116:
1112:
1111:
1106:
1102:
1098:
1094:
1090:
1086:
1085:
1083:
1078:
1075:
1071:
1066:
1058:
1054:
1050:
1046:
1042:
1038:
1037:
1032:
1028:
1024:
1020:
1016:
1012:
1011:
1009:
1004:
1001:
979:
976:
924:
923:
917:
916:
915:
914:
913:
912:
911:
910:
909:
908:
907:
901:
894:
888:
887:
886:
881:
874:
840:
780:
753:
722:
691:
639:
621:
614:
603:
602:
478:consisting of
444:
443:Cache behavior
441:
398:
397:
391:
390:
389:
388:
387:
381:
375:
374:
373:
368:
361:
343:
315:
308:
297:
296:
276:
275:
264:
259:
256:
252:
246:
243:
239:
233:
228:
225:
222:
218:
214:
209:
206:
202:
130:
127:
87:big O notation
15:
9:
6:
4:
3:
2:
6926:
6915:
6912:
6910:
6909:Matrix theory
6907:
6905:
6902:
6901:
6899:
6884:
6881:
6879:
6876:
6874:
6871:
6869:
6866:
6864:
6861:
6859:
6856:
6855:
6853:
6849:
6843:
6840:
6838:
6835:
6833:
6830:
6828:
6825:
6823:
6820:
6819:
6817:
6813:
6807:
6804:
6802:
6799:
6796:
6792:
6789:
6787:
6784:
6782:
6779:
6778:
6776:
6772:
6766:
6763:
6761:
6758:
6757:
6755:
6751:
6747:
6740:
6735:
6733:
6728:
6726:
6721:
6720:
6717:
6710:
6707:
6703:
6699:
6695:
6691:
6687:
6683:
6678:
6674:
6670:
6666:
6662:
6657:
6652:
6648:
6644:
6639:
6635:
6631:
6627:
6623:
6618:
6613:
6609:
6605:
6601:
6596:
6595:
6582:
6578:
6573:
6568:
6564:
6560:
6553:
6544:
6539:
6532:
6524:
6520:
6515:
6510:
6506:
6502:
6495:
6486:
6481:
6477:
6473:
6469:
6462:
6447:
6443:
6438:
6433:
6426:
6419:
6411:
6405:
6401:
6397:
6393:
6386:
6379:
6371:
6367:
6362:
6357:
6353:
6349:
6342:
6340:
6331:
6327:
6322:
6317:
6313:
6309:
6302:
6300:
6298:
6286:
6282:
6278:
6274:
6270:
6266:
6259:
6252:
6244:
6243:
6235:
6227:
6223:
6216:
6215:
6207:
6205:
6189:
6185:
6178:
6169:
6164:
6157:
6149:
6145:
6140:
6135:
6131:
6127:
6123:
6119:
6115:
6111:
6107:
6103:
6099:
6091:
6089:
6073:
6069:
6063:
6056:
6052:
6045:
6041:
6034:
6027:
6020:
6010:on 2014-03-05
6006:
6002:
5998:
5994:
5990:
5985:
5980:
5976:
5972:
5965:
5958:
5950:
5945:
5941:
5937:
5930:
5923:
5914:
5909:
5902:
5894:
5890:
5886:
5882:
5875:
5867:
5863:
5859:
5855:
5848:
5840:
5836:
5832:
5828:
5821:
5812:
5807:
5803:
5799:
5795:
5788:
5780:
5776:
5772:
5768:
5764:
5760:
5753:
5745:
5739:
5735:
5731:
5727:
5726:
5721:
5714:
5707:
5703:
5698:
5693:
5690:(2): 97–107,
5689:
5685:
5678:
5670:
5666:
5659:
5658:
5653:
5647:
5645:
5643:
5634:
5628:
5624:
5620:
5616:
5609:
5593:
5589:
5582:
5580:
5578:
5576:
5574:
5565:
5563:0-262-03384-4
5559:
5555:
5554:
5549:
5545:
5541:
5537:
5531:
5529:
5527:
5512:
5508:
5501:
5493:
5488:
5484:
5483:
5475:
5473:
5464:
5459:
5455:
5448:
5440:
5435:
5431:
5424:
5422:
5413:
5407:
5403:
5399:
5395:
5390:
5389:
5383:
5377:
5375:
5373:
5371:
5366:
5357:
5354:
5352:
5349:
5347:
5344:
5342:
5339:
5337:
5334:
5332:
5329:
5328:
5322:
5320:
5315:
5313:
5309:
5305:
5301:
5297:
5288:
5279:
5277:
5271:
5265:
5258:
5254:
5250:
5244:
5243:3D algorithm,
5238:
5234:
5230:
5221:
5214:
5210:
5202:
5191:
5178:
5174:
5164:
5157:
5153:
5147:
5138:
5126:
5118:
5114:
5110:
5106:
5102:
5096:
5069:
5065:
5061:
5049:
5045:
5039:
5033:
5027:
5022:
5016:
5012:
5000:
4997:
4987:
4980:
4975:
4972:
4964:
4957:
4952:
4949:
4941:
4934:
4929:
4926:
4918:
4911:
4906:
4903:
4902:
4901:In parallel:
4900:
4893:
4884:
4875:
4866:
4857:
4850:
4841:
4832:
4823:
4814:
4813:
4811:
4804:
4797:
4790:
4783:
4778:
4777:
4776:
4764:
4760:
4755:
4742:
4737:
4733:
4728:
4725:
4722:
4712:
4705:
4698:
4693:
4690:
4682:
4675:
4668:
4663:
4660:
4652:
4645:
4638:
4633:
4630:
4622:
4615:
4608:
4603:
4600:
4592:
4585:
4578:
4573:
4570:
4562:
4555:
4548:
4543:
4540:
4532:
4525:
4518:
4513:
4510:
4502:
4495:
4488:
4483:
4480:
4479:
4477:
4470:
4461:
4452:
4443:
4434:
4427:
4418:
4409:
4400:
4391:
4384:
4375:
4366:
4357:
4348:
4341:
4332:
4323:
4314:
4305:
4304:
4301:
4297:
4288:
4281:
4274:
4267:
4260:
4255:
4254:
4253:
4249:
4245:
4241:
4236:
4230:
4228:
4225:
4203:
4195:
4191:
4185:
4181:
4177:
4172:
4168:
4162:
4158:
4150:
4146:
4140:
4136:
4132:
4127:
4123:
4117:
4113:
4103:
4099:
4093:
4089:
4085:
4080:
4076:
4070:
4066:
4058:
4054:
4048:
4044:
4040:
4035:
4031:
4025:
4021:
4014:
4005:
4004:
4003:
4001:
3997:
3993:
3978:
3976:
3972:
3954:
3950:
3937:finite field
3933:
3929:
3919:
3916:
3912:
3905:
3887:
3883:
3879:
3877:
3872:
3866:
3862:
3857:
3853:
3849:
3833:
3830:
3827:
3807:
3782:
3776:
3773:
3770:
3766:
3745:
3742:
3739:
3719:
3711:
3707:
3703:
3698:
3678:
3675:
3672:
3668:
3662:
3657:
3652:
3644:
3640:
3635:
3631:
3627:
3622:
3618:
3614:
3611:
3606:
3602:
3598:
3595:
3592:
3589:
3584:
3581:
3576:
3572:
3566:
3560:
3555:
3550:
3542:
3537:
3530:
3516:
3513:
3508:
3504:
3498:
3494:
3490:
3487:
3482:
3478:
3474:
3471:
3466:
3463:
3458:
3454:
3449:
3445:
3438:
3435:
3432:
3429:
3426:
3425:
3410:
3407:
3404:
3400:
3394:
3389:
3384:
3376:
3372:
3367:
3363:
3359:
3354:
3350:
3346:
3343:
3338:
3334:
3330:
3327:
3324:
3321:
3316:
3313:
3308:
3304:
3298:
3292:
3287:
3282:
3274:
3269:
3262:
3248:
3245:
3240:
3236:
3230:
3226:
3222:
3219:
3214:
3210:
3206:
3203:
3198:
3195:
3190:
3186:
3181:
3177:
3170:
3167:
3164:
3161:
3158:
3157:
3142:
3139:
3136:
3131:
3127:
3123:
3120:
3117:
3114:
3109:
3106:
3101:
3097:
3091:
3085:
3080:
3075:
3067:
3062:
3055:
3039:
3035:
3031:
3028:
3023:
3020:
3015:
3011:
3006:
3002:
2995:
2992:
2989:
2986:
2983:
2982:
2967:
2964:
2961:
2956:
2952:
2948:
2945:
2942:
2939:
2934:
2931:
2926:
2922:
2916:
2910:
2905:
2900:
2892:
2887:
2880:
2864:
2860:
2856:
2853:
2848:
2845:
2840:
2836:
2831:
2827:
2820:
2817:
2814:
2811:
2808:
2807:
2803:
2800:
2797:
2794:
2791:
2788:
2787:
2781:
2767:
2747:
2739:
2734:
2732:
2731:finite fields
2728:
2722:
2717:
2696:
2692:
2685:
2682:
2674:
2671:
2666:
2662:
2657:
2650:
2638:
2635:, devised by
2634:
2609:
2605:
2598:
2585:
2580:
2555:
2546:
2541:
2536:
2533:
2530:
2524:
2519:
2515:
2512:
2509:
2506:
2503:
2500:
2497:
2494:
2488:
2485:
2482:
2479:
2476:
2473:
2469:
2458:
2457:
2456:
2445:
2442:
2438:
2434:
2404:
2400:
2394:
2390:
2386:
2381:
2377:
2371:
2367:
2363:
2358:
2350:
2346:
2336:
2332:
2325:
2318:
2310:
2306:
2298:
2294:
2287:
2282:
2279:
2272:
2271:
2270:
2269:
2264:horizontally:
2254:
2250:
2246:
2242:
2236:
2235:
2234:
2233:
2214:
2206:
2202:
2198:
2191:
2187:
2183:
2177:
2172:
2167:
2159:
2155:
2147:
2143:
2136:
2131:
2128:
2125:
2118:
2117:
2116:
2115:
2104:
2100:
2096:
2092:
2086:
2085:
2084:
2083:
2064:
2058:
2053:
2049:
2041:
2036:
2032:
2025:
2020:
2016:
2010:
2002:
1998:
1988:
1984:
1977:
1972:
1969:
1962:
1961:
1960:
1959:
1954:horizontally:
1948:
1944:
1940:
1936:
1930:
1929:
1928:
1927:
1923:
1920:
1914:
1910:
1906:
1900:
1896:
1892:
1882:
1878:
1869:
1868:
1865:
1856:
1852:
1846:
1840:
1832:
1812:
1804:
1800:
1790:
1784:
1780:
1776:
1770:
1767:
1764:
1758:
1752:
1745:
1731:
1725:
1716:
1710:
1704:
1697:
1696:
1695:
1688:
1682:
1676:
1669:
1665:
1661:
1640:
1632:
1628:
1622:
1618:
1614:
1609:
1605:
1599:
1595:
1587:
1583:
1577:
1573:
1569:
1564:
1560:
1554:
1550:
1540:
1536:
1530:
1526:
1522:
1517:
1513:
1507:
1503:
1495:
1491:
1485:
1481:
1477:
1472:
1468:
1462:
1458:
1451:
1446:
1441:
1433:
1429:
1421:
1417:
1407:
1403:
1395:
1391:
1384:
1377:
1369:
1365:
1357:
1353:
1343:
1339:
1331:
1327:
1320:
1315:
1310:
1302:
1298:
1290:
1286:
1276:
1272:
1264:
1260:
1253:
1244:
1243:
1242:
1217:
1212:
1204:
1200:
1192:
1188:
1178:
1174:
1166:
1162:
1155:
1150:
1147:
1143:
1138:
1130:
1126:
1118:
1114:
1104:
1100:
1092:
1088:
1081:
1076:
1073:
1069:
1064:
1056:
1052:
1044:
1040:
1030:
1026:
1018:
1014:
1007:
1002:
999:
992:
991:
990:
989:
985:
975:
970:
964:
951:
945:
937:
918:
904:
897:
889:
884:
877:
868:
867:
863:
859:
855:
841:
835:
834:
830:
826:
822:
808:
807:
803:
799:
795:
781:
776:
772:
768:
764:
760:
756:
752:
745:
741:
737:
733:
729:
725:
721:
714:
710:
706:
702:
698:
694:
690:
685:
684:
670:
669:
655:
654:
640:
634:
627:
622:
615:
605:
604:
601:
596:
585:
577:
576:
562:
557:
534:
526:
518:
501:
493:
477:
472:
470:
466:
462:
454:
449:
440:
436:
429:
425:
420:
416:
410:
405:
392:
384:
376:
371:
364:
355:
354:
344:
338:
337:
327:
326:
316:
309:
299:
298:
295:
262:
257:
254:
250:
244:
241:
237:
231:
226:
223:
220:
216:
212:
207:
204:
200:
192:
191:
190:
187:
183:
169:
165:
155:
151:
145:
141:
136:
126:
124:
118:
112:
106:
100:
96:
92:
88:
82:
75:
71:
66:
62:
57:
52:
50:
46:
42:
38:
34:
30:
26:
22:
6794:
6753:Key concepts
6685:
6681:
6646:
6642:
6607:
6603:
6562:
6558:
6552:
6531:
6507:(3): 383–5.
6504:
6500:
6494:
6475:
6471:
6461:
6449:. Retrieved
6418:
6391:
6378:
6351:
6347:
6311:
6307:
6264:
6251:
6241:
6234:
6226:1721.1/47519
6213:
6191:. Retrieved
6187:
6177:
6156:
6105:
6101:
6076:. Retrieved
6071:
6062:
6050:
6047:
6043:
6039:
6026:
6018:
6012:, retrieved
6005:the original
5974:
5970:
5957:
5939:
5935:
5922:
5901:
5884:
5880:
5874:
5857:
5847:
5830:
5820:
5801:
5797:
5787:
5762:
5758:
5752:
5724:
5713:
5687:
5683:
5677:
5669:1721.1/80568
5656:
5614:
5608:
5596:. Retrieved
5591:
5552:
5514:. Retrieved
5510:
5500:
5481:
5453:
5447:
5429:
5396:–46, 401–3.
5387:
5318:
5316:
5311:
5303:
5299:
5293:
5269:
5263:
5256:
5252:
5248:
5242:
5236:
5232:
5228:
5219:
5212:
5208:
5200:
5189:
5179:
5172:
5162:
5155:
5151:
5145:
5136:
5124:
5113:2D algorithm
5112:
5107:
5100:
5094:
5091:
5059:
5047:
5043:
5031:
5023:
5014:
5010:
5008:
4998:
4985:
4978:
4973:
4962:
4955:
4950:
4939:
4932:
4927:
4916:
4909:
4904:
4891:
4882:
4873:
4864:
4848:
4839:
4830:
4821:
4802:
4795:
4788:
4781:
4762:
4758:
4753:
4752:
4735:
4731:
4723:
4710:
4703:
4696:
4691:
4680:
4673:
4666:
4661:
4650:
4643:
4636:
4631:
4620:
4613:
4606:
4601:
4590:
4583:
4576:
4571:
4560:
4553:
4546:
4541:
4530:
4523:
4516:
4511:
4500:
4493:
4486:
4481:
4468:
4459:
4450:
4441:
4425:
4416:
4407:
4398:
4382:
4373:
4364:
4355:
4339:
4330:
4321:
4312:
4299:
4295:
4279:
4272:
4265:
4258:
4247:
4243:
4239:
4234:
4233:
4220:
3996:parallelized
3989:
3925:
3914:
3910:
3903:
3884:is a simple
3880:
3870:
3864:
3860:
3709:
3699:
3695:
2735:
2720:
2630:
2446:
2430:
2252:
2248:
2244:
2240:
2102:
2098:
2094:
2090:
1946:
1942:
1938:
1934:
1912:
1908:
1904:
1894:
1890:
1880:
1876:
1864:dimensions.
1862:
1850:
1838:
1830:
1827:
1689:
1680:
1674:
1667:
1662:, using the
1657:
1232:
981:
968:
962:
949:
943:
935:
927:
899:
892:
879:
872:
871:sum ← sum +
861:
857:
853:
828:
824:
820:
801:
797:
793:
774:
770:
766:
762:
758:
754:
750:
743:
739:
735:
731:
727:
723:
719:
712:
708:
704:
700:
696:
692:
688:
679:in steps of
664:in steps of
649:in steps of
632:
625:
594:
583:
573:
563:
555:
532:
524:
516:
499:
491:
473:
457:
434:
427:
423:
408:
401:
379:
366:
359:
358:sum ← sum +
277:
185:
181:
167:
163:
153:
149:
143:
139:
132:
116:
104:
80:
73:
69:
60:
53:
28:
18:
6688:(3): 1–33.
6649:(3): 1–25.
6478:: 2230–40.
6193:26 November
5759:Numer. Math
4812:Otherwise:
4743:Deallocate
3922:AlphaTensor
2237:Otherwise,
2110:vertically:
1660:recursively
137:is that if
49:distributed
6898:Categories
6795:algorithms
6168:2212.01175
6078:2022-11-01
6014:2015-01-16
5942:(3): 251,
5913:2008.03759
5732:. p.
5598:27 January
5516:2021-04-01
5492:2010.05846
5463:2210.10173
5439:2307.07970
5362:References
4858:Partition
4815:Partition
4435:Partition
4392:Partition
4349:Partition
4306:Partition
4227:pseudocode
779:, that is:
675:from 1 to
660:from 1 to
645:from 1 to
482:bytes and
349:from 1 to
332:from 1 to
321:from 1 to
56:takes time
6822:CPU cache
6651:CiteSeerX
6617:0709.1272
6610:: 38–53.
6567:CiteSeerX
6543:1411.3273
6509:CiteSeerX
6437:1304.1467
6356:CiteSeerX
6316:CiteSeerX
6130:1476-4687
6040:SIAM News
5979:CiteSeerX
5779:121656251
5692:CiteSeerX
5684:SIAM News
5550:(2009) .
5276:MapReduce
5019:partition
4754:Procedure
4695:multiply(
4665:multiply(
4635:multiply(
4605:multiply(
4575:multiply(
4545:multiply(
4515:multiply(
4485:multiply(
4293:of shape
4238:multiply(
4235:Procedure
3926:In 2022,
3828:ω
3808:ω
3771:ω
3743:×
3720:ω
3641:
3628:⋅
3596:−
3590:⋅
3582:
3514:
3472:−
3464:
3373:
3360:⋅
3328:−
3322:⋅
3314:
3246:
3204:−
3196:
3121:−
3115:⋅
3107:
3029:−
3021:
2946:−
2940:⋅
2932:
2854:−
2846:
2792:Reference
2683:≈
2672:
2610:ω
2466:Θ
2087:Else, if
1794:Θ
1720:Θ
1237:for some
686:Multiply
217:∑
6851:Software
6815:Hardware
6774:Problems
6285:Archived
6148:36198780
5654:(1999).
5325:See also
5143:, where
4303:, then:
3928:DeepMind
3908:time if
3848:Williams
3834:2.371552
2987:Winograd
2812:Strassen
2723:> 100
2256:. Split
2106:, split
1950:, split
1919:unrolled
1888:of size
1874:of size
111:Williams
45:parallel
19:Because
6702:1242360
6673:9359223
6451:12 July
6442:Bibcode
6281:8410593
6139:9534758
6110:Bibcode
6001:1004789
5217:√
5198:√
5187:√
5160:√
5134:√
5122:√
5115:, is a
5038:speedup
966:√
956:
947:√
932:
919:Return
838:sum = 0
630:√
592:√
581:√
537:
521:
504:
488:
393:Return
341:sum = 0
175:, then
171:matrix
161:and an
157:matrix
147:for an
6873:LAPACK
6863:MATLAB
6700:
6671:
6653:
6632:
6569:
6511:
6406:
6358:
6318:
6279:
6146:
6136:
6128:
6102:Nature
5999:
5981:
5777:
5740:
5694:
5629:
5560:
5408:
5296:meshes
5030:Θ(log
5009:Here,
4786:, set
4263:, set
3708:. The
179:is an
6858:ATLAS
6698:S2CID
6669:S2CID
6630:S2CID
6612:arXiv
6538:arXiv
6432:arXiv
6428:(PDF)
6388:(PDF)
6288:(PDF)
6277:S2CID
6261:(PDF)
6218:(PDF)
6163:arXiv
6046:(9),
6036:(PDF)
6008:(PDF)
5967:(PDF)
5932:(PDF)
5908:arXiv
5775:S2CID
5661:(PDF)
5487:arXiv
5458:arXiv
5434:arXiv
5046:/log
4862:into
4819:into
4771:into
4767:adds
4439:into
4396:into
4353:into
4310:into
3846:, by
2697:2.807
2641:2 × 2
2433:tiled
1235:2 × 2
905:+ sum
846:from
813:from
786:from
748:into
575:tiled
519:>
465:cache
385:← sum
65:field
41:graph
6837:SIMD
6453:2014
6404:ISBN
6195:2022
6144:PMID
6126:ISSN
5738:ISBN
5627:ISBN
5600:2015
5558:ISBN
5406:ISBN
5079:and
5015:join
5011:fork
4999:Join
4977:add(
4974:Fork
4954:add(
4951:Fork
4931:add(
4928:Fork
4908:add(
4905:Fork
4757:add(
4730:add(
4724:Join
4692:Fork
4662:Fork
4632:Fork
4602:Fork
4572:Fork
4542:Fork
4512:Fork
4482:Fork
3990:The
3896:and
3854:and
3831:<
3427:2023
3159:2017
2984:1971
2809:1969
2789:Year
2727:BLAS
2251:) =
2239:max(
2101:) =
2089:max(
1945:) =
1933:max(
1903:max(
1835:and
890:Set
869:Set
852:min(
842:For
836:Let
819:min(
809:For
792:min(
782:For
717:and
671:For
656:For
641:For
628:= Θ(
616:Let
610:and
568:and
510:and
463:and
413:(in
404:time
377:Set
356:Set
345:For
339:Let
328:For
317:For
310:Let
304:and
286:and
133:The
47:and
35:and
6827:TLB
6690:doi
6661:doi
6634:955
6622:doi
6577:doi
6519:doi
6480:doi
6396:doi
6366:doi
6326:doi
6269:doi
6222:hdl
6134:PMC
6118:doi
6106:610
6053:= 2
5989:doi
5944:doi
5889:doi
5862:doi
5835:doi
5806:doi
5767:doi
5734:108
5702:doi
5665:hdl
5619:doi
5398:doi
5195:by
5131:by
5040:of
5028:of
4784:= 1
4261:= 1
3820:is
3632:log
3615:1.5
3573:log
3505:log
3491:1.5
3455:log
3364:log
3305:log
3237:log
3187:log
3098:log
3012:log
2923:log
2837:log
2663:log
1931:If
850:to
817:to
790:to
589:by
409:nmp
85:in
6900::
6696:.
6686:41
6684:.
6667:.
6659:.
6647:34
6645:.
6628:.
6620:.
6608:35
6606:.
6575:.
6563:45
6561:.
6517:.
6503:.
6476:29
6474:.
6470:.
6440:.
6430:.
6402:.
6390:.
6364:.
6352:39
6350:.
6338:^
6324:.
6312:64
6310:.
6296:^
6283:.
6275:.
6263:.
6203:^
6186:.
6142:.
6132:.
6124:.
6116:.
6104:.
6100:.
6087:^
6070:.
6044:38
6042:,
6038:,
6017:,
5997:MR
5995:,
5987:,
5975:18
5973:,
5969:,
5938:,
5934:,
5883:.
5856:.
5829:.
5800:.
5796:.
5773:.
5763:13
5761:.
5736:.
5700:,
5686:,
5641:^
5625:.
5590:.
5572:^
5546:;
5542:;
5538:;
5525:^
5509:.
5485:,
5471:^
5456:,
5432:,
5420:^
5404:.
5394:45
5369:^
5171:Ω(
5139:/3
5127:/3
5099:Ω(
5058:Θ(
5042:Θ(
4989:22
4984:,
4982:22
4966:21
4961:,
4959:21
4943:12
4938:,
4936:12
4920:11
4915:,
4913:11
4895:22
4889:,
4886:21
4880:,
4877:12
4871:,
4868:11
4852:22
4846:,
4843:21
4837:,
4834:12
4828:,
4825:11
4806:11
4801:+
4799:11
4794:←
4792:11
4761:,
4734:,
4714:22
4709:,
4707:22
4702:,
4700:22
4684:21
4679:,
4677:22
4672:,
4670:21
4654:22
4649:,
4647:12
4642:,
4640:12
4624:21
4619:,
4617:12
4612:,
4610:11
4594:12
4589:,
4587:21
4582:,
4580:22
4564:11
4559:,
4557:21
4552:,
4550:21
4534:12
4529:,
4527:11
4522:,
4520:12
4504:11
4499:,
4497:11
4492:,
4490:11
4472:22
4466:,
4463:21
4457:,
4454:12
4448:,
4445:11
4429:22
4423:,
4420:21
4414:,
4411:12
4405:,
4402:11
4386:22
4380:,
4377:21
4371:,
4368:12
4362:,
4359:11
4343:22
4337:,
4334:21
4328:,
4325:12
4319:,
4316:11
4298:×
4283:11
4278:×
4276:11
4271:←
4269:11
4252::
4246:,
4242:,
4229::
4196:22
4186:22
4173:12
4163:21
4151:21
4141:22
4128:11
4118:21
4104:22
4094:12
4081:12
4071:11
4059:21
4049:12
4036:11
4026:11
3918:.
3913:=
3911:AB
3902:Θ(
3892:,
3863:×
3599:12
3436:12
3331:12
3168:12
3124:15
2993:15
2949:18
2818:18
2247:,
2243:,
2097:,
2093:,
1941:,
1937:,
1911:,
1907:,
1893:×
1884:,
1879:×
1849:Θ(
1837:Θ(
1833:/2
1684:11
1678:11
1673:=
1671:11
1633:22
1623:22
1610:12
1600:21
1588:21
1578:22
1565:11
1555:21
1541:22
1531:12
1518:12
1508:11
1496:21
1486:12
1473:11
1463:11
1434:22
1422:21
1408:12
1396:11
1370:22
1358:21
1344:12
1332:11
1303:22
1291:21
1277:12
1265:11
1205:22
1193:21
1179:12
1167:11
1131:22
1119:21
1105:12
1093:11
1057:22
1045:21
1031:12
1019:11
930:Θ(
902:ij
898:←
895:ij
882:kj
878:×
875:ik
866::
860:,
856:+
833::
827:,
823:+
806::
800:,
796:+
765:,
734:,
703:,
683::
668::
653::
600::
554:Θ(
433:Θ(
426:×
407:Θ(
382:ij
369:kj
365:×
362:ik
353::
336::
325::
184:×
166:×
152:×
144:AB
142:=
115:O(
103:O(
79:Θ(
72:×
6797:)
6793:(
6738:e
6731:t
6724:v
6704:.
6692::
6675:.
6663::
6636:.
6624::
6614::
6583:.
6579::
6546:.
6540::
6525:.
6521::
6505:6
6488:.
6482::
6455:.
6444::
6434::
6412:.
6398::
6372:.
6368::
6332:.
6328::
6271::
6228:.
6224::
6197:.
6171:.
6165::
6150:.
6120::
6112::
6081:.
6051:ω
5991::
5946::
5940:9
5916:.
5910::
5895:.
5891::
5885:5
5868:.
5864::
5841:.
5837::
5814:.
5808::
5802:4
5781:.
5769::
5746:.
5704::
5688:4
5671:.
5667::
5635:.
5621::
5602:.
5566:.
5519:.
5489::
5460::
5436::
5414:.
5400::
5319:n
5312:n
5304:n
5302:×
5300:n
5270:p
5264:p
5259:)
5257:p
5255:/
5253:n
5251:(
5249:O
5239:)
5237:p
5235:/
5233:n
5231:(
5229:O
5224:)
5220:p
5215:/
5213:n
5211:(
5209:O
5201:p
5190:p
5182:p
5175:)
5173:n
5167:)
5163:M
5158:/
5156:n
5154:(
5152:O
5146:M
5137:M
5125:M
5103:)
5101:n
5081:B
5077:A
5073:C
5062:)
5060:n
5054:T
5050:)
5048:n
5044:n
5034:)
5032:n
5001:.
4993:.
4991:)
4986:T
4979:C
4970:.
4968:)
4963:T
4956:C
4947:.
4945:)
4940:T
4933:C
4924:.
4922:)
4917:T
4910:C
4898:.
4892:T
4883:T
4874:T
4865:T
4860:T
4855:.
4849:C
4840:C
4831:C
4822:C
4817:C
4803:t
4796:c
4789:c
4782:n
4773:C
4769:T
4765:)
4763:T
4759:C
4747:.
4745:T
4740:.
4738:)
4736:T
4732:C
4718:.
4716:)
4711:B
4704:A
4697:T
4688:.
4686:)
4681:B
4674:A
4667:T
4658:.
4656:)
4651:B
4644:A
4637:T
4628:.
4626:)
4621:B
4614:A
4607:T
4598:.
4596:)
4591:B
4584:A
4577:C
4568:.
4566:)
4561:B
4554:A
4547:C
4538:.
4536:)
4531:B
4524:A
4517:C
4508:.
4506:)
4501:B
4494:A
4487:C
4475:.
4469:T
4460:T
4451:T
4442:T
4437:T
4432:.
4426:C
4417:C
4408:C
4399:C
4394:C
4389:.
4383:B
4374:B
4365:B
4356:B
4351:B
4346:.
4340:A
4331:A
4322:A
4313:A
4308:A
4300:n
4296:n
4291:T
4280:b
4273:a
4266:c
4259:n
4250:)
4248:B
4244:A
4240:C
4204:)
4192:B
4182:A
4178:+
4169:B
4159:A
4147:B
4137:A
4133:+
4124:B
4114:A
4100:B
4090:A
4086:+
4077:B
4067:A
4055:B
4045:A
4041:+
4032:B
4022:A
4015:(
3959:Z
3955:2
3951:/
3946:Z
3915:C
3906:)
3904:n
3898:C
3894:B
3890:A
3871:k
3865:k
3861:k
3786:)
3783:1
3780:(
3777:o
3774:+
3767:n
3746:n
3740:n
3679:M
3676:5
3673:+
3669:)
3663:M
3658:n
3653:2
3645:(
3636:2
3623:2
3619:n
3612:+
3607:2
3603:n
3593:M
3585:7
3577:2
3567:)
3561:M
3556:n
3551:3
3543:(
3538:4
3517:n
3509:2
3499:2
3495:n
3488:+
3483:2
3479:n
3475:4
3467:7
3459:2
3450:n
3446:5
3433:7
3411:M
3408:5
3405:+
3401:)
3395:M
3390:n
3385:2
3377:(
3368:2
3355:2
3351:n
3347:3
3344:+
3339:2
3335:n
3325:M
3317:7
3309:2
3299:)
3293:M
3288:n
3283:3
3275:(
3270:4
3249:n
3241:2
3231:2
3227:n
3223:3
3220:+
3215:2
3211:n
3207:4
3199:7
3191:2
3182:n
3178:5
3165:7
3143:M
3140:3
3137:+
3132:2
3128:n
3118:M
3110:7
3102:2
3092:)
3086:M
3081:n
3076:3
3068:(
3063:5
3040:2
3036:n
3032:5
3024:7
3016:2
3007:n
3003:6
2990:7
2968:M
2965:3
2962:+
2957:2
2953:n
2943:M
2935:7
2927:2
2917:)
2911:M
2906:n
2901:3
2893:(
2888:6
2865:2
2861:n
2857:6
2849:7
2841:2
2832:n
2828:7
2815:7
2768:M
2748:n
2721:n
2702:)
2693:n
2689:(
2686:O
2680:)
2675:7
2667:2
2658:n
2654:(
2651:O
2627:.
2615:)
2606:n
2602:(
2599:O
2589:ω
2556:)
2547:M
2542:b
2537:p
2534:n
2531:m
2525:+
2520:b
2516:p
2513:m
2510:+
2507:p
2504:n
2501:+
2498:n
2495:m
2489:+
2486:p
2483:+
2480:n
2477:+
2474:m
2470:(
2453:b
2449:M
2405:2
2401:B
2395:2
2391:A
2387:+
2382:1
2378:B
2372:1
2368:A
2364:=
2359:)
2351:2
2347:B
2337:1
2333:B
2326:(
2319:)
2311:2
2307:A
2299:1
2295:A
2288:(
2283:=
2280:C
2262:B
2258:A
2253:m
2249:p
2245:m
2241:n
2215:)
2207:2
2203:B
2199:A
2192:1
2188:B
2184:A
2178:(
2173:=
2168:)
2160:2
2156:B
2148:1
2144:B
2137:(
2132:A
2129:=
2126:C
2108:B
2103:p
2099:p
2095:m
2091:n
2065:)
2059:B
2054:2
2050:A
2042:B
2037:1
2033:A
2026:(
2021:=
2017:B
2011:)
2003:2
1999:A
1989:1
1985:A
1978:(
1973:=
1970:C
1952:A
1947:n
1943:p
1939:m
1935:n
1915:)
1913:p
1909:m
1905:n
1898:.
1895:p
1891:m
1886:B
1881:m
1877:n
1872:A
1853:)
1851:n
1841:)
1839:n
1831:n
1813:,
1810:)
1805:2
1801:n
1797:(
1791:+
1788:)
1785:2
1781:/
1777:n
1774:(
1771:T
1768:8
1765:=
1762:)
1759:n
1756:(
1753:T
1732:;
1729:)
1726:1
1723:(
1717:=
1714:)
1711:1
1708:(
1705:T
1692:n
1681:b
1675:a
1668:c
1641:)
1629:B
1619:A
1615:+
1606:B
1596:A
1584:B
1574:A
1570:+
1561:B
1551:A
1537:B
1527:A
1523:+
1514:B
1504:A
1492:B
1482:A
1478:+
1469:B
1459:A
1452:(
1447:=
1442:)
1430:B
1418:B
1404:B
1392:B
1385:(
1378:)
1366:A
1354:A
1340:A
1328:A
1321:(
1316:=
1311:)
1299:C
1287:C
1273:C
1261:C
1254:(
1239:n
1218:,
1213:)
1201:B
1189:B
1175:B
1163:B
1156:(
1151:=
1148:B
1144:,
1139:)
1127:A
1115:A
1101:A
1089:A
1082:(
1077:=
1074:A
1070:,
1065:)
1053:C
1041:C
1027:C
1015:C
1008:(
1003:=
1000:C
969:M
963:b
958:)
950:M
944:b
940:/
936:n
921:C
900:C
893:C
880:B
873:A
864:)
862:m
858:T
854:K
848:K
844:k
831:)
829:p
825:T
821:J
815:J
811:j
804:)
802:n
798:T
794:I
788:I
784:i
775:T
773:+
771:J
769::
767:J
763:T
761:+
759:I
757::
755:I
751:C
744:T
742:+
740:J
738::
736:J
732:T
730:+
728:K
726::
724:K
720:B
713:T
711:+
709:K
707::
705:K
701:T
699:+
697:I
695::
693:I
689:A
681:T
677:m
673:K
666:T
662:p
658:J
651:T
647:n
643:I
637:)
633:M
626:T
618:C
612:B
608:A
595:M
584:M
570:B
566:A
558:)
556:n
550:B
546:B
542:A
533:b
529:/
525:M
517:n
512:B
508:A
500:b
496:/
492:M
484:b
480:M
437:)
435:n
428:n
424:n
411:)
395:C
380:C
367:B
360:A
351:m
347:k
334:p
330:j
323:n
319:i
312:C
306:B
302:A
292:p
288:j
284:n
280:i
263:.
258:j
255:k
251:b
245:k
242:i
238:a
232:m
227:1
224:=
221:k
213:=
208:j
205:i
201:c
186:p
182:n
177:C
173:B
168:p
164:m
159:A
154:m
150:n
140:C
119:)
117:n
107:)
105:n
83:)
81:n
74:n
70:n
61:n
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.