Fast Matrix Multiplication Algorithms

42
Fast Matrix Multiplication Algorithms

Transcript of Fast Matrix Multiplication Algorithms

Page 1: Fast Matrix Multiplication Algorithms

Fast Matrix Multiplication

Algorithms

Page 2: Fast Matrix Multiplication Algorithms

Why should we care?

Complexity of matrix multiplication = Complexity of โ€œalmost allโ€ matrix problems

Solving linear systems

Evaluating determinants

LU factorization

Many more

P. Bรผrgisser, M. Clausen, M. A.

Shokrollahi

Algebraic complexity theory.

Page 3: Fast Matrix Multiplication Algorithms

A brief historyโ€ฆ

Until the late 1960โ€™s: naรฏve algorithm, ๐‘‚(๐‘›3)

Page 4: Fast Matrix Multiplication Algorithms

A brief historyโ€ฆ

Until the late 1960โ€™s: naรฏve algorithm, ๐‘‚(๐‘›3)

1969: Strassenโ€™s algorithm, ๐‘‚(๐‘›2,808) (V. Strassen. Gaussian elimination is not

optimal)

Page 5: Fast Matrix Multiplication Algorithms

Strassenโ€™s algorithm

๐‘  = ๐‘Ž๐‘“ + ๐‘โ„Ž=๐‘ƒ1+๐‘ƒ2

๐‘ก = ๐‘๐‘’ + ๐‘‘๐‘”=๐‘ƒ3+๐‘ƒ4

๐‘Ÿ = ๐‘Ž๐‘’ + ๐‘๐‘”=๐‘ƒ5+๐‘ƒ4 โˆ’๐‘ƒ2 +๐‘ƒ6

๐‘ข = ๐‘๐‘“ + ๐‘‘โ„Ž=๐‘ƒ5+๐‘ƒ1 โˆ’ ๐‘ƒ3+๐‘ƒ7

Page 6: Fast Matrix Multiplication Algorithms

Strassenโ€™s algorithm

1. ๐‘ƒ1 = ๐ด1๐ต1 = ๐‘Ž(๐‘“ โˆ’ โ„Ž) = ๐‘Ž๐‘“ โˆ’ ๐‘Žโ„Ž

2. ๐‘ƒ2 = ๐ด2๐ต2 = (๐‘Ž + ๐‘)โ„Ž = ๐‘Žโ„Ž + ๐‘โ„Ž

3. ๐‘ƒ3 = ๐ด3๐ต3 = (๐‘ + ๐‘‘)๐‘’ = ๐‘๐‘’ + ๐‘‘๐‘’

4. ๐‘ƒ4 = ๐ด4๐ต4 = ๐‘‘(๐‘” โˆ’ ๐‘’) = ๐‘‘๐‘” โˆ’ ๐‘‘๐‘’

5. ๐‘ƒ5 = ๐ด5๐ต5 = ๐‘Ž + ๐‘‘ ๐‘’ + โ„Ž = ๐‘Ž๐‘’ + ๐‘Žโ„Ž + ๐‘‘๐‘’ + ๐‘‘โ„Ž

6. ๐‘ƒ6 = ๐ด6๐ต6 = (๐‘ โˆ’ ๐‘‘)(๐‘” + โ„Ž) = ๐‘๐‘” + ๐‘โ„Ž โˆ’ ๐‘‘๐‘” โˆ’ ๐‘‘โ„Ž

7. ๐‘ƒ7 = ๐ด7๐ต7 = (๐‘Ž โˆ’ ๐‘)(๐‘’ + ๐‘“) = ๐‘Ž๐‘’ + ๐‘Ž๐‘“ โˆ’ ๐‘๐‘’ โˆ’ ๐‘๐‘“

๐‘  = ๐‘Ž๐‘“ + ๐‘โ„Ž=๐‘ƒ1+๐‘ƒ2

๐‘ก = ๐‘๐‘’ + ๐‘‘๐‘”=๐‘ƒ3+๐‘ƒ4

๐‘Ÿ = ๐‘Ž๐‘’ + ๐‘๐‘”=๐‘ƒ5+๐‘ƒ4 โˆ’๐‘ƒ2 +๐‘ƒ6

๐‘ข = ๐‘๐‘“ + ๐‘‘โ„Ž=๐‘ƒ5+๐‘ƒ1 โˆ’ ๐‘ƒ3+๐‘ƒ7

Page 7: Fast Matrix Multiplication Algorithms

Strassenโ€™s algorithm

1. ๐‘ƒ1 = ๐ด1๐ต1 = ๐‘Ž(๐‘“ โˆ’ โ„Ž) = ๐‘Ž๐‘“ โˆ’ ๐‘Žโ„Ž

2. ๐‘ƒ2 = ๐ด2๐ต2 = (๐‘Ž + ๐‘)โ„Ž = ๐‘Žโ„Ž + ๐‘โ„Ž

3. ๐‘ƒ3 = ๐ด3๐ต3 = (๐‘ + ๐‘‘)๐‘’ = ๐‘๐‘’ + ๐‘‘๐‘’

4. ๐‘ƒ4 = ๐ด4๐ต4 = ๐‘‘(๐‘” โˆ’ ๐‘’) = ๐‘‘๐‘” โˆ’ ๐‘‘๐‘’

5. ๐‘ƒ5 = ๐ด5๐ต5 = ๐‘Ž + ๐‘‘ ๐‘’ + โ„Ž = ๐‘Ž๐‘’ + ๐‘Žโ„Ž + ๐‘‘๐‘’ + ๐‘‘โ„Ž

6. ๐‘ƒ6 = ๐ด6๐ต6 = (๐‘ โˆ’ ๐‘‘)(๐‘” + โ„Ž) = ๐‘๐‘” + ๐‘โ„Ž โˆ’ ๐‘‘๐‘” โˆ’ ๐‘‘โ„Ž

7. ๐‘ƒ7 = ๐ด7๐ต7 = (๐‘Ž โˆ’ ๐‘)(๐‘’ + ๐‘“) = ๐‘Ž๐‘’ + ๐‘Ž๐‘“ โˆ’ ๐‘๐‘’ โˆ’ ๐‘๐‘“

๐‘  = ๐‘Ž๐‘“ + ๐‘โ„Ž=๐‘ƒ1+๐‘ƒ2

๐‘ก = ๐‘๐‘’ + ๐‘‘๐‘”=๐‘ƒ3+๐‘ƒ4

๐‘Ÿ = ๐‘Ž๐‘’ + ๐‘๐‘”=๐‘ƒ5+๐‘ƒ4 โˆ’๐‘ƒ2 +๐‘ƒ6

๐‘ข = ๐‘๐‘“ + ๐‘‘โ„Ž=๐‘ƒ5+๐‘ƒ1 โˆ’ ๐‘ƒ3+๐‘ƒ7

T n

= 7Tn

2+ ฮ˜(n2)

Time

complexity:

๐‘‚(๐‘›2.808)

Page 8: Fast Matrix Multiplication Algorithms

A brief historyโ€ฆ

Until the late 1960โ€™s: naรฏve algorithm, ๐‘‚(๐‘›3)

1969: Strassenโ€™s algorithm, ๐‘‚(๐‘›2,808) (V. Strassen. Gaussian elimination is not

optimal)

1978: Pan, ๐œ” < 2.796 (V. Y. Pan. Strassenโ€™s algorithm is not optimal)

Page 9: Fast Matrix Multiplication Algorithms

Bilinear Algorithms

Given two matrices ๐ด, ๐ต๐‘ƒ๐‘™ =

๐‘–,๐‘— ๐‘ข๐‘–๐‘—๐‘™๐ด ๐‘–, ๐‘— ๐‘–,๐‘— ๐‘ฃ๐‘–๐‘—๐‘™๐ต ๐‘–, ๐‘— , ๐‘Ÿlinear combinations.

๐ด๐ต ๐‘–, ๐‘— =

๐‘™

๐‘ค๐‘–๐‘—๐‘™๐‘ƒ๐‘™

Page 10: Fast Matrix Multiplication Algorithms

The minimum number of products r that a bilinear algorithm can use to

compute the product of two ๐‘› ร— ๐‘› matrices is called the rank of ๐‘› ร— ๐‘› matrix

multiplication ๐‘…( ๐‘›, ๐‘›, ๐‘› )

The product of two k๐‘› ร— ๐‘˜๐‘› matrices can be viewed as the product of two ๐‘˜ ร—k matrices the entries of which are ๐‘› ร— ๐‘› matrices

We can create a recursive algorithm ALG for multiplication of ๐‘˜ ร— k.

View the ๐‘˜๐‘– ร— ๐‘˜๐‘– as ๐‘˜ ร— k matrices with entries ๐‘˜๐‘–โˆ’1 ร— ๐‘˜๐‘–โˆ’1 matrices

The recursive approach using an upper bound of ๐‘Ÿ on ๐‘…( ๐‘˜, ๐‘˜, ๐‘˜ ) gives a

bound ๐œ” < log๐‘˜ ๐‘Ÿ, (the number of additions that one has to do in each step is

no more than 3๐‘Ÿ๐‘˜2

As long as ๐‘Ÿ < ๐‘˜3 we get a non trivial bound for ๐œ”

Strassen: ๐‘˜ = 2, r = 7

Pan: ๐‘˜ = 70, ๐‘Ÿ = 143640

Page 11: Fast Matrix Multiplication Algorithms

A brief historyโ€ฆ

Until the late 1960โ€™s: naรฏve algorithm, ๐‘‚(๐‘›3)

1969: Strassenโ€™s algorithm, ๐‘‚(๐‘›2,808) (V. Strassen. Gaussian elimination is not

optimal)

1978: Pan, ๐œ” < 2.796 (V. Y. Pan. Strassenโ€™s algorithm is not optimal)

1979: Bini (border rank), ๐œ” < 2.78 (D. Bini, M. Capovani, F. Romani, and G.

Lotti. ๐‘‚(๐‘›2.7799) complexity for n n approximate matrix multiplication)

Page 12: Fast Matrix Multiplication Algorithms

A brief historyโ€ฆ

Until the late 1960โ€™s: naรฏve algorithm, ๐‘‚(๐‘›3)

1969: Strassenโ€™s algorithm, ๐‘‚(๐‘›2,808) (V. Strassen. Gaussian elimination is not

optimal)

1978: Pan, ๐œ” < 2.796 (V. Y. Pan. Strassenโ€™s algorithm is not optimal)

1979: Bini (border rank), ๐œ” < 2.78 (D. Bini, M. Capovani, F. Romani, and G.

Lotti. ๐‘‚(๐‘›2.7799) complexity for n n approximate matrix multiplication)

1981: Shonhage ฯ„-theorem(asymptotic sum inequality), ๐œ” < 2.548

in the same paper, ๐œ” < 2.522 (A. Schonhage. Partial and total matrix

multiplication)

Page 13: Fast Matrix Multiplication Algorithms

Approximate Bilinear Algorithms (ABA)

In bilinear algorithms the coefficients ๐‘ข๐‘–๐‘—๐‘™, ๐‘ฃ๐‘–๐‘—๐‘™, ๐‘ค๐‘–๐‘—๐‘™ were constants.

In ABA these coefficients are linear combinations of the integer powers of a

indeterminate ๐œ†.

The entries of ๐ด๐ต are then only approximately computed: ๐ด๐ต ๐‘–, ๐‘— = ๐‘™๐‘ค๐‘–๐‘—๐‘™๐‘ƒ๐‘™ + ๐‘‚(๐œ†)

๐‘‚(๐œ†): linear combination of positive powers of ๐œ†.

When ๐œ† โ†’ 0, then the product is almost exactly.

The minimum number of products ๐‘Ÿ for an ABA to compute the product of two ๐‘› ร— ๐‘› matrices, is called border rank of a matrix multiplication ๐‘…( ๐‘›, ๐‘›, ๐‘› )

Page 14: Fast Matrix Multiplication Algorithms

Bini showed that when dealing with the asymptotic complexity of matrix

multiplication, approximate algorithms suffice obtaining bounds for ๐œ”

If ๐‘…( ๐‘›, ๐‘›, ๐‘› ) โ‰ค ๐‘Ÿ, then ๐œ” โ‰ค log๐‘˜ ๐‘Ÿ

Bini used 10 entry products to multiply a 2 ร— 3matrix with a 3 ร— 3matrix ๐‘˜ = 12,๐‘Ÿ = 1000

Shonhage ฯ„-theorem: Suppose we have an upper bound of ๐‘Ÿ on the border

rank of computing ๐‘ independent instances of matrix multiplication with dimensions ๐‘˜๐‘– ร—๐‘š๐‘– by ๐‘š๐‘– ร—๐‘š๐‘– for ๐‘– = 1, โ€ฆ , ๐‘. Then ๐œ” < 3๐œ, where ๐‘–(๐‘˜๐‘–๐‘š๐‘–๐‘›๐‘–)

๐œ= ๐‘Ÿ

In particular he showed that one can approximately compute the product of a

3 ร— 1 by 1 ร— 3 vector and the product of a 1 ร— 4 by 4 ร— 1 vector together using

only 10 products, whereas any exact bilinear algorithm needs at least 13

products.

Page 15: Fast Matrix Multiplication Algorithms

A brief historyโ€ฆ

Until the late 1960โ€™s: naรฏve algorithm, ๐‘‚(๐‘›3)

1969: Strassenโ€™s algorithm, ๐‘‚(๐‘›2,808) (V. Strassen. Gaussian elimination is not

optimal)

1978: Pan, ๐œ” < 2.796 (V. Y. Pan. Strassenโ€™s algorithm is not optimal)

1979: Bini (border rank), ๐œ” < 2.78 (D. Bini, M. Capovani, F. Romani, and G.

Lotti. ๐‘‚(๐‘›2.7799) complexity for n n approximate matrix multiplication)

1981: Shonhage ฯ„-theorem(asymptotic sum inequality), ๐œ” < 2.548

in the same paper, ๐œ” < 2.522 (A. Schยจonhage. Partial and total matrix

multiplication)

1982: Romani, ๐œ” < 2.517 (F. Romani. Some properties of disjoint sums of

tensors related to matrix multiplication.)

Page 16: Fast Matrix Multiplication Algorithms

A brief historyโ€ฆ

Until the late 1960โ€™s: naรฏve algorithm, ๐‘‚(๐‘›3)

1969: Strassenโ€™s algorithm, ๐‘‚(๐‘›2,808) (V. Strassen. Gaussian elimination is not

optimal)

1978: Pan, ๐œ” < 2.796 (V. Y. Pan. Strassenโ€™s algorithm is not optimal)

1979: Bini (border rank), ๐œ” < 2.78 (D. Bini, M. Capovani, F. Romani, and G.

Lotti. ๐‘‚(๐‘›2.7799) complexity for n n approximate matrix multiplication)

1981: Shonhage ฯ„-theorem(asymptotic sum inequality), ๐œ” < 2.548

in the same paper, ๐œ” < 2.522 (A. Schยจonhage. Partial and total matrix

multiplication)

1982: Romani, ๐œ” < 2.517 (F. Romani. Some properties of disjoint sums of

tensors related to matrix multiplication.)

1982: Coppersmith & Winograd, ๐œ” < 2.496 (D. Coppersmith and S.

Winograd. On the asymptotic complexity of matrix multiplication)

Page 17: Fast Matrix Multiplication Algorithms

Continueโ€ฆ

1986: Strassen, laser method, ๐œ” < 2.479 (entirely new approach on the

matrix multiplication problem) (V. Strassen. The asymptotic spectrum of

tensors and the exponent of matrix multiplication)

Page 18: Fast Matrix Multiplication Algorithms

Continueโ€ฆ

1986: Strassen, laser method, ๐œ” < 2.479 (entirely new approach on the

matrix multiplication problem) (V. Strassen. The asymptotic spectrum of

tensors and the exponent of matrix multiplication)

1989: Coppersmith & Winograd, combine Strassenโ€™s laser method with a

novel from analysis based on large sets avoiding arithmetic progression,

๐œ” < 2.376 (D. Coppersmith and S. Winograd. Matrix multiplication via

arithmetic progressions.)

Page 19: Fast Matrix Multiplication Algorithms

Continueโ€ฆ

1986: Strassen, laser method, ๐œ” < 2.479 (entirely new approach on the

matrix multiplication problem) (V. Strassen. The asymptotic spectrum of

tensors and the exponent of matrix multiplication)

1989: Coppersmith & Winograd, combine Strassenโ€™s laser method with a

novel from analysis based on large sets avoiding arithmetic progression,

๐œ” < 2.376 (D. Coppersmith and S. Winograd. Matrix multiplication via

arithmetic progressions.)

2003: Cohn & Umans: group theoretic framework for designing and

analyzing matrix multiplication algorithms

Page 20: Fast Matrix Multiplication Algorithms

Continueโ€ฆ

1986: Strassen, laser method, ๐œ” < 2.479 (entirely new approach on the

matrix multiplication problem) (V. Strassen. The asymptotic spectrum of

tensors and the exponent of matrix multiplication)

1989: Coppersmith & Winograd, combine Strassenโ€™s laser method with a

novel from analysis based on large sets avoiding arithmetic progression,

๐œ” < 2.376 (D. Coppersmith and S. Winograd. Matrix multiplication via

arithmetic progressions.)

2003: Cohn & Umans: group theoretic framework for designing and

analyzing matrix multiplication algorithms

2005: Cohn, Umans, Kleinberg, Szegedy, ๐‘‚(๐‘›2.41) (H. Cohn, R. Kleinberg, B.

Szegedy, and C. Umans. Group-theoretic algorithms for matrix

multiplication)

Conjectures that can lead to: ๐œ” = 2.

Page 21: Fast Matrix Multiplication Algorithms

Continueโ€ฆ

1986: Strassen, laser method, ๐œ” < 2.479 (entirely new approach on the

matrix multiplication problem) (V. Strassen. The asymptotic spectrum of

tensors and the exponent of matrix multiplication)

1989: Coppersmith & Winograd, combine Strassenโ€™s laser method with a

novel from analysis based on large sets avoiding arithmetic progression,

๐œ” < 2.376 (D. Coppersmith and S. Winograd. Matrix multiplication via

arithmetic progressions.)

2003: Cohn & Umans: group theoretic framework for designing and

analyzing matrix multiplication algorithms

2005: Cohn, Umans, Kleinberg, Szegedy, ๐‘‚(๐‘›2.41) (H. Cohn, R. Kleinberg, B.

Szegedy, and C. Umans. Group-theoretic algorithms for matrix

multiplication)

Conjectures that can lead to: ๐œ” = 2.

2014 Williams: ๐œ” < 2.373 (Multiplying matrices in ๐‘‚ ๐‘›2.373 time)

Page 22: Fast Matrix Multiplication Algorithms

Continueโ€ฆ

For some ๐‘˜ they provide a way to

multiply ๐‘˜ ร— ๐‘˜ matrices using ๐‘š โ‰ช ๐‘˜3

multiplications and apply the technique

recursively to show that ๐œ” < log๐‘˜๐‘š

Page 23: Fast Matrix Multiplication Algorithms

Basic group theory definitions

Group

A group is a non empty set ๐บ with a binary operation โˆ™ defined on ๐บ such that the

following conditions hold:

1. For all ๐‘Ž, ๐‘, ๐‘ โˆˆ ๐บ, we have ๐‘Ž โˆ™ ๐‘ โˆ™ ๐‘ = (๐‘Ž โˆ™ ๐‘) โˆ™ ๐‘

2. There exists an element 1โˆˆ ๐บ such that 1 โˆ™ ๐‘Ž = ๐‘Ž and ๐‘Ž โˆ™ 1 = ๐‘Ž for all ๐‘Ž โˆˆ ๐บ

3. For all ๐‘Ž โˆˆ ๐บ there exists an element ๐‘Žโˆ’1 โˆˆ ๐บ such that ๐‘Ž โˆ™ ๐‘Žโˆ’1 = 1 and๐‘Žโˆ’1 โˆ™ ๐‘Ž = 1

Order of a group

The order ๐บ of a group is its cardinality, i.e. the number of elements in its set.

Page 24: Fast Matrix Multiplication Algorithms

Cyclic Group

A group is said to be cyclic if it is generated by a single element.

(We say that ๐‘‹ generates ๐บ if ๐บ = ๐‘‹ if every element of ๐บ can be written as a finite

product of elements from ๐‘‹ and their inverses. Note that the order of an element ๐‘Žof a group is the order of the subgroup ๐‘Ž it generates)

Abelian Group

The group ๐บ is said to be abelian if ๐‘Ž โˆ™ ๐‘ = ๐‘ โˆ™ ๐‘Ž far all ๐‘Ž, ๐‘ โˆˆ ๐บ.

Page 25: Fast Matrix Multiplication Algorithms

Group Algebra

The group algebra ๐น ๐บ of ๐บ is defined to be the ๐น-vector space with basis the

elements of ๐บ endowed with the multiplication extending that on ๐บ. Thus:

1. An element of ๐น ๐บ is a sum ๐‘”โˆˆ๐บ ๐‘๐‘”๐‘”, ๐‘๐‘”โˆˆ๐น

2. Two elements ๐‘”โˆˆ๐บ ๐‘๐‘”๐‘”, ๐‘”โˆˆ๐บ ๐‘๐‘”โ€ฒ๐‘” of ๐น ๐บ are equal if and only if ๐‘๐‘” = ๐‘๐‘”

โ€ฒ for all ๐‘”

3. ๐‘”โˆˆ๐บ ๐‘๐‘”๐‘” ๐‘”โˆˆ๐บ ๐‘๐‘”โ€ฒ๐‘” = ๐‘”โˆˆ๐บ ๐‘๐‘”

โ€ฒโ€ฒ๐‘” , ๐‘๐‘”โ€ฒโ€ฒ = ๐‘”1๐‘”2 = ๐‘”๐‘๐‘”๐‘๐‘”

โ€ฒ

Homomorphism

A homomorphism for a group ๐บ to a group ๐บโ€ฒ is a map ๐’ถ: ๐บ โ†’ ๐บโ€ฒ such that ๐’ถ ๐‘Ž๐‘ =๐’ถ(๐‘Ž)๐’ถ(๐‘) for all ๐‘Ž, ๐‘ โˆˆ ๐บ. An isomorphism is a bijective homomorphism.

Page 26: Fast Matrix Multiplication Algorithms

Multiplying polynomials via FFT

Standard method requires time complexity of ๐‘‚(๐‘›2)

We think of the coefficient vectors of the polynomials as elements of the

group algebra โ„‚ ๐บ of a finite group ๐บ

If the group is large (order at least 2๐‘›), convolution of two vectors in the

group algebra corresponds to the polynomial product.

Page 27: Fast Matrix Multiplication Algorithms

Multiplying polynomials via FFT

Standard method requires time complexity of ๐‘‚(๐‘›2)

We think of the coefficient vectors of the polynomials as elements of the

group algebra โ„‚ ๐บ of a finite group ๐บ

If the group is large (order at least 2๐‘›), convolution of two vectors in the

group algebra corresponds to the polynomial product.

Discrete convolution

Suppose we have two complex vectors in ๐ธ๐‘:๐‘ = ๐‘ง0 ๐‘ง1 โ‹ฏ ๐‘ง๐‘โˆ’1

๐‘‡ ๐‘Œ = ๐‘ฆ0 ๐‘ฆ โ‹ฏ ๐‘ฆ๐‘โˆ’1๐‘‡

The discrete convolution of these two vectors is another vector, which we

denote ๐‘ โˆ— ๐‘Œ, defined componentwise by (๐‘ โˆ— ๐‘Œ)๐‘˜= ๐‘—=0๐‘โˆ’1 ๐‘ง๐‘˜โˆ’๐‘—๐‘ฆ๐‘— , ๐‘˜ = 0,1,2, โ€ฆ

Page 28: Fast Matrix Multiplication Algorithms

Multiplying polynomials via FFT

Standard method requires time complexity of ๐‘‚(๐‘›2)

We think of the coefficient vectors of the polynomials as elements of the

group algebra โ„‚ ๐บ of a finite group ๐บ

If the group is large (order at least 2๐‘›), convolution of two vectors in the

group algebra corresponds to the polynomial product.

Convolution in the group algebra can be computed quickly using the FFT.

Time complexity of FFT and inverse FFT: ๐‘‚(๐‘›๐‘™๐‘œ๐‘”๐‘›)

Page 29: Fast Matrix Multiplication Algorithms

Discrete Fourier Transform for

polynomials

Discrete Fourier Transform

Embed polynomials as elements of the group algebra โ„‚[๐บ]:

Let ๐บ = ๐‘ง be a cyclic group of order ๐‘š โ‰ฅ 2๐‘›. Define

๐ด = ๐‘–=0๐‘›โˆ’1๐‘Ž๐‘–๐‘ง

๐‘– ๐ต = ๐‘–=0๐‘›โˆ’1๐‘๐‘–๐‘ง

๐‘–

Discrete Fourier Transform is an invertible linear transformation

๐ท: โ„‚[๐บ] โ†’ โ„‚ ๐บ , such that

๐ท ๐ด = ( ๐‘–=0๐‘›โˆ’1๐‘Ž๐‘–๐‘ฅ0

๐‘– , ๐‘–=0๐‘›โˆ’1๐‘Ž๐‘–๐‘ฅ1

๐‘– , โ€ฆ , ๐‘–=0๐‘›โˆ’1๐‘Ž๐‘–๐‘ฅ๐‘›โˆ’1

๐‘– , ), ๐‘ฅ๐‘˜ = ๐‘’2๐œ‹๐‘–

๐‘›๐‘˜

Then ๐ด ๐ต = ๐ทโˆ’1(๐ท( ๐ด)๐ท( ๐ต))

Page 30: Fast Matrix Multiplication Algorithms

Embedding matrices A, B into elements ๐ด, ๐ตof the group algebra โ„‚[๐บ]

Cohn & Umans

Matrix multiplication can be embedded into the group algebra of a finite

group ๐บ (๐บ must be non-abelian)

Let ๐น be a field and ๐‘†, ๐‘‡ and ๐‘ˆ be subsets of ๐บ.

๐ด = (๐‘Ž๐‘ ,๐‘ก)๐‘ โˆˆ๐‘†,๐‘กโˆˆ๐‘‡ and B = (๐‘๐‘ก,๐‘ข)๐‘กโˆˆ๐‘‡,๐‘ขโˆˆ๐‘ˆ

are ๐‘† ร— ๐‘‡ and ๐‘‡ ร— ๐‘ˆ , indexed by elements of ๐‘†, ๐‘‡ and ๐‘‡, ๐‘ˆ respectively.

Then embed ๐ด, ๐ต as elements ๐ด, ๐ต โˆˆ ๐น ๐บ :

๐ด = ๐‘ โˆˆ๐‘†,๐‘กโˆˆ๐‘‡ ๐‘Ž๐‘ ,๐‘ก๐‘ โˆ’1๐‘ก and ๐ต = ๐‘กโˆˆ๐‘‡,๐‘ขโˆˆ๐‘ˆ ๐‘๐‘ก,๐‘ข๐‘ก

โˆ’1๐‘ข

Page 31: Fast Matrix Multiplication Algorithms

Using the FFT

As with the polynomial method, the Fourier transform provides an efficient way

to compute the convolution product.

For a non-abelian group a fundamental theorem of Weddeburn says that the

group algebra is isomorphic, via a Fourier transform, to an algebra of block

diagonal matrices having block dimensions ๐‘‘1โ€ฆ๐‘‘๐‘˜, with ๐‘‘๐‘–2 = ๐บ .

Convolution in โ„‚ ๐บ is thus transformed into block diagonal matrix

multiplication.

Page 32: Fast Matrix Multiplication Algorithms

Embed matrices A,

B into elements ๐ด, ๐ต of the group

algebra โ„‚[๐บ]

Multiplication of ๐ดand ๐ต in the

group algebra is carried out in the Fourier domain

after performing the Discrete

Fourier Transform (DFT) of ๐ด and ๐ต

FFT

The product ๐ด ๐ต is found

by performing the inverse

DFT

Inverse FFT

Entries of the matrix

AB can be read off from

the group algebra product ๐ด ๐ต

Page 33: Fast Matrix Multiplication Algorithms

Triple Product Property

The approach works only if the group ๐บ admits an embedding of matrix

multiplication into its group algebra.

The coefficients of the convolution product correspond to the entries of the

product matrix.

Such an embedding is possible whenever the group ๐บ has three subgroups,

๐ป1, ๐ป2, ๐ป3 with the property that whenever โ„Ž1๐œ– ๐ป1, โ„Ž2๐œ– ๐ป2 and โ„Ž3๐œ– ๐ป3 with

โ„Ž1โ„Ž2โ„Ž3 = 1, then โ„Ž1 = โ„Ž2 = โ„Ž3 = 1

(The condition can be generalized to subsets of ๐บ rather than subgroups)

Page 34: Fast Matrix Multiplication Algorithms

Beating the sum of cubes

In order for ๐œ” to be less than 3, the group must satisfy more conditions.

In particular, it must be the case that:

๐ป1 ๐ป2 ๐ป3 > ๐‘‘๐‘–3,

๐‘‘๐‘–: the block dimensions of the block matrices

Page 35: Fast Matrix Multiplication Algorithms

Group Theory Definitions

Permutation Groups

Let ๐‘† be a set and let ๐‘†๐‘ฆ๐‘š ๐‘† be the set of bijections ๐‘Ž: ๐‘† โ†’ ๐‘†

๐‘†๐‘ฆ๐‘š ๐‘† is a group, called the group of symmetries of ๐‘†. For example, the permutation group on n letters ๐‘†๐‘› is defined to be the group of symmetries of the set 1,โ€ฆ , ๐‘› โ€” it has order ๐‘›!.

Groups Acting on Sets

Let ๐‘‹ be a set and let ๐บ be a group. A left action of ๐บ on ๐‘‹ is a mapping ๐‘”, ๐‘ฅ โŸผ๐‘”๐‘ฅ: ๐บ ร— ๐‘‹ โ†’ ๐‘‹ such that

a) 1๐‘ฅ = ๐‘ฅ, for all ๐‘ฅ โˆˆ ๐‘‹

b) (๐‘”1๐‘”2)๐‘ฅ = ๐‘”1(๐‘”2๐‘ฅ), all ๐‘”1, ๐‘”2 โˆˆ ๐บ, ๐‘ฅ โˆˆ ๐‘‹

A set together with a (left) action of ๐บ is called a (left) ๐บ-set. An action is trivial if ๐‘”๐‘ฅ =๐‘ฅ for all ๐‘” โˆˆ ๐บ

Page 36: Fast Matrix Multiplication Algorithms

Direct product

When ๐บ and ๐ป are groups, we can construct a new group ๐บ ร— ๐ป, called the

(direct) product of ๐บ and ๐ป. As a set, it is the Cartesian product of ๐บ and ๐ป, and

multiplication is defined by: ๐‘”, โ„Ž ๐‘”โ€ฒ, โ„Žโ€ฒ = (๐‘”๐‘”โ€ฒ, โ„Žโ„Žโ€ฒ)

Normal subgroups

A subgroup ๐‘ of ๐บ is normal, denoted ๐‘ โŠฒ ๐บ, if ๐‘”๐‘๐‘”โˆ’1 = ๐‘ for all ๐‘” โˆˆ ๐บ

Semidirect product

A group ๐บ ๐‘–s a semidirect product of its subgroups ๐‘ and ๐‘„ if ๐‘ is normal and the

homomorphism ๐บ โ†’ ๐บ ๐‘ induces an isomorphism ๐‘„ โ†’ ๐บ ๐‘.

We write ๐บ = ๐‘ โ‹Š ๐‘„.

Page 37: Fast Matrix Multiplication Algorithms

Wreath Product

The wreath product of two groups ๐ด and ๐ต is constructed in the following way.

Let ๐ด๐ต be the set of all functions defined on ๐ต with values in ๐ด.

With respect to the componentwise multiplication, this set is a group which is the

complete direct product of ๐ต copies of ๐ด.

The semidirect product ๐‘Š of ๐ต and ๐ด๐ต is called the Cartesian wreath product of

๐ด and ๐ต, and is denoted by ๐ด๐‘Š๐‘Ÿ ๐ต.

If instead of ๐ด๐ต one takes the smaller group ๐ด(๐ต) consisting of all functions with

finite support, that is, functions taking only non-identity values on a finite set of

points, then one obtains a subgroup of๐‘Š called the wreath product of ๐ด and ๐ตand is denoted by ๐ด ๐‘ค๐‘Ÿ ๐ต.

Page 38: Fast Matrix Multiplication Algorithms

Beating the sum of cubes, finallyโ€ฆ

The elusive group ๐บ that managed to โ€œbeat the sum of cubesโ€ turned out to

be a wreath product of:

Abelian group of order 173

Symmetric group of order 2

Page 39: Fast Matrix Multiplication Algorithms

Symmetric group of order 2

Wreath Product

Abelian group of order 173

๐‘‚ ๐‘›2.91

Page 40: Fast Matrix Multiplication Algorithms

Beating the sum of cubes, finallyโ€ฆ

The elusive group ๐บ that managed to โ€œbeat

the sum of cubesโ€ turned out to be a wreath

product of:

Abelian group of order 173

Symmetric group of order 2

Elementary fact of group representation theory

The index of the largest abelian subgroup of a

group is an upper bound on the size of the

maximum block of the block diagonal matrix

representation given by Wedderburnโ€™s

theorem.

For a non-abelian group a

fundamental theorem of

Weddeburn says that the

group algebra is isomorphic,

via a Fourier transform, to an

algebra of block diagonal

matrices having block

dimensions ๐‘‘1โ€ฆ๐‘‘๐‘˜, with ๐‘‘๐‘–2 = ๐บ .

Page 41: Fast Matrix Multiplication Algorithms

Improving the bounds for ๐œ”

Szegedy realized that some of the combinatory structures of the 1987

Coppersmith - Winograd paper could be used to select the three subsets in

the wreath product groups in amore sophisticated way.

The researchers managed to achieve exponential bound: ๐œ” < 2.48

The researchers distilled their insights into two conjectures, one that has an

algebraic flavor and one that has a combinatorial.

Page 42: Fast Matrix Multiplication Algorithms

A 6ร—6 strong USP, along with 2 of its 18

pieces