Rota on multilinear algebra and geometry

Notes for Part of MIT Math Course 18.795circa 1993

Gian-Carlo Rota

March 6, 2003

ii

Preface

These “samizdat” notes express Rota’s own views some eight years after“Barnabei-Brini-Rota.”

Contents

1 Geometry and Algebra 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Homogeneous coordinates . . . . . . . . . . . . . . . . . . . . 21.3 Projective space . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Desargues’ theorem . . . . . . . . . . . . . . . . . . . . . . . . 51.5 Multilinear functionals . . . . . . . . . . . . . . . . . . . . . . 61.6 Brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.7 Bracket products . . . . . . . . . . . . . . . . . . . . . . . . . 81.8 The weak transfer conjecture . . . . . . . . . . . . . . . . . . . 111.9 The classical algebras . . . . . . . . . . . . . . . . . . . . . . . 131.10 Coordinate-free versions of the classical algebras . . . . . . . . 131.11 Generalizations of the determinant . . . . . . . . . . . . . . . 151.12 Extensors and subspaces . . . . . . . . . . . . . . . . . . . . . 171.13 Meets and the double algebra . . . . . . . . . . . . . . . . . . 181.14 Meets and joins in the dual space . . . . . . . . . . . . . . . . 201.15 Plucker coordinates . . . . . . . . . . . . . . . . . . . . . . . . 211.16 Desargues revisted & theorem of Pappus . . . . . . . . . . . . 221.17 Matrix identities . . . . . . . . . . . . . . . . . . . . . . . . . 241.18 Pfaffians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.19 The Grassmannian . . . . . . . . . . . . . . . . . . . . . . . . 301.20 Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321.21 Degree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351.22 Symmetry classes . . . . . . . . . . . . . . . . . . . . . . . . . 351.23 Matching theory . . . . . . . . . . . . . . . . . . . . . . . . . 36

iii

Chapter 1

Geometry and Algebra

1.1 Introduction

Multilinear algebra is part of a larger program to rewrite geometry in thelanguage of algebra. The subject was ignored in the earlier part of thiscentury but there has been a recent revival of interest in it as a result of animpetus from an unexpected source: computer algebra.

What does it mean more precisely to rewrite geometry in the languageof algebra? We need to be more precise about what we mean by geometryand algebra. To do this we must begin with a brief review of vector spaces.We shall not give complete definitions but will highlight certain elementaryfacts which turn out on deeper inspection to have great significance.

We shall consider only finite dimensional vectors spaces unless otherwisenoted. The elements of a vector space V are called vectors or points. Vectorscan be visualized as arrows. The dual space of V is denoted by V ∗ and is thevector space of linear functionals u : V → k on V . Given x ∈ V and u ∈ V ∗,we write hx|ui for u(x). Note that this is backwards from the usual “bra-ket”notation. A set of vectors {x1, . . . , xn} is said to be linearly independent ifα1x1 + · · · + αnxn = 0 only if all the αi are zero. A basis is a maximallinearly independent set. It is a basic theorem that any two bases have thesame cardinality. This cardinality is known as the dimension of V . One ofthe most important properties of bases is the following.

Exchange property. Let I and J be sets of independent vectors in a vectorspace V . If |I| < |J | then there exists x ∈ J such that I∪{x} is independent.This is equivalent to the following.

1

2 CHAPTER 1 GEOMETRY AND ALGEBRA

Double exchange property. If B1 and B2 are bases of a vector spacethen for every x ∈ B1 there exists y ∈ B2 such that (B1 − x) ∪ {y} and(B2 − y) ∪ {x} are bases.The proofs are left as an exercise.A coordinate system in a vector space V of dimension n is a basis B =

{u1, . . . , un} of V ∗. If x ∈ V , then its coordinates relative to B are thesequence hx|u1i, . . . , hx|uni.We are now in a position to say more precisely what we mean by geometry.

In this course we define geometry to the study of those facts about vectorspaces V over a field k that are independent of the choice of coordinates.Historically, it took a long time for mathematicians to free vectors from acoordinate system, although the idea seems obvious to us nowadays.

1.2 Homogeneous coordinates

Given a vector space V , we can define various operations on its subspaces.If W1 and W2 are subspaces of V we define the meet W1 ∧W2 to be the set-theoretic intersection W1 ∩W2. Note that the set-theoretic union W1 ∪W2

is not a subspace, but the span is. So we define the join W1 ∨W2 to be thespan. This lets us define a lattice structure on the subsets of V .Let S be a set and let P (S) be the Boolean algebra of all subsets of S. Let

L(V ) denote the lattice of all subspaces of V . There is a deep relationshipbetween P (S) and L(V ) that is still not fully understood.The join and meet operations are commutative, associative, idempotent,

and satisfy the Dedekind absorption laws:

(W1 ∨W2) ∧W1 =W1 and (W1 ∧W2) ∨W1 =W1.

Note, however, that in general

W1 ∧ (W2 ∨W3) 6= (W1 ∧W2) ∨ (W1 ∧W3),

even though the analogous statement for sets in place of vector spaces is true.It is left as an exercise to show that

dim(W1 ∨W2) = dim(W1) + dim(W2)− dim(W1 ∧W2)

but that the corresponding inclusion-exclusion statement for three subspacesholds only if the three subspaces are in general position.

1.3 PROJECTIVE SPACE 3

Given W ⊆ V there exists at least one subspace W c ⊆ V such thatW ∨W c = V and W ∧W c = 0. Such a W c is called a complement of W .Note that W c is never unique!

Can we characterize L(V ) in a simple way? The answer is yes. But beforewe explain the answer, we must first overcome a psychological hurdle. If wedraw the Hasse diagram of P (S) we see that the points that cover 0 justcorrespond to the elements of S. But if we draw the Hasse diagram for L(V )the points that cover 0 correspond to lines instead of points! This motivatesthe definition of homogeneous coordinates.

We might try to solve the problem by considering affine varieties insteadof linear subspaces in V . The problem with this is that parallel varieties don’tintersect, so that the result is not a lattice. To circumvent this problem weintroduce projective space.

1.3 Projective space

Take an (n+1)-dimensional vector space Vn+1 over a field k. Fix a coordinatesystem {u0, . . . , un}. Given x ∈ Vn+1, recall that the coordinates are givenby xi = hx|uii.We now put an equivalence relation on the coordinates. Write x ∼ y

whenever there is a nonzero scalar c ∈ k such that yi = cxi for 0 6 i 6n. The resulting set of equivalence classes, excluding the equivalence classcontaining zero, gives us projective space of dimension n, which we denoteby PG(n). The image of a basis of Vn+1 under this equivalence relation iscalled a basis of PG(n). Notice that under this equivalence relation, linearsubspaces of Vn+1 become linear varieties, and the dimension drops by one.The formula

dim(W1 ∨W2) = dim(W1) + dim(W2)− dim(W1 ∧W2),

where the Wi are linear varieties, is still valid in projective space.

The subset of equivalence classes of vectors with x0 6= 0 has a sectionconsisting of all vectors of the form (1, x1, . . . , xn). This is called affine spaceof dimension n. It is closed under affine linear combinations:X

i

αix(i) with

Xi

αi = 1.


The subset of projective space of dimension n consisting of all equivalenceclasses of vectors of the form (0, x1, . . . , xn) is a projective space of dimen-sion n− 1.We can form a topological model of projective space by taking an n-

dimensional sphere and identifying opposite points.Homogeneous coordinates are relevant to the question of how much one

can simplify a polynomial

p(t) = a0 + a1t+ · · ·+ antn, ai ∈ kby changes of variables of the form

(a) t0 = ct (c 6= 0),(b) t0 = t+ c,

(c) t0 = 1/t.

There has been a revival of interest in this problem because of computeralgebra. One quickly runs into annoying problems when trying to keep trackof the degree of polynomials whose coefficients are not all nonzero. To circum-vent this problem we introduce the concept of a homogeneous polynomial. Apolynomial p(t1, . . . , tn) is said to be homogeneous of degree d if the followingidentity holds:

p(ct1, . . . , ctn) = cdp(t1, . . . , tn).

Given an ordinary polynomial p, set t = x1/x0. Then

f(x0, x1) = xn0p(x1/x0) = a0x

n0 + a1x

n−10 x1 + · · ·+ anxn1 .

This gives us a homogeneous polynomial f . It is left as an exercise to showthat the changes of variables correspond to matrices acting, as follows.

(a0)

⎛⎝ x00x01

⎞⎠=⎛⎝ 1 00 c

⎞⎠⎛⎝ x0x1

⎞⎠ ,(b0)

⎛⎝ x00x01

⎞⎠=⎛⎝ 1 0c 1

⎞⎠⎛⎝ x0x1

⎞⎠ ,(c0)

⎛⎝ x00x01

⎞⎠=⎛⎝ 0 11 0

⎞⎠⎛⎝ x0x1

⎞⎠ .

1.4 DESARGUES’ THEOREM 5

It is also left as an exercise to show that the group generated by matricesof the above form is the group GL(2) of matrices with nonzero determinant,called the general linear group of order 2.

From now on we use homogeneous polynomials because nonhomogeneouspolynomials can be homogenized in the way just illustrated.

1.4 Desargues’ theorem

This theorem states that in PG(n), for n > 3, any two triangles which areperspective from a point are also perspective from a line. In symbols, weassume that for some point P

(A ∨A0) ∧ (B ∨B0) = (A ∨A0) ∧ (C ∨ C 0) = (B ∨B0) ∧ (C ∨ C 0) = P

and we want to prove that

(A ∨B) ∧ (A0 ∨B0), (B ∨ C) ∧ (B0 ∨ C 0), and (A ∨ C) ∧ (A0 ∨ C 0)

are on the same line.

Proof. Assume that n = 3. Consider the planes π = A ∨ B ∨ C andπ0 = A0 ∨B0 ∨ C 0. Then

dim(π ∨ π0) + dim(π ∧ π0) = dim(π) + dim(π0)

3 + dim(π ∧ π0) = 2 + 2

dim(π ∧ π0) = 1.

Hence π ∧ π0 is a line l.

By the assumption (A ∨ C) ∧ (A0 ∨ C 0) exists. Since A ∨ C ∈ π andA0 ∨ C 0 ∈ π0, it follows that

(A ∨ C) ∧ (A0 ∨ C 0) ∈ π ∧ π0 = l.

Similarly the other two points lie on l, and this completes the proof.

The converse of Desargues’ theorem is also true, and the proof is obtainedby dualizing the above proof (i.e., interchanging joins and meets). Desargues’theorem helps us characterize L(V ), but we shall not pursue this point here.


1.5 Multilinear functionals

An r-linear functional on a vector space V over k is a function f : V r → kthat is linear in each of its coordinates, i.e., if we fix all but one of thearguments of f the result is a linear function of the unfixed argument. Wealso use the term multilinear functional when the value of r is immaterial.A symmetric r-linear functional f is one that satisfies

f(xσ(1), xσ(2), . . . , xσ(r)) = f(x1, x2, . . . , xr)

for all permutations σ of {1, 2, . . . , r}. A skew-symmetric or alternating r-linear functional is one that satisfies

f(xσ(1), xσ(2), . . . , xσ(r)) = sgn(σ)f(x1, x2, . . . , xr)

for all permutations σ of {1, 2, . . . , r}. (In the case of fields with characteristictwo, we further require that f = 0 whenever two arguments are equal. Ofcourse, this further condition is automatic if the characteristic is not equalto two.)

Proposition 1 Every bilinear functional f(x1, x2) over a field of charac-teristic zero can be expressed uniquely as the sum of a symmetric bilinearfunctional and a skew-symmetric bilinear functional.

Proof. Simply note that f = fs + fa where

fs (x1, x2) =f (x1, x2) + f (x2, x1)

2and fa (x1, x2) =

f (x1, x2)− f (x2, x1)2

We say that a trilinear functional f is cyclic-symmetric if

f(x1, x2, x3) + f(x2, x3, x1) + f(x3, x1, x2) = 0

for all x1, x2, and x3. The analogue in three dimensions is the following,whose proof is left as an exercise.

Proposition 2 Every trilinear functional over a field of characteristic zerocan be expressed uniquely as the sum of a symmetric, a skew-symmetric, anda cyclic-symmetric trilinear functional.

Unfortunately there is no simple generalization to higher dimensions.This gives a hint of the complexities that can arise when passing from linearfunctionals to multilinear functionals.

1.6 BRACKETS 7

1.6 Brackets

We shall now study skew-symmetric n-linear functionals f(x1, . . . , xn) wherethe xi are taken from an n-dimensional vector space in some detail. Firstnote that if f(x1, . . . , xr) is an r-linear functional then

fs(x1, . . . , xr) =Xσ

f¡xσ(1), . . . , xσ(r)

¢is clearly symmetric, and

fa(x1, . . . , xr) =Xσ

sgn(σ)f¡xσ(1), . . . , xσ(r)

¢is clearly skew-symmetric. We refer to the process of forming fs from f assymmetrizing and to the process of forming fa from f as antisymmetrizing.Now denote by [x1, x2, . . . , xn] an n-linear skew-symmetric functional over

a vector space V of dimension n. We call this the bracket. (The terminol-ogy is due to Cayley.) A bracket is said to be nondegenerate if given anyx1 ∈ V there exist x2, . . . , xn such that [x1, . . . , xn] 6= 0. Given a bracket,a basis {e1, . . . , en} of V is said to be unimodular if [e1, . . . , en] = 1. Wehave the following proposition, which shows that in some sense brackets arecoordinate-free determinants.

Proposition 3 Given a unimodular basis {e1, . . . , en} of V and a nondegen-erate bracket , the dual basis {u1, . . . , un} of V ∗ satisfies

[x1, . . . , xn] =Xσ

sgn(σ)xσ(1)|u1

® · · · xσ(n)|un®=Xσ

sgn(σ)x1|uσ(1)

® · · ·Dxn|uσ(n)E= det (hxi|uji) .

Proof. By nondegeneracy a unimodular basis exists. Write xi =P

j αijejwith αij ∈ k. Then by direct substitution and multilinearity,

[x1, . . . , xn] =Xj1,...,jn

α1j1 · · ·αnjn[ej1 , . . . , ejn ].

By skew-symmetry of the bracket, the only nonzero terms in the multiple sumon the right hand side are those such that {j1, . . . , jn} is a permutation σ of


{1, . . . , n}. Again by skew-symmetry, the value of [ej1 , . . . , ejn ] is just sgn(σ),since {e1, . . . , en} is unimodular. The remainder of the proof is clear.Warning! Given a nondegenerate bracket, there is in general more than

one unimodular basis.Next define a Peano space to be a vector space V of dimension n together

with a nondegenerate bracket. We state a few facts about Peano spaces thatlook innocent but have important consequences.(1) [x1, . . . , xn] = 0 if and only if {x1, . . . , xn} is linearly dependent.(2) If f(x1, . . . , xr) is an alternating multilinear form on a Peano space of

dimension n, and if r > n, then f ≡ 0.Proof. Let {e1, . . . , en} be a basis of V . Then f(ei1 , . . . , eir) = 0 for anyi1, . . . , ir (for since r > n, at least two of the arguments are equal and we canpermute them). Expanding by multilinearity then completes the proof.(3) If f(x1, . . . , xn) is an alternating multilinear form on a Peano space

of dimension n, then there is a constant c such that

f(x1, . . . , xn) = c[x1, . . . , xn].

1.7 Bracket products

A bracket product is a function of r variables {x1, . . . , xr} whose values aregiven by an expression of the form

[xi11 , . . . , xi1n][xi21 , . . . , xi2n ] · · · [xij1, . . . , xijn]

where 1 6 ipq 6 r. The ipq need not be distinct.We now study the process of antisymmetrizing a bracket product. We

take as our example a function of the variables

{x1, . . . , xs, y1, . . . , yn−r, z1, . . . , zt}

(where 1 6 r 6 s 6 n and t+ s− r = n) given by the expression

[x1, . . . , xr, y1, . . . , yn−r][xr+1, . . . , xs, z1, . . . , zt].

We antisymmetrize with respect to the x variables to getXσ

sgn(σ)[xσ(1), . . . , xσ(r), y1, . . . , yn−r][xσ(r+1), . . . , xσ(s), z1, . . . , zt].(∗)

1.7 BRACKET PRODUCTS 9

We can simplify this expression. We can arrange the set {σ(1), . . . ,σ(r)}in increasing order and label the resulting sequence j1, . . . , jr, so that j1 <· · · < jr. Similarly, we can arrange the set {σ(r+1), . . . ,σ(s)} into increasingorder jr+1 < · · · < js. Observe that

[xσ(1), . . . , xσ(r), y1, . . . , yn−r] = sgn(τ)[xj1 , . . . , xjr , y1, . . . , yn−r],

where τ is the permutation that sends σ(i) to ji for 1 6 i 6 r. Thus (∗)equals k!(n− k)! times the following expression:X

σ

0sgn(σ)[xσ(1), . . . , xσ(r), y1, . . . , yn−r][xσ(r+1), . . . , xσ(s), z1, . . . , zt], (∗∗)

where the prime indicates that the sum is over shuffles, i.e., over permuta-tions σ of {1, 2, . . . , s} such that

σ(1) < σ(2) < · · · < σ(r) and σ(r + 1) < · · · < σ(s).

Thus (∗∗) is skew-symmetric. Forming the expression (∗∗) from the bracketproduct has advantages over the usual antisymmetrizing process. For exam-ple, (∗∗) is less likely to be zero in a field of nonzero characteristic since (∗)is a positive integral multiple of (∗∗).The Scottish notation for (∗∗) is sometimes helpful. We put a dot or some

other symbol over each of the variables that are being skew-symmetrized:

[x1, . . . , xr, y1, . . . , yn−r][xr+1, . . . , xs, z1, . . . , zt].

Of course, the above generalizes in an obvious way to products of more thantwo brackets.

Corollary 4 If s > n then (∗∗) = 0.

This is clear because if s > n we obtain a skew-symmetric multilinearfunction of more than n variables.An example of the corollary is

[x1, . . . , xn][xn+1, z1, . . . , zn−1] = 0,

which we claim is just Cramer’s rule. To see this, we can expand to obtain[x1, . . . , xn][xn+1, z1, . . . , zn−1]−[x1, . . . , xn−1, xn+1][xn, z1, . . . , zn−1]+· · · = 0.Note that there are n+1 terms in this expansion. Let u = [x1, . . . , xn]xn+1−


[x1, . . . , xn−1, xn+1]xn + [x1, . . . , xn−2, xn, xn+1]xn−1 − · · · , where again thereare n + 1 terms on the right. Then our equation can be written as sim-ply [u, z1, . . . , zn−1] = 0. Now this holds for all z1, . . . , zn−1, so u = 0 bynondegeneracy. Hence we obtain the following identity for vectors.

[x1, . . . , xn]xn+1 = [x1, x2, . . . , xn+1]x1 − [x1, x2, x3, . . . , xn+1]x2 + · · ·(where we employ the Rado notation in which xi means that the term xi isomitted). In particular, if {x1, . . . , xn} is a basis of V , we obtain a coordinate-free version of Cramer’s rule:

xn+1 =[x1, x2, . . . , xn+1]

[x1, . . . , xn]x1 − [x1, x2, x3, . . . , xn+1]

[x1 . . . , xn]x2 + · · · .

Notice that this also gives us a coordinate-free way of expressing vectors interms of a basis.

Proposition 5 For any x1, . . . , xn, y1, . . . , yn ∈ V and any i ∈ {1, . . . , n},[x1, . . . , xi, y1, . . . , yn−i][xi+1, . . . , xn, yn−i+1, . . . , yn] = ±[x1, . . . , xn][y1, . . . , yn].Proof. This can be verified by choosing a basis {e1, . . . , en}, showing thatthe identity holds if we let each of the xs and ys be one of the es, and applyingmultilinearity.This is closely related to the Laplace expansion for the determinant. For

let {e1, . . . , en} be a unimodular basis, and let {u1, . . . , un} be the dual basis.Then

[x1, . . . , xi, e1, . . . , en−i]

=Xσ

sgn (σ) .x1|uσ(1)

® · · · .xi|uσ(i)® .e1|uσ(i+1)® · · · .en−i|uσ(n)®=Xσ

0 sgn (σ) .x1|uσ(1)

® · · · .xi|uσ(i)®where in this instance the prime indicates that the sum is over all permuta-tions σ such that

σ(i+ 1) = 1, σ(i+ 2) = 2, . . . , σ(n) = n− i,since all other terms in the sum vanish due to the duality of the es and theus. Similar reasoning applies to the other bracket. It is easy to see now thatthis is just the Laplace expansion.

1.8 THE WEAK TRANSFER CONJECTURE 11

Another example of bracket products is the following. Take n bases of V :

{w1, . . . , wn}, {x1, . . . , xn}, . . . , {y1, . . . , yn}, {z1, . . . , zn}.Now consider the following antisymmetrized bracket product, where eachbasis (except the last) is skew-symmetrized independently:

[w1, x1 . . . , y1, z1][w2, x2 . . . , y2, z2] · · · [wn, x2 . . . , yn, zn].We claim that this equals

c · [w1, . . . , wn][x1, . . . , xn] · · · [y1, . . . , yn][z1, . . . , zn]for some constant c. We shall not give a complete proof here, but will contentourselves with a sketch. Fix everything except the ws. Letting the ws vary,note that the result is skew-symmetric and hence equal to

c0(x1, . . . , xn, . . . , z1, . . . , zn)[w1, . . . , wn],

where c0 is some function that is independent of the ws but dependent oneverything else. Now fix everything except the xs, and so on, repeating theargument for each basis. We eventually reach the desired identity.There is a difficult unsolved problem connected with this identity: for

even n, is it true that c 6= 0? Certain results in supersymmetric algebrastrongly suggest that this is the case, but we shall not discuss the problemfurther here.

1.8 The weak transfer conjecture

The weak transfer conjecture is a very interesting and fruitful conjecture incombinatorics. Before we can state it, we must define the notion of a matroid.A matroid of rank n on a finite set S is a family B of n-subsets of S,

called bases, with the property that for any B1, B2 ∈ B and any x ∈ B1,there exists y ∈ B2 such that

(B1 − {x}) ∪ {y} and (B2 − {y}) ∪ {x}are also in B.It is a remarkable fact that the matroid exchange property implies a much

stronger exchange property, known as Greene’s theorem. We omit the proof.


Theorem 6 Let M(S) be a matroid of rank n on a set S. Let B1 and B2 beany two bases, and let A1 be any subset of B1. Then there exists a subset A2of B2 such that

(B1 −A1) ∪ A2 and (B2 −A2) ∪A1are also bases.

The classical example of matroids is given by the following proposition.

Proposition 7 Let S be a finite subset of PG(n−1) which contains at leastone basis. Then the family of all n-subsets of S which are bases of PG(n−1)is a matroid of rank n.

Proof. [x1, . . . , xn][y1, . . . , yn] equals an expression of the form

±[y1, x2, . . . , xn][x1, y2, . . . , yn] +±[y2, x2, . . . , xn][y1, x1, y3, . . . , yn]± · · ·+±[yn, x2, . . . , xn][y1, . . . , yn−1, x1].

Hence if {x1, . . . , xn} and {y1, . . . , yn} are bases, at least one term in thissum is nonzero, which proves the exchange property.Not all matroids can be obtained in this way. The study of such “non-

representable” matroids involves some very deep combinatorics, but we shallnot pursue this study here.Note that we can prove Greene’s theorem for the case of PG(n − 1) by

using Proposition 5.Now let us state the weak transfer conjecture. If we take a bracket product

that is multilinear in each variable and skew-symmetrize with respect to anyn + 1 of its variables, the result is identically zero. This procedure givesus a way of constructing identities of the form b ≡ 0 where b is a linearcombination of bracket monomials. Let S be the set of bracket monomialsappearing in the expression for b. From the identity we obtain trivially atheorem of the form, “if at least one element of S is nonzero, then at leasttwo elements of S are nonzero.” Since a bracket is nonzero if and only if itsarguments form a basis, we can restate such a theorem in terms of bases.Stated in this way, the theorem can be interpreted as a statement aboutmatroids. The weak transfer conjecture states that all statements aboutmatroids obtained in this way are theorems about matroids.Note that the weak transfer conjecture implies Greene’s theorem, because

of the identity of Proposition 5.

1.9 THE CLASSICAL ALGEBRAS 13

1.9 The classical algebras

There are four classical ways to associate an algebra with a set S = {x1, . . . , xn}and k, a field:1. khSi, the associative free algebra over k generated by the variables S,2. k[S], the polynomial algebra over k generated by the commuting vari-

ables S,3. Ext[S, k], the skew-symmetric algebra generated by S; i.e., khSi subject

to the two relations xixj = −xjxi and x2i = 0,4. Div[S, k], the divided powers algebra generated by S, where to each

variable xi we associate the fake powers x(1)i , x

(2)i , . . . and form the polynomial

ring k[S] subject to the relations

x(1)i = xi and x(i)r x

(j)r =

µi+ j

i

¶x(i+j)r .

The above definition of Div[S, k] is a bit obscure. The secret is that itworks like k[S] but with x(i) = xi/i!. The advantage of the above definitionis that it works in fields of nonzero characteristic.Ext[S, k] is the only one of the above which is finite dimensional as a

vector space–it has dimension 2n, in fact. Its subspace of homogeneousskew-symmetric polynomials of degree r has dimension

¡nr

¢. Curiously, the

subspace of homogeneous polynomials of degree r in k[S] has dimension¿n

r

À=n(n+ 1) · · · (n+ r − 1)

r!.

The above algebras are all graded.

1.10 Coordinate-free versions of the classical

algebras

The reader may be familiar with the quantum group algebra, which is justkhSi subject to the relation xixj = qxjxi. Since this includes the exterioralgebra and the commutative polynomial algebra as special cases, why don’twe just study this more general algebra? The reason is that the four classicalalgebras are the only ones that can be defined in a coordinate-free way. Ournext task is to provide these definitions.


Let V be a vector space of dimension n over k. We have the followingalgebras:1. Tens(V, k) = 1 ⊕ V ⊕ (V ⊗ V ) ⊕ (V ⊗ V ⊗ V ) ⊕ · · · as a vector

space, with an associative product defined by juxtaposition. If we choose abasis {e1, . . . , en} for V , then we get an isomorphism between Tens(V, k) andkh{e1, . . . , en}i. Tens(V, k) is a graded algebra, whose homogeneous elementsof degree r are called tensors of step r.Note that the product definition(v, v0) 7→ (v ⊗ v0) is bilinear. We can

thus decompose v ⊗ v0 into the sum of a symmetric part (v ⊗ v0 + v0 ⊗ v)/2and a skew-symmetric part (v ⊗ v0 − v0 ⊗ v)/2. It is curious to note thatthe symmetric part will give us the relation for a symmetric algebra, whilethe skew-symmetric part gives us the relation for an exterior algebra. Thisprompts the obvious question, what sort of algebra does the cyclic-symmetricpart of a trilinear form give us? The answer is not known.2. Symm(V, k), the quotient of Tens(V, k) by the identity vv0 = v0v

(note that we henceforth use juxtaposition to denote the tensor product).Here, if we choose a basis {e1, . . . , en} for V , we get an isomorphism betweenSymm(V, k) and k[{e1, . . . , en}].3. Ext(V, k), the quotient of Tens(V, k) by the identities vv0 = −v0v and

v2 = 0. We shall occasionally denote the exterior algebra by Λ(V, k) or Λ(V )and the rth graded part by Λr(V, k) or Λr(V ).Warning: The product in Ext(V, k) is usually denoted by ∧, but we

shall use the symbol ∨ instead, reserving ∧ for another product that weshall introduce shortly, in order to be more consistent with lattice-theoreticnotation. We shall refer to ∨ as the exterior product or the join.Again, if we choose a basis {e1, . . . , en} for V , then we get an isomorphism

between Ext(V, k) and Ext[{e1, . . . , en}, k].It is possible to define the tensor product of two exterior algebras. The

naıve definition does not work; we need a definition of the tensor product ⊗asuch that

(v ⊗ 1)⊗a (1⊗ w) = −(1⊗ w)⊗a (v ⊗ 1).The only way to ensure this is to declare it true by fiat. We leave it to thereader to fill in the remaining details.To define the divided power algebra in a coordinate-free way, we must first

define the notion of an “algebra with divided powers”. Let A be a gradedalgebra over k, generated as an algebra by its elements of degree 1. Then wesay that A has divided powers when to every element a ∈ A we can associate

1.11 GENERALIZATIONS OF THE DETERMINANT 15

an element a(i) of degree i ·deg(a) such that the following five relations hold:1. a(i)a(j) =

µi+ j

j

¶a(i+j)

2. (a(i))(j) =(ij)!

j!(i!)ja(ij)

3. (a+ b)(i) =iXj=0

a(j)b(i−j)

4. a(1) = a

5. a(0) = 1

For example, the symmetric algebra over a field of characteristic zero hasdivided powers with the aforementioned trick of letting x(i) = xi/i!. We nowleave it as an exercise to define the divided power algebra in a coordinate-freeway.

It is an amazing fact that the exterior algebra also has divided powers.We shall indicate by means of examples how this can be proved. Choose abasis {e1, . . . , en} of V , and let

t = e1 ∨ e2 + e3 ∨ e4 = e1e2 + e3e4.

Then

t · t = t2 = e1e2e3e4 + e3e4e1e2 = 2e1e2e3e4.This leads us to define t(2) as e1e2e3e4. Now let

t = e1e2e3 + e4e5e6.

It is easy to see that t · t = 0, and this leads us to define t(r) = 0 for all odd r.We leave it as an exercise to complete the proof.

1.11 Generalizations of the determinant

Previously, we defined skew-symmetric forms in an intrinsic fashion andfound that they gave us a basis-independent version of the determinant. Oneof the benefits of giving intrinsic definitions of the various classical algebrasis that it helps us define generalizations of the determinant.

Here are some examples:


1. The Wronskian,

W (f1, . . . , fn) = det

⎛⎜⎜⎜⎜⎜⎜⎝f1 · · · fn

f 01 · · · f 0n...

...

f(n−1)1 · · · f (n−1)n

⎞⎟⎟⎟⎟⎟⎟⎠ ,

where fi ∈ k[x] and char k = 0.2. The Jacobian,

Jac(f1, . . . , fn) = det

⎛⎜⎜⎜⎜⎝∂f1∂x1

· · · ∂fn∂x1

......

∂f1∂xn

· · · ∂fn∂xn

⎞⎟⎟⎟⎟⎠ .3. The Resultant, Res(f1, . . . , fn; k), fi ∈ Sym(V, k). The resultant has

a long definition which we will not repeat here, but suffice it to note that itequals zero iff the fi have a common root.The Wronskian is particularly useful in the study of ordinary differential

equations. Suppose we have a linear operator T given by

T = Dn + pn−1(x)Dn−1 + · · ·+ p1(x)D + p0(x)I,

where the pi(x) are polynomials. Then we know from the theory of ordinarydifferential equations that {f : Tf = 0} is a vector space V of dimension n.Furthermore, the WronskianW is a non-degenerate bracket on V , i.e., V +Wis a Peano space. This fact illuminates many aspects of ordinary differentialequations.There is an analogue of the Wronskian for differences in place of deriva-

tives, called the Casoratian. Let A be an algebra over k, and let U be anendormorphism of A. Define ∆U = U − I. Then the Casoratian is definedby

C(f1, . . . , fn) = det

⎛⎜⎜⎜⎜⎜⎜⎝f1 · · · fn

∆Uf1 · · · ∆Ufn...

...

∆n−1U f1 · · · ∆n−1

U fn

⎞⎟⎟⎟⎟⎟⎟⎠ .

1.12 EXTENSORS AND SUBSPACES 17

We obtain the classical case by taking A to be the set of functions of onevariable (subject to certain conditions) and let U be the map that takes f(x)to f(x+ 1).Actually, endomorphisms of an algebra do not reveal as much of its struc-

ture as derivations do. A derivation of an algebra A is a linear operatorD : A→ A such that

D(fg) = (Df) · g + f · (Dg).It is possible to define an analogue of the Wronskian using derivations inplace of endomorphisms.We leave the proofs of the following two statements as an exercise.1. Let A be the algebra of all endormorphisms of a vector space V of

dimension n. Then every automorphism of A is inner (i.e., of the formT 7→ STS−1 for some invertible S ∈ A), and every derivation of A is inner(i.e., of the form T 7→ TS − ST for some S ∈ A).2. The Singer-Wermer theorem: the only derivation of the algebra of con-

tinuous real-valued functions on the unit interval that is everywhere definedis the zero derivation.

1.12 Extensors and subspaces

Let V be a Peano Space of dimension n. If {e1, . . . , en} is a basis, we canwrite a typical element t as

t =X

i1<i2<···<ir

ai1,i2,...,ir(ei1 ∨ ei2 ∨ · · · ∨ eir).

We say that t is decomposable or simple or that t is an extensor if there existsa basis {x1, . . . , xn} of V such that t = x1 ∨ x2 ∨ · · · ∨ xr.Most tensors aren’t decomposable. In fact, it is an important and un-

solved problem to figure out the minimal number of decomposable tensorsrequired to generate a given Peano space.If t = x1∨x2∨ · · ·∨xr is decomposable, then we say that t represents the

subspace A spanned by {x1, . . . , xr}. Note that this explains why we chosethe notation ∨ for the exterior product instead of the more usual ∧.Theorem 8 The definition of the representation of a subspace by a decom-posable tensor is well defined in the sense that if {y1, . . . , yr} is another basis


of A, theny1 ∨ y2 ∨ · · · ∨ yr = c · (x1 ∨ x2 ∨ · · · ∨ xr)

for some c 6= 0.

Proof. We have

yi =rXj=1

dijxj

for some scalars dij; hence

y1 ∨ · · · ∨ yr =µ rXs=1

d1sxs

¶∨µ rXt=1

d2txt

¶∨ · · · ∨

µ rXu=1

druxu

¶,

which by multilinearity is the same as

rXs=1

rXt=1

· · ·rXu=1

d1sd2t · · · dru(xs ∨ xt ∨ · · · ∨ xu).

But the individual tensors are non-zero only if the s, t, . . . , u are distinct, sowe may treat the indices as a permutation σ of {1, . . . , r}. The expressionbecomes X

σ∈Sr

d1σ(1)d2σ(2) · · · drσ(r)sgn(σ)(x1 ∨ x2 ∨ · · · ∨ xr).

But this is just a determinant, so it must equal c · (x1 ∨ x2 ∨ · · ·∨ xr), wherec 6= 0.

Corollary 9 If x1, . . . , xr ∈ V , then x1 ∨ x2 ∨ · · ·∨ xr = 0 iff {x1, . . . , xr} islinearly dependent.

If t and t0 are two extensors representing the subspaces A and B respec-tively, then the extensor t ∨ t0 represents the subspace A ∨ B provided thatA ∧B = 0. This is clear from the above corollary.

1.13 Meets and the double algebra

We now introduce an operation ∧ that is dual to ∨. These two operationstogether make the exterior algebra into a double algebra. Warning : thedouble algebra is not quite a lattice, as will become apparent.

1.13 MEETS AND THE DOUBLE ALGEBRA 19

Let A = x1 ∨ · · · ∨ xp and B = y1 ∨ · · · ∨ yq. Then the meet of A and B,denoted A ∧B, is defined as follows

A ∧B = [x1, . . . , xp,.y1, . . . ,

.yn−p]

.yn−p+1 ∨ · · · ∨ .

yq= ± .

xn−q+1 ∨ · · · ∨ .xp[

.x1, . . . ,

.xn−q, y1, . . . , yq].

We leave it as an exercise to determine the correct sign and to show that thetwo expressions are indeed equivalent.

Theorem 10 Let A and B be extensors associated to subspaces X and Yof V respectively. If the union X ∪Y spans all of V and if X ∩Y 6= {0} thenA ∧B is the extensor associated to the subspace X ∩ Y .

Proof. Suppose X ∪ Y spans V . Let the step of A be p and let the stepof B be q, with p + q > n. The theorem is trivial if p + q = n, so supposep+ q > n. Take a basis {c1, . . . , cd} of X ∩ Y , where d = p+ q − n > 0. Wecan complete this to a basis of X or to a basis of Y , say

{c1, . . . , cd, a1, . . . , ap−d} and {c1, . . . , cd, b1, . . . , bq−d}respectively. Then we have

A = c1 ∨ · · · ∨ cd ∨ a1 ∨ · · · ∨ ap−dB = c1 ∨ · · · ∨ cd ∨ b1 ∨ · · · ∨ bq−d

so that

A ∧B = ±[c1, . . . , cd, a1, . . . , ap−d, b1, . . . , bq−d](c1 ∨ · · · ∨ cd).For example, if l is a line in PG(2) represented by x1 ∨ x2 and l0 is a line

represented by y1 ∨ y2, then p = l ∧ l0 is represented by(x1 ∨ x2) ∧ (y1 ∨ y2) = [x1x2y1]y2 − [x1x2y2]y1 = x1[x2y1y2]− x2[x1y1y2],

since l ∨ l0 = PG(2).For another example, we can describe algebraically the situation of three

lines meeting at a single point in PG(2) by letting l = x1x2, l0 = y1y2, and

l00 = z1z2 and imposing the condition that [x1x2y1][y2z1z2] = 0 or (expandingthe Scottish notation) [x1x2y1][y2z1z2]− [x1x2y2][y1z1z2] = 0.We leave the proof of the following proposition as an exercise.


Proposition 11 Let A,B,C,A1, A2 be extensors such that A = A1 + A2,step(A) = p and step(B) = q. Then

A ∧B = (−1)n+p+q+pq (B ∧A) ,= A1 ∧B +A2 ∧B, and

A ∧ (B ∧ C) = (A ∧B) ∧ C.

1.14 Meets and joins in the dual space

The dual space V ∗ of V has a natural bracket defined on it as follows: let{u1, . . . , un} be a basis for V ∗. Then

det hxi|uji = c(u1, . . . , un)[x1, . . . , xn]

for some multilinear function c. Now c is clearly skew-symmetric, so it is abracket on V ∗, which we denote by [ ]∗. Hence we have the attractive identity

det hxi|uji = [x1, . . . , xn][u1, . . . , un]∗.

We thus get joins and meets in the dual space. How are these related to theoriginal joins and meets? To answer this question we introduce the conceptof copoints. If {e1, . . . , en} is a basis for V , then

ei = e1 ∨ · · · ∨ bei ∨ · · · ∨ enis an extensor of step n− 1, called a copoint.Note that hei|uji = δij while [ei, ej ] = ±δij. The linear form ei 7→ [ei, ej]

thus coincides, up to a sign, with the linear form ei 7→ hei|uji. We can thusextend the map uj 7→ ±ej to an isomorphism between V ∗ and the copointsof Ext(V ). From now on, in fact, we will identify the two. This allows us towrite

hx|ui = [x, u] = x ∧ u.The observant reader will notice that the above discussion sneakily proves

that every skew symmetric form of step n− 1 is decomposable.We leave it as an exercise to prove that joins in V ∗ equal meets in V and

that meets in V ∗ (defined by the dual bracket) equal joins in V . Unfortu-nately, this does not give us perfect symmetry between a space and its dual.

1.15 PLUCKER COORDINATES 21

For example, it is a straightforward exercise to show that if u1, . . . , un arecovectors, then

u1 ∧ u2 ∧ · · · ∧ un = [u1, u2, . . . , un]∗,but it is not the case that

x1 ∨ x2 ∨ · · · ∨ xn = [x1, x2, . . . , xn]since the right hand side of this equation is a scalar but the left hand sideis not. There is no easy way to get around this. We can choose an integral,i.e., a tensor E = x1 ∨ · · ·∨ xn where {x1, . . . , xn} is a unimodular basis, butthis does not solve all the problems. About the best we can achieve at thisstage in the theory are identities such as the following, whose proof we leaveas an exercise.

[y1, . . . , yn](x1 ∨ · · · ∨ xn) = [x1, . . . , xn](y1 ∨ · · · ∨ yn).We remark that every extensor of step r is the meet of n − r covectors.

To see this, suppose that

t = x1 ∨ · · · ∨ xr.Complete the xs to a basis {x1, . . . , xn}. Now define

xi = x1 ∨ · · · ∨ bxi ∨ · · · ∨ xrOne checks readily that

t = xr+1 ∧ · · · ∧ xn.

1.15 Plucker coordinates

Let t = x1 ∨ · · · ∨ xr and let {u1, . . . , un} be a basis of V ∗. Recall thathx|ui = x ∧ u. This suggests the following generalization:hx1 ∨ · · · ∨ xr|u1 ∨ · · · ∨ uri = (x1 ∨ · · · ∨ xr) ∧ (u1 ∨ · · · ∨ ur)

= h .xr|u1i ( .x1 ∨ · · · ∨ .xr−1) ∧ u2 ∧ · · · ∧ ur

= det hxi|uji .Then we define the Plucker coordinates of t to be

pi1,...,ir = ht|ui1 ∧ · · · ∧ uir1i .


This allows us to identify (Λr(V ))∗ with Λn−r(V ) in the same way we iden-

tified V ∗ with Λn−1(V ). Notice that if {e1, . . . , er} is the basis whose dual is{u1, . . . , ur}, then pi1,...,ir is just the coefficient of ei1 ∨ ei2 ∨ · · · ∨ eir in theexpansion of t in this basis. This explains why the ps are called “coordinates”In fact, we can generalize further by dropping the requirement that the

number of xs has to equal the number of us. We find that for p > q,

hx1 ∨ · · · ∨ xp|u1 ∧ · · · ∧ uqi = hx1 ∨ · · · ∨ xq|u1 ∧ · · · ∧ uqi(xq+1 ∧ · · · ∧ xp),

and for p 6 q,

hx1 ∨ · · · ∨ xp|u1 ∧ · · · ∧ uqi = hx1 ∨ · · · ∨ xp|u1 ∧ · · · ∧ upi(up+1 ∧ · · · ∧ uq).

1.16 Desargues revisted & theorem of Pap-

pus

We can now prove Desargues’ theorem purely algebraically.

Theorem 12 Let a, b, c, a0, b0, c0 be any points in PG(2). Then the followingidentity holds:

(bc ∧ b0c0) ∨ (ca ∧ c0a0) ∨ (ab ∧ a0b0) = [a, b, c][a0, b0, c0](aa0 ∧ bb0 ∧ cc0).

Some comments are in order. For simplicity of notation we use juxtaposi-tion of vectors to denote their join, and juxtaposition of covectors to denotetheir meet. Note that this identity does indeed imply Desargues’ theorem: ifthe triangles abc and a0b0c0 are nondegenerate and if the lines aa0, bb0, and cc0

intersect in a point, then the right hand side is zero; by the theorem, the lefthand side is zero, but this is just the conclusion of the Desargues theorem.In fact, the identity is in some sense equivalent to the Desargues theorem,but we shall not make this notion precise here.Proof. Let x, y, z be copoints (i.e., lines). We claim that

abc ∧ ((a ∨ yz) ∧ (b ∨ zx) ∧ (c ∨ xy)) = xyz ∧ ((bc ∧ x) ∨ (ca ∧ y) ∨ (ab ∧ z)).

To see this, note that

a ∨ yz = [a, y]z − [a, z]y and ca ∧ y = c[a, y]− a[c, y].

1.16 DESARGUES REVISTED & THEOREM OF PAPPUS 23

The right hand side of the claim equals

xyz ∧ ((b[c, x]− c[b, x]) ∨ (c[a, y]− a[c, y]) ∨ (a[b, z]− b[a, z]))

or

xyz ∧ (bca[c, x][a, y][b, z]− cab[b, x][c, y][a, z])or

[x, y, z][a, b, c]([c, x][a, y][b, z]− [b, x][c, y][a, z]).The left hand side equals

[a, b, c](([a, y]z − [a, z]y) ∧ ([b, z]x− [b, x]z) ∧ ([c, x]y − [c, y]x))

or

[a, b, c][x, y, z]([a, y][b, z][c, x]− [a, z][b, x][c, y]).This proves the claim. Now just set x = b0c0, y = c0a0, and z = a0b0.

As another illustration of the technique of proving geometric theoremsalgebraically, we give the algebraic version of a theorem of Pappus.

Theorem 13 Let a, b, c, a0, b0, c0 be points in PG(2), and let E be an integral.Then

((b0c ∧ bc0) ∨ (c0a ∧ ca0) ∨ (a0b ∧ ab0)) ∧E= [a, a0, b0][b, b0c0][c, c0, a0][a, b, c]− [a, b, b0][b, c, c0][c, a, a0][a0, b0c0].

Proof. If we expand the left hand side we get

(b0[c, b, c0]− c[b0, b0, c]) ∨ (c0[a, c, a0]− a[c0, c, a]) ∨ (a0[b, a, b0]− b[a0, a, b0]).

After some straightforward manipulation this simplifies to

b0c0a0[c, b, c0][a, c, a0][b, a, b0]− cab[b0, b, c0][c0, c, a0][a0, a, b0],

which is equivalent to what the theorem asserts.


1.17 Matrix identities

The machinery we have developed so far gives us a method for proving ma-trix identities. We can think of a matrix as a set of n vectors {a1, . . . , an}.The entries of the matrix can be retrieved by fixing a coordinate system{u1, . . . , un}.With this understanding, we define the adjugate matrix Aad as follows:

Aad = {a1, . . . , an}where as usual

ai = a1 ∨ · · · ∨ ai−1 ∨ bai ∨ ai+1 ∨ · · · ∨ an.Relative to a dual coordinate system {e1, . . . , en}, the adjugate matrix hasentries hej|aii. Classically, the adjugate matrix is just the matrix of cofactors.The following theorem is known as Cauchy’s theorem on the adjugate.

Theorem 14 det (hej|aii) = (det (hai|uji) )n−1.Proof. Note that

a1 ∧ a2 = (a2 ∨ · · · ∨ an) ∧ (a1 ∨ a3 ∨ · · · ∨ an)= [a2, . . . , an,

.a1]

.a3 ∨ · · · ∨ .

an

= ±[a1, . . . , an]a3 ∨ · · · ∨ an.Thus

a1a2a3 =[a1, . . . , an]a3a4 · · · an ∧ a1a2a4 · · · an= ±[a1, . . . , an]2a4a5 · · · an

and so on. The result follows by induction.In the course of this proof we essentially proved Jacobi’s theorem:

Theorem 15 Every r× r minor of a matrix is equal to a power of a brackettimes the complementary (n− r)× (n− r) minor of the adjugate matrix.Given a matrixM , a compound matrixM (r) of order r is a matrix whose

entries are r × r minors of M . For example, in Λk(V ), consider the set

{ai1 ∨ ai2 ∨ · · · ∨ air : i1 < i2 < · · · ir}.

1.18 PFAFFIANS 25

Relative to a coordinate system, the matrix has the entries

hai1 ∨ · · · ∨ air |uj1 ∧ · · · ∧ ujri.

One can show that the determinant of M (k) is just a power of the bracket[a1, . . . , an].We invite the reader to invent his own theorem on minors as an exercise.

1.18 Pfaffians

If T : V → V is a linear operator, then T extends uniquely to an endomor-phism of Λ(V ) by setting

T (x1 ∨ · · · ∨ xr) = Tx1 ∨ Tx2 ∨ · · · ∨ Txr.

Some work needs to be done to show that T is well-defined and indeed anendomorphism, but we omit the details. We can restrict T to an operator T (r)

on Λr(V ). If {e1, . . . , en} is a basis of V so that

{ei1 ∨ · · · ∨ eir : i1 < i2 < · · · < ir}

is a basis of Λr(V ), then the matrix of T(r) relative to this basis is the

compound matrix of the matrix of T .As an immediate corollary we obtain the Binet-Cauchy formula.

Corollary 16 (TS)(r) = T (r)S(r).

We now turn to the Pfaffian. The classical definition is as follows.If M = (xij) is an n× n matrix such that xij = −xji for all i and j and

xii = 0 for all i, then if n is odd the determinant of M is zero, but if n iseven the determinant of M is the square of a polynomial in the xij. Thispolynomial is called the Pfaffian of M . Thus

det(M) = 0, if

nisodd; Pf(M)2, ifniseven.The Pfaffian is also related to the problem of enumerating 1-factors of

graphs. Recall that a 1-factor of a graph G is a subset S of the edges such


that every vertex of G belongs to exactly one edge in S. Then we have anexpression of the form

Pf(M) =Xϕ∈Kn

±Y(i,j)∈ϕ

aij.

where the sum is over all 1-factors ϕ of the complete n-graph Kn. We leavethe exact formulation of this formula and its proof as an exercise.In some sense a the Pfaffian is a generalization of the determinant, because

the determinant is related to 1-factors of complete bipartite graphs in thesame way that Pfaffians are related to 1-factors of complete graphs.The Pfaffian can be defined in terms of divided powers. Given a 2n× 2n

matrix M = (aij), with aij = −aji, we can define a tensor of degree two inExt(V ) (where V has dimension 2n) by

t =Xi<j

aij(ei ∨ ej).

Then we can define the Pfaffian Pf(M) by

t(n) = Pf(M)e1e2 · · · e2n.We leave it as an exercise to show that this is equivalent to the classicaldefinition.We can now state and prove the fundamental property of the Pfaffian,

which we express in matrix form.

Theorem 17 Pf(P TAP ) = det(P )Pf(A).

Here, of course, P T denotes the transpose of P .Proof. By the principle of irrelevance of inequalities, we may assume thedet(P ) 6= 0. Write P = (pij) and let T be the linear operator defined by

T : ei →2nXj=1

pijej .

Let t =P

i<j aij(ei ∨ ej). Then

T (2)(t) =Xi<j

Ã2nXr=1

pirer ∨2nXs=1

pjses

!=X

aijpirpjs (er ∨ es)=X

pTriaijpirpjs (er ∨ es) .

1.18 PFAFFIANS 27

Now T is an endomorphism of Λ(V ), so T (t(n)) = (Tt)(n), i.e.,

T (2n)t(n) = (T (2)t)(n).

Taking determinants and applying the result just obtained for T (2) togetherwith the definition of the Pfaffian in terms of divided powers, we get

det(P )Pf(A)[e1, . . . , e2n] = det(T(2n))Pf(A)[e1, . . . , e2n] = Pf(P

TAP )[e1, . . . , e2n].

The result follows.Tensors of step two can be reduced to a canonical form.

Theorem 18 For any given tensor t of step two, there exists a basis {e1, . . . , e2n}of V such that in this basis, t can be written in the form

t = e1 ∨ e2 + e3 ∨ e4 + · · ·+ e2p−1 ∨ e2pwhere p 6 n.

Equivalently, in matrix terms, if A is a skew-symmetric 2n× 2n matrix,then there exists P such that P TAP is of the form

P TAP =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

0 −11 0

0 −11 0

. . .

0 −11 0

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠.

There is no known general canonical form for tensors of step higher thantwo, though there are some existence results by Mumford. For step three,we give here what is known for n = 4, 5, 6. For n = 4, all tensors aredecomposable. For n = 5, every tensor t can be written in the form t = x∨ t0where t0 is of step two and x ∈ Λ1(V ). For n = 6, the following is a completelist of canonical forms.I. t = a ∨ b ∨ c


II. t = b ∨ c ∨ d+ b ∨ e ∨ f = b ∨ t0III. t = b ∨ c ∨ d+ b ∨ f ∨ g + c ∨ e ∨ fIV. t = b ∨ c ∨ d+ e ∨ f ∨ g (the generic case)We leave it as an exercise to the reader to prove the case n = 6.Recall that Λl(V )

∗ ' Λl(V∗) ' Λn−l(V ) with

hx1 ∨ · · · ∨ xl|u1 ∧ · · · ∧ uli = det hxi|uji = (x1 ∨ x2 ∨ · · ·∨ xl)∧ u1 ∧ · · ·∧ ul.

If t is a tensor of step r, then

ht|u1 ∧ · · · ∧ uri = ft(u1, u2, . . . , ur),

with ft an alternating r-linear form on V ∗. Conversely, if g(x1, . . . , xr) is analternating r-linear form on V , then there exists a tensor g in Λr(V

∗) suchthat

g(x1, . . . , xr) = hx1 ∨ · · · ∨ xk|gi.We can use this fact to give a simple proof of the above theorem, which

we restate as follows.

Theorem 19 Let f(x1, x2) be an alternating bilinear form. Then there existsa basis {e1, e01, e2, e02, . . . , en, e0n} of V such that f(ei, e0i) = 1 and f(ei, ej) = 0for 1 6 i 6 p.

(Historical note: The notation e1, e01, . . . , en, e

0n for a basis of V is due to

Herman Weyl.)Proof. First choose e1, e

01 such that f(e1, e

01) = 1. If f is nondegenerate,

this can always be done. Then for x ∈ V , choose α1, α2 and x0 such that

x = α1e1 + α2e01 + x

0 and f(e1, x0) = f(e01, x

0) = 0.

To see that this can be done, note that α1 and α2 must satisfy

f(x, e1) = α1f(e1, e1) + α2f(e01, e1) + f(x

0, e1),

andf(x, e2) = α1f(e1, e2) + α2f(e

01, e2) + f(x

0, e2).

We see now that α2 = −f(x, e1) and α1 = f(x, e2) does the job. (Recall, forinstance, that f(e01, e1) = −f(e1, e01) = −1.)

1.18 PFAFFIANS 29

Now we will use this result to give a simple proof that Pf(A)2 = det(A),for A a skew-symmetric 2n× 2n matrix.Proof. Pf(A)(P TAP ) = det(P )Pf(A) by the fundamental property. Byjudicious choice of P , we may assume that that P TAP is in canonical form

P TAP =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

0 −11 0

0 −11 0

. . .

0 −11 0

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠,

so Pf(P TAP ) = 1 = det(P TAP ) by inspection. Now, det(P TAP ) = det(P )2 det(A),while

Pf(P TAP ) = det(P )Pf(A) = det(P )2Pf(A)2,

since 1 = 12. Thus, Pf(A)2 = det(A).

Here is an example of how to compute the Pfaffian of a six by six skewsymmetric matrix. Let

A =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

12 13 14 15 16

23 24 25 26

34 35 36

45 46

56

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠


Then Pf(A) is given by

(12)Pf

⎛⎜⎜⎜⎝34 35 36

45 46

56

⎞⎟⎟⎟⎠− (13) Pf⎛⎜⎜⎜⎝24 25 26

45 46

56

⎞⎟⎟⎟⎠+ (14)Pf⎛⎜⎜⎜⎝23 25 26

35 36

56

⎞⎟⎟⎟⎠+

− (15) Pf

⎛⎜⎜⎜⎝23 24 26

34 36

46

⎞⎟⎟⎟⎠+ (16)Pf⎛⎜⎜⎜⎝23 24 25

34 35

45

⎞⎟⎟⎟⎠As an exercise, give the analog of Laplace expansions for Pfaffians. Also,

derive the analog of Cramer’s rule and Jacobi’s Theorem for Pfaffians.

1.19 The Grassmannian

Given a tensor t ∈ Λr(V ), write

t =Xj1,...,jr

aj1,...,jrej1 . . . ejr .

(The as are of course the Plucker coordinates.) We define

span(t) =

( Xj1,...,jr

[x1, . . . , xn−k+1,.cj1, . . . ,

.cjr−1 ]

.cjr−1 | x1, . . . , xn−k+1 ∈ V

)

Note that t can always be written using vectors from span(t).We now state a remarkable theorem that gives necessary and sufficient

conditions on the Plucker coordinates for t to be decomposable.

Theorem 20 With t as above, let et be the associated r-linear form on Λr(V ∗).Then t is decomposable if and only if the following Grassmann conditions aresatisfied: et ( .u1, . . . , .ur)et ( .ur+1, ν1, . . . , νr−1) = 0for all νi. Alternatively, we can express this condition by the equation

ap1,...,prapr+1,q1,...,qr−1 = 0.

1.19 THE GRASSMANNIAN 31

For example, for r = 3 and n = 6, we get

a123a456 − a124a356 + a134a256 − a234a156 = 0.

In general the Grassmann conditions are not independent. It is in fact amajor unsolved problem to find an algorithm to construct a minimal set ofindependent equations from the Grassmann conditions.The Grassmann conditions determine a variety called the Grassmannian.

They show us that the Grassmannian provides us with a natural parametriza-tion of the r-dimensional subspaces of an n-dimensional space.Proof. Let f(x1, . . . , xr) be an r-linear form. Choose a basis {e1, . . . , en),and write

xj =Xj

aijej, with aij ∈ k.

Then

f

ÃXj1

a1j1ej1 ,Xj2

a2j2ej2, . . . ,Xjr

arjrer

!=

Xj1,...,jr

a1j1a2j2 · · · arjrf(ej1, ej2 , . . . , ejr).

Thus, setting tj1,...,jr = f(ej1, ej2 , . . . , ejr), we obtain the Plucker coordinatesof a skew-symmetric tensor. The Grassmann conditions are equivalent to

f(x1, x2, . . . , xr)f(xr+1, y1, . . . , yr−1) = 0.(∗)

We claim that (∗) is satisfied by a skew symmetric r-linear form f for all xsand ys if and only if

f(x1, . . . , xr) = hx1 ∨ · · · ∨ xr|u1 ∧ · · · ∧ uri

for all covectors u1, . . . , ur. We check this for nondegenerate forms. Choosex1, . . . , xr such that f(x1, . . . , xr) = 1. Then (∗) is equivalent to

f(xr+1, y1, . . . , yr−1)

= f

ÃrXi=1

±f(xr+1, x1, . . . , xi−1, xi, xi+1, . . . , xr)xi, y1, . . . , yr−1!.


For simplicity, set

z =rXi=1

±f(xr+1, x1, . . . , xi−1, xi, xi+1, . . . , xr)xi

What we have proved is that

f(xr+1, y1, . . . , yr−1) = f(z, y1, . . . , yr−1)

for all y1, . . . , yr−1. But xr+1 is arbitrary. So what we have shown is that forany x,

x ∼rXj=1

f(x, x1, . . . , xj, . . . , xr)xj,

where u ∼ v means that f(u, y1, . . . , yr−1) = f(v, y1, . . . , yr−1) for all y1, . . . , yr−1.Using analogous reasoning for each of the arguments of f , we see that

f(z1, . . . , zr)

= f(rX

j1=1

f(z1, x1, . . . , xj1, . . . , xr)xj1, . . . ,rX

jr=1

f(z1, x1, . . . , xjr , . . . , xr)xjr).

We can therefore choose a coordinate system where

x1 =

µ100...0

¶, x2 =

µ010...0

¶, x3 =

µ001...0

¶, . . .

all the way up to xr. In this coordinate system, f(z1, . . . , zr) dependsonly on the first r coordinates. Thus it vanishes outside the correspond-ing r-dimensional subspace. Hence we have an alternating r-linear form onan r-dimensional space, which must therefore be a bracket. The theoremfollows.

1.20 Trace

An antiderivation on Λ(V ) is a linear map δ : Λ(V ) → Λ(V ) such thatδ¡Λ1(V )

¢ ⊂ k andδ(x1 ∨ x2 ∨ · · · ∨ xr) =

rXi=1

(−1)i+1x1 ∨ · · · ∨ xi−1 ∨ δ(xi) ∨ xi+1 ∨ · · · ∨ xr.

1.20 TRACE 33

An example is given by

δ(x ∨ y) = δ(x) ∨ y − x ∨ δ(y).

Theorem 21 Let x 7→ hx|ui be a linear functional on V . Then there existsa unique antiderivation that extends the functional to Λ(V ).

Proof. Simply set δu(t) = u ∧ t, and observe that

u∧ (x1 ∨ · · ·∨ xr) =rXi=1

(−1)i+1[u, xi](x1 ∨ · · ·∨ xi−1 ∨ xi ∨ xi+1 ∨ · · ·∨ xr).

We might wonder if we can replace “antiderivation” with “derivation” inthis theorem. The answer is no. For example, we might try something like

x ∨ y 7→ hx|ui y + x hy|ui

But then we obtain x ∨ x 7→ 2hx|uix, so this doesn’t work.This theorem actually holds in the infinite dimensional case, where there

are no brackets. Let L(x) be a linear functional. We begin by extending tothe tensor algebra by setting

δL(x1 ⊗ · · ·⊗ xr) =rXi=1

(−1)i L(xi)(x1 ⊗ · · ·⊗ xi−1 ⊗ bxi ⊗ xi+1 ⊗ · · ·⊗ xr).Now note that δL(x ⊗ x) = 0, so that we get an induced antiderivation onthe exterior algebra.This theory has applications in physics. In quantum mechanics there are

four kinds of creation and annihilation operators, associated with the tensor,exterior, symmetric, and Clifford algebras. We will consider the first twokinds here. For x ∈ V and u ∈ Tens(V ) we define

Ct(x) : u→ x⊗ uAt(L) : u→ δL(u).

For x ∈ Λ1(V ) and u ∈ Λn−1(V ) we define

Ce(x) : t→ x ∨ t (t ∈ Λ(V ))

Ae(u) : t→ u ∧ t (t ∈ Λ(V )).


Then we have the following fundamental relation.

Ce(x)Ae(u) +Ae(u)Ce(x) = hx|uiI.This is easily checked, since Ce(x)Ae(u)t = x ∨ (u ∧ t) and

Ae(u)Ce(x)t = u ∧ (x ∨ t) = hx|uit− x ∨ (u ∧ t).Any linear operator T : V → V extends uniquely to a derivation (not

an antiderivation) of the exterior algebra Λ(V ). This can be done by firstextending T to a derivation of Tens(V ) as follows:

δT (x1 ⊗ · · ·⊗ xr) =rXi=1

x1 ⊗ · · ·⊗ xi−1 ⊗ Txi ⊗ xi+1 ⊗ · · ·⊗ xr.

Now just note that

δT (x⊗ x) = (Tx)⊗ x+ x⊗ (Tx)= (x+ Tx)⊗ (x+ Tx)− x⊗ x− (Tx)⊗ (Tx) ,

so that δT induces a derivation DT on Λ(V ). Note that DT sends Λr(V ) intoΛr(V ) for all r.The action of DT on Λn(V ) is given by

DT (x1 ∨ x2 ∨ · · · ∨ xn) = (DTx1)∨x2 ∨ · · · ∨ xn + x1 ∨ (DTx2)∨ · · · ∨ xn + · · ·= (Tx1) ∨ x2 ∨ · · · ∨ xn + x1 ∨ (Tx2)∨ · · · ∨ xn + · · ·.

Writing Txi =P

j aijxj with aij ∈ k, we obtain

(Tx1) ∨ x2 ∨ · · · ∨ xn =nXj=1

aij (xj ∨ x2 ∨ · · · ∨ xn)

= a11 (x1 ∨ x2 ∨ · · · ∨ xn)since all the other terms in the sum have a repeated variable and thus vanish.We can carry out a similar argument for a22, a33, etc., and so DT defines thetrace invariantly.The fundamental property of the trace is Tr(TT 0) = Tr(T 0T ). How do we

prove this invariantly? In general, DTT 0 6= DTDT 0 . However, it is true thatDTT 0−T 0T = DTDT 0 −DT 0DT .

From this it follows that Tr(TT 0 − T 0T ) = 0.

1.21 DEGREE 35

1.21 Degree

How do we define the degree or step of a tensor invariantly? The answer isknown only for characteristic zero. To illustrate, let us consider a quadraticform f(x) where x ∈ Tens(V ). Then

B (x, y) = f (x+ y)− f (x)− f (y)is a symmetric bilinear form if and only if f(x) is quadratic. Conversely, ifB(x, y) is a symmetric bilinear form, then B(x, x) is a quadratic form. Nowfor the catch: if we construct f from B and construct a bilinear form B0

from f using these methods, then B and B0 will differ by a factor of two.This messes things up in characteristic two.The process of obtaining a multilinear form from f is called polarization.

Clearly, polarization can be done in degrees higher than two, using the ob-vious inclusion-exclusion formula. In degree r we obtain a factor of r! thatexplains why there is trouble in fields of finite characteristic.Another way of defining polarization is to expand

f(λ1x1 + λ2x2 + · · ·+ λrxr) =Xi1,...,ir

ci1,...,irλi11 λ

i22 · · ·λirr

where the cs are coefficients. Now just take the coefficient of λ1λ2 · · ·λr.

1.22 Symmetry classes

Fix r, and let V be a vector space of dimension n with n > r. Let f(x1, . . . , xr)be an r-linear form such that the set of functions

Sf =©f(xσ(1), . . . , xσ(r)) | σ ∈ Perm {1, 2, . . . , r}

ªis linearly independent. We can define an action of a permutation σ on Wby

σ(v) = v(xσ(1), . . . , xσ(r)).

The orbits under this action, i.e., subspaces W 0 such that σW 0 = W 0, arecalled symmetry classes. For example, when r = 2 then the subspace of allfunctions of the form f(x1, x2) + f(x2, x1) is a symmetry class.In characteristic zero, there are exactly pr minimal symmetry classes,

where pr is the number of partitions of the integer r. Hence we can index


them by partitions as follows: Wλ. The dimension of Wλ is just the numberof standard Young tableaux of shape λ. The Wλ span the space of r-linearforms. There are connections with the representation theory of the symmetricgroup, but we shall not go into the details here.

1.23 Matching theory

Let us recall the Philip Hall marriage theorem. Given two sets B and G, arelation is a subset of B ×G. We can represent a relation by a graph wherethe elements of B and G are vertices and the elements of R are edges. In thecase where |B| = |G| = n, a matching is a relation R such that |R| = n andevery vertex belongs to exactly one edge of R. The Hall marriage theoremstates that a relation R contains a matching if and only if R(A) > A forevery subset A of B, where

R (A) = {g | (b, g) ∈ R for some b ∈ A} .Related to the Hall marriage theorem is the Konig-Egervary theorem.

This states that the maximum size of a “partial matching” R0 ⊂ R equalsthe minimum size of a set V ⊂ B ∪ G such that every “edge” of R has atleast one endpoint in V .We can recast these theorems in the language of multilinear algebra, by

considering the incidence matrix of the graphs of the relations. Then the Hallmarriage theorem gives necessary and sufficient conditions for the incidencematrix to contain a permutation matrix, and the Konig-Egervary theoremsays that the maximum size of a sub-permutation matrix equals the minimumnumber of lines (i.e., rows or columns) needed to contain all nonzero entries.Unfortunately, it is not possible to prove these theorems directly using

the standard incidence matrices. This can be fixed by replacing the ones ofthe incidence matrix by independent indeterminates. The result is called thef ree matrix of the graph.

Proposition 22 If F is the free matrix of a bipartite graph R then R has amatching if and only if det(F ) 6= 0.Proof. The determinant of F has the form

det(F ) =Xσ

sgn(σ)x1σ(1) · · ·xnσ(n).

1.23 MATCHING THEORY 37

It is clear that distinct terms in this sum cannot cancel. Hence det(F ) isnonzero if and only if at least one of the summands is nonzero, but clearlythis just means that there exists a matching.To prove the Hall marriage theorem, we may permute the rows and

columns of the free matrix to get a maximal partial matching, i.e., so thatthe first principal r × r minor is Ir and the complementary n − r × n − rminor is zero. Let v1, . . . , vr be the first n columns of the free matrix. Therank of the free matrix is r. If vj 6= 0 for j > r, then there must exist anexpression of vj in terms of v1, . . . , vr:

vj = c1v1 + c2v2 + · · ·+ civi,where all the cs are nonzero. By a suitable permutation we may assumej = r + 1. By Cramer’s rule, the cs depend only on the transcendentals xpqsuch that 1 6 p 6 r and 1 6 q 6 r + 1. Taking the (r + 1)th components,we obtain

vr+1,r+1 = c1v1,r+1 + c2v2,r+1 + · · ·+ civi,r+1 = 0.It follows that the last n− r rows of the free matrix are zero, contradictingthe condition of the Hall marriage theorem.We now give a preliminary lemma, leaving its proof as an exercise.

Lemma 23 If M is an n × n matrix of rank r and if x1, . . . , xr are lin-early independent row vectors and u1, . . . , ur are linearly independent columnvectors of M , then the minor hx1 ∨ · · · ∨ xr|u1 ∧ · · · ∧ uri of M is nonzero.

We now give the Pfaffian version of Jacobi’s identity, which was givenbefore as an exercise. Let A = (aij) be an n × n matrix and let Aad =(aij) be the adjugate matrix. We use the notation Ai1,i2,...,ir to denote thematrix obtained by deleting rows and columns i1, i2, . . . , ir from A. Withthis notation, we have

det(A) det(A12) = det (a11a12a21a22) = a11a22 − a12a21.If A is skew-symmetric and n is even, then a11 = a22 = 0. Thus

Pf(A)2Pf(A12)2 = a212,

or, taking square roots,

Pf(A)Pf(A12) = ±a12.


This is the Pfaffian analogue of a special case of the Jacobi theorem. We canextend this reasoning to deal with A1234 (the “4× 4 case”).

det(A)3 det

⎛⎜⎜⎜⎝a55 · · · a5n.... . .

...

an5 · · · ann

⎞⎟⎟⎟⎠ = det

⎛⎜⎜⎜⎝a11 · · · a14.... . .

...

a41 · · · a44

⎞⎟⎟⎟⎠= (a12a34 − a13a24 + a23a14)2

Taking square roots,

±Pf(A)3Pf

⎛⎜⎜⎜⎝a55 · · · a5n.... . .

...

an5 · · · ann

⎞⎟⎟⎟⎠ = a12a34 − a13a24 + a23a14.We can now apply the 2× 2 case: aij = ±Pf(A)Pf(Aij). This gives us

Pf(A)Pf(Apqrs) = ±Pf(Apq)Pf(Ars)± Pf(Apr)Pf(Aqs)± Pf(Aqr)Pf(Aps). (∗)

We are now in a position to prove Tutte’s famous matching theorem usingPfaffians.

Theorem 24 A graph G = (V,E) has a 1-factor if and only if |S| >oddity(S) for every subset S ⊂ V , where oddity(S) is the number of con-nected components of G− S with an odd number of vertices.

Proof. Necessity is easy to see: each connected component of G − S withan odd number of components must have a vertex that matches some vertexof S. Now suppose that the conclusion fails, so that the Pfaffian of the freematrix of G is zero. This means that the left hand side of (∗) is zero.We say that a vertex p of G is bad if Pf(Aij) = 0 for all q, and good

otherwise. The proof now proceeds in six steps.(1) If Pf(Apq) = 0 and Pf(Aqr) = 0 and Pf(Aqs) 6= 0 then Pf(Aps) = 0.(2) If Pf(Apq) 6= 0 then (p, q) is not an edge of G.(3) Suppose (p, q) is an edge and (q, r) is an edge. If q is good then

Pf(Ap,r) = 0. By (2), Pf(Apq) = 0 and Pf(Aqr) = 0. By goodness, Pf(Aqs) 6=0 for some s. Hence by (1), Pf(Apr) = 0.

1.23 MATCHING THEORY 39

(4) If vertex p is connected to vertex q by a path, and all the internalvertices in the path are good, then Pf(Apq) = 0.(5) This is the key step. We define the saturation Gsat of the graph G as

follows. If Pf(Ars) = 0, we add the edge (r, s). By (2), Gsat has no 1-factor

if G does not.(6) Let Σ be the set of bad vertices of Gsat. We claim that

|Σ| < oddity(Σ) in Gsat. (∗∗)

To see this, note that by definition of Σ, the graph Gsat − Σ has no badvertices, so that in each connected component every two points are connectedby a path of good vertices. By (4) and (5), this means that the connectedcomponents of Gsat−Σ are complete graphs. But if (∗∗) were to fail we couldobtain a 1-factor of Gsat, contradicting (5). Hence (∗∗) holds. But from thisit follows trivially that

|Σ| < oddity(Σ) in G,

because each odd component of Gsat−Σ contains an odd component of G−Σ.This completes the proof.

Rota on multilinear algebra and geometry

Documents

Transcript of Rota on multilinear algebra and geometry