Lecture Notes Helleloid Algebraic Combinatorics

Algebraic Combinatorics

Geir T. Helleloid

Fall 2008

M390C Algebraic Combinatorics Fall 2008 Instructor: Geir Helleloid

2

Contents

1 Enumeration 91.1 Lecture 1 (Thursday, August 28): The 12-Fold Way (Stanley [4, Section 1.1]) 9

1.1.1 Stirling Numbers and Bell Numbers . . . . . . . . . . . . . . . . . . . 101.1.2 Superclasses (Arias-Castro, Diaconis, Stanley) . . . . . . . . . . . . . 111.1.3 Back to the 12-Fold Way . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.2 Lecture 2 (Tuesday, September 2): Generating Functions (Stanley [4, Section1.1]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.2.1 Recurrences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161.2.2 Catalan numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171.2.3 q-Analogues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.3 Lecture 3 (Thursday, September 4): Permutation Enumeration (Stanley [4,Section 1.1], Wilf [7, Chapter 4] . . . . . . . . . . . . . . . . . . . . . . . . . 201.3.1 Permutation Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 211.3.2 Multiset Permutations and q-Analogues . . . . . . . . . . . . . . . . . 231.3.3 Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241.3.4 Using Generating Functions to Find Expected Values . . . . . . . . . 251.3.5 Unimodality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261.3.6 Cycle Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271.3.7 Square Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

1.4 Lecture 5 (Thursday, September 11): The Exponential Formula (Stanley [5,Chapter 5]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

1.5 Lecture 6 (Tuesday, September 16): Bijections . . . . . . . . . . . . . . . . . 351.6 Lecture 7 (Thursday, September 18): Bijections II . . . . . . . . . . . . . . . 381.7 Lecture 8 (Tuesday, September 23): Bijections II (Aigner [1, Section 5.4]) . . 40

1.7.1 The Gessel-Viennot Lemma . . . . . . . . . . . . . . . . . . . . . . . 40

2 Special Topics 432.1 Lecture 9 (Thursday, September 25): Permutation Patterns (Bona [2, Chapter

4]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432.2 Lecture 10 (Tuesday, September 30): The Matrix Tree Theorem (Stanley [5,

Section 5.6]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.2.1 Spanning Trees and the Matrix Tree Theorem . . . . . . . . . . . . . 47

2.3 Lecture 11 (Thursday, October 2): The BEST Theorem (Stanley [5, Section5.6]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502.3.1 The BEST Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3


2.3.2 De Bruijn Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.4 Lecture 12 (Tuesday, October 7): Abelian Sandpiles and Chip-Firing Games 522.5 Lecture 13 (Thursday, October 9): Mobius Inversion and the Chromatic Poly-

nomial (Stanley [4, Chapter 2]) . . . . . . . . . . . . . . . . . . . . . . . . . 552.5.1 Posets and Mobius Inversion . . . . . . . . . . . . . . . . . . . . . . . 572.5.2 Back to the Chromatic Polynomial . . . . . . . . . . . . . . . . . . . 59

2.6 Lecture 14 (Tuesday, October 14): The Chromatic Polynomial and Connections 602.6.1 The Graph Minor Theorem . . . . . . . . . . . . . . . . . . . . . . . 612.6.2 Hyperplane Arrangements . . . . . . . . . . . . . . . . . . . . . . . . 62

3 The Representation Theory of the Symmetric Group and Symmetric Func-tions 633.1 An Introduction to the Representation Theory of Finite Groups (Sagan [3,

Chapter 1]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633.1.1 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643.1.2 Irreducible Representations . . . . . . . . . . . . . . . . . . . . . . . 653.1.3 Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.2 Lectures 16 and 17 (Tuesday, October 21 and Thursday, October 23): TheIrreducible Representations of the Symmetric Group (Sagan [3, Chapter 2]) . 663.2.1 Constructing the Irreducible Representations (Sagan [3, Section 2.1]) 663.2.2 The Specht module Sλ (Sagan [3, Section 2.3]) . . . . . . . . . . . . . 673.2.3 The Specht Modules are the Irreducible Modules (Sagan [3, Section 2.4]) 683.2.4 Finding a Basis for Sλ (Sagan [3, Section 2.5]) . . . . . . . . . . . . . 703.2.5 Decomposition of Mλ (Sagan [3, Section 2.9]) . . . . . . . . . . . . . 71

3.3 Lecture 18 (Tuesday, October 28): The RSK Algorithm (Stanley [5, Section7.11]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713.3.1 Row Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723.3.2 The Robinson-Schensted-Knuth (RSK) Algorithm . . . . . . . . . . . 733.3.3 Growth Diagrams and Symmetries of RSK . . . . . . . . . . . . . . . 74

3.4 Lecture 19 (Thursday, October 30): Increasing and Decreasing Subsequences(Stanley [5, Appendix A]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.5 Lectures 20 and 21 (Tuesday, November 4 and Thursday, November 6): AnIntroduction to Symmetric Functions (Stanley [5, Chapter 7]) . . . . . . . . 783.5.1 The Ring of Symmetric Functions . . . . . . . . . . . . . . . . . . . . 783.5.2 (Proposed) Bases for the Ring of Symmetric Functions . . . . . . . . 793.5.3 Changes of Basis Involving the mλ . . . . . . . . . . . . . . . . . . . 823.5.4 Identities and an Involution . . . . . . . . . . . . . . . . . . . . . . . 853.5.5 Schur Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 863.5.6 The Hook Length Formula . . . . . . . . . . . . . . . . . . . . . . . . 883.5.7 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4


5


6


7


8

Chapter 1

Enumeration

Algebraic combinatorics is a new, sprawling, and poorly defined subject area in mathematics.As one might expect, any topic with both an algebraic and a combinatorial flavor can becalled algebraic combinatorics. Topics that are often included in this area that we will nottouch on are finite geometries, polytopes, combinatorial commutative algebra, combinatorialaspects of algebraic geometry, or matroids. What we will do is start with ten lectures onthe fundamentals of enumerative combinatorics (more or less, the study of counting), includ-ing methods (generating functions, bijections, inclusion-exclusion, the exponential formula),standard results (permutation enumeration, enumeration of graphs, identities), and somespecial topics (the Marcos-Tardos theorem). We will then spend about four lectures (andperhaps more toward the end of the semester) on topics in graph theory that have a morealgebraic flavor. This will be followed by about ten lectures on the representation theoryof the symmetric group and symmetric functions. This is the principal topic in the area ofalgebraic combinatorics that we will cover, and it will hint at the appearance of enumerativemethods within representation theory and algebraic geometry. The course will finish up witha few lectures on special topics of particular interest, including the combinatorics of cardshuffling and the enumeration of alternating sign matrices.

1.1 Lecture 1 (Thursday, August 28): The 12-Fold

Way (Stanley [4, Section 1.1])

There are three goals for this lecture. The first is to introduce some of the fundamentalobjects in enumerative combinatorics. The second is to foreshadow some of the enumerativemethods that we will discuss in depth in subsequent lectures. The third is to give examplesof why these objects might be of interest outside of combinatorics.

The 12-fold way is a unified way to view some basic counting problems. Let f : N → Xbe a function, where |N | = n and |X| = x. It is illustrative to interpret f as an assignment ofn balls to x bins. We arrive at 12 counting problems by placing restrictions on f . On the onehand, we can count functions f with no restriction, those that are surjective, and those thatare injective. On the other hand, we can also count functions up to permutations of N and/orpermutations of X; an alternative viewpoint is that the balls are either distinguishable orindistinguishable and the bins are either distinguishable or indistinguishable. We form a

9


chart of the number of distinct functions f under each possible set of restrictions:

N X Any f Injective f Surjective fDist Dist (#1)xn (#2)(x)n (#3)x!S(n, x)

Indist Dist (#4)((

xn

))(#5)

(xn

)(#6)

((x

n−x

))Dist Indist (#7)S(n, 1) + · · ·+ S(n, x) (#8)1 if n ≤ x, 0 otherwise (#9)S(n, x)

Indist Indist (#10)p1(n) + · · ·+ px(n) (#11)1 if n ≤ x, 0 otherwise (#12)px(n)

1. (Any f with distinguishable balls and distinguishable bins) Each ball can go in one ofx bins, so there are xn functions.

2. (Injective f with distinguishable balls and distinguishable bins) The first ball can gointo one of x bins, the second can go into one of the x − 1 other bins, and so on, sothere are x(x − 1) · · · (x − n + 1) functions. This expression occurs often enough toearn its own notation.

Definition. The falling factorial (x)n is defined to be

(x)n := x(x− 1) · · · (x− n+ 1).

1.1.1 Stirling Numbers and Bell Numbers

3. (Surjective f with distinguishable balls and distinguishable bins) To choose a surjectivefunction f , we have to split the balls up into x groups and pair up each group with abin. First, we count the number of ways to split the balls into x groups.

Definition. A set partition of a set N is a collection π = {B1, . . . , Bk} of subsetscalled blocks such that the blocks are non-empty, disjoint, and their union is all of N .

For example, the set partitions of {1, 2, 3} are

{{1}, {2}, {fd1(x)fd2(x) · · · fdm(x)3}}(sometimes denoted by 1/2/3)

{{1, 2}, {3}}(sometimes denoted by 12/3)

{{1, 3}, {2}}(sometimes denoted by 13/2)

{{1}, {2, 3}}(sometimes denoted by 1/23)

{{1, 2, 3}}(sometimes denoted by 123)

The number of set partitions of an n-element set is the Bell number B(n). The numberof set partitions of an n-element set into k blocks is the Stirling number of the secondkind S(n, k). (We will discuss Stirling number of the first kind in a couple lectures.)

Clearly a surjective function f is built by choosing one of the S(n, x) set partitionsof N with x blocks and then assigning blocks to bins in x! ways, so the number offunctions is x!S(n, x).

10


Before moving on to the next entry in the 12-fold way, let’s study Stirling numbers andBell numbers a bit more. They are very common in combinatorics and we will get ataste of some methods in enumerative combinatorics. Ideally, we would find a closed-form expression for S(n, k) and/or B(n), but this is not possible. We can, however,write down a recurrence that S(n, k) satisfies.

Definition. Let [n] will denote the set {1, 2, . . . , n}.

Proposition 1.1. By definition S(0, 0) = 1. For n ≥ 1, if k > n, then S(n, k) = 0,while S(n, 0) = 0. For 0 < k < n,

S(n, k) = kS(n− 1, k) + S(n− 1, k − 1).

Proof. We obtain a set partition of [n] into k blocks either by partitioning [n− 1] intok blocks and placing n in one of the blocks or by partitioning [n− 1] into k− 1 blocksand placing n into its own block.

Recurrences are powerful tools for computing terms in sequences and for finding for-mulas for sequences. We can find a similar formula for the Bell numbers.

Proposition 1.2. For n ≥ 0,

B(n+ 1) =n∑i=0

(n

i

)B(i).

Proof. A set partition of [n+1] is formed by creating a block with n+1 and any choiceof n− i elements from [n] and choosing a set partition of the remaining i elements.

These recurrences will quickly lead to generating functions for these numbers, but wewill defer that until next time. We will, however, consider an alternative representationof set partitions. Put n dots in a row, labeled 1, 2, . . . , n. If i and j are consecutiveelements in a block (when the elements in a block are written in increasing order),connect them with an arc from i to j. This is a bijection between set partitions of nand arc diagrams in which each dot has at most one incoming arc and at most oneoutgoing arc. Arc diagrams turn out to be incredible intuitive and convenient way torepresent many objects, including set partitions.

1.1.2 Superclasses (Arias-Castro, Diaconis, Stanley)

We will briefly discuss a non-combinatorial context in which set partitions arise. LetU(n) be the group of upper triangular matrices with ones on the diagonal and entriesin Fq. It has been proved that classifying the conjugacy classes of U(n) is equivalentto classifying wild quivers, which seems to be impossible. (Note: I have been askedto expand on this comment. This seems to be the sort of throw-away line found inpapers with no justification or reference. I am attempting to track down a betterexplanation.) Knowing the conjugacy classes of a group is important in representation

11


theory and the theory of random walks on groups. Instead of classifying conjugacyclasses, it is possible and of use to classify so-called superclasses, which are unions ofconjugacy classes. If X = I + x ∈ U(n) (x is upper-triangular with zeroes on thediagonal) and Y ∈ U(n), conjugation looks like Y −1XY = I +Y −1xY . Here, Y allowsus to add a multiple of one row up while simultaneous adding the same multiple of thecorresponding column to the right. For superclasses, we look at the orbit of X underthe action I + ZxY , where Y, Z ∈ U(n). This is clearly a union of conjugacy classes.Here, multiplication of x by Z and Y lets us add multiples of a column to the left andmultiples of a row up.

It is not so hard to see that each superclass will have a unique representative with atmost one non-zero entry in each row and column (above the diagonal). For example,if n = 3, the representatives have the form1 0 0

0 1 00 0 1

1 ? 00 1 00 0 1

1 0 00 1 ?0 0 1

1 0 ?0 1 00 0 1

1 ? 00 1 ?0 0 1

,

where ? is any non-zero element of Fq. How many possible forms are there for therepresentatives? Exactly B(n). Why? For each non-zero entry in (i, j) for i < j, drawan arc in the arc diagram. This give a set partition and defines a bijection.

1.1.3 Back to the 12-Fold Way

4. (Any f with indistinguishable balls and distinguishable bins) If y1 balls are placed inthe first bin, y2 balls in the second, and so on, then the number of functions is just thenumber of ordered sums y1 + · · · + yx = n with yi ≥ 0. This brings up compositionsand multisets.

Definition. A composition of n is an expression of n as an ordered sum of positiveintegers. For example, the eight compositions of n are 1 + 1 + 1 + 1, 2 + 1 + 1, 1 + 2 +1, 1+1+2, 3+1, 1+3, 2+2, 4. A composition with k parts is called a k-composition. Aweak composition of n is an expression of n as an ordered sum of non-negative integers.A weak composition with k parts is called a weak k-composition.

Proposition 1.3. The number of k-compositions of n is(n−1k−1

)and the number of

compositions of n is 2n−1. The number of weak k-compositions of n is(n+k−1k−1

).

Proof. A k-composition of n can be represented by n “stars” in a row and putting k−1vertical “bars” into k − 1 of the n − 1 gaps between the stars. This can be done in(n−1k−1

)ways. A weak k-composition of n can be represented by any arrangement of n

“stars” and k− 1 vertical “bars”. There are(n+k−1k−1

)ways to arrange n stars and k− 1

bars. A composition of n is obtained by choosing whether or not to put a bar in eachof the n− 1 gaps, so there are 2n−1 compositions.

12


This “stars and bars” argument is very standard. Note that the number of k-compositionsof n is the number of positive integer solutions to x1 + · · ·+xk = n. Another viewpointon compositions is as “combinations with repetitions” or multisets. Let

((nk

))denote

the number of ways to choose k elements from [n] disregarding order and allow repe-titions (that is, the number of multisets of cardinality k on [n]). In fact, we just needto decide how many 1s, how many 2s, and so on should be chosen, and the number ofways to do this is the number of weak n-compositions of k, so((

n

k

))=

(n+ k − 1

n− 1

)=

(n+ k − 1

k

).

So the number of functions is((

xn

)).

5. (Injective f with indistinguishable balls and distinguishable bins) Each box containsat most one ball; we can choose the boxes with balls in

(xn

)ways.

6. (Surjective f with indistinguishable balls and distinguishable bins) Each box containsat most one ball. Remove one ball from each box to obtain a multiset on X with n−xelements. So there are

((x

n−x

))functions.

7. (Any f with distinguishable balls and indistinguishable bins) We need to group theballs into at most x groups, which can be done in S(n, 1) + · · ·+ S(n, x) ways.

8. (Injective f with distinguishable balls and indistinguishable bins) If there are at leastas many bins as balls, there is one such f ; otherwise there are none.

9. (Surjective f with distinguishable balls and indistinguishable bins) We need to groupthe balls into exactly x groups and the order of the groups does not matter, so thereare S(n, x) functions.

10. (Any f with indistinguishable balls and indistinguishable bins) We need to break nballs up into at most x parts, where the balls are indistinguishable and the order ofthe parts doesn’t matter. These are precisely the (integer) partitions of n into at mostx parts. Let pk(n) denote the number of partitions of n into k parts. Then the numberof functions is p1(n) + · · ·+ px(n).

11. (Injective f with indistinguishable balls and indistinguishable bins) If there are at leastas many bins as balls, there is one such f ; otherwise there are none.

12. (Surjective f with indistinguishable balls and indistinguishable bins) We need to breakn balls up into exactly x parts, so there are px(n) such functions.

1.2 Lecture 2 (Tuesday, September 2): Generating Func-

tions (Stanley [4, Section 1.1])

Over the next few lectures, we will be discussing a variety of enumeration problems and howto solve them. One of the principal techniques used is the theory of generating functions.

13


An abstract setting for many enumerative problems is that we have finite sets S0, S1, S2, . . .and we want to find |Sn|. Ideally, we would find a “closed-form formula” for |Sn|, which mayinvolve a sum or a product. Alternatively, we could find a recurrence equation, an asymptoticformula, or a generating function. It is only through experience that the usefulness ofgenerating functions can be appreciated.

Definition. Let a0, a1, . . . be a sequence. The ordinary generating function of this sequenceis

F (x) =∞∑n=0

anxn,

while the exponential generating function of this sequence is

G(x) =∞∑n=0

anxn

n!.

Here are some simple examples:

1. The OGF for the sequence 1, 1, 1, . . . is 11−x . The EGF is ex.

2. The OGF for the sequence 1, 2, 3, . . . is 1(1−x)2 . The EGF is (x+ 1)ex.

In this lecture, we will use generating functions to prove combinatorial identities, solverecurrences, and obtain exact formulas for various sequences. For all of these applications, wewill regard generating functions as formal power series (that is, members of the ring C[[x]]),and thus there is no question of convergence. Eventually we will use analytic methods toextract asymptotic information about the coefficients, and in those cases we will need toconsider convergence questions.

Here is a more complicated example with deep connections.

Example. Let p(n) be the number of partitions of n. Then

∞∑n=0

p(n)xn =∏i≥1

(1 + xi + x2i + x3i + x4i+) =∏i=1

(1− xi)−1.

Let pd(n) be the number of partitions of n into distinct parts. Then∑n≥0

pd(n)xn = (1 + x)(1 + x2)(1 + x3) · · · =∏i≥1

(1 + xi).

Let po(n) be the number of partitions of n into odd parts. Then∑n≥0

po(n)xn =∏i≥1

(1 + x2i−1 + x2(2i−1) + x3(2i−1) + x4(2i−1)+) =∏i≥1

1

1− x2i−1.

But ∏i≥1

(1 + xi) =∏i≥1

1− x2i

1− xi=∏i≥1

1

1− x2i−1,

showing that po(n) = pd(n)! We will have more to say about partition identities in thelecture on bijections.

14


Before looking at more applications, we need to write down multiplication formulae forgenerating functions.

Proposition 1.4. Products of generating functions satisfy the following equations:(∞∑n=0

anxn

)(∞∑m=0

bmxm

)=

∑n=0

(n∑i=0

aibn−i

)xn(

∞∑n=0

anxn

n!

)(∞∑m=0

bm ·xm

m!

)=

∑n=0

(n∑i=0

(n

i

)aibn−i

)xn

n!

The multiplication formulas suggest that we can use generating functions to prove com-binatorial identities. There are hundreds of examples, but here are two:

Example. Find a nice expression for

cn =n∑i=0

i

(n

i

).

Example. Find a nice expression for

cn =n∑i=0

(2i

i

)(2(n− i)n− i

).

We will see many more combinatorial identities in the lecture on the WZ method.

15


1.2.1 Recurrences

Generating functions are intimately related to recurrences. For now, we focus on findinggenerating functions given recurrences.

Example. The Fibonacci numbers are defined recursively by fn+2 = fn+1+fn and f0 = 0 andf1 = 1. We can find both the exponential generating function and the ordinary generatingfunction for fn. Let

F (x) =∑n≥0

fnxn.

If we multiply both sides of the recurrence by xn and sum over all n ≥ 0, we find

F (x)− xx2

=∑n≥0

(fn+1 + fn)xn

=∑n≥0

fn+1xn + F (x)

=F (x)

x+ F (x)

F (x) =x

1− x− x2

=x(

1− 1+√

52x)(

1− 1−√

52x)

=1√5

(1

1− 1+√

52x− 1

1− 1−√

52x

)

On the other hand, let

G(x) =∑n≥0

fnxn

n!.

Multiply both sides of the recurrence by xn/n! and sum over all n ≥ 2. Then

G′′(x) = G′(x) +G(x)

G(x) =1√5e

1+√

52

x − 1√5e

1−√

52

x

Either way, we find that

fn =1√5

[(1 +√

5

2

)n

−

(1−√

5

2

)n]

Example. Recall that the Stirling numbers of the second kind satisfy the recurrence

S(n, k) = kS(n− 1, k) + S(n− 1, k − 1).

16


Let Fk(x) be the exponential generating function of S(n, k) with respect to n. Then∑n≥k

S(n, k)xn

n!=

∑n≥k

kS(n− 1, k)xn

n!+∑n≥k

S(n− 1, k − 1)xn

n!

F ′k(x) = kFk(x) + Fk−1(x)

Induction shows that Fk(x) = 1k!

(ex − 1)k. Then

1

k!(ex − 1)k =

1

k!

k∑i=0

(−1)k−i(k

i

)eix,

so S(n, k) = 1k!

∑ki=0 (−1)k−i

(ki

)in.

Example. Since ∑n≥k

S(n, k)xn

n!=

1

k!(ex − 1)k,

we find that ∑k ≥ 1

∑n≥k

S(n, k)xn

n!=

∑k≥1

1

k!(ex − 1)k

∑n≥0

B(n)xn

n!= ee

x−1

If we differentiate and compare coefficients, we reprove the recursion

Proposition 1.5. For n ≥ 0,

B(n+ 1) =n∑i=0

(n

i

)B(i).

1.2.2 Catalan numbers

There is one other famous combinatorial sequence to consider, namely the Catalan numbers.Richard Stanley has a list of 166 different families of objects enumerated by the Catalannumbers. We will define them as follows:

Definition. Let the n-th Catalan number Cn be the number of lattice paths from (0, 0) to(n, n) with steps (1, 0) and (0, 1) that never go above the line y = x.

When n = 3, there are five such lattice paths:

The sequence of Catalan numbers begins 1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862. Itturns out that we can obtain a closed-form formula for Cn using recurrences and generatingfunctions.

17


Proposition 1.6.

Cn =1

n+ 1

(2n

n

).

Proof. For n ≥ 0,

Cn+1 =n∑i=0

CiCn−i.

LetF (x) =

∑n≥0

Cnxn.

ThenF (x)− 1 = xF (x)2

and

F (x) =1−√

1− 4x

2x.

Since √1 + y = 1− 2

∞∑n=1

(2n− 2

n− 1

)(−1

4

)nyn

n,

we find that

F (x) =1

x

∞∑n=1

(2n− 2

n− 1

)(−1

4

)n(−4x)n

n

=∞∑n=0

1

n+ 1

(2n

n

)xn

We will see a more combinatorial proof in the lecture on bijections. One example ofwhere the Catalan numbers appear in a non-combinatorial context requires an alternativediscription of them.

Lemma 1.7. The Catalan number Cn is equal to the number of n-element multisets onZ/(n+ 1)Z whose elements sum to 0. For example, when n = 3, we have 000, 013, 022, 112and 233.

Proof. The total number of n-element multisets is(2nn

). Call two multisets M and N equiva-

lent if they are translates of each other, that is, M = {a1, . . . , an} andN = {a1+k, . . . , an+k}for some k ∈ Z/(n+1)Z. Each equivalence class contains exactly one multiset whose elementssum to 0. So the total number is 1

n+1

(2nn

).

Example. What does this relate to? Let us count the number of conjugacy classes ofA ∈ SL(n,C) with An+1 = 1. A matrix in SL(n,C) is diagonalizable with eigenvalues thatare (n+1)-st roots of unity and sum to 1. Each conjugacy class is determined by the multisetof eigenvalues. So the number of conjugacy classes equals the number of n-element multisetsof Z/(n+ 1)Z that sum to 0.

18


1.2.3 q-Analogues

There is a special type of generating function that is often referred to as a “q-analogue.”Suppose that c counts a set S of combinatorial objects (c may depend on one or moreparameters). Suppose there is a non-negative integer-valued “statistic” on the objects in S.Let ci be the number of elements in S for which the statistic equals i. Then the q-analogueof c is the polynomial (generating function)

c(q) =m∑i=0

ciqk,

and note that limq→1 c(q) = c.

Example. The classic example of a q-analogue is the q-binomial coefficient[nk

]q, which is a q-

analogue of the binomial coefficients. The idea is that the sum of the coefficients of[nk

]q

is(nk

)(which counts, say, k-subsets of an n-set), and the coefficients are positive, so the k-subsetsof an n-set should have a natural partition into groups parameterized by i = 0, 1, . . . ,m, andthe coefficient of qi in

[nk

]q

should count the k-subsets in the group parameterized by i.

In this case, let the k-subsets of an n-set be represented by an n-bit binary string s withk ones. Let

i(s) = #{(a, b) : sa > sb}.

This is the number of inversions of s; we will talk about this more later. Define[n

k

]q

:=

k(n−k)∑i=0

#{s : i(s) = i}qi.

For example,[42

]q

= 1 + q + 2q2 + q3 + q4. This can also be defined as[n

k

]q

:=[n]!

[k]![n− k]!,

where [k]! = [1][2] · · · [k] and [k] = 1 + q + · · ·+ qk−1. Then the recurrence[n

k

]q

=

[n− 1

k

]q

+ qn−k[n− 1

k − 1

]q

.

shows that these two definitions are equivalent.

Proposition 1.8. The number of k-dimensional subspaces of an n-dimensional vector spaceover a finite field Fq is

[nk

]q.

Proof. Let the number in question be G(n, k) and let N(n, k) denote the number of orderedk-tuples (v1, . . . , vk) of linearly independent vectors. We may choose v1 in qn− 1 ways, v2 inqn − q ways, and so

N(n, k) = (qn − 1)(qn − q) · · · (qn − qk−1).

19


But we can also choose (v1, . . . , vk) by choosing a k dimensional subspace W in G(n, k) waysand choosing v1 in qk − 1 ways, v2 in qk − 2 ways, and so on, so

N(n, k) = G(n, k)(qk − 1)(qk − q) · · · (qk − qk−1).

Thus

G(n, k) =(qn − 1)(qn − q) · · · (qn − qk−1)

(qk − 1)(qk − q) · · · (qk − qk−1)=

[n]!

[k]![n− k]!=

[n

k

]q

.

We will see other q-analogues in the next lecture on permutation enumeration.

1.3 Lecture 3 (Thursday, September 4): Permutation

Enumeration (Stanley [4, Section 1.1], Wilf [7, Chap-

ter 4]

As we will see through this course, the symmetric group Sn is rich in combinatorics. Today,we focus on some basic enumerative problems about permutations. There are three waysthat we will represent a permutation π ∈ Sn. The first is with two-line notation, in whichwe write

π =1 2 · · · n

π(1) π(2) · · · π(n).

The second is one-line notation or as a word, where we forget about the top row:

π = π(1)π(2) · · · π(n).

The last way is via the standard representation in disjoint cycle notation, where we requireeach cycle be written with its largest element first and the cycles are written in increasingorder of the largest element. For example,

π =1 2 3 4 5 6 74 2 7 1 3 6 5

= 4271365 = (14)(2)(375)(6).

Define a map π 7→ π by writing π in the standard representation and erasing the paren-theses (interpreting the result in one-line notation). In the above example,

π = 1423756.

Proposition 1.9. The map π 7→ π is a bijection, and π has k cycles if and only if π has kleft-to-right maxima.

Proof. We can recover π from π by inserting parentheses at the beginning, end, and afterevery left-to-right maxima. Thus this map is a bijection, and π has k cycles if and only if πhas k left-to-right maxima.

20


1.3.1 Permutation Statistics

A permutation statistic is a map from Sn to N. For this subsection, write π in one-linenotation. In lecture, we will skip descents, excedences, and the major index for now and goto the inversion number.

The first permutation statistic to consider is the number of descents in π, denoted des(π).

Definition. The descent set of a permutation π = a1a2 · · · an is the set D(π) = {i : ai >ai+1}, and des(π) = |D(π)|. For example, the permutation π = 4271365 has D(π) = {1, 3, 6}and des(π) = 3.

The first question is how many permutations have a fixed descent set S? We can onlygive a partial answer right now.

Proposition 1.10. Let S = {s1, s2, . . . , sk} ⊆ [n− 1] and let α(S) be the number of π ∈ Snwith D(π) ⊆ S. Then

α(S) =

(n

s1, s2 − s1, s3 − s2, . . . , n− sk

).

Proof. To find π with D(π) ⊆ S, we choose a1 < a2 < · · · < as1 in(ns1

)ways, as1+1 < as1+2 <

· · · < as2 in(n−s1s2−s1

)ways, and so on. Thus

α(S) =

(n

s1

)(n− s1

s2 − s1

)· · ·(n− skn− sk

)=

(n

s1, s2 − s1, s3 − s2, . . . , n− sk

).

Counting the number of permutations with descent set S can be done using this resultand the Principle of Inclusion-Exclusion, which we will talk about in a few lectures. Thenext question to address is how many permutations have a given number of descents. LetA(n, k) = |{π ∈ Sn : des(π) = k−1}|; this is called an Eulerian number. Then the Eulerianpolynomial An(x) is the corresponding ordinary generating function:

An(x) =n∑k=1

A(n, k)xk.

The first few Eulerian polynomials are:

A1(x) = x

A2(x) = x+ x2

A3(x) = x+ 4x2 + x3

A4(x) = x+ 11x2 + 11x3 + x4

A5(x) = x+ 26x2 + 66x3 + 26x4 + x5

It is clear that the coefficients of these polynomials are symmetric, since if πr is the reversalof π, the map π 7→ πr is a bijection on Sn that takes a permutation with k − 1 descents toa permutation with n− k descents. We will return to Eulerian numbers in a little bit.

The next statistic is the weak excedence number exc(π).

21


Definition. The weak excedence set of a permutation π = a1a2 · · · an is the set E(π) ={i : ai > i}, and exc(π) = |E(π)|. For example, the permutation π = 4271365 hasE(π) = {1, 2, 3, 6} and exc(π) = 4.

Proposition 1.11. The number of permutations π ∈ Sn with k weak excedences equalsA(n, k + 1).

Proof. Let π = (a1a2 · · · ai1)(ai1+1 · · · ai2) · · · (aik−1+1 · · · an). Then (π) has n − k descents.On the other hand, i is a weak excedence of π exactly when ai < ai+1 or i = n, so there arek of them. Thus π 7→ π takes a permutation with k weak excedences to a permutation withk descents.

Any statistic on Sn distributed the same as des(π) or exc(π) is called Eulerian. Theseare fruitful sources of combinatorial bijections and turn up in poset topology, among otherplaces.

The third statistic we consider is the inversion number inv(π).

Definition. The inversion set of a permutation π = a1a2 · · · an is the set {(i, j) : i <j but ai > aj}, and inv(π) is the number of inversions of π. For example, the permutationπ = 4271365 has inv(π) = 9.

The generating function for the number of permutations in Sn with a given inversionnumber has a nice factorization and leads us into the topic of q-analogues.

Proposition 1.12.∑π∈Sn

qinv(π) = (1 + q)(1 + q2) · · · (1 + q + q2 + · · ·+ qn−1)

Proof. Given a permutation a1 · · · an−1 in Sn−1, the letter n can be inserted in one of n spots,and inserting it after ai increases the number of inversions by n − i − 1 for 0 ≤ i ≤ n − 1.So by induction, if∑

π∈Sn−1

qinv(π) = (1 + q)(1 + q2) · · · (1 + q + q2 + · · ·+ qn−2),

then ∑π∈Sn

qinv(π) = (1 + q)(1 + q2) · · · (1 + q + q2 + · · ·+ qn−1).

The fourth statistic is maj(π).

Definition. The major index maj(π) of a permutation π is the sum of all the numbers inthe descent set of π.

Proposition 1.13. The number of permutations π ∈ Sn with maj(π) = k equals the numberof permutations with k inversions.

Any statistic on Sn distributed the same as inv(π) or maj(π) is called Mahonian.

22


1.3.2 Multiset Permutations and q-Analogues

Many of these statistics can be generalized to permutations of multisets, that is, permutationsπ = a1a2 · · · an where we allow repetitions. We will focus on inversions. As before, π hasan inversion (i, j) if i < j but ai > aj. Writing down the corresponding generating functionrequires a so-called q-analogue of the multinomial coefficients. A q-analogue is a generatingfunction in the variable q that reduces to a known quantity when q → 1.

The classic example of a q-analogue is the q-multinomial coefficient[nk

]q, which is a q-

analogue of the multinomial coefficients.

Definition. Define the q-multinomial coefficient (or Gaussian polynomial) by[n

a1, . . . , am

]q

:=[n]q!

[a1]q![a2]q! · · · [am]q!,

where [k]q! = [1]q[2]q · · · [k]q and [k]q = 1 + q + · · ·+ qk−1.

Proposition 1.14. The q-binomial coefficients satisfies the recurrence[n

k

]q

=

[n− 1

k

]q

+ qn−k[n− 1

k − 1

]q

.

The q-multinomial coefficients are products of q-binomial coefficients (in direct analogue tomultinomial coefficients) and thus the q-multinomial coefficients are polynomials in q.

The q-multinomial coefficients arise all over the place, but here they are in the contextof inversions.

Theorem 1.15. Let M = {1a1 , . . . ,mam} be a multiset of cardinality a1 + · · ·+ am. Then∑π∈SM

qinv(π) =

[n

a1, . . . , am

]q

.

Proof. Define a map

φ : SM × Sa1 × · · · × Sam → Sn

(π0, π1, . . . , πm) 7→ π,

by converting the ai i’s in π0 to the numbers a1 + · · · + ai−1 + 1, a1 + · · · + ai−1 + 2, . . . ,a1 + · · ·+ ai−1 + ai in the order specified by πi. For example,

(21331223, 21, 231, 312) 7→ 42861537.

We look at the multiset permutation 21331223. Then π1 tells us to change the 1s to a 2 and1 from left to right, π2 tells us to change the 2s to a 4, a 3, and 5 from left to right, and π3

tells us to change the 3s to a 8, a 6, and a 7 from left to right. Then π is a bijection and

inv(π) = inv(π0) + · · ·+ inv(πm).

Using Prop 1.12

[n]q! =

(∑π∈SM

qinv(π)

)[a1]q! · · · [am]q!.

23


Here is another instance of q-binomial coefficients:

Theorem 1.16. The number of k-dimensional subspaces of an n-dimensional vector spaceover a finite field Fq is

[nk

]q.

Proof. Let the number in question be G(n, k) and let N(n, k) denote the number of orderedk-tuples (v1, . . . , vk) of linearly independent vectors. We may choose v1 in qn− 1 ways, v2 inqn − q ways, and so

N(n, k) = (qn − 1)(qn − q) · · · (qn − qk−1).

But we can also choose (v1, . . . , vk) by choosing a k-dimensional subspace W in G(n, k) waysand choosing v1 ∈ W in qk − 1 ways, v2 ∈ W in qk − 2 ways, and so on, so

N(n, k) = G(n, k)(qk − 1)(qk − q) · · · (qk − qk−1).

Thus

G(n, k) =(qn − 1)(qn − q) · · · (qn − qk−1)

(qk − 1)(qk − q) · · · (qk − qk−1)=

[n]!

[k]![n− k]!=

[n

k

]q

.

1.3.3 Cycles

We turn to looking at the number of cycles and cycle lengths of a permutation.

Definition. The cycle type of π ∈ Sn, denoted type(π), is the vector (c1, . . . , cn), where πhas ci cycles of length i.

Theorem 1.17. The number of π ∈ Sn with cycle type (c1, . . . , cn) is

n!

1c1c1!2c2c2! · · ·ncncn!.

Proof. Let π = a1 . . . an. Parenthesize π so that the first c1 cycles have length 1, the next c2cycles have length 2, and so on. This gives a map from Sn to the set of permutations withcycle type (c1, . . . , cn). Now, if σ has cycle type (c1, . . . , cn), the number of ways to write σas a product of disjont cycles with the cycle lengths non-decreasing is 1c1c1! · · ·ncncn!. Sothe number of distinct σ is

n!

1c1c1!2c2c2! · · ·ncncn!.

Let c(n, k) be the number of π ∈ Sn with exactly k cycles. Then (−1)n−kc(n, k) is theStirling number of the first kind and c(n, k) is the signless Stirling number of the secondkind. They satisfy a nice recurrence, and we can use that recurrence to obtain a generatingfunction.

24


Theorem 1.18. The signless Stirling numbers of the first kind c(n, k) satisfy the recurrence

c(n, k) = (n− 1)c(n− 1, k) + c(n− 1, k − 1).

Thenn∑k=0

c(n, k)xk = x(x+ 1)(x+ 2) · · · (x+ n− 1).

Proof. For the recurrence, choose a permutation in Sn−1 with k − 1 or k cycles. In thefirst case, include n in its own cycle. In the second case, n can be inserted in one of thecycles in any one of n− 1 positions. The generating function follows from the recurrence byinduction.

1.3.4 Using Generating Functions to Find Expected Values

We use the results of the previous section and generating functions to prove probabilisticstatements about permutations in Sn.

Proposition 1.19. Let X : S → N be a random variable on a set S that assumes non-negative integer values, and let pk be the probability that X = k. The expected value of Xis

E[X] =∞∑n=0

npn

and the variance is

Var(X) =∞∑n=0

n2pn −

(∞∑n=0

npn

)2

.

Let

f(x) =∞∑n=0

pnxn.

Then

E[X] = f ′(1) =∞∑n=0

npn

and

Var(X) = f ′(1) + f ′′(1)− f ′(1)2 =∞∑n=0

n2pn −

(∞∑n=0

npn

)2

.

Example. Note that c(n, k)/n! is the probability that a random permutation in Sn has kcycles. Define the random variable X : Sn → N where X(π) = (number of cycles in π). Weknow that

∞∑k=0

Prob(X = k)xk =∞∑k=0

c(n, k)

n!xk =

1

n!x(x+ 1)(x+ 2) · · · (x+ n− 1)

25


by Theorem 1.18. We can compute the expected number of cycles in a random permutationin Sn.

f ′(x) =1

n!

n−1∑i=0

x(x+ 1)(x+ 2) · · · (x+ n− 1)

x+ i

f ′(1) =1

n!

n−1∑i=0

n!

i+ 1

= 1 +1

2+

1

3+ · · ·+ 1

n.

We can also compute the variance in the number of cycles in a random permutation in Sn.

f ′(x) + xf ′′(x) = f ′(x) + x

(1

n!

n−1∑i,j=0,i 6=j

x(x+ 1)(x+ 2) · · · (x+ n− 1)

(x+ i)(x+ j)

)

f ′(1) + f ′′(1) =

(1 +

1

2+

1

3+ · · ·+ 1

n

)+

n−1∑i,j=0,i 6=j

1

(i+ 1)(j + 1)

=

(1 +

1

2+

1

3+ · · ·+ 1

n

)+

(1 +

1

2+

1

3+ · · ·+ 1

n

)2

−(

1 +1

4+

1

9+ · · ·+ 1

n2

)Var(X) = 1 +

1

2+

1

3+ · · ·+ 1

n− 1− 1

4− 1

9− · · · − 1

n2

≈ log n+ γ − π2

6+ o(1).

So the average number of cycles in a random permutation in Sn is ≈ log n with standarddeviation ≈

√log n.

1.3.5 Unimodality

It is nice to know when finite sequences are unimodal, that is, the entries rise to a maximumand then decrease. The most well-known unimodal sequence is

{(nk

)}nk=0

. It is generally ex-tremely hard to determine the unimodality of a sequence. In a few cases, however, generatingfunctions can help.

Theorem 1.20 (Newton’s Theorem / Darroch’s Theorem). Let p(x) = c0 + c1x + c2x2 +

· · · + cnxn be a polynomial all of whose roots are real and negative. Then the sequence of

coefficients is unimodal. If p(1) > 0, then the value of k for which ck is maximized is withinone of p′(1)/p(1).

This immediately shows that the binomial coefficients and the Stirling numbers of thefirst kind are unimodal. It takes more work to show that the Stirling numbers of the secondkind and the Gaussian coefficients are unimodal.

26


1.3.6 Cycle Index

This subsection derives some powerful asymptotic results about the cycle type of a randompermutation in Sn as n→∞. Let

φn(x) =∑

c=(c1,...,cn)

|{π ∈ Sn : type(π) = c}|xc11 xc22 · · ·

In a future lecture, we will call this the cycle index of Sn. Note that

φn(x)

n!=

∑c=(c1,...,cn)

Prob(type(π ∈ Sn) = c)xc11 xc22 · · ·

Let

C(x, t) =∞∑n=1

φn(x)tn

n!.

This huge generating function has a very nice form.

Theorem 1.21.

C(x, t) = exp

(∑j≥1

xjtj

j

)Proof. Using Theorem 1.17,

C(x, t) =∑n≥0

φn(x)

n!tn

=∑n≥0

tn

n!

∑c1+2c2+···=n

|{π ∈ Sn : type(π) = c}|xc11 xc22 · · ·

=∑n≥0

tn

n!

∑c1+2c2+···=n

n!

1c1c1!2c2c2! · · ·ncncn!xc11 x

c22 · · ·

=

(∑c1≥0

(tx1)c1

1c1c1!

)(∑c2≥0

(t2x2)c2

2c2c2!

)· · ·

= etx1et2x2/2et

3x3/3 · · ·

= exp

(∑j≥1

xjtj

j

).

So how do we apply this to get probabilistic results about permutations? We need topoint out one lemma:

Lemma 1.22. Let∑

j bj be a convergent series. Then in the power series expansion of

1

1− t∑j

bjtj =

∑n

αntn,

27


we have

αn =n∑j=1

bj

and solimn→∞

αn =∑j

bj.

Let us insert a brief example to show how useful this is.

Example. The ordinary generating function for the sequence 0, 1, 1/2, 1/3, 1/4, . . . is

− log (1− t) =∞∑n=1

tn

n.

By the lemma, the ordinary generating function for the sequence of harmonic numbers0, 1, 1 + 1/2, 1 + 1/2 + 1/3, . . . is

− log (1− t)1− t

.

Now we have our big theorem.

Theorem 1.23. Let S be a set of positive integers for which∑n∈S

1

n<∞.

The probability that the cycle type of a random permutation in Sn agrees with c in all of itscomponents whose subscripts lie in S goes to

e−Pi∈S (1/s)[xc] exp

(∑s∈S

xss

)=

1∏s∈S e

1/ssasas!

as n → ∞. Here [xc] is the operator that extracts the coefficient of xc from the generatingfunction following it.

Proof. In C(x, t), set xi = 1 if i /∈ S (that is, we have no interest in any cycle lengths otherthan those in S). Then we find

C(x, t) = exp

(∑i∈S

xiti

i+∑i/∈S

ti

i

)

= exp

(∑i∈S

(xi − 1)ti

i+ log

1

1− t

)

=1

1− texp

(∑i∈S

(xi − 1)ti

i

)

28


The coefficient of tn as n→∞ is

exp

(∑i∈S

xi − 1

i

)

by the lemma (we obtain this by setting t = 1).

Corollary 1.24. Letting S = {1} shows that the probability that a random permutation inSn has exactly c1 fixed points goes to 1/c1!e as n→∞.

Corollary 1.25. Letting S = {r} shows that the probability that a random permutation inSn has exactly cr r-cycles goes to 1/(e1/rrcrcr!) as n→∞.

Corollary 1.26. Letting S = {1, 4, 9, 16, . . . } shows that the probability that a random per-mutation in Sn has no cycles whose lengths are squares is e−π

2/6 as n→∞.

Corollary 1.27. Letting S = {1, 2} shows that the probability that a random permutationin Sn has equal numbers of 1-cycles and 2-cycles is

e−3/2

∞∑j=0

1

2jj!2

as n→∞.

1.3.7 Square Roots

Finally, we discuss the question of how many permutations have square roots.

Theorem 1.28. A permutation σ has a square root if and only if the number of cycles ofeven length is even.

Proof. Let τ be a permutation. A cycle of length 2m in τ breaks into two cycles of length min τ 2. A cycle of odd length in τ stays in a cycle of odd length in τ 2. So the given conditionis necessary for σ to have a square root. Conversely, if σ has an even number of cycles ofeven length, it is easy to construct a square root of σ.

So σ has a square root if and only if the cycle type c has even-indexed components thatare even. The coefficient of xctn/n! in

ex1tex2t2/2ex3t3/3 · · ·

is the number of permutations of n letters whose cycle type is c. To sum over all cycle typeswe are considering, set x1 = x3 = x5 = · · · = 1, and replace ex2mt2m/2m by the series with onlyeven powers, namely coshx, and set x2m = 1. Let f(n, 2) be the number of permutations in

29


Sn with square roots. Then’∑n≥0

f(n, 2)tn

n!= et cosh(t2/2)et

3/3 cosh(t4/4)et5/5 · · ·

= et+t3/3+t5/5+···

∏m≥1

cosh

(t2m

2m

)

=

√1 + t

1− t∏m≥1

cosh

(t2m

2m

)= 1 + t+

t2

2!+ 3

t3

3!+ 12

t4

24+ 60

t5

5+ · · ·

Thus the number of permutations f(2, n) in Sn with a square root generates a sequence1, 1, 1, 3, 12, 60, . . . .

1.4 Lecture 5 (Thursday, September 11): The Expo-

nential Formula (Stanley [5, Chapter 5])

We have seen many examples of (exponential) generating functions involving the exponentialfunction. This is not a coincidence. These functions arise naturally, as we will see inthe Exponential Formula. We will use the Exponential Formula to rederive some of thegenerating functions we have already found, see new applications to Stirling numbers of thefirst kind, graph enumeration, and subgroup enumeration, and we will get a chance to revisitthe cycle index of the symmetric group from another point of view.

Let

Ef (x) =∑n≥0

f(n)xn

n!.

We will often denote the size of a finite set S as #S := |S|. Let K be any field of characteristiczero (like R or C with some indeterminates attached). We will soon see why we might wantK to have indeterminates.

Proposition 1.29. Given functions f, g : N→ K, define a new function h : N→ K by therule

h(#X) =∑(S,T )

f(#S)g(#T ),

where X is a finite set and where (S, T ) ranges over all weak ordered partitions of X intotwo blocks (weak means that S or T may be empty and ordered means that (S, T ) and (T, S)are distinct partition). Then

Eh(x) = Ef (x)Eg(x).

Proof. Let #X = n. Then

h(n) =n∑k=0

(n

k

)f(k)g(n− k)

30


since if #S = k, then S can be chosen in(nk

)ways and T is the complement of S in X. The

desired formula follows from the formula for multiplying two EGFs.

The next proposition is a direct consequence of applying Proposition 1.29 repeatedly.

Proposition 1.30. Let k ∈ P and f1, . . . , fk : N→ K. Define h : N→ K by

h(#S) =∑

f1(#T1)f2(#T2) · · · fk(#Tk#),

where (T1, . . . , Tk) ranges over all weak ordered partitions of S into k blocks. Then

Eh(x) = Ef1(x) · · ·Efk(x).

This proposition quickly rederives the EGF of the Stirling numbers of the second kindwith respect to k.

Example. In Proposition 1.30, let fi(0) = 0 and fi(n) = 1 otherwise for 1 ≤ i ≤ k. Thenh(n) simply counts the number of ordered partitions of [n] into k blocks, so h(n) = k!S(n, k).On the other hand, Efi(x) = ex − 1 for each i, so

Eh(x) =∞∑n=0

k!S(n, k)xn

n!= (ex − 1)k.

Thus the EGF for S(n, k) for fixed k is

(ex − 1)k

k!.

At this point, we need to insert a little sidenote about the one issue I brushed over whenI said that we could just manipulate generating functions formally without worrying aboutconvergence.

Note. In the ring of formal power series C((x)) (or C((x1, x2, . . . , xn))), we have clearlydefined addition and multiplication operations. But what is the situation with regard toinverses and composition? The answers are that f(x) has an inverse if and only if f(0) 6= 0and the composition f(g(x)) is well-defined if g(0) = 0. In the case of composition, thisis because computing the coefficient of xn in f(g(x)) requires a finite number of steps ifg(0) = 0, but requires infinitely many if g(0) 6= 0 (and, say, f(x) is not a polynomial).

It is time for the fundamental result in this lecture (which specializes to the ExponentialFormula).

Theorem 1.31 (The Compositional Formula). Given f : P → K and g : N → K withg(0) = 1, define h : N→ K by

h(0) = 1

andh(#S) =

∑π={B1,...,Bk}

f(#B1) · · · f(#Bk)g(k),

where the sum ranges over all set partitions π of the set S. Then

Eh(x) = Eg(Ef (x)).

31


Proof. Let #S = n, and let hk(n) =∑

π={B1,...,Bk} f(|B1|) · · · f(|Bk|)g(k) for fixed k. The

B1, . . . , Bk are non-empty since f(0) = 0, so there are k! ways to order them. Thus

Ehk(x) =g(k)

k!Ef (x)k.

Now we sum over all k ≥ 1.

The EGF for the Bell numbers follows almost immediately from Theorem 1.31.

Example. In Theorem 1.31, let f(x) = g(x) = 1. Then h(n) = B(n) (the n-th Bell number).But Ef (x) = ex − 1 and Eg(x) = ex, so the EGF for the Bell numbers is

Eh(x) = eex−1 =

∞∑n=0

B(n)xn

n!.

Roughly, the idea of the Compositional Formula is that if we have a set of objects (likeset partitions) that are disjoint unions of connected components (like blocks of a partition),and if there are f(j) possibilities for a component of size j, and there are g(k) ways to sticktogether k components to form an object, then h(n) is the total number of objects. We sooften want g(x) = 1 that we state this case as a corollary.

Corollary 1.32 (The Exponential Formula). Given f : P→ K, define h : N→ K by

h(0) = 1

andh(#S) =

∑π={B1,...,Bk}

f(#B1) · · · f(#Bk),

where the sum ranges over all set partitions π of the set S. Then

Eh(x) = exp(Ef (x)).

The Exponential Formula is often used in the enumeration of graphs. Recall that a graphconsists of a vertex set V and a set of edges E, where an edge is an unordered pair of distinctvertices. A connected graph is one in which for any two vertices v and w, there are verticesv0 = v, v1, v2, . . . , vj = w such that vivi+1 is an edge for all i.

Example. The number of graphs with vertex set {1, 2, . . . , n} is 2(n2) (each unordered pairof vertices may or may not correspond to an edge). Let c(n) be the number of connected

graphs on this vertex set. Let f(n) = c(n) and h(n) = 2(n2). The Exponential Formula showsthat

Eh(x) =∑n≥0

2(n2) xn

n!= exp

(∑n≥1

c(n)xn

n!

),

so ∑n≥1

c(n)xn

n!= log

(∑n≥0

2(n2) xn

n!

).

32


Now we are going to find a generating function for the number of graphs on n verticeswith exactly k connected components. Here is where we begin to use the generality of thefield K. If K contains indeterminates, say, we can use f and/or g to assign weights to eachchoice of components.

Example. For this example, let ck(n) be the number of graphs with k components on{1, 2, . . . , n} and let

h(n) =∑k≥0

ck(n)tk.

Write

F (x, t) =∑n≥0

(∑k≥0

ck(n)tk

)xn

n!= Eh(x).

Let f(n) = tc(n), namely t times the number of connected components on n vertices (weare weighting each component by t, and the weight of a graph is the product of the weightsof the components, so a graph has weight tk if it has k components), let g(k) = 1 and leth(n) =

∑k≥0 ck(n)tk. Then by the Exponential Formula,

F (x, t) = Eh(x) = exp

(∑n≥1

tc(n)xn

n!

)=

(∑n≥0

2(n2) xn

n!

)t

.

More generally, if Eh(x) is the generating function for some collection of objects, then Eh(x)t

is a multivariate generating function that keeps track of the number of components. Notethat h(n) is the sum of the weights of the graphs on n vertices, so if t = 1, we just get thenumber of graphs on n vertices. We can go further with this example by writing

F (x, t) = Eh(x)

=∑k≥0

tkEck(x)

=∑k≥0

tkEc1(x)k

k!

so

Eck(x) =1

k!(logEh(x))k .

Suppose that instead of graphs, we were looking at permutations of [n], which are a disjointunion of cycles. Let h(n) = n! and let ck(n) = c(n, k), the signless Stirling number of thefirst kind. Then Eh(x) = (1− x)−1, so the EGF for the Stirling numbers of the first kind is

Ec(n,k)(x) =∑n≥0

c(n, k)xn

n!=

1

k!

[log(1− x)−1

]k.

We turn to an application of the Exponential Formula in group theory.

33


Example. Let G be a finitely generated group and let Hom(G,Sn) be the set of homomor-phisms from G to Sn. What is ∑

n≥0

|Hom(G,Sn)| xn

n!?

Observe that there is a bijection between such homomorphisms and actions of G on [n]. Theorbits of this action form a set partition π of [n]. Letting f(d) = gd, the number of transitiveactions of G on [d], the Exponential Formula shows that

∑n≥0

|Hom(G,Sn)| xn

n!= exp

(n∑d=1

gdxd

d!

).

Since the orbit of 1 in such a transitive action has size d, then the stabilizer of 1 is asubgroup H of G of index d. Now the d− 1 coset representatives of the non-trivial cosets ofH send 1 to each of 2, . . . , d, and we can determine that assignment in (d− 1)! ways. Thusgd = (d− 1)!sd(G) where sd(G) is the number of subgroups of G of index d. Therefore

∑n≥0

|Hom(G,Sn)| xn

n!= exp

(∑d

sd(G)xd

d

).

As a final application of the Exponential Formula, we will briefly revisit the cycle indexof last lecture.

Example. Recall that the cycle index of Sn is

φn(x) =∑

c=(c1,c2,... )

#{π ∈ Sn : type(π) = c}xc11 xc22 · · ·.

Letting f(k) = (k − 1)!xk (that is, the weight of a cycle of length k is xk), we see that

φn(x) =∑

π={B1,...,Bk}

f(#B1) · · · f(#Bk).

where the sum is over set partitions π of [n]. Since

Ef (t) =∞∑n=1

xntn

n,

the Exponential Formula says that

C(t, x) =∞∑n=1

φn(x)tn

n!= exp

(∞∑n=1

xntn

n

).

In particular, if we let xi = 0, that gives weight 0 to each cycle of length i, and hence weight0 to each permutation with a cycle of length i. Hence choosing a set S, letting xi = 0 if i /∈ Sand xi = 1 if i ∈ S lets the coefficient of tn/n! in C(t, x) count the number of permutations

34


none of whose cycle lengths are in S. Afternatively, letting xi = 1 if i /∈ S makes the weightof a permutation just

∏i∈S x

cii , so extracting the coefficient of

∏i∈S x

cii t

n/n! from C(t, x)counts the number of permutations with ci i-cycles when i ∈ S.

The rest of the lecture yesterday was devoted to showing that, for appropriate choicesof S, we can rewrite C(t, x) and use a lemma to find an exact formula for the limit of thecoefficient of

∏i∈S x

cii t

n in C(t, x) as n → ∞, which gives the limiting probability that arandom permutation in Sn has ci i-cycles for each i ∈ S.

1.5 Lecture 6 (Tuesday, September 16): Bijections

So far, we have focused on the algebraic (generating function) approach to proving combina-torial enumeration theorems. One alternative is to prove enumerative theorems bijectively.For example, one might show that the n-th Catalan number Cn satisfies

Cn =1

n+ 1

(2n

n

)by exhibiting a bijection between {Catalan paths} × [n+ 1] and 0-1 sequences of length 2nwith n zeroes (here, Catalan paths is shorthand for the lattice paths from (0, 0) to (n, n) withsteps (0, 1) and (1, 0) that never go above y = x). I have tried to avoid bijective proofs so farbecause: (1) bijective proofs tend to arise after an enumerative formula has been found viaalgebraic means in order to lend insight into the problem, (2) algebraic methods are morepowerful in the sense that few, if any, identities can be proved via bijections that cannot beproved by algebraic means, and (3) bijective proofs tend to be more specialized.

Still, bijections are an important and beautiful part of enumerative combinatorics, solet’s dive in. We begin with a varied collection of nice, classic bijections.

Example. One extremely common class of bijections are lattice path bijections. Manycombinatorial quantities have lattice path interpretations. The classic example here is theCatalan number. We wish to prove the following proposition bijectively.

Proposition 1.33. The Catalan number Cn is given by

Cn =1

n+ 1

(2n

n

).

Proof. Our bijection is not of the type suggested in the introduction, highlighting the factthat it is not always clear what sets should be involved in the bijection. But recall that Cnwas defined to be the number of lattice paths from (0, 0) to (n, n) with steps (0, 1) and (1, 0)that never go above y = x. Let this set of lattice paths be L1. Let L2 be the set of latticepaths from (0, 0) to (n−1, n+1) with steps (0, 1) and (1, 0). Let L3 be the set of lattice pathsfrom (0, 0) to (n, n) with steps (0, 1) and (1, 0). Obviously |L2| =

(2nn−1

)and |L3| =

(2nn

). We

will exhibit a bijection from L3 to L1 ∪L2, which would show that Cn = |L1| =(2nn

)−(

2nn−1

),

and this equals the desired quantity.Let P be a path in L3. Either it is in L1 or it crosses the line y = x. Find the first edge

in the path that lies above the diagonal, and flip the portion of the path occurring after that

35


edge along a line parallel to y = x. The resulting path is in L2. Since every path in L2 mustcross y = x at some point, every such path can be obtained in this fashion in precisely oneway, and the flip is clearly reversible. This is the desired bijection.

The Catalan numbers have a great many different interpretations (Richard Stanley has alist of 166 interpretations). In the next example, we give bijective proofs of several of these,leading up to a rather nice result on counting triagulations.

Example.

Proposition 1.34. The following sets of objects are counted by the Catalan number Cn:

1. S1 = {lattice paths from (0, 0) to (n, n) with steps (0, 1) and (1, 0) that do not go above y = x}

2. S2 = {triangulations of a convex (n+ 2)-gon into n triangles by n− 1 non-intersecting diagonals}

3. S3 = {lattice paths from (0, 0) to (2n, 0) with steps (1, 1) and (1,−1) that do not go below y = 0}

4. S4 = {sequences i1, . . . , i2n of 1’s and -1’s with non-negative partial sums and total sum equal to zero}

5. S5 = {plane binary rooted tree with 2n+ 1 vertices}

Here a plane binary rooted tree is a rooted tree in which each interior node has one left childand one right child.

Proof. The bijection from S1 to S3 is achieved by rotating a lattice path 135 degrees andscaling by

√2. The bijection from S3 to S4 is achieved by recording a 1 for each northeast

step and a −1 for each southeast step. The bijection from S4 to S5 is more difficult. Givena plane binary rooted tree with 2n + 1 vertices, traverse it in depth-first search order, andeach time a vertex is visited, except the last, record the number of children minus one. Thisgives a sequence of 1’s and -1’s of length 2n with total sum 0. The fact that the partial sumsare non-negative follows from the fact that a vertex is visited before any of its children, andthe left subtree of a vertex has exactly one more leaf node than interior node. So traversinga vertex and its left subtree gives equal numbers of 1’s and -1’s, and the traversal can bedecomposed into several traversals of left subtrees, so all partial sums must be non-negative.It is clear how to reverse the map. (It is probably easier to go from S1 to S5 directly.)

The bijection from S2 to S5 is neat. Fix an edge e in the (n+2)-gon. Given a triangulationT , form a tree G in which the edges of the triangulation are the vertices of G, the root isthe vertex corresponding to e, the left and right child are the other two edges of the trianglecontaining e, and so on. This gives a plane binary rooted tree with 2n + 1 vertices, andreversing the bijection is easy.

Example. The next common class of bijections involves partitions and Ferrers diagrams.Given an integer partition λ = (λ1, λ2, . . . , λk), where λ1 ≥ λ2 ≥ · · · ≥ λk > 0, its Ferrersdiagram representation consists of k rows of left-justified dots with λi dots in row k. (Usingboxes in places of dots gives us the Young diagram, which will be integral to the second halfof this course.) We call k the length l(λ) of λ. Here are some basic partition bijections.

1. The number of partitions of n with length k equals the number of partitions withlargest part k.

36


Proof. The transpose λ′ of the partition λ is the partition whose Ferrers diagram isobtained from the Ferrers diagram of λ by reflecting across the diagonal y = −x. If λhas length k, then λ′ has largest part k.

2. The number of partitions of n into odd parts equals the number of partitions of n intodistinct parts.

Proof. Given a partition λ of n into odd parts, merge pairs of parts of equal size untilall parts are distinct. We can recover λ by breaking in half all even parts until thereare only odd parts left.

3. Euler’s Pentagonal Number Theorem says that

∞∏i=1

(1− xi) = 1 +∑j≥1

(−1)j(x

3j2−j2 + x

3j2+j2

).

The bijective proof of this result is our first example of the Involution Principle, whichlets us give “bijective” proofs when there is subtraction involved.

Theorem 1.35 (Involution Principle). Suppose a finite set S is the disjoint union ofa positive part S+ and a negative part S−. Suppose φ : S → S is an involution andsign-reversing (that is, either φ(x) = x or x and φ(x) are in different parts). Then

|S+| − |S−| = |Fixφ(S+)| − |Fixφ(S−)|.

Proof of Euler’s Pentagonal Number Theorem. The coefficient of xn on the left equalsthe number of partitions of n into distinct parts that have an even number of partsminus the number of partitions of n into distinct parts that have an odd number ofparts. Let the first set be S+ and the second set be S−. We must show that

|S+| − |S−| ={

(−1)n : n = j(3j ± 1)/20 : otherwise

We want to find a sign-reversing involution on S = S+∪S− with no fixed points exceptwhen n = j(3j ± 1)/2, in which case there will be one fixed point (in S+ or S− asappropriate.

Given a partition λ of length k in D0 ∪ D1, let p be the maximum index such thatλ1 = λ2 + 1 = λ3 + 2 = · · · = λp + (p− 1). If k = p and λk = p or λk = p+ 1, don’t doanything. Otherwise, if λk ≤ p, move the dots in the last row of λ to the ends of thefirst λk rows of λ. If λk > p, move the last dot in the first p rows to a new last row ofλ.

Clearly the new partition has all distinct parts and except in the last case, the parityof the length changed. The only partitions for which we don’t do anything have theform (λ1, λ1 − 1, · · · , λ1 − (k − 1)) with λ1 − (k − 1) = k or k + 1. There is one suchpartition if and only if n = j(3j ± 1)/2 for some j. This proves the theorem.

37


1.6 Lecture 7 (Thursday, September 18): Bijections II

Our final type of basic bijection is a tree enumeration bijection.

Example.

Theorem 1.36 (Cayley’s Theorem). The number of trees (that is, connected graphs withn− 1 edges) on the vertex set [n] equals nn−2.

We will see an algebraic proof of this in a few lectures, but today we will see a bijectiveproof. There are several known ones; we will look at the proof using so-called Pr ufer codes.

Proof. We will define a bijection P from trees on [n] to [n]n−2. Let T be a tree on [n]. Weare going to define a sequence of trees T1, T2, . . . , Tn−1. Let T1 = T . Given Ti, which hasn− i+1 vertices, let xi be the vertex of degree one with the smallest label. Let it be adjacentto yi. Delete xi and edge xiyi from Ti to get Ti+1. Let P (T ) = (y1, y2, . . . , yn−2).

As an example, consider the tree T with edges (1, 2), (1, 5), (1, 7), (1, 10), (2, 3), (2, 4),(6, 7), (8, 10) and (9, 10). Then P (T ) = (2, 2, 1, 1, 7, 1, 10, 10, 10).

Note the following facts. First, V (Tk) = {xk, . . . , xn−1, n} and yn−1 = n. The edges ofTk are {xi, yi} for i ≥ k. Now, the number of times that v appears in {y1, . . . , yn−2} is thedegree of v in T minus one. By extension, the number of times that a vertex v in Tk appearsin {yk, . . . , yn−2} is the degree of v in Tk minus one. This implies that the degree one verticesof Tk are precisely those vertices which do not appear in {x1, . . . , xk−1, yk, . . . , yn−1}, andxk is that degree one vertex with the smallest l abel. Thus given P (T ), we can iterativelydetermine x1, x2, . . . , xn}, and this determines T . Finally, given an element of [n]n−2, thisprocedure produces a connected graph with n− 1 edges, which is a tree.

Now that we have seen some basic types of bijections, we further explore the one genericidea we have seen, the Involution Principle, in the context of determinants. Here is a morecomplicated application of the Involution Principle.

Example. In this example, we compute the Vandermonde determinant.

Theorem 1.37. Let

V =

xn−1

1 xn−12 · · · xn−1

n

xn−21 xn−2

2 · · · xn−2n

......

......

x1 x2 · · · xn1 1 · · · 1

.

Thendet(V ) =

∏1≤i<j≤n

(xi − xj) =: A.

Note that this theorem is easy to prove directly. Letting xi = xj, we see that thedeterminant becomes 0, so det(V ) must be divisible by xi − xj. Also, the determinant is apolynomial of total degree at most

(n−1

2

), so we know det(V ) is some constant multiple of∏

1≤i<j≤n (xi − xj). That this constant is 1 is easily determined. But the involution proofleads us into a general way to evaluate determinants bijectively.

38


Proof. On the one hand,

det(V ) =∑σ∈Sn

(sgnσ)xn−1σ(1) · · ·x

0σ(n).

On the other hand,

A =∑

(−1)mxa11 · · ·xann ,

where∑ai =

(n2

)and m = #{j : xj is taken from xi − xj}. (There are 2(n2) terms in this

sum.)We associate each summand in the expansion of A with a tournament. A tournament is

a directed graph on the vertex set [n] with exactly one of the arcs (i, j) and (j, i) for eachi 6= j. The weight w(a) of an arc (i, j) is xi and the sign sgn(a) of the arc is +1 if i < j and−1 if i > j. The weight of a tournament T is w(T ) =

∏w(a) and the sign of a tournament

T is sgn(T ) =∏

sgn(a). The weight of a tournament is xa11 · · ·xann , where ai is the out-degree

of i. ThenA =

∑T

sgn(T )w(T ).

A tournament is transitive if whenever arcs (i, j) and (j, k) are present, so is (i, k). Thisdefines a unique permutation of [n], so there are n! transitive tournaments. If Tσ is thetransitive tournament corresponding to the permutation σ, then

w(Tσ) = xn−1σ(1) · · ·x

0σ(n)

and sgn(Tσ) = (−1)inv(σ) = sgn(σ). Thus

det(V ) =∑σ∈Sn

(sgnσ)w(Tσ).

Let S+ be the non-transitive tournaments with sign +1 and let S− be the non-transitivetournaments with sign −1. To show that det(V ) = A, it suffices to construct a sign-reversinginvolution on S = S+ ∪ S−.

Note that T is non-transitive if and only if there are vertices i and j with equal out-degrees. Given a non-transitive tournament, choose the least such i and then the least suchj. We may assume that T contains (i, j). For each vertex k 6= i, j, there are four possibilities:

1. T contains (i, k) and (k, j).

2. T contains (k, i) and (j, k).

3. T contains (i, k) and (j, k).

4. T contains (k, i) and (k, j).

Note that the number of k for which case (b) hold is one fewer than the number of k forwhich case (a) holds. Define φ(T ) be reversing the arc (i, j) and the other two arcs in thetriangle ijk for which case (a) or case (b) holds. Then the out-degree of each vertex in φ(T )is the same as in T , φ(T ) is non-transitive and φ is reversible, and φ is an involution. Also,φ reverses an odd number of edges, so the sign of φ(T ) and of T are different. Thus φ issign-reversing.

39


1.7 Lecture 8 (Tuesday, September 23): Bijections II

(Aigner [1, Section 5.4])

1.7.1 The Gessel-Viennot Lemma

The Gessel-Viennot Lemma (originally proved by Lindstrom) is a powerful result connectingdeterminants of matrices to paths in graphs/lattices. Let G = (V,E) be a finite acyclicdirected graph with weighted arcs. Let P be a directed path from v to w, and let theweight of P be w(P ) =

∏a∈P w(a), with w(P ) = 1 if the length of P is 0. Suppose that

V = {v1, . . . , vn} and W = {w1, . . . , wn} are two sets of vertices (not necessarily disjoint).To V and W we associate the path matrix M = (mij), where

mij =∑

P :vi→wj

w(P ).

A path system S from V to W consists of a permutation σ ∈ Sn and n paths Pi : vi → wσ(i).Let sgn(S) = sgn(σ). The weight of S is w(S) =

∏w(Pi). A path system is vertex-disjoint

if no two paths have a vertex in common. Let V DPS be the set of vertex-disjoint pathsystems from V to W .

Lemma 1.38 (Gessel-Viennot).

det(M) =∑

A∈V DSP

sgn(A)w(A).

Before we prove this, let’s see an example.

Example. We want to compute det(M), where mij =(m+i−1j−1

)for 1 ≤ i, j ≤ n. Consider the

lattice Z2 with arcs directed up and to the right with all weights 1. Let vi = (0,−m− i+ 1)and wj = (j−1,−j+1) for 1 ≤ i, j ≤ n. Then the number of paths from vi to wj is

(m+i−1j−1

).

On the other hand, the number of vertex-disjoint path systems is 1 (see drawing), so thedeterminant is 1.

Proof of the Gessel-Viennot Lemma. Expanding det(M) gives a sum over σ ∈ Sn with sum-mands

sgn(σ)m1σ(1) · · ·mnσ(n) = sgn(σ)

∑P1:A1→Bσ(1)

w(P1)

· · · ∑Pn:An→Bσ(n)

w(Pn)

.

So det(M) is the sum ∑A

sgn(A)w(A)

over all path systems from V to W . We need to show that∑B

sgn(B)w(B) = 0,

40


where the sum is over all non-vertex disjoint path systems B. We will define a sign-reversinginvolution on the set of non-vertex disjoint path systems B. Given B, find the smallest indexi such that Pi crosses some Pj; the first vertex x on Pi shared with another path; and thesmallest index j > i such that x is on Pj. Now swap the part of Pi and the part of Pj thatcome after x. This gives a new non-vertex disjoint path system and defines a sign-reversinginvolution.

Now we look at a very nice application of Gessel-Viennot. (If you are reading the notes,I know that this will be incomprehensible without the pictures drawn in class. Sorry.)

Example. We begin with a regular hexagon of side length n triangulated into equilateraltriangles of side length 1. A rhombus consists of two triangles with a common side. We wantto count the number of tilings of the hexagon by rhombi.

Associate to the hexagon a directed graph with each edge segment on the lower left edgeof the hexagon labeled by v0, . . . , vn−1 and each edge segment on the upper right edge ofthe hexagon labeled by w0, . . . , wn−1, and edges in the directed graph go at 90 degrees or 30degrees across one edge in the triangulation to the next edge.

The number of paths from Ai to Bj is

mij =

(2n

n+ i− j

).

On the other hand, rhombus tilings correspond bijectively to vertex-disjoint path systems.Just view the tiling in 3-D, shading in rhombi tilted up and to the right, and we get a seriesof white staircases from vi to wi. Thus the number of rhombus tilings is

det

((2n

n+ i− j

))1≤i,j≤n

.

Via row and column operations, one can show that

det

((2n

n+ i− j

))1≤i,j≤n

=n−1∏i=0

(2n+in

)(n+in

) .

41


42

Chapter 2

Special Topics

In this chapter, we move from general methods/topics in enumerative combinatorics to morespecialized topics. These topics include permutation patterns, the Matrix Tree Theorem andthe BEST theorem, chromatic polynomials (including posets and Mobius inversion), the WZmethod, and asymptotic enumeration.

2.1 Lecture 9 (Thursday, September 25): Permutation

Patterns (Bona [2, Chapter 4])

One of the hot new areas in combinatorics, permutation patterns. One way to view permu-tation patterns is as a generalization of the inversion number of a permutation.

Definition. For k ≤ n, let π = π1π2 · · · πn be a permutation in Sn and let σ = σ1σ2 · · ·σk bea permutation in Sk. Then π contains the pattern σ if there is a subsequence πi1πi2 · · · πikof π such that i1 < · · · < ik and πia > πib if and only if σa > σb.

Example. Let σ = 21. Then each occurrence of σ in π corresponds to an inversion in π.

Example. Let σ = 123. Then each occurrence of σ in π corresponds to a subsequence ofthree (not necessarily consecutive) increasing entries.

One way to visualize a pattern in a permutation is to look at the matrix representationA = (aij) of π. We will define A by setting aij = 1 if π(j) = n+ 1− i and 0 otherwise; thisway, increasing subsequences correspond to a set of columns in which the heights of the 1’sare increasing from left-to-right. Each k × k permutation submatrix defines a permutationthat is contained (as a pattern) in π. Types of questions we can ask about permutationpatterns include:

1. How many π ∈ Sn do not contain σ?

2. How many times does π ∈ Sn contain σ?

3. How many distinct patterns does π ∈ Sn contain?

43


We will only discuss the first question. Several chapters on this subject can be found inBona [2]. Given a pattern σ, let Nn(σ) denote the number of π ∈ Sn that avoid (do notcontain) σ. The first result is easy.

Proposition 2.1. Let σ, σ′ ∈ Sk correspond to permutation matrices that are equivalentunder dihedral symmetries. Then Nn(σ) = Nn(σ′).

Proof. If π contains σ, and σ′ is obtained by (for example) rotating the permutation matrixof σ 180 degrees, then the permutation obtained from π by rotating the permutation ma-trix of π 180 degrees avoids σ′. This gives a bijection between σ-avoiding and σ′-avoidingpermutations.

The six patterns of length 3 are 123, 132, 213, 231, 312, and 321. Of these, 123 and 321have equivalent permutation matrices and 132, 213, 231, and 312 have equivalent permuta-tion matrices.

Theorem 2.2. Nn(123) = Nn(132).

This theorem shows that the number of permutations in Sn that avoid a given patternof length 3 is independent of the pattern!

Proof. We will define a bijection f from the set of 132-avoiding permutations to the set of123-avoiding permutations. Given a permutation π, its left-to-right minima are those entrieswhich are smaller than all of their predecessors. For example, the left-to-right minima inπ = 67341258 are {6, 3, 1}. Given π, let f(π) be the permutation obtained by keeping theleft-to-right minima of π fixed and rearranging the remaining entries in decreasing order.So f(67341258) = 68371542. Then f(π) is 123-avoiding since it is the union of two disjointdecreasing subsequences.

We claim that the left-to-right minima of π and f(π) are the same. Observe that tocompute f(π), we can imagine switching pairs of non-left-to-right minima in π that are notin decreasing order. Since this moves a smaller entry to the right and a larger entry to theleft, f cannot create new left-to-right minima (or destroy existing left-to-right-minima).

Now, to recover π, we keep the left-to-right minima of f(π) fixed and fill in the remainingentries from left to right by placing the smallest unplaced element that is largest than theleft-to-right minimum to the left of the given position. This is forced by the fact that π mustavoid 132.

Now we can compute Nn(132).

Theorem 2.3. Nn(132) = Cn, the n-th Catalan number.

Proof. For π to avoid 132, it must be that each entry to the left of n is larger than each entryto the right of n, and that the entries to the left of n are 132-avoiding as are the entries tothe right of n. This shows that

Nn(132) =n−1∑i=0

Ni(132)Nn−1−i(132).

Since N0(132) = 1 and N1(132) = 1, the two sequences satisfy the same recurrence and areequal.

44


While investigations into pattern-avoidance often do not have connections with otherareas of math, one of the original motivations for pattern avoidance comes from Knuth, whoshowed that the 231-avoiding permutations are precisely the stack-sortable permutations.

Given a permutation π = π1π2 · · · πn and a first-in, last-out stack, we begin by placingπ1 on the stack. At each step, we can put the next entry in π (reading left-to-right) on thestack or we can take the top element in the stack and add it to our output. Every entrymust be put on the stack at some point. Then π is stack-sortable if it can be passed througha stack with the elements removed in ascending order. If it is stack-sortable, the unique wayto sort it is if the next element to be added to the stack is greater than the top element ofthe stack, remove the top element, and otherwise add the next element.

For example, π = 85214376 is stack-sortable. We push 8,5,2, and 1, pop 1 and 2, push 4and 3, pop 3,4, and 5, push 7 and 6, and pop 6,7, and 8.

Theorem 2.4. The 231-avoiding permutations are the stack-sortable permutations.

Our proof follows the presentation by Julian West in his 1990 Ph.D. thesis.

Proof. If i < j and πi < πj, then πi has to be removed from the stack before πj is added. Ifi < k and πi > πk then πi must stay on the stack until πk has been removed. So if i < j < kand πk < πi < πj, πi must be removed before πj is added but after πk. But this is impossible,so a stack-sortable permutation cannot have a subsequence of type 231.

On the other hand, the algorithm fails to sort π only if we must remove an element fromthe top of the stack which is not the largest element that hasn’t been removed. Then thetop element of the stack is smaller than the next element to be added but larger than somelater element. These three elements form a 231 pattern. So a 231-avoiding permutation isstack-sortable.

For patterns of length greater than 3, questions about Nn(σ) becomes very difficult.However, there was a long-standing conjecture, recently proved, that shows that Nn(σ) issmall relative to n! for all σ.

Theorem 2.5 (Stanley-Wilf conjecture, proved by Marcus-Tardos). Let σ be any pattern.There exists a constant cσ so that for all positive integers n,

Nn(σ) ≤ cnσ.

Moreover,limn→∞

n√Nn(σ)

exists.

Doron Zeilberger has “interpreted” the proof in terms of “love-patterns”. That feels alittle too silly for me to present, so I will interpret the proof in terms of American andEuropean golfers.

Proof. We first prove the Furedi-Hajnal Conjecture, then show that the Furedi-Hajnal Con-jecture implies the Stanley-Wilf Conjecture. We say that a 0-1 matrix M contains a permu-tation σ if there is a k×k submatrix of M that has a 1 in every position that the permutation

45


matrix of σ has a 1 in (the submatrix may have additional ones as well). The Furedi-HajnalConjecture states: let σ be a permutation matrix and let f(n, σ) be the maximum numberof 1s in a σ-avoiding n× n 0-1 matrix. Then there is a constant dσ so that f(n, σ) ≤ dσn.

For our interpretation of this problem, we have a 0-1 matrix M in which the rows corre-spond to n American golfers ranked by ability and the columns correspond to n Europeangolfers ranked by ability. Each 1 corresponds to a match between an American golfer anda European golfer. For a fixed σ, we want to avoid the match-pattern σ, that is, we forbidthere to be k American golfers and k European golfers such that the i-th best Americangolfer (amongst these k) plays the σ(i)-th best European golfer. What is the maximumnumber of matches f(n, σ)?

Suppose a divides n. Divide the American golfers (respectively, the European golfers)into n/a teams, where the a best American (resp. European) golfers are on the first teamand so on. A team plays a team from the opposite continent if there is at least one matchbetween players of the teams. A team challenges a team from the opposite continent if thereare at least k players from the first team who play against players on the other team.

The total number of pairs of teams that play each other is f(n/a, σ) (else σ would occurin the full matrix). If a team were to challenge k other teams because of the k players onthe challenging team, then M would contain k rows and k blocks containing those rows inwhich each block contains a one in each of the k rows. In this case, any k× k pattern wouldbe contained in M , including σ. Therefore, since M avoids σ, a team can challenge fewerthan k

(ak

)teams from the opposite continent. So there are at most k

(ak

)· 2 · n

achallenges.

The number of matches between teams that play each other but do not challenge eachother is at most (k − 1)2. The number of matches between teams that challenge each otheris at most a2. Thus

f(n) ≤ (k − 1)2f(n/a) + 2ak

(a

k

)n.

Letting a = k2, we find

f(n) ≤ 2k4

(k2

k

)n.

So we can take dσ = 2k4(k2

k

), proving the Furedi-Hajnal Conjecture.

As for the Stanley-Wilf conjecture, let N ′n(σ) be the number of n × n 0-1 matrices thatavoid σ. Clearly Nn(σ) ≤ N ′n(σ). But to construct a n× n 0-1 matrix that avoids σ, we dothe following. Divide the golfers into teams of 2. We can choose the teams that play eachother in at most N ′n/2(σ) ways, and there will be at most dσ(n/2) pairs of teams that playeach other. We can choose how the players within a pair of teams play each other in 15ways. So N ′n(σ) ≤ N ′n/2(σ)15dσ(n/2), which implies that Nn(σ) ≤ N ′n(σ) ≤ 15dσn.

Finally, to show that the limit exists, we claim that Nn(σ)Nm(σ) ≤ Nm+n(σ). Indeed, wemay assume that k precedes 1 in σ. Given π ∈ Sn and τ ∈ Sm that avoid σ, construct a newpermutation in Sm+n that avoid σ by concatenating π and τ and adding n to each elementof τ . This injection shows that Nn(σ)Nm(σ) ≤ Nm+n(σ), so Nn(σ) is monotone increasingand bounded, so the limit exists.

The value oflimn→∞

n√Nn(σ)

46


for various σ has been investigated. When σ has length 3, the limit is 4. There are casesknown in which the limit is 8 or 9, which prompted a conjecture that the limit is always aninteger, but Bona demonstrated a pattern σ for which this limit is not rational.

2.2 Lecture 10 (Tuesday, September 30): The Matrix

Tree Theorem (Stanley [5, Section 5.6])

Graph theory began when Euler solved the Konigsberg bridge problem in 1735. Seven bridgesconnected parts of the city of Konigsberg (see Figure ???) and the populace wondered ifthey could walk over each bridge once and return to their starting spot. Euler viewed this asa (multi-)graph problem, with parts of the city represented by vertices and bridges by edges.He proved that such a walk was not possible; in modern terminology, there is no Euleriantour. We will prove this on Thursday.

*** Draw bridges and graph analogue ***

The next major development in graph theory came in the mid-1800s. Although theknight’s tour problem (which asks for a sequence of knight’s moves on a chessboard thatvisits each square once and returns to the starting square) has been around since the 1400s,19th-century mathematicians were the first to view it as a problem about visiting eachvertex in a graph once and returning to the starting vertex; in modern terminology, findinga Hamiltonian cycle. Hamilton even marketed the “Icosian” game, which essentially askedfor a Hamiltonian cycle on the vertices and edges of a dodecahedron.

The late 1800s saw increasing interest in (what is now called) graph theory, from the 4-color conjecture in 1852 to the enumeration of graphs by Cayley and Sylvester for applicationsin chemistry. (Sylvester was the first to use the word “graph” in this context.) Graph theoryresearch exploded in the 1900s, particularly with the rise of computer science and electronics.

We begin the next two lectures by stating and proving beautiful formulas for countingthe number of spanning trees in a graph (the Matrix Tree Theorem) and the number ofEulerian tours in a digraph (the BEST Theorem). Then we apply the Matrix Tree Theoremin a couple enumerative contexts. Further applications of the Matrix Tree Theorem and theBEST Theorem appear on the homework.

2.2.1 Spanning Trees and the Matrix Tree Theorem

Let G be a general graph. Recall that a spanning tree of G is a subgraph H that is a treeand V (H) = V (G). In a directed graph, an oriented spanning tree with root v is a spanningtree in which all arcs point towards v. Spanning trees have many applications in computerscience and other fields; often a network is represented by a graph and one wants to connectall nodes on the network as cheaply as possible by finding a spanning tree of the network.One of the homework problems addresses this. More importantly for our purposes, spanningtrees turn up in a number of enumeration problems. In this section, we prove the MatrixTree Theorem, which gives us a determinantal formula for the number of spanning trees ina graph, and then apply it in later sections.

Let D be a directed graph (digraph). The adjacency matrix A(D) of D is the n × n

47


matrix (aij) whereaij = the number of arcs from vi to vj.

Let T (D) = (tij) be the n × n diagonal matrix with tii = outdeg(vi). The LaplacianL(D) of D is the matrix L(D) = T (D) − A(D). Note that L(D) is independent of theloops in D. For example, let D be the digraph on vertices V = {v1, v2, v3, v4} with arcs{(v1, v2), (v3, v2), (v4, v3), (v4, v1), (v4, v2)}. Then

A(D) =

0 1 0 00 0 0 00 1 0 01 1 1 0

T (D) =

1 0 0 00 0 0 00 0 1 00 0 0 3

L(D) =

1 −1 0 00 0 0 00 −1 1 0−1 −1 −1 3

The adjacency matrix and the Laplacian are ubiquitous in graph theory; we will see just

a few of their uses. Let Li(D) be the matrix obtained by deleting row i and column i fromL(D).

Let G be a graph. The adjacency matrix A(G) of G is the n× n matrix (aij) where

aij =the number of edges between vi and vj : i 6= j

twice the number of loops at vi :

Let T (G) = (tij) be the n × n diagonal matrix with tii = deg(vi). The Laplacian L(G) ofG is the matrix L(G) = T (G) − A(G). Again, L(G) is independent of the loops in G. Forexample, let G be the graph on V = {v1, v2, v3, v4} with edges {v1v2, v1v4, v2v3, v2v4, v3v4}.Then

A(G) =

0 1 0 11 0 1 10 1 0 11 1 1 0

T (G) =

2 0 0 00 3 0 00 0 2 00 0 0 3

L(G) =

2 −1 0 −1−1 3 −1 −10 −1 2 −1−1 −1 −1 3

48


Let t(G) be the number of spanning trees in G and let t(D, v) be the number of orientedspanning trees of D rooted at v.

Theorem 2.6 (Matrix Tree Theorem). Let G be a general graph. Then t(G) = det(Li(G))spanning trees (this determinant is independent of i).

There is a matrix-theoretic proof of the Matrix Tree Theorem using the Cauchy-BinetTheorem that can be found in Van Lint and Wilson [6], and there is a proof using the Gessel-Viennot Lemma in Aigner [1], but we will prove it as a corollary of a more general theoremabout directed graphs that can be proved by induction.

Theorem 2.7. Let D be a digraph with vertex set V = {v1, . . . , vn}. Then t(D, vi) =det(Li(D)).

Before proving this theorem, we show that the Matrix Tree Theorem is a corollary.

Proof of the Matrix Tree Theorem as a corollary of Theorem 2.7. Given a general graph G,form a digraph D by replacing each edge vivj in G by arcs (vi, vj) and (vj, vi). Then L(G) =L(D). Furthermore there is a bijection between spanning trees of G and oriented spanningtrees of D with root vi (given a spanning tree of G, orient all edges toward vi to obtain anoriented spanning tree of D with root vi, and given an oriented spanning tree of G with rootvi, unorient the edges to obtain a spanning tree of G). So t(G) = t(D, vi). By Theorem 2.7,the number of spanning trees of G equals det(Li(D)) = det(Li(G)).

Now we prove Thereom 2.7.

Proof of Theorem 2.7. The proof is by induction on the number of arcs in D. If D is notconnected, then the number of spanning trees is 0, while if D1 is the component containingvi and D2 is the rest, then det(Li(D)) = det(Li(D1)) det(L(D2)) = 0. (The Laplacian hasdeterminant 0 since 1 is a right eigenvector.) If D has n − 1 edges, then D is a tree as anundirected graph. If D is not an oriented tree with root vi, then some vertex vj 6= vi hasout-degree 0, Li(D) has a row of zeroes, and the determinant is 0. Otherwise, if D is anoriented tree with root vi, then Li(D) is (conjugate to) an upper triangular matrix with 1’son the diagonal and has determinant 1.

If D has m > n − 1 arcs, we may assume that vi has no outgoing arcs (these are notcontained in any spanning tree with root v and do not affect det(Li(D)). Then some vertexvj 6= vi has outdegree at least two; choose one outgoing arc e. Let D1 be D with e removedand let D2 be D with other outgoing arcs from vj removed. By induction, det(Li(D1))and det(Li(D2)) equal the number of oriented spanning trees rooted at vi in the respectivegraphs. The number of such trees in D is the sum of these two numbers. But also

det(Li(D)) = det(Li(D1)) + det(Li(D2))

by the multilinearity of the determinant, since D equals D1 and D2 except in row j, and rowj in D is the sum of row j in D1 and row j in D2.

Lemma 2.8. Let M be an n×n matrix with row and column sums 0. Let Mij be the matrixobtained by removing i-th row and j-th column. Then the coefficient of x in det(M − xI) is(−1)i+j+1p det(Mij).

49


Proof. We prove this for i = j = n. Add all rows of M − xI except the last row to thelast row. This changes all entries of the last row to −x. Factor that −x out, giving N(x)with det(M − xI) = −x det(N(x)). The coefficient of x in det(M − xI) is − det(N(0)).Add all the columns of N(0) except the last column to the last column. The last columnof N(0) becomes the column vector [0, . . . , 0, p]. Expanding by the last column shows thatdet(N(0)) = p det(Mnn).

Corollary 2.9. If D is a balanced digraph (each vertex has the same in-degree and out-degree), and the eigenvalues of L(D) are µ1, . . . , µn = 0, then the number of oriented spanningtrees rooted at v is µ1 · · ·µn−1/n.

Example. Let G = Kn, the complete graph on n vertices. Then L(G) = nI − 1. Since 1has eigenvalue 0 with multiplicity n− 1 and eigenvalue n with multiplicity 1, then L(G) haseigenvalue n with multiplicity n− 1 and eigenvalue 0 with multiplicity 1. So the number ofspanning trees of G is nn−2 (Cayley’s Theorem).

2.3 Lecture 11 (Thursday, October 2): The BEST The-

orem (Stanley [5, Section 5.6])

2.3.1 The BEST Theorem

An Eulerian tour in a graph (respectively, a digraph) is a closed walk (respectively, directedwalk) which uses every edge (respectively, arc) exactly once. A graph or a digraph with anEulerian tour is called Eulerian. The Konigsberg bridge problem asks for an Eulerian tourin a multigraph with 4 vertices and 7 edges.

*** Insert diagram ***There is a simple necessary and sufficient criterion for the existence of an Eulerian tour

in a digraph.

Theorem 2.10. A digraph D without isolated vertices is Eulerian if and only if it is con-nected and indeg(v) = outdeg(v) for all vertices v ∈ V (D).

Proof.

The analogous theorem for graphs is proved in precisely the same way, so we will statethe theorem and omit the proof.

Theorem 2.11. A graph G without isolated vertices is Eulerian if and only if it is connectedand every vertex has even degree.

50


Now that we have settled the existence question for Eulerian tours, we turn to enumer-ating them. It turns out that counting Eulerian tours in a graph is hard (more formally,it is #P-complete). Counting Eulerian tours in a digraph is made possible by the BESTTheorem, named after de Bruijn, van Aardenne-Ehrenfest, Smith and Tutte.

Theorem 2.12 (BEST Theorem). Let D be an Eulerian digraph. Let e be an arc with initialvertex v. Let t(D, v) be the number of oriented spanning trees rooted at v and let e(D, e) bethe number of Eulerian tours starting with e. Then

e(D, e) = t(D, v)∏v∈V

outdeg(v)− 1)!.

Proof. Given a tour E = e1, . . . , em, for each u 6= v, let e(u) be the last exit from u in thetour. We claim

1. The vertices of D and the arcs e(u) form an oriented spanning tree T with root v.

2. Given an oriented spanning tree with root v, we can construct∏

v∈V outdeg(v)− 1)!Eulerian tours.

To prove (1), just observe that

1. T has n− 1 edges.

2. T does not have two arcs going out of the same vertex.

3. T does not have an arc going out of v.

4. T does not have a cycle.

To prove (2), given T , we construct an Eulerian tour by starting at e and continue to chooseany edge possible except we don’t choose f ∈ T unless we have to. The set of last exits ofthe tour coincide with the set of edges of T . The only way to get stuck is to end at v withno more exits available, but with some edge unused. If so, some unused edge must be a lastexit edge (that is, an edge in T ). Let u be a vertex closest to v such that f ∈ T outgoingfrom u is not in the tour. Let y be the endpoint of f . If y 6= v, since we enter y as oftenas we leave, we don’t use the last exit from y. Thus y = v. But then we can leave v, acontradiction.

2.3.2 De Bruijn Sequences

One problem solved by the BEST Theorem and the Matrix Tree Theorem is the enu-meration of de Bruijn sequences. A de Bruijn sequence of degree n is a binary sequenceA = a1a2 . . . a2n so that each binary sequence of length n appears exactly once as the subse-quence aiai+1 . . . ai+n−1 (indices taken modulo n) as i ranges from 1 to 2n.

We begin by constructing a digraph Dn and a bijection between de Bruijn sequences ofdegree n and pairs (E, e) where E is an Eulerian tour in Dn and e is an edge in Dn. Thevertices of Dn are the 2n−1 binary sequences of length n− 1. There is an arc from the vertex

51


a1 · · · an−1 to the vertex b1 · · · bn−1 if a2 · · · an−1 = b1 · · · bn−2. For convenience, label the arcwith bn−1. Each vertex in Dn has indegree 2 and outdegree 2 and Dn is connected, so Dn isEulerian and has 2n edges.

*** Draw de Bruijn graph ***Given an Eulerian tour E in Dn, concatenate the edge labels in Dn in the order that

they appear as E is traversed starting at e. It is easy to see that we obtain a de Bruijnsequence and that each de Bruijn sequence arises in this way. Therefore the number of deBruijn sequences of degree n equals 2n times the number of Eulerian tours in Dn. By theBEST Theorem and the Matrix Tree Theorem, the number of de Bruijn sequences of degreen equals 2n det(L1(Dn)).

It only remains to compute det(L1(Dn)). It is possible to compute this determinant byrow and column operations, but the following argument is more attractive. One reason thatthe adjacency matrix of a digraph is so useful comes from this proposition:

Proposition 2.13. Let A be the adjacency matrix of a digraph D. Then the (i, j)-entry inAk equals the number of directed walks from vi to vj of length k.

The digraph Dn has a particularly nice property.

Proposition 2.14. Let u and v be any two vertices of Dn. Then there is a unique directedwalk from u to v of length n− 1.

Let A be the adjacency matrix of D. These two propositions imply that An−1 is the2n−1 × 2n−1 matrix of ones. But the eigenvalues of that matrix are 2n−1 (with multiplicityone) and 0 (with multiplicity 2n−1 − 1). Thus the eigenvalues of A are 2 (with multiplicity1) and 0 (with multiplicity 2n−1 − 1).

2.4 Lecture 12 (Tuesday, October 7): Abelian Sand-

piles and Chip-Firing Games

Today’s topic is just a fun one that has a connection to the Matrix Tree Theorem. A surveyof chip-firing can be found on the arXiv co-authored by Alexander Holroyd, Lionel Levine,Karola Meszaros, Yuval Peres, James Propp, and David Wilson.

Let G be a finite directed graph on n + 1 vertices. Let s be a vertex with out-degree0 such that from every other vertex there is a directed path leading to s; call s a (global)sink. Let V ′(G) = V (G) \ {s} = {v1, . . . , vn} and let di be the outdegree of vi. Let a chipconfiguration be a map σ : V (G) → Nn; we think of C as representing stacking σ(v) chipson the vertex v for each non-sink vertex v. If σ(v) ≥ outdeg(v), that is, there are at leastas many chips on v as outgoing arcs of v, then v is unstable and can fire, which removesoutdeg(v) chips from v and puts one chip on w for each arc (v, w) in G (if w 6= s). This givesa new chip configuration. Note that the total number of chips decreases by the number ofarcs (v, s) in G. A chip configuration is stable if no vertex can fire and unstable otherwise.

A firing sequence F = (σ = σ0, (σ1, v1), (σ2, v2), . . . , (σm, vm)) is a sequence of firingson σ, where σi is the chip configuration obtained from σi−1 by firing vi. Relaxing a chipconfiguration means that we fire vertices until we have reached a stable configuration. It iseasy to see that a stable configuration is reached in finitely many steps.

52


Proposition 2.15. Given a chip configuration σ, there is a unique stable configurationachieved from relaxing σ (that is independent of the choice and order of firings in the relax-ation process). Call this stable configuration the relaxation of σ and denote it by R(C).

Proof. Let F = (σ = σ0, (σ1, v1), (σ2, v2), . . . , (σm, vm)) be a firing sequence with σm stable,and let F ′ = (C = C0, (σ

′1, v′1), (σ

′2, v′2), . . . , (σ

′l, v′l)) be some other firing sequence. I claim

that l ≤ m and no vertex fires more times in F ′ than in F .Suppose we have a counterexample with m + l minimal. Since v′1 is unstable in σ, it

must be fired at some point in F ; say vi = v′1. Then firing vi, v1, v2, . . . , vi−1, vi+1, . . . , vm inorder generates a valid firing sequence with final stable configuration σm. Thus the firingsequences generated by v1, v2, . . . , vi−1, vi+1, . . . , vm and v′2, . . . , v

′l form a smaller minimal

counterexample. This is a contradiction.

There is an addition operation on chip configurations. If σ1, σ2 : V (G) → Nn are chipconfigurations, then σ1 + σ2 is the chip configuration for which (σ1 + σ2)(v) = σ1(v) + σ2(v);that is, you just combine the two piles of chips at each vertex. Now, generalize to chipconfigurations that allow negative numbers of chips at vertices, that is, maps σ : V ′(G)→ Zn

and allow any vertex to fire. Generalized chip configurations form a group under + naturallyisomorphic to Zn. Let S be the generalized chip configurations that can be obtained fromthe configuration of all zeroes by a sequence of firings. Note that S is a subgroup under +.

Definition. The chip-firing group CF (G) of G is the quotient Zn/S. We think of the cosetsas being equivalence classes of generalized configurations under firing.

At the moment, it is not clear why CF (G) is interesting, but we can compute |CF (G)|!Let L(G) be the Laplacian of G and let Ls(G) be the n× n matrix obtained from L(G) bydeleting the row and column corresponding to the sink s. Then the rows of Ls(G) generateS as a sublattice of Zn. But the index of S in Zn equals the volume of the parallelopipedwhose edges are the rows of Ls(G), and this volume is equal to the determinant of Ls(G).The Matrix Tree Theorem implies the following theorem.

Proposition 2.16. The order of the chip-firing group CF (G) equals the number of orientedspanning trees of G rooted at s.

So what’s interesting about the chip-firing group? Define a new addition operation ⊕ onchip configurations (with non-negative numbers of chips, where σ1 ⊕ σ2 = R(σ1 + σ2).

Definition. A chip configuration σ is recurrent if, given any other chip configuration σ′, wecan arrive at σ by selectively adding chips at vertices and firing unstable vertices.

Note that adding chips at vertices and firing unstable vertices are commuting operations,so we can do things in any order. Also, any configuration with at least dvi − 1 chips at eachvertex is recurrent.

Theorem 2.17. Each coset of Zn/S (that is, each equivalence class under firing) containsexactly one stable, recurrent configuration. The group Zn/S is (isomorphic to) the group ofstable, recurrent configurations under ⊕.

53


Before we prove the theorem, note if G has arcs (v, w) and (w, v), no configuration withzero chips at v and at w is recurrent, since if we start with a configuration with a positivenumber of chips at v and w, there is no way to arrive at a configuration in which neither vnor w has any chips.

Let δ be the chip configuration with d(v) = outdeg(v) for all v.

Lemma 2.18. Each coset contains a stable chip configuration.

Proof. Note that δ−R(δ) has at least one chip on each vertex and is in S. Given any α ∈ Zn,let m be the minimum of all coordinates of α and 0, so that m ≤ 0. Then

β = α + (−m)(δ −R(δ))

has nonnegative entries and is equivalent to α. So β is a stable configuration in the coset ofα.

Lemma 2.19. Each coset contains a stable, recurrent configuration.

Proof. Again, let α ∈ Zn with m equal to the minimum of the coordinates of α and 0. Letd be the maximum outdegree in G. Then α is equivalent to

β = α + (d−m)(δ −R(δ)).

Every entry in β is at least d. Then β is recurrent: given any configuration σ, compute R(σ),then add chips at each vertex until we arrive at β. Then R(β) is stable and recurrent.

Lemma 2.20. Let ε = (2δ)−R(2δ). If σ is stable and recurrent, σ = R(σ + ε).

Proof. Choose γ so that σ = R(δ + ζ). Let

γ = (δ + ζ) + ε = (δ + ζ) + (2δ −R(2δ)).

Since ε has non-negative entries, we can fire all the unstable vertices in δ + ζ to get σ,then add ε and relax, so R(γ) = R(σ + ε). On the other hand, since δ − R(2δ) has non-negative entries, we can fire all the unstable vertices in 2δ, to get δ + ζ, which relaxes to σ,so R(γ) = σ.

Lemma 2.21. Each coset contains at most one stable, recurrent configuration.

Proof. Let σ1 and σ2 be stable, recurrent, equivalent configurations. Write

σ1 = σ2 +n∑i=1

ciRi,

where Ri is the i-th row in Ls(G). Let J+ = {i : ci > 0} and J− = {i : ci < 0}. Then

σ := σ1 +∑i∈J−

(−ci)Ri = σ2 +∑i∈J+

(ci)Ri.

Choose k so that σ′ = σ+ kε has at least |ci|dvi chips at vertex vi. From σ′, we can fire eachvi in i ∈ J− a total of −ci times, resulting in σ1 + kε. But R(σ′ + kε) = σ1 by the abovelemma. Similarly, R(σ′ + kε) = σ2. Thus σ1 = σ2.

54


Now we can ask questions about CF (G). What is the identity element? (This is one ofmy favorite mathematical questions.) What is the inverse of a group element? What is thestructure of CF (G) as a finite abelian group? Is there a bijection between oriented spanningtrees and group elements?

We can compute the identity via Id = R((2δ − 2) − R(2δ − 2)). We can compute theinverse of σ via σ−1 = R((3δ−3)−R(3δ−3)−σ). But this doesn’t tell us what these elementslook like as chip configurations. There are now several known bijections with spanning trees,but I won’t discuss them here.

Finally, it turns out we can compute the structure of CF (G).

Proposition 2.22. Any m×n matrix L can be written L = ADB with A, B invertible withdeterminant ±1 and D a diagonal matrix with diagonal entries 0, . . . , 0, d1, . . . , dk such thatdi|di+1. This is called the Smith normal form of L and the di are the invariant factors.

Theorem 2.23. Let d1, . . . , dk be the invariant factors of Ls(G). Then

CF (G) ∼= Z/d1Z× · · · × Z/dkZ.

Example. Let G = Kn+1, the complete graph on n+ 1 vertices, with one vertex designatedas the sink. Then

Ls(G) = (n+ 1)I − J =

(n −ut−u (n+ 1)I ′ − J ′

),

where J is the n × n matrix of all 1’s, I ′ is the (n − 1) × (n − 1) identity matrix, J ′ is the(n− 1)× (n− 1) matrix of all 1’s, and u is the (n− 1)× 1 matrix of all 1’s. Define

R1 =

(1 ut

u I + J

)and

R2 =

(1 −ut0 I

).

Then

R1Ls(G)R2 =

(1 00 (n+ 1)I

).

Thus CF (G) ∼= (Z/(n+ 1)Z)n+3.

2.5 Lecture 13 (Thursday, October 9): Mobius Inver-

sion and the Chromatic Polynomial (Stanley [4,

Chapter 2])

We are going to start today with another graph theory topic with algebraic connections: thechromatic polynomial. Let G be a graph on n vertices and suppose [k] is a set of colors. Aproper coloring of G is a map κ : V (G)→ [k] such that if v and w are adjacent, κ(v) 6= κ(w).

55


Definition. Let χG(k) be the number of proper colorings of G using k colors. The smallestk for which χG(k) 6= 0 is called the chromatic number χ(G) of G.

The 4-Color Theorem says that if G is planar (that is, can be drawn in the plane withno two edges crossing), then χ(G) ≤ 4.

Proposition 2.24. The quantity χG(k) is a polynomial in k. Thus it makes sense to call itthe chromatic polynomial of G.

Proof 1. Let ei be the number of proper colorings of G using exactly i colors. Then

χG(k) =n∑i=1

ei

(k

i

)is a polynomial in k.

How can we compute χG(k)? Our first method is to find a recurrence for χG(k). Considertwo ways to reduce a graph to a smaller graph:

1. Deletion: If e is an edge in G, delete it to obtain the graph G \ e.

2. Contraction: If e is an edge in G, remove it and contract (identify) the two endpointsinto one vertex to get G/e.

Theorem 2.25.

χG(k) = χG\e(k)− χG/e(k)

Proof. Let C1 be the set of k-colorings of G, let C2 be the set of k-colorings of G \ e, and letC3 be the set of k-colorings of G/e. We describe a bijection between C2 and C1 ∪ C3. Let ehave endpoints v and w.

Given a coloring κ in C2, if v and w have different colors, then this is a proper coloring ofG. Otherwise, we obtain a proper coloring of G/e by using the same colors, with the vertexformed by contracting v and w colored the same color as v and w were in κ. This is clearlya bijection.

This is called the Deletion-Contraction Recurrence. The base case for this recurrence isthe graph on n vertices with no edges, which has chromatic polynomial kn. This recurrencenot only gives an alternate proof that χG(k) is a polynomial, but also shows the followingfacts.

Proposition 2.26. 1. χG(k) is a monic polynomial of degree n with integer coefficients.

2. The coefficients of χG(k) alternate in sign.

3. The coefficient of xn−1 is −m, where m is the number of edges in G.

There are alternative formulas for χG(k), and this forces us to detour into the world ofposets and Mobius inversion.

56


2.5.1 Posets and Mobius Inversion

A poset P (a partially ordered set) consists of a set (also denoted by P ) and a partial order≤, where a partial order ≤ on a set is a binary relation that satisfies:

1. (Reflexivity) For all x ∈ P , x ≤ x.

2. (Antisymmetry) If x ≤ y and y ≤ x, then x = y.

3. (Transitivity) If x ≤ y and y ≤ z, then x ≤ z.

If x < y and there is no z for which x < z < y, then we say y covers x. We oftenrepresent posets by their Hasse diagram, which is the graph whose vertices are the elementsof P , whose edges are the cover relations, and if x < y, then y is drawn above x. Here aresome important examples.

Example. The set [n] is a poset under the usual order on integers whose Hasse diagram isjust a path on n vertices.

Example. The set of subsets of [n] is a poset under inclusion whose Hasse diagram is then-dimensional hypercube (projected to two dimensions). This is often called the Booleanalgebra.

Example. The set of positive integral divisors of n forms a poset in which the partial orderis given by divisibility.

Example. The set of set partitions of [n] is a poset under refinement. That is, π ≤ σ if σis obtained from π by merging blocks of π.

Example. There are many posets on the set of permutations on Sn. One such poset is theinversion poset, in which π ≤ σ if σ is obtained from π by a sequence of transpositions thatput a smaller number after a larger number.

Definition. The poset P has a 0 if there is a (unique) minimum element of P . It has a1 if there is a (unique) maximum element of P . A chain in P is a sequence of elementsx0 < x1 < · · · < xk. The poset is graded if every maximal chain has the same length.Then there is a unique rank function ρ : P → {0, . . . , n} such that if y covers x, thenρ(y) = ρ(x) + 1.

Note that all of the above poset examples are graded (respectively, by size, size, numberof prime factors, number of blocks, and inversions).

Definition. The Mobius function of a finite poset P is the map µ : P × P → Z definedinductively by

µ(x, x) = 1

µ(x, y) = −∑x≤z<y

µ(x, z)

57


Theorem 2.27 (Mobius Inversion Formula). Let P be a finite poset and let f, g : P → C.Then

g(x) =∑y≤x

f(y), for all x ∈ P

if and only if

f(x) =∑y≤x

g(y)µ(y, x).

Corollary 2.28 (Dual Form). Let P be a finite poset and let f, g : P → C. Then

g(x) =∑y≥x

f(y), for all x ∈ P

if and only if

f(x) =∑y≥x

g(y)µ(x, y).

We will defer the proof until next time. Let’s see some applications.

Example. Let P be the natural poset on [n]. Then

µ(x, y) =

1 : x = y−1 : x = y − 1

0 : otherwise

Mobius inversion says that if

g(i) =i−1∑j=1

f(j),

thenf(i) = g(i)− g(i− 1).

Example. Let P be the set of subsets of [n]. Then

µ(T, S) = (−1)|S−T |.

(We can prove this by induction, for example.) Mobius inversion says that if

g(S) =∑T⊆S

f(T ), for all S ⊂ X,

thenf(S) =

∑T⊆S

(−1)|S−T |g(T ).

This is what is generally called the Principle of Inclusion-Exclusion. For example, recallthe problem of counting derangements, that is, how many permutations π ∈ Sn have nofixed points? Let Dn equal the number of derangements in Sn, let f(S) be the number of

58


permutations in Sn whose fixed points are exactly the elements of S, and let g(S) be thenumber of permutations in Sn that fix the elements of S. Then

g(S) =∑T⊇S

f(T ), for all S ⊂ X,

so

Dn = f(0) =∑T

g(T )µ(0, T ) =n∑i=0

(−1)i(n

i

)(n− i)! = n!

n∑i=0

(−1)i

i!.

2.5.2 Back to the Chromatic Polynomial

Define a poset P called the bond lattice of G. The elements of the poset are partitions ofV (G) into connected components (blocks). The partitions are ordered by refinement.

Example. Consider the graph G with edges ab, ac, bc, bd, cd. The bond lattice is a gradedposet with 1 element of rank 0, five elements of rank 1, six elements of rank 2, and oneelement of rank 3.

Given a partition π of V (G) into connected components, let χπ(k) be the number ofproper k-colorings of G in which

1. all vertices in a block have the same color

2. any two adjacent blocks have different colors

Then χG(k) = χ0(k). Note that

knumber of blocks in π =∑σ≥π

χσ(k).

Let f(σ) = χπ(k) and g(σ) = knumber of blocks in σ. By (dual) Mobius inversion,

χπ(k) =∑σ≥π

µ(π, σ)knumber of blocks in σ

χ0(k) =∑σ

µ(0, σ)knumber of blocks in σ

In the above example, χG(k) = k4 − 5k3 + 8k2 − 4.Next comes a surprising result about χG(−1), which we would not necessarily expect to

be meaningful.

Definition. An acyclic orientation of G is a way to orient each edge of G so that no cyclesare created. Let w(G) be the number of acyclic orientations of G.

Theorem 2.29 (Stanley).

χG(−1) = (−1)nw(G)

59


Proof. This holds trivially for G = En, the graph with no edges, since χEn(k) = kn andw(En) = 1. If we could show that w(G) = w(G \ e) + w(G/e), then by induction on thenumber of edges in G,

w(G) = w(G \ e) + w(G/e)

= (−1)nχG\e(−1) + (−1)n−1χG/e(−1)

= (−1)n(χG\e(−1)− χG/e(−1)

)= (−1)nχG(−1)

To prove the recursion for w(G), take an acyclic orientation of G. Removing e gives anacyclic orientation of G \ e. But if two acyclic orientations of G are the same except for theorientation of e, we get the same acyclic orientation of G \ e twice. In this case, if e = (x, y),this means there is no directed path between x, y in either direction. So we also get anacyclic orientation of G/e. Thus

w(G) = w(G \ e) + w(G/e).

2.6 Lecture 14 (Tuesday, October 14): The Chromatic

Polynomial and Connections

We begin with a proof of the Mobius Inversion Formula. The techniques used in this proofare not so important for us, but are useful for more detailed studies of posets.

Proof of Mobius Inversion. Define the incidence algebra of P over C by

I(P ) = {ξ : P × P → C : ξ(x, y) = 0 if x 6≤ y}.

This is an algebra under convolution, defined by

(ξ ∗ ν)(x, y) =∑x≤z≤y

ξ(x, z)ν(z, y).

This convolution is associative and

δ(x, y) =

{1 : x = y0 : otherwise

is a 2-sided identity.If ξ ∈ I(P ) has a 2-sided inverse ξ−1, then ξ(x, x)ξ−1(x, x) = (ξ ∗ξ−1)(x, x) = δ(x, x) = 1.

Thus ξ(x, x) 6= 0 for all x ∈ P . On the other hand, suppose ξ(x, x) 6= 0 for all x ∈ P . Thenit is easy to verify that if ξ−1 is defined inductively by

ξ−1(x, y) =

{1

ξ(x,x): x = y

1ξ(y,y)

(−sumx≤z<yξ−1(x, z)ξ(z, y)) : x 6= y,

60


then ξ−1 is a left inverse for ξ and, by associativity, a right inverse as well.In particular, µ is the inverse of

ζ(x, y) =

{1 : x ≤ y0 : x 6≤ y

and so µζ = δ.The algebra I(P ) acts on the right of CP by

(fξ)(x) =∑y≤x

f(y)ξ(y, x)

for f ∈ CP and ξ ∈ I(P ). Then fζ = g means

g(x) =∑y≤x

f(y),

but multiplying by µ on the right shows that f = gµ, so

f(x) =∑y≤x

g(y)µ(y, x).

2.6.1 The Graph Minor Theorem

One of the two most important results in graph theory (along with the 4-color theorem) overthe past 30 years is the Graph Minor Theorem by Robertson and Seymour. Though notdirectly connected to the chromatic polynomial, it does demonstrate that the operations ofedge deletion and edge contraction have a lot of significance.

Definition. A graph H is a minor of a graph G if H be can obtained from G by a sequenceof deletions and contractions.

Theorem 2.30 (Graph Minor Theorem). Any infinite family of graphs contains one that isa minor of another.

Corollary 2.31. Any minor-closed family F (that is, whenever G ∈ F , so are all the minorsof G) has a characterization of the form “G ∈ F if and only if the following finite list ofgraphs does not include any minors of G:”.

Example. 1. The minor-closed family of forests (graphs with no cycles) is the family ofgraphs that do not have C3 (the cycle on 3 vertices) as a minor.

Wagner’s Theorem The minor-closed family of planar graphs is the family of graphs that do not have K5

(the complete graph on 5 vertices) or K3,3 (the complete bipartite graph on two setsof three vertices) as a minor.

2. The minor-closed family of graphs that can be embedded in a torus has not been socharacterized, though it is known that a minimal list of avoided minors contains atleast 16,000 graphs.

61


2.6.2 Hyperplane Arrangements

One significant area of study related to combinatorics, reflection groups, Lie theory, etc. isthe study of hyperplane arrangements. A hyperplane arrangments in Rn is just a collectionof hyperplanes. For example, let A = {x = 0, y = 0, z = 0, x+ y = 0}. We can construct itsintersection poset L(A), whose elements are the intersections of 0 or more hyperplanes in Aordered by reverse inclusion. The characteristic polynomial of A is given by

χA(t) =∑

x∈L(A)

µ(0, x)tdim(x)

For the given example, χA(t) = t3 − 4t2 + 5t− 2.

Theorem 2.32. The quantity (−1)nχA(−1) is the number of regions that the hyperplanespartition A into. The quantity (−1)rank(A)χA(1) is the number of bounded regions, where therank of A is the dimension of the space spanned by the normal vectors to the hyperplanes inA.

In fact, the characteristic polynomial is a generalization of the chromatic polynomial.

Theorem 2.33. Let G be a graph with vertices v1, . . . , vn. Define the graphical arrangementA(G) in Rn corresponding to G to have the planes xi − xj = 0 for vivj ∈ E(G). ThenχA(G)(t) = χG(t).

Moreover, if Fq is a finite field and Aq, the arrangment A viewed as a set of hyperplanesover an Fq-vector space, has the same intersection poset as A, then χA(q) is the number ofpoints in Fnq that do not lie on any hyperplanes in Aq(!)

62

Chapter 3

The Representation Theory of theSymmetric Group and SymmetricFunctions

Symmetric function theory and its connections to representation theory and algebraic ge-ometry have dominated the world of algebraic combinatorics over the past twenty years orso. There are (at least) two perspectives to take on symmetric function theory. On theone hand, it has long linked (tableaux) combinatorics to the representation theory of thesymmetric group (and the general linear group), and more recent research has developedcombinatorial theories for the representation theory of other groups. There are also deepconnections between symmetric functions and topics in algebraic geometry like the Schubertcalculus of Grassmanians and Hilbert schemes. On the other hand, tableaux combinatoricsand symmetric functions provide a unified setting in which to (for example) study enumer-ative questions about permutations, partitions, and topics like Polya theory.

Our goals in this chapter are four-fold. The first goal is to discuss the representationtheory of the symmetric group and present the RSK algorithm. This is a (reasonably)natural way to introduce tableaux. The second goal is to introduce symmetric functions andtheir algebraic properties and to connect them to the representation theory of the symmetricgroup. The third goal is to link the algebraic and combinatorial properties of symmetricfunctions with the combinatorics of permutations and other objects. The fourth and finalgoal is to use all of this theory to solve a variety of problems.

3.1 An Introduction to the Representation Theory of

Finite Groups (Sagan [3, Chapter 1])

This section presents the most basic facts about the representation theory of finite groups.It does not correspond to a lecture; rather, you should read it to prepare yourself for thelectures in this chapter. It is important to remember that we are only working with finitegroups; much of what is written in this section is not true for infinite groups. Representationtheory turns out to be fundamental in virtually all of mathematics; besides the combinatorialimplications, we will (eventually) discuss an application to probability theory (namely card

63


shuffling and the convergence rates of random walks).

3.1.1 Representations

Let G be a finite group. Given a (complex finite-dimensional) vector space V of dimensionn, let GL(V ) denote the group of invertible linear transformations of V . Then a (complex)representation of G of dimension n is a group homomorphism ρ : G→ GL(V ) (alternatively,a representation of G is a group action of G on a vector space). In this situation, V is acalled a G-module. We obtain a more concrete description of ρ if we choose a basis of V .Then, if dim(V ) = n, we can view ρ as a homomorphism ρ : G→ GLn(C), where GLn(C) isthe group of invertible n× n complex matrices. The only problem is that a different choiceof basis for V will give us a different homomorphism ρ : G → GLn(C). For example, letV = C3 and let S3 act on V by permuting the standard basis vectors e1, e2, and e3; thisdefines a homomorphism ρ into GL3(C). With respect to the standard basis, we find

ρ((1)(2)(3)) =

1 0 00 1 00 0 1

ρ((123)) =

0 0 11 0 00 1 0

ρ((12)(3)) =

0 1 01 0 00 0 1

ρ((13)(2)) =

0 0 10 1 01 0 0

ρ((132)) =

0 1 00 0 11 0 0

ρ((1)(23)) =

1 0 00 0 10 1 0

But with respect to the basis v1 = (1, 0, 0), v2 = (1, 1, 0), and v3 = (0, 1, 1), we find

ρ((1)(2)(3)) =

1 0 00 1 00 0 1

ρ((123)) =

−1 0 21 0 −10 1 1

ρ((12)(3)) =

−1 0 21 1 −10 0 1

ρ((13)(2)) =

1 0 0−1 0 11 1 0

ρ((132)) =

1 2 0−1 −1 11 1 0

ρ((1)(23)) =

1 2 00 −1 10 1 1

To correct this ambiguity, we say that two representations ρ and τ are isomorphic if thereis a (change-of-basis) matrix T ∈ GLn(C) such that τ(g) = T−1ρ(g)T for all g ∈ G.

For any group G, there is the trivial representation of dimension 1 corresponding to thetrivial action of G on a one-dimensional vector space.

64


3.1.2 Irreducible Representations

Given two representations ρ and τ , we can form a new representation ρ ⊕ τ by taking adirect sum. Under this new representation, (ρ ⊕ τ)(g) = ρ(g) ⊕ τ(g), where the direct sumof the two matrices ρ(g) and τ(g) is the block matrix with ρ(g) in the upper-left, τ(g) in thelower-right, and zeroes elsewhere. On the other hand, given a representation σ, suppose wecan find a change-of-basis matrix T so that σ is (isomorphic to) a representation in whicheach matrix σ(g) is a block matrix (with the same size blocks for each g). Then we candecompose σ as the direct sum of two smaller-dimensional representations. For example, letρ be the three-dimensional representation where S3 permutes the standard basis vectors inC3 and let τ be the trival representation. Then with respect to the standard basis, ρ⊕ τ isgiven by

(ρ⊕ τ)((1)(2)(3)) =

1 0 0 00 1 0 00 0 1 00 0 0 1

(ρ⊕ τ)((123)) =

0 0 1 01 0 0 00 1 0 00 0 0 1

(ρ⊕ τ)((12)(3)) =

0 1 0 01 0 0 00 0 1 00 0 0 1

(ρ⊕ τ)((13)(2)) =

0 0 1 00 1 0 01 0 0 00 0 0 1

(ρ⊕ τ)((132)) =

0 1 0 00 0 1 01 0 0 00 0 0 1

(ρ⊕ τ)((1)(23)) =

1 0 0 00 0 1 00 1 0 00 0 0 1

With respect to some other basis, these matrices would not be block matrices, but therepresentation would still be (isomorphic to) the direct sum of ρ and τ .

Irreducible representations are the building blocks for all representations. A representa-tion ρ is irreducible if it cannot be written as the direct sum of two representations. If ρ isnot irreducible, it is reducible. Maschke’s Theorem says that every representation of a finitegroup G has a unique decomposition (up to the ordering of the factors) into the direct sumof irreducible representations of G. Thus if we can understand all irreducible representationsof a group, we can understand all representations of a group.

One important representation of a group G is the regular representation. This represen-tation has dimension |G| and corresponds to G acting by left multiplication on C[g1, . . . , gn],the vector space of all complex linear combinations of the elements of G, also known as thegroup algebra of G. It turns out that the regular representation ρ of G decomposes intoirreducible representations as

ρ = ⊕τ an irreducible representation of G(dim τ)τ .

That is, the multiplicity of τ in ρ is dim(τ) for every irreducible representation of G. Inparticular this proves that the number of irreducible representations of G is finite and that

|G| =∑

τ an irreducible representation of G

(dim(τ))2.

65


In fact, the number of irreducible representations of G equals the number of conjugacy classesin G. Finally, given an irreducible G-module W and a G-module V , the multiplicity of Win V equals the dimension of Hom(W,V ).

3.1.3 Characters

It is well-known that the trace of a matrix is invariant under conjugation. Therefore, if ρ isa representation of G on the G-module V , the trace of ρ(g) does not depend on the basischosen for V . It makes sense, therefore, to define the character of ρ to be the functionχ : G → C, where χ(g) = Trace(ρ(g)). Moreover, χ is constant on the conjugacy classesof G. For example, for the 3-dimension representation τ of S3 given in Subsection 3.1.1,χ((1)(2)(3)) = 3, χ((123)) = χ((132)) = 0, and χ((12)(3)) = χ((13)(2)) = χ((1)(23)) = 1.The characters corresponding to irreducible representations are called irreducible characters.Since the number of irreducible characters equals the number of conjugacy classes of G, thecharacter values for G can be represented in a conjugacy table. Here is the table for S3, withthe rows labeled by irreducible characters and the columns labeled by (representatives of)conjugacy classes:

(1)(2)(3) (123) (12)(3)χ1 1 1 1χ2 1 1 −1χ3 2 −1 0

3.2 Lectures 16 and 17 (Tuesday, October 21 and Thurs-

day, October 23): The Irreducible Representations

of the Symmetric Group (Sagan [3, Chapter 2])

In this section we construct the irreducible representations of the symmetric group andlook at related results. The number of conjugacy classes in Sn is the number of partitionsof n, so we seek irreducible representations indexed by the partitions of n. We begin byconstructing a family of representations Mλ that are indexed by the partitions of n but arenot, in general, irreducible. We will find the Specht module Sλ as a submodule of Mλ andwe will show that Sλ is irreducible. We finish this section by finding a basis for Sλ and byfinding the decomposition of Mλ into Specht modules.

3.2.1 Constructing the Irreducible Representations (Sagan [3, Sec-tion 2.1])

We build up to the construction of Mλ via several definitions and then offer several examples.

Definition. Let λ ` n. A tableau of shape λ (or a λ-tableau) t is obtained by replacing thedots in the Ferrers diagram of λ by positive integers. A tabloid (of shape λ) {t} is an equiv-alence class of tableaux of shape λ, where two tableaux are equivalent if the correspondingrows of the two tableaux contain the same elements. A Young tableau (of shape λ) is a

66


tableau of shape λ in which each integer from 1 to n appears exactly once, and a Youngtabloid is an equivalence class of Young tableaux.

Definition. Let Vλ be the vector space spanned by the Young tabloids of shape λ. The groupSn acts on the set of Young tabloids by permuting the entries; this defines a representationMλ of Sn on Vλ.

Here are all Young tableaux and all Young tabloids of shape (2, 1):

Young tableaux 1 23

1 32

2 31

2 13

3 12

3 21

Young tabloids1 23

1 32

2 31

If λ = (3), thenM (3) ∼= C{ 1 2 3 },

and this is the trivial representation. If λ = (2, 1), then

M (2,1) ∼= C{

1 23

,1 32

,2 31

}Finally, if λ = (1, 1, 1), then

M (1,1,1) ∼= C

123

,132

,213

,231

,312

,321

,

and this is the regular representation.

3.2.2 The Specht module Sλ (Sagan [3, Section 2.3])

We will identify the representation Sλ by finding a subspace of Vλ on which Sn acts. Givena subset H ⊆ Sn, define the group algebra sums

H+ =∑π∈H

π ∈ C[Sn]

H− =∑π∈H

sgn(π)π ∈ C[Sn]

Given a tableau t with columns C1, C2, . . . , Ck, let Ct = SC1×· · ·×SCk ≤ Sn be the column-stabilizer of t, which acts on t but only permutes entries within columns. Then pt = C−t (t) isthe polytableau associated to t. By replacing each tableau in pt by the corresponding tabloid,we obtain the polytabloid et associated to t.

For example, let

t =4 1 23 5

.

67


Then

pt =4 1 23 5

− 3 1 24 5

− 4 5 23 1

+3 5 24 1

and

et =4 1 23 5

− 3 1 24 5

− 4 5 23 1

+3 5 24 1

.

Definition. The Specht module Sλ is the subspace of Mλ spanned by the polytabloids et,where t has shape λ.

Now, it is not even clear that this definition makes sense; we need to show that Sn actson this subspace. This follows from the next lemma.

Lemma 3.1. Let t be a tableau and π ∈ Sn. Then

1. Cπt = πCtπ−1

2. C−πt = πC−t π−1

3. eπt = πet.

Proof. For part 1, we have

σ ∈ Cπt ↔ σπt = πt up to column equivalence

↔ π−1σπt = t up to column equivalence

↔ π−1σπ ∈ Ct↔ σ ∈ πCtπ−1.

The proof of parts 2 and 3 are similar.

Part 3 shows that Sn does acts on Sλ and therefore Sλ is a module.

3.2.3 The Specht Modules are the Irreducible Modules (Sagan [3,Section 2.4])

Next we have to order all the partitions of n so that we can deal with the Mλ one by one.One extremely important partial order on partitions is the dominance order. If λ and µ arepartitions of n, then λ dominates µ, written λ� µ, if λ1 + λ2 + · · ·+ λi � µ1 + µ2 + · · ·+ µifor all i ≥ 1. It is convenient to refine or extend this to a linear order, and this is given bythe lexicographic order on partitions. Here, λ ≥ µ in the lex order if, for some i, λj = µj forj < i and λi > µi. We need a criterion for dominance involving tableaux.

Lemma 3.2 (Dominance Lemma). Let t be a λ-tableau and let s be a µ-tableau. If, for eachindex i, the elements of row i in s are in different columns in t, then λ� µ.

Proof. We can sort the columns of t so that the elements in the first i rows of s are in thefirst i rows of t. Thus

λ1 + · · ·+ λi ≥ µ1 + · · ·+ µi.

68


We need to show that the Specht modules are irreducible, they are pairwise non-isomorphic,and they are a complete set of irreducible Sn-modules. This requires many delicate lemmas.Define the inner product on Mλ where

〈{t}, {s}〉 = δ{t},{s}.

Lemma 3.3 (Sign Lemma). Let H be a subgroup of Sn.

1. If π ∈ H, thenπH− = H−π = (sgnπ)H−.

Otherwise, let π−H− = H−.

2. For any u, v ∈Mλ, ⟨H−u, v

⟩=⟨u,H−v

⟩.

3. If the transposition (b, c) ∈ H, then

H− = k(ε− (b, c))

for some k ∈ C[Sn].

4. If t is a tableau with b, c in the same rom of t and (b, c) ∈ H, then

H−{t} = 0.

Proof. Part 1 follows from from the multiplicativity of the sign function. For part 2, we have⟨H−u, v

⟩=∑π∈H

〈(sgnπ)πu, v〉 =∑π∈H

⟨u, (sgnπ)π−1v

⟩.

Replace π by π−1 in the final sum to get 〈u,H−v〉.In part 3, we can take k = K− for the subgroup K = {e, (b, c)} of H.In part 4, we just apply part 3 (the tabloid {t} is annihilated by (b, c)).

Corollary 3.4. Let t be a λ-tableau and let s be a µ-tableau. If C−t {s} 6= 0, then λ� µ. Ifλ = µ, then C−t {s} = ±et.

Proof. Suppose b, c are in the same row of s. If they are also in the same column of t, thenC−t contains (b, c) and C−t {s} = 0 by part 4 of the previous lemma. By the dominancelemma, λ� µ.

If λ = µ, then {s} = π{t} for some π ∈ Ct by the dominance lemma. Then C−t {s} =(sgnπ)C−t {t} = ±et by part 1 of the previous lemma.

Corollary 3.5. If u ∈Mµ and t is a µ-tableau, then C−t u is a multiple of et.

Theorem 3.6 (Submodule Theorem). Let U be a submodule of Mµ. Then U ⊇ Sµ orU ⊆ Sµ⊥, so Sµ is irreducible.

69


Proof. Let u ∈ U and t be a µ-tableau. Then C−t u = fet for some f ∈ C. There are twocases.

Suppose that f 6= 0 for some choice of u and t. Then fet = C−t u ∈ U and et ∈ U . Butet generates Sµ, so U ⊇ Sµ.

Now, if C−t u = 0 for all u and t, consider any u ∈ U and µ-tableau t. Then

〈u, et〉 =⟨u,C−t {t}

⟩=⟨C−t u, {t}

⟩= 0.

Thus u ∈ Sµ⊥.

Lemma 3.7. If Hom(Sλ,Mµ) is non-trivial, then λ � µ. If λ = µ, then Hom(Sλ,Mµ) isone-dimensional.

Proof. Let 0 6= θ ∈ Hom(Sλ,Mµ). Extend θ to Hom(Mλ,Mµ) by setting θ(Sλ⊥) = 0. Then,for some et ∈ Sλ,

0 6= θ(et) = θ(C−t {t}) = C−t θ({t}).

By the corollary, λ� µ. If λ = µ, then θ(et) = cet and acting on t by π proves this holds forall polytabloids.

Theorem 3.8. The Sλ for λ ` n form a complete list of irreducible modules for Sn.

Proof. By the Submodule Thm, the Specht modules are irreducible. The number of Spechtmodules matches the number of irreducible Sn-modules. By the above lemma, if Sλ ∼= Sµ,then Hom(Sλ,Mµ) and Hom(Sµ,Mλ) and λ = µ.

Corollary 3.9. The Mµ decompose as

Mµ ⊕λ�µ KλµSλ.

3.2.4 Finding a Basis for Sλ (Sagan [3, Section 2.5])

There are a lot of polytabloids and the linear relations between them are unclear; in partic-ular, it is unclear what the dimension of Sλ is. So we would like to find a basis for Sλ. It iseasy to state the relevant theorem. A standard Young tableau is a Young tableau in whichthe entries in each row and each column are increasing.

Theorem 3.10. The set

{et : t is a standard Young tableau of shape λ}

is a basis for Sλ.

In the interest of time, we will only sketch the proof.

Proof. 1. Define a dominance order on tabloids.

2. Show that t is the maximum tabloid that appears in et (this shows the et are indepen-dent).

70


3. Show that for any tableau t, the polytabloid et is a linear combination of standardpolytabloids es.

Let fλ be the number of standard Young tableau of shape λ. Then the dimension of Sλ

is fλ and ∑λ`n

(fλ)2 = n!.

3.2.5 Decomposition of Mλ (Sagan [3, Section 2.9])

In this subsection, we want to write Mλ as a sum of Specht modules. Why? Partly it is justbecause the Mλ are nice modules to understand, but mostly it is because the decompositionwill introduce a fundamental generalization of standard Young tableaux.

Definition. The content of a tableau of shape λ is the composition µ = (µ1, . . . , µm), whereµi equals the numbers of i’s in T . A tableau is a semistandard Young tableau if the entriesin each row are weakly increasing and the entries in each column are strictly increasing.

Theorem 3.11.Mµ ∼= ⊕λKλµS

λ,

where Kλµ is the number of semistandard Young tableau of shape λ and content µ.

The method of proof is to show that Hom(Sλ,Mµ) has a basis parameterized by T 0λµ =

{T : T is a SSYT of shape λ and content µ}. We will omit the (lengthy) proof.

Example. The theorem shows that

M (2,2,1) ∼= S(2,2,1) ⊕ S(3,3,1) ⊕ 2S(3,2) ⊕ 2S(4,1) ⊕ S(5).

Example. Note that Kλ(1n) = fλ, the number of standard tableaux of shape λ. So

M (1n) ∼= ⊕λfλSλ.This shows that the regular representation decomposes as a sum of all irreducible modulesappearing with multiplicity equal to their dimension.

3.3 Lecture 18 (Tuesday, October 28): The RSK Al-

gorithm (Stanley [5, Section 7.11])

Now that we have constructed the irreducible representations of the symmetric group, whatnext? The first step is to examine the identity

∑λ`n (fλ)2 = n! more closely (can we find

a bijective proof, for example?). This question leads to a lot of beautiful combinatoricsthat is the subject of the next two subsections (after which we will meander back to therepresentation theory of Sn).

The identity∑

λ`n (fλ)2 = n! begs for a bijective proof; that is, a bijection betweenpairs of SYT of the same shape and permutations in Sn. In fact, we can do better. TheRobinson-Schensted-Knuth (RSK) Algorithm offers a bijection between pairs of SSYT of thesame shape and infinite N-matrices with finite support, and it specializes the bijection wedesire. This bijection is more than a curiosity; it has many consequences, as we shall see.

71


3.3.1 Row Insertion

The key component to this bijection is row insertion. Let P be a SSYT and let k be apositive integer. The row insertion of k into P , written P ← k, is defined as follows: (1)Find the largest r such that P1,r−1 ≤ k; if P11 > k, let r = 1. (2) If P1,r doesn’t exist (becauseP has r − 1 columns), append k to the end of the first row and stop. The resulting arrayis P ← k. Otherwise, replace P1r by k, and insert P1r into the second row using the samealgorithm. Continue until an element is inserted at the end of a row. The result is P ← k.

For example, let

P =

1 1 2 4 5 5 62 3 3 6 6 84 4 6 86 78 9

.

Inserting 4 results in

P ← 4 =

1 1 2 4 4 5 62 3 3 5 6 84 4 6 66 7 88 9

,

where the elements that were inserted into a row are in positions {(1, 5), (2, 4), (3, 4), (4, 3)}(the inserted elements are 4, 5, 6, and 8 respectively). The set of positions into which anelement was inserted is called the insertion path I(P ← 4).

We need to prove two useful properties of insertion paths, the first application of whichwill be the fact that the new array is also a SSYT.

Lemma 3.12. 1. The insertion path does not move to the right.

2. If j ≤ k, then I(P ← j) lies strictly to the left of I((P ← j)← k), and I((P ← j)← k)does not extend below the bottom of I(P ← j).

Proof. 1. Suppose that (r, s) ∈ I(P ← k). Either Pr+1,s > Pr,s or Pr+1,s does not exist.In the first case, when Pr,s is bumped to row r + 1, it cannot be inserted to the rightof Pr+1,s. In the second case, when Pr,s is bumped to row r+ 1, inserting it in positions+ 1 would leave a gap in the row.

2. A number must bump a strictly larger number, so k is inserted into the first row ofP ← j to the right of j. The element j bumps is at most the element that k bumps,so by induction, for all rows, I(P ← j) is left of I((P ← j) ← k). The last elementinserted in I(P ← j) went at the end of its row. If I((P ← j)← k) inserts an elementinto this row, it goes at the end of the row and the insertion algorithm stops.

Corollary 3.13. If P is an SSYT, then so is P ← k.

Proof. The rows of P ← k are clearly weakly increasing. If a bumps b, then a < b, and b isnot inserted to the right of a in the next row, so b is below a number smaller than b. ThusP ← k is increasing in columns.

72


3.3.2 The Robinson-Schensted-Knuth (RSK) Algorithm

Let A = (aij)i,j≥1 have non-negative integer entries with finitely many nonzero entries.Associate to A a two-line array

wA =

(i1 i2 · · · imj1 j2 · · · jm

),

where the columns (ir, jr) are listed in lexicographic order with multiplicity aij if (ir, jr) =(i, j). For example, if

A =

1 0 20 2 01 1 0

,then

wA =

(1 1 1 2 2 3 31 3 3 2 2 1 2

).

Note that A is a permutation matrix if and only if wA is a permutation written in two-linenotation. We construct a pair of SSYT (P,Q) from A as follows: Begin with (P0, Q0) = (∅, ∅).Once (Pt, Qt) is defined, construct (Pt+1, Qt+1) by

1. Pt+1 = Pt ← jt+1;

2. Qt+1 is obtained from Qt by adding it+1 in the position that makes Qt+1 have the sameshape as Pt+1.

Then (P,Q) := (Pm, Qm), and we write ARSK−→ (P,Q). This is the RSK Algorithm. The

SSYT P is called the insertion tableau of A and Q is called the recording tableau of A.

Example. Applying RSK to the above A and wA, we have

P (i) Q(i)i = 1 1 1i = 2 1 3 1 1i = 3 1 3 3 1 1 1

i = 4 1 2 33

1 1 12

i = 5 1 2 23 3

1 1 12 2

i = 61 1 22 33

1 1 12 23

i = 71 1 2 22 33

1 1 1 32 23

Theorem 3.14 (RSK Theorem). The RSK algorithm is a bijection between N-matricesA = (aij) of finite support and ordered pairs (P,Q) of SSYT of the same shape. In thiscorrespondence, the content of P is the vector of column sums of A and the content of Q isthe vector of row sums of A.

73


Proof. By the corollary, P is a SSYT. Also, P and Q have the same shapes and specifiedcontents, and that Q is weakly increasing in rows and columns. It remains to show that Qis strictly increasing in columns and that this is a bijection.

If ik = ik+1 in wA, then jk ≤ jk+1. By the lemma, the insertion path of jk+1 lies strictlyto the right of the insertion path of jk and does not extend below that of jk+1. Thus ik+1

will be inserted to the right of ik in Q and Q must be strictly increasing in columns.It is easy to reverse the RSK procedure by finding the largest element in Q (break ties by

finding the right-most occurence), reverse insertion of the corresponding element in P , anditerating. The only question is if starting with a pair (P,Q) yields a valid two-line array. Itsuffices to show that if ik = ik+1, then jk ≤ jk+1.

Let ik = Qrs and ik+1 = Quv, so r ≥ u and s < v. The element Puv lies at the end ofits row when we begin to apply inverse bumping to it. So the inverse insertion path of Prsintersects row u to the left of column v. That is, at row u the inverse insertion path of Prslie to the left of that of Puv. More generally, the entire inverse insertion path of Prs lies tothe left of that of Puv. Thus before removing ik+1, the two element jk and jk+1 appear inthe first row with jk to the left of jk+1. So jk ≤ jk+1.

Corollary 3.15. The RSK algorithm is a bijection between permutation matrices and SYTtableaux of the same shape. In particular,

∑λ`n (fλ)2 = n!.

3.3.3 Growth Diagrams and Symmetries of RSK

There is an alternate, geometric description of the RSK algorithm involving growth diagramsthat, among other things, leads to a quick proof of one symmetry of the RSK algorithm.

2 preliminary comments:

1. Standardization

2. Containment of partitions

Given w = w1 · · ·wn ∈ Sn, construct an n× n array with an X in the wi-th square fromthe bottom of column i (just as we did when talking about permutation avoidance). We aregoing to label each of the (n+ 1)2 points that are corners of squares in the array by integerpartitions. Label all points in the bottom row and left column with the empty partition ∅.

If three corners of a square s have been labeled

µλ ν

we label the upper right corner by the partition ρ defined by

1. (L1) If s does not contain an X and λ = µ = ν, let ρ = λ.

2. (L2) If s does not contain an X and λ ⊂ µ = ν, then µ was obtained from λ by adding1 to some part λi. Let ρ be obtained from µ by adding 1 to µi+1.

3. (L3) If s does not contain an X and µ 6= ν, then let ρi = max(µi, νi).

74


4. (L4) If s contains an X, then λ = µ = ν. Let ρ be obtained from λ by adding 1 to λ1.

This generates the growth diagram Gw of w. If a point p is labeled by λ, then |λ| is equalto the number of X’s in the quarter-plane to the left and below p. Let λi be the partitionin row i (with the bottom row being row 0) and column n. Then |λi| = i. Let µi be thepartition in column i and row n. Then λ0 ⊂ λ1 ⊂ · · · ⊂ λn and µ0 ⊂ µ1 ⊂ · · · ⊂ µn

correspond to SYT Pw and Qw.

Theorem 3.16. The SYT Pw and Qw are the same as the SYT obtained from w via RSK.

Proof. Let the partition in row i and column j be ν(i, j). For fixed j,

∅ = ν(0, j) ⊆ ν(1, j) ⊆ · · · ⊆ ν(n, j)

when |ν(i, j)/ν(i − 1, j)| = 0, 1. Let T (i, j) be the tableau of shape ν(i, j) obtained byinserting k into the square ν(k, j)/nu(k− 1, j) when 0 ≤ k < i and |ν(k, j)/ν(k− 1, j)| = 1.We claim that T (i, j) has the following description: Let (i1, j1), . . . , (ik, jk) be the positionsof X’s to the left and below T (i, j), with j1 < j2 < · · · < jk. Then T (i, j) is obtained by rowinserting i1, i2, . . . , ik, beginning with ∅.

The proof of the claim is by induction on i + j. It is true if i = 0 or j = 0. If i > 0 andj > 0, then T (i − 1, j), T (i, j − 1), and T (i − 1, j − 1) satisfy the desired conditions. Now,check that T (i, j) satisfies these conditions using the rules (L1)-(L4).

If i = n, thenT (n, j) = ((∅ ← w1)← w2)← · · · ← wj

So T (n, n) = Pw is the insertion tableau of w, while Qw is the recording tableau.

Corollary 3.17. Let A be an N-matrix of finite support and let ARSK−→ (P,Q). Then A is

symmetric if and only if P = Q.

Corollary 3.18. The number of SYT of size n equals the number of involutions in Sn. Thusthe generating function for tn, the number of SYT of size n, is∑

n≥0

t(n)tn

n!= exp

(x+

x2

2

).

3.4 Lecture 19 (Thursday, October 30): Increasing and

Decreasing Subsequences (Stanley [5, Appendix A])

Schensted’s original motivation for the RSK algorithm came from studying increasing subse-quences in permutations. Schensted’s Theorem has been generalized to Greene’s Theorem,and the techniques used to prove Greene’s Theorem will be useful to us in a couple weekswhen proving the Littlewood-Richardson Rule.

Let π = π1 · · · πn ∈ Sn and k ∈ N. Let Ik(π) denote the maximum number of elements ina union of k increasing subsequences of π. Let Dk(π) be the maximum number of elementsin a union of k decreasing subsequences of π. For example, if π = 236145 ∈ S6, then I0(π) =0, I1(π) = 4, I2(π) = I3(π) = 6; D0(π) = 0, D1(π) = 2, D2(π) = 4, D3(π) = 5, D4(π) = 6. Letsh(π) denote the shape of the insertion and recording tableaux of π under RSK.

75


Theorem 3.19 (Greene’s Theorem). Let π ∈ Sn and λ = sh(π). Then

Ik(π) = λ1 + · · ·+ λk

andDk(π) = λ′1 + · · ·+ λ′k.

This theorem tells us: (1) a way to find Dk(π) and Ik(π) and (2) when two permutationshave the same shape. The proof of theorem will also discuss when two permutations havethe same insertion tableau.

Definition. A Knuth transformation of a permutation π switches two adjacent entries aand c provided that next to a or c is an entry b with a < b < c. Two permutations π, σ are

Knuth equivalent, written πK∼ σ, if one can be transformed into the other by a sequence of

Knuth transformations.

For example, 54123, 51423, 51243, 15243, 15423, and 12543 are all Knuth equivalent.

Theorem 3.20. Permutations are Knuth equivalent if and only if their insertion tableaucoincide.

Definition. Let T be a SYT. The reading word of T is the sequence of entries of T obtainedby concatenating the rows of T bottom to top. For example, the tableau

468, 579, 1 2 3 468, 579, 468, 579

has reading word 579468123.

Note that a tableau can be reconstructed from its reading word by finding the descentsin the word.

Theorem 3.21. Each Knuth equivalence class contains exactly one reading word of a SYT,and consists of all permutations whose insertion tableau is that SYT.

In the above example, the only reading word in the given Knuth equivalence class is54123, and it corresponds to the tableau

1 2 345

.

Lemma 3.22. For any k, the values Ik(π) and Dk(π) are invariant under Knuth transfor-mations of π.

Proof. Note that π′ = πn · · · π1 satisfies Ik(π) = Dk(π′) and Ik(π

′) = Dk(π), and πK∼ σ if

and only if π′K∼ σ, so it suffices to prove the lemma for Ik(π).

Suppose π contains acb and the Knuth transformation switches a and c to obtain σ (whenb is on the other side, the situation is analogous). Let Ik(π) = m. Clearly Ik(σ) ≤ m. IfIk(σ) < m, it means that every collection {s1, . . . , sk} of disjoint increasing subsequences of

76


π which cover m elements has an element, say s1, containing a and c. In this case, if b doesnot belong to some si, we replace c by b in s1, obtaining a contradiction. If b does belong insome si, say s2, then

s1 = (πi1 < · · · < πis < a < c < πis+3 < · · · )s2 = (πj1 < · · · < πjt < b < πjt+2 < · · · ),

and the increasing subsequences

s′1 = (πi1 < · · · < πis < a < b < πis+3 < · · · )s′2 = (πj1 < · · · < πjt < c < πjt+2 < · · · ),

cover the same elements of π, which is again a contradiction. So Ik(σ) = m.

Lemma 3.23. Any permutation is Knuth equivalent to the reading word of its insertiontableau.

Proof. It suffices to show that reading(P )kK∼ reading(P ← k) for any k and SYT P .

Consider reading(P )k. We can switch k with its left neighbor until k winds up to the rightof exactly those entries to the right of k in the first row of P ← k. Then the left neighborof k is the entry that is bumped to the second row of P , and it can be switched with its leftneighbor until it winds up in the corresponding spot in the permutation, and so on.

Corollary 3.24. Let P be the insertion tableau for π. Then π and reading(P ) have the samevalues of Ik and Dk.

Lemma 3.25. Let π be the reading word of a SYT T . Then T is the insertion tableau of π.

Proof of Greene’s Theorem. We may assume that π is the reading word of a SYT T . Notethat each row of T is an increasing subsequence in π, so

Ik(π) ≥ λ1 + · · ·+ λk,

and each column of T is a decreasing subsequence in π, so

Dk(π) ≥ λ′1 + · · ·+ λ′k.

For example, consider π = 592671348 with insertion tableau

T =1 3 4 82 6 75 9

.

Now, pick a box in T along the lower-right border (in the above example, 5,9,6,7,4, or 8).Let it be in row k and column l. Then

Ik(π) +Dl(π) ≥ λ1 + · · ·λk + λ′1 + · · ·+ λ′l = n+ kl.

But an increasing subsequence and a decreasing subsequence have at most one element incommon. So

Ik(π) +Dl(π) ≤ n+ kl.

77


So equality holds throughout, with

Ik(π) = λ1 + · · ·+ λk,

andDk(π) = λ′1 + · · ·+ λ′k.

For any given k or l, we can find a suitable box along the border, so this holds for all k andl.

Corollary 3.26. The shape of π is invariant under Knuth transformations.

But here is a stronger corollary.

Corollary 3.27. The insertion tableau of a π is invariant under Knuth transformations.

Proof. Let π(k) be the permutation in Sk formed by the entries 1, . . . , k in π. Then P(k) is theinsertion tableau formed by the k smallest entries in P . Note that any Knuth transformationof π either does not change π(k) or transforms it into a Knuth-equivalent permutation. Thisdoes not affect the shape of π(k), which is the shape of P(k). But the shapes of P(k) for allk uniquely define P . Since all P(k) are unchanged by Knuth-equivalent permuations, so isP .

This proves the remaining two theorems.

3.5 Lectures 20 and 21 (Tuesday, November 4 and Thurs-

day, November 6): An Introduction to Symmetric

Functions (Stanley [5, Chapter 7])

Unfortunately, it is essentially impossible to introduce symmetric functions in a motivatedway. But as we build up the theory of symmetric functions, we will see more and moreconnections to the representation theory of the symmetric group, the RSK algorithm, enu-meration problems, and more. In this section, we define the ring of symmetric functions,introduce five bases for the ring, prove that they are bases, find transition matrices for pairsof bases, introduce a scalar product and a special ring homomorphism, and prove algebraicequalities involving the bases.

3.5.1 The Ring of Symmetric Functions

The fundamental object of study in this section is the ring of symmetric functions. It ismisleading to refer to the ring of symmetric functions for three reasons. The first reason isthat the functions in question are multi-variate polynomials; there is essentially no theory ofmore general symmetric functions. The second reason is that this ring can be defined overany commutative ring R, so it is not unique. However, we will only be interested in thisring defined over Q or (more rarely) Z. The third reason is that the symmetric functions inquestion can be functions of finitely many variables x = {x1, x2, . . . , xn} or infinitely many

78


variables x = {x1, x2, . . . }, yet another reason that the ring in question is not unique. Asyou might expect, it is easier to write down examples using finitely many variables. Butmore importantly, some theorems need to be proved for functions of finitely many variables,and then hold for functions of infinitely many variables by letting n go to infinity.

Before we give formal definitions, here are a few examples of symmetric functions (inthree variables):

x1 + x2 + x3

x1x2 + x1x3 + x2x3

x21 + x1x2 + x1x3 + x2

2 + x2x3 + x33

2x1 + 2x2 + 2x3 − 5x1x2 − 5x1x3 − 5x2x3

More formally, the symmetric group Sn acts on the ring Q[x1, x2, . . . , xn] by permuting thevariables, and a polynomial is symmetric if it is invariant under this action. Let Λn bethe subring of symmetric functions. A function f ∈ Λn is homogeneous of degree k ifevery monomial in f has total degree k. Letting Λk

n be the Q-vector space of homogeneoussymmetric functions of degree k, we may write Λn as the vector space direct sum

Λn = ⊕k≥0Λkn,

and observe that Λn is a graded Q-algebra. (Recall from the definition of a vector spacedirect sum that if f = f0 + f1 + f2 + · · · with fk ∈ Λk

n, then all but finitely many fk mustbe 0).

If we repeat the above definitions with an infinite set of indeterminates, we obtain Λ =⊕k≥0Λ

k, the graded Q-algebra of symmetric functions in infinitely many variables. We willbrush over technical details related to having infinitely many variables, but it is true thatΛ is the inverse limit of the Λn in the category of graded rings. If we replace Q by Z inthe above definitions, we obtain ΛZ = ⊕k≥0Λ

kZ, the graded ring and Z-module of symmetric

functions in infinitely many variables. We will rarely refer to ΛZ, but we will see that severalbases of Λ are also Z-bases of ΛZ.

3.5.2 (Proposed) Bases for the Ring of Symmetric Functions

A natural first step in the examination of Λn is to find one (or, in our case, five) bases for Λn

as a Q-vector space. We will define the five bases in this subsection. In the next subsection,we will build up the tools for proving that they are bases and relating them to one another.Then we will prove that they are bases and give transition matrices for pairs of bases.

In each basis, the basis elements are indexed by partitions λ. If α = (α1, α2, . . . , αn) ∈ Nn,let xα denote the monomial xα1

1 xα22 · · ·xαnn .

The simplest basis consists of the monomial symmetric functions. Let λ be a partitionof length at most n. We define the monomial symmetric function mλ(x1, . . . , xn) =

∑α x

α,

79


where the sum ranges over all distinct permutations of the entries in λ. For example,

m∅ = 1

m1 = x1 + x2 + · · ·+ xn

m2 = x21 + x2

2 + · · ·+ x2n

m11 =∑i<j

xixj

m21 =∑i≤j

x2ixj.

There is no need to wait two subsections to prove that the mλ form a basis; it is obviousthat {mλ(x1, . . . , xn) : `(λ) ≤ n and |λ| = k} forms a Q-basis of Λk

n and {mλ(x1, . . . , xn) :`(λ) ≤ n} forms a Q-basis of Λn.

The second basis consists of the elementary symmetric functions. Let λ be a partition.We define the elementary symmetric function eλ(x1, x2, . . . , xn) by

eλ = eλ1eλ2 · · ·er =

∑i1<···<ir

xi1 · · ·xir .

For example,

e∅ = 1

e1 = x1 + x2 + · · ·+ xn = m1

e2 =∑i<j

xixj = m11

e11 = e1e1 = m2 + 2m11

e21 = e2e1 = m21 + 3m111.

We will show that {eλ(x1, . . . , xn) : λ1 ≤ n and |λ| = k} forms a Q-basis of Λkn and

{eλ(x1, . . . , xn) : λ1 ≤ n} forms a Q-basis of Λn.The third basis consists of the complete homogeneous symmetric functions. Let λ be a

partition. We define the complete homogeneous symmetric function hλ(x1, x2, . . . , xn) by

hλ = hλ1hλ2 · · ·hr =

∑i1≤···≤ir

xi1 · · ·xir .

For example,

h∅ = 1

h1 = x1 + x2 + · · ·+ xn = m1

h2 =∑i≤j

xixj = m2 +m11

h11 = h1h1 = m2 + 2m11

h21 = h2h1 = m3 + 2m21 + 3m111.

80


We will show that {hλ(x1, . . . , xn) : λ1 ≤ n and |λ| = k} forms a Q-basis of Λkn and

{hλ(x1, . . . , xn) : λ1 ≤ n} forms a Q-basis of Λn.

The fourth basis consists of the power sum symmetric functions. Let λ be a partition.We define the power sum symmetric function pλ(x1, x2, . . . , xn) by

pλ = pλ1pλ2 · · ·pr =

∑i

xri .

For example,

p∅ = 1

p1 = x1 + x2 + · · ·+ xn = m1

p2 = x21 + x2

2 + · · ·+ x2n = m2

p11 = p1p1 = m2 + 2m11

p21 = p2p1 = m3 +m21.

We will show that {pλ(x1, . . . , xn) : λ1 ≤ n and |λ| = k} forms a Q-basis of Λkn and

{pλ(x1, . . . , xn) : λ1 ≤ n} forms a Q-basis of Λn.

The fifth basis consists of the Schur functions. These are more complicated to define.Sometimes they are defined algebraically and sometimes they are defined combinatorially.We will define them algebraically, and the combinatorial definition will become apparentwhen we prove the Schur functions form a basis.

Let ε(σ) denote the sign of the permutation σ. Given a monomial xα and a permutationσ ∈ Sn, let σ(xα) denote the monomial obtained when σ acts on xα by permuting variables.Now let α be a partition of length at most n. Let aα(x1, x2, . . . , xn) be the skew-symmetricpolynomial

aα =∑σ∈Sn

ε(σ)σ(xα).

It is skew-symmetric because σ(aα) = ε(σ)aα. Note that aα vanishes unless α has distinctparts. So let α have distinct parts; we can write α = λ+ δ, where λ is a partition of lengthat most n and δ is the staircase partition (n− 1, n− 2, . . . , 2, 1). Then

aλ+δ =∑σ∈Sn

ε(σ)σ(xλ+δ) = det(xλj+n−ji )1≤i,j,≤n.

This determinant equals 0 if we set xi = xj for some i 6= j. Thus it is divisible by theVandermonde determinant

aδ =∏

1≤i≤j≤n

(xi − xj) = det(xn−ji ).

Therefore aλ+δ/aδ ∈ Λn and we may define the Schur function sλ = aλ+δ/aδ. For example,

81


when n = 2,

s∅ = 1

s1 =

x21 x0

1

x22 x0

2

x11 x0

1

x12 x0

2

= x1 + x2 = m1

s2 =

x31 x0

1

x32 x0

2

x11 x0

1

x12 x0

2

= x21 + x1x2 + x2

2 = m2 +m11

s11 =

x21 x1

1

x22 x1

2

x11 x0

1

x12 x0

2

= x1x2 = m11

s21 =

x31 x1

1

x32 x1

2

x11 x0

1

x12 x0

2

= x21x2 + x1x

22 = m21

.

We will show that {sλ(x1, . . . , xn) : `(λ) ≤ n and |λ| = k} forms a Q-basis of Λkn and

{pλ(x1, . . . , xn) : `(λ) ≤ n} forms a Q-basis of Λn.

3.5.3 Changes of Basis Involving the mλ

In this subsection, we (combinatorially) express eλ, hλ, pλ, and sλ in terms of the mλ basis.This will prove that eλ, pλ, and sλ are also bases for Λn. Proving that hλ is a basis requires abit more effort, and we will do this in the next subsection. For what follows, remember thatwe previously defined the dominance order ≤ on partitions and the reverse lexicographic

orderR

≤. Given a matrix A, let row(A) denote the vector of row sums of A and let col(A)denote the vector of column sums of A.

Theorem 3.28. Let λ ` k. Then

eλ =∑µ`n

Mλµmµ,

where Mλµ is the number of (0, 1)-matrices A with row(A) = λ and col(A) = µ. Further-more, Mλµ = 0 unless µ ≤ λ′ and Mλλ′ = 1. Therefore {eλ ∈ Λk} is a basis for Λk and{eλ(x1, . . . , xn) : λ1 ≤ n and |λ| = k} is a basis for Λk

n. Also, {e1, e2, . . . } are algebraicallyindependent and generate Λ as a Q-algebra (that is, Λ = Q[e1, e2, . . . ]).

Proof. Consider the matrix x1 x2 x3 · · ·x1 x2 x3 · · ·...

......

82


A term of eλ = eλ1eλ2 · · · is obtained by choosing λ1 entries from the first row, λ2 entries fromthe second row, and so on, and multiplying those entries together to get some xα. Convertingthe chosen entries to 1 and the rest of 0, we obtain a (0,1)-matrix A with row(A) = λ andcol(A) = α. Conversely, each such matrix corresponds to a term of eλ. Thus

eλ =∑µ`n

Mλµmµ.

If Mλµ > 0, let A be a (0,1)-matrix with row(A) = λ and col(A) = µ. Let A′ be the matrixwith row(A′) = λ and the ones left-justified. Then λ′ = col(A′) ≥ col(A) = µ. Moreover,A′ is the only matrix with row(A′) = λ and col(A′) = λ′, so Mλλ′ = 1. Thus the transitionmatrix (Mλµ) (for |λ| = |µ| = k) is upper triangular and invertible, so {eλ ∈ Λk} is a basisfor Λk.

If we are working in finitely many variables, we find that eλ = 0 unless λ1 ≤ n, in whichcase `(µ) ≤ n. Thus {eλ(x1, . . . , xn) : λ1 ≤ n and |λ| = k} is a basis for Λk

n.Finally, since each f ∈ Λ is uniquely expressed as a linear combination of products of

ei’s, the ei are algebraically independent generators of Λ.

As an example of the previous theorem, we see that

e1111 = m4 + 4m31 + 6m22 + 12m211 + 24m1111

e211 = m31 + 2m22 + 5m211 + 12m1111

e22 = m22 + 2m211 + 6m1111

e31 = m211 + 4m1111

e4 = + m1111

Note that Mλµ can be interpreted as follows: we have k balls with λi balls labeled i foreach i. We have boxes labeled 1, 2, . . . . Them Mλµ is the number of ways to place the ballsinto the boxes so that no box contains more than one ball with the same label and box icontains µi balls.


hλ =∑µ`n

Nλµmµ,

where Nλµ is the number of N-matrices A with row(A) = λ and col(A) = µ.

Proof. Consider the matrix x1 x2 x3 · · ·x1 x2 x3 · · ·...

......

A term of hλ = hλ1hλ2 · · · is obtained by choosing λ1 not-necessarily distinct entries from thefirst row, λ2 not-necessarily distinct entries from the second row, and so on, and multiplyingthose entries together to get some xα. Converting the chosen entries to positive integers

83


giving the number of times the entry was chosen, we obtain an N-matrix A with row(A) = λand col(A) = α. Conversely, each such matrix corresponds to a term of hλ. Thus

hλ =∑µ`n

Nλµmµ.

Unfortunately, the matrix (Nλµ) is not upper triangular, so we cannot, at the moment,prove the hλ form a basis.

Note that Nλµ can be interpreted as follows: we have k balls with λi balls labeled i foreach i. We have boxes labeled 1, 2, . . . . Them Nλµ is the number of ways to place the ballsinto the boxes so that box i contains µi balls.


pλ =∑µ`n

Rλµmµ,

where Rλµ is the number of ordered partitions π = (B1, . . . , B`(µ)) of the set [`(λ)] such that

µj =∑i∈Bj

λi, 1 ≤ j ≤ k.

Furthermore, Rλµ = 0 unless λ ≤ µ and Rλλ′ =∏

imi(λ)!, where λ has mi(λ) parts of sizei. Therefore {pλ ∈ Λk} is a basis for Λk and {pλ(x1, . . . , xn) : λ1 ≤ n and |λ| = k} is abasis for Λk

n. Also, {p1, p2, . . . } are algebraically independent and generate Λ as a Q-algebra.

Proof. Rλµ is the coefficient of xµ = xµ1

1 xµ2

2 · · · in pλ =(∑

xλ1i

) (∑xλ2i

)· · · . To obtain the

monomial xµ, we choose xλjij

from each factor so that∏

j xλjij

= xµ. Let Br = {j : ij = r}.Then (B1, . . . , B`(µ)) is an ordered partition of the appropriate type.

If Rλµ > 0, then let π be an appropriate ordered partition. Let Bi1 , . . . , Bis be the distinctblocks of π containing at least one of 1, 2, . . . , r. Then µ1 + · · · + µr ≥ µi1 + · · · + µis ≥λ1 + · · ·+ λr, and µ ≥ λ.

If µ = λ, then Bi contains a single element. The elements in B1, . . . , Bm1 can be anypermutation of 1, . . . ,m1, the elements in Bm1+1, . . . , Bm1+m2 can be any permutation ofm1 + 1, . . . ,m1 +m2, and so on. So Rλλ =

∏imi(λ)!.

Since (Rλµ) is upper triangular, {pλ ∈ Λk} is a basis for Λk and {pλ(x1, . . . , xn) :λ1 ≤ n and |λ| = k} is a basis for Λk

n. As before, this shows {p1, p2, . . . } are algebraicallyindependent and generate Λ as a Q-algebra.


sλ =∑µ`n

Kλµmµ,

where Kλµ is the number of semi-standard Young tableaux of shape λ and type µ. Further-more, Kλµ = 0 unless λ ≥ µ and Kλλ = 1. Therefore {sλ(x1, . . . , xn) : λ1 ≤ n and |λ| = k}is a basis for Λk

n.

The proof of this theorem requires results from the next subsection.

84


3.5.4 Identities and an Involution

The goal of this subsection is to develop several classes of identities relating the variousbases. We begin with the three generating functions

E(t) =∑r≥0

ertr =

n∏i=1

(1 + xit)

H(t) =∑r≥0

hrtr =

n∏i=1

(1− xit)−1

P (t) =∑r≥1

prtr−1

=n∑i=1

xi1− xit

=n∑i=1

d

dtlog (1− xit)−1

=d

dtlogH(t)

=H ′(t)

H(t)

P (−t) =n∑i=1

xi1 + xit

=d

dtlogE(t)

=E ′(t)

E(t).

We can use these generating functions to write down formulas relating er, hr, and pr. Theonly formula we need relates er and hr; since H(t)E(−t) = 1, we find that

n∑r=0

(−1)rerhn−r = 0.

From the above formula it looks like there is some sort of symmetry between eλ and hλ,and indeed there is. Since the er are algebraically independent, we can define an algebraendomorphism ω : Λ→ Λ by ω(er) = hr. Applying ω to the above equation gives

n∑r=0

(−1)rhrω(hn−r) = 0 =n∑r=0

(−1)n−rω(hr)hn−r,

and it must be that ω(hr) = er. Thus ω2 = 1 and ω is an involution of Λ.The definition of ω has many consequences. First, since ω is an isomorphism of Λn and

sends er to hr, it follows that {hλ : λ1 ≤ n} forms a Q-basis of Λn, completing the assertionthat we have defined five bases of Λn.

Next, we can compute the action of ω on each of our bases.

85


Theorem 3.32. For any partition λ, let ελ = (−1)|λ|−`(λ). Then

ω(eλ) = hλ

ω(hλ) = eλ

ω(pλ) = ελpλ

ω(sλ) = sλ′

ω(mλ) = fλ,

where the fλ are the forgotten symmetric functions.

Proof. The equalities for ω(eλ) and ω(hλ) follow from the above discussion of ω. The equalityfor ω(mλ) is simply the definition of the forgotten symmetric functions; they do not have aparticularly simple description.

As for pλ, note that ω(H(t)) = E(t), so ω(P (t)) = P (−t). It follows that pr = (−1)r−1prand thus ω(pλ) = ελpλ. The result for sλ will follow directly from the Jacobi-Trudi identity,which will be proved in the next subsection.

At the moment, it is not clear why ω is so useful; basically, given some identity involvingone or more of the bases, we can apply ω to obtain a dual identity. This will change in thenext subsection when we prove an important property of ω.

3.5.5 Schur Functions

We need to finish off the proof of several theorems that were stated involving Schur functions.First, we prove that the algebraic and combinatorial definitions of Schur functions are thesame. We need a quick lemma first.

Proposition 3.33. Let α and α be compositions of n that are rearrangements of one another.Then Kλα = Kλα.

Proof. It suffices to prove this when

α = (α1, α2, . . . , αi−1, αi+1, αi, αi+2, . . . ).

Let Tλα be the set of all SSYT of shape λ and type α. Let Tλα be the set of all SSYT ofshape λ and type α. Let T ∈ Tλα.

Consider the parts of T equal to i or i + 1. If a column of T contains no such parts, orone part i and one part i + 1, ignore the column. The other columns contain one part i orone part i + 1. If a row has r i’s followed by s i + 1’s, convert to s i’s and r i + 1’s. Theresult is an element of Tλα, and this is a bijection.


sλ =∑µ`n

Kλµmµ,

where Kλµ is the number of semi-standard Young tableaux of shape λ and type µ. Further-more, Kλµ = 0 unless λ ≥ µ and Kλλ = 1. Therefore {sλ(x1, . . . , xn) : λ1 ≤ n and |λ| = k}is a basis for Λk

n.

86


Proof. Let e(j)n denote the elementary symmetric function in the variables x1, . . . , xj−1, xj+1, . . . , xn.

Let µ = (µ1, . . . , µl) be a composition, and define the l × l matrices

Aµ = (xµij ), Hµ = (hµi−l+j), E = ((−1)l−ie(j)l−i).

We claim Aµ = HµE. To prove this, write

E(j)(t) :=l−1∑n=0

e(j)n tn =∏i 6=j

(1 + xit).

Then

H(t)E(j)(−t) =1

1− xjt,

extracting the coefficient of tµi on each side gives

l∑k=1

hµi−l+l(−1)l−ke(j)l−k = x

µjj ,

and this proves the claim.Taking determinants, we find |Aµ| = |Hµ| · |E|, where |Aµ| = aµ. Letting µ = δ, since

|Hδ| = 1, we get |E| = aµ. Now letting µ = λ+ δ, we get

sλ =aλ+δ

aδ= |Hλ+δ| = |hλi−i+j|.

The Jacobi-Trudi identity, proved below, completes the proof.

Theorem 3.35 (Jacobi-Trudi Identities).

sλ = det(hλi−i+j)1≤i,j≤n (n ≥ `(λ))

sλ = det(eλ′i−i+j)1≤i,j≤m (m ≥ `(λ′))

Proof. For the first identity, we will use the Gessel-Viennot Lemma. Let A = A(α, β, γ, δ)consist of all path systems L, where path Li goes from (βi, γi) to (αi, δi). Let the weight ofa horizontal step in row j be xj. Let mi = αi− βi. Then the weight of a path system L willbe ∏

i

wt(Li) =∏i

∑γi≥k1≥···≥kmi≥δi

xk1 · · ·xkm =∏i

hmi(xδi , . . . , xγi).

Let B be the set of all non-intersecting path systems. Let B be the total weight of all systemsin B. Then

B = det(hαj−βi(xδj , . . . , xγi).

Now let αj = λj + n− j, βi = n− i, γi =∞, and δj = 1. Then

B = det(hλj−j+i).

On the other hand, given a nonintersecting lattice path system L in B, construct a reverseSSYT (all the rows are weakly decreasing and all columns are strongly decreasing) of shape λ

87


such that the weight of L is xtype(T ), where if the horizontal steps in path Li occur at heightsa1 ≥ a2 ≥ · · · ≥ aλi−µi , then let these be the i-th row of T . This is a bijection. Note that suchreverse SSYT are in bijection with SSYT of shape λ and the same type α = (α1, α2, . . . ),since if k is the largest entry of T , replacing Tij by k+ 1−Tij gives a bijection with the Kλα

SSYT of shape λ and type α = (αk, αk−1, . . . ), and these are in bijection with Kλα by theproposition. By the Gessel-Viennot Lemma, the first identity is proved.

To prove the second identity, consider the matrices

H = (hi−j)0≤i,j≤N

andE = ((−1)i−jei−j)0≤i,j≤N .

Both are lower triangular and, by the identity from the previous subsection, are inverses. Afact from linear algebra tells us that any minor of H is equal to the complementary cofactorof Et; that is, the determinant of the submatrix of H obtained by selecting rows i1, . . . , ik andcolumns j1, . . . , jk is equal to (−1)i1+···+ik+j1+···+jk times the determinant of the submatrixof Et obtained by selecting the complementary rows and columns.

Let λ, µ be two partitions of length at most p such that λ′, µ′ have length at most q,where p + q = N + 1. Consider the minor of H with row indices λi + p − i (1 ≤ i ≤ p)and column indices p − i (1 ≤ i ≤ p). The complementary cofactor of Et has row indicesp− 1 + j − λ′j (1 ≤ j ≤ q) and column indices p− 1 + j (1 ≤ j ≤ q). So

det(hλi−i+j)1≤i,j≤p = (−1)|λ| det((−1)λ′i−i+jeλ′i−i+j)1≤i,j≤q.

Cancelling minus signs gives

det(hλi−i+j) = det(eλ′i−i+j),

proving the result.

Corollary 3.36.ω(sλ) = sλ′ .

3.5.6 The Hook Length Formula

One beautiful enumerative formula in this area is the Hook Length Formula. This formulacounts the number of SYT of shape λ; since this is also the dimension fλ of the Spechtmodule Sλ, this computes the dimension of the irreducible representations of the symmetricgroup.

Given a shape λ and a square v ∈ λ, let Hv be the hook of v, namely v union the squaresbelow v and the squares to the right of v.

Theorem 3.37 (Hook Length Formula). Let λ be a partition of n.

fλ =n!∏v∈λ hv

.

88


An inner corner of λ is a square whose removal leaves another partition; that is, it is asquare that is at the end of a row and the end of a column.

Proof by Greene, Nijenhuis, Wilf. Consider the following algorithm:

1. Pick v ∈ λ with probability 1/n.

2. While v is not an inner corner, pick w ∈ Hv \ {v} with probability 1/(hv − 1) and setv := w.

3. Label the corner v that you have reached n.

4. Repeat with λ replaced by λ \ {v} and n replaced by n− 1, until all the cells of λ arelabeled.

The sequence of nodes generated by one pass through the first three steps is called a trial.We claim that each SYT tableau P of shape λ is produced with probability

∏v∈λ hv/n!.

First, it is clear that the algorithm terminates at a SYT of shape λ. Now, let (α, β) bethe cell of P containing n and let p(α, β) be the probability that the first trial ends there.Suppose we knew that

p(α, β) =1

n

α−1∏i=1

(1 +

1

hi,β − 1

) β−1∏j=1

(1 +

1

hα,j − 1

).

Let P ′ be the SYT of shape λ obtained by removing the square labeled n from P . Let hvbe the hook length of a square in P . By induction,

p(P ′) =

∏v∈λ hv

(n− 1)!.

Note that hv = hv if v is not in row α or column β. If v = (i, β), then hv = hi,β − 1 and ifv = (α, j), then hv = hα,j − 1. We have

p(P ) =1

n·∏

v∈λ hv

(n− 1)!

α−1∏i=1

(1 +

1

hi,β − 1

) β−1∏j=1

(1 +

1

hα,j − 1

)

=

∏v∈λ hv

n!

α−1∏i=1

(1 +

1

hi,β − 1

) β−1∏j=1

(1 +

1

hα,j − 1

)=

∏v∈λ hv

n!.

To show the formula for p(α, β), if a trial ends in (α, β), let the horizontal projection of thetrial be

I = {i 6= α : v = (i, j) for some v on the trial}.

89


The vertical projection is defined similarly. Let pI,J(α, β) be the sum of the probabilities ofall trials terminating at (α, β) with horizontal projection I and vertical projection J . Weclaim that

pI,J(α, β) =1

n

∏i∈I

1

hi,β − 1

∏j∈J

1

hα,j − 1.

Summing over all I ⊂ [α− 1] and all J ⊂ [β− 1] and factoring gives the formula for p(α, β).To prove the formula for pI,J(α, β), note that for I = J =, the probability is 1/n, agreeingwith the formula. We prove the formula via induction on |I| + |J |. Let a be the smallestelement of I and let b be the smallest element of J . Let I ′ = I \ {a} and let J ′ = J \ {b}.For a trial to have projections I and J , it must start at (a, b) and the next step must eitherbe to (a, b′) or (a′, b), where a′ is the smallest element of I ′ and b′ is the smallest element ofJ ′. So

pI,J(α, β) =1

n· 1

ha,b − 1(npI′,J(α, β) + npI,J ′(α, β))

=1

n· 1

ha,b − 1

(∏i∈I′

1

hi,β − 1

∏j∈J

1

hα,j − 1+∏i∈I

1

hi,β − 1

∏j∈J ′

1

hα,j − 1

)

=1

n·∏i∈I

1

hi,β − 1

∏j∈J

1

hα,j − 1· 1

ha,b − 1(ha,β + hα,b − 2)

=1

n·∏i∈I

1

hi,β − 1

∏j∈J

1

hα,j − 1

3.5.7 Orthogonality

This is the final subsection introducing the basics of symmetric functions. In here, wediscover the Cauchy identities, which lead us to define a scalar product (a Z-valued bilinearform) on Λn (in fact, it will be an inner product, but we will need a couple results before wecan prove it is symmetric and positive definite). This scalar product gives us the final toolto understand the transition matrices between bases and how to express elements of Λn interms of a basis.

Theorem 3.38 (Cauchy Identities).∏i,j

(1− xiyj)−1 =∑λ

z−1λ pλ(x)pλ(y)

=∑λ

hλ(x)mλ(y)

=∑λ

mλ(x)hλ(y)

=∑λ

sλ(x)sλ(y)

90


Proof.

It turns out that the Cauchy identities suggest defining a scalar product on Λn by〈hλ,mµ〉 = δλµ for all partitions λ and µ.

91


92

Bibliography

[1] Martin Aigner. A course in enumeration, volume 238 of Graduate Texts in Mathematics.2007.

[2] Miklos Bona. Combinatorics of permutations. Discrete Mathematics and its Applications(Boca Raton). Chapman & Hall/CRC, Boca Raton, FL, 2004.

[3] Bruce E. Sagan. The symmetric group. 1991. Representations, combinatorial algorithms,and symmetric functions.

[4] Richard P. Stanley. Enumerative combinatorics. Vol. 1. 1997.

[5] Richard P. Stanley. Enumerative combinatorics. Vol. 2. 1999.

[6] J. H. van Lint and R. M. Wilson. A course in combinatorics. 2001.

[7] Herbert S. Wilf. generatingfunctionology. 2006. available online.

93

Lecture Notes Helleloid Algebraic Combinatorics

Documents

Transcript of Lecture Notes Helleloid Algebraic Combinatorics