Rajnikant Sinha Galois Theory and Advanced Linear Algebra
Transcript of Rajnikant Sinha Galois Theory and Advanced Linear Algebra
Rajnikant Sinha
Galois Theory and Advanced Linear Algebra
Galois Theory and Advanced Linear Algebra
Rajnikant Sinha
Galois Theory and AdvancedLinear Algebra
123
Rajnikant SinhaSamne GhatVaranasi, Uttar Pradesh, India
ISBN 978-981-13-9848-3 ISBN 978-981-13-9849-0 (eBook)https://doi.org/10.1007/978-981-13-9849-0
© Springer Nature Singapore Pte Ltd. 2020This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or partof the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmissionor information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilarmethodology now known or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in thispublication does not imply, even in the absence of a specific statement, that such names are exempt fromthe relevant protective laws and regulations and therefore free for general use.The publisher, the authors and the editors are safe to assume that the advice and information in thisbook are believed to be true and accurate at the date of publication. Neither the publisher nor theauthors or the editors give a warranty, expressed or implied, with respect to the material containedherein or for any errors or omissions that may have been made. The publisher remains neutral with regardto jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,Singapore
Preface
Evariste Galois (25 October 1811–31 May 1832) was a great French mathematician.While still in his teens, he was able to determine a necessary and sufficient conditionfor a polynomial to be solvable by radicals, thereby solving a problem standing for350 years. The famous ancient problem of “trisecting an angle by using solelystraightedge and compass” was later solved by using the fundamental theorem ofGalois theory. Many students are overwhelmed to learn this. They take keen interestin learning the theory Galois had discovered. Unfortunately, there is no literaturewhich can lead to the fundamental theorem without going through painful learningprocess. Further, there is no right kind of book on linear algebra that can providegood theoretical foundation needed for later applications in Riemannian geometry,quantum mechanics, etc. These voids prompted me to write this book.
This book is meant to be an introduction to abstract algebra. The reader of thisbook is assumed to have some prior exposure to elementary properties of groups,rings, fields, and vector spaces. At times, we shall assume familiarity with innerproduct space of finite dimension, and linear transformations. For the readers whohave only learned a minimum of abstract algebra will also find this book friendly.A nodding acquaintance with elementary properties of positive integers is beneficial.
Most of the material usually taught in an abstract algebra course are presented inthis text. However, some results appear for the first time in a textbook form. Theordering of the topics as well as the approach we have taken sometimes deviatefrom the standard path, simply because of pedagogical reasons. Aside from theusual approach, we sometimes have also developed a more elementary approachthat uses standard calculation techniques. Wherever required, we also have suppliedabundantly “second layer of proof” (that is, proof within proof), so that compre-hensibility of the proof gets enhanced. In some named theorems, rarely we need athird layer of proof.
In the first part of Chap. 1, we have developed necessary field theory. Usingthese theorems, we have tried to prove the fundamental theorem of Galois groups.This is the theorem for which Galois became immortal. In Chap. 2, some wonderfulapplications of Galois theory are presented. Solution to the ancient famous problemof “trisection of a given angle by ruler and compass” is given here with enough
v
detailed proof. In Chap. 3, we have supplied the proofs of many celebrated theo-rems by using linear transformation tools as well as matrix methods. These arebeautiful areas of mathematics in itself. Their applications to quantum mechanics,manifold theory, etc., are well known. In Chap. 4, we have dealt with someamazing, but forgotten, results of yesteryears. It is known that “signature” of aquadratic form is invariant, but it is difficult to find an accessible proof to it. Wehave tried to supply a proof which could be an effortless reading.
Finally, on a personal note, I would like to thank my lovely wife, Bina, for herpatient endurance and constant encouragement.
Uttar Pradesh, India Rajnikant Sinha
vi Preface
Contents
1 Galois Theory I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Euclidean Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Polynomial Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.3 The Eisenstein Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311.4 Roots of Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441.5 Splitting Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
2 Galois Theory II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912.1 Simple Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912.2 Galois Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1032.3 Applications of Galois Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 1292.4 Solvability By Radicals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
3 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1673.1 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1673.2 Canonical Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1873.3 The Cayley–Hamilton Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 223Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
4 Sylvester’s Law of Inertia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2554.1 Positive Definite Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2554.2 Sylvester’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2904.3 Application to Riemannian Geometry . . . . . . . . . . . . . . . . . . . . . 323Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
vii
About the Author
Rajnikant Sinha is a former Professor of Mathematics at Magadh University,Bodh Gaya, India. A passionate mathematician, Prof. Sinha has published numerousinteresting research findings in international journals, and has authored three text-books with Springer Nature: Smooth Manifolds, Real and Complex Analysis:Volume 1, and Real and Complex Analysis: Volume 2; and a contributed volume onSolutions to Weatherburn’s Elementary Vector Analysis: With Applications toGeometry and Mechanics with another publisher. His research focuses ontopological vector spaces, differential geometry and manifolds.
ix
Chapter 1Galois Theory I
Roughly, a field is a commutative ring in which division by every nonzero elementis allowed. In algebra, fields play a central role. Results about fields find importantapplications in the theory of numbers. The theory of fields comprises the subjectmatter of the theory of equations. Here, we shall deal lightly with the field ofalgebraic numbers. Our main emphasis will be on aspects of field theory thatconcern the roots of polynomials. The beautiful ideas, due to the brilliant Frenchmathematician Évariste Galois (1811–1832), served as an inspiration for thedevelopment of abstract algebra. We shall prove the fundamental theorem of Galoistheory.
1.1 Euclidean Rings
1.1.1 Definition Let R be an integral domain. Suppose that for every nonzeromember a of R, d að Þ is a nonnegative integer. If
1. for every nonzero a; b 2 R, d að Þ� d abð Þ,2. for every nonzero a; b 2 R, there exist q; r 2 R such that a ¼ qbþ r, and
either r ¼ 0 or d rð Þ\d bð Þ, then we say that R is a Euclidean ring.
1.1.2 Theorem Let R be a Euclidean ring. Let A be an ideal of R. Then there existsa0 2 A such that a0R ¼ A.
Proof If A ¼ 0f g, then 0 serves the purpose of a0. So we consider the caseA 6¼ 0f g.
It follows that there exists a nonzero member a of A, and henced xð Þ : x 2 A and x 6¼ 0f g is a nonempty set of nonnegative integers. Hence,
min d xð Þ : x 2 A and x 6¼ 0f g exists. It follows that there exists a nonzero membera0 of A such that
© Springer Nature Singapore Pte Ltd. 2020R. Sinha, Galois Theory and Advanced Linear Algebra,https://doi.org/10.1007/978-981-13-9849-0_1
1
d a0ð Þ ¼ min d xð Þ : x 2 A and x 6¼ 0f g: �ð Þ
It remains to show that A ¼ a0R.For this purpose, take an arbitrary a 2 A. If a ¼ 0, then a ¼ 0 ¼ a00 2 a0R.
Thus if a ¼ 0, then a 2 a0R.Now we consider the case a 6¼ 0. Since a; a0 are nonzero members of R, and R is
a Euclidean ring, there exist q; r 2 R such that a ¼ qa0 þ r andeither r ¼ 0 or d rð Þ\d bð Þ. Since a; a0 are members of the ideal A of R, we have
r ¼ a� qa0ð Þ 2 A|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl};and hence r 2 A. Now, if r 6¼ 0, then from �ð Þ, d a0ð Þ� d rð Þ. Further,either r ¼ 0 or d rð Þ\d a0ð Þ. This shows that
a� qa0 ¼ r ¼ 0|fflffl{zfflffl};and hence a ¼ qa0 ¼ a0q 2 a0R.
Thus in all cases, a 2 A ) a 2 a0R. Hence A � a0R. It remains to show thata0R � A. For this purpose, let us take an arbitrary b 2 R. We have to show thata0b 2 A. Since a0 is a member of the ideal A of R, and b 2 R, we have a0b 2 A: ■
Definition Let R be an integral domain. Let a 2 R. It is clear that aR is an ideal ofR. The ideal aR of R is denoted by að Þ.Definition Let R be an integral domain. If
1. R has a unit element,2. every ideal of R is of the form að Þ, then we say that R is a principal ideal ring.
1.1.3 Theorem Let R be a Euclidean ring. Then R has a unit element.
Proof Since R is an ideal of the Euclidean ring R, by 1.1.2, there exists a0 2 R suchthat a0R ¼ R. Since a0R ¼ R and a0 2 R, we have a0 2 a0R, and hence there existse 2 R such that a0 ¼ a0e. It suffices to show that e functions as a unit element inR. To this end, let us take an arbitrary b 2 R. We have to show that be ¼ b.
Since b 2 R ¼ a0R, there exists c 2 R such that b ¼ a0c. Now,
LHS ¼ be ¼ a0cð Þe ¼ ca0ð Þe ¼ c a0eð Þ ¼ ca0 ¼ a0c ¼ b ¼ RHS;
where LHS and RHS are the left- and right-hand sides of the equality to beproved. ■
1.1.4 Theorem Let R be a Euclidean ring. Then R is a principal ideal ring.
Proof By 1.1.3, R has a unit element. Next, by 1.1.2, every ideal of R is of the formað Þ. It follows, by the definition of principal ideal ring, that R is a principal idealring. ■
2 1 Galois Theory I
1.1.5 Theorem Let R be a Euclidean ring. Let a; b 2 R. Then the greatest commondivisor a; bð Þ of a and b exists in R, in the sense that
1. a; bð Þ 2 R,2. a; bð Þja,3. a; bð Þjb,4. c a and cj jbð Þ ) cj a; bð Þ.
Further, there exist s; t 2 R such that
a; bð Þ ¼ asþ bt:
Proof Since R is a Euclidean ring, by 1.1.3, R has a unit element, say e. Let
A � axþ by : x; y 2 Rf g:
Clearly A is an ideal of the Euclidean ring R. Now, by 1.1.2, there exists f 2 Rsuch that
f ¼ fe 2 fR ¼ A|fflfflffl{zfflfflffl} ¼ axþ by : x; y 2 Rf g;
and hence f 2 axþ by : x; y 2 Rf g. It follows that there exist s; t 2 R such that
f ¼ asþ bt:
Since fR ¼ axþ by : x; y 2 Rf g3 aeþ b0ð Þ ¼ a, we have a 2 fR, and hence f ja.Similarly, f jb. Next suppose that c a and cj jb. It remains to show that cj asþ btð Þ.This is clearly true. ■
Definition Let R be a commutative ring with unit element 1. Let a 2 R. If thereexists b 2 R such that ab ¼ 1, then we say that a is a unit in R.
1.1.6 Theorem Let R be an integral domain with unit element 1. Let a; b benonzero members of R. Suppose that ajb and bja. Then there exists u 2 R such thatau ¼ b, and u is a unit in R.
Proof Since ajb, there exists u 2 R such that au ¼ b. Similarly, there exists v 2 Rsuch that bv ¼ a. It follows that
a uvð Þ ¼ auð Þv ¼ a|fflfflfflfflfflffl{zfflfflfflfflfflffl} ¼ a1;
and hence a uvð Þ ¼ a1. Now, since a is a nonzero member of the integral domain R,we have uv ¼ 1, and hence u is a unit in R. ■
Definition Let R be a commutative ring with unit element 1. Let a; b 2 R. If thereexists a unit u in R such that au ¼ b, then we say that a and b are associates, andwe denote this relationship by a� b.
1.1 Euclidean Rings 3
It is clear that � is an equivalence relation over R. Hence R is partitioned by �into equivalence classes.
1.1.7 Theorem Let R be a Euclidean ring. Let a; b 2 R. Let c be a greatest commondivisor of a and b. Let d be a greatest common divisor of a and b. Then c� d.
Proof Since c is a greatest common divisor of a and b, we have c a and cj jbð Þ. Now,since d is a greatest common divisor of a and b, we have cjd. Similarly, djc.Since cjd and djc, by 1.1.6 there exists a unit u 2 R such that cu ¼ d, and hencec� d. ■
1.1.8 Theorem Let R be a Euclidean ring with unit element 1. Let a; b be nonzeromembers of R. Let b be a nonunit. Then
d að Þ\d abð Þ:Proof Suppose to the contrary that d að Þ ¼ d abð Þ. We seek a contradiction.
Since aR is an ideal of R, by 1.1.2 there exists b 2 R such that abð ÞR ¼ aR.Now, since a ¼ a1 2 aR ¼ abð ÞR, we have a 2 abð ÞR, and hence there exists anonzero member c of R such that a ¼ abð Þc. Hence
a1 ¼ a ¼ a bcð Þ|fflfflfflfflfflffl{zfflfflfflfflfflffl} :Since a1 ¼ a bcð Þ and a is a nonzero member of the integral domain R, we have
1 ¼ bc, and hence b is a unit. This is a contradiction. ■
Definition Let R be a Euclidean ring with unit element 1. Let p be a nonzeromember of R that is not a unit. If
a; b 2 R and p ¼ abð Þ ) a is a unit or b is a unitð Þ;
then we say that p is a prime element of R.
1.1.9 Theorem Let R be a Euclidean ring with unit element 1. Let a be a nonzeromember of R. Suppose that d að Þ ¼ d 1ð Þ. Then a is a unit.
Proof Suppose to the contrary that a is a nonunit. We seek a contradiction. Here1; a are nonzero members of R. Also, a is not a unit. So by 1.1.8,
d 1ð Þ\d 1að Þ|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} ¼ d að Þ;
and hence d 1ð Þ\d að Þ. Thus d að Þ 6¼ d 1ð Þ. This is a contradiction. ■
1.1.10 Problem Let R be a Euclidean ring with unit element 1. Let a be a nonzeromember of R. Let a be a unit. Then d að Þ ¼ d 1ð Þ.
4 1 Galois Theory I
Proof Suppose to the contrary that d að Þ 6¼ d 1ð Þ. We seek a contradiction. Since ais a unit, there exists a nonzero member b of R such that ab ¼ 1. Now, since R is aEuclidean ring, we have
d að Þ� d abð Þ|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} ¼ d 1ð Þ� d 1að Þ ¼ d að Þ;
and hence d að Þ ¼ d 1ð Þ. This is a contradiction. ■
1.1.11 Problem Let R be a Euclidean ring with unit element 1. Then for everynonzero member a of R, either a is a unit or a can be expressed as a product offinitely many prime elements of R.
Proof (Induction on d að ÞÞ: Let us first consider the case that a is a nonzero memberof R and d að Þ ¼ 0. Since
0� d 1ð Þ� d 1að Þ ¼ d að Þ ¼ 0;
we have d 1ð Þ ¼ d að Þ. Now by 1.1.9, a is a unit. Thus the statement “either a is aunit or a can be expressed as a product of finitely many prime elements of R” holdsin this case.
Next suppose that the statement “either a is a unit or a can be expressed as aproduct of finitely many prime elements of R” holds for all a in R for whichd að Þ� n.
Next suppose that b is a nonzero member of R for which d bð Þ ¼ nþ 1. We haveto show that the statement “either b is a unit or b can be expressed as a product offinitely many prime elements of R” holds.
Case I: b is a prime element of R. In this case, the statement “b can be expressed asa product of finitely many prime elements of R” holds, and hence the statement“either b is a unit or b can be expressed as a product of finitely many prime elementsof R” holds.Case II: b is not a prime element of R.
Subcase I: b is a unit. In this subcase, the statement “either b is a unit or b can beexpressed as a product of finitely many prime elements of R” holds.Subcase II: b is not a unit. Here b is not a prime element of R, so by thedefinition of prime element, there exist c; e 2 R such that
1. b ¼ ce,2. c is not a unit,3. e is not a unit.
Since b is a nonzero member of R and b ¼ ce, it follows that c; e are nonzeromembers of R. Since c is not a unit, by 1.1.8,
1.1 Euclidean Rings 5
d eð Þ\d ecð Þ|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} ¼ d ceð Þ ¼ d bð Þ ¼ nþ 1;
and hence d eð Þ\nþ 1. Now, since d eð Þ in an integer, we have d eð Þ� n.Similarly, d cð Þ� n. Since d cð Þ� n and c is not a unit, by hypothesis, c can beexpressed as a product of finitely many prime elements of R. Similarly, e can beexpressed as a product of finitely many prime elements of R. It follows that cecan be expressed as a product of finitely many prime elements of R. Now, sinceb ¼ ce, b can be expressed as a product of finitely many prime elements of R. ■
Definition Let R be a Euclidean ring with unit element 1. Let a; b be nonzeromembers of R (By 1.1.5, a greatest common divisor of a and b exists in R.). If thereexists a unit u in R such that u is a greatest common divisor of a and b, then we saythat a and b are relatively prime.
1.1.12 Problem Let R be a Euclidean ring with unit element 1. Let a; b; c be anynonzero elements of R. Suppose that ajbc. Let a and b be relatively prime. Then ajc.Proof Since a and b are relatively prime, there exists a unit u in R such that u is agreatest common divisor of a and b. Now, by 1.1.5, there exist s; t 2 R such that
u ¼ asþ bt:
Since u is a unit in R, there exists v in R such that uv ¼ 1. It follows that
asvþ btv ¼ asþ btð Þv ¼ 1|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl};and hence
acsvþ bctv ¼ asvcþ btvc ¼ asvþ btvð Þc ¼ 1c|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ c:
Thus acsvþ bcð Þtv ¼ c. Since ajbc, there exists a nonzero member k of R suchthat ak ¼ bc. It follows that
a csvþ ktvð Þ ¼ acsvþ akð Þtv ¼ c|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence al ¼ c, where l � csvþ ktvð Þ 2 R. Thus ajc. ■
1.1.13 Problem Let R be a Euclidean ring with unit element 1. Let a; p be anynonzero elements of R. Suppose that p is a prime element of R. Then either pja or(p and a are relatively prime).
Proof By 1.1.5, there exists e 2 R such that ejp, eja, and c p and cj jað Þ ) cjeð Þ, thatis, e is a greatest common divisor of p and a.
6 1 Galois Theory I
Since ejp, there exists k 2 R such that ek ¼ p. Now, since p is a prime elementof R, either e is a unit or k is a unit.
Case I: k is a unit. It follows that there exists l 2 R such that kl ¼ 1. Now, sinceek ¼ p, we have
e ¼ e1 ¼ e klð Þ ¼ ekð Þl ¼ pl|fflfflfflfflfflffl{zfflfflfflfflfflffl};and hence e ¼ pl. Now, since eja, we have plð Þja, and hence there exists m 2 Rsuch that p lmð Þ ¼ plð Þm ¼ a|fflfflfflfflfflffl{zfflfflfflfflfflffl}. Thus pn ¼ a, where n � lm 2 R. Hence pja. Thus
the statement “either pja or (p and a are relatively prime)” holds.
Case II: e is a unit. Since e is a greatest common divisor of p and a, and e is a unit, pand a are relatively prime, and hence the statement “either pja or (p and a arerelatively prime)” holds. ■
1.1.14 Problem Let R be a Euclidean ring with unit element 1. Let p be a nonzeroelement of R. Suppose that p is a prime element of R. Then
a; b 2 R and pj abð Þð Þ ) p a or pj jbð Þ:Proof Let us take arbitrary nonzero members a and b of R such that pj abð Þ. Itsuffices to show that p a or pj jb. Suppose to the contrary that p-a and p-b. We haveto arrive at a contradiction.
Since p is a prime element of R, by 1.1.13 we have pja or (p and a are relativelyprime). Now, since p-a, p and a are relatively prime. Next, since pj abð Þ and p and aare relatively prime, by 1.1.12, we have pjb. This is a contradiction. ■
1.1.15 Theorem Let R be a Euclidean ring with unit element 1. Let a be a nonzeroelement of R. Suppose that a is not a unit in R (By 1.1.11, a can be expressed as aproduct of finitely many prime elements of R.). Let
a ¼ p1p2. . .pm;
where each pi i ¼ 1; 2; . . .;mð Þ is a prime element of R. Let
a ¼ p01p02. . .p
0n;
where each p0j j ¼ 1; 2; . . .; nð Þ is a prime element of R. Then
1. each pi is an associate of some p0j,2. each p0j is an associate of some pi,3. n ¼ m.
This theorem is known as the unique factorization theorem.
1.1 Euclidean Rings 7
Proof Since p1j p1p2. . .pmð Þ and p1p2. . .pm ¼ a ¼ p01p02. . .p
0n, we have
p1j p01p02. . .p0n� �
. Now, since p1 is a prime element of R, by 1.1.14 we have p1jp0j forsome j 2 1; 2; . . .; nf g.
Here for some j 2 1; 2; . . .; nf g, we have p1jp0j, and hence there exists a nonzerok 2 R such that p1k ¼ p0j. Now, since p0j is a prime element of R, eitherp1is a unit or k is a unit. Since p1 is a prime element of R, p1 is not a unit. It followsthat k is a unit. Now, since p1k ¼ p0j, we have p1 � p0j, where j 2 1; 2; . . .; nf g.Similarly, p2 � p0k, where k 2 1; 2; . . .; nf g, etc. Thus, each pi is an associate ofsome p0j. Similarly, each p0j is an associate of some pi. This proves (1) and (2).
For (3): Suppose to the contrary that m\n. We seek a contradiction.Since each p1 is an associate of some p0j and
p1p2. . .pm ¼ p01p02. . .p
0n;
we get an equality of the form
p2p3. . .pm ¼ up01p02. . .p
0j�1p
0jþ 1. . .p
0n;
where u is a unit. Next, since p2j p2p3. . .pmð Þ, and p2p3. . .pm ¼ up01p02. . .
p0j�1p0jþ 1. . .p
0n, we have p2j up01p
02. . .p
0j�1p
0jþ 1. . .p
0n
� �. Now, since p2 is a prime
element of R, by 1.1.14 we have p2ju or p2jp0k for some k 2 1; 2; . . .; nf g � jf gð Þ.We claim that p2-u. Suppose to the contrary that p2ju. We seek a contradiction.Since u is a unit, there exists a nonzero v 2 R such that uv ¼ 1. Since p2ju, we
have p2j uvð Þ, and hence p2j1. This shows that p2 is a unit. Since p2 is a primeelement of R, p2 is not a unit. This is a contradiction. Thus our claim is true, that is,p2-u.
It follows that p2jp0k for some k 2 1; 2; . . .; nf g � jf gð Þ. Similarly, p3jp0l forsome l 2 1; 2; . . .; nf g � j; kf gð Þ, etc. On repeating this argument m times, we getan equality of the form
1 ¼ w0p0mþ 1p
0mþ 2. . .p
0n;
where w0 is a unit. It follows that p0mþ 1 is a unit, and hence p0mþ 1 is not a primeelement. This is a contradiction. ■
1.1.16 Problem Let R be a Euclidean ring with unit element 1. Let p be a nonzeroelement of R. Suppose that p is not a unit in R. Let p be a prime element of R. Thenthe ideal pð Þ is maximal.
Proof If pð Þ ¼ R, then it is clear that pð Þ is a maximal ideal. So we consider thecase that the ideal pð Þ is a proper subset of R. We have to show that pð Þ is maximal.
Suppose to the contrary that there exists an ideal U of R such that pð Þ is a propersubset of U, and U is a proper subset of R. We seek a contradiction.
8 1 Galois Theory I
By 1.1.4, R is a principal ideal ring. It follows that there exists a 2 U such thatað Þ ¼ U. Thus pð Þ is a proper subset of að Þ, and að Þ is a proper subset of R. Sinceað Þ is a proper subset of R, a is not a unit. Since pð Þ is a subset of að Þ and p 2 pð Þ,we have p 2 að Þ. Hence there exists a nonzero u 2 R such that p ¼ au. Now, sincep is a prime element of R, a is a unit or u is a unit. Next since a is not a unit, u is aunit. Since u is a unit and p ¼ au, we have pð Þ ¼ að Þ. This contradicts the fact thatpð Þ is a proper subset of að Þ. ■
1.1.17 Problem Let R be a Euclidean ring with unit element 1. Let p be a nonzeroelement of R. Suppose that p is not a unit in R. Let the ideal pð Þ be maximal. Then pis a prime element of R.
Proof Suppose to the contrary that there exist nonzero a; b in R such that p ¼ ab,and neither a nor b is a unit. We seek a contradiction.
Since p ¼ ab, we have pð Þ � að Þ. Since pð Þ � að Þ and pð Þ is maximal, eitherpð Þ ¼ að Þ or að Þ ¼ R. Since a is not a unit, we have að Þ 6¼ R. It follows thatpð Þ ¼ að Þ, and hence there exists u 2 R such that a ¼ pu. Now, since p ¼ ab, wehave
p1 ¼ p ¼ puð Þb|fflfflfflfflfflffl{zfflfflfflfflfflffl} ¼ p ubð Þ;
and hence p1 ¼ p ubð Þ. Next, since p is a nonzero element of R, we have 1 ¼ ub,and hence b is a unit. This is a contradiction. ■
1.1.18 Notation The collection of all complex numbers aþ ib, where a; b areintegers, is denoted by J i½ �. Its members are called Gaussian integers.
It is easy to see that J i½ � is an integral domain with unit element1 ¼1þ i0 2 J i½ �ð Þ. For every nonzero aþ ib 2 J i½ �, by d aþ ibð Þ we shall mean thepositive integer a2 þ b2.
1.1.19 Note Observe that for every nonzero aþ ibð Þ; eþ ifð Þ 2 J i½ �, we have0\ a2 þ b2ð Þ, and 1� e2 þ f 2ð Þ, and hence
d aþ ibð Þ ¼ a2 þ b2� �
1� a2 þ b2� �
e2 þ f 2� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ aþ ibj j2 eþ ifj j2
¼ aþ ibj j eþ ifj jð Þ2¼ aþ ibð Þ eþ ifð Þj j2¼ d aþ ibð Þ eþ ifð Þð Þ:
Thus for every nonzero aþ ibð Þ; eþ ifð Þ 2 J i½ �, we have
d aþ ibð Þ� d aþ ibð Þ eþ ifð Þð Þ:1.1.20 Note Let aþ ibð Þ be any nonzero member of J i½ �, where a and b are inte-gers. Let x be any positive integer. By the divisibility property of integers, thereexist two integers q1; r1 such that a ¼ q1xþ r1, where � x
2 � r1 � x2.
1.1 Euclidean Rings 9
Examples: Let x ¼ 7 and a ¼ 42. Since 42 ¼ 6 7þ 0, we can take q1 ¼ 6 andr1 ¼ 0 2 � 7
2 ;72
� �.
Let x ¼ 7 and a ¼ 47. Since 47 ¼ 6 7þ 5 ¼ 6þ 1ð Þ 7þ �2ð Þ, we can takeq1 ¼ 7 and r1 ¼ �2 2 � 7
2 ;72
� �.
Let x ¼ 6 and a ¼ 45. Since 45 ¼ 7 6þ 3, we can take q1 ¼ 7 andr1 ¼ 3 2 � 6
2 ;62
� �.
Similarly, there exist two integers q2; r2 such that b ¼ q2xþ r2, where� x
2 � r2 � x2.
It follows that
aþ ibð Þ ¼ q1xþ r1ð Þþ i q2xþ r2ð Þ ¼ q1 þ iq2ð Þxþ r1 þ ir2ð Þ;
and hence
aþ ibð Þ ¼ qxþ r;
where q � q1 þ iq2ð Þ 2 J i½ �, and r � r1 þ ir2ð Þ 2 J i½ �.Suppose that r 6¼ 0. Since r1j j � x
2 and r2j j � x2, we have
d rð Þ ¼ d r1 þ ir2ð Þ ¼ r1ð Þ2 þ r2ð Þ2 � x2
2\x2 ¼ d xþ i0ð Þ ¼ d xð Þ:
1.1.21 Conclusion Let aþ ibð Þ be any nonzero member of J i½ �, where a and b areintegers. Let x be any positive integer. Then there exist q; r 2 J i½ � such thataþ ibð Þ ¼ qxþ r and either r ¼ 0 or d rð Þ\d xð Þð Þ.1.1.22 Note Let aþ ibð Þ be any nonzero member of J i½ �, where a and b are inte-gers. Let eþ ifð Þ be any nonzero member of J i½ �, where e and f are integers.
Since eþ ifð Þ is a nonzero member of J i½ �, e2 þ f 2 is a positive integer. Now, by1.1.20, there exist q1 þ iq2ð Þ; r1 þ ir2ð Þ 2 J i½ � such that
1. aþ ibð Þ e� ifð Þ ¼ q1 þ iq2ð Þ e2 þ f 2ð Þþ r1 þ ir2ð Þ,2. either r1 þ ir2 ¼ 0 or d r1 þ ir2ð Þ\d e2 þ f 2ð Þð Þ.Case I: r1 þ ir2 ¼ 0. From item 1 above,
aþ ibð Þ e� ifð Þ ¼ q1 þ iq2ð Þ e2 þ f 2� �
:
Next, since e2 þ f 2 is a positive integer, we have aþ ibð Þ ¼ q1 þðiq2Þ eþ ifð Þþ 0þ i0ð Þ. Thus the statement “there exist q; r 2 J i½ � such thataþ ibð Þ ¼ q eþ ifð Þþ r and either r ¼ 0 or d rð Þ\d eþ ifð Þð Þ” holds.
Case II: r1 þ ir2 6¼ 0. It follows from (2) that
d r1 þ ir2ð Þ\d e2 þ f 2� �
;
10 1 Galois Theory I
and hence from (1),
d aþ ibð Þ e� ifð Þ � q1 þ iq2ð Þ e2 þ f 2� �� �
\d e2 þ f 2� �
;
that is,
aþ ibð Þ e� ifð Þ � q1 þ iq2ð Þ e2 þ f 2� � 2\ e2 þ f 2
� �2;
that is,
e2 þ f 2� �
aþ ibð Þ � q1 þ iq2ð Þ eþ ifð Þj j2\ e2 þ f 2� �2
;
that is,
aþ ibð Þ � q1 þ iq2ð Þ eþ ifð Þj j2\ e2 þ f 2� �
: �ð Þ
Let us put
s1 þ is2 � aþ ibð Þ � q1 þ iq2ð Þ eþ ifð Þð Þ 2 J i½ �;
where s1; s2 are integers. Here
aþ ibð Þ ¼ q1 þ iq2ð Þ eþ ifð Þþ s1 þ is2ð Þ;
and from �ð Þ,
s1 þ is2j j2\ e2 þ f 2� �
:
It follows that either s1 þ is2 ¼ 0 or d s1 þ is2ð Þ\d eþ ifð Þ. Thus the state-ment “there exist q; r 2 J i½ � such that aþ ibð Þ ¼ q eþ ifð Þþ r, andeither r ¼ 0 or d rð Þ\d eþ ifð Þð Þ” holds.
1.1.23 Conclusion For every nonzero a; b 2 J i½ �, there exist q; r 2 J i½ � such thata ¼ qbþ r and either r ¼ 0 or d rð Þ\d bð Þð Þ. Also, we have seen that for everynonzero a; b 2 J i½ �, we have d að Þ� d abð Þ.
Hence J i½ � is a Euclidean ring with unit element 1.
1.1.24 Problem It is clear that the collection Z of all integers is an integral domainwith unit element 1. For every nonzero integer a, by d að Þ we shall mean theabsolute value aj j of a. Observe that for every nonzero a; b 2 Z,
d að Þ ¼ aj j � aj j bj j ¼ abj j ¼ d abð Þ;
so for every nonzero a; b 2 Z, d að Þ� d abð Þ.
1.1 Euclidean Rings 11
Next, let us take arbitrary nonzero a; b 2 Z. By the divisibility property ofintegers, there exist q; r 2 Z such that a ¼ qbþ r and 0� r\ bj j. Since 0� r\ bj j,we have either r ¼ 0 or 0\r\ bj jð Þ, and hence either r ¼ 0 or rj j\ bj jð Þ. Thus,
for every nonzero a; b 2 Z, there exist q; r 2 Z such that a ¼ qbþ r andeither r ¼ 0 or d rð Þ\d bð Þð Þ.This shows that Z is a Euclidean ring.Further, Z is a subring of the Euclidean ring J i½ �.
Proof Since for every integer a, we have a ¼ aþ i0ð Þ 2 J i½ �, it follows that Z is asubset of J i½ �. Also, Z is itself an integral domain. Since Z is a subset of J i½ � and J i½ �is a Euclidean ring, we have d að Þ� d abð Þ for every nonzero a; b 2 Z.
Next, let us take arbitrary nonzero a; b 2 Z. By the divisibility property ofintegers, there exist q; r 2 Z such that a ¼ qbþ r and 0� r\ bj j. Since 0� r\ bj j,we have either r ¼ 0 or 0\r\ bj jð Þ, and hence either r ¼ 0 or rj j2\ bj j2
� �. Thus,
for every nonzero a; b 2 Z, there exist q; r 2 Z such that a ¼ qbþ r andeither r ¼ 0 or d rð Þ\d bð Þð Þ.This shows that Z is itself a Euclidean ring. Thus Z is a subring of the Euclidean
ring J i½ �. ■
1.1.25 Problem Let p 2 Zð Þ be a prime number, in the sense that 1\p andajp ) a ¼ 1 or � 1 or p or � pð Þð Þ. Let a; b; c be any integers satisfying
1. c is relatively prime to p,
2. cp ¼ a2 þ b2ð Þ.Then p is not a prime element of the Euclidean ring J i½ �.
Proof Suppose to the contrary that p is a prime element of the Euclidean ring J i½ �.We seek a contradiction.
From (2), pj aþ ibð Þ a� ibð Þ. It follows, by 1.1.14, that p aþ ibð Þ or pj j a� ibð Þ.Case I: pj aþ ibð Þ. It follows that there exist integers e; f such that p eþ ifð Þ ¼aþ ibð Þ, and hence aþ ibj j2¼ p eþ ifð Þj j2. Thus p2 e2 þ f 2ð Þ ¼ a2 þ b2ð Þ. Now,from (2), p2 e2 þ f 2ð Þ ¼ cp, and hence p e2 þ f 2ð Þ ¼ c. This shows that pjc, andhence c is not relatively prime to p. This contradicts (1).
Case II: pj a� ibð Þ. This case is similar to Case I. ■
1.1.26 Problem Let p 2 Zð Þ be a prime number. Let a; b; c be any integerssatisfying
1. c is relatively prime to p,
2. cp ¼ a2 þ b2ð Þ.Then there exist integers e and f such that p ¼ e2 þ f 2.
Proof By 1.1.25, p is not a prime element of the Euclidean ring J i½ �. Hence thereexist integers e; f ; g; h such that
12 1 Galois Theory I
p ¼ eþ ifð Þ gþ ihð Þ;
and neither eþ ifð Þ is a unit in J i½ � nor gþ ihð Þ is a unit in J i½ �.Since eþ ifð Þ is not a unit in J i½ �, we have e2 þ f 2 6¼ 1.
Proof Suppose to the contrary that e2 þ f 2 ¼ 1. We seek a contradiction. Sincee2 þ f 2 ¼ 1, we have eþ ifð Þ e� ifð Þ ¼ 1, and hence eþ ifð Þ is a unit in J i½ �.This is a contradiction. ■
Similarly, g2 þ h2 6¼ 1. Since p ¼ eþ ifð Þ gþ ihð Þ, we have
p2 ¼ e2 þ f 2� �
g2 þ h2� �
; �ð Þ
and hence e2 þ f 2ð Þjp2. Now, since p is a prime number, we have e2 þ f 2ð Þ ¼ 1 ore2 þ f 2ð Þ ¼ p or e2 þ f 2ð Þ ¼ p2. And since e2 þ f 2 6¼ 1, we have e2 þ f 2ð Þ ¼ p ore2 þ f 2ð Þ ¼ p2.If e2 þ f 2ð Þ ¼ p2, then from �ð Þ, we have p2 ¼ p2 g2 þ h2ð Þ, and hence
g2 þ h2 ¼ 1. This is a contradiction. So e2 þ f 2ð Þ 6¼ p2. Since e2 þ f 2ð Þ ¼ p ore2 þ f 2ð Þ ¼ p2, we have e2 þ f 2ð Þ ¼ p. ■
1.2 Polynomial Rings
1.2.1 Note Let us observe that the quadratic congruence x2 � 1 mod 8ð Þ has1; 3; 5; 7f g as a solution set. Thus the quadratic congruence x2 � 1 mod 8ð Þ has four
solutions.
Definition Let n be an integer such that n 1. By u nð Þ we mean the number ofpositive integers m such that m� n and (m; n are relatively prime). Here u :1; 2; 3; . . .f g ! 1; 2; 3; . . .f g is called the Euler totient function.For example, u 1ð Þ ¼ 1;u 2ð Þ ¼ 1;u 3ð Þ ¼ 2;u 4ð Þ ¼ 2;u 5ð Þ ¼ 4;u 6ð Þ ¼ 2, etc.
Definition Let m be an integer such that m 1. By a reduced residue systemmodulo m we mean a collection A of integers such that
1. the number of elements in A is u mð Þ,2. no two members of A are congruent modulo m,3. each member of A is relatively prime to m.
Example: 1; 29f g is a reduced residue system modulo 6.
1.2.2 Problem Let m be an integer such that m 1. Let a1; a2; . . .; au mð Þ �
be areduced residue system modulo m. Let k be a positive integer that is relatively primeto m. Then ka1; ka2; . . .; kau mð Þ
�is also a reduced residue system modulo m.
Proof The proof is straightforward. ■
1.1 Euclidean Rings 13
1.2.3 Problem Let a;m be any integers such that a 1, and m 1. Suppose that ais relatively prime to m. Then
au mð Þ � 1 modmð Þ:Proof Let b1; b2; . . .; bu mð Þ
�be a reduced residue system modulo m. Since a is
relatively prime to m, by 1.2.2, ab1; ab2; . . .; abu mð Þ �
is also a reduced residuesystem modulo m. Since a is relatively prime to m, we have ab1 � b1 modmð Þ.Similarly, ab2 � b2 modmð Þ, etc. This shows that
ab1ð Þ ab2ð Þ. . . abu mð Þ� �� � � b1b2. . .bu mð Þ
� �modmð Þ;
and hence
au mð Þ� �
b1b2. . .bu mð Þ� �� �
� b1b2. . .bu mð Þ� �
modmð Þ:
Thus
au mð Þ� �b1b2. . .bu mð Þ� �� b1b2. . .bu mð Þ
� �m
is an integer, that is,
b1b2. . .bu mð Þ� �
au mð Þ � 1� �
m
is an integer. Since b1; b2; . . .; bu mð Þ �
is a reduced residue system modulo m, eachbi is relatively prime to m. Since each bi is relatively prime to m, and
b1b2. . .bu mð Þ� �
au mð Þ � 1� �
m
is an integer,
au mð Þ � 1m
is an integer, and hence au mð Þ � 1 modmð Þ. ■
1.2.4 Problem Let a be any integer such that a 1. Let p be a prime. Suppose thatp does not divide a. Then
ap�1 � 1 mod pð Þ:
14 1 Galois Theory I
Proof Since p is a prime and p does not divide a, a is relatively prime to p. Now, by1.2.3,
au pð Þ � 1 mod pð Þ:
Since p is a prime, we have u pð Þ ¼ p� 1. Hence
ap�1 � 1 mod pð Þ:
■
1.2.5 Theorem Let a be any integer such that a 1. Let p be a prime. Then
ap � a mod pð Þ:
This result is known as the little Fermat theorem.
Proof Case I: p does not divide a. Here, by 1.2.4, ap�1 � 1 mod pð Þ, and henceap�1a � 1a mod pð Þ. Thus ap � a mod pð Þ.
Case II: p divides a. It follows that ap ap�1 � 1ð Þ is an integer, and hence ap�ap is an
integer. This shows that ap � a mod pð Þ.Hence in all cases, ap � a mod pð Þ. ■
1.2.6 Problem Let a; b;m be any integers such that a 1; b 1, and m 1.Suppose that a is relatively prime to m. Then the polynomial congruence
ax � b modmð Þ
has a unique solution, namely, x � au mð Þ�1b modmð Þ.Proof Existence: We must show that
a au mð Þ�1b� �
� b modmð Þ;
that is,
au mð Þb � b modmð Þ:
Since a is relatively prime to m, by 1.2.3, au mð Þ � 1 modmð Þ, and henceau mð Þb � 1b modmð Þ. Thus au mð Þb � b modmð Þ.
Uniqueness: Suppose that the polynomial congruence
ax � b modmð Þ
1.2 Polynomial Rings 15
has two solutions x1 and x2, that is,
ax1 � b modmð Þax2 � b modmð Þ
�:
We have to show that x1 � x2 modmð Þ. Since
ax1 � b modmð Þax2 � b modmð Þ
�;
we have
ax1 � ax2 modmð Þ;
and hence
ax1 � ax2m
is an integer, and hence a x1�x2ð Þm is an integer. Since a is relatively prime to m, x1�x2
m isan integer, and hence
x1 � x2 modmð Þ:
■
1.2.7 Theorem Let p be a prime. Let
f xð Þ � c0 þ c1xþ þ cnxn
be any polynomial in x with integer coefficients. Suppose that cn is a positiveinteger that is not divisible by p. Then the polynomial congruence
f xð Þ � 0 mod pð Þ
has at most n solutions.This result is due to Lagrange.
Proof (Induction on nÞ: Let us take the case n ¼ 1. Thus f xð Þ � c0 þ c1x. Here wehave to solve the congruence
c0 þ c1xð Þ � 0 mod pð Þ;
where c1 is not divisible by p. Since p is a prime and c1 is not divisible by p, c1 isrelatively prime to p, and hence by 1.2.6, the polynomial congruence
16 1 Galois Theory I
c1x � �c0 modmð Þ
has a unique solution. Hence the statement “f xð Þ � 0 mod pð Þ has at most n solu-tions” holds for n ¼ 1.
Now suppose that the statement “f xð Þ � 0 mod pð Þ has at most n� 1ð Þ solu-tions” holds for all polynomials of degree n� 1ð Þ.
Also, suppose that the polynomial congruence
f xð Þ � 0 mod pð Þ
has nþ 1ð Þ noncongruent solutions, say x0; x1; . . .; xn. We seek a contradiction.Observe that
f xð Þ � f x0ð Þ ¼ c1 x� x0ð Þþ c1 x2 � x0ð Þ2� �
þ þ cn xn � x0ð Þnð Þ¼ x� x0ð Þ c1 þ ð Þxþ þ cnx
n�1� �;
so
f xð Þ � f x0ð Þ ¼ x� x0ð Þg xð Þ;
where g xð Þ is a polynomial of degree n� 1, with leading coefficient cn. Now, sincecn is a positive integer that is not divisible by p, the leading coefficient of g xð Þ is apositive integer that is not divisible by p. It follows, by the induction hypothesis, that
g xð Þ � 0 mod pð Þ
has at most n� 1ð Þ noncongruent solutions.Since x0; x1 are solutions of f xð Þ � 0 mod pð Þ, we have
f x0ð Þ � 0 mod pð Þf x1ð Þ � 0 mod pð Þ
�;
and hence
f x1ð Þ � f x0ð Þ mod pð Þ:
This shows that
f x1ð Þ � f x0ð Þð Þ � 0 mod pð Þ;
that is,
x1 � x0ð Þg x1ð Þð Þ � 0 mod pð Þ:
1.2 Polynomial Rings 17
Now, since, x1; x0 are noncongruent modulo p, we have g x1ð Þ � 0 mod pð Þ, andhence x1 is a solution of g xð Þ � 0 mod pð Þ. Similarly, x2 is a solution of g xð Þ �0 mod pð Þ; . . .; xn is a solution of g xð Þ � 0 mod pð Þ. Thus g xð Þ � 0 mod pð Þ has nnoncongruent solutions. This is a contradiction. ■
1.2.8 Problem Let p be a prime. Let
f xð Þ � c0 þ c1xþ þ cnxn
be any polynomial in x. Suppose that the polynomial congruence
f xð Þ � 0 mod pð Þ
has more than n solutions. Then each ci is divisible by p.
Proof Suppose to the contrary that there exists a largest positive integer k� n suchthat ck is not divisible by p. We seek a contradiction.
Here
f xð Þ ¼ c0 þ c1xþ þ ckxk þ p ð Þxkþ 1 þ ð Þxkþ 2 þ þ ð Þxn� �:
Suppose that the polynomial congruence
f xð Þ � 0 mod pð Þ
has nþ 1ð Þ noncongruent solutions x0; x1; . . .; xn. It follows that
f x0ð Þ � 0 mod pð Þ;
that is,
c0 þ c1x0 þ þ ck x0ð Þk þ p ð Þ x0ð Þkþ 1 þ ð Þ x0ð Þkþ 2 þ þ ð Þ x0ð Þn� �� �
� 0 mod pð Þ:
Now, since
p ð Þ x0ð Þkþ 1 þ ð Þ x0ð Þkþ 2 þ þ ð Þ x0ð Þn� �
� 0 mod pð Þ;
we have
c0 þ c1x0 þ þ ck x0ð Þk þ p ð Þ x0ð Þkþ 1 þ ð Þ x0ð Þkþ 2 þ þ ð Þ x0ð Þn� �� ��
�p ð Þ x0ð Þkþ 1 þ ð Þ x0ð Þkþ 2 þ þ ð Þ x0ð Þn� ��
� 0� 0ð Þ mod pð Þ;
18 1 Galois Theory I
and hence
c0 þ c1x0 þ þ ck x0ð Þk� �
� 0 mod pð Þ:
This shows that x0 is a solution of the polynomial congruence
c0 þ c1xþ þ ckxk
� � � 0 mod pð Þ:
Similarly, x1 is a solution of the polynomial congruence
c0 þ c1xþ þ ckxk
� � � 0 mod pð Þ;
..
.
xn is a solution of the polynomial congruence
c0 þ c1xþ þ ckxk
� � � 0 mod pð Þ:
Thus the number of solutions of the polynomial congruence
c0 þ c1xþ þ ckxk
� � � 0 mod pð Þ
is strictly greater than n. By 1.2.7, the number of solutions of the polynomialcongruence
c0 þ c1xþ þ ckxk
� � � 0 mod pð Þ
is at most k. It follows that n\k. This contradicts k� n. ■
1.2.9 Problem Let p be a prime. Let
f xð Þ � x� 1ð Þ x� 2ð Þ. . . x� p� 1ð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}p�1ð Þfactors
� xp�1 � 1� �
:
Then each coefficient of the p� 2ð Þth-degree polynomial f xð Þ is divisible by p.
Proof By 1.2.4, for every x 2 1; 2; . . .; p� 1ð Þf g, we have
xp�1 � 1 � 0 mod pð Þ:
It is clear that for every x 2 1; 2; . . .; p� 1ð Þf g, we have
x� 1ð Þ x� 2ð Þ. . . x� p� 1ð Þð Þ � 0 mod pð Þ:
1.2 Polynomial Rings 19
Hence for every x 2 1; 2; . . .; p� 1ð Þf g, we have
x� 1ð Þ x� 2ð Þ. . . x� p� 1ð Þð Þ � xp�1 � 1� �� � � 0� 0ð Þ mod pð Þ:
Thus for every x 2 1; 2; . . .; p� 1ð Þf g, we have
f xð Þ � 0 mod pð Þ:
Thus the number of solutions of the polynomial congruence f xð Þ � 0 mod pð Þ isstrictly greater than p� 2ð Þ. Now, since f xð Þ is a polynomial of degree p� 2ð Þ, by1.2.8, each coefficient of the polynomial f xð Þ is divisible by p. ■
1.2.10 Theorem Let p be a prime. Then p� 1ð Þ! � �1 mod pð Þ.This result is known as Wilson’s theorem.
Proof If p ¼ 2, then p� 1ð Þ! � �1 mod pð Þ becomes 1! � �1 mod 2ð Þ. This istrivially true. So we shall consider only the case in which the prime p is odd.
By 1.2.9, each coefficient of the p� 2ð Þth-degree polynomialx� 1ð Þ x� 2ð Þ x� p� 1ð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
p�1ð Þ factors
� xp�1 � 1ð Þ is divisible by p. Now, since the
constant term of the polynomial
x� 1ð Þ x� 2ð Þ x� p� 1ð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}p�1ð Þ factors
� xp�1 � 1� �
is
�1ð Þp�1 p� 1ð Þ!þ 1 ¼ �1ð Þodd�1 p� 1ð Þ!þ 1 ¼ p� 1ð Þ!þ 1� �
;
p� 1ð Þ!þ 1 is divisible by p, that is, p� 1ð Þ! � �1 mod pð Þ. ■
1.2.11 Note Let p be a prime number of the form 4nþ 1. It follows that 12 p� 1ð Þ isan integer 2. Put
a � 1 2 3 12
p� 1ð Þ �
¼ 12
p� 1ð Þ �
!
�:
It follows that
a2 ¼ 1 2 3 12
p� 1ð Þ � �
12
p� 1ð Þ �
12
p� 1ð Þ � 1 �
3 2 1 �
;
20 1 Galois Theory I
and hence
a2 � 1 2 3 12
p� 1ð Þ � �
12
p� 1ð Þ �
12
p� 1ð Þ � 1 �
3 2 1 �
mod pð Þ:
Thus
a2 � 1 2 3 12
p� 1ð Þ � �
12
p� 1ð Þ �
12
p� 3ð Þ �
3 2 1 �
mod pð Þ:
Since
12 p� 1ð Þ � �1
2 pþ 1ð Þ mod pð Þ;12 p� 3ð Þ � �1
2 pþ 3ð Þ mod pð Þ;...
3 � � p� 3ð Þ mod pð Þ;2 � � p� 2ð Þ mod pð Þ;1 � � p� 1ð Þ mod pð Þ;
we have
12
p� 1ð Þ �
12
p� 3ð Þ �
3 2 1 �
� �12
pþ 1ð Þ � �1
2pþ 3ð Þ
� � p� 3ð Þð Þ � p� 2ð Þð Þ � p� 1ð Þð Þ
�mod pð Þ;
that is,
12
p� 1ð Þ �
12
p� 3ð Þ �
3 2 1 �
� �1ð Þp�12
12
pþ 1ð Þ �
12
pþ 3ð Þ �
p� 3ð Þð Þ p� 2ð Þð Þ p� 1ð Þð Þ �
mod pð Þ;
that is,
12
p� 1ð Þ �
12
p� 3ð Þ �
3 2 1 �
� �1ð Þeven 12
pþ 1ð Þ �
12
pþ 3ð Þ �
p� 3ð Þð Þ p� 2ð Þð Þ p� 1ð Þð Þ �
mod pð Þ;
1.2 Polynomial Rings 21
that is,
12
p� 1ð Þ �
12
p� 3ð Þ �
3 2 1 �
� 12
pþ 1ð Þ �
12
pþ 3ð Þ �
p� 3ð Þð Þ p� 2ð Þð Þ p� 1ð Þð Þ �
mod pð Þ:
Now, since
a2 � 1 2 3 12
p� 1ð Þ � �
12
p� 1ð Þ �
12
p� 3ð Þ �
3 2 1 �
mod pð Þ;
we have
a2 � 1 2 3 12
p� 1ð Þ � �
12
pþ 1ð Þ �
12
pþ 3ð Þ �
p� 3ð Þð Þ p� 2ð Þð Þ p� 1ð Þð Þ �
mod pð Þ;
that is,
a2 � p� 1ð Þ! mod pð Þ:
By 1.2.10, p� 1ð Þ! � �1 mod pð Þ. Now, since a2 � p� 1ð Þ! mod pð Þ, we havea2 � �1 mod pð Þ. This shows that a is a solution of the quadratic congruencex2 � �1 mod pð Þ, that is, 1
2 p� 1ð Þ� �! is a solution of the quadratic congruence
x2 � �1 mod pð Þ.1.2.12 Conclusion Let p be a prime number of the form 4nþ 1. Then there exists asolution of x2 � �1 mod pð Þ. One such solution is 1
2 p� 1ð Þ� �!.
1.2.13 Theorem Let p be a prime number of the form 4nþ 1. Then there existintegers a and b such that p ¼ a2 þ b2.
This result is due to Fermat.
Proof It is clear that 5� p and that 12 p� 1ð Þ is an even integer. By 1.2.12, there
exists an integer x 2 0; 1; . . .; 12 p� 1ð Þ; . . .; p� 1ð Þ �such that x2 � �1 mod pð Þ. It
follows that there exists an integer c such that cp ¼ x2 þ 12.It follows that there exists an integer y 2 � 1
2 p� 1ð Þ;� 12 p� 1ð Þþ
1; . . .0; . . .; 12 p� 1ð Þg such that y2 � �1 mod pð Þ.Proof Here x 2 0; 1; . . .; 12 p� 1ð Þ; . . .; p� 1ð Þ �
, so
either x 2 0; 1; . . .;12
p� 1ð Þ� �
or x 2 12
p� 1ð Þþ 1; . . .; p� 1ð Þ� �
:
22 1 Galois Theory I
Case I: x 2 0; 1; . . .; 12 p� 1ð Þ �. In this case, let us take x for y. Since
x2 � �1 mod pð Þ, we have
y2 � �1 mod pð Þ: Since
y ¼ x 2 0; 1; . . .;12
p� 1ð Þ� �
� � 12
p� 1ð Þ;� 12
p� 1ð Þþ 1; . . .0; . . .;12
p� 1ð Þ� �
;
we have y 2 � 12 p� 1ð Þ;� 1
2 p� 1ð Þþ 1; . . .0; . . .; 12 p� 1ð Þ �.
Case II: x 2 12 p� 1ð Þþ 1; . . .; p� 1ð Þ �
. In this case, let us take p� xð Þ for y.Since
x2 � �1 mod pð Þ, x2 þ 1p is an integer, and hence p� 2xþ x2 þ 1
p is an integer.
Since
y2 þ 1p
¼ p� xð Þ2 þ 1p
¼ p� 2xþ x2 þ 1p
;
y2 þ 1p is an integer. Thus y2 � �1 mod pð Þ. It remains to show that
y 2 � 12
p� 1ð Þ;� 12
p� 1ð Þþ 1; . . .0; . . .;12
p� 1ð Þ� �
:
Since x 2 12 p� 1ð Þþ 1; . . .; p� 1ð Þ �
; we have
y ¼ p� xð Þ 2 p� 12 p� 1ð Þþ 1� �
; . . .; p� p� 1ð Þ � ¼ 12 p� 1ð Þ; . . .; 1 �
� � 12 p� 1ð Þ;� 1
2 p� 1ð Þþ 1; . . .0; . . .; 12 p� 1ð Þ �;
and hence
y 2 � 12
p� 1ð Þ;� 12
p� 1ð Þþ 1; . . .0; . . .;12
p� 1ð Þ� �
:
So in all cases, there exists an integer y 2 � 12 p� 1ð Þ;� 1
2 p� 1ð Þþ1; . . .0; . . .; 12 p� 1ð Þg such that
y2 � �1 mod pð Þ:■
Since y2 � �1 mod pð Þ, there exists a positive integer e such that ep ¼ y2 þ 12ð Þ.In view of 1.1.26, it suffices to show that e is relatively prime to p. Since
1.2 Polynomial Rings 23
y 2 � 12 p� 1ð Þ;� 1
2 p� 1ð Þþ 1; . . .0; . . .; 12 p� 1ð Þ �, we have yj j � 1
2 p� 1ð Þ, andhence y2 � 1
4 p� 1ð Þ2. It follows that
y2 þ 12� �� 1
4p� 1ð Þ2 þ 1;
and hence
e ¼ 1p
y2 þ 12� �� 1
p14
p� 1ð Þ2 þ 1 �
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}\1
p� 114
p� 1ð Þ2 þ 1 �
¼ 14
p� 1ð Þþ 1p� 1
� 14
p� 1ð Þþ 14¼ p
4\p:
Thus e is a positive integer strictly smaller than p, and since p is a prime, e mustbe relatively prime to p. ■
1.2.14 Definition Let F be a field. Let F x½ � be the collection of all polynomials f xð Þin the “indeterminant” x having coefficients in F. We know that F x½ � is an integraldomain with unit element 1. For every nonzero polynomial f xð Þ, put
d f xð Þð Þ � deg f xð Þð Þ:
It is known that
1. for every nonzero f xð Þ; g xð Þ 2 F x½ �, deg f xð Þð Þ� deg f xð Þg xð Þð Þ,2. for every nonzero f xð Þ; g xð Þ 2 F x½ �, there exist q xð Þ; r xð Þ 2 F x½ � such that
f xð Þ ¼ q xð Þg xð Þþ r xð Þ, and either r xð Þ ¼ 0 or deg r xð Þð Þ\deg g xð Þð Þð Þ.This shows that F x½ � is a Euclidean ring. Also, it is clear that�ð Þ for every nonzero f xð Þ; g xð Þ 2 F x½ �, deg f xð Þg xð Þð Þ ¼ deg f xð Þð Þþ
deg g xð Þð Þ.1.2.15 Problem F x½ � is a principal ideal ring.
Proof Since F x½ � is a Euclidean ring, by 1.1.4, F x½ � is a principal ideal ring. ■
Definition Let f xð Þ be a nonzero member of the Euclidean ring F x½ �. If f xð Þ is aunit or f xð Þ is a prime element of F x½ �, then we say that f xð Þ is irreducible over F.1.2.16 Problem Let f xð Þ be a nonzero member of the Euclidean ring F x½ �. Thenf xð Þ is a unit if and only if f xð Þ is a constant.
Proof Let f xð Þ be a unit in the Euclidean ring F x½ �. We have to show that f xð Þ is aconstant.
Since f xð Þ is a unit in the Euclidean ring F x½ �, there exists g xð Þ 2 F x½ � such thatf xð Þg xð Þ ¼ 1. Now, since 1 6¼ 0, and F x½ � is an integral domain, it follows thatf xð Þ 6¼ 0 and g xð Þ 6¼ 0, and then that
24 1 Galois Theory I
0� deg f xð Þð Þ� deg f xð Þg xð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ deg 1ð Þ ¼ 0;
and hence deg f xð Þð Þ ¼ 0. Thus f xð Þ is a constant.Conversely, let f xð Þ be a constant. It follows that deg f xð Þð Þ ¼ 0, and hence
f xð Þ 2 F. Now, since f xð Þ is a nonzero member of the field F, there exists anonzero member b of F � F x½ �ð Þ such that f xð Þb ¼ 1. Thus f xð Þ is a unit. ■
1.2.17 Problem Let f xð Þ be a nonzero member of the Euclidean ring F x½ �. Thenf xð Þ is irreducible if and only if
g xð Þ; h xð Þ 2 F x½ � and f xð Þ ¼ g xð Þh xð Þð Þ ) g xð Þ is a unit or h xð Þ is a unitð Þ:Proof Let f xð Þ be irreducible. We have to show that
g xð Þ; h xð Þ 2 F x½ � and f xð Þ ¼ g xð Þh xð Þð Þ ) g xð Þ is a unit or h xð Þ is a unitð Þ:
Since f xð Þ is irreducible, f xð Þ is a unit or f xð Þ is a prime element of F x½ �.Case I: f xð Þ is a unit. Suppose that g xð Þ; h xð Þ 2 F x½ � and f xð Þ ¼ g xð Þh xð Þ. We haveto show that g xð Þ is a unit or h xð Þ is a unit. Suppose to the contrary that g xð Þ isnot a unit and h xð Þ is not a unit. We seek a contradiction.
Since f xð Þ is a unit, by 1.2.16, f xð Þ is a constant, and hence deg f xð Þð Þ ¼ 0. Sincef xð Þ is nonzero and f xð Þ ¼ g xð Þh xð Þ, we have g xð Þ 6¼ 0. Now, since g xð Þ isnot a unit, by 1.2.16, g xð Þ is a not constant, and hence 1� deg g xð Þð Þ. Sincef xð Þ ¼ g xð Þh xð Þ, we have
1� deg g xð Þð Þ� deg g xð Þh xð Þð Þ ¼ deg f xð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0:
This is a contradiction.
Case II: f xð Þ is a prime element of F x½ �. Suppose that g xð Þ; h xð Þ 2F x½ � and f xð Þ ¼ g xð Þh xð Þ. We have to show that g xð Þ is a unit or h xð Þ is a unit.
Since f xð Þ is a prime element of the Euclidean ring F x½ �, g xð Þ; h xð Þ 2F x½ �; and f xð Þ ¼ g xð Þh xð Þ, we have that either g xð Þ is a unit or h xð Þ is a unit.
So in all cases,
g xð Þ; h xð Þ 2 F x½ � and f xð Þ ¼ g xð Þh xð Þð Þ ) g xð Þ is a unit or h xð Þ is a unitð Þ:
Conversely, suppose that
g xð Þ; h xð Þ 2 F x½ � and f xð Þ ¼ g xð Þh xð Þð Þ ) g xð Þ is a unit or h xð Þ is a unitð Þ: �ð Þ
1.2 Polynomial Rings 25
We have to show that f xð Þ is irreducible, that is, f xð Þ is a unit or f xð Þ is a primeelement of F x½ �. Suppose to the contrary that f xð Þ is not a unit and f xð Þ is not aprime element of F x½ �. We seek a contradiction.
Since f xð Þ is not a unit and f xð Þ is not a prime element of the Euclidean ringF x½ �, there exist g xð Þ; h xð Þ 2 F x½ � such that f xð Þ ¼ g xð Þh xð Þ, g xð Þ is not a unit andh xð Þ is not a unit. This contradicts �ð Þ. ■
1.2.18 Problem Clearly, the polynomial 1þ x2 is irreducible over the field R of allreal numbers.
Proof Suppose to the contrary that it is reducible. Then by 1.2.17, there existg xð Þ; h xð Þ 2 R x½ � such that 1þ x2 ¼ g xð Þh xð Þ, g xð Þ is not a unit, andh xð Þ is not a unit. We seek a contradiction.
Since g xð Þ is not a unit, by 1.2.16, g xð Þ is a constant, and hence deg g xð Þð Þ 1.Similarly, deg h xð Þð Þ 1.
Next,
deg g xð Þð Þþ deg h xð Þð Þ ¼ deg g xð Þh xð Þð Þ ¼ deg 1þ x2� � ¼ 2;
so deg g xð Þð Þþ deg h xð Þð Þ ¼ 2. Now, since deg g xð Þð Þ 1 and deg h xð Þð Þ 1, wehave deg g xð Þð Þ ¼ 1 and deg h xð Þð Þ ¼ 1. So we can suppose that g xð Þ � xþ a andh xð Þ � xþ b, where a; b are real numbers. Thus
1þ x2 ¼ xþ að Þ xþ bð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ abþ aþ bð Þxþ x2;
and hence
ab ¼ 1aþ b ¼ 0
�:
This shows that 0� a2 ¼ �1. This is a contradiction. ■
1.2.19 Problem Clearly, the polynomial 1þ x2 is not irreducible over the field C ofall complex numbers.
Proof Observe that
1þ x2 ¼ xþ ið Þ x� ið Þ:
Also, xþ i; x� i are members of C x½ �. Clearly, xþ i and x� i are not units inC x½ �. Thus by 1.2.17, 1þ x2 is not irreducible in C x½ �. ■
1.2.20 Problem Let f xð Þ be a nonzero member of the Euclidean ring F x½ �. Let f xð Þbe a nonunit. Then f xð Þ can be expressed as a product of finitely many irreduciblepolynomials of degree 1 in F x½ �.
26 1 Galois Theory I
Proof Since f xð Þ is not a unit, by 1.2.16, f xð Þ is not a constant, and hencedeg f xð Þð Þ 1. Further, by 1.1.11, f xð Þ can be expressed as a product of finitelymany prime elements of F x½ �. Since a prime element of F x½ � is not a unit, by 1.2.16,the degree of a prime element of F x½ � is 1. Now, by the definition of irreducibilityover F, f xð Þ can be expressed as a product of finitely many irreducible polynomialsof degree 1 in F x½ �. ■
1.2.21 Theorem Let f xð Þ be a nonzero member of the Euclidean ring F x½ �.Suppose that f xð Þ is not a unit in F x½ �. (By 1.2.20, f xð Þ can be expressed as aproduct of finitely many irreducible polynomials of degree 1 in F x½ �:Þ Let
f xð Þ ¼ p1 xð Þp2 xð Þ. . .pm xð Þ;
where each pi xð Þ i ¼ 1; 2; . . .;mð Þ is an irreducible polynomial of degree 1 inF x½ �. Let
f xð Þ ¼ p01 xð Þp02 xð Þ. . .p0n xð Þ;
where each p0j xð Þ j ¼ 1; 2; . . .; nð Þ is an irreducible polynomial of degree 1 inF x½ �. Then1. each pi xð Þ is an associate of some p0j xð Þ;2. each p0j xð Þ is an associate of some pi xð Þ,3. n ¼ m.
This theorem is known as the unique factorization theorem of polynomialsover F.
Proof By 1.1.15, the proof is immediate. ■
1.2.22 Problem Let p xð Þ be a nonzero member of the Euclidean ring F x½ �. Let p xð Þbe an irreducible polynomial of degree 1 in F x½ �. Then the ideal p xð Þð Þ is“maximal” in the sense that(i) p xð Þð Þ is a proper subset of F x½ �,(ii) if M is an ideal containing p xð Þð Þ and M is a proper subset of F x½ �, then
M ¼ p xð Þð Þ.
Proof By 1.2.16, p xð Þ is not a unit. Hence by the definition of irreduciblepolynomial, p xð Þ is a prime element of F x½ �. Now by 1.1.16, the ideal p xð Þð Þ ismaximal. ■
1.2.23 Problem Let p xð Þ be a nonzero member of the Euclidean ring F x½ �. Supposethat deg p xð Þð Þ 1. Let the ideal p xð Þð Þ be maximal. Then p xð Þ is an irreduciblepolynomial in F x½ �.Proof By 1.2.16, p xð Þ is not a unit. Hence by the definition of irreducible polyno-mial, it suffices to show that p xð Þ is a prime element of F x½ �.
Since the ideal p xð Þð Þ is maximal, by 1.1.17, p xð Þ is a prime element of F x½ �. ■
1.2 Polynomial Rings 27
1.2.24 Problem Let p xð Þ be a nonzero member of the Euclidean ring F x½ �. Supposethat deg p xð Þð Þ 1. Let p xð Þ be irreducible over the field F. Then by 1.2.22, the
ideal p xð Þð Þ is maximal. Further, the quotient ring F x½ �p xð Þð Þ is a field.
Proof Since F x½ � is an integral domain with unit element 1, the quotient ring F x½ �p xð Þð Þ is
a commutative ring with unit element 1þ p xð Þð Þ. Next, let us take arbitrary nonzero
elements f xð Þþ p xð Þð Þ and g xð Þþ p xð Þð Þ of F x½ �p xð Þð Þ, where f xð Þ; g xð Þ 2 F x½ �. We have
to show that f xð Þg xð Þþ p xð Þð Þ is nonzero. Suppose to the contrary thatf xð Þg xð Þ 2 p xð Þð Þ. We seek a contradiction.
Since f xð Þg xð Þ 2 p xð Þð Þ, there exists h xð Þ 2 F x½ � such that f xð Þg xð Þ ¼ p xð Þh xð Þ.Now, since p xð Þ is irreducible, by 1.2.21, p xð Þjf xð Þ or p xð Þjg xð Þ. It follows thateither f xð Þ 2 p xð Þð Þ or g xð Þ 2 p xð Þð Þ. In other words, either f xð Þþ p xð Þð Þ is the
zero element of F x½ �p xð Þð Þ or g xð Þþ p xð Þð Þ is the zero element of F x½ �
p xð Þð Þ. This is a
contradiction.
Thus we have shown that the product of nonzero elements of F x½ �p xð Þð Þ is nonzero.
Next, let f xð Þþ p xð Þð Þ be a nonzero element of F x½ �p xð Þð Þ. It follows that
f xð Þ 62 p xð Þð Þ, and f xð Þ is a nonzero polynomial. Hence p xð Þ-f xð Þ. By 1.1.5, thereexists a greatest common divisor h xð Þ of p xð Þ and f xð Þ in F x½ �. Further, there existk xð Þ; l xð Þ 2 F x½ � such that
h xð Þ ¼ k xð Þp xð Þþ l xð Þf xð Þ:
We claim that h xð Þ is a unit. Suppose to the contrary that h xð Þ is not a unit. Weseek a contradiction.
Since h xð Þ is a greatest common divisor of p xð Þ and f xð Þ in F x½ �, we haveh xð Þjp xð Þ and h xð Þjf xð Þ. Since h xð Þjp xð Þ, there exists k xð Þ 2 F x½ � such thatp xð Þ ¼ h xð Þk xð Þ. Now, since p xð Þ is irreducible, by 1.2.17, h xð Þ is a unitor k xð Þ is a unit. Since h xð Þ is not a unit, k xð Þ is a unit. It follows from p xð Þ ¼ h xð Þk xð Þ that p xð Þ and h xð Þ are associates. And since p xð Þ-f xð Þ, we have h xð Þ-f xð Þ. Thisis a contradiction.
Thus our claim is true, that is, h xð Þ is a unit. It follows that there exists l xð Þ 2F x½ � such that 1 ¼ h xð Þl xð Þ, and hence
1 ¼ k xð Þp xð Þþ l xð Þf xð Þð Þl xð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ k xð Þl xð Þð Þp xð Þþ l xð Þl xð Þð Þf xð Þ:
Thus
1� g xð Þf xð Þ ¼ k xð Þl xð Þð Þp xð Þ 2 p xð Þð Þ;
where g xð Þ � k xð Þl xð Þ 2 F x½ �. Hence g xð Þþ p xð Þð Þ serves the purpose of the
inverse element of f xð Þþ p xð Þð Þ in F x½ �p xð Þð Þ.
Thus F x½ �p xð Þð Þ is a field. ■
28 1 Galois Theory I
1.2.25 Example The field of all rational numbers is denoted by Q. Observe that thepolynomial x3 � 2 is a member of the Euclidean ring Q x½ �. Also, x3 � 2 is irre-ducible in Q x½ �.
And hence by 1.2.24, the quotient ring Q x½ �x3�2ð Þ is a field.
Proof Suppose to the contrary that
x3 � 2 ¼ g xð Þh xð Þ;
where g xð Þ; h xð Þ 2 Q x½ �, and neither g xð Þ nor h xð Þ is a unit. It follows that eitherdeg g xð Þð Þ ¼ 1 or deg h xð Þð Þ ¼ 1. For definiteness, suppose that deg g xð Þð Þ ¼ 1.Now we can suppose that
g xð Þ � x� a;
where a 2 Q. It follows that
x3 � 2 ¼ x� að Þh xð Þ;
and hence
a3 � 2 ¼ a� að Þh að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0h að Þ ¼ 0:
Thus a3 ¼ 2. Since a 2 Q, there exist two integers r and s such that
r3 ¼ 2 s3:
Observe that in the prime factorization of r3, 23 integerð Þ will occur, but in the primefactorization of r3 ¼ð Þ 2 s3, 23 integerð Þþ 1 will occur. But 23 integerð Þ 6¼ 23 integerð Þþ 1,which contradicts the uniqueness property of prime factorization of integers. ■
1.2.26 Note Suppose that f xð Þþ x3 � 2ð Þ is a member of the field Q x½ �x3�2ð Þ, where
f xð Þ 2 Q x½ �. Let us denote the polynomial x3 � 2 by g xð Þ.It follows that there exist q xð Þ; r xð Þ 2 Q x½ � such that f xð Þ ¼ q xð Þg xð Þþ r xð Þ, and
either r xð Þ ¼ 0 or deg r xð Þð Þ\deg g xð Þð Þ ¼ 3ð Þ. Hence we can suppose that
r xð Þ � a0 þ a1xþ a2x2;
where a0; a1; a2 2 Q. Thus
f xð Þ ¼ q xð Þg xð Þþ a0 þ a1xþ a2x2;
1.2 Polynomial Rings 29
and hence
f xð Þþ x3 � 2� � ¼ q xð Þg xð Þþ a0 þ a1xþ a2x
2 þ x3 � 2� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ q xð Þg xð Þþ a0 þ a1xþ a2x2 þ g xð Þð Þ
¼ a0 þ a1xþ a2x2 þ q xð Þg xð Þþ g xð Þð Þð Þ:
Thus
f xð Þþ x3 � 2� � ¼ a0 þ a1xþ a2x
2 þ q xð Þg xð Þþ g xð Þð Þð Þ:
Clearly, q xð Þg xð Þ 2 g xð Þð Þ. Now, since g xð Þð Þ is an additive group, q xð Þg xð Þþg xð Þð Þ ¼ g xð Þð Þ. Thus
f xð Þþ x3 � 2� � ¼ a0 þ a1xþ a2x
2 þ g xð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ a0 þ g xð Þð Þð Þþ a1 þ g xð Þð Þð Þ xþ g xð Þð Þð Þþ a2 þ g xð Þð Þð Þ xþ g xð Þð Þð Þ2;
or
f xð Þþ x3 � 2� � ¼ a0 þ g xð Þð Þð Þþ a1 þ g xð Þð Þð Þtþ a2 þ g xð Þð Þð Þt2;
where t � xþ g xð Þð Þ. Next,
t3 ¼ xþ g xð Þð Þð Þ3¼ x3 þ g xð Þð Þ ¼ g xð Þþ g xð Þð Þð Þþ 2þ g xð Þð Þð Þ¼ 0þ g xð Þð Þð Þþ 2þ g xð Þð Þð Þ ¼ 2þ g xð Þð Þ;
so
t3 ¼ 2þ g xð Þð Þ:
Notation Observe that the field Q x½ �x3�2ð Þ can be thought of as a vector space over the
field Q under the obvious definition of “scalar multiplication”: For every a 2 Q andfor every f xð Þ 2 Q x½ �,
a f xð Þþ x3 � 2� �� � � aþ x3 � 2
� �� �f xð Þþ x3 � 2
� �� � ¼ af xð Þþ x3 � 2� �� �
:
That is why for every a 2 Q, it is customary to denote aþ g xð Þð Þ simply by a.
1.2.27 Conclusion All the elements of the field Q x½ �x3�2ð Þ can be expressed “uniquely” as
a0 þ a1tþ a2t2;
where t � xþ x3 � 2ð Þ, and a0; a1; a2 2 Q. Further, t3 � 2 ¼ 0.
30 1 Galois Theory I
Proof of uniqueness part To this end, suppose that
a0 þ a1tþ a2t2 ¼ b0 þ b1tþ b2t
2;
where a0; a1; a2; b0; b1; b2 2 Q. We have to show that ai ¼ bi i ¼ 0; 1; 2ð Þ.Since
a0 þ a1tþ a2t2 ¼ b0 þ b1tþ b2t
2;
we have
a0 � b0ð Þþ a1 � b1ð Þtþ a2 � b2ð Þt2 ¼ 0;
and hence
a0 � b0ð Þþ a1 � b1ð Þxþ a2 � b2ð Þx2 2 x3 � 2� �
:
Now, since the degree of each nonzero member of x3 � 2ð Þ is 3,a0 � b0ð Þþ a1 � b1ð Þxþ a2 � b2ð Þx2 is the zero polynomial, and hence ai ¼ bii ¼ 0; 1; 2ð Þ. ■
1.3 The Eisenstein Criterion
1.3.1 Definition The field of all integers is denoted by Z. Leta0 þ a1xþ þ anxn be a member of the ring Z x½ �, where each ai is an integer.If 1 is a greatest common divisor of a0; a1; . . .; an, then we say that a0 þa1xþ þ anxn is a primitive polynomial.
1.3.2 Problem Let a0 þ a1xþ þ anxn and b0 þ b1xþ þ bmxm be twoprimitive polynomials. Their product is
c0 þ c1xþ þ cnþmxnþm;
where
c0 � a0b0c1 � a1b0 þ a0b1
c2 � a2b0 þ a1b1 þ a0b2...
9>>=>>;:
Then c0 þ c1xþ þ cnþmxnþm is primitive.
Proof Suppose to the contrary that c0 þ c1xþ þ cnþmxnþm is imprimitive. Weseek a contradiction.
1.2 Polynomial Rings 31
Since c0 þ c1xþ þ cnþmxnþm is not primitive, 1 is not a greatest commondivisor of c0; c1; ; cnþm, and hence there exists a prime number p [ 1ð Þ such thatpjci i ¼ 0; 1; . . .; nþmð Þ. Since p[ 1, and 1 is a greatest common divisor ofa0; a1; . . .; an, there exists j 2 0; 1; . . .; nf g such that p-aj and pjal l ¼ð0; 1; . . .; j� 1Þ. Similarly, there exists k 2 0; 1; . . .;mf g such that p-bk andpjbl l ¼ 0; 1; . . .; k � 1ð Þ. Hence a0 þ a1xþ þ anxn is of the form
p ð Þþ p ð Þxþ p ð Þx2 þ þ p ð Þxj�1 þ ajxj þ ajþ 1x
jþ 1 þ ;
and b0 þ b1xþ þ bmxm is of the form
p ð Þþ p ð Þxþ p ð Þx2 þ þ p ð Þxk�1 þ bkxk þ bkþ 1x
kþ 1 þ :
Since p is a prime number, p-aj, and p-bk, we have p-ajbk . Further,
cjþ k ¼ ajþ kb0 þ ajþ k�1b1 þ þ a0bjþ k
¼ ajþ kb0 þ ajþ k�1b1 þ þ ajþ 1bk�1 þ ajbk þ aj�1bkþ 1 þ þ a0bjþ k;
so
ajbk ¼ cjþ k � ajþ kb0 þ ajþ k�1b1 þ þ ajþ 1bk�1� �� aj�1bkþ 1 þ þ a0bjþ k
� �¼ cjþ k � ajþ kp ð Þ þ ajþ k�1p ð Þþ þ ajþ 1p ð Þ� �� p ð Þbkþ 1 þ þ p ð Þbjþ k� �
¼ cjþ k � p ð Þ � p ð Þ ¼ cjþ k � p ð Þ;
and hence
ajbk ¼ cjþ k � p ð Þ:
Since pjcjþ k, we have pjajbk . This is a contradiction. ■
1.3.3 Theorem Let a0 þ a1xþ þ anxn 2 Z x½ �ð Þ be a primitive polynomial.Suppose that
a0 þ a1xþ þ anxn ¼ r0 þ r1xþ þ rmx
mð Þ s0 þ s1xþ þ sn�mxn�mð Þ;
where each ri is a rational number and each sj is a rational number. Then there existtwo polynomials k xð Þ;l xð Þ 2 Z x½ � such that
a0 þ a1xþ þ anxn ¼ k xð Þl xð Þ:
This result is known as the Gauss’s lemma.
32 1 Galois Theory I
Proof By clearing denominators and taking out common factors, we can write
r0 þ r1xþ þ rmxmð Þ s0 þ s1xþ þ sn�mx
n�mð Þ
as ab k xð Þl xð Þ, where a; b are positive integers and k xð Þ; l xð Þ are primitive poly-
nomials. Now, since
a0 þ a1xþ þ anxn ¼ r0 þ r1xþ þ rmx
mð Þ s0 þ s1xþ þ sn�mxn�mð Þ;
we have
a0 þ a1xþ þ anxn ¼ a
bk xð Þl xð Þ;
or
ba0 þ ba1xþ þ panxn ¼ ak xð Þl xð Þ: �ð Þ
Since a0 þ a1xþ þ anxn 2 Z x½ �ð Þ is a primitive polynomial, 1 is a greatestcommon divisor of a0; a1; . . .; an, and hence b is a greatest common divisor ofba0; ba1; . . .; ban. Now, from �ð Þ, b is a greatest common divisor of all the coef-ficients of the various powers of x in ak xð Þl xð Þ. Since k xð Þ; l xð Þ are primitive, by1.3.2, k xð Þl xð Þ is primitive, and hence a is a greatest common divisor of all thecoefficients of the various powers of x in ak xð Þl xð Þ. Since a; b are positive integersand b is a greatest common divisor of all the coefficients of the various powers of xin ak xð Þl xð Þ, we have a ¼ b. Since a ¼ b, by �ð Þ, we have
a0 þ a1xþ þ anxn ¼ k xð Þl xð Þ:
Also, k xð Þ; l xð Þ are primitive polynomials. ■
1.3.4 Problem Let a0 þ a1xþ þ anxn 2 Z x½ �ð Þ be a primitive polynomial. Letp 2ð Þ be a prime number. Suppose that pjai i ¼ 0; 1; . . .; n� 1ð Þ, p-an, and p2-a0.Then a0 þ a1xþ þ anxn is irreducible over Q, that is, a0 þ a1xþ þ anxn
cannot be factored into two nontrivial polynomials with rational numbers ascoefficients.
Proof Suppose to the contrary that
a0 þ a1xþ þ anxn ¼ r0 þ r1xþ þ rmx
mð Þ s0 þ s1xþ þ sn�mxn�mð Þ;
where each ri is a rational number and each sj is a rational number. We seek acontradiction.
By 1.3.3, there exist two polynomials b0 þ b1xþ þ bmxm; c0 þ c1xþ þcn�mxn�m 2 Z x½ � such that
1.3 The Eisenstein Criterion 33
a0 þ a1xþ þ anxn ¼ b0 þ b1xþ þ bmx
mð Þ c0 þ c1xþ þ cn�mxn�mð Þ:
Since pjai i ¼ 0; 1; . . .; n� 1ð Þ, the above equality takes the form
p a00 þ a01xþ þ a0n�1xn�1� �þ anx
n ¼ b0 þ b1xþ þ bmxmð Þ
c0 þ c1xþ þ cn�mxn�mð Þ �ð Þ:
Since p-an, p does not divide any greatest common divisor of a0; a1; . . .; an. Herepa00 ¼ b0c0, and p is a prime, so pjb0 or pjc0. Since p2-a0 and a0 ¼ pa00 ¼ b0c0, wehave p2-b0c0, and hence pjb0 and pjc0 cannot be true simultaneously.
So for the sake of definiteness, suppose that pjb0, and p-c0.Now �ð Þ takes the form
p a00 þ a01xþ þ a0n�1xn�1� �þ anx
n ¼ p ð Þþ b1xþ þ bmxmð Þ
c0 þ c1xþ þ cn�mxn�mð Þ:
Since p-an, from �ð Þ, we find that there exists j 2 1; 2; . . .;mf g such that
1. p-bj,2. pjbk k ¼ 0; 1; . . .; j� 1ð Þ.
Now we can write
p a00 þ a01xþ þ a0n�1xn�1� �þ anx
nðp ð Þ þ p ð Þxþ þ p ð Þxj�1
þ bjxj þ þ bmx
m�c0 þ c1xþ þ cn�mx
n�mð Þ:
It follows that
pa0j ¼ bjc0 þ p ð Þc1 þ p ð Þc2 þ þ p ð Þcj:
This shows that pjbjc0. Now, since p is a prime number, either pjbj or pjc0. Thisis a contradiction. ■
1.3.5 Theorem Let a0 þ a1xþ þ anxn 2 Z x½ �ð Þ be a polynomial. Let p 2ð Þ bea prime number. Suppose that pjai i ¼ 0; 1; . . .; n� 1ð Þ, p-an and p2-a0. Thena0 þ a1xþ þ anxn is irreducible over Q, that is, a0 þ a1xþ þ anxn cannotbe factored into two nontrivial polynomials with rational numbers as coefficients.
This result is known as Eisenstein’s criterion.
Proof Let d be the positive greatest common divisor of a0; a1; . . .; an. We can write
a0 þ a1xþ þ anxn ¼ d a00 þ a01xþ þ a0nx
n� �
;
34 1 Galois Theory I
where the positive greatest common divisor of a00; a01; . . .; a
0n is 1. It follows that
a00 þ a01xþ þ a0nxn is a primitive polynomial. Since, p-an, p does not divide the
positive greatest common divisor d of a0; a1; . . .; an.Since pja0, and a0 ¼ da00, we have pjda00. Now, since p is a prime and p-d, we
have pja00.Since pja1, and a1 ¼ da01, we have pjda01. Since p is a prime and p-d, we have
pja01. Similarly, pja02, etc. Thus pja0i i ¼ 0; 1; . . .; n� 1ð Þ.Since p-an and an ¼ da0n, we have p-da0n. It follows that p-a
0n.
Since p2-a0, and a0 ¼ da00, we have p2-da00. It follows that p2-a00.
Now, by 1.3.4, a00 þ a01xþ þ a0nxn is irreducible over Q, and hence
d a00 þ a01xþ þ a0nxn
� �is irreducible over Q. Since
a0 þ a1xþ þ anxn ¼ d a00 þ a01xþ þ a0nx
n� �
;
a0 þ a1xþ þ anxn is irreducible over Q. ■
1.3.6 Definition Let R be a commutative ring with unit element 1. We know thatR x1½ � is a commutative ring with unit element 1. Since R x1½ � is a commutative ringwith unit element 1, R x1½ �ð Þ x2½ � is also a commutative ring with unit element 1. HereR x1½ �ð Þ x2½ � is denoted by R x1; x2½ �.Observe that the elements of R x1; x2½ � ¼ R x1½ �ð Þ x2½ �ð Þ are of the form
a00 þ a10x1 þ a20 x1ð Þ2 þ � �
þ a01 þ a11x1 þ a21 x1ð Þ2 þ � �
x2
þ a02 þ a12x1 þ a22 x1ð Þ2 þ � �
x2ð Þ2 þ ;
that is,
a00 þ a10x1 þ a01x2ð Þþ a20 x1ð Þ2 þ a11x1x2 þ a02 x2ð Þ2� �
þ a30 x1ð Þ3 þ a21 x1ð Þ2x2 þ a12x1 x2ð Þ2 þ a03 x2ð Þ3� �
þ ;
that is,
Xiþ j¼0
aij x1ð Þi x2ð Þ j� �þ Xiþ j¼1
aij x1ð Þi x2ð Þ j� �þ Xiþ j¼2
aij x1ð Þi x2ð Þ j� �þ :
Thus each member of the ring R x1; x2½ � is of the form
Xiþ j¼0
aij x1ð Þi x2ð Þ j� �þ Xiþ j¼1
aij x1ð Þi x2ð Þ j� �þ Xiþ j¼2
aij x1ð Þi x2ð Þ j� �þ :
1.3 The Eisenstein Criterion 35
Definition Let R be a commutative ring with unit element 1. We know thatR x1; x2½ � is a commutative ring with unit element 1. Since R x1; x2½ � is a commutativering with unit element 1, R x1; x2½ �ð Þ x3½ � is also a commutative ring with unit element1. Here R x1; x2½ �ð Þ x3½ � is denoted by R x1; x2; x3½ �.
Observe that the elements of R x1; x2; x3½ � ¼ R x1; x2½ �ð Þ x3½ �ð Þ are of the form
a000 þ a100x1 þ a010x2ð Þþ a200 x1ð Þ2 þ a110x1x2 þ a020 x2ð Þ2� �
þ � �
þ a001 þ a101x1 þ a011x2ð Þþ a201 x1ð Þ2 þ a111x1x2 þ a021 x2ð Þ2� �
þ � �
x3
þ a002 þ a102x1 þ a012x2ð Þþ a202 x1ð Þ2 þ a112x1x2 þ a022 x2ð Þ2� �
þ � �
x3ð Þ2 þ ;
that is,
a000 þ a100x1 þ a010x2 þ a001x3ð Þþ a200 x1ð Þ2 þ a110x1x2 þ a020 x2ð Þ2�
þ a101x1x3 þ a011x2x3 þ a002 x3ð Þ2�þ ;
that is,
Xiþ jþ k¼0
aijk x1ð Þi x2ð Þ j x3ð Þk� �
þX
iþ jþ k¼1
aijk x1ð Þi x2ð Þ j x3ð Þk� �
þX
iþ jþ k¼2
aijk x1ð Þi x2ð Þ j x3ð Þk� �
þ :
Similar definitions can be supplied for R x1; x2; x3; x4½ �, etc. The commutative ringR x1; . . .; xn½ � with unit element 1 is called the ring of polynomials in n variablesx1; . . .; xn over R.
1.3.7 Problem Let R be an integral domain. Then R x½ � is an integral domain.And hence R x1; . . .; xn½ � is an integral domain.
Proof Let a0 þ a1xþ a2x2 þ and b0 þ b1xþ b2x2 þ be any two nonzeromembers of R x½ �. It suffices to show that their product
a0b0ð Þþ a1b0 þ a0b1ð Þxþ a2b0 þ a1b1 þ a0b2ð Þx2 þ
is nonzero.Since a0 þ a1xþ a2x2 þ is nonzero, there exists j 2 0; 1; 2; . . .f g such that
1. aj is nonzero,2. al ¼ 0 l ¼ 0; 1; . . .; j� 1ð Þ.
36 1 Galois Theory I
Similarly, there exists k 2 0; 1; 2; . . .f g such that1′. bk is nonzero,
2′. bl ¼ 0 l ¼ 0; 1; . . .; k � 1ð Þ.It follows that
a0 þ a1xþ a2x2 þ ¼ ajx
j þ ajþ 1xjþ 1 þ
and
b0 þ b1xþ b2x2 þ ¼ bkx
k þ bkþ 1xkþ 1 þ ;
and hence the coefficient of xjþ k in
a0 þ a1xþ a2x2 þ � �
b0 þ b1xþ b2x2 þ � �
is ajbk. It suffices to show that ajbk is nonzero. Since aj is a nonzero member of R,bk is a nonzero member of R, and R is an integral domain, it follows that ajbk isnonzero. ■
Definition Let F be a field. By 1.3.7, F x1; . . .; xn½ � is an integral domain. Now wecan construct its field of quotients. This field is denoted by F x1; . . .; xnð Þ and iscalled the field of rational functions in x1; . . .; xn over F.
The field F x1; . . .; xnð Þ is important in algebraic geometry.
Definition Let R be an integral domain with unit element 1. Let p be a nonzeromember of R that is not a unit. If
a; b 2 R and p ¼ abð Þ ) a is a unit or b is a unitð Þ;
then we say that p is irreducible (or p is a prime element of RÞ.Definition Let R be an integral domain with unit element 1. If
a. every nonzero member of R that is not a unit can be written as a product offinitely many irreducible elements of R,
b. the decomposition in part (a) is unique up to the order and associates of theirreducible elements of R, then we say that R is a unique factorization domain.
�ð Þ Since every nonzero member of a field is a unit, every field is an example ofa unique factorization domain.
1.3.8 Problem Let R be a unique factorization domain with unit element 1. Leta; b 2 R. Then clearly, a greatest common divisor of a and b exists in R.
Definition Let R be a unique factorization domain with unit element 1. Let a; b benonzero members of R (By 1.3.8, a greatest common divisor of a and b exists in R.).
1.3 The Eisenstein Criterion 37
If there exists a unit u in R such that u is a greatest common divisor of a and b, thenwe say that a and b are relatively prime.
1.3.9 Problem Let R be a unique factorization domain with unit element 1. Leta; b; c be any nonzero elements of R. Suppose that ajbc. Let a and b be relativelyprime. Then clearly, ajc.1.3.10 Problem Let R be a unique factorization domain with unit element 1. Leta; b; c be any nonzero elements of R. Suppose that a is an irreducible element.Suppose that ajbc. Then clearly, either ajb or ajc.Definition Let R be a unique factorization domain with unit element 1. Leta0 þ a1xþ þ anxn be a member of the ring R x½ �, where each ai is in R. If 1 is agreatest common divisor of a0; a1; . . .; an, then we say that a0 þ a1xþ þ anxn isa primitive polynomial in R x½ �.1.3.11 Problem Let R be a unique factorization domain with unit element 1. Leta0 þ a1xþ þ anxn and b0 þ b1xþ þ bmxm be two primitive polynomials inR x½ �. Their product is
c0 þ c1xþ þ cnþmxnþm;
where
c0 � a0b0c1 � a1b0 þ a0b1
c2 � a2b0 þ a1b1 þ a0b2...
9>>=>>;:
Then c0 þ c1xþ þ cnþmxnþm is primitive.
Proof Suppose to the contrary that c0 þ c1xþ þ cnþmxnþm is imprimitive. Weseek a contradiction.
Since c0 þ c1xþ þ cnþmxnþm is not primitive, 1 is not a greatest commondivisor of c0; c1; . . .; cnþm; and hence there exists an irreducible p such thatpjci i ¼ 0; 1; . . .; nþmð Þ. Since p is irreducible, p is not a unit. Since p is not a unitand 1 is a greatest common divisor of a0; a1; . . .; an, there exists j 2 0; 1; . . .; nf gsuch that p-aj and pjal l ¼ 0; 1; . . .; j� 1ð Þ. Similarly, there exists k 2 0; 1; . . .;mf gsuch that p-bk and pjbl l ¼ 0; 1; . . .; k � 1ð Þ. Hence a0 þ a1xþ þ anxn is of theform
p ð Þþ p ð Þxþ p ð Þx2 þ þ p ð Þxj�1 þ ajxj þ ajþ 1x
jþ 1 þ ;
and b0 þ b1xþ þ bmxm is of the form
38 1 Galois Theory I
p ð Þþ p ð Þxþ p ð Þx2 þ þ p ð Þxk�1 þ bkxk þ bkþ 1x
kþ 1 þ :
Since p is irreducible, p-aj, p-bk, and R is a unique factorization domain, we havep-ajbk. Further,
cjþ k ¼ ajþ kb0 þ ajþ k�1b1 þ þ a0bjþ k
¼ ajþ kb0 þ ajþ k�1b1 þ þ ajþ 1bk�1 þ ajbk þ aj�1bkþ 1 þ þ a0bjþ k;
so
ajbk ¼ cjþ k � ajþ kb0 þ ajþ k�1b1 þ þ ajþ 1bk�1� �� aj�1bkþ 1 þ þ a0bjþ k
� �¼ cjþ k � ajþ kp ð Þ þ ajþ k�1p ð Þþ þ ajþ 1p ð Þ� �� p ð Þbkþ 1 þ þ p ð Þbjþ k� �
¼ cjþ k � p ð Þ � p ð Þ ¼ cjþ k � p ð Þ;
and hence
ajbk ¼ cjþ k � p ð Þ:
Since pjcjþ k, we have pjajbk . This is a contradiction. ■
1.3.12 Note Let R be a unique factorization domain with unit element 1. It followsthat R is an integral domain with unit element 1, and hence it has a field F ofquotients. Since F is a field, by 1.3.7, F x½ � is an integral domain with unit element1. Clearly, R x½ � can be considered a subring of F x½ �. Let
a0b0
þ a1b1
xþ a2b2
x2 þ þ anbn
xn
be any member of F x½ �, where each ai is a member of R and each bi is a nonzeromember of R. Now we can write
a0b0
þ a1b1
xþ a2b2
x2 þ þ anbn
xn ¼ 1b
c0 þ c1xþ c2x2 þ þ cnx
n� �
;
where b � b0b1b2. . .bn 2 R, c0 � a0b1b2. . .bn 2 R, c1 � b0a1b2. . .bn 2 R; . . .,cn � b0b1b2. . .bn�1an 2 R. Let d be a greatest common divisor of c0; c1; c2; . . .; cn.Hence
c0 þ c1xþ c2x2 þ þ cnx
n
can be expressed as
1.3 The Eisenstein Criterion 39
d d0 þ d1xþ d2x2 þ þ dnx
n� �
;
where each di is a member of R. Clearly, 1 is a greatest common divisor ofd0; d1; d2; . . .; dn, and hence d0 þ d1xþ d2x2 þ þ dnxn is a primitive polynomialin R x½ �.1.3.13 Conclusion Let R be a unique factorization domain with unit element 1 andlet F be its field of quotients. Then every member of F x½ � can be expressed as d
b f xð Þ,where f xð Þ 2 R x½ �, b; d 2 R, and f xð Þ is primitive in R x½ �.1.3.14 Problem Let R be a unique factorization domain with unit element 1, and letF be its field of quotients. Let f xð Þ 2 R x½ �. Suppose that
a. f xð Þ is primitive as an element of R x½ �,b. f xð Þ is irreducible as an element of R x½ �.
Then f xð Þ is irreducible as an element of F x½ �.Proof If not, then by 1.3.13, we can suppose to the contrary that
f xð Þ ¼ d1b1
f1 xð Þ d2b2
f2 xð Þ;
where f1 xð Þ; f2 xð Þ 2 R x½ �, b1; b2; d1; d2 2 R, f1 xð Þ; f2 xð Þ are primitive as elementsof R x½ �, deg f1 xð Þð Þ 1, and deg f2 xð Þð Þ 1. We seek a contradiction.
Here
b1b2f xð Þ ¼ d1d2g xð Þ;
where g xð Þ � f1 xð Þf2 xð Þ. Now, since f1 xð Þ; f2 xð Þ are primitive as elements of R x½ �,by 1.3.11, g xð Þ is primitive as an element of R x½ �. It follows that d1d2 is a greatestcommon divisor of the coefficients of the various powers of x ind1d2g xð Þ ¼b1b2f xð Þð Þ. Thus d1d2 is a greatest common divisor of the coefficients ofthe various powers of x in b1b2f xð Þ. Since f xð Þ is primitive as an element of R x½ �,b1b2 is a greatest common divisor of the coefficients of the various powers of x inb1b2f xð Þ. Hence we can suppose that d1d2 ¼ b1b2 6¼ 0ð Þ. Now, since b1b2f xð Þ ¼d1d2f1 xð Þf2 xð Þ and R x½ � is an integral domain, we have f xð Þ ¼ f1 xð Þf2 xð Þ. Next,since deg f1 xð Þð Þ 1, deg f2 xð Þð Þ 1, and f1 xð Þ; f2 xð Þ 2 R x½ �, it follows that f xð Þ isnot irreducible as an element of R x½ �. This is a contradiction. ■
1.3.15 Problem Let R be a unique factorization domain with unit element 1, and letF be its field of quotients. Let f xð Þ 2 R x½ �. Suppose that
a. f xð Þ is primitive as an element of R x½ �,b. f xð Þ is irreducible as an element of F x½ �.
Then f xð Þ is irreducible as an element of R x½ �.
40 1 Galois Theory I
Proof Suppose to the contrary that
f xð Þ ¼ f1 xð Þ f2 xð Þ;
where f1 xð Þ; f2 xð Þ 2 R x½ � � F x½ �ð Þ, deg f1 xð Þð Þ 1, and deg f2 xð Þð Þ 1. We seek acontradiction.
It follows that f1 xð Þ; f2 xð Þ 2 F x½ �. Since f xð Þ 2 R x½ � � F x½ �, we havef xð Þ 2 F x½ �. Since f xð Þ is irreducible as an element of F x½ �, f xð Þ ¼ f1 xð Þ f2 xð Þ, andf1 xð Þ; f2 xð Þ; f xð Þ 2 F x½ �, we have
f1 xð Þ is a unit as an element of F x½ � or f2 xð Þ is a unit as an element of F x½ �ð Þ;
that is, either deg f1 xð Þð Þ� 1 or deg f2 xð Þð Þ� 1. This is a contradiction. ■
1.3.16 Problem Let R be an integral domain with unit element 1. We know, by1.3.7, that R x½ � is also an integral domain with unit element 1. It is clear that
�ð Þ for every nonzero f xð Þ; g xð Þ 2 R x½ �, deg f xð Þg xð Þð Þ ¼ deg f xð Þð Þþ deg g xð Þð Þ.Also, if u xð Þ is a unit in R x½ �, then u xð Þ is also a unit in R.
Proof Let u xð Þ be a unit in R x½ �. We have to show that u xð Þ is a unit in R. Sinceu xð Þ is a unit in the integral domain R x½ �, there exists a nonzero v xð Þ in R x½ � suchthat
u xð Þv xð Þ ¼ 1;
and hence
0� deg u xð Þð Þþ deg v xð Þð Þ ¼ deg u xð Þv xð Þð Þ ¼ deg 1ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0:
This shows that deg u xð Þð Þ ¼ 0, and deg v xð Þð Þ ¼ 0. So there exist nonzero a; bin R such that u xð Þ ¼ a and v xð Þ ¼ b. Now, since u xð Þv xð Þ ¼ 1, we have ab ¼ 1. Itfollows that u xð Þ ¼ð Þa is a unit in R, and hence u xð Þ is a unit in R. ■
1.3.17 Problem Let R be a unique factorization domain with unit element 1. Letf xð Þ be a nonzero member of R x½ � having degree 1. Suppose that f xð Þ is primitiveas an element of R x½ �. Then f xð Þ can be expressed as a product of finitely manyirreducible polynomials in R x½ �.Proof Since R x½ � can be considered a subring of F x½ �, where F is the field ofquotients of members of R, and f xð Þ 2 R x½ �, we can think of f xð Þ as a nonzeromember of F x½ �.
Since f xð Þ is a nonzero member of R x½ � having degree 1, f xð Þ is not a unit inF x½ �. By 1.2.20, f xð Þ can be expressed as a product of finitely many irreducible
1.3 The Eisenstein Criterion 41
polynomials p1 xð Þ; p2 xð Þ; . . .; pn xð Þ of degree 1 in F x½ �. By 1.3.13, we cansuppose that
pk xð Þ ¼ dkbk
fk xð Þ k ¼ 1; 2; . . .; nð Þ;
where fk xð Þ 2 R x½ �, bk; dk 2 R, and fk xð Þ is primitive in R x½ �. Hence
b1b2. . .bnf xð Þ ¼ d1d2. . .dng xð Þ;
where g xð Þ � f1 xð Þf2 xð Þ. . .fn xð Þ. Now, since f1 xð Þ; f2 xð Þ; . . .; fn xð Þ are primitive aselements of R x½ �, by 1.3.11, g xð Þ is primitive as an element of R x½ �. It follows thatd1d2. . .dn is a greatest common divisor of the coefficients of the various powers ofx in d1d2. . .dng xð Þ ¼ b1b2. . .bnf xð Þð Þ. Thus d1d2. . .dn is a greatest common divisorof the coefficients of the various powers of x in b1b2. . .bnf xð Þ. Since f xð Þ isprimitive as an element of R x½ �, b1b2. . .bn is a greatest common divisor of thecoefficients of the various powers of x in b1b2. . .bnf xð Þ. Hence we can suppose that
d1d2. . .dn ¼ b1b2. . .bn 6¼ 0ð Þ:
Now, since b1b2. . .bnf xð Þ ¼ d1d2. . .dng xð Þ and R x½ � is an integral domain, we
have f xð Þ ¼ g xð Þ, and hence f xð Þ ¼ f1 xð Þf2 xð Þ. . .fn xð Þ. Since each dkbkfk xð Þ ¼
� �pk xð Þ is irreducible in F x½ �, fk xð Þ is irreducible in F x½ �. And since fk xð Þ is primitivein R x½ �, by 1.3.15, fk xð Þ is irreducible as an element of R x½ �.
Thus f xð Þ is expressed as a product of finitely many irreducible polynomials inR x½ �. ■
1.3.18 Problem Let R be a unique factorization domain with unit element 1. Letf xð Þ be a nonzero member of R x½ � having degree 1. Suppose that f xð Þ is primitiveas an element of R x½ �. Then f xð Þ can be expressed uniquely as a product of finitelymany irreducible polynomials in R x½ �.Proof By 1.3.17, it suffices to prove only the uniqueness part of this theorem. Tothis end, let
f xð Þ ¼ p1 xð Þp2 xð Þ. . .pm xð Þ;
where each pi xð Þ i ¼ 1; 2; . . .;mð Þ is an irreducible polynomial of degree 1 inR x½ �. Let
f xð Þ ¼ p01 xð Þp02 xð Þ p0n xð Þ;
where each p0j xð Þ j ¼ 1; 2; . . .; nð Þ is an irreducible polynomial of degree 1 inR x½ �. We have to show that
42 1 Galois Theory I
1. each pi xð Þ is an associate of some p0j xð Þ in R x½ �,2. each p0j xð Þ is an associate of some pi xð Þ in R x½ �,3. n ¼ m.
Since f xð Þ is primitive as an element of R x½ � and f xð Þ ¼ p1 xð Þp2 xð Þ. . .pm xð Þ,each pi xð Þ is primitive as an element of R x½ �. Similarly, each p0j xð Þ is primitive as anelement of R x½ �. Now, since p1 xð Þ is an irreducible polynomial in R x½ �, by 1.3.14,p1 xð Þ is irreducible as an element of F x½ �, where F is the field of quotients ofmembers of R. Similarly, each pi xð Þ is irreducible as an element of F x½ �, and eachp0j xð Þ is irreducible as an element of F x½ �.
Since f xð Þ is a nonzero member of R x½ � � F x½ �ð Þ having degree 1, by 1.2.16,f xð Þ is not a unit in F x½ �. By 1.2.21,
1. each pi xð Þ is an associate of some p0j xð Þ in F x½ �,2. each p0j xð Þ is an associate of some pi xð Þ in F x½ �,3. n ¼ m.
Suppose that pi xð Þ is an associate of some p0j xð Þ in F x½ �. It follows that thereexist a; b 2 R such that
pi xð Þ ¼ abp0j xð Þ;
and hencebpi xð Þ ¼ ap0j xð Þ:
Since pi xð Þ is primitive as an element of R x½ �, b is a greatest common divisor of
coefficients of the various powers of x in bpi xð Þ ¼ ap0j xð Þ� �
. Thus b is a greatest
common divisor of the coefficients of the various powers of x in ap0j xð Þ. Since p0j xð Þis primitive as an element of R x½ �, a is a greatest common divisor of the coefficientsof the various powers of x in ap0j xð Þ. Hence we can suppose that
a ¼ b 6¼ 0ð Þ:
Now, since bpi xð Þ ¼ ap0j xð Þ and R x½ � is an integral domain, pi xð Þ ¼ p0j xð Þ, andhence pi xð Þ is an associate of some p0j xð Þ in R x½ �. Thus each pi xð Þ is an associate ofsome p0j xð Þ in R x½ �. Similarly, each p0j xð Þ is an associate of some pi xð Þ in R x½ �. ■
1.3.19 Problem Let R be a unique factorization domain with unit element 1. ThenR x½ � is also a unique factorization domain with unit element 1.
Proof Since R is a unique factorization domain with unit element 1, R is an integraldomain with unit element 1, and hence by 1.3.7, R x½ � is an integral domain with unitelement 1. It remains to show that
1.3 The Eisenstein Criterion 43
a. every nonzero member of R x½ � that is not a unit can be written as a product offinitely many irreducible elements of R x½ �,
b. the decomposition in part (a) is unique up to the order and associates of theirreducible elements of R x½ �.To this end, let us take a nonzero member f xð Þ of R x½ � that is not a unit in R x½ �.
We have to show that f xð Þ can be expressed uniquely as a product of finitely manyirreducible polynomials in R x½ �.
Let d be a greatest common divisor of the coefficients of the various powers ofx in f xð Þ. It follows that f xð Þ takes the form dg xð Þ, where g xð Þ is primitive as anelement of R x½ �.Case I: deg f xð Þð Þ 1. Since f xð Þ ¼ dg xð Þ and d 2 R, by 1.3.16, we havedeg g xð Þð Þ 1. It follows, by 1.3.18, that g xð Þ can be expressed uniquely as aproduct of finitely many irreducible polynomials in R x½ �. Since d 2 R, d is irre-ducible in R x½ �. Now, since f xð Þ ¼ dg xð Þ, f xð Þ can be expressed uniquely as aproduct of finitely many irreducible polynomials in R x½ �.Case II: deg f xð Þð Þ ¼ 0, that is, f xð Þ is a nonzero member of R. It follows that f xð Þ isirreducible in R x½ �.
So in all cases,f xð Þ can be expressed uniquely as a product of finitely manyirreducible polynomials in R x½ �. ■
1.3.20 Problem Let R be a unique factorization domain with unit element 1. ThenR x1; x2; ; xn½ � is also a unique factorization domain with unit element 1.
Proof Since R is a unique factorization domain with unit element 1, by 1.3.19,R x1½ � is also a unique factorization domain with unit element 1. Now, again by1.3.19, R x1; x2½ � ¼ð Þ R x1½ �ð Þ x2½ � is a unique factorization domain with unit element1. and hence R x1; x2½ � is a unique factorization domain with unit element 1.Similarly, R x1; x2; x3½ � is a unique factorization domain with unit element 1. Finally,R x1; x2; . . .; xn½ � is a unique factorization domain with unit element 1. ■
1.3.21 Problem Let F be a field. Then F x1; x2; ; xn½ � is a unique factorizationdomain.
Proof From 1.3.7, F is a unique factorization domain, and hence by 1.3.20,F x1; x2; . . .; xn½ � is a unique factorization domain. ■
1.4 Roots of Polynomials
1.4.1 Definition Let F be a field. Let K be a field such that F � K. If F is a subfieldof K, then we say that K is an extension of F.
44 1 Galois Theory I
Examples
1. The field R of all real numbers is an extension of the field Q of all rationalnumbers.
2. The field C of all complex numbers is an extension of the field R of all realnumbers.
3. The field aþ ffiffiffi2
pb : a; b 2 Q
�is an extension of Q.
1.4.2 Problem Let F and K be any fields such that K is an extension of F. Let ustreat every member of K as a vector and every member of F as a scalar. We definethe operation of scalar multiplication as follows:
for every f 2 F � Kð Þ and every v 2 K, we say that the product fv in the fieldK is the result of scalar multiplication of the scalar f and vector v.
Then K is a vector space over the field F.
Proof It suffices to show that
1. for every f1; f2 2 F and for every v 2 K, f1 þ f2ð Þv ¼ f1vþ f2v andf1f2ð Þv ¼ f1 f2vð Þ,
2. for every f 2 F and for every v;w 2 K, f vþwð Þ ¼ fvþ fw,
3. for every v 2 K, 1v ¼ v.
For 1: Let us take arbitrary f1; f2 2 F and v 2 K. We have to show that f1 þ f2ð Þv ¼f1vþ f2v and f1f2ð Þv ¼ f1 f2vð Þ.
We see that f1 þ f2ð Þv ¼ f1vþ f2v is trivially true, in view of the facts that F � Kand the right distributive law holds in the field K. Similarly, f1f2ð Þv ¼ f1 f2vð Þ istrivially true, in view of the facts that F � K and the associative law of multipli-cation holds in the field K.
For 2: Let us take arbitrary f 2 F and v;w 2 K. We have to show thatf vþwð Þ ¼ fvþ fw.
This is trivially true, in view of the facts that F � K and the left distributive lawholds in the field K.
For 3: Let us take an arbitrary v 2 K. We have to show that 1v ¼ v.
Since F is a subfield of K, the unit element 1 of F is also the unit elementof K. Now, 1v ¼ v is trivially true, in view of the fact that F � K and the existenceof the unit element 1 in the field K. ■
Definition Let F and K be any fields such that K is an extension of F. By 1.4.2, K isa vector space over the field F. If the dimension of this vector space is finite, thenwe say that K is a finite extension of F. In this case, the dimension of the vectorspace K over F is denoted by K : F½ � and is called the degree of K over F.
Example: We have seen that the field C of all complex numbers is an extension ofthe field R of all real numbers. Hence C is a vector space over the field R. Here
1.4 Roots of Polynomials 45
1;ffiffiffiffiffiffiffi�1
p � � C, and every member of C can be expressed as a linear combination
of vectors 1;ffiffiffiffiffiffiffi�1
p. Further 1;
ffiffiffiffiffiffiffi�1p �
is a linearly independent set of vectors, in thesense that
a1þ bffiffiffiffiffiffiffi�1
p¼ 0; and a; b 2 R
� �) a ¼ 0 and b ¼ 0ð Þ:
Thus 1;ffiffiffiffiffiffiffi�1
p �is a basis of C. Since the number of elements in the basis
1;ffiffiffiffiffiffiffi�1
p �is 2, the dimension of the vector space C over R is 2, which is of course
finite. Hence C is a finite extension of R. Also, C : R½ � ¼ 2.
1.4.3 Problem Let F;K, and L be any fields such that F � K � L. Suppose thatK is a finite extension of F and L is a finite extension of K. Then
a. L is a finite extension of F,
b. L : F½ � ¼ L : K½ � K : F½ �.
Proof Let L : K½ � ¼ m and K : F½ � ¼ n. It suffices to construct a basis of the vectorspace L over F that has mn elements. For the sake of simplicity, let us take m ¼ 2,and n ¼ 3.
Since L : K½ � ¼ 2, there exists a basis v1; v2f g � Lð Þ of the vector space L overK. Similarly, there exists a basis w1;w2;w3f g � K � Lð Þ of the vector space K overF. It follows that w1;w2;w3f g � L and v1; v2f g � L. Since L is a field,v1w1; v1w2; v1w3; v2w1; v2w2; v2w3f g � L. It suffices to show thatv1w1; v1w2; v1w3; v2w1; v2w2; v2w3f g is a basis of the vector space L over F. To this
end, we must prove the following:
1. every element of L can be expressed as a linear combination of v1w1; v1w2;v1w3; v2w1; v2w2; v2w3 with coefficients in F,
2. v1w1; v1w2; v1w3; v2w1; v2w2; v2w3f g is a linearly independent set of vectors inthe vector space L over F.
For 1: Let us take an arbitrary v 2 L. Now, since v1; v2f g is a basis of the vectorspace L over K, there exist k1; k2 2 K such that v ¼ k1v1 þ k2v2. Since k1 2 K, andw1;w2;w3f g is a basis of the vector space K over F, there exist f11; f12; f13 2 F such
that k1 ¼ f11w1 þ f12w2 þ f13w3. Similarly, there exist f21; f22; f23 2 F such thatk2 ¼ f21w1 þ f22w2 þ f23w3. Hence
v ¼ f11w1 þ f12w2 þ f13w3ð Þv1 þ f21w1 þ f22w2 þ f23w3ð Þv2;
that is,
v ¼ f11w1v1 þ f12w2v1 þ f13w3v1ð Þþ f21w1v2 þ f22w2v2 þ f23w3v2ð Þ;
46 1 Galois Theory I
that is,
v ¼ f11 v1w1ð Þþ f12 v1w2ð Þþ f13 v1w3ð Þþ f21 v2w1ð Þþ f22 v2w2ð Þþ f23 v2w3ð Þ:
Thus v is expressed as a linear combination of v1w1; v1w2; v1w3; v2w1; v2w2; v2w3
having coefficients in F.
For 2: Suppose that
f11 v1w1ð Þþ f12 v1w2ð Þþ f13 v1w3ð Þþ f21 v2w1ð Þþ f22 v2w2ð Þþ f23 v2w3ð Þ ¼ 0;
where each fij 2 F. We have to show that each fij is zero. We have
f11 v1w1ð Þþ f12 v1w2ð Þþ f13 v1w3ð Þð Þþ f21 v2w1ð Þþ f22 v2w2ð Þþ f23 v2w3ð Þð Þ ¼ 0;
that is,
f11w1 þ f12w2 þ f13w3ð Þv1 þ f21w1 þ f22w2 þ f23w3ð Þv2 ¼ 0 �ð Þ:
Since each fij 2 F � Kð Þ, each wk 2 K, and K is a field, it follows thatf11w1 þ f12w2 þ f13w3ð Þ 2 K. Similarly, f21w1 þ f22w2 þ f23w3ð Þ 2 K. Since v1; v2f gis a basis of the vector space L overK, v1; v2f g is linearly independent. Now from �ð Þ,
f11w1 þ f12w2 þ f13w3 ¼ 0f21w1 þ f22w2 þ f23w3 ¼ 0
�:
Since w1;w2;w3f g is a basis of the vector space K over F, w1;w2;w3f g islinearly independent. Since
f11w1 þ f12w2 þ f13w3 ¼ 0;
we have f11 ¼ f12 ¼ f13 ¼ 0. Similarly, f21 ¼ f22 ¼ f23 ¼ 0. ■
1.4.4 Problem Let F;K, and L be any fields such that L is a finite extension of F,K is an extension of F, and L is an extension of K. Then
a. K is a finite extension of F, and L is a finite extension of K,
b. K : F½ �j L : F½ �.
Proof Since L is a finite extension of F, the dimension of the vector space L over Fis finite. So let v1; v2; . . .; vnf g � Lð Þ be a basis of the vector space L over K. SinceK is an extension of F, it follows by 1.4.2 that K is a vector space over the fieldF. Since K � L, K is a vector space over the field F, and L is a vector space over thefield F, we have that K is a linear subspace of L. Since v1; v2; . . .; vnf g � Lð Þ is abasis of the vector space L over F, the dimension of the vector space K over F is� n, and hence K is a finite extension of F.
Since v1; v2; . . .; vnf g � Lð Þ is a basis of the vector space L over F, each elementof L is a linear combination of v1; v2; . . .; vn with coefficients in F � Kð Þ, and hence
1.4 Roots of Polynomials 47
each element of L is a linear combination of v1; v2; . . .; vn with coefficients inK. This shows that the dimension of the vector space L over K is � n, and henceL is a finite extension of K. This proves (a). Now, since K is a finite extension of F,by 1.4.3, L : F½ � ¼ L : K½ � K : F½ �, and hence K : F½ �j L : F½ �. This proves (b). ■
Definition Let F and K be any fields such that K is an extension of F. Let a 2 K. Ifthere exists a nonzero polynomial q xð Þ 2 F x½ � such that K3ð Þq að Þ ¼ 0, then we saythat a is algebraic over F (Caution: Here the polynomial q xð Þ is a symbol, whileq að Þ is a member of the field K.).
1.4.5 Problem Let F and K be any fields such that K is an extension of F. Thenevery element of F is algebraic over F.
Proof Let us take an arbitrarya 2 F.We have to show that a is algebraic overF. Let ustake aþ �1ð Þx for q xð Þ 2 F x½ �ð Þ. Clearly, q að Þ ¼ 0. Thus, a is algebraic over F. ■
1.4.6 Problem Let F and K be any fields such that K is an extension of F. Leta 2 K. Let M be the collection of all fields L satisfying
a. F [ af g � L � K,b. L is a subfield of K.
Clearly, K 2 M, and hence M is a nonempty collection. Also, \M is amember of M.
Thus, \M is the smallest member of M.
Proof Since each member of M is a field, \M is also a field. Since each memberof M contains F [ af g, \M also contains F [ af g. Since each member of M iscontained in K, \M is also contained in the field K. Now, since \M is a field,\M is a subfield of K. Thus by the definition of M, \M is a member of M. ■
1.4.7 Problem Let F and K be any fields such that K is an extension of F. Let a 2 K.Let N be the set of all elements of K of the form g að Þð Þ�1f að Þ, where f xð Þ; g xð Þ aremembers of F x½ � and g að Þ is a nonzero member of K. Then N is a field.
Proof Let g að Þð Þ�1f að Þ 2 N, where f xð Þ; g xð Þ are members of F x½ � and g að Þ is anonzero member of K. Let g1 að Þð Þ�1f1 að Þ 2 N, where f1 xð Þ; g1 xð Þ are members ofF x½ � and g1 að Þ is a nonzero member of K. It suffices to show the following:
1. g að Þð Þ�1f að Þþ g1 að Þð Þ�1f1 að Þ� �
2 N,
2. g að Þð Þ�1f að Þ� �
g1 að Þð Þ�1f1 að Þ� �
2 N,
3. if g að Þð Þ�1f að Þ, g1 að Þð Þ�1f1 að Þ are nonzero elements of N, then their product isnonzero,
4. 1 2 N,
5. if g að Þð Þ�1f að Þ is a nonzero element of N, then there exists b in N such that
g að Þð Þ�1f að Þ� �
b ¼ 1.
48 1 Galois Theory I
For 1: Observe that
g að Þð Þ�1f að Þþ g1 að Þð Þ�1f1 að Þ ¼ g að Þð Þ�1 g1 að Þð Þ�1 f að Þg1 að Þþ g að Þf1 að Þð Þ¼ g að Þg1 að Þð Þ�1 f að Þg1 að Þþ g að Þf1 að Þð Þ ¼ g að Þg1 að Þð Þ�1 k1 að Þþ k2 að Þð Þ;
where k1 xð Þ � f xð Þg1 xð Þ 2 F x½ �ð Þ and k2 xð Þ � g xð Þf1 xð Þ 2 F x½ �ð Þ, and hence
g að Þð Þ�1f að Þþ g1 að Þð Þ�1f1 að Þ ¼ g að Þg1 að Þð Þ�1k að Þ;
where k xð Þ � k1 xð Þþ k2 xð Þð Þ 2 F x½ �ð Þ. Since g xð Þ; g1 xð Þ are members of F x½ �, h xð Þis a member of F x½ �, where h xð Þ � g xð Þg1 xð Þ. It follows that h að Þ ¼ g að Þg1 að Þ.Since g að Þ; g1 að Þ are nonzero members of K and K. is a field, h að Þ ¼ð Þg að Þg1 að Þ isa nonzero member of K, and hence h að Þ is a nonzero member of K. Thus
g að Þð Þ�1f að Þþ g1 að Þð Þ�1f1 að Þ ¼ h að Þð Þ�1k að Þ;
where h xð Þ; k xð Þ are members of F x½ � and h að Þ is a nonzero member of K. It followsthat
g að Þð Þ�1f að Þþ g1 að Þð Þ�1f1 að Þ ¼� �
h að Þð Þ�1k að Þ 2 N;
and hence
g að Þð Þ�1f að Þþ g1 að Þð Þ�1f1 að Þ� �
2 N:
For 2: Observe that
g að Þð Þ�1f að Þ� �
g1 að Þð Þ�1f1 að Þ� �
¼ g að Þð Þ�1 g1 að Þð Þ�1 f að Þf1 að Þð Þ¼ g að Þg1 að Þð Þ�1 f að Þf1 að Þð Þ ¼ g að Þg1 að Þð Þ�1 k að Þð Þ;
where k xð Þ � f xð Þf1 xð Þ 2 F x½ �ð Þ. Since g xð Þ; g1 xð Þ are members of F x½ �, h xð Þ is amember of F x½ �, where h xð Þ � g xð Þg1 xð Þ. It follows that h að Þ ¼ g að Þg1 að Þ. Sinceg að Þ; g1 að Þ are nonzero members of K, and K is a field, h að Þ ¼ð Þg að Þg1 að Þ is anonzero member of K, and hence h að Þ is a nonzero member of K. Thus
g að Þð Þ�1f að Þ� �
g1 að Þð Þ�1f1 að Þ� �
¼ h að Þð Þ�1k að Þ;
where h xð Þ; k xð Þ are members of F x½ �, and h að Þ is a nonzero member of K. Itfollows that
g að Þð Þ�1f að Þ� �
g1 að Þð Þ�1f1 að Þ� �
¼� �
h að Þð Þ�1k að Þ 2 N;
1.4 Roots of Polynomials 49
and hence
g að Þð Þ�1f að Þ� �
g1 að Þð Þ�1f1 að Þ� �� �
2 N:
For 3: Let g að Þð Þ�1f að Þ, g1 að Þð Þ�1f1 að Þ be nonzero elements of N. We have toshow that
g að Þð Þ�1f að Þ� �
g1 að Þð Þ�1f1 að Þ� �
is a nonzero element of N. Suppose to the contrary that
g að Þð Þ�1f að Þ� �
g1 að Þð Þ�1f1 að Þ� �
¼ 0: �ð Þ
We seek a contradiction. We have seen above that
g að Þð Þ�1f að Þ� �
g1 að Þð Þ�1f1 að Þ� �
¼ h að Þð Þ�1k að Þ;
where h xð Þ � g xð Þg1 xð Þ 2 F x½ �ð Þ, k xð Þ � f xð Þf1 xð Þ 2 F x½ �ð Þ, and h að Þ is a nonzeromember of K. Now from �ð Þ, h að Þð Þ�1k að Þ ¼ 0. Since h að Þ is a nonzero member ofthe field K, we have f að Þf1 að Þ ¼ð Þk að Þ ¼ 0, and hence either f að Þ ¼ 0 or f1 að Þ ¼ 0.It follows that either g að Þð Þ�1f að Þ ¼ 0 or g1 að Þð Þ�1f1 að Þ ¼ 0. This is acontradiction.
For 4: Let us take the constant polynomial 1 for f xð Þ, and again the constant poly-nomial 1 for g xð Þ. Clearly, 1 ¼ 1�11 ¼ð Þ g að Þð Þ�1f að Þ 2 N, and hence 1 2 N.
For 5: Let us take an arbitrary nonzero element g að Þð Þ�1f að Þ of N � Kð Þ, wheref xð Þ; g xð Þ are members of F x½ �, and g að Þ is a nonzero member of K. Since g að Þ is anonzero member of the field K, g að Þð Þ�1 is a nonzero member of K. Next, sinceg að Þð Þ�1f að Þ is a nonzero member of the field K, f að Þ is a nonzero member of the
field K. This shows that f að Þð Þ�1g að Þ 2 N. Further, it is clear that g að Þð Þ�1f að Þ� �
f að Þð Þ�1g að Þ� �
¼ 1. Hence f að Þð Þ�1g að Þ serves the purpose of b. ■
1.4.8 Problem Let F and K be any fields such that K is an extension of F. Leta 2 K. Let N be the symbol as described in 1.4.7, and M the symbol as describedin 1.4.6. Then N ¼ \M.
Proof We must prove:
1. N � \M,
2. \M � N.
50 1 Galois Theory I
For 1: By 1.4.6, \M is a member of M, so it suffices to show that every memberofM contains N. To this end, let us take an arbitrary L 2 M. We have to show thatN � L.
Next let us take an arbitrary g að Þð Þ�1f að Þ, where f xð Þ; g xð Þ are members of F x½ �,and g að Þ is a nonzero member of K. We have to show that g að Þð Þ�1f að Þ 2 L.
Since L 2 M, by the definition of M, L is a field satisfying
a. F [ af g � L � K,b. L is a subfield of K.
Since f xð Þ is a member of F x½ � and L is a field containing F [ af g, we havef að Þ 2 L. Similarly, g að Þ 2 L. Now, since g að Þ is nonzero, g að Þð Þ�12 L. Next,since f að Þ 2 L and L is a field, we have g að Þð Þ�1f að Þ 2 L.
For 2: By 1.4.6, \M is the smallest member of M, so it suffices to show that N isa member of M. By the definition of M, we must prove:
a. N is a field,
b. F [ af g � N � K,
c. N is a subfield of K.
For a: By 1.4.7, N is a field.For b: Let us take an arbitrary a 2 F. We want to show that a 2 N. To this end,
let us take the constant polynomial a as f xð Þ 2 F x½ �ð Þ, and the constant polynomial 1
as g xð Þ 2 F x½ �ð Þ. It is clear that a ¼ 1ð Þ�1a ¼� �
g að Þð Þ�1f að Þ 2 N, and hence
a 2 N. Thus we have shown that F � N.Now we want to show that a 2 N. To this end, let us take the polynomial 0þ 1x
as f xð Þ 2 F x½ �ð Þ, and the constant polynomial 1 as g xð Þ 2 F x½ �ð Þ. It is clear that
a ¼ 1ð Þ�1 0þ 1að Þ ¼� �
g að Þð Þ�1f að Þ 2 N, and hence a 2 N. Thus we have shown
that F [ af g � N.By the definition of N, N � K.For c: Since K;N are fields and N � K, it follows that N is a subfield of K. ■
Definition Let F and K be any fields such that F � K. Suppose that K is anextension of F. Let a 2 K. The smallest subfield of K that contains both F and a isdenoted by F að Þ, and we say that F að Þ is the subfield obtained by adjoining a to F.
Thus F � F að Þ, a 2 F að Þ, and F is a field. It follows that f að Þ : f xð Þf2 F x½ �g � F að Þ.
�ð Þ By 1.4.8, F að Þ is equal to the set of all elements of K of the formg að Þð Þ�1f að Þ, where f xð Þ; g xð Þ are members of F x½ �, and g að Þ is a nonzero memberof K Further, since F að Þ is a field containing the field F as a subfield, F að Þ is anextension of F. It follows, by 1.4.2, that F að Þ is a vector space over the field F.
1.4.9 Problem Let F and K be any fields such that K is an extension of F. Leta 2 K. Let F að Þ be a finite extension of F. Then a is algebraic over F.
1.4 Roots of Polynomials 51
Proof Case I: a 2 F. Clearly f að Þ ¼ 0, where f xð Þ is the polynomialaþ �1ð Þx 2 F x½ �ð Þ, and hence a is algebraic over F.Case II: a 62 F. We have to show that a is algebraic over F. Suppose to the contrarythat a is not algebraic over F. We seek a contradiction.
Since a be not algebraic over F, 1; a; a2; . . . �
is a collection of distinct elementsof F að Þ. Thus 1; a; a2; . . .
�is an infinite subset of F að Þ. Since F að Þ is a finite
extension of F, the dimension of the vector space F að Þ over F is finite. Now, since1; a; a2; . . . �
is an infinite subset of F að Þ, 1; a; a2; . . . �
is linearly dependent overF. It follows that there exists a positive integer n such that 1; a; a2; . . .; an
�is
linearly dependent over F. Hence, there exist a0; a1; a2; . . .; an 2 F such that not allai i ¼ 0; 1; . . .; nð Þ are 0, and
a01þ a1aþ a2a2 þ þ ana
n ¼ 0:
Thus f að Þ ¼ 0, where f xð Þ � a0 þ a1xþ a2x2 þ þ anxn 2 F x½ �ð Þ is such thatnot all ai i ¼ 0; 1; ; nð Þ are 0. Hence f xð Þ is nonzero. Thus a is algebraic overF. ■
1.4.10 Problem Let F be a field. Let g xð Þ 2 F x½ � and g xð Þ 6¼ 0. Let n be the degreeof the polynomial g xð Þ. Let us denote the ideal g xð Þð Þ ¼ f xð Þg xð Þ : f xð Þ 2 F x½ �f gð Þby V . Then the quotient ring F x½ �
V is a vector space over the field F under theusual vector addition and scalar multiplication. Further, 1þV ; xþV ; x2 þV ; ; xn�1 þVg is a basis of F x½ �
V . And hence n is the dimension of the vector spaceF x½ �V .
Proof It suffices to show that
1. for every a; b 2 F, and for every v xð Þ 2 F x½ �, we have aþ bð Þ v xð ÞþVð Þ ¼a v xð ÞþVð Þþ b v xð ÞþVð Þ and abð Þ v xð ÞþVð Þ ¼ a b v xð ÞþVð Þð Þ,
2. for every a 2 F and for every v xð Þ;w xð Þ 2 F x½ �, we have a v xð ÞþVð Þþðw xð ÞþVð ÞÞ ¼ a v xð ÞþVð Þþ a w xð ÞþVð Þ,
3. for every v xð Þ 2 F x½ �, 1 v xð ÞþVð Þ ¼ v xð ÞþVð Þ.For 1: Let us take arbitrary a; b 2 F and v xð Þ 2 F x½ �. We have to show thataþ bð Þ v xð ÞþVð Þ ¼ a v xð ÞþVð Þþ b v xð ÞþVð Þ and abð Þ v xð ÞþVð Þ ¼ a b v xð ÞþVð Þð Þ:Here
LHS ¼ aþ bð Þ v xð ÞþVð Þ ¼ aþ bð Þv xð ÞþV ¼ av xð Þþ bv xð Þð ÞþV¼ av xð ÞþVð Þþ bv xð ÞþVð Þ ¼ a v xð ÞþVð Þþ b v xð ÞþVð Þ ¼ RHS
52 1 Galois Theory I
and
LHS ¼ abð Þ v xð ÞþVð Þ ¼ abð Þv xð ÞþV ¼ a bv xð Þð ÞþV
¼ a bv xð ÞþVð Þ ¼ a b v xð ÞþVð Þð Þ ¼ RHS:
For 2: Let us take arbitrary a 2 F and v xð Þ;w xð Þ 2 F x½ �. We have to show thata v xð ÞþVð Þþ w xð ÞþVð Þð Þ ¼ a v xð ÞþVð Þþ a w xð ÞþVð Þ. Here
LHS ¼ a v xð ÞþVð Þþ w xð ÞþVð Þð Þ ¼ a v xð Þþw xð Þð ÞþVð Þ¼ a v xð Þþw xð Þð ÞþV ¼ av xð Þþ aw xð Þð ÞþV
¼ av xð ÞþVð Þþ aw xð ÞþVð Þ ¼ a v xð ÞþVð Þþ a w xð ÞþVð Þ ¼ RHS:
For 3: Let us take an arbitrary v xð Þ 2 F x½ �. We have to show that1 v xð ÞþVð Þ ¼ v xð ÞþV . Here
LHS ¼ 1 v xð ÞþVð Þ ¼ 1v xð ÞþV ¼ v xð ÞþV ¼ RHS:
Thus we have shown that F x½ �V is a vector space over the field F � F x½ �ð Þ. It is
clear that 1þV ; xþV ; x2 þV ; ; xn�1 þV �
is a subset of F x½ �V . We shall try to
show that 1þV ; xþV ; x2 þV ; ; xn�1 þV �
is a basis of F x½ �V . To this end, we
must show that
1. 1þV ; xþV ; x2 þV ; ; xn�1 þV �
is linearly independent,
2. 1þV ; xþV ; x2 þV ; ; xn�1 þV �
generates every element of F x½ �V .
For 1: Suppose that
a0 1þVð Þþ a1 xþVð Þþ an�1 xn�1 þV� � ¼ 0þV :
We have to show that each ak is 0. Since
a0 þ a1xþ þ an�1xn�1ð ÞþV ¼ a01þVð Þþ a1xþVð Þþ þ an�1xn�1 þVð Þ¼ a0 1þVð Þþ a1 xþVð Þþ an�1 xn�1 þV
� � ¼ 0þV|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};
we havea0 þ a1xþ þ an�1x
n�1 ¼ a0 þ a1xþ þ an�1xn�1� �� 0 2 V|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ f xð Þg xð Þ : f xð Þ 2 F x½ �f g;
1.4 Roots of Polynomials 53
and hence a0 þ a1xþ þ an�1xn�1 is a member of f xð Þg xð Þ : f xð Þ 2 F x½ �f g. By1.2.14, every nonzero member of f xð Þg xð Þ : f xð Þ 2 F x½ �f g is of degree deg g xð Þð Þ ¼ n. It follows that
either a0 þ a1xþ þ an�1xn�1 ¼ 0 or deg a0 þ a1xþ þ an�1x
n�1� � n [ n� 1ð Þð Þ:
It follows that a0 þ a1xþ þ an�1xn�1 ¼ 0, that is, each ak is 0.
For 2: Let us take an arbitrary nonzero member h xð Þ of F x½ �, whereh xð Þ � b0 þ b1xþ b2x2 þ . We have to show that h xð ÞþV can be expressed as alinear combination of 1þV ; xþV ; x2 þV ; ; xn�1 þV .
By 1.2.14, there exist q xð Þ; r xð Þ 2 F x½ � such that h xð Þ ¼ q xð Þg xð Þþ r xð Þ andeither r xð Þ ¼ 0 or deg r xð Þð Þ\deg g xð Þð Þð Þ.Case I: r xð Þ ¼ 0. It follows that h xð Þ ¼ q xð Þg xð Þ 2 f xð Þg xð Þ : f xð Þ 2 F x½ �f g ¼ Vð Þ,and hence h xð Þ 2 V . We have
h xð ÞþV ¼ 0 1þVð Þþ 0 xþVð Þþ 0 x2 þV� �þ þ 0 xn�1 þV
� �:
Case II: deg r xð Þð Þ\deg g xð Þð Þ ¼ nð Þ. We suppose that r xð Þ � c0 þ c1xþ þcn�1xn�1, where not all ci i ¼ 0; 1; . . .; n� 1ð Þ are 0. It suffices to show that
q xð Þg xð Þ ¼ h xð Þ � r xð Þ ¼ð Þh xð Þ � c0 þ c1xþ þ cn�1xn�1� �
is a member of V ¼ f xð Þg xð Þ : f xð Þ 2 F x½ �f gð Þ, that is, q xð Þg xð Þ is a member off xð Þg xð Þ : f xð Þ 2 F x½ �f g. This is clearly true.
Thus we have shown that 1þV ; xþV ; x2 þV ; ; xn�1 þV �
is a basis of F x½ �V .
Since 1þV ; xþV ; x2 þV ; ; xn�1 þV �
is linearly independent,1þV ; xþV ; x2 þV ; ; xn�1 þV �
is a set of distinct elements, and hence the
number of elements in the basis 1þV ; xþV ; x2 þV ; ; xn�1 þV �
of F x½ �V is n.
Thus n is the dimension of the vector space F x½ �V . ■
Definition Let F and K be any fields such that K is an extension of F. Let a be amember of K. Let a be algebraic over F. It follows that there exists a nonzeropolynomial q xð Þ 2 F x½ � � Fð Þ such that
1. K3ð Þq að Þ ¼ 0,2. deg q xð Þð Þ 1,3. the leading coefficient of q xð Þ is 1.
If n 1ð Þ is the smallest degree of all such q xð Þ, then we say that a is algebraicof degree n over F.
Clearly, every member of F is algebraic of degree 1 over F.
54 1 Galois Theory I
1.4.11 Problem Let F and K be any fields such that K is an extension of F. Let a bea member of K. Let a be algebraic of degree n over F. Then there exists a uniquepolynomial q xð Þ 2 F x½ � � Fð Þ such that
1. K3ð Þq að Þ ¼ 0,2. n ¼ deg q xð Þð Þ 1,3. the leading coefficient of q xð Þ is 1.
The unique polynomial q xð Þ is called the minimal polynomial of a over F.
Proof Existence of q xð Þ is clear from the definition of “algebraic of degree n overF.”
Uniqueness: Suppose that there exist q1 xð Þ; q2 xð Þ 2 F x½ � such that
1. q1 að Þ ¼ 0; q2 að Þ ¼ 0,2. n ¼ deg q1 xð Þð Þ ¼ deg q2 xð Þð Þ 1,3. the leading coefficient of q1 xð Þ is 1, and the leading coefficient of q2 xð Þ is 1.
We have to show that q1 xð Þ ¼ q2 xð Þ. Suppose to the contrary that q1 xð Þ 6¼ q2 xð Þ,that is, q1 xð Þ � q2 xð Þ 6¼ 0. We seek a contradiction.
Let us put h xð Þ � q1 xð Þ � q2 xð Þ. Clearly, h xð Þ 6¼ 0. Since q1 xð Þ; q2 xð Þ 2 F x½ �and F x½ � is a ring, we have q1 xð Þ � q2 xð Þ 2 F x½ �, and hence h xð Þ 2 F x½ �. Sincen ¼ deg q1 xð Þð Þ ¼ deg q2 xð Þð Þ, the leading coefficient of q1 xð Þ is 1, and the lead-ing coefficient of q2 xð Þ is 1, we have deg q1 xð Þ � q2 xð Þð Þ\n, and hencedeg h xð Þð Þ\n. Since q1 að Þ ¼ 0; q2 að Þ ¼ 0, and h xð Þ ¼ q1 xð Þ � q2 xð Þ, we haveh að Þ ¼ q1 að Þ � q2 að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0� 0 ¼ 0, and hence h að Þ ¼ 0.
Since a is algebraic of degree n over F, h xð Þ 2 F x½ �; h xð Þ 6¼ 0, h að Þ ¼ 0,deg h xð Þð Þ\n, we find that either deg h xð Þð Þ\1 or the leading coefficient of h xð Þ isdifferent from 1.
Case I: deg h xð Þð Þ\1. It follows that h xð Þ is a constant polynomial, and sinceh að Þ ¼ 0, we have h xð Þ ¼ 0. This is a contradiction.Case II: the leading coefficient of h xð Þ is different from 1. Here, we can supposethat
h xð Þ � b0 þ b1xþ þ bkxk;
where k is a positive integer strictly smaller than n, and bk is a nonzero member ofF. Put q xð Þ � 1
bkh xð Þ. Clearly, q xð Þ 2 F x½ �; q xð Þ 6¼ 0, q að Þ ¼ 0, the leading coeffi-
cient of q xð Þ is 1, and 1� k ¼ deg q xð Þð Þ. Now, since a is algebraic of degree n overF, we have n� k. This is a contradiction. ■
1.4.12 Problem Let F and K be any fields such that K is an extension of F. Let a bea member of K. Let a be algebraic of degree n over F. Let q xð Þ 2 F x½ � � Fð Þ. Letq xð Þ be the minimal polynomial of a over F. Then q xð Þ is irreducible over F.
Proof Suppose to the contrary that q xð Þ is not irreducible over F. We seek acontradiction.
1.4 Roots of Polynomials 55
Since q xð Þ is not irreducible over F, there exist r xð Þ; s xð Þ 2 F x½ � such that
1. q xð Þ ¼ r xð Þs xð Þ,2. 1� deg r xð Þð Þ\deg q xð Þð Þ ¼ nð Þ, and 1� deg s xð Þð Þ\deg q xð Þð Þ ¼ nð Þ.
Since a is algebraic of degree n over F, q xð Þ 2 F x½ � � Fð Þ, and q xð Þ is theminimal polynomial of a over F, we have q að Þ ¼ 0, n ¼ deg q xð Þð Þ 1, and theleading coefficient of q xð Þ is 1. Since q xð Þ ¼ r xð Þs xð Þ, we have0 ¼ q að Þ ¼ r að Þs að Þ|fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl}, and hence r að Þs að Þ ¼ 0. Now, since r að Þ; s að Þ 2 K and K is a
field, either r að Þ ¼ 0 or s að Þ ¼ 0.
Case I: r að Þ ¼ 0. Here 1� deg r xð Þð Þ, so deg r1 xð Þð Þ ¼ deg r xð Þð Þ \nð Þ, wherer1 xð Þ � 1
leadingcoefficientof r xð Þ r xð Þ. Thus r1 xð Þ 2 F x½ �, r1 að Þ ¼ 0, and the leading
coefficient of r1 xð Þ is 1. Further, 1� deg r xð Þð Þ ¼ deg r1 xð Þð Þ, so 1� deg r1 xð Þð Þ.Now, since a is algebraic of degree n over F, we have n� deg r1 xð Þð Þ. This is acontradiction.Case II: s að Þ ¼ 0. This case is similar to Case I.
Thus in all cases, we get a contradiction. ■
1.4.13 Problem Let F and K be any fields such that K is an extension of F. Let a bea member of K. Let a be algebraic of degree n over F. Let p xð Þ 2 F x½ � � Fð Þ. Letp xð Þ be the minimal polynomial of a over F. By 1.4.12, p xð Þ is irreducible over F,and hence by 1.2.22, the ideal p xð Þð Þ � p xð Þf xð Þ : f xð Þ 2 F x½ �f gð Þ is a maximal
ideal of the ring F x½ �. Further, by 1.2.24, the quotient ring F x½ �p xð Þð Þ is a field. Also,
f að Þ : f xð Þ 2 F x½ �f g � F að Þ.Let w : f xð Þ 7! f að Þ be a mapping from the ring F x½ � to the field F að Þ. Then
1. w : F x½ � ! F að Þ is a ring homomorphism,2. ker wð Þ ¼ p xð Þð Þ, where ker wð Þð� ff xð Þ : f xð Þ 2 F x½ �andw f xð Þð Þ ¼ 0g ¼
f xð Þ : f xð Þ 2 F x½ �and f að Þ ¼ 0f gÞ denotes the kernel of the homomorphism w,and p xð Þð Þ ¼ p xð Þf xð Þ : f xð Þ 2 F x½ �f gð Þ is the ideal of the ring F x½ � generated byp xð Þ.
Proof 1. Let us take arbitrary f xð Þ; g xð Þ 2 F x½ �, where
f xð Þ � a0 þ a1xþ a2x2 þ ; g xð Þ � b0 þ b1xþ b2x
2 þ ;
each ai 2 F, and each bi 2 F. We have to show that
a. w f xð Þþ g xð Þð Þ ¼ w f xð Þð Þþw g xð Þð Þ,b. w f xð Þg xð Þð Þ ¼ w f xð Þð Þw g xð Þð Þ.
56 1 Galois Theory I
For (a): Here
LHS ¼ w f xð Þþ g xð Þð Þ ¼ w a0 þ a1xþ a2x2 þ � �þ b0 þ b1xþ b2x
2 þ � �� �¼ w a0 þ b0ð Þþ a1 þ b1ð Þxþ a2 þ b2ð Þx2 þ � �¼ a0 þ b0ð Þþ a1 þ b1ð Þaþ a2 þ b2ð Þa2 þ ¼ a0 þ b0ð Þþ a1aþ b1að Þþ a2a2 þ b2a2
� �þ ¼ a0 þ a1aþ a2a
2 þ � �þ b0 þ b1aþ b2a2 þ � �
¼ f að Þþ g að Þ ¼ w f xð Þð Þþw g xð Þð Þ ¼ RHS:
For (b): Here
LHS ¼ w f xð Þg xð Þð Þ ¼ w a0 þ a1xþ a2x2 þ � �
b0 þ b1xþ b2x2 þ � �� �
¼ w a0b0 þ a0b1 þ a1b0ð Þxþ a0b2 þ a1b1 þ a2b0ð Þx2 þ � �¼ a0b0 þ a0b1 þ a1b0ð Þaþ a0b2 þ a1b1 þ a2b0ð Þa2 þ ¼ a0b0 þ a0b1aþ a1b0að Þþ a0b2a
2 þ a1b1a2 þ a2b0a
2� �þ ¼ a0b0 þ a0b1aþ a0b2a
2 þ � �þ a1b0aþ a1b1a2 þ a1b2a
3 þ � �þ a2b0a
2 þ a2b1a3 þ a2b2a
4 þ � �þ ¼ a0 b0 þ b1aþ b2a2 þ � �
þ a1a b0 þ b1aþ b2a2 þ � �þ a2a
2 b0 þ b1aþ b2a2 þ � �þ
¼ a0 þ a1aþ a2a2 þ � �
b0 þ b1aþ b2a2 þ � � ¼ f að Þg að Þ
¼ w f xð Þð Þw g xð Þð Þ ¼ RHS:
2. We have to show that
a. p xð Þg xð Þ : g xð Þ 2 F x½ �f g � f xð Þ : f xð Þ 2 F x½ �and f að Þ ¼ 0f g,b. f xð Þ : f xð Þ 2 F x½ � and f að Þ ¼ 0f g � p xð Þg xð Þ : g xð Þ 2 F x½ �f g.For (a): Let us take an arbitrary g xð Þ 2 F x½ �. We have to show thatp xð Þg xð Þ 2 f xð Þ : f xð Þ 2 F x½ �and f að Þ ¼ 0f g, that is,
p að Þg að Þ ¼ w p xð Þð Þw g xð Þð Þ ¼ w p xð Þg xð Þð Þ ¼ 0|fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl};that is, p að Þg að Þ ¼ 0. It suffices to show that p að Þ ¼ 0.
Since a is algebraic of degree n over F and p xð Þ is the minimal polynomial of aover F, we have p að Þ ¼ 0.
For (b): Let us take an arbitrary nonzero f xð Þ 2 F x½ � such that w f xð Þð Þ ¼ 0, that is,f að Þ ¼ 0. We have to show that f xð Þ 2 p xð Þg xð Þ : g xð Þ 2 F x½ �f g.
Since p xð Þ is the minimal polynomial of a over F, we have p að Þ ¼ 0,n ¼ deg p xð Þð Þ 1, and the leading coefficient of p xð Þ is 1. Since deg p xð Þð Þ 1,
1.4 Roots of Polynomials 57
p xð Þ is a nonzero member of F x½ �. Now by 1.2.14, there exist q xð Þ; r xð Þ 2 F x½ � suchthat f xð Þ ¼ q xð Þp xð Þþ r xð Þ, and either r xð Þ ¼ 0 or deg r xð Þð Þ\deg p xð Þð Þð Þ. Sincef xð Þ ¼ q xð Þp xð Þþ r xð Þ, we have
0 ¼ f að Þ ¼ w f xð Þð Þ ¼ w q xð Þp xð Þþ r xð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ w q xð Þp xð Þð Þþw r xð Þð Þ ¼ w q xð Þð Þw p xð Þð Þþw r xð Þð Þ¼ q að Þp að Þþ r að Þ ¼ q að Þ0þ r að Þ ¼ r að Þ;
and hence r að Þ ¼ 0.We claim that r xð Þ ¼ 0. Suppose to the contrary that r xð Þ 6¼ 0. We seek a
contradiction.Since r xð Þ 6¼ 0 and r að Þ ¼ 0, r xð Þ is not a constant polynomial, and hence
deg r xð Þð Þ 1. Since r xð Þ 6¼ 0 and either r xð Þ ¼ 0 or deg r xð Þð Þ\deg p xð Þð Þð Þ, wehave 1� deg r xð Þð Þ\deg p xð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ n, and hence 1� deg r xð Þð Þ\n.
Put
r1 xð Þ � 1leading coefficient of r xð Þ r xð Þ:
Thus r1 xð Þ 2 F x½ �, r1 að Þ ¼ 0, and the leading coefficient of r1 xð Þ is 1. Further,1� deg r xð Þð Þ ¼ deg r1 xð Þð Þ, so 1� deg r1 xð Þð Þ. Now, since a is algebraic of degreen over F, we have n� deg r1 xð Þð Þ, and hence n� deg r xð Þð Þ. This is a contradiction.
Thus our claim is true, that is, f xð Þ � q xð Þp xð Þ ¼ð Þr xð Þ ¼ 0. Hencef xð Þ ¼ q xð Þp xð Þ 2 p xð Þg xð Þ : g xð Þ 2 F x½ �f g. ■
1.4.14 Problem Let F and K be any fields such that K is an extension of F. Let a bea member of K. Let a be algebraic of degree n over F. Let p xð Þ 2 F x½ � � Fð Þ. Letp xð Þ be the minimal polynomial of a over F. Let w : f xð Þ 7! f að Þ be a mapping fromthe ring F x½ � to the field F að Þ. By 1.4.13, w : F x½ � ! F að Þ is a ring homomorphism,and hence by the fundamental theorem of ring homomorphisms, the mapping g :f xð Þþ ker wð Þð Þ 7!w f xð Þð Þ ¼ f að Þð Þ is a ring isomorphism from the quotient ringF x½ �
ker wð Þ to F að Þ. Thus g maps F x½ �ker wð Þ onto F að Þ.
In short, the field F x½ �ker wð Þ is isomorphic to the field F að Þ.
Proof Recall that F að Þ is equal to the set of all elements of K of the formg að Þð Þ�1f að Þ, where f xð Þ; g xð Þ are members of F x½ �, and g að Þ is a nonzero memberof K. Next, let us take an arbitrary g að Þð Þ�1f að Þ 2 F að Þ, where f xð Þ; g xð Þ are
members of F x½ �, and g að Þ is a nonzero member of K. From 1.4.13, F x½ �p xð Þð Þ is a field,
and ker wð Þ ¼ p xð Þð Þ, so F x½ �ker wð Þ is a field. Since f xð Þ; g xð Þ are members of F x½ �,
f xð Þþ ker wð Þð Þ; g xð Þþ ker wð Þð Þ 2 F x½ �ker wð Þ. Since w g xð Þð Þ ¼ð Þg að Þ is nonzero,
58 1 Galois Theory I
g xð Þ 62 ker wð Þ, and hence g xð Þþ ker wð Þ is a nonzero member of the field F x½ �ker wð Þ. It
follows that
f xð Þh xð Þþ ker wð Þð Þ ¼ f xð Þþ ker wð Þð Þ h xð Þþ ker wð Þð Þ
¼ f xð Þþ ker wð Þð Þ g xð Þþ ker wð Þð Þ�12 F x½ �ker wð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};
where h xð Þþ ker wð Þð Þ g xð Þþ ker wð Þð Þ ¼ 1þ ker wð Þð Þ. Thus f xð Þh xð Þþ ker wð Þð Þ2 F x½ �ker wð Þ, and h xð Þg xð Þþ ker wð Þð Þ ¼ 1þ ker wð Þð Þ. It suffices to show that
g f xð Þh xð Þþ ker wð Þð Þ ¼ g að Þð Þ�1f að Þ. Since
g f xð Þh xð Þþ ker wð Þð Þ ¼ w f xð Þh xð Þð Þ ¼ w f xð Þð Þw h xð Þð Þ ¼ f að Þh að Þ;
it suffices to show that f að Þh að Þ ¼ g að Þð Þ�1f að Þ, that is, g að Þf að Þh að Þ ¼ f að Þ, thatis, f að Þg að Þh að Þ ¼ f að Þ. Again, it suffices to show that h að Þg að Þ ¼ 1.
Since h xð Þg xð Þþ ker wð Þð Þ ¼ 1þ ker wð Þð Þ, we have
h að Þg að Þ ¼ w h xð Þð Þw g xð Þð Þ ¼ w h xð Þg xð Þð Þ¼ g h xð Þg xð Þþ ker wð Þð Þ ¼ g 1þ ker wð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ w 1ð Þ ¼ 1;
and hence h að Þg að Þ ¼ 1. ■
1.4.15 Note Let F and K be any fields such that F � K. Suppose that K is anextension of F. Let a be a member of K. Let a be algebraic of degree n over F. Letp xð Þ 2 F x½ � � Fð Þ. Let p xð Þ be the minimal polynomial of a over F. It follows that
n ¼ deg p xð Þð Þ 1. Now by 1.4.10, dim F x½ �p xð Þð Þ
� �¼ n.
Let w : f xð Þ 7! f að Þ be a mapping from ring F x½ � to the field F að Þ. By 1.4.14, themapping g : f xð Þþ ker wð Þð Þ 7!w f xð Þð Þ ¼ f að Þð Þ is a ring isomorphism from the
quotient ring F x½ �ker wð Þ onto F að Þ. Further, by 1.4.13, ker wð Þ ¼ p xð Þð Þ. Thus
dim F x½ �ker wð Þ
� �¼ n.
We can think of F x½ �ker wð Þ as a vector space over the field F under the usual
operations of vector addition and scalar multiplication:For every f xð Þ; g xð Þ 2 F x½ � and for every a 2 F � F x½ �ð Þ,
f xð Þþ ker wð Þð Þþ g xð Þþ ker wð Þð Þ � f xð Þþ g xð Þð Þþ ker wð Þ
1.4 Roots of Polynomials 59
and
a f xð Þþ ker wð Þð Þ � aþ ker wð Þð Þ f xð Þþ ker wð Þð Þ ¼ af xð Þþ ker wð Þð Þ:
It suffices to show the following:
1. For every f xð Þ 2 F x½ � and for every a; b 2 F,
aþ bð Þ f xð Þþ ker wð Þð Þ ¼ a f xð Þþ ker wð Þð Þþ b f xð Þþ ker wð Þð Þ
and
abð Þ f xð Þþ ker wð Þð Þ ¼ a b f xð Þþ ker wð Þð Þð Þ:
2. For every f xð Þ; g xð Þ 2 F x½ � and for every a 2 F,
a f xð Þþ ker wð Þð Þþ g xð Þþ ker wð Þð Þð Þ ¼ a f xð Þþ ker wð Þð Þþ a g xð Þþ ker wð Þð Þ:
3. For every f xð Þ 2 F x½ �,
1 f xð Þþ ker wð Þð Þ ¼ f xð Þþ ker wð Þð Þ:
For 1:
LHS ¼ aþ bð Þ f xð Þþ ker wð Þð Þ ¼ aþ bð Þf xð Þþ ker wð Þ¼ af xð Þþ bf xð Þð Þþ ker wð Þ ¼ af xð Þþ ker wð Þð Þþ bf xð Þþ ker wð Þð Þ¼ a f xð Þþ ker wð Þð Þþ b f xð Þþ ker wð Þð Þ ¼ RHS:
Next,
LHS ¼ abð Þ f xð Þþ ker wð Þð Þ ¼ abð Þf xð Þþ ker wð Þ ¼ a bf xð Þð Þþ ker wð Þ¼ a bf xð Þþ ker wð Þð Þ ¼ a b f xð Þþ ker wð Þð Þð Þ ¼ RHS:
For 2:
LHS ¼ a f xð Þþ ker wð Þð Þþ g xð Þþ ker wð Þð Þð Þ ¼ a f xð Þþ g xð Þð Þþ ker wð Þð Þ¼ a f xð Þþ g xð Þð Þþ ker wð Þ ¼ af xð Þþ ag xð Þð Þþ ker wð Þ¼ af xð Þþ ker wð Þð Þþ ag xð Þþ ker wð Þð Þ ¼ a f xð Þþ ker wð Þð Þþ a g xð Þþ ker wð Þð Þ¼ RHS:
60 1 Galois Theory I
For 3:
LHS ¼ 1 f xð Þþ ker wð Þð Þ ¼ 1f xð Þþ ker wð Þ ¼ f xð Þþ ker wð Þ ¼ RHS:
Thus F x½ �ker wð Þ is a vector space over the field F.
Since F � F að Þ and F;F að Þ are fields, by 1.4.2, F að Þ can be thought of as avector space over the field F.
We shall show that the mapping g : f xð Þþ ker wð Þð Þ 7!w f xð Þð Þ ¼ f að Þð Þ is an
isomorphism from the vector space F x½ �ker wð Þ onto the vector space F að Þ.
Since g is an isomorphism from F x½ �ker wð Þ to F að Þ, the map g from F x½ �
ker wð Þ to F að Þ isone-to-one and onto. Hence it suffices to show that for every f xð Þ; g xð Þ 2 F x½ � andfor every a; b 2 F,
g a f xð Þþ ker wð Þð Þþ b g xð Þþ ker wð Þð Þð Þ ¼ ag f xð Þþ ker wð Þð Þþ bg g xð Þþ ker wð Þð Þ;
LHS ¼ g a f xð Þþ ker wð Þð Þþ b g xð Þþ ker wð Þð Þð Þ ¼ g af xð Þþ bg xð Þð Þþ ker wð Þð Þ¼ w af xð Þþ bg xð Þð Þ ¼ w af xð Þð Þþw bg xð Þð Þ ¼ w að Þw f xð Þð Þþw bð Þw g xð Þð Þ¼ aw f xð Þð Þþ bw g xð Þð Þ ¼ ag f xð Þþ ker wð Þð Þþ bg g xð Þþ ker wð Þð Þ ¼ RHS:
Thus the vector space F x½ �ker wð Þ over F is isomorphic to the vector space F að Þ over
F. It follows that
n ¼ dimF x½ �
ker wð Þ �
¼ dim F að Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ F að Þ : F½ �;
and hence F að Þ : F½ � ¼ n.
1.4.16 Conclusion Let F and K be any fields such that K is an extension of F Let abe a member of K. Let a be algebraic of degree n over F. Then F að Þ : F½ � ¼ n. Inshort, a is algebraic of degree F að Þ : F½ � over F.1.4.17 Problem Let F and K be any fields such that K is an extension of F. Let a bea member of K. Let a be algebraic over F. Then F að Þ is a finite extension of F.
Proof Since a is algebraic over F, there exists a nonzero polynomial q xð Þ 2 F x½ �such that K3ð Þ q að Þ ¼ 0. It follows that deg q xð Þð Þ 1. Let n be the smallest degreeof all such polynomials q xð Þ. Hence a is algebraic of degree n over F. Now by1.4.16, F að Þ : F½ � ¼ n\1. Hence F að Þ is a finite extension of F. ■
1.4 Roots of Polynomials 61
1.4.18 Note Let F and K be any fields such that K is an extension of F. Let A be thecollection of all elements of K that are algebraic over F. By 1.4.5, F � A. ThusF � A � K. We shall show that A is a field.
To this end, let us take arbitrary a; b 2 A. It suffices to show the following:
1. a� bð Þ 2 A,2. ab 2 A,3. ab�1 2 A, provided a; b are nonzero.
Since a 2 A, a is algebraic over F, and hence there exists a positive integer msuch that a is algebraic of degree m over F. Similarly, there exists a positive integern such that b is algebraic of degree n over F. Since b is algebraic of degree n over F,by 1.4.11, there exists a unique polynomial q xð Þ 2 F x½ � � Fð Þ such that
1. K3ð Þ q bð Þ ¼ 0,2. n ¼ deg q xð Þð Þ 1,3. the leading coefficient of q xð Þ is 1.
Suppose that b is algebraic of degree k over the field F að Þ � Kð Þ.Since F � F að Þ, and F að Þ is a field, we have F x½ � � F að Þð Þ x½ �. Now, since
q xð Þ 2 F x½ �, we have q xð Þ 2 F að Þð Þ x½ �. Since q xð Þ 2 F að Þð Þ x½ �, q bð Þ ¼ 0, theleading coefficient of q xð Þ is 1, and b is algebraic of degree k over the field F að Þ, wehave
k� deg q xð Þð Þ|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} ¼ n:
Thus k� n. Since b is algebraic of degree k over the field F að Þ � Kð Þ, by 1.4.16,F að Þð Þ bð Þ : F að Þ½ � ¼ k. Since a is algebraic of degree m over the field F � Kð Þ, by
1.4.16, F að Þ : F½ � ¼ m. Since F � F að Þ � F að Þð Þ bð Þ � K bð Þ ¼ K, by 1.4.3, wehave
F að Þð Þ bð Þ : F½ � ¼ F að Þð Þ bð Þ : F að Þ½ � F að Þ : F½ �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ k F að Þ : F½ � � n F að Þ : F½ � ¼ nm:
Thus F að Þð Þ bð Þ : F½ � � nm \1ð Þ. Since F að Þð Þ bð Þ is a field containing b and allthe elements of F að Þ 3að Þ, F að Þð Þ bð Þ is a field containing a; b.
1. Since F að Þð Þ bð Þ is a field containing a; b, F að Þð Þ bð Þ is a field containing a� b,and hence F a� bð Þ � F að Þð Þ bð Þ. Thus F a� bð Þ is a linear subspace of thevector space F að Þð Þ bð Þ. It follows that
F a� bð Þ : F½ � ¼ dim F a� bð Þð Þ� dim F að Þð Þ bð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ F að Þð Þ bð Þ : F½ � � nm\1ð Þ;
62 1 Galois Theory I
and hence F a� bð Þ : F½ � � nm. Now, by 1.4.9, a� b is algebraic over F, and hencea� bð Þ 2 A.Next, by 1.4.16, a� b is algebraic of degree F a� bð Þ : F½ � � nmð Þ over F, and
hence a� b is algebraic of degree � nm over F.
2. Since F að Þð Þ bð Þ is a field containing a; b, F að Þð Þ bð Þ is a field containing ab. Itfollows, as above, that F abð Þ : F½ � � nm. Now by 1.4.9, ab is algebraic over F,and hence ab 2 A.
Next, by 1.4.16, ab is algebraic of degree F abð Þ : F½ � � nmð Þ over F, and henceab is algebraic of degree � nm over F.
3. Suppose that a; b are nonzero. Since F að Þð Þ bð Þ is a field containing a; b,F að Þð Þ bð Þ is a field containing ab�1. It follows, as above, thatF ab�1ð Þ : F½ � � nm. Now by 1.4.9, ab�1 is algebraic over F, and henceab�1 2 A.
Next, by 1.4.16, ab�1 is algebraic of degree F ab�1ð Þ : F½ � � nmð Þ over F, andhence ab�1 is algebraic of degree � nm over F.
1.4.19 Conclusion Let F and K be any fields such that K is an extension of F. Let Abe the collection of all elements of K that are algebraic over F. Then
1. F � A � K,2. A is a subfield of K, and F is a subfield of A,3. if a is algebraic of degree m over F, and b is algebraic of degree n over F, then
all of a� b; ab; ab�1 provided b is nonzeroð Þ are algebraic of degree �mnover F.
Definition Let F and K be any fields such that F � K. Suppose that K is anextension of F. Let a; b 2 K. Since F [ af g � F að Þ � F að Þð Þ bð Þ andb 2 F að Þð Þ bð Þ, we have F [ a; bf g � F að Þð Þ bð Þ. So F að Þð Þ bð Þ is a field containingF [ a; bf g. The smallest field containing F [ a; bf g is denoted by F a; bð Þ.
We have seen that F a; bð Þ � F að Þð Þ bð Þ. It is clear that F að Þ � F a; bð Þ, andhence F að Þð Þ [ bf g � F a; bð Þ. It follows that F að Þð Þ bð Þ � F a; bð Þ. Thus we haveshown that F að Þð Þ bð Þ ¼ F a; bð Þ. Similarly, F bð Þð Þ að Þ ¼ F a; bð Þ. Thus
F að Þð Þ bð Þ ¼ F bð Þð Þ að Þ ¼ F a; bð Þ:
A similar definition can be supplied for F a; b; cð Þ, etc.Definition Let F and K be any fields such that K is an extension of F. If everyelement of K is algebraic over F, then we say that K is an algebraic extension of F.
1.4.20 Problem Let F;K, and L be any fields such that F � K � L. Suppose thatK is an algebraic extension of F, and L is an algebraic extension of K. Then L is analgebraic extension of F.
Proof Let us take an arbitrary l 2 L. We have to show that l is algebraic over F.
1.4 Roots of Polynomials 63
Since L is an algebraic extension of K, and l 2 L, there exists a nonzeropolynomial
k0 þ k1xþ k2x2 þ þ knx
n
such that each ki 2 K, n is a positive integer, and
k0 þ k1lþ k2l2 þ þ knl
n ¼ 0:
Since k0 2 K and K is an algebraic extension of F, k0 is algebraic over F, andhence by 1.4.17, F k0ð Þ � Kð Þ is a finite extension of F. Since k1 2 K and K is analgebraic extension of F, k1 is algebraic over F. Now, since F k0ð Þ is an extension ofF, by 1.4.17, F k0; k1ð Þ ¼ð Þ F k0ð Þð Þ k1ð Þ � Kð Þ is a finite extension of F. ThusF k0; k1ð Þ is an extension of F. Since k2 2 K and K is an algebraic extension of F, k2is algebraic over F. Now, since F k0; k1ð Þ is an extension of F; by 1.4.17,F k0; k1; k2ð Þ ¼ð Þ F k0; k1ð Þð Þ k2ð Þ � Kð Þ is a finite extension of F. Thus F k0; k1; k2ð Þis a finite extension of F, etc.
It follows that F k0; k1; ; knð Þ � Kð Þ is a finite extension of F. Since eachki 2 F k0; k1; . . .; knð Þ, the nonzero polynomial
k0 þ k1xþ k2x2 þ þ knx
n
is a member of F k0; k1; . . .; knð Þð Þ x½ �. Next,
k0 þ k1lþ k2l2 þ . . .þ knl
n ¼ 0;
so l is algebraic over F k0; k1; . . .; knð Þ � Kð Þ. It follows, by 1.4.17, thatF k0; k1; . . .; knð Þð Þ lð Þ is a finite extension of F k0; k1; . . .; knð Þ. Further,F k0; k1; . . .; knð Þ is a finite extension of F, so by 1.4.4, F k0; k1; ; knð Þð Þ lð Þ is afinite extension of F. Now by 1.4.9, l is algebraic over F. ■
Definition Let a 2 C. Recall that Q � C, and the field C is an extension of thefield Q. If a is algebraic over Q, then we say that a is an algebraic number.A complex number that is not an algebraic number is called a transcendentalnumber.
�ð Þ By 1.4.5, every rational number is an algebraic number. By 1.4.19,the collection of all algebraic numbers is a subfield of C. Thus if a is analgebraic number and b is an algebraic number, then all of a� b; ab;ab�1 provided b is nonzeroð Þ are algebraic numbers.
1.4.21 Problem Recall that Q � C, and the field C is an extension of the field Q.Let A be the collection of all algebraic numbers. We know from 1.4.19 thatQ � A � C, C is an extension of A, and A is an algebraic extension of Q. Let f xð Þbe a nonzero member of A x½ �, and let a 2 Cð Þ be a root of the polynomial f xð Þ, inthe sense that f að Þ ¼ 0. Then a 2 A.
In short, the roots of a polynomial whose coefficients are algebraic numbers.
64 1 Galois Theory I
Proof Suppose to the contrary that a 62 A. We seek a contradiction.Since C is an extension of A, a 2 C, f xð Þ is a nonzero member of A x½ �, and
f að Þ ¼ 0, a is algebraic over A, and hence each member of A[ af g is algebraicover A. Observe that the field A að Þ is an extension of A.
Let B be the collection of all elements of A að Þ that are algebraic over A. By1.4.19, A � B � A að Þ, B is a subfield of A að Þ, and A is a subfield of B. Since eachmember of A[ af g is algebraic over A, we have A[ af g � B, and henceA að Þ � B. Now, since B � A að Þ, we have B ¼ A að Þ.
Thus every element of A að Þ is algebraic over A, and hence A að Þ is an algebraicextension of A. Now, since A is an algebraic extension of Q, by 1.4.20, A að Þ is analgebraic extension of Q. And since a 2 A að Þ, a is algebraic over Q, and hence a isan algebraic number. Thus a 2 A. This is a contradiction. ■
1.5 Splitting Fields
1.5.1 Theorem The number e � 1þ 11! þ 1
2! þ 13! þ � �
is a transcendentalnumber.
Proof (due to Hermite) Suppose to the contrary that e is not a transcendentalnumber. We seek a contradiction.
Since e is not a transcendental number, e is an algebraic number, and hence thereexists a nonzero polynomial
c0 þ c1xþ c2x2 þ þ cnx
n
such that
1. each ci is an integer,2. c0 is a positive integer,3. n is a positive integer,4. c0 þ c1eþ c2e2 þ þ cnen ¼ 0,5. cn is a nonzero integer.
Let us take a polynomial f xð Þ 2 R x½ �, and let deg f xð Þð Þ ¼ r[ 1.It follows that the rþ 1ð Þth derivative f rþ 1ð Þ xð Þ of f xð Þ is the zero polynomial.
Similarly, f rþ 2ð Þ xð Þ is the zero polynomial, etc.Put
F xð Þ � f xð Þþ f 0 xð Þþ f 00 xð Þþ þ f rð Þ xð Þ:
By the mean value theorem, there exists a real number h1 2 0; 1ð Þ such that
1.4 Roots of Polynomials 65
e�1F 1ð Þ � e�0F 0ð Þ ¼ 1� 0ð Þd e�xF xð Þð Þdx
x¼h1
;
that is,
e�1F 1ð Þ � F 0ð Þ ¼ 1� 0ð Þ �e�xF xð Þþ e�xF0 xð Þð Þjx¼h1;
that is,
e�1F 1ð Þ � F 0ð Þ ¼ 1� 0ð Þe�h1 F0 h1ð Þ � F h1ð Þð Þ;
that is,
e�1F 1ð Þ � F 0ð Þ ¼ 1� 0ð Þe�h1 f 0 h1ð Þþ f 00 h1ð Þþ þ f rð Þ h1ð Þþ f rþ 1ð Þ h1ð Þ� ��
� f h1ð Þþ f 0 h1ð Þþ f 00 h1ð Þþ þ f rð Þ h1ð Þ� ��
;
that is,
e�1F 1ð Þ � F 0ð Þ ¼ 1� 0ð Þe�h1 f rþ 1ð Þ h1ð Þ � f h1ð Þ� �
;
that is,
e�1F 1ð Þ � F 0ð Þ ¼ 1� 0ð Þe�h1 0� f h1ð Þð Þ;
that is,
e�1F 1ð Þ � F 0ð Þ ¼ � 1� 0ð Þe�1h1 f 1h1ð Þ:
Similarly, there exists a real number h2 2 0; 1ð Þ such that
e�2F 2ð Þ � F 0ð Þ ¼ � 2� 0ð Þe�2h2 f 2h2ð Þ:
Also, there exists a real number h3 2 0; 1ð Þ such that
e�3F 3ð Þ � F 0ð Þ ¼ � 3� 0ð Þe�3h3 f 3h3ð Þ:...
There exists a real number hn 2 0; 1ð Þ such that
e�nF nð Þ � F 0ð Þ ¼ � n� 0ð Þe�nhn f nhnð Þ:
66 1 Galois Theory I
Thus
F 1ð Þ � eF 0ð Þ ¼ �e1�h1 f 1h1ð Þ;F 2ð Þ � e2F 0ð Þ ¼ �2e2�2h2 f 2h2ð Þ;F 3ð Þ � e3F 0ð Þ ¼ �3e3�3h3 f 3h3ð Þ;
..
.
F nð Þ � enF 0ð Þ ¼ �nen�nhn f nhnð Þ:
It follows that
c0F 0ð Þþ c1F 1ð Þþ c2F 2ð Þþ þ cnF nð Þ¼ c0F 0ð Þþ c1 eF 0ð Þ � e1�h1 f 1h1ð Þ� �þ c2 e2F 0ð Þ � 2e2�2h2 f 2h2ð Þ� �þ þ cn enF 0ð Þ � nen�nhn f nhnð Þ� � ¼ c0 þ c1eþ c2e
2 þ þ cnen
� �F 0ð Þ
� c1e1�h11f h1ð Þþ c22e2�2h2 f 2h2ð Þþ þ cnne
n�nhn f nhnð Þ� �¼ 0 F 0ð Þ � c11e1 1�h1ð Þf 1h1ð Þþ c22e2 1�h2ð Þf 2h2ð Þþ þ cnne
n 1�hnð Þf nhnð Þ� �
;
and hence
c0F 0ð Þþ c1F 1ð Þþ c2F 2ð Þþ þ cnF nð Þ¼ � c11e1 1�h1ð Þf 1h1ð Þþ c22e2 1�h2ð Þf 2h2ð Þþ þ cnne
n 1�hnð Þf nhnð Þ� �
: �ð Þ:
Let us take an arbitrary prime p such that 1� n\p, and 1� c0\p. It followsthat p divides neither the integer c0 nor n!, and hence p does not divide the integerc0 n!ð Þp.
Next let us take
f xð Þ � 1p� 1ð Þ! x
p�1 1� xð Þp 2� xð Þp n� xð Þp 2 R x½ �ð Þ:
Here
r ¼ deg f xð Þð Þ ¼ p� 1ð Þþ pþ pþ þ p|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl}n terms
¼ nþ 1ð Þp� 1:
Thus r ¼ nþ 1ð Þp� 1. Observe that
1.5 Splitting Fields 67
f xð Þ ¼ 1p�1ð Þ! x
p�1 1� xð Þp 2� xð Þp n� xð Þp
¼ 1p�1ð Þ! x
p�1 1p � p1
�1p�1xþ p
2
�1p�2x2 �
� 2p � p
1
�2p�1xþ p
2
�2p�2x2 �
� 3p � p
1
�3p�1xþ p
2
�3p�2x2 �
�...
np � p1
�np�1xþ p
2
�np�2x2 �
�¼ 1
p�1ð Þ! xp�1 1p 2p np þ integerð Þxþ integerð Þx2 þ ð Þ
¼ n!ð Þpp�1ð Þ! x
p�1 þ integerp�1ð Þ! xp þ integer
p�1ð Þ! xpþ 1 þ þ integerp�1ð Þ! x nþ 1ð Þp�1;
so
f xð Þ ¼ n!ð Þpp� 1ð Þ! x
p�1 þ a0p� 1ð Þ! x
p þ a1p� 1ð Þ! x
pþ 1 þ þ anp�1
p� 1ð Þ! xnþ 1ð Þp�1;
where each ai is an integer. Now,
f pð Þ xð Þ ¼ n!ð Þpp� 1ð Þ!D
p xp�1� �þ a0
p� 1ð Þ!Dp xpð Þþ a1
p� 1ð Þ!Dp xpþ 1� �þ
¼ n!ð Þpp� 1ð Þ! 0þ
a0p� 1ð Þ! p!þ
a1p� 1ð Þ!
pþ 1ð Þ!pþ 1ð Þ � pð Þ! x
pþ 1ð Þ�p
þ a2p� 1ð Þ!
pþ 2ð Þ!pþ 2ð Þ � pð Þ! x
pþ 2ð Þ�p þ
¼ a0p� 1ð Þ! p!þ
a1p� 1ð Þ!
pþ 1ð Þ!1!
xþ a2p� 1ð Þ!
pþ 2ð Þ!2!
x2 þ
¼ a0pþ a1 pþ 1ð Þpxþ a2pþ 2ð Þ pþ 1ð Þp
2!x2 þ
¼ a0pþ a1pþ 1
1
�pxþ a2
pþ 2
2
�px2 þ ;
so
f pð Þ xð Þ ¼ a0pþ a1pþ 11
�pxþ a2
pþ 22
�px2 þ :
Here we observe that each coefficient of f pð Þ xð Þ is an integer that is divisibleby p.
68 1 Galois Theory I
Further,
f pþ 1ð Þ xð Þ ¼ a1pþ 11
�pþ a2
pþ 22
�p2xþ a3
pþ 33
�p3x2 þ :
Again, we observe that each coefficient of f pþ 1ð Þ xð Þ is an integer that is divisibleby p.
Thus for every integer j, and for every integer i p, f ið Þ jð Þ is an integer that isdivisible by p.
[Before going ahead, let us recall the Leibniz rule of differentiation:
uvð Þ0¼ u0vþ uv0;
uvwð Þ0¼ u0vwþ uv0wþ uvw0;
uvwð Þ00¼ u00vwþ u0v0wþ u0vw0ð Þ þ u0v0wþ uv00wþ uv0w0ð Þ þ u0vw0 þ uv0w0 þ uvw00ð Þ;
uvwð Þ00¼ u00vwþ þ uv00wþ uvw00 þ 2uv0w0 þ 2u0vw0 þ 2u0v0w¼ u00 vwþ u0 2vw0 þ 2v0wð Þþ u v00wþ vw00 þ 2v0w0ð Þ:
Similarly,
uvwð Þ nð Þ¼ Pi; j; kð Þ
i; j; k are nonnegative integersiþ jþ k ¼ n
positive integerð Þu ið Þv jð Þw kð Þ
¼ u nð Þ ð Þþ u n�1ð Þ ð Þþ u n�2ð Þ ð Þþ ¼ v nð Þ ð Þþ v n�1ð Þ ð Þþ v n�2ð Þ þ ;
etc. Also, for every integer i 2 0; 1; 2; . . .; p� 2f g, we have Di xp�1ð Þx¼0¼ 0. AlsoDp�1 xp�1ð Þ ¼ p� 1ð Þ!, and Dp xp�1ð Þ ¼ 0.
Next, for every integer i 2 0; 1; 2; . . .; p� 1f g; Di 1� xð Þpð Þjx¼1¼ 0. AlsoDp 1� xð Þpð Þ ¼ �1ð Þp p!ð Þ.
Similarly, for every integer i 2 0; 1; 2; ; p� 1f g, Di 2� xð Þpð Þjx¼2¼ 0. Also,Dp 2� xð Þpð Þ ¼ �1ð Þp p!ð Þ, etc.]
Now, since
f xð Þ ¼ 1p� 1ð Þ! x
p�1 1� xð Þp 2� xð Þp n� xð Þp;
1.5 Splitting Fields 69
we have
f p�1ð Þ 1ð Þ ¼ 1p� 1ð Þ! 0þ 0þ ð Þ ¼ 0::
Similarly, f p�1ð Þ 2ð Þ ¼ 0, f p�1ð Þ 3ð Þ ¼ 0, etc. Also, f p�2ð Þ 1ð Þ ¼ 0, f p�2ð Þ 2ð Þ ¼ 0,etc. In short, for every i 2 0; 1; 2; . . .; p� 1f g, and for every j 2 1; 2; ; nf g,f i jð Þ ¼ 0.
Since
f xð Þ ¼ 1p� 1ð Þ! x
p�1 1� xð Þp 2� xð Þp n� xð Þp;
we have
f p�1ð Þ 0ð Þ ¼ 1p� 1ð Þ! p� 1ð Þ! 1� 0ð Þp 2� 0ð Þp n� 0ð Þp þ 0þ 0þ ð Þ ¼ n!ð Þp:
Similarly, f p�2ð Þ 0ð Þ ¼ 0, f p�3ð Þ 0ð Þ ¼ 0, etc. In short, for every i 2 0; 1; 2; . . .;fp� 2g, we have f ið Þ 0ð Þ ¼ 0, and f p�1ð Þ 0ð Þ ¼ n!ð Þp.
Since
F xð Þ ¼ f xð Þþ f 0 xð Þþ f 00 xð Þþ þ f rð Þ xð Þ;
we have, for every j 2 1; 2; . . .; nf g,
F jð Þ ¼ f jð Þþ f 0 jð Þþ f 00 jð Þþ þ f rð Þ jð Þ¼ f jð Þþ f 0 jð Þþ f 00 jð Þþ þ f nþ 1ð Þp�1ð Þ jð Þ
¼ f jð Þþ f 0 jð Þþ f 00 jð Þþ þ f p�1ð Þ jð Þþ f p jð Þþ þ f nþ 1ð Þp�1ð Þ jð Þ� �¼ f jð Þþ f 0 jð Þþ f 00 jð Þþ þ f p�1ð Þ jð Þþ p integerð Þ¼ 0þ 0þ 0þ þ 0þ p integerð Þ ¼ p integerð Þ;
and hence for every j 2 1; 2; . . .; nf g;F jð Þ is an integer that is a multiple of p.Since
F xð Þ ¼ f xð Þþ f 0 xð Þþ f 00 xð Þþ þ f rð Þ xð Þ;
70 1 Galois Theory I
we have
F 0ð Þ ¼ f 0ð Þþ f 0 0ð Þþ f 00 0ð Þþ þ f rð Þ 0ð Þ¼ f 0ð Þþ f 0 0ð Þþ f 00 0ð Þþ þ f nþ 1ð Þp�1ð Þ 0ð Þ¼ f 0ð Þþ f 0 0ð Þþ f 00 0ð Þþ þ f p�2ð Þ 0ð Þþ f p�1 0ð Þþ f p 0ð Þþ þ f nþ 1ð Þp�1ð Þ 0ð Þ
� �¼ f 0ð Þþ f 0 0ð Þþ f 00 0ð Þþ þ f p�2ð Þ 0ð Þþ f p�1 0ð Þþ p integerð Þ¼ 0þ 0þ 0þ þ 0þ n!ð Þp þ p integerð Þ ¼ n!ð Þp þ p integerð Þ;
and hence F 0ð Þ is an integer of the form n!ð Þp þ p integerð Þ.Since for every j 2 1; 2; ; nf g;F jð Þ is a multiple of p, F 0ð Þ is of the form
n!ð Þp þ p integerð Þ, and each ci is an integer, it follows that
c0F 0ð Þþ c1F 1ð Þþ c2F 2ð Þþ þ cnF nð Þ¼ � c11e1 1�h1ð Þf 1h1ð Þþ c22e2 1�h2ð Þf 2h2ð Þþ þ cnne
n 1�hnð Þf nhnð Þ� �� �
is an integer of the form c0 n!ð Þp þ p integerð Þ. Thus
� c11e1 1�h1ð Þf 1h1ð Þþ c22e2 1�h2ð Þf 2h2ð Þþ þ cnnen 1�hnð Þf nhnð Þ
� �is an integer of the form c0 n!ð Þp þ p integerð Þ.
Observe that
1e1 1�h1ð Þf 1h1ð Þ ¼ 1e1 1�h1ð Þ 1p� 1ð Þ! 1h1ð Þp�1 1� 1h1ð Þp 2� 1h1ð Þp n� 1h1ð Þp;
so
1e1 1�h1ð Þf 1h1ð Þ ¼ 1e1 1�h1ð Þ 1p�1ð Þ! 1h1ð Þp�1 1� 1h1ð Þp 2� 1h1ð Þp n� 1h1ð Þp
¼ 1e1 1�h1ð Þ 1
p�1ð Þ! 1h1ð Þp�1 1� 1h1j j 2� 1h1j j n� 1h1j jð Þp:
Now, since h1 2 0; 1ð Þ, we have
1� 1h1j j 2� 1h1j j n� 1h1j j � 1 2 n ¼ n!;
1.5 Splitting Fields 71
and hence
1e1 1�h1ð Þf 1h1ð Þ � 1e1 1�h1ð Þ 1p�1ð Þ! 1h1ð Þp�1 n!ð Þp � 1e1 1�h1ð Þ 1
p�1ð Þ! np�1 n!ð Þp
� 1e1 1�h1ð Þ 1p�1ð Þ! n
p n!ð Þp¼ 1e1 1�h1ð Þ n n!ð Þð Þ n n!ð Þð Þp�1
p�1ð Þ! ! 1e1 1�h1ð Þ n n!ð Þð Þ 0
as p ! 1. Thus 1e1 1�h1ð Þf 1h1ð Þ ! 0 as p ! 1.Since
2e2 1�h2ð Þf 2h2ð Þ ¼ 2e2 1�h2ð Þ 1p� 1ð Þ! 2h2ð Þp�1 1� 2h2ð Þp 2� 2h2ð Þp n� 2h2ð Þp;
we have
2e2 1�h2ð Þf 2h2ð Þ ¼ 2e2 1�h2ð Þ 1p�1ð Þ! 2h2ð Þp�1 1� 2h2ð Þp 2� 2h2ð Þp n� 2h2ð Þp
¼ 2e2 1�h2ð Þ 1
p�1ð Þ! 2h2ð Þp�1 1� 2h2j j 2� 2h2j j n� 2h2j jð Þp:
Now, since h2 2 0; 1ð Þ, we have
1� 2h2j j 2� 2h2j j n� 2h2j j � 1 2 n ¼ n!;
and hence
2e2 1�h2ð Þf 2h2ð Þ � 2e2 1�h2ð Þ 1p� 1ð Þ! 2h2ð Þp�1 n!ð Þp � 2e2 1�h2ð Þ 1
p� 1ð Þ! np�1 n!ð Þp
� 2e2 1�h2ð Þ 1p� 1ð Þ! n
p n!ð Þp¼ 2e2 1�h2ð Þ n n!ð Þð Þ n n!ð Þð Þp�1
p� 1ð Þ! ! 2e2 1�h2ð Þ n n!ð Þð Þ 0
as p ! 1. Thus 2e2 1�h2ð Þf 2h2ð Þ ! 0 as p ! 1. Similarly, 3e3 1�h3ð Þf 3h3ð Þ ! 0 asp ! 1, etc. It follows that
� c11e1 1�h1ð Þf 1h1ð Þþ c22e2 1�h2ð Þf 2h2ð Þþ þ cnnen 1�hnð Þf nhnð Þ
� �! 0 as p
! 1:
Since
� c11e1 1�h1ð Þf 1h1ð Þþ c22e2 1�h2ð Þf 2h2ð Þþ þ cnnen 1�hnð Þf nhnð Þ
� �is an integer of the form c0 n!ð Þp þ p integerð Þ and p does not divide the integerc0 n!ð Þp, it follows that
72 1 Galois Theory I
� c11e1 1�h1ð Þf 1h1ð Þþ c22e2 1�h2ð Þf 2h2ð Þþ þ cnnen 1�hnð Þf nhnð Þ
� �is a nonzero integer, and hence
� c11e1 1�h1ð Þf 1h1ð Þþ c22e2 1�h2ð Þf 2h2ð Þþ þ cnnen 1�hnð Þf nhnð Þ
� �90 as p
! 1:
This is a contradiction. ■
Definition Let F and K be any fields such that K is an extension of F. Let f xð Þ be anonzero member of F x½ � with deg f xð Þð Þ 1. Let a 2 K. If K3ð Þf að Þ ¼ 0, then wesay that a is a root of f xð Þ.1.5.2 Theorem Let F and K be any fields such that K is an extension of F. Let f xð Þbe a nonzero member of F x½ � with deg f xð Þð Þ 1. Let a 2 K. Then there exists anonzero q xð Þ 2 K x½ � such that
1. f xð Þ ¼ x� að Þq xð Þþ f að Þ,2. deg q xð Þð Þ ¼ deg f xð Þð Þ � 1.
This theorem is known as the remainder theorem.
Proof Since F � K, we have f xð Þ 2ð ÞF x½ � � K x½ �, and hence f xð Þ 2 K x½ �. It isgiven that f xð Þ is nonzero. Since 1;�a 2 K, the polynomial x� a is a nonzeromember of K x½ �. Now, by 1.2.14, there exist q xð Þ; r xð Þ 2 K x½ � such that
f xð Þ ¼ q xð Þ x� að Þþ r xð Þ;
and either r xð Þ ¼ 0ð or deg r xð Þð Þ\deg x� að Þ ¼ 1ð ÞÞ. It follows thateither r xð Þ ¼ 0 or deg r xð Þð Þ ¼ 0. Since r xð Þ 2 K x½ �, r xð Þ is a member of K, andhence r xð Þ ¼ r að Þ 2 K. Since
f xð Þ ¼ q xð Þ x� að Þþ r xð Þ;
we have
f að Þ ¼ q að Þ a� að Þþ r að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ q að Þ 0þ r að Þ ¼ r að Þ ¼ r xð Þ;
and hence f að Þ ¼ r xð Þ. Thus f xð Þ ¼ q xð Þ x� að Þþ f að Þ. This proves (1).Since f að Þ 2 K, we have
deg f xð Þð Þ ¼ deg q xð Þ x� að Þþ f að Þð Þ ¼ deg q xð Þ x� að Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ deg q xð Þð Þþ deg x� að Þ ¼ deg q xð Þð Þþ 1;
and hence deg f xð Þð Þ ¼ deg q xð Þð Þþ 1. ■
1.5 Splitting Fields 73
1.5.3 Theorem Let F and K be any fields such that K is an extension of F. Let f xð Þbe a nonzero member of F x½ � with deg f xð Þð Þ 1. Let a 2 K. Let a be a root of f xð Þ.Then x� að Þjf xð Þ in K x½ �.Proof By 1.5.2, there exists a nonzero q xð Þ 2 K x½ � such that
1. f xð Þ ¼ x� að Þq xð Þþ f að Þ,2. deg q xð Þð Þ ¼ deg f xð Þð Þ � 1.
Since a is a root of f xð Þ, we have
f xð Þ � x� að Þq xð Þ ¼ f að Þ ¼ 0|fflfflfflfflffl{zfflfflfflfflffl};and hence f xð Þ � x� að Þq xð Þ ¼ 0, that is, f xð Þ ¼ x� að Þq xð Þ. Since F � K, wehave f xð Þ 2ð ÞF x½ � � K x½ �, and hence f xð Þ 2 K x½ �. Since a 2 K, we havex� að Þ 2 K x½ �. Also, q xð Þ 2 K x½ �. Next, since f xð Þ ¼ x� að Þq xð Þ, it follows thatx� að Þjf xð Þ in K x½ �. ■Definition Let F and K be any fields such that K is an extension of F. Let f xð Þ be anonzero member of F x½ � with deg f xð Þð Þ 1. Let a 2 K. Let m be a positive integer.
If x� að Þmjf xð Þ in K x½ �, then clearly, f að Þ ¼ 0, and hence a is a root of f xð Þ.If x� að Þmjf xð Þ in K x½ � and x� að Þmþ 1-f xð Þ in K x½ �, then we say that a is a root
of f xð Þ of multiplicity m.
Caution We count a as m roots.
1.5.4 Theorem Let F and K be any fields such that F � K. Suppose that K is anextension of F. Let f xð Þ be a nonzero member of F x½ � with deg f xð Þð Þ 1. Supposethat deg f xð Þð Þ ¼ n. Then the number of roots of f xð Þ in K is � n.
Proof (Induction on nÞ If f xð Þ has no root in K, then the number of roots of f xð Þ inK is 0, and hence the result is trivially true. So we consider the case that there existsa root of f xð Þ in K.
Suppose that deg f xð Þð Þ ¼ 1. We can suppose that f xð Þ � aþ bx, where a; b 2 Fand b 6¼ 0. Next, let a; b 2 K such that
aþ ba ¼ 0aþ bb ¼ 0
�:
We shall show that a ¼ b. Since aþ ba ¼ 0 and b 6¼ 0, we have a ¼ �b�1a.Similarly, b ¼ �b�1a. It follows that a ¼ b. Thus the result is true for n ¼ 1.
Now let us suppose that the result is true for all positive integer values \n. Itsuffices to show that the result is true for n.
Let a be a root of f xð Þ in K, and let m 1ð Þ be its multiplicity. Hencex� að Þmjf xð Þ in K x½ �, and x� að Þmþ 1-f xð Þ in K x½ �. It follows that there exists
74 1 Galois Theory I
g xð Þ 2 K x½ � such that f xð Þ ¼ x� að Þmg xð Þ and x� að Þ-g xð Þ. Sincef xð Þ ¼ x� að Þmg xð Þ, we have
n ¼ deg f xð Þð Þ ¼ deg x� að Þmg xð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ deg x� að Þmð Þþ deg g xð Þð Þ
¼ mþ deg g xð Þð Þ 1þ deg g xð Þð Þ[ deg g xð Þð Þ;
and hence deg g xð Þð Þ\n. By the induction hypothesis, the number of roots of g xð Þin K is � deg g xð Þð Þ. Since
f xð Þ ¼ x� að Þmg xð Þ;
the number of roots of f xð Þ in K is equal to
mþ the number of roots of g xð ÞinKð Þ �mþ deg g xð Þð Þ ¼ nð Þ;
and hence the number of roots of f xð Þ in K is � n. ■
1.5.5 Note Let F and K be any fields such that K is an extension of F. Let p xð Þ be anonzero member of F x½ � with deg p xð Þð Þ 1. Suppose that deg p xð Þð Þ ¼ n. Let p xð Þbe irreducible over F.
By 1.2.24, the quotient ring F x½ �V is a field, where V denotes the ideal
p xð Þð Þ ¼ f xð Þp xð Þ : f xð Þ 2 F x½ �f gð Þ.Let w : a 7! aþVð Þ be a mapping from the field F to the field F x½ �
V . It is clear thatw is a ring isomorphism:
1. w : F ! F x½ �V is one-to-one: To prove this, let w að Þ ¼ w bð Þ. We have to show
that a ¼ b.
Since aþV ¼ w að Þ ¼ w bð Þ|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} ¼ bþV , we have aþV ¼ bþV , and hence
a� bð Þ 2 V . Since each nonzero element of a� bð Þ 2 V ¼ð Þf xð Þp xð Þ : f xð Þ 2 F x½ �f g is of degree deg p xð Þð Þ 1ð Þ, we have a� b ¼ 0
or deg a� bð Þ 1. Since a� bð Þ 2 V , either a� b ¼ 0 or deg a� bð Þ ¼ 0. It fol-lows that a� b ¼ 0, that is, a ¼ b.
2. It is clear that w is a ring homomorphism.
Thus we have shown that w : F ! F x½ �V is a ring isomorphism from the field F to the
field F x½ �V . It follows that we can identify each element a of F with w að Þ ¼ aþVð Þð Þ
of the field F x½ �V . It is in this sense that we write F � F x½ �
V and treat F x½ �V as an extension
of F.
Since deg p xð Þð Þ ¼ n, by 1.4.10, n is the dimension of the vector space F x½ �V , and
hence F x½ �V : F
h i¼ n. Also, by 1.4.10, 1þV ; xþV ; x2 þV ; ; xn�1 þV
�is a
1.5 Splitting Fields 75
basis of F x½ �V . From the definition of addition and scalar multiplication over the
quotient ring F x½ �V , it is clear that
p xþVð Þ ¼ p xð ÞþV|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ p xð Þþ p xð Þð Þ ¼ p xð Þð Þ ¼ V ¼ 0þV 2 F x½ �V
;
and hence p xþVð Þ ¼ 0þV . Here p xþVð Þ ¼ 0þV , xþV is a member of F x½ �V ,
and 0þV is the zero element of the field F x½ �V , so xþV is a root of the given
polynomial. Thus p xð Þ has a root in F x½ �V .
1.5.6 Conclusion Let F and K be any fields such that K is an extension of F. Letp xð Þ be a nonzero member of F x½ � with deg p xð Þð Þ 1. Suppose that deg p xð Þð Þ ¼ n.Let p xð Þ be irreducible over F. Then there exists a field E such that
1. E is an extension of F,2. E : F½ � ¼ n,3. p xð Þ has a root in E.
1.5.7 Problem Let F and K be any fields such that K is an extension of F. Let f xð Þbe a nonzero member of F x½ � with deg f xð Þð Þ 1. Then there exists a field E suchthat
1. E is a finite extension of F,2. E : F½ � � deg f xð Þð Þ,3. f xð Þ has a root in E.
Proof Since deg f xð Þð Þ 1, f xð Þ is not a unit in F x½ �, and hence by 1.2.20, thereexists an irreducible p xð Þ 2 F x½ � such that 1� deg p xð Þð Þ� deg f xð Þð Þ and p xð Þjf xð Þ.It follows, by 1.5.6, that there exists a field E such that
1. E is an extension of F,2. E : F½ � ¼ deg p xð Þð Þ � deg f xð Þð Þ\1ð Þ,3. p xð Þ has a root in E.
Since E : F½ �\1, E is a finite extension of F. Also E : F½ � � deg f xð Þð Þ. Sincep xð Þ has a root, say a, in E, and p xð Þjf xð Þ, a is also a root of f xð Þ. ■
1.5.8 Note Let F and K be any fields such that K is an extension of F. Let f xð Þ be anonzero member of F x½ � and deg f xð Þð Þ 1. Let deg f xð Þð Þ ¼ n.
By 1.5.8, there exists a field E1 such that
1. E1 is a finite extension of F,2. E1 : F½ � � n,3. f xð Þ has a root, say a1, in E1.
It follows, by 1.5.3, that x� a1ð Þjf xð Þ in E1 x½ �, and hence there exists f1 xð Þ 2E1 x½ � such that f xð Þ ¼ x� a1ð Þf1 xð Þ and deg f1 xð Þð Þ ¼ n� 1.
76 1 Galois Theory I
By 1.5.9, there exists a field E1 such that
1. E2 is a finite extension of E1,2. E2 : E1½ � � n� 1,3. f1 xð Þ has a root, say a2, in E2.
It follows, by 1.5.3, that x� a2ð Þjf1 xð Þ in E2 x½ �, and hence there exists f2 xð Þ 2E2 x½ � such that f1 xð Þ ¼ x� a2ð Þf2 xð Þ and deg f2 xð Þð Þ ¼ n� 1ð Þ � 1 ¼ n� 2ð Þ. Itfollows, by 1.4.3, that E2 is a finite extension of F, and
E2 : F½ � ¼ E2 : E1½ � E1 : F½ �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} � E2 : E1½ �n� n� 1ð Þn ¼ n n� 1ð Þ:
Thus E2 : F½ � � n n� 1ð Þ. Also,
f xð Þ ¼ x� a1ð Þf1 xð Þ ¼ f xð Þ ¼ x� a1ð Þ x� a2ð Þf2 xð Þ;
so the field E2 contains two roots, a1; a2 of f xð Þ.Similarly, the field E3 contains three roots of f xð Þ, E3 : F½ � � n n� 1ð Þ n� 2ð Þ,
and E3 is a finite extension of F.Finally, there exists a field E such that
1. E is a finite extension of F,2. E contains all the roots of f xð Þ in K,3. E : F½ � � n n� 1ð Þ n� 2ð Þ 2 1 ¼n!ð Þ.
1.5.9 Conclusion Let F and K be any fields such that K is an extension of F. Letf xð Þ be a nonzero member of F x½ � with deg f xð Þð Þ 1. Let deg f xð Þð Þ ¼ n. Supposethat K contains n roots of f xð Þ. Then there exists a field E such that
1. E is a finite extension of F,2. E contains all the roots of f xð Þ in K,3. if G is a proper subfield of E, then G does not contain all the roots of f xð Þ in K,4. E : F½ � � n!.
Definition Let F and K be any fields such that F � K. Suppose that K is anextension of F. Let f xð Þ be a nonzero member of F x½ � with deg f xð Þð Þ 1. Letdeg f xð Þð Þ ¼ n. Suppose that K contains n roots of f xð Þ. Let E be a field such that
1. E is a finite extension of F,2. E contains all the roots of f xð Þ in K,3. if G is a proper subfield of E that contains F, then G does not contain all the
roots of f xð Þ in K.
Then we say that E is a splitting field over F for f xð Þ.
1.5 Splitting Fields 77
Thus a field E is a splitting field over F for f xð Þ if and only if E is a minimalfinite extension of F in which f xð Þ can be factored as a product of linear factors inE x½ �. From 1.5.9,
splitting field over F for f xð Þð Þ : F½ � � deg f xð Þð Þð Þ!:1.5.10 Problem Let F;F0 be any fields. Let s : a 7! a0 be a ring isomorphism fromF onto F0. Then the map
s� : a0 þ a1xþ a2x2 þ þ anx
n� � 7! a00 þ a01tþ a02t
2 þ þ a0ntn
� �from the polynomial ring F x½ � to the polynomial ring F0 t½ � is a ring isomorphismfrom F x½ � onto F0 t½ � such that for every a 2 F, we have s� að Þ ¼ a0.
Proof s� : F x½ � ! F0 t½ � is one-to-one: To show this, suppose that
a00 þ a01tþ a02t2 þ þ a0nt
n ¼ b00 þ b01tþ b02t2 þ þ b0nt
n:
We have to show that ai ¼ bi i ¼ 0; 1; . . .; nð Þ. Since
a00 þ a01tþ a02t2 þ þ a0nt
n ¼ b00 þ b01tþ b02t2 þ þ b0nt
n;
we have a0i ¼ b0i i ¼ 0; 1; . . .; nð Þ, and hence s aið Þ ¼ s bið Þ i ¼ 0; 1; . . .; nð Þ. Sinces : F ! F0 is a ring isomorphism, s : F ! F0 is one-to-one, and sinces aið Þ ¼ s bið Þ i ¼ 0; 1; . . .; nð Þ, we have ai ¼ bi i ¼ 0; 1; . . .; nð Þ.
s� : F x½ � ! F0 t½ � is onto: This is clear.s� : F x½ � ! F0 t½ � is a ring homomorphism: This is clear.Thus, s� is a ring isomorphism from F x½ � onto F0 t½ �. Also, it is clear that for
every a 2 F, s� að Þ ¼ a0. ■
1.5.11 Problem Let F;F0 be any fields. Let s : a 7! a0 be a ring isomorphism fromF onto F0. For every f xð Þ 2 F x½ �, we shall denote s� f xð Þð Þ by f 0 tð Þ, where s� is thesame as discussed in 1.5.10. Thus s� : f xð Þ 7! f 0 tð Þ from the ring F x½ � onto the ringF0 t½ � is an isomorphism. Let p xð Þ 2 F x½ �. It follows that p0 tð Þ 2 F0 t½ �. PutV � p xð Þð Þ, where p xð Þð Þ denotes the ideal generated by p xð Þ in F x½ �. PutV 0 � p0 tð Þð Þ, where p0 tð Þð Þ denotes the ideal generated by p0 tð Þ in F0 t½ �. Let
s�� : f xð ÞþV 7! f 0 tð ÞþV 0
be the mapping from the quotient ring F x½ �V to the quotient ring F0 t½ �
V 0 . Then s�� is an
isomorphism from F x½ �V onto F0 t½ �
V 0 . Also, for every a 2 F, we have s�� aþVð Þ ¼ a0 and
s�� xþVð Þ ¼ tþV 0:
Proof s�� : F x½ �V ! F0 t½ �
V 0 is well defined. To show this, let f xð Þ; g xð Þ 2 F x½ � be suchthat f xð Þ � g xð Þ 2 V ¼ p xð Þð Þð Þ. We have to show that f 0 tð Þ � g0 tð Þ 2 V 0. Since
78 1 Galois Theory I
f xð Þ � g xð Þ 2 p xð Þð Þ, there exists h xð Þ 2 F x½ � such that f xð Þ � g xð Þ ¼ p xð Þh xð Þ,and hence
f 0 tð Þ � g0 tð Þ ¼ s� f xð Þð Þ � s� g xð Þð Þ ¼ s� f xð Þ � g xð Þð Þ ¼ s� p xð Þh xð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ s� p xð Þð Þs� h xð Þð Þ
¼ p0 tð Þs� h xð Þð Þ ¼ p0 tð Þh0 tð Þ 2 p0 tð Þð Þ ¼ V 0:
Thus f 0 tð Þ � g0 tð Þ 2 V 0.s�� : F x½ �
V ! F0 t½ �V 0 is one-to-one. To show this, let f xð Þ; g xð Þ 2 F x½ � be such that
f 0 tð Þ � g0 tð Þ 2 V 0 ¼ p0 tð Þð Þð Þ. We have to show that f xð Þ � g xð Þ 2 V . Sincef 0 tð Þ � g0 tð Þ 2 p0 tð Þð Þ, there exists h xð Þ 2 F x½ � such that
s� f xð Þ � g xð Þð Þ ¼ s� f xð Þð Þ � s� g xð Þð Þ¼ f 0 tð Þ � g0 tð Þ ¼ p0 tð Þh0 tð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ s� p xð Þð Þs� h xð Þð Þ ¼ s� p xð Þh xð Þð Þ;
and hence
s� f xð Þ � g xð Þð Þ ¼ s� p xð Þh xð Þð Þ:
Since s� is one-to-one, we have f xð Þ � g xð Þ ¼ p xð Þh xð Þ 2 p xð Þð Þ ¼ V , andhence f xð Þ � g xð Þ 2 V .
s�� : F x½ �V ! F0 t½ �
V 0 is onto. This is clear.
s�� : F x½ �V ! F0 t½ �
V 0 is a ring homomorphism. This is clear.
Thus s�� is an isomorphism from F x½ �V onto F0 t½ �
V 0 .By the definition of s��, for every a 2 F, s�� aþVð Þ ¼ s� að ÞþV 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ a0 þV 0, so
for every a 2 F, s�� aþVð Þ ¼ a0 þV 0. Since for every a 2 F, we have a0 2 F0, andwe identify s�� aþVð Þ ¼ð Þa0 þV 0 with a0, we can write s�� aþVð Þ ¼ a0.
Since the polynomial x is a member of F x½ �, we have
s�� xþVð Þ ¼ tþV 0: ■
1.5.12 Problem Let F and K be any fields such that F � K. Suppose that K is anextension of F. Let a be a member of K. Let p xð Þ 2 F x½ � � Fð Þ. Let p xð Þ beirreducible over F. Let n be a positive integer. Let n be the degree of p xð Þ. Let a be aroot of p xð Þ in K. Then
1. a is algebraic of degree n over F.2. 1
leadingcoefficientof p xð Þ p xð Þ is the minimal polynomial of a over F.
1.5 Splitting Fields 79
Proof Put
p1 xð Þ � 1leading coefficient of p xð Þ p xð Þ:
Clearly, K3ð Þp1 að Þ ¼ 0, 1� n ¼ deg p1 xð Þð Þ, and the leading coefficient ofp1 xð Þ is 1.
Let a be algebraic of degree m over F. It follows that m� n. We have to showthat m ¼ n. Suppose to the contrary that m\n. We seek a contradiction.
Since a is algebraic of degree m over F, there exists f xð Þ 2 F x½ � such thatK3ð Þ f að Þ ¼ 0,
1� deg f xð Þð Þ ¼ m|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}\n ¼ deg p1 xð Þð Þ;
and the leading coefficient of f xð Þ is 1. It follows that there exist q xð Þ; r xð Þ 2 F x½ �such that
p1 xð Þ ¼ f xð Þq xð Þþ r xð Þ
and
r xð Þ ¼ 0 or deg r xð Þð Þ\deg f xð Þð Þð Þ:
Since deg f xð Þð Þ\ deg p1 xð Þð Þ, p1 xð Þ ¼ f xð Þq xð Þþ r xð Þ, and p1 xð Þ is irreducibleover F, we have r xð Þ 6¼ 0. Also
0 ¼ p1 að Þ ¼ f að Þq að Þþ r að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0q að Þþ r að Þ ¼ r að Þ;
so r að Þ ¼ 0. Since a is algebraic of degree m over F, r xð Þ 6¼ 0, and r að Þ ¼ 0, wehave deg f xð Þð Þ ¼ð Þm� deg r xð Þð Þ. Since r xð Þ ¼ 0 or deg r xð Þð Þ\deg f xð Þð Þð Þ, wehave r xð Þ ¼ 0. This is a contradiction.
Thus a is algebraic of degree n over F. Also
1leading coefficient of p xð Þ p xð Þ
is the minimal polynomial of a over F. ■
1.5.13 Note Let F and K be any fields such that K is an extension of F. Let a be amember of K. Let p xð Þ 2 F x½ � � Fð Þ. Let p xð Þ be irreducible over F. Let n be apositive integer. Let n be the degree of p xð Þ. Let a be a root of p xð Þ in K. By 1.5.12,
80 1 Galois Theory I
1. a is algebraic of degree n over F.2. 1
leadingcoefficientof p xð Þ p xð Þ is the minimal polynomial of a over F.
Let w : f xð Þ 7! f að Þ be the mapping from the ring F x½ � to the field F að Þ. By1.4.13, w : F x½ � ! F að Þ is a ring homomorphism, and hence by the fundamentaltheorem of ring homomorphism, the mapping w� : f xð Þþ ker wð Þð Þ 7!w f xð Þð Þ ¼ f að Þð Þ is a ring isomorphism from the quotient ring F x½ �
ker wð Þ to F að Þ. Also,w� maps F x½ �
ker wð Þ onto F að Þ. Put
p1 xð Þ � 1leading coefficient of p xð Þ p xð Þ:
By 1.4.12, p1 xð Þ is irreducible over F, and hence by 1.2.22, the ideal
p1 xð Þð Þ� p1 xð Þf xð Þ : f xð Þ 2 F x½ �f g¼ 1
leadingcoefficientof p xð Þ p xð Þf xð Þ : f xð Þ 2 F x½ �n o
¼ p xð Þ 1leadingcoefficientofp xð Þ f xð Þ : f xð Þ 2 F x½ �
� �¼ p xð Þ f xð Þ : f xð Þ 2 F x½ �f g ¼ p xð Þð ÞÞ
:
is a maximal ideal of the ring F x½ �, and hence p xð Þð Þ is a maximal ideal of the ringF x½ �.
Further, by Note 1.2.24, the quotient ring F x½ �p1 xð Þð Þ ¼ F x½ �
p xð Þð Þ� �
is a field. Also,
f að Þ : f xð Þ 2 F x½ �f g � F að Þ. By 1.4.13,
ker wð Þ ¼ p1 xð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ p xð Þð Þ:
Now, since ker wð Þ ¼ p xð Þð Þ, it follows that w� : f xð Þþ ker wð Þð Þ 7! f að Þ is a
ring isomorphism from the field F x½ �p xð Þð Þ onto the field F að Þ.
Further, for every b 2 F,
w� bþ p xð Þð Þð Þ ¼ w� bþ ker wð Þð Þ ¼ w bð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ b
and
w� xþ p xð Þð Þð Þ ¼ w xð Þ ¼ a:
1.5.14 Conclusion Let F and K be any fields such that K is an extension of F. Let abe a member of K. Let p xð Þ 2 F x½ � � Fð Þ. Let p xð Þ be irreducible over F. Let a be a
1.5 Splitting Fields 81
root of p xð Þ in K. Let w : f xð Þ 7! f að Þ be the mapping from the ring F x½ � to the fieldF að Þ. Then1. w� : f xð Þþ ker wð Þð Þ 7! f að Þ is a ring isomorphism from the field F x½ �
p xð Þð Þ onto the
field F að Þ,2. for every b 2 F, w� bþ p xð Þð Þð Þ ¼ b,3. w� xþ p xð Þð Þð Þ ¼ a.
In short, w� is an isomorphism from F x½ �p xð Þð Þ onto F að Þ such that every element of
F is fixed, and the w�-image of xþ p xð Þð Þ is a.1.5.15 Note Let F and K be any fields such that K is an extension of F. Let a be amember of K. Let p xð Þ 2 F x½ � � Fð Þ. Let p xð Þ be irreducible over F. Let a be a rootof p xð Þ in K. Let F0 be any field. Let s : a 7! a0 be a ring isomorphism from F ontoF0. For every f xð Þ 2 F x½ �, we shall denote s� f xð Þð Þ by f 0 tð Þ, where s� is the same asdiscussed in 1.5.10. We know that s� : f xð Þ 7! f 0 tð Þ is an isomorphism from the ringF x½ � onto the ring F0 t½ �.
Since p xð Þ 2 F x½ �, we have p0 tð Þ 2 F0 t½ �. Let b be a root of p0 tð Þ in someextension K 0 of F0.
Suppose that p xð Þð Þ denotes the ideal generated by p xð Þ in F x½ �. Suppose thatp0 tð Þð Þ denotes the ideal generated by p0 tð Þ in F0 t½ �. Let
s�� : f xð Þþ p xð Þð Þ 7! f 0 tð Þþ p0 tð Þð Þ
be a mapping from the quotient ring F x½ �p xð Þð Þ to the quotient ring
F0 t½ �p0 tð Þð Þ. By 1.5.11, s�� is
an isomorphism from F x½ �p xð Þð Þ onto F0 t½ �
p0 tð Þð Þ. Also, for every a 2 F, we have
s�� aþ p xð Þð Þð Þ ¼ a0 þ p0 tð Þð Þ and
s�� xþ p xð Þð Þð Þ ¼ tþ p0 tð Þð Þ:
Let w : f xð Þ 7! f að Þ be a mapping from the ring F x½ � to the field F að Þ. Then by1.5.14,
1. w� : f xð Þþ p xð Þð Þð Þ 7! f að Þ is a ring isomorphism from the field F x½ �p xð Þð Þ onto the
field F að Þ,2. for every a 2 F, w� aþ p xð Þð Þð Þ ¼ a,3. w� xþ p xð Þð Þð Þ ¼ a.
Let h : f 0 tð Þ 7! f 0 bð Þ be a mapping from the ring F0 t½ � to the field F0 bð Þ. Then by1.5.14,
1. h� : f 0 tð Þþ p0 tð Þð Þð Þ 7! f 0 bð Þ is a ring isomorphism from the field F0 t½ �p0 tð Þð Þ onto the
field F0 bð Þ,
82 1 Galois Theory I
2. for every b 2 F0, h� bþ p0 tð Þð Þð Þ ¼ b,3. h� tþ p0 tð Þð Þð Þ ¼ b.
Since w� is a ring isomorphism from F x½ �p xð Þð Þ onto F að Þ, w�ð Þ�1 is a ring isomor-
phism from F að Þ onto F x½ �p xð Þð Þ. Now, since s
�� is an isomorphism from F x½ �p xð Þð Þ onto
F0 t½ �p0 tð Þð Þ,
and h� is a ring isomorphism from F0 t½ �p0 tð Þð Þ onto F0 bð Þ, the composite
h�degs��deg w�ð Þ�1� �
is an isomorphism from F að Þ onto F0 bð Þ.For every a 2 F,
h� � s�� � w�ð Þ�1� �� �
að Þ ¼ h� s�� w�ð Þ�1 að Þ� �� �
¼ h� s�� aþ p xð Þð Þð Þð Þ ¼ h� a0 þ p0 tð Þð Þð Þ ¼ a0;
so for every a 2 F, we have r að Þ ¼ a0, where r � h� � s�� � w�ð Þ�1� �
.
Next,
r að Þ ¼ h� � s�� � w�ð Þ�1� �� �
að Þ ¼ h� s�� w�ð Þ�1 að Þ� �� �
¼ h� s�� xþ p xð Þð Þð Þð Þ¼ h� tþ p0 tð Þð Þð Þ ¼ b;
so r að Þ ¼ b.
1.5.16 Conclusion Let F and K be any fields such that K is an extension of F. Let abe a member of K. Let p xð Þ 2 F x½ � � Fð Þ. Let p xð Þ be irreducible over F. Let a be aroot of p xð Þ in K. Let F0 be any field. Let s : a 7! a0 be a ring isomorphism fromF onto F0. For every f xð Þ 2 F x½ �, we shall denote s� f xð Þð Þ by f 0 tð Þ, where s� is thesame as discussed in 1.5.10. Let b be a root of p0 tð Þ in some extension K 0 of F0.Then there exists an isomorphism r from the field F að Þ onto the field F0 bð Þ suchthat
1. r að Þ ¼ b,2. for every a 2 F, r að Þ ¼ a0.
1.5.17 Note In 1.5.16, let us take F for F0 and the identity map i : F ! F for s.Thus for every a 2 F, a0 means a, and for every f xð Þ 2 F x½ �, f 0 tð Þ means f tð Þ. Also,p0 tð Þ means p tð Þ. Thus a; b are the roots of the same polynomial p xð Þ. By 1.5.16,there exists an isomorphism r from the field F að Þ onto the field F bð Þ such that
1. r að Þ ¼ b,2. for every a 2 F, r að Þ ¼ a.
1.5.18 Conclusion Let F and K be any fields such that K is an extension of F. Leta; b be members of K. Let p xð Þ 2 F x½ � � Fð Þ. Let p xð Þ be irreducible over F. Let
1.5 Splitting Fields 83
a; b be any roots of p xð Þ in K. Then there exists an isomorphism r from the fieldF að Þ onto the field F bð Þ such that
1. r að Þ ¼ b,2. for every a 2 F, r að Þ ¼ a.
1.5.19 Example Let F be the field of all rational numbers, and let K be the field ofall complex numbers. Let us take the polynomial x4 þ x2 þ 1 for f xð Þ in F x½ �.According to 1.5.9,
splitting field over F for f xð Þð Þ : F½ � � deg f xð Þð Þð Þ!;
so
splitting field over F for x4 þ x2 þ 1� �
: F� �� deg x4 þ x2 þ 1
� �� �! ¼ 4! ¼ 24;
and hence
1� splitting field over F for x4 þ x2 þ 1� �
: F� �� 24:
Since
x4 þ x2 þ 1 ¼ x2 þ 1� �2�x2 ¼ x2 þ 1þ x
� �x2 þ 1� x� �
¼ x� xð Þ x� x2� � xþxð Þ xþx2� �;
where x � �12 þ i
ffiffi3
p2 , F xð Þ is a splitting field over F for x4 þ x2 þ 1. Since the
polynomial 1þ xþ x2 is a member of F x½ �, 1þ xþ x2 is irreducible over F,deg 1þ xþ x2ð Þ ¼ 2, and x is a root of 1þ xþ x2 in K, by 1.5.12, x is algebraic ofdegree 2 over F, and hence by 1.4.16, F xð Þ : F½ � ¼ 2. Thus
splitting field over F for x4 þ x2 þ 1� �
: F� � ¼ 2:
1.5.20 Example Let F be the field of all rational numbers, and let K be the field ofall complex numbers. Let us take the polynomial x3 � 2 for f xð Þ in F x½ �. Accordingto 1.5.9,
splitting field over F for f xð Þð Þ : F½ � � deg f xð Þð Þð Þ!;
so
splitting field over F for x3 � 2� �
: F� �� deg x3 � 2
� �� �! ¼ 3! ¼ 6;
84 1 Galois Theory I
and hence
1� splitting field over F for x3 � 2� �
: F� �� 6:
Observe that
x3 � 2 ¼ x�ffiffiffi23
p� � x�
ffiffiffi23
px
� �x�
ffiffiffi23
px2
� �;
where x � �12 þ i
ffiffi3
p2 . Since the polynomial x3 � 2 is a member of F x½ �, x3 � 2 is
irreducible over F, deg x3 � 2ð Þ ¼ 3, andffiffiffi23
pis a root of x3 � 2 in K, by 1.5.12,
ffiffiffi23
p
is algebraic of degree 3 over F, and hence by 1.4.16, Fffiffiffi23
p� �: F
� � ¼ 3. By 1.4.4,
Fffiffiffi23
p� �: F
h ij splitting field over F for x3 � 2� �
: F� �
:
Now, since Fffiffiffi23
p� �: F
� � ¼ 3, and 1� splitting field over F for x3 � 2ð Þ : F½ �� 6, we have
splitting field over F for x3 � 2� �
: F� � ¼ 3 or 6: �ð Þ
Since members of Fffiffiffi23
p� �are real numbers,
x3 � 2 ¼ x�ffiffiffi23
p� � x�
ffiffiffi23
px
� �x�
ffiffiffi23
px2
� �;
andffiffiffi23
px;
ffiffiffi23
px2 are not real numbers, F
ffiffiffi23
p� �is not a splitting
field over F for x3 � 2, and hence 3\ splitting field over F for x3 � 2ð Þ : F½ �. It fol-lows from �ð Þ that
splitting field over F for x3 � 2� �
: F� � ¼ 6:
1.5.21 Example Let F be the field of all rational numbers, and let K be the field ofall complex numbers. Let a; b 2 F. Let us take the polynomial x2 þ axþ b for f xð Þin F x½ �. Suppose that a 2 K � Fð Þ such that a is a root of x2 þ axþ b, that is,a2 þ aaþ b ¼ 0.
According to 1.5.9,
splitting field over F for f xð Þð Þ : F½ � � deg f xð Þð Þð Þ!;
so
splitting field over F for x2 þ axþ b� �
: F� �� deg x2 þ axþ b
� �� �! ¼ 2! ¼ 2;
1.5 Splitting Fields 85
and hence
splitting field over F for x2 þ axþ b� �
: F� � ¼ 1 or 2:
Since a 2 K � Fð Þ such that a is a root of x2 þ axþ b, we havesplitting field over F for x2 þ axþ bð Þ : F½ �[ 1, and hence
splitting field over F for x2 þ axþ b� �
: F� � ¼ 2:
1.5.22 Theorem Let F and K be any fields such that K is an extension of F. Let F0
and K 0 be any fields such that K 0 is an extension of F0. Let s : a 7! a0 be a ringisomorphism from F onto F0. For every f xð Þ 2 F x½ �, we shall denote s� f xð Þð Þ byf 0 tð Þ, where s� is the same as discussed in 1.5.10. Let g xð Þ 2 F x½ �. It follows thatg0 tð Þ 2 F0 t½ �. Let E be a splitting field over F for g xð Þ. Let E0 be a splitting field overF0 for g0 tð Þ. Suppose that E : F½ � ¼ 1. Then E ¼ F.
Proof Suppose to the contrary that E 6¼ F. We seek a contradiction.Since E is a splitting field over F for g xð Þ, E is a finite extension of F, and hence
F � E. Now, since E 6¼ F, there exists a nonzero a in E such that a 62 F. Snce1 2 F, we have a 6¼ 1, and hence 1; af g � Eð Þ is a set of two elements.
Clearly, 1; af g is a linearly independent subset of E.
Proof Suppose that k1þ la ¼ 0, where k; l 2 F. We have to show that k ¼ 0and l ¼ 0.
If l 6¼ 0, then a ¼ �l�1k 2 F, and hence a 2 F. This is a contradiction.Hence l ¼ 0. Since k1þ la ¼ 0,
we have k ¼ 0. ■
Thus we have shown that 1; af g is a linearly independent subset of E, and thenumber of elements in 1; af g is 2. It follows that 1 ¼ E : F½ � ¼ dim Eð Þ 2|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl}. Thuswe get a contradiction. ■
1.5.23 Note Let F and K be any fields such that K is an extension of F. Let F0 andK 0 be any fields such that K 0 is an extension of F0. Let s : a 7! a0 be a ring iso-morphism from F onto F0. For every f xð Þ 2 F x½ �, we shall denote s� f xð Þð Þ by f 0 tð Þ,where s� is the same as discussed in 1.5.10. Let g xð Þ 2 F x½ �. It follows thatg0 tð Þ 2 F0 t½ �. Let E be a splitting field over F for g xð Þ. Let E0 be a splitting field overF0 for g0 tð Þ. Suppose that E : F½ � ¼ 1.
By 1.5.22, we have E ¼ F. Since E is a splitting field over F for g xð Þ, F is asplitting field over F for g xð Þ, and hence g xð Þ splits into a product of linear factorsover F. By 1.5.10, g0 tð Þ splits into a product of linear factors over F0. Next, since E0
is a splitting field over F0 for g0 tð Þ, we have E0 ¼ F0. Since s is a ring isomorphismfrom F onto F0, E ¼ F, and E0 ¼ F0, s is a ring isomorphism from E onto E0.
Let us take an arbitrary a 2 F. By the definition of s, we have s að Þ ¼ a0.
86 1 Galois Theory I
1.5.24 Conclusion Let F and K be any fields such that K is an extension of F. LetF0 and K 0 be any fields such that K 0 is an extension of F0. Let s : a 7! a0 be a ringisomorphism from F onto F0. For every f xð Þ 2 F x½ �, we shall denote s� f xð Þð Þ byf 0 tð Þ, where s� is the same as discussed in 1.5.10. Let g xð Þ 2 F x½ �. It follows thatg0 tð Þ 2 F0 t½ �. Let E be a splitting field over F for g xð Þ. Let E0 be a splitting field overF0 for g0 tð Þ. Suppose that E : F½ � ¼ 1. Then there exists a ring isomorphism u fromE onto E0 such that for every a 2 F, u að Þ ¼ a0.
1.5.25 Problem Let F and K be any fields such that K is an extension of F. Let F0
and K 0 be any fields such that K 0 is an extension of F0. Let s : a 7! a0 be a ringisomorphism from F onto F0. For every f xð Þ 2 F x½ �, we shall denote s� f xð Þð Þ byf 0 tð Þ, where s� is the same as discussed in 1.5.10. Let g xð Þ 2 F x½ �. It follows thatg0 tð Þ 2 F0 t½ �. Let E be a splitting field over F for g xð Þ. Let E0 be a splitting field overF0 for g0 tð Þ. Suppose that E : F½ � ¼ 2. Then there exists a ring isomorphism u fromE onto E0 such that for every a 2 F, u að Þ ¼ a0.
Proof Since E : F½ � ¼ 2, we have E : F½ � 6¼ 1, and hence E 6¼ F. It follows that F isa proper subset of E. By 1.3.17, g xð Þ can be expressed as a product of finitely manyirreducible polynomials in F x½ �. Since E is a splitting field over F for g xð Þ, and F isa proper subset of E, there exists p xð Þ 2 F x½ � such that
1. deg p xð Þð Þ[ 1,2. p xð Þjg xð Þ,3. p xð Þ is irreducible over F.
Since E is a splitting field over F for g xð Þ, p xð Þjg xð Þ, and p xð Þ is irreducible over F,all roots of p xð Þ are members of E. Since deg p xð Þð Þ[ 1, there exists a 2 E such that
1. a 62 F,2. a is a root of p xð Þ in E.
It follows, by 1.5.12, that a is algebraic of degree r over F, where r � deg p xð Þð Þ[ 1ð Þ. Now by 1.4.16, F að Þ : F½ � ¼ r. By 1.4.3,
2 ¼ E : F½ � ¼ E : F að Þ½ � F að Þ : F½ �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ E : F að Þ½ �r;
and hence E : F að Þ½ � ¼ 2r � 1. Now, since E : F að Þ½ � is a positive integer, we have
E : F að Þ½ � ¼ 1.Since p xð Þ 2 F x½ �, we have p0 tð Þ 2 F0 t½ �. Since deg p xð Þð Þ[ 1, we have
deg p0 tð Þð Þ[ 1. Since p xð Þjg xð Þ, we have p0 tð Þjg0 tð Þ. Since p xð Þ is irreducible overF, p0 tð Þ is irreducible over F0. It follows that there exists b 2 E0 such that
1. b 62 F0,2. b is a root of p0 tð Þ in E0.
1.5 Splitting Fields 87
Since F � F að Þ, we have g xð Þ 2 F x½ � � F að Þð Þ x½ �|fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl}, and hence g xð Þ 2 F að Þð Þ x½ �.
Since E � F að Þ � Fð Þ is a splitting field over F for g xð Þ, E is a splitting field overF að Þ for g xð Þ. Since E0 � F0 bð Þ � F0ð Þ is a splitting field over F0 for g0 tð Þ, E0 is asplitting field over F0 bð Þ for g0 tð Þ. Now, since E : F að Þ½ � ¼ 1, by 1.5.24, there existsa ring isomorphism u from E onto E0 such that for every a 2 F að Þ � Fð Þ,u að Þ ¼ a0. It follows that for every a 2 F, u að Þ ¼ a0. ■
1.5.26 Problem Let F and K be any fields such that K is an extension of F. Let F0
and K 0 be any fields such that K 0 is an extension of F0. Let s : a 7! a0 be a ringisomorphism from F onto F0. For every f xð Þ 2 F x½ �, we shall denote s� f xð Þð Þ byf 0 tð Þ, where s� is the same as discussed in 1.5.10. Let g xð Þ 2 F x½ �. It follows thatg0 tð Þ 2 F0 t½ �. Let E be a splitting field over F for g xð Þ. Let E0 be a splitting field overF0 for g0 tð Þ. Suppose that E : F½ � ¼ 3. Then there exists a ring isomorphism u fromE onto E0 such that for every a 2 F, u að Þ ¼ a0.
Proof Since E : F½ � ¼ 3, we have E : F½ � 6¼ 1, and hence E 6¼ F. It follows that F isa proper subset of E. By 1.3.17, g xð Þ can be expressed as a product of finitely manyirreducible polynomials in F x½ �. Now, since E is a splitting field over F for g xð Þ andF is a proper subset of E, there exists p xð Þ 2 F x½ � such that
1. deg p xð Þð Þ[ 1,2. p xð Þjg xð Þ,3. p xð Þ is irreducible over F.
Since E is a splitting field over F for g xð Þ, p xð Þjg xð Þ, and p xð Þ is irreducible overF, all roots of p xð Þ are members of E. Since deg p xð Þð Þ[ 1, there exists a 2 E suchthat
1. a 62 F,2. a is a root of p xð Þ in E.
It follows, by 1.5.12, that a is algebraic of degree r over F, wherer � deg p xð Þð Þ [ 1ð Þ. Now by 1.4.16, F að Þ : F½ � ¼ r. By 1.4.3,
3 ¼ E : F½ � ¼ E : F að Þ½ � F að Þ : F½ �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ E : F að Þ½ �r;
and hence E : F að Þ½ � ¼ 3r � 2. Now, since E : F að Þ½ � is a positive integer, we have
E : F að Þ½ � ¼ 1 or 2.Since p xð Þ 2 F x½ �, we have p0 tð Þ 2 F0 t½ �. Since deg p xð Þð Þ[ 1, we have
deg p0 tð Þð Þ[ 1. Since p xð Þjg xð Þ, we have p0 tð Þjg0 tð Þ. Since p xð Þ is irreducible overF, p0 tð Þ is irreducible over F0. It follows that there exists b 2 E0 such that
1. b 62 F0,2. b is a root of p0 tð Þ in E0.
88 1 Galois Theory I
Since F � F að Þ, we have g xð Þ 2 F x½ � � F að Þð Þ x½ �|fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl}, and hence g xð Þ 2 F að Þð Þ x½ �.
Since E � F að Þ � Fð Þ is a splitting field over F for g xð Þ, E is a splitting field overF að Þ for g xð Þ. Since E0 � F0 bð Þ � F0ð Þ is a splitting field over F0 for g0 tð Þ, E0 is asplitting field over F0 bð Þ for g0 tð Þ. Now, since E : F að Þ½ � ¼ 1 or 2, by 1.5.24 and1.5.25, there exists a ring isomorphism u from E onto E0 such that for everya 2 F að Þ � Fð Þ, u að Þ ¼ a0. It follows that for every a 2 F, u að Þ ¼ a0. ■
Similarly, we get the following.
1.5.27 Conclusion Let F and K be any fields such that K is an extension of F. LetF0 and K 0 be any fields such that K 0 is an extension of F0. Let s : a 7! a0 be a ringisomorphism from F onto F0. For every f xð Þ 2 F x½ �, we shall denote s� f xð Þð Þ byf 0 tð Þ, where s� is the same as discussed in 1.5.10. Let g xð Þ 2 F x½ �. It follows thatg0 tð Þ 2 F0 t½ �. Let E be a splitting field over F for g xð Þ. Let E0 be a splitting field overF0 for g0 tð Þ. Then there exists a ring isomorphism u from E onto E0 such that forevery a 2 F, u að Þ ¼ a0.
1.5.28 Note In 1.5.27, let us take F for F0, and the identity map i : F ! F for s.Thus for every a 2 F, a0 means a, and for every f xð Þ 2 F x½ �, f 0 tð Þ means f tð Þ. Letg xð Þ 2 F x½ �. It follows that g tð Þ 2 F0 t½ �. Let E be a splitting field over F for g xð Þ.Let E0 be a splitting field over F0 for g0 tð Þ ¼ g tð Þð Þ. Then by 1.5.27, there exists aring isomorphism u from E onto E0 such that for every a 2 F, u að Þ ¼ a0.
1.5.29 Conclusion Let F and K be any fields such that F � K. Suppose that K is anextension of F. Let g xð Þ 2 F x½ �. Let E be a splitting field over F for g xð Þ. Let E0 be asplitting field over F for g xð Þ. Then there exists a ring isomorphism u from E ontoE0 such that for every a 2 F, u að Þ ¼ a0.
Thus the splitting field over F for a polynomial is essentially unique, and hence itis justified in speaking about “the” splitting field.
Exercises
1. Find the greatest common divisor of
5þ 3ffiffiffiffiffiffiffi�1
pand 3� 4
ffiffiffiffiffiffiffi�1
p
in Jffiffiffiffiffiffiffi�1
p� �.
(Hint: Observe that
5þ i3 ¼ 3� i4ð Þiþ 13� i4 ¼ 1 3� i4ð Þþ 0
�:
1.5 Splitting Fields 89
So the required gcd is 1.)2. Suppose that p is a prime number, and a; b are integers such that pj a2 þ b2ð Þ,
and p2- a2 þ b2ð Þ. Show that p can be expressed as a sum of two perfect squares.3. Show that x3 � 2 is irreducible in the integral domain Q x½ �.4. Show that if 4n� 3 is a prime number, then 4nþ 1 can be expressed as a sum
of two perfect squares.5. Prove that 72!þ 1ð Þ is divisible by 73.6. Suppose that R is a unique factorization domain with unit element 1. Show that
R x; y½ � is also a unique factorization domain.7. Let F and K be any fields such that K is an extension of F. Suppose that a 2 K.
Show that a is algebraic of degree F að Þ : F½ � over F.8. Prove that
ffiffiffi2
p þ ffiffiffi3
pis algebraic of degree � 4 over Q.
9. Show that
splitting field overQ for x2 þ xþ 1� �
: Q� � ¼ 2:
10. Show thatffiffiffie
pis a transcendental number.
90 1 Galois Theory I
Chapter 2Galois Theory II
Roughly, a real number a is called a constructible number if by the application ofstraightedge and compass we can construct, given a line segment of unit length, aline segment of length a. In some familiar geometric situations, we shall apply theresults of Galois theory. In our general development, we shall show that the generalpolynomial equation of degree five has no solution in radicals.
2.1 Simple Extensions
2.1.1 Definition Let D be an integral domain, that is, D is a commutative ring suchthat all products of nonzero members of D are nonzero. If for every positive integer
m and every nonzero member a of D, ma � aþ aþ � � � þ a|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl}m terms
0@
1A is nonzero, then we
say that D is of characteristic 0.
Definition Let D be an integral domain. If there exists a positive integer m suchthat for every member a of D, ma ¼ 0, then we say that D is of finite characteristic.
2.1.2 Problem Let D be an integral domain. Let a and b be any nonzero membersof D. Then
m : m is a positive integer and ma ¼ 0f g¼ m : m is a positive integer and mb ¼ 0f g:
© Springer Nature Singapore Pte Ltd. 2020R. Sinha, Galois Theory and Advanced Linear Algebra,https://doi.org/10.1007/978-981-13-9849-0_2
91
Proof Let us take an arbitrary positive integer m satisfying ma ¼ 0: We shall showthat mb ¼ 0: Since ma ¼ 0, we have
aðmbÞ ¼ ðmaÞb ¼ 0b|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} ¼ 0;
and hence aðmbÞ ¼ 0: Now, since a and mb are members of the integral domainD and a is nonzero, we have mb ¼ 0: Thus
m : m is a positive integer and ma ¼ 0f g� m : m is a positive integer and mb ¼ 0f g:
Similarly,
m : m is a positive integer and mb ¼ 0f g� m : m is a positive integer and ma ¼ 0f g:
Hence
m : m is a positive integer and ma ¼ 0f g¼ m : m is a positive integer and mb ¼ 0f g:
■
2.1.3 Note Let D be an integral domain such that D is of finite characteristic. Letb be a nonzero member of D. It follows that
m : m is a positive integer such that for every member a of D;ma ¼ 0f g
is a nonempty set of positive integers. Also, by 2.1.2,
m : m is a positive integer such that for every member a of D;ma ¼ 0f g¼ m : m is a positive integer andmb ¼ 0f g:
Since every set of positive integers has a least member, the smallest member n of
m : m is a positive integer such that for every member a of D;ma ¼ 0f g
exists. Clearly, n is a prime number.Also, for every nonzero member b of D,
m : m is a positive integer and mb ¼ 0f g ¼ n; 2n; 3n; . . .f g:
92 2 Galois Theory II
(The number n is called the characteristic of D.)
Proof Suppose to the contrary that n is not a prime number. We seek acontradiction.
Since n is not a prime number, there exist positive integers n1; n2 such thatn ¼ n1n2 and 1\n1 � n2\n: Here,
n is the smallest member of
m : m is a positive integer and mb ¼ 0f g;
so n1b 6¼ 0; n2b 6¼ 0; and nb ¼ 0: Since n1b; n2b are nonzero members of theintegral domain D, we have
0 ¼ 0b ¼ ðnbÞb ¼ n1n2ð Þbð Þb ¼ n1bð Þ n2bð Þ 6¼ 0|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl};and hence we get a contradiction. ■
Definition Let F be a field. Let f ðxÞ 2 F½x�; where
f ðxÞ � a0xn þ a1x
n�1 þ � � � þ an�1xþ an
and ai 2 F i ¼ 0; 1; . . .; nð Þ: It follows that na0; ðn�1Þa1; . . .; 2an�2 ¼ an�2 þ an�2ð Þ; 1an�1 ¼ an�1ð Þ are members of F, and hence
na0ð Þxn�1 þ n� 1ð Þa1ð Þxn�2 þ � � � þ 1an�1
is a member of F½x�: The polynomial
na0ð Þxn�1 þ n� 1ð Þa1ð Þxn�2 þ � � � þ 1an�1
is denoted by f 0ðxÞ and is called the derivative of f ðxÞ.2.1.4 Problem Let F be a field. Let f ðxÞ; gðxÞ 2 F½x�: Let a 2 F: Suppose thathðxÞ ¼ f ðxÞþ agðxÞ 2 F½x�ð Þ: Then
h0ðxÞ ¼ f 0ðxÞþ ag0ðxÞ:Proof Let
f ðxÞ � a0xn þ a1x
n�1 þ � � � þ an�1xþ an;
where ai 2 F i ¼ 0; 1; . . .; nð Þ: Next, let
2.1 Simple Extensions 93
gðxÞ � b0xn þ b1x
n�1 þ � � � þ bn�1xþ bn;
where bi 2 F i ¼ 0; 1; . . .; nð Þ: It follows that
hðxÞ ¼ c0xn þ c1x
n�1 þ � � � þ cn�1xþ cn;
where ci � ai þ abi i ¼ 0; 1; . . .; nð Þ: It follows that
f 0ðxÞ ¼ na0ð Þxn�1 þ n� 1ð Þa1ð Þxn�2 þ � � � þ 1an�1;
g0ðxÞ ¼ nb0ð Þxn�1 þ n� 1ð Þb1ð Þxn�2 þ � � � þ 1bn�1;
and
h0ðxÞ ¼ nc0ð Þxn�1 þ n� 1ð Þc1ð Þxn�2 þ � � � þ 1cn�1:
Here
LHS ¼ f 0 xð Þþ ag0 xð Þ¼ na0 þ a nb0ð Þð Þxn�1 þ n� 1ð Þa1 þ a n� 1ð Þb1ð Þð Þxn�2
þ � � � þ 1an�1 þ a 1bn�1ð Þð Þ¼ na0 þ n ab0ð Þð Þxn�1 þ n� 1ð Þa1 þ n� 1ð Þ ab1ð Þð Þxn�2
þ � � � þ 1an�1 þ 1 abn�1ð Þð Þ¼ n a0 þ ab0ð Þð Þð Þxn�1 þ n� 1ð Þ a1 þ ab1ð Þð Þxn�2
þ � � � þ 1 an�1 þ abn�1ð Þ¼ n c0ð Þð Þxn�1 þ n� 1ð Þ c1ð Þð Þxn�2
þ � � � þ 1 cn�1ð Þ¼ nc0ð Þxn�1 þ n� 1ð Þc1ð Þxn�2 þ � � � þ 1cn�1 ¼ h0 xð Þ ¼ RHS
■
2.1.5 Problem Let F be a field. Let f ðxÞ; gðxÞ 2 F½x�: Suppose that hðxÞ ¼f ðxÞgðxÞ 2 F½x�ð Þ: Then
h0ðxÞ ¼ f 0ðxÞgðxÞþ f ðxÞg0ðxÞ:Proof Let
f ðxÞ � a0xn þ a1x
n�1 þ � � � þ an�1xþ an;
94 2 Galois Theory II
where ai 2 F i ¼ 0; 1; . . .; nð Þ: Next, let
gðxÞ � b0xn þ b1x
n�1 þ � � � þ bn�1xþ bn;
where bi 2 F i ¼ 0; 1; . . .; nð Þ: It follows that
hðxÞ ¼ c0x2n þ c1x
2n�1 þ � � � þ c2n�1xþ c2n;
where c0 � a0b0, c1 � a0b1 þ a1b0, etc. Here
LHS ¼ f 0ðxÞgðxÞþ f ðxÞg0ðxÞ¼ na0ð Þxn�1 þ n� 1ð Þa1ð Þxn�2 þ � � �� �
b0xn þ b1x
n�1 þ � � � þ bn�1xþ bn� �
þ a0xn þ a1x
n�1 þ � � � þ an�1xþ an� �
nb0ð Þxn�1�þ n� 1ð Þb1ð Þxn�2 þ � � � þ 1bn�1
� ¼ na0ð Þb0ð Þx2n�1�
þ na0ð Þb1 þ n� 1ð Þa1ð Þb0ð Þx2n�2 þ � � ��þ a0 nb0ð Þð Þx2n�1�þ a0 n� 1ð Þb1ð Þþ a1 nb0ð Þð Þx2n�2 þ � � �� ¼ na0ð Þb0 þ a0 nb0ð Þð Þx2n�1
þ na0ð Þb1 þ n� 1ð Þa1ð Þb0 þ a0 n� 1ð Þb1ð Þþ a1 nb0ð Þð Þx2n�2 þ � � �¼ n a0b0ð Þþ n a0b0ð Þð Þx2n�1 þ n a0b1ð Þþ n� 1ð Þ a1b0ð Þðþ n� 1ð Þ a0b1ð Þþ n a1b0ð ÞÞx2n�2 þ � � � ¼ 2nð Þ a0b0ð Þð Þx2n�1
þ n a0b1 þ a1b0ð Þþ n� 1ð Þ a0b1 þ a1b0ð Þð Þx2n�2 þ � � �¼ 2nð Þ a0b0ð Þð Þx2n�1 þ 2n� 1ð Þ a0b1 þ a1b0ð Þð Þx2n�2 þ � � �¼ 2nð Þc0ð Þx2n�1 þ 2n� 1ð Þc1ð Þx2n�2 þ � � � ¼ h0ðxÞ ¼ RHS:
■
2.1.6 Note Let F and K be any fields such that K is an extension of F. Let a be amember of K. Suppose that f ðxÞ ¼ x� að ÞmgðxÞ, where m 2 2; 3; . . .f g,f ðxÞ 2 F½x�, and gðxÞ 2 K½x�:
2.1 Simple Extensions 95
It follows that x� að Þm; f ðxÞ; gðxÞ 2 K½x�: Now by 2.1.5,
f 0ðxÞ
¼
x� að Þ0 x� að Þ x� að Þ � � � x� að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}ðm�1Þ factors
gðxÞ
þ x� að Þ x� að Þ0 x� að Þ � � � x� að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}ðm�2Þ factors
gðxÞþ � � �
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}m terms
þ x� að Þ � � � x� að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}m factors
g0ðxÞ
¼ 1 x� að Þ x� að Þ � � � x� að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}ðm�1Þ factors
gðxÞþ x� að Þ1 x� að Þ � � � x� að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}ðm�2Þ factors
gðxÞþ � � �
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}m terms
þ x� að Þ � � � x� að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}m factors
g0ðxÞ
¼ x� að Þ x� að Þ � � � x� að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}ðm�1Þ factors
gðxÞþ x� að Þ x� að Þ � � � x� að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}ðm�1Þ factors
gðxÞþ � � �
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}m terms
þ x� að Þ � � � x� að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}m factors
g0ðxÞ
¼ x� að Þm�1gðxÞþ x� að Þm�1gðxÞþ � � �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}m terms
þ x� að Þmg0ðxÞ
¼ m x� að Þm�1gðxÞþ x� að Þmg0ðxÞ¼ x� að Þ m x� að Þm�2gðxÞþ x� að Þm�1g0ðxÞ
� �¼ x� að ÞrðxÞ;
where rðxÞ � m x� að Þm�2gðxÞþ x� að Þm�1g0ðxÞ 2 K½x�: Since f ðxÞ ¼x� að Þ x� að Þm�1gðxÞ
� �and f 0ðxÞ ¼ x� að ÞrðxÞ; x� að Þ is a common factor of
f ðxÞ and f 0ðxÞ:2.1.7 Conclusion Let F and K be any fields such that K is an extension of F. Let abe a member of K. Let f ðxÞ 2 F½x�: Suppose that a is a multiple root of f ðxÞ . Thenf ðxÞ and f 0ðxÞ have a nontrivial common factor in K½x�:2.1.8 Note Let F and K be any fields such that K is an extension of F. Suppose thatf ðxÞ 2 F½x�: It follows that f 0ðxÞ 2 F½x�: Suppose that f ðxÞ and f 0ðxÞ have a non-trivial common factor in K½x�, that is, f ðxÞ and f 0ðxÞ have a common factor of degree� 1 in K½x�:
96 2 Galois Theory II
It follows that there exists a such that x� að Þjf ðxÞ and x� að Þjf 0ðxÞ:We shall showthat x� að Þ2jf ðxÞ.
Since x� að Þjf ðxÞ, there exist a positive integer m and a polynomial rðxÞ suchthat
1. f ðxÞ ¼ x� að ÞmrðxÞ;2. x� að Þ-rðxÞ:
It suffices to show that m� 2: Suppose to the contrary that m ¼ 1. We seek acontradiction. Here f ðxÞ ¼ x� að ÞrðxÞ, so by 2.1.5,
f 0ðxÞ ¼ x� að Þ0rðxÞþ x� að Þr0ðxÞ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 1rðxÞþ x� að Þr0ðxÞ ¼ rðxÞþ x� að Þr0ðxÞ;
and hence
f 0ðxÞ ¼ rðxÞþ x� að Þr0ðxÞ:
It follows that
f 0 að Þ ¼ r að Þþ a� að Þr0 að Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ r að Þþ 0r0 að Þ ¼ r að Þ:
Thus f 0 að Þ ¼ r að Þ. Since x� að Þ-rðxÞ, we have r að Þ 6¼ 0. Since x� að Þjf 0ðxÞ, wehave f 0 að Þ ¼ 0. Since f 0 að Þ ¼ 0 and r að Þ 6¼ 0, we have f 0 að Þ 6¼ r að Þ. This is acontradiction.
2.1.9 Conclusion Let F and K be any fields such that K is an extension ofF. Suppose that f ðxÞ 2 F½x�. Suppose that f ðxÞ and f 0ðxÞ have a nontrivial commonfactor in K½x�. Then f ðxÞ has a multiple root.
2.1.10 Problem Let F and K be any fields such that K is an extension of F. LetF be of characteristic 0. Suppose that f ðxÞ 2 F½x�. Let f ðxÞ be irreducible. Then f ðxÞhas no multiple root.
Proof Suppose to the contrary that f ðxÞ has a multiple root. We seek acontradiction.
Since f ðxÞ has a multiple root, by 2.1.6, f ðxÞ and f 0ðxÞ have a nontrivial commonfactor in K½x�. Since f ðxÞ is irreducible, f ðxÞ is the only nontrivial factor of f ðxÞ.Now, since f ðxÞ and f 0ðxÞ have a nontrivial common factor in K½x�, f ðxÞ is a factorof f 0ðxÞ, and hence deg f ðxÞð Þ� deg f 0ðxÞð Þ. Suppose that
f ðxÞ � a0xn þ a1x
n�1 þ � � � þ an�1xþ an;
where ai 2 F i ¼ 0; 1; . . .; nð Þ, n is a positive integer, and a0 6¼ 0. Since F is ofcharacteristic 0, we have na0 is a nonzero member of F, and hence
2.1 Simple Extensions 97
deg f 0 xð Þð Þ ¼ deg na0ð Þxn�1 þ n� 1ð Þa1ð Þxn�2 þ � � �� � ¼ n� 1|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}\n
¼ deg a0xn þ a1xn�1 þ � � � þ an�1xþ anð Þ ¼ deg f xð Þð Þ:
Thus deg f 0ðxÞð Þ\deg f ðxÞð Þ. This is a contradiction. ■
2.1.11 Problem Let F and K be any fields such that K is an extension of F. LetF be of characteristic p. By 2.1.3, p is a prime number. Suppose that f ðxÞ 2 F½x�.Let f ðxÞ be irreducible. Suppose that f ðxÞ has a multiple root. Then f ðxÞ is of theform g xpð Þ, where gðxÞ 2 F½x�.Proof Since f ðxÞ has a multiple root, by 2.1.6, f ðxÞ and f 0ðxÞ have a nontrivialcommon factor in K½x�. Since f ðxÞ is irreducible, f ðxÞ is the only nontrivial factor off ðxÞ. Now, since f ðxÞ and f 0ðxÞ have a nontrivial common factor in K½x�, f ðxÞ is afactor of f 0ðxÞ, and hence deg f ðxÞð Þ� deg f 0ðxÞð Þ. But we know that if f 0ðxÞ isnonzero, then deg f 0ðxÞð Þ\deg f ðxÞð Þ, hence f 0ðxÞ ¼ 0. Suppose that
f ðxÞ � a0 þ a1xþ � � � þ ap�1xp�1 þ apx
p þ apþ 1xpþ 1 þ � � �
þ a2p�1x2p�1 þ a2px
2p þ a2pþ 1x2pþ 1 þ � � � þ anx
n;
where ai 2 F i ¼ 0; 1; . . .; nð Þ, and n is a positive integer. It suffices to show thata1 ¼ 0; . . .; ap�1 ¼ 0; apþ 1 ¼ 0; a2p�1 ¼ 0; etc.
Since f 0ðxÞ ¼ 0, we have
a1 þ 2a2xþ � � � þ p� 1ð Þap�1xp�2 þ papx
p�1 þ pþ 1ð Þapþ 1xp þ � � �
þ 2p� 1ð Þa2p�1x2p�2 þ 2pa2px2p�1 þ 2pþ 1ð Þa2pþ 1x
2p þ � � � þ nanxn�1 ¼ 0;
and hence 0 ¼ a1, 0 ¼ 2a2, 0 ¼ p� 1ð Þap�1, 0 ¼ pþ 1ð Þapþ 1, 0 ¼ 2p� 1ð Þa2p�1,0 ¼ 2pþ 1ð Þa2pþ 1, etc.
Since F is of characteristic p, p- p� 1ð Þ, 0 ¼ p� 1ð Þap�1, and ap�1 2 F, wehave ap�1 ¼ 0. Similarly, apþ 1 ¼ 0, a2p�1 ¼ 0, etc. ■
2.1.12 Note Let F and K be any fields such that K is an extension of F. Let F be ofcharacteristic p.
Observe that 1xp þ �1ð Þx is a member of F½x�. Also
1xp þ �1ð Þxð Þ0¼ p1ð Þxp�1 þ �1ð Þ:
Since 1 2 F; and F is of characteristic p, we have p1 ¼ 0, and hence
1xp þð�1Þxð Þ0¼ ðp1Þxp�1 þð�1Þ ¼ 0xp�1 þð�1Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ �1:
98 2 Galois Theory II
Thus 1xp þð�1Þxð Þ0¼ �1. Now, since 1xp þð�1Þx and 1 have no nontrivialcommon factor in K½x�, 1xp þð�1Þx and 1xp þð�1Þxð Þ0 have no nontrivial com-mon factor in K½x�, and hence by 2.1.6, 1xp þ �1ð Þx has no multiple root.
2.1.13 Conclusion Let F and K be any fields such that K is an extension of F. LetF be of characteristic p. Then xp � x has no multiple root. Similarly, xp
2 � x has nomultiple root, xp
3 � x has no multiple root, etc.
2.1.14 Note Let F and K be any fields such that K is an extension of F. Let F be ofcharacteristic 0. Let a; b 2 K. Suppose that a, b are algebraic over F.
Since a is algebraic over F, there exists a nonzero polynomial qðxÞ 2 F½x� suchthat K3ð ÞqðaÞ ¼ 0. By 1.3.21, there exists an irreducible polynomial f ðxÞ 2 F½x�such that f ðxÞjqðxÞ in F½x�, K3ð Þ f ðaÞ ¼ 0, and the leading coefficient of f ðxÞ is 1.Similarly, there exists an irreducible polynomial gðxÞ 2 F½x� such that K3ð ÞgðbÞ ¼ 0, and the leading coefficient of gðxÞ is 1. Suppose that deg f ðxÞð Þ ¼m � 1ð Þ and deg gðxÞð Þ ¼ n � 1ð Þ. Let L be a field such that
1. K � L;2. L is an extension of K,3. f ðxÞ splits completely in L,4. gðxÞ splits completely in L.
Suppose that all the m roots of f ðxÞ in L are a; a2; . . .; am. Next, suppose that allthe n roots of gðxÞ in L are b; b2; . . .; bn. Since a; b 2 L and F � L, we haveFða; bÞ � L. Also
f ðxÞ ¼ x� að Þ x� a2ð Þ. . . x� amð Þ
and
gðxÞ ¼ x� bð Þ x� b2ð Þ. . . x� bnð Þ
in L½x�. Since F is of characteristic 0, f ðxÞ 2 F½x�, f ðxÞ is irreducible, anda; a2; . . .; am are the roots of f ðxÞ, by 2.1.10, a; a2; . . .; am are distinct. Similarly,b; b2; . . .; bn are distinct. Also, f ðaÞ ¼ 0, gðbÞ ¼ 0, f aið Þ ¼ 0 i ¼ 2; 3; . . .;mð Þ, andg bj� � ¼ 0 j ¼ 2; 3; . . .; nð Þ. Since F is of characteristic 0 and 1 2 F,
1; 1þ 1; 1þ 1þ 1; . . .
are distinct members of F, and hence F is an infinite set.Let us take arbitrary i 2 2; . . .;mf g and j 2 2; . . .; nf g. Since b; b2; . . .; bn are
distinct, b� bj� � 6¼ 0, and hence b� bj
� ��1is a nonzero element of the field L.
Observe that there exists a unique k 2 L such that
2.1 Simple Extensions 99
ai þ kbj ¼ aþ kb:
Proof Existence: Since
ai þ ai � að Þ b� bj� ��1
� �bj ¼ ai b� bj
� �þ ai � að Þbj� �
b� bj� ��1
¼ aib� abj� �
b� bj� ��1
and
aþ ai � að Þ b� bj� ��1
� �b ¼ a b� bj
� �þ ai � að Þb� �b� bj� ��1
¼ aib� abj� �
b� bj� ��1
;
ai � að Þ b� bj� ��1 ð2 LÞ is a solution of the k-equation
ai þ kbj ¼ aþ kb
in L.Uniqueness: Suppose that
ai þ k1bj ¼ aþ k1bai þ k2bj ¼ aþ k2b
�;
where k1; k2 2 L. We have to show that k1 ¼ k2. Here,
k1 � k2ð Þbj ¼ ai þ k1bj� �� ai þ k2bj
� � ¼ aþ k1bð Þ � aþ k2bð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ k1 � k2ð Þb;
so k1 � k2ð Þbj ¼ k1 � k2ð Þb, and hence k1 � k2ð Þ b� bj� � ¼ 0. Since
b� bj� � 6¼ 0, we have k1 � k2 ¼ 0, and hence k1 ¼ k2: ■
Thus we have shown that for every i 2 2; . . .;mf g and j 2 2; . . .; nf g, thereexists a unique k 2 L ð FÞ such that ai þ kbj ¼ aþ kb. It follows that the col-lection of all such k is a finite set. Now, since F is an infinite set, there exists anonzero c 2 F � Fða; bÞð Þ such that for every i 2 2; . . .;mf g and j 2 2; . . .; nf g,ai þ cbj 6¼ aþ cb, and hence for every j 2 2; . . .;mf g, ðaþ cbÞ � cbj
� �is different
from a; a2; . . .; am. Since all the m distinct roots of f ðxÞ are a; a2; . . .; am, for everyj 2 2; . . .;mf g, ðaþ cbÞ � cbj is not a root of f ðxÞ, and hence for everyj 2 2; . . .;mf g, we have f ðaþ cbÞ � cbj
� � 6¼ 0.Since a, b, c are elements of the field Fða; bÞ, we have ðaþ cbÞ 2 Fða; bÞ, and
hence Fðaþ cbÞ � Fða; bÞ � L.
100 2 Galois Theory II
Now we shall show that Fða; bÞ � Fðaþ cbÞ. To this end, put
hðxÞ � f ðaþ cbÞ � cxð Þ 2 Fðaþ cbÞð Þ½x�ð Þ:
Thus hðxÞ 2 Fðaþ cbÞð Þ½x�. Also
hðbÞ ¼ f ðaþ cbÞ � cbð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ f ðaÞ ¼ 0;
so hðbÞ ¼ 0, and hence ðx� bÞ is a factor of the polynomial hðxÞ in L½x�. SincegðbÞ ¼ 0, ðx� bÞ is a factor of the polynomial gðxÞ in L½x�. Thus ðx� bÞ is acommon factor of the polynomials gðxÞ and hðxÞ in L½x�. By 2.1.10, we havex� bð Þ2-gðxÞ, and hence x� bð Þ2 is not a common factor of the polynomials gðxÞand hðxÞ.
Clearly, for every j 2 2; . . .;mf g, x� bj� �
is not a common divisor of thepolynomials gðxÞ and hðxÞ.Proof Let us fix an arbitrary j 2 2; . . .;mf g. It follows that
h bj� � ¼ f ðaþ cbÞ � cbj
� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} 6¼ 0;
and hence h bj� � 6¼ 0. Thus x� bj
� �-hðxÞ in L½x�. It follows that for every
j 2 2; . . .;mf g, x� bj� �
is not a common divisor of the polynomials gðxÞ andhðxÞ in L½x�. ■
Thus ðx� bÞ is a greatest common divisor of the polynomials gðxÞ and hðxÞ inL½x� Fðaþ cbÞð Þ½x�ð Þ. It follows that ðx� bÞ 2 ðFða; bÞÞ½x� � L½x�ð Þ divides eachmember of the set
gðxÞuðxÞþ hðxÞvðxÞ : uðxÞ; vðxÞ 2 L½x�f g gðxÞuðxÞþ hðxÞvðxÞ : uðxÞ; vðxÞ 2 Fðaþ cbÞð Þ½x�f gð Þ;
and hence ðx� bÞ divides each member of
gðxÞuðxÞþ hðxÞvðxÞ : uðxÞ; vðxÞ 2 Fðaþ cbÞð Þ½x�f g:
Since gðxÞ 2 F½x� � Fðaþ cbÞð Þ½x�ð Þ and hðxÞ 2 Fðaþ cbÞð Þ½x�, a greatestcommon divisor of the polynomials gðxÞ and hðxÞ in Fðaþ cbÞð Þ½x� is a member of
gðxÞuðxÞþ hðxÞvðxÞ : uðxÞ; vðxÞ 2 Fðaþ cbÞð Þ½x�f g:
2.1 Simple Extensions 101
Since ðx� bÞ divides each member of
gðxÞuðxÞþ hðxÞvðxÞ : uðxÞ; vðxÞ 2 Fðaþ cbÞð Þ½x�f g;
ðx� bÞ divides a greatest common divisor of the polynomials gðxÞ and hðxÞ inFðaþ cbÞð Þ½x�, and hence a greatest common divisor of the polynomials gðxÞ andhðxÞ in Fðaþ cbÞð Þ½x� is nontrivial. Since Fðaþ cbÞð Þ½x� � L½x�, a greatest commondivisor of the polynomials gðxÞ and hðxÞ in Fðaþ cbÞð Þ½x� divides a greatestcommon divisor of the polynomials gðxÞ and hðxÞ in L½x�, and hence a greatestcommon divisor of the polynomials gðxÞ and hðxÞ in Fðaþ cbÞð Þ½x� divides ðx� bÞ.Now, since x� bð Þ divides a greatest common divisor of the polynomials gðxÞ andhðxÞ in Fðaþ cbÞð Þ½x�, ðx� bÞ is a greatest common divisor of the polynomials gðxÞand hðxÞ in Fðaþ cbÞð Þ½x�, and hence x� bð Þ 2 Fðaþ cbÞð Þ½x�. It follows that�bð Þ 2 Fðaþ cbÞ, and hence b 2 Fðaþ cbÞ. Since ðaþ cbÞ 2 Fðaþ cbÞ andc 2 F � Fðaþ cbÞð Þ, we have
a ¼ ðaþ cbÞ � cbð Þ 2 Fðaþ cbÞ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence a 2 Fðaþ cbÞ. Thus F [ a; bf g � Fðaþ cbÞ, and hence Fða; bÞ� Fðaþ cbÞ. Next, since Fðaþ cbÞ � Fða; bÞ, we have FðcÞ ¼ Fða; bÞ, wherec � ðaþ cbÞ 2 Fða; bÞ.2.1.15 Conclusion I Let F and K be any fields such that K is an extension of F. LetF be of characteristic 0. Let a; b 2 K. Suppose that a, b are algebraic over F. Thenthere exists c 2 Fða; bÞ such that Fða; bÞ ¼ FðcÞ.
Similarly, we get the following.
2.1.16 Conclusion II Let F and K be any fields such that K is an extension ofF. Let F be of characteristic 0. Let a1; a2; . . .; an 2 K. Suppose that a1; a2; . . .; an arealgebraic over F. Then there exists c 2 F a1; a2; . . .; anð Þ such that F a1; a2; . . .; anð Þ¼ FðcÞ.Definition Let F and K be any fields such that K is an extension of F. If there existsc 2 K such that K ¼ FðcÞ, then we say that K is a simple extension of F.
Now Conclusion II can be stated as follows:
2.1.17 Conclusion III Let F and K be any fields such that K is an extension ofF. Let F be of characteristic 0. Let a1; a2; . . .; an 2 K. Suppose that a1; a2; . . .; an arealgebraic over F. Then F a1; a2; . . .; anð Þ is a simple extension of F.
Using 1.4.9, we get the following.
2.1.18 Conclusion IV Let F and K be any fields such that K is an extension ofF. Let F be of characteristic 0. Let a 2 K. Suppose that FðaÞ is a finite extension ofF. Then FðaÞ is a simple extension of F. In short, every finite extension of a field ofcharacteristic 0 is a simple extension.
102 2 Galois Theory II
2.2 Galois Groups
Caution: From henceforth, all our fields are of characteristic 0.
2.2.1 Definition Let K be any field. Let r : K ! K be a function. If
1. for every a; b 2 K, rðaþ bÞ ¼ rðaÞþ rðbÞ,2. for every a; b 2 K, rðabÞ ¼ rðaÞrðbÞ,3. r : K ! K is onto,4. r : K ! K is 1-1,
then we say that r is an automorphism of K.Here condition (4) is superfluous.
Proof Suppose to the contrary that there exist a; b 2 K such that rðaÞ ¼ rðbÞ,and a 6¼ b. We seek a contradiction. Since a 6¼ b, ða� bÞ is a nonzero memberof K, and hence ða� bÞ�1 2 K. It follows that
1 ¼ rð1Þ ¼ r a� bð Þ a� bð Þ�1� �� �
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ r a� bð Þr a� bð Þ�1� �
¼ r aþ �bð Þð Þr a� bð Þ�1� �
¼ r að Þþ r �bð Þð Þr a� bð Þ�1� �
¼ r að Þþ �r bð Þð Þð Þr a� bð Þ�1� �
¼ r að Þ � r bð Þð Þr a� bð Þ�1� �
¼ r að Þ � r að Þð Þr a� bð Þ�1� �
¼ 0 r a� bð Þ�1� �
¼ 0;
and hence 1 ¼ 0. This contradicts the fact that K is a field. ■
2.2.2 Note Let K be any field. Let r1; . . .; rn be n distinct automorphisms of K. Leta1; . . .; an 2 K. Suppose that
1. for every u 2 K, a1r1ðuÞþ � � � þ anrnðuÞ ¼ 0,
2. not all ai i ¼ 1; . . .; nð Þ are 0.
We claim that this is impossible. We seek a contradiction.In the case of n = 1, condition (1) becomes for every u 2 K, a1r1ðuÞ ¼ 0, and
hence
a1 ¼ a11 ¼ a1r1ð1Þ ¼ 0|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} :Thus a1 ¼ 0. In this case of n ¼ 1, condition (2) becomes a1 6¼ 0. This is a
contradiction.Now we consider the case n ¼ 2.Here condition (1) becomes for every u 2 K, a1r1ðuÞþ a2r2ðuÞ ¼ 0. Next,
condition (2) becomes (either a1 6¼ 0 or a2 6¼ 0). For definiteness, suppose thata1 6¼ 0. We seek a contradiction.
2.2 Galois Groups 103
Since r1; r2 are distinct, there exists a nonzero c 2 K such that r1ðcÞ 6¼ r2ðcÞ.Thus r1ðcÞ; r2ðcÞ are nonzero, and
a1r1ðcÞþ a2r2ðcÞ ¼ 0:
Since a1 6¼ 0 and r1ðcÞ 6¼ 0, we have
a2r2 cð Þ ¼ �a1r1 cð Þ 6¼ 0|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl};and hence a2r2ðcÞ 6¼ 0. This shows that a2 6¼ 0. Observe that for every u 2 K, wehave cu 2 K, and hence
a2r2 cð Þ r2 uð Þ � r1 uð Þð Þ ¼ �a2r2 cð Þð Þr1 uð Þþ a2r2 cð Þr2 uð Þ¼ a1r1 cð Þr1 uð Þþ a2r2 cð Þr2 uð Þ¼ a1r1ðcuÞþ a2r2ðcuÞ ¼ 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} :
Thus for every u 2 K,
a2r2ðcÞ r2ðuÞ � r1ðuÞð Þ ¼ 0:
Now, since a2; r2ðcÞ are nonzero members of K and r2ðuÞ � r1ðuÞð Þ is amember of the field K, we have, for every u 2 K, r2ðuÞ � r1ðuÞ ¼ 0. Thus forevery u 2 K, r1ðuÞ ¼ r2ðuÞ. Since c 2 K, we have r1ðcÞ ¼ r2ðcÞ. This is acontradiction.
Next we consider the case n = 3.If a3 ¼ 0, then from the cases discussed above, we get a contradiction. Hence we
have to deal with only the case a3 6¼ 0.Since r1; r3 are distinct, there exists a nonzero c 2 K such that r1ðcÞ 6¼ r3ðcÞ.
Thus r1ðcÞ; r3ðcÞ are nonzero, and
a1r1ðcÞþ a2r2ðcÞþ a3r3ðcÞ ¼ 0:
Here condition (1) becomes, for every u 2 K, a1r1ðuÞþ a2r2ðuÞþ a3r3ðuÞ ¼ 0.Observe that for every u 2 K, we have cu 2 K, and hence
a2 r2 cð Þ � r1 cð Þð Þ � r2 uð Þþ a3 r3 cð Þ � r1 cð Þð Þ � r3 uð Þ¼ �a2r2 uð Þ � a3r3 uð Þð Þr1 cð Þþ a2r2 cð Þr2 uð Þþ a3r3 cð Þr3 uð Þ
¼ a1r1 uð Þð Þr1 cð Þþ a2r2 cð Þr2 uð Þþ a3r3 cð Þr3 uð Þ¼ a1r1 cð Þr1 uð Þþ a2r2 cð Þr2 uð Þþ a3r3 cð Þr3 uð Þ
¼ a1r1 cuð Þþ a2r2 cuð Þþ a3r3 cuð Þ ¼ 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
104 2 Galois Theory II
Thus for every u 2 K,
b2r2ðuÞþ b3r3ðuÞ ¼ 0;
where b2 � a2 r2ðcÞ � r1ðcÞð Þ and b3 � a3 r3ðcÞ � r1ðcÞð Þ. Since a3; r3ð ðcÞ �r1ðcÞÞ are nonzero members of the field K, b3 ¼ð Þ a3 r3ðcÞ � r1ðcÞð Þ is a nonzeromember of the field K, and hence not all bi i ¼ 2; 3ð Þ are 0. By our earlier casen = 2, we get a contradiction, etc.
2.2.3 Conclusion Let K be any field. Let r1; . . .; rn be n distinct automorphisms ofK. There do not exist a1; . . .; an 2 K such that
1. for every u 2 K, a1r1ðuÞþ � � � þ anrnðuÞ ¼ 0,2. not all ai i ¼ 1; . . .; nð Þ are 0.
2.2.4 Problem Let K be any field. Let G be a nonempty collection automorphismsof K. Put
KG � a : a 2 K; and for every r 2 G; rðaÞ ¼ af g:
Then KG is a subfield of K.Here we say that KG is the fixed field of G.
Proof Let us take an arbitrary r 2 G. It follows that r : K ! K is an automorphismof K, and hence rð0Þ ¼ 0 and rð1Þ ¼ 1. This shows that 0; 1 2 KG. Thus KG is asubset of K, and KG contains at least two elements.
Let a; b 2 KG. Let us take an arbitrary r 2 G. It follows by the definition of KG
that rðaÞ ¼ a and rðbÞ ¼ b. Hence rðaþ bÞ ¼ rðaÞþ rðbÞ ¼ aþ b andrðabÞ ¼ rðaÞrðbÞ ¼ ab. Thus rðaþ bÞ ¼ aþ b and rðabÞ ¼ ab. It follows thatðaþ bÞ; ab 2 KG. Next, since rð�aÞ ¼ � rðaÞð Þ ¼ �a, we have rð�aÞ ¼ �a. Thisshows that ð�aÞ 2 KG. If a 6¼ 0, then r a�1ð Þ ¼ rðaÞð Þ�1¼ a�1. Thus if a is anonzero element of KG, then a�1 2 KG. Hence KG is a subfield of K. ■
2.2.5 Problem Let K be any field. The collection of all automorphisms of K isdenoted by AutðKÞ. Clearly, AutðKÞ is a group.
Proof The identity map Id : a 7! a from K onto K is an automorphism of K, andhence Id 2 AutðKÞ.a. Let r; l 2 AutðKÞ. We have to show that ðrlÞ 2 AutðKÞ. Since r 2 AutðKÞ, r
is a one-to-one map from K onto K. Similarly, l is a one-to-one map fromK onto K. It follows that the composite map ðrlÞ is a one-to-one map fromK onto K. Next, let us take arbitrary a, b in K. We have
ðrlÞðaþ bÞ ¼ r lðaþ bÞð Þ ¼ r lðaÞþ lðbÞð Þ ¼ r lðaÞð Þþ r lðbÞð Þ¼ ðrlÞðaÞþ ðrlÞðbÞ;
2.2 Galois Groups 105
and hence
ðrlÞðaþ bÞ ¼ ðrlÞðaÞþ ðrlÞðbÞ:
Similarly, ðrlÞðabÞ ¼ ðrlÞðaÞ � ðrlÞðbÞ. Thus ðrlÞ 2 AutðKÞ.b. Let r 2 AutðKÞ. We have to show that r�1 2 AutðKÞ. Since r 2 AutðKÞ, r is a
one-to-one map from K onto K, and hence r�1 is a one-to-one map from K ontoK. Next, let us take arbitrary a, b in K. We have to show that
1. r�1ðaþ bÞ ¼ r�1ðaÞþ r�1ðbÞ, that is, aþ b ¼ r r�1ðaÞþ r�1ðbÞð Þ,2. r�1ðabÞ ¼ r�1ðaÞ � r�1ðbÞ, that is, ab ¼ r r�1ðaÞ � r�1ðbÞð Þ.
For 1: RHS ¼ r r�1ðaÞþ r�1ðbÞ� � ¼ r r�1ðaÞ� �þ r r�1ðbÞ� � ¼ aþ b ¼ LHS:
For 2: RHS ¼ r r�1ðaÞ � r�1ðbÞ� � ¼ r r�1ðaÞ� � � r r�1ðbÞ� � ¼ ab ¼ LHS:
■
2.2.6 Problem Let K be any field. Let F be a subfield of K. Put
GðK;FÞ � r : r 2 AutðKÞ; and for every a 2 F; rðaÞ ¼ af g:
Then GðK;FÞ is a subgroup of AutðKÞ.Here GðK;FÞ is called the group of automorphisms of K relative to F.
Proof The identity map Id : a 7! a from K onto K is an automorphism of K, andhence Id 2 AutðKÞ. Also, for every a 2 F; IdðaÞ ¼ a. Thus Id 2 GðK;FÞ.
Let r; l 2 GðK;FÞ. It suffices to show that rl�1ð Þ 2 GðK;FÞ. To this end, let ustake an arbitrary a 2 F. It suffices to show that rl�1ð ÞðaÞ ¼ a. Since l 2 GðK;FÞand a 2 F, we have lðaÞ ¼ a, and hence l�1ðaÞ ¼ a.
LHS ¼ rl�1� �ðaÞ ¼ r l�1ðaÞ� � ¼ rðaÞ ¼ a ¼ RHS:
■
2.2.7 Note Let F and K be any fields such that K is a finite extension of F.
It follows that K : F½ �\1. Hence there exists a basis u1; u2; . . .; unf g of thevector space K over F, where n ¼ K : F½ �. By 2.2.6, GðK;FÞ is a group of auto-morphisms of K.
We claim that the number of elements of GðK;FÞ is � n. Suppose to thecontrary that the number of elements of GðK;FÞ is [ n. We seek a contradiction.
Since the number of elements of GðK;FÞ is [ n, there exist ðnþ 1Þ distinctautomorphisms r1; r2; . . .; rnþ 1 of K. It follows that for every i 2 1; 2; . . .; nf g and
106 2 Galois Theory II
for every j 2 1; 2; . . .; nþ 1f g, we have rj uið Þ 2 K. It follows that the followingsystem of n linear equations in ðnþ 1Þ variables x1; x2; . . .; xn; xnþ 1,
r1 u1ð Þx1 þ r2 u1ð Þx2 þ � � � þ rn u1ð Þxn þ rnþ 1 u1ð Þxnþ 1 ¼ 0r1 u2ð Þx1 þ r2 u2ð Þx2 þ � � � þ rn u2ð Þxn þ rnþ 1 u2ð Þxnþ 1 ¼ 0
..
.
r1 unð Þx1 þ r2 unð Þx2 þ � � � þ rn unð Þxn þ rnþ 1 unð Þxnþ 1 ¼ 0
9>>>=>>>;;
has a nontrivial solution x1; x2; . . .; xn; xnþ 1ð Þ ¼a1; a2; . . .; an; anþ 1ð Þ 6¼ 0; 0; . . .; 0; 0ð Þð Þ in K. It follows that
r1 u1ð Þa1 þ r2 u1ð Þa2 þ � � � þ rnþ 1 u1ð Þanþ 1 ¼ 0r1 u2ð Þa1 þ r2 u2ð Þa2 þ � � � þ rnþ 1 u2ð Þanþ 1 ¼ 0
..
.
r1 unð Þa1 þ r2 unð Þa2 þ � � � þ rnþ 1 unð Þanþ 1 ¼ 0
9>>>=>>>;;
that is,
a1r1 u1ð Þþ a2r2 u1ð Þþ � � � þ anþ 1rnþ 1 u1ð Þ ¼ 0a1r1 u2ð Þþ a2r2 u2ð Þþ � � � þ anþ 1rnþ 1 u2ð Þ ¼ 0
..
.
a1r1 unð Þþ a2r2 unð Þþ � � � þ anþ 1rnþ 1 unð Þ ¼ 0
9>>>=>>>;;
that is,
Xnþ 1
j¼1
ajrj uið Þ ¼ 0 i ¼ 1; . . .; nð Þ:
Since a1; a2; . . .; an; anþ 1ð Þ 6¼ 0; 0; . . .; 0; 0ð Þ, not all ai i ¼ 1; . . .; nþ 1ð Þ are 0,and hence by 2.2.3, there exists u 2 K such that
a1r1ðuÞþ � � � þ anþ 1rnþ 1ðuÞ 6¼ 0:
Since u 2 K and u1; u2; . . .; unf g is a basis of the vector space K over F, thereexist b1; b2; . . .; bn in F such that u ¼Pn
i¼1 biui. It follows that
2.2 Galois Groups 107
a1r1 uð Þþ � � � þ anþ 1rnþ 1 uð Þ ¼ a1r1Xni¼1
biui
!þ � � � þ anþ 1rnþ 1
Xni¼1
biui
!|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ a1Pni¼1
bir1 uið Þ�
þ � � � þ anþ 1Pni¼1
birnþ 1 uið Þ�
¼Pni¼1
a1bir1 uið Þþ � � � þ Pni¼1
anþ 1birnþ 1 uið Þ
¼ Pnþ 1
j¼1
Pni¼1
ajbirj uið Þ�
¼Pni¼1
Pnþ 1
j¼1ajbirj uið Þ
!
¼Pni¼1
biPnþ 1
j¼1ajrj uið Þ
!¼Pn
i¼1bi � 0ð Þ ¼ 0;
and hence
a1r1ðuÞþ � � � þ anþ 1rnþ 1ðuÞ ¼ 0:
This is a contradiction.
2.2.8 Conclusion Let F and K be any fields such that K is a finite extension of F.Then o GðK;FÞð Þ� K : F½ �.Definition Let F be a field. By 1.3.7, F x1; . . .; xn½ � is an integral domain. Its field ofquotients is denoted by F x1; . . .; xnð Þ. The members of F x1; . . .; xnð Þ are calledrational functions in x1; . . .; xn over F. By Sn we shall mean the permutation group
r : r : 1; 2; . . .; nf g ! 1; 2; . . .; nf g is one-to-one and ontof g;
which is called the symmetric group of degree n.
2.2.9 Problem Observe that for every r 2 Sn, the mapping r : r x1; . . .; xnð Þ7! r xrð1Þ; . . .; xr nð Þ
� �from F x1; . . .; xnð Þ to F x1; . . .; xnð Þ is an automorphism of the
field F x1; . . .; xnð Þ:For simplicity, r is also denoted by r. Thus we can treat Sn as a group of
automorphisms of the field F x1; . . .; xnð Þ.Proof Suppose that p x1;...;xnð Þ
q x1;...;xnð Þ ;r x1;...;xnð Þs x1;...;xnð Þ 2 F x1; . . .; xnð Þ, where p x1; . . .; xnð Þ;
q x1; . . .; xnð Þ; r x1; . . .; xnð Þ; s x1; . . .; xnð Þ 2 F x1; . . .; xn½ �. Here
108 2 Galois Theory II
rp x1; � � � ; xnð Þq x1; � � � ; xnð Þ þ
r x1; � � � ; xnð Þs x1; � � � ; xnð Þ
�
¼ rp x1; � � � ; xnð Þs x1; � � � ; xnð Þþ r x1; � � � ; xnð Þq x1; � � � ; xnð Þ
q x1; � � � ; xnð Þs x1; � � � ; xnð Þ�
¼ p xrð1Þ; � � � ; xr nð Þ� �
s xrð1Þ; � � � ; xr nð Þ� �þ r xrð1Þ; � � � ; xr nð Þ
� �q xrð1Þ; � � � ; xr nð Þ� �
q xrð1Þ; � � � ; xr nð Þ� �
s xrð1Þ; � � � ; xr nð Þ� �
¼ p xrð1Þ; � � � ; xr nð Þ� �
q xrð1Þ; � � � ; xr nð Þ� � þ r xrð1Þ; � � � ; xr nð Þ
� �s xrð1Þ; � � � ; xr nð Þ� �
¼ rp x1; � � � ; xnð Þq x1; � � � ; xnð Þ�
þ rr x1; � � � ; xnð Þs x1; � � � ; xnð Þ�
;
so
rp x1; . . .; xnð Þq x1; . . .; xnð Þ þ
r x1; . . .; xnð Þs x1; . . .; xnð Þ
� ¼ r
p x1; . . .; xnð Þq x1; . . .; xnð Þ�
þ rr x1; . . .; xnð Þs x1; . . .; xnð Þ�
:
Next,
rp x1; � � � ; xnð Þq x1; � � � ; xnð Þ
r x1; � � � ; xnð Þs x1; � � � ; xnð Þ
� ¼ r
p x1; � � � ; xnð Þr x1; � � � ; xnð Þq x1; � � � ; xnð Þs x1; � � � ; xnð Þ�
¼ p xrð1Þ; � � � ; xr nð Þ� �
r xrð1Þ; � � � ; xr nð Þ� �
q xrð1Þ; � � � ; xr nð Þ� �
s xrð1Þ; � � � ; xr nð Þ� � ¼ p xrð1Þ; � � � ; xr nð Þ
� �q xrð1Þ; � � � ; xr nð Þ� � r xrð1Þ; � � � ; xr nð Þ
� �s xrð1Þ; � � � ; xr nð Þ� �
¼ rp x1; � � � ; xnð Þq x1; � � � ; xnð Þ�
� r r x1; � � � ; xnð Þs x1; � � � ; xnð Þ�
;
so
rp x1; . . .; xnð Þq x1; . . .; xnð Þ
r x1; . . .; xnð Þs x1; . . .; xnð Þ
� ¼ r
p x1; . . .; xnð Þq x1; . . .; xnð Þ�
� r r x1; . . .; xnð Þs x1; . . .; xnð Þ�
:
Thus r preserves addition and multiplication.r : F x1; . . .; xnð Þ ! F x1; . . .; xnð Þ is one-to-one. To show this, let
r r x1; . . .; xnð Þð Þ ¼ r s x1; . . .; xnð Þð Þ, where r x1; . . .; xnð Þ; s x1; . . .; xnð Þ2 F x1; . . .; xnð Þ. We have to show that r x1; . . .; xnð Þ ¼ s x1; . . .; xnð Þ. Since
r xrð1Þ; . . .; xr nð Þ� � ¼ r r x1; . . .; xnð Þð Þ ¼ r s x1; . . .; xnð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ s xrð1Þ; . . .; xr nð Þ
� �;
2.2 Galois Groups 109
we have r xrð1Þ; . . .; xr nð Þ� � ¼ s xrð1Þ; . . .; xr nð Þ
� �, and hence
r x1; � � � ; xnð Þ ¼ r xr�1 r 1ð Þð Þ; � � � ; xr�1 r nð Þð Þ� �
¼ r�1� �r xr 1ð Þ; � � � ; xr nð Þ� �� � ¼ r�1� �
s xr 1ð Þ; � � � ; xr nð Þ� �� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ s xr�1 r 1ð Þð Þ; � � � ; xr�1 r nð Þð Þ� � ¼ s x1; � � � ; xnð Þ:
Thus r x1; . . .; xnð Þ ¼ s x1; . . .; xnð Þ.r : F x1; . . .; xnð Þ ! F x1; . . .; xnð Þ is onto. To show this, let us take an arbitrary
r x1; . . .; xnð Þ 2 F x1; . . .; xnð Þ. Put s x1; . . .; xnð Þ � r xr�1ð1Þ; . . .; xr�1 nð Þ� �
. Here
r s x1; . . .; xnð Þð Þ ¼ r r xr�1ð1Þ; . . .; xr�1 nð Þ� �� � ¼ r xr r�1ð1Þð Þ; . . .; xr r�1 nð Þð Þ
� �¼ r x1; . . .; xnð Þ;
so
r s x1; . . .; xnð Þð Þ ¼ r x1; . . .; xnð Þ:
Thus r : F x1; . . .; xnð Þ ! F x1; . . .; xnð Þ is an automorphism of F x1; . . .; xnð Þ: ■2.2.10 Definition Let F be a field. We know that F x1; . . .; xnð Þ is a field extensionof F, and Sn is a group of automorphisms of F x1; . . .; xnð Þ. Here the fixed field of Snis denoted by S. Thus
S ¼ r x1; . . .; xnð Þ : r x1; . . .; xnð Þ 2 F x1; . . .; xnð Þ; and for everyfr 2 Sn; r r x1; . . .; xnð Þð Þ ¼ r x1; . . .; xnð Þg;
that is,
S ¼ r x1; . . .; xnð Þ : r x1; . . .; xnð Þ 2 F x1; . . .; xnð Þ; and for everyfr 2 Sn; r xrð1Þ; . . .; xr nð Þ
� � ¼ r x1; . . .; xnð Þ:By 2.2.4, S is a subfield of F x1; . . .; xnð Þ. Also F � S. Thus
F � S � F x1; . . .; xnð Þ.The members of S are called symmetric rational functions. Thus S is the field of
symmetric rational functions.
2.2.11 Example Suppose that n = 3. Here
S3 ¼ r1; r2; r3; r4; r5; r6f g;
110 2 Galois Theory II
where
r1 � 112233
� ; r2 � 1
22331
� ; r3 � 1
32132
� ; r4 � 1
12332
� ; r5
� 132231
� ; r6 � 1
22133
� :
Observe that x2x3 þ x3x1 þ x1x2 is a symmetric rational function.Verification: We must show that
xri 2ð Þxri 3ð Þ þ xri 3ð Þxrið1Þ þ xrið1Þxri 2ð Þ¼ x2x3 þ x3x1 þ x1x2 i ¼ 1; 2; 3; 4; 5; 6ð Þ:
For i ¼ 1 : LHS ¼ xr1 2ð Þxr1 3ð Þ þ xr1 3ð Þxr1ð1Þ þ xr1ð1Þxr1 2ð Þ¼ x2x3 þ x3x1 þ x1x2 ¼ RHS:
For i ¼ 2 : LHS ¼ xr2 2ð Þxr2 3ð Þ þ xr2 3ð Þxr2ð1Þ þ xr2ð1Þxr2 2ð Þ¼ x3x1 þ x1x2 þ x2x3 ¼ x2x3 þ x3x1 þ x1x2 ¼ RHS:
For i ¼ 3 : LHS ¼ xr3 2ð Þxr3 3ð Þ þ xr3 3ð Þxr3ð1Þ þ xr3ð1Þxr3 2ð Þ¼ x1x2 þ x2x3 þ x3x1 ¼ x2x3 þ x3x1 þ x1x2 ¼ RHS:
For i ¼ 4 : LHS ¼ xr4 2ð Þxr4 3ð Þ þ xr4 3ð Þxr4ð1Þ þ xr4ð1Þxr4 2ð Þ¼ x3x2 þ x2x1 þ x1x3 ¼ x2x3 þ x3x1 þ x1x2 ¼ RHS:
For i ¼ 5 : LHS ¼ xr5 2ð Þxr5 3ð Þ þ xr5 3ð Þxr5ð1Þ þ xr5ð1Þxr5 2ð Þ¼ x2x1 þ x1x3 þ x3x2 ¼ x2x3 þ x3x1 þ x1x2 ¼ RHS:
For i ¼ 6 : LHS ¼ xr6 2ð Þxr6 3ð Þ þ xr6 3ð Þxr6ð1Þ þ xr6ð1Þxr6 2ð Þ¼ x1x3 þ x3x2 þ x2x1 ¼ x2x3 þ x3x1 þ x1x2 ¼ RHS:
Verified.
Definition Similarly, x1x2x3 is a symmetric rational function, and x1 þ x2 þ x3 is asymmetric rational function. The symmetric rational functions x1 þ x2 þ x3,x2x3 þ x3x1 þ x1x2, and x1x2x3 are called the elementary symmetric functions.
The elementary symmetric function x1 þ x2 þ x3 is denoted by a1, the elementarysymmetric function x2x3 þ x3x1 þ x1x2 is denoted by a2, and the elementary sym-metric function x1x2x3 is denoted by a3. Thus a1; a2; a3f g[Fð Þ � S, and hence thesmallest field F a1; a2; a3ð Þ containing a1; a2; a3f g[F is contained in S. In short,
2.2 Galois Groups 111
F � F a1; a2; a3ð Þ � S � F x1; x2; x3ð Þ:
Since F x1; x2; x3ð Þ is an extension of the field S, by 2.2.6,
G F x1; x2; x3ð Þ; Sð Þ
¼r : r 2 Aut F x1; x2; x3ð Þð Þ; and for every r x1; x2; x3ð Þ
2 S; r r x1; x2; x3ð Þð Þ ¼ r x1; x2; x3ð Þ
( )
¼r : r 2 Aut F x1; x2; x3ð Þð Þ; and for every r x1; x2; x3ð Þ
2 S; r xr 1ð Þ; xr 2ð Þ; xr 3ð Þ� � ¼ r x1; x2; x3ð Þ
( )
is a group of automorphisms of F x1; x2; x3ð Þ.2.2.12 Problem Clearly, S3 � G F x1; x2; x3ð Þ; Sð Þ.Proof To show this, let us take an arbitrary ri 2 S3, where i 2 1; 2; 3; 4; 5; 6f g. Wehave to show that
1. ri 2 Aut F x1; x2; x3ð Þð Þ,2. for every r x1; x2; x3ð Þ 2 S; r xrið1Þ; xri 2ð Þ; xri 3ð Þ
� � ¼ r x1; x2; x3ð Þ.For 1: Since we can treat S3 as a group of automorphisms of the field
F x1; x2; x3ð Þ, and ri 2 S3, we have ri 2 Aut F x1; x2; x3ð Þð Þ.For 2: Let us take an arbitrary r x1; x2; x3ð Þ 2 S. Now, since ri 2 S3, by the
definition of S, r xrið1Þ; xri 2ð Þ; xri 3ð Þ� � ¼ r x1; x2; x3ð Þ. ■
2.2.13 Note Since S3 � G F x1; x2; x3ð Þ; Sð Þ, we have
3! ¼ o S3ð Þ� o G F x1; x2; x3ð Þ; Sð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence
3!� o G F x1; x2; x3ð Þ; Sð Þð Þ:
Since F a1; a2; a3ð Þ � F x1; x2; x3ð Þ, the field F x1; x2; x3ð Þ is an extension ofF a1; a2; a3ð Þ. Observe that
t3 þ �a1ð Þt2 þ a2tþ �a3ð Þ� � 2 F a1; a2; a3ð Þð Þ t½ �
and
t3 þ �a1ð Þt2 þ a2tþ �a3ð Þ ¼ t3 � x1 þ x2 þ x3ð Þt2 þ x2x3 þ x3x1 þ x1x2ð Þt� x1x2x3
¼ t � x1ð Þ t � x2ð Þ t � x3ð Þ:
112 2 Galois Theory II
So
t3 þ �a1ð Þt2 þ a2tþ �a3ð Þ ¼ t � x1ð Þ t � x2ð Þ t � x3ð Þ:
It follows that F x1; x2; x3ð Þ contains all the roots x1; x2; x3 oft3 þ �a1ð Þt2 þ a2tþ �a3ð Þ in F x1; x2; x3ð Þ.
We claim that F x1; x2; x3ð Þ is a splitting field over F a1; a2; a3ð Þ fort3 þ �a1ð Þt2 þ a2tþ �a3ð Þ.
Suppose to the contrary that G is a proper subfield of F x1; x2; x3ð Þ that containsall the roots x1; x2; x3 of t3 þ �a1ð Þt2 þ a2tþ �a3ð Þ in F x1; x2; x3ð Þ. We seek acontradiction. Since G contains F [ x1; x2; x3f g, and G is a field, G containsF x1; x2; x3ð Þ. This contradicts the fact that G is a proper subset of F x1; x2; x3ð Þ.
Hence our claim is substantiated, that is, F x1; x2; x3ð Þ is a splitting field overF a1; a2; a3ð Þ for t3 þ �a1ð Þt2 þ a2tþ �a3ð Þ. By 1.5.9,
splitting field over F a1; a2; a3ð Þ for t3 þ �a1ð Þt2 þ a2tþ �a3ð Þð Þ : F 1; a2; a3ð Þ½ �� deg t3 þ �a1ð Þt2 þ a2tþ �a3ð Þð Þð Þ!;
so
F x1; x2; x3ð Þ : F a1; a2; a3ð Þ½ � � deg t3 þ �a1ð Þt2 þ a2tþ �a3ð Þ� �� �!;
and hence
F x1; x2; x3ð Þ : F a1; a2; a3ð Þ½ � � 3!:
Thus F x1; x2; x3ð Þ is a finite extension of F a1; a2; a3ð Þ. Now, sinceF a1; a2; a3ð Þ � S � F x1; x2; x3ð Þ, by 1.4.4, S is a finite extension of F a1; a2; a3ð Þ,and F x1; x2; x3ð Þ is a finite extension of S. Also, by 1.4.3,
F x1; x2; x3ð Þ : F a1; a2; a3ð Þ½ � ¼ F x1; x2; x3ð Þ : S½ � S : F a1; a2; a3ð Þ½ �:
Since F x1; x2; x3ð Þ is a finite extension of S, by 2.2.8, we have
3!� o G F x1; x2; x3ð Þ; Sð Þð Þ� F x1; x2; x3ð Þ : S½ �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} � F x1; x2; x3ð Þ : S½ � S : F a1; a2; a3ð Þ½ �
¼ F x1; x2; x3ð Þ : F a1; a2; a3ð Þ½ � � 3!;
and hence
F x1; x2; x3ð Þ : S½ � ¼ 3!:
2.2 Galois Groups 113
Also
F x1; x2; x3ð Þ : S½ � ¼ 3!:
o G F x1; x2; x3ð Þ; Sð Þð Þ ¼ 3! ¼ o S3ð Þð Þ
and
F x1; x2; x3ð Þ : S½ � ¼ F x1; x2; x3ð Þ : S½ � S : F a1; a2; a3ð Þ½ �:
Since F x1; x2; x3ð Þ : S½ � ¼ 3!, we have S : F a1; a2; a3ð Þ½ � ¼ 1, and henceS ¼ F a1; a2; a3ð Þ. Since S3 � G F x1; x2; x3ð Þ; Sð Þ and o G F x1; x2; x3ð Þ; Sð Þð Þ¼ 3! ¼ o S3ð Þð Þ, we have S3 ¼ G F x1; x2; x3ð Þ; Sð Þ.2.2.14 Conclusion Let F be a field. Let n be a positive integer. Then
1. F x1; . . .; xnð Þ : S½ � ¼ n!;2. G F x1; . . .; xnð Þ; Sð Þ ¼ Sn;3. S ¼ F a1; . . .; anð Þ;4. F x1; . . .; xnð Þ is a splitting field over S for tn � a1tn�1 þ a2tn�2 � . . .þ �1ð Þnan,
where the symbols have their usual meanings.
2.2.15 Note Let F and K be any fields such K is a finite extension of F.
It follows that K : F½ �\1. Hence there exists a basis u1; u2; . . .; unf g of thevector space K over F, where n ¼ K : F½ �. By 2.2.6, GðK;FÞ is a group of auto-morphisms of K, where
GðK;FÞ � r : r 2 AutðKÞ; and for every a 2 F; rðaÞ ¼ af g:
Further, by 2.2.8, o GðK;FÞð Þ� K : F½ �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ n. Here the fixed field of GðK;FÞ is
a : a 2 K; and for every r 2 GðK;FÞ; rðaÞ ¼ af g ð FÞ;
so F � fixed field of GðK;FÞð Þ � K. It follows that
fixed field of GðF;FÞð Þ ¼ F:
Definition Let F and K be any fields such that F � K. Let K be a finite extension ofF. If F ¼ fixed field of GðK;FÞð Þ, that is, fixed field of GðK;FÞð Þ � F, then wesay that K is a normal extension of F.
Since fixed field of GðF;FÞð Þ ¼ F, and F : F½ � ¼ 1\1, F is a normal exten-sion of F.
114 2 Galois Theory II
2.2.16 Note Let F and K be any fields such that K is a normal extension of F. LetH be a subgroup of the group GðK;FÞ � AutðKÞð Þ. Let KH be the fixed field of H,that is,
K ð ÞKH ¼ a : a 2 K; and for every r 2 H; rðaÞ ¼ af g ð FÞ:
Since K is a normal extension of F, K is a finite extension of F, and henceK : F½ �\1. Next, by 2.2.7, o GðK;FÞð Þ� ½K : F�. Since H is a subgroup of thegroup GðK;FÞ, we have oðHÞ� o GðK;FÞð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} � ½K : F�\1, and hence oðHÞ\1.
This shows that H is a finite subgroup of the finite group GðK;FÞ. Further,F � KH � K.
Observe that
G K;KHð Þ ¼ r : r 2 AutðKÞ; and for every a 2 KH ; rðaÞ ¼ af g ð HÞ:
Since F � KH � K, and K is a finite extension of F, by 1.4.4, K is a finiteextension of KH , and hence by 2.2.7, o G K;KHð Þð Þ� K : KH½ �\1. Now, sinceH � G K;KHð Þ, we have
oðHÞ� o G K;KHð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} � K : KH½ �\1; ðÞ
and hence oðHÞ� K : KH½ �.Since 1� K : KH½ �\1, there exist a1; . . .; am 2 K such that a1; . . .; amf g is a
basis of the vector space K over the field KH . It follows that for every x 2 K, thereexist a1; . . .; am 2 KH such that
x ¼ a1a1 þ � � � þ amamð Þ 2 KH a1; . . .; amð Þ:
Thus K � KH a1; . . .; amð Þ. Since KH [ a1; . . .; amf g � K, we haveKH a1; . . .; amð Þ � K|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} � KH a1; . . .; amð Þ, and hence
K ¼ KH a1; . . .; amð Þ:
Since F � KH � KH a1ð Þ � KH a1; . . .; amð Þ ¼ K, we have F � KH a1ð Þ � K.Since K is a normal extension of F, K is a finite extension of F. Since F � KH � K,by 1.4.4, K is a finite extension of KH . Next, since KH � KH a1ð Þ � K, by 1.4.4,KH a1ð Þ is a finite extension of KH , and hence by 1.4.9, a1 is algebraic over KH .Similarly, a2 is algebraic over KH , etc. By 2.1.16, there exists a 2 KH a1; � � � ; amð Þsuch that
KH a1; � � � ; amð Þ ¼ KHðaÞ:
2.2 Galois Groups 115
Since K ¼ KH a1; . . .; amð Þ, we have a 2 K and K ¼ KHðaÞ. Since K is a normalextension of F, KHðaÞ ¼ð ÞK is a finite extension of F, and hence KHðaÞ is a finiteextension of F. Since F � KH � K ¼ KHðaÞ, we have F � KH � KHðaÞ. Now,since KHðaÞ is a finite extension of F, by 1.4.4, KHðaÞ is a finite extension of KH ,and hence by 1.4.9, a is algebraic over KH .
Let a be algebraic of degree n over KH .By 1.4.11, there exists qðxÞ 2 KH ½x� such that qðxÞ is the minimal polynomial of
a over KH , that is,
1. K3ð ÞqðaÞ ¼ 0;2. n ¼ deg qðxÞð Þ� 1;3. the leading coefficient of qðxÞ is 1.
Again by 1.4.12, qðxÞ is irreducible over KH . Also, since a is algebraic of degreen over KH , by 1.4.16, we have
K : KH½ � ¼ KHðaÞ : KH½ � ¼ n|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ deg qðxÞð Þ;
and hence K : KH½ � ¼ deg qðxÞð Þ.Since H is a finite subgroup of the finite group GðK;FÞ, we can suppose that
H ¼ r1; r2; . . .; rhf g � GðK;FÞð Þ; o Hð Þ ¼ h, and r1 is the identity element ofthe group GðK;FÞ. It follows that each riðaÞ 2 K, and r1ðaÞ ¼ a. Put
a1 � r1 að Þþ r2 að Þþ � � � þ rh að Þ 2 Kð Þ;a2 �
Pi\j
ri að Þrj að Þ� � 2 Kð Þ;a3 �
Pi\j\k
ri að Þrj að Þrk að Þ� � 2 Kð Þ;
..
.
We want to show that
a1 2 KH ¼ a : a 2 K and for every r 2 H; rðaÞ ¼ af gð Þ:
To this end, let us take an arbitrary ri 2 H, where i 2 1; 2; . . .; hf g. It suffices toshow that ri a1ð Þ ¼ a1, that is,
ri r1ðaÞþ r2ðaÞþ � � � þ rhðaÞð Þ ¼ r1ðaÞþ r2ðaÞþ � � � þ rhðaÞ:
Observe that
ri r1ðaÞþ r2ðaÞþ � � � þ rhðaÞð Þ ¼ ri r1ðaÞð Þþ � � � þ ri rhðaÞð Þ¼ rir1ð ÞðaÞþ � � � þ rirhð ÞðaÞ:
116 2 Galois Theory II
Also, rj 7! rirj is a one-to-one mapping from r1; r2; . . .; rhf g ontor1; r2; . . .; rhf g, so
LHS ¼ ri r1ðaÞþ r2ðaÞþ � � � þ rhðaÞð Þ¼ rir1ð ÞðaÞþ � � � þ rirhð ÞðaÞ ¼ r1ðaÞþ r2ðaÞþ � � � þ rhðaÞ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ RHS:
Thus a1 2 KH . Next, we want to show that a2 2 KH ¼ a : a 2 Kfðand for every r 2 H; rðaÞ ¼ agÞ.
To this end, let us take an arbitrary ri 2 H, where i 2 1; 2; . . .; hf g. It suffices toshow that ri a2ð Þ ¼ a2, that is,
ri r1 að Þr2 að Þþ r1 að Þr3 að Þþ � � �ð Þþ r2 að Þr3 að Þþ r2 að Þr4 að Þþ � � �ð Þþ � � �ð Þ¼ r1 að Þr2 að Þþ r1 að Þr3 að Þþ � � �ð Þþ r2 að Þr3 að Þþ r2 að Þr4 að Þþ � � �ð Þþ � � � :
Observe that
ri r1ðaÞr2ðaÞþ r1ðaÞr3ðaÞþ � � �ð Þþ r2ðaÞr3ðaÞþ r2ðaÞr4ðaÞþ � � �ð Þþ � � �ð Þ¼ ri r1ðaÞr2ðaÞð Þþ ri r1ðaÞr3ðaÞð Þþ � � �ð Þþ ri r2ðaÞr3ðaÞð Þþ ri r2ðaÞr4ðaÞð Þ � � �ð Þ þ � � �
¼ ri r1ðaÞð Þri r2ðaÞð Þþ ri r1ðaÞð Þri r3ðaÞð Þþ � � �ð Þþ ri r2ðaÞð Þri r3ðaÞð Þþ ri r2ðaÞð Þri r4ðaÞð Þþ � � �ð Þþ � � �
¼ rir1ð ÞðaÞ rir2ð ÞðaÞþ rir1ð ÞðaÞ rir3ð ÞðaÞþ � � �ð Þþ rir2ð ÞðaÞ rir3ð ÞðaÞþ rir2ð ÞðaÞ rir4ð ÞðaÞþ � � �ð Þþ � � �
¼ rir1ð ÞðaÞ rir2ð ÞðaÞþ rir3ð ÞðaÞþ � � �ð Þþ rir2ð ÞðaÞ rir3ð ÞðaÞþ rir4ð ÞðaÞþ � � �ð Þ þ � � � :
Also, rj 7! rirj is a one-to-one mapping from r1; r2; . . .; rhf g ontor1; r2; . . .; rhf g, so
rir1ð Þ að Þ rir2ð Þ að Þþ rir3ð Þ að Þþ � � �ð Þþ rir2ð Þ að Þ rir3ð Þ að Þþ rir4ð Þ að Þþ � � �ð Þþ � � �
¼ r1 að Þr2 að Þþ r1 að Þr3 að Þþ � � �ð Þþ r2 að Þr3 að Þþ r2 að Þr4 að Þþ � � �ð Þþ � � � ;
and hence LHS = RHS. Thus a2 2 KH . Similarly, a3 2 KH , etc. It follows that
xh � a1xh�1 þ a2x
h�2 � � � � þ �1ð Þhah� �
2 KH ½x�:
2.2 Galois Groups 117
Since
xh � a1xh�1 þ a2xh�2 � � � � þ �1ð Þhah ¼ x� r1 að Þð Þ x� r2 að Þð Þ � � � x� rh að Þð Þ¼ x� að Þ x� r2 að Þð Þ � � � x� rh að Þð Þ;
it follows that a is a root of the polynomial pðxÞ in KH ½x�, where
pðxÞ � xh � a1xh�1 þ a2x
h�2 � � � � þ �1ð Þhah:
Hence pðxÞ 2 KH ½x�, pðaÞ ¼ 0, h ¼ deg pðxÞð Þ� 1, and the leading coefficient ofpðxÞ is 1. Now, since qðxÞ is the minimal polynomial of a over KH , we have
K : KH½ � ¼ deg qðxÞð Þ� deg pðxÞð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ h ¼ o Hð Þ;
and hence K : KH½ � � o Hð Þ. Since o Hð Þ� K : KH½ �, we have
o Hð Þ ¼ K : KH½ �
Next, from (*), o G K;KHð Þð Þ ¼ o Hð Þ. Now, since H � G K;KHð Þ, we haveH ¼ G K;KHð Þ .
We can substitute GðK;FÞ for H in o Hð Þ ¼ K : KH½ �. We geto GðK;FÞð Þ ¼ K : KGðK;FÞ
� �. Now observe that
KGðK;FÞ ¼ b : b 2 K; and for every r 2 GðK;FÞ; rðbÞ ¼ bf g:
Since K is a normal extension of F, we have
F ¼ fixed field of GðK;FÞð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ b : b 2 K and for every r 2 GðK;FÞ; rðbÞ ¼ bf g¼ KGðK;FÞ;
and hence KGðK;FÞ ¼ F. Since o GðK;FÞð Þ ¼ K : KGðK;FÞ� �
, we have
o GðK;FÞð Þ ¼ K : F½ �:
We can substitute GðK;FÞ for H in K ¼ KHðaÞ. We get K ¼ KGðK;FÞðaÞ|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} ¼ FðaÞ,and hence
K ¼ F að Þ;
118 2 Galois Theory II
where a 2 K. Further,
h ¼ o Hð Þ ¼ o GðK;FÞð Þ ¼ K : F½ � ¼ K : KGðK;FÞ� � ¼ K : KH½ � ¼ deg qðxÞð Þ ¼ n;
so h ¼ n. Now
pðxÞ ¼ xh � a1xh�1 þ a2x
h�2 � � � � þ �1ð Þhahbecomes
pðxÞ ¼ xn � a1xn�1 þ a2x
n�2 � � � � þ �1ð Þnan:
Next, pðxÞ 2 KH ½x� becomes xn � a1xn�1 þ a2xn�2 � � � � þ �1ð Þnan ¼ð ÞpðxÞ 2 F½x�, and hence each ai is in F. Also pðaÞ ¼ 0, h ¼ deg pðxÞð Þ� 1, and theleading coefficient of pðxÞ is 1. Further,
xh � a1xh�1 þ a2x
h�2 � � � � þ �1ð Þhah ¼ x� r1 að Þð Þ x� r2 að Þð Þ � � � x� rh að Þð Þ¼ x� að Þ x� r2 að Þð Þ � � � x� rh að Þð Þ
becomes
xn � a1xn�1 þ a2xn�2 � � � � þ �1ð Þnan ¼ x� r1 að Þð Þ x� r2 að Þð Þ � � � x� rn að Þð Þ¼ x� að Þ x� r2 að Þð Þ � � � x� rn að Þð Þ;
and H ¼ r1; r2; � � � ; rhf g becomes
GðK;FÞ ¼ r1; r2; � � � ; rnf g:
Since each riðaÞ is in K and F½x�3ð ÞpðxÞ ¼ x� r1ðaÞð Þ x� r2ðaÞð Þ. . .x� rnðaÞð Þ; K splits the polynomial pðxÞ in F½x� into a product of linear factors inK½x�.
We shall show that K is a splitting field over F for pðxÞ.Assume to the contrary that G is a proper subfield of K ¼ FðaÞð Þ that contains
F as well as all the roots of pðxÞ in K. We seek a contradiction.Since r1ðaÞ is a root of pðxÞ in K, and G contains all the roots of pðxÞ in K, we
have r1ðaÞ 2 G. Now, since r1ðaÞ ¼ a, we have a 2 G. Thus F [ af g � G, andhence K ¼ FðaÞ � G|fflfflfflfflfflffl{zfflfflfflfflfflffl}. Thus K � G. This contradicts the fact that G is a proper
subset of K.Thus we have shown that K is a splitting field over F for pðxÞ.
2.2.17 Conclusion Let F and K be any fields such that K is a normal extension ofF. Let H be a subgroup of the group GðK;FÞ � AutðKÞð Þ. Let KH be the fixed fieldof H. Then
2.2 Galois Groups 119
1. o Hð Þ ¼ K : KH½ �,2. H ¼ G K;KHð Þ,3. o GðK;FÞð Þ ¼ K : F½ �,4. there exists a 2 K such that K ¼ FðaÞ, and K is a splitting field over F for
x� r1ðaÞð Þ x� r2ðaÞð Þ � � � x� rnðaÞð Þ in F½x�, where GðK;FÞ ¼r1; r2; . . .; rnf g and r1ðaÞ ¼ a.
2.2.18 Note Let F and K be any fields such that F � K. Let f ðxÞ 2 F½x�. Let K be asplitting field over F for f ðxÞ. Suppose that deg f ðxÞð Þ� 1. Let pðxÞ be an irreduciblefactor of f ðxÞ in F½x�. Suppose that all the roots of pðxÞ are a1; a2; . . .; ar.
Since pðxÞ is a factor of f ðxÞ in F½x�, all the roots of pðxÞ are the roots of f ðxÞ.Now, since a1; a2; . . .; ar are the roots of pðxÞ, a1; a2; . . .; ar are the roots of f ðxÞ.Since K is a splitting field over F for f ðxÞ, K contains all the roots of f ðxÞ. Sincea1; a2; . . .; ar are the roots of f ðxÞ, K contains a1; a2; . . .; ar.
Let us fix an arbitrary i 2 2; 3; . . .; rf g.Since a1; ai are members of K, pðxÞ 2 F½x� Fð Þ, pðxÞ is irreducible over F, and
a1; ai are the roots of pðxÞ in K, by 1.5.18, there exists an isomorphism si from thefield F a1ð Þ � Kð Þ onto the field F aið Þ such that
1. si a1ð Þ ¼ ai,2. for every a 2 F, siðaÞ ¼ a.
Since K is a splitting field over F for f ðxÞ, K is a finite extension of F. SinceF [ a1f g � K, we have F � F a1ð Þ � K. Since K is a finite extension of F, by 1.4.4,K is a finite extension of F a1ð Þ. Since f ðxÞ 2 F½x� � F a1ð Þð Þ½x�ð Þ, we havef ðxÞ 2 F a1ð Þð Þ½x�.
We want to show that K is a splitting field over F a1ð Þ for f ðxÞ.To this end, let us take a proper subfield G of K that contains F a1ð Þ Fð Þ. It
suffices to show that G does not contain all the roots of f ðxÞ. Since G is a propersubfield of K that contains F, and K is a splitting field over F for f ðxÞ, G does notcontain all the roots of f ðxÞ.
Thus K is a splitting field for f ðxÞ considered a polynomial over F1, whereF1 � F a1ð Þ. Similarly, K is a splitting field for f ðxÞ considered a polynomial overF1ð Þ0, where F1ð Þ0� F aið Þ. Now, by 1.5.27, there exists a ring isomorphism ri fromK onto K such that for every a 2 F1 ¼ F a1ð Þ F [ a1f gð Þ, riðaÞ ¼ siðaÞ.
Hence for every a 2 F, riðaÞ ¼ siðaÞ|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} ¼ a, and hence for every a 2 F,
riðaÞ ¼ a. Also, ri a1ð Þ ¼ si a1ð Þ|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} ¼ ai. Thus ri a1ð Þ ¼ ai. Since ri is a ring iso-
morphism from the field K onto K, we have ri 2 AutðKÞ. Next, since for everya 2 F, riðaÞ ¼ a, we have ri 2 GðK;FÞ.2.2.19 Conclusion Let F and K be any fields such that F � K. Let f ðxÞ 2 F½x�. LetK be a splitting field over F for f ðxÞ. Suppose that deg f ðxÞð Þ� 1. Let pðxÞ be anirreducible factor of f ðxÞ in F½x�. Suppose that all the roots of pðxÞ are a1; a2; . . .; ar.Then for every i 2 1; 2; . . .; rf g, there exists ri 2 GðK;FÞ such that ri a1ð Þ ¼ ai.
120 2 Galois Theory II
2.2.20 Problem Let F and K be any fields such that K is an extension of F. Letf ðxÞ 2 F½x�. Let K be a splitting field over F for f ðxÞ. Suppose that deg f ðxÞð Þ� 1.Then K is a normal extension of F.
Proof Case I: f ðxÞ splits into linear factors over F.Since f ðxÞ splits into linear factors over F, F is a splitting field over F for f ðxÞ.
Now, since K is a splitting field over F for f ðxÞ, by 1.5.29, K ¼ F. Since F is anormal extension of F, K is a normal extension of F.
Case II: f ðxÞ does not split into linear factors over F.For induction on K : F½ �, let us assume that for every pair of fields K1;F1 of
degree \ K : F½ �,
K1 is a splitting field over F1 of some polynomial in F1½x�ð Þ) K1 is a normal extension of F1ð Þ: ðÞ
Since f ðxÞ does not split into linear factors over F, by 1.2.21, there exists anirreducible factor pðxÞ of f ðxÞ in F½x� such that deg pðxÞð Þ� 2. Suppose that all theroots of pðxÞ are a1; a2; . . .; ar, where r � deg pðxÞð Þ � 2ð Þ. Since K is a splittingfield over F for f ðxÞ, K is a finite extension of F. Since a1 is a root of pðxÞ, and pðxÞis irreducible over F, a1 is a nonzero.
Since pðxÞ is a factor of f ðxÞ in F½x�, all the roots of pðxÞ are roots of f ðxÞ. Sincea1; a2; . . .; ar are the roots of pðxÞ, a1; a2; . . .; ar are the roots of f ðxÞ. Since K is asplitting field over F for f ðxÞ, K contains all the roots of f ðxÞ. Since a1; a2; . . .; ar arethe roots of f ðxÞ, K contains a1; a2; . . .; ar. It follows that F [ a1f g � K, and henceF � F a1ð Þ � K. Now, since K is a finite extension of F, by 1.4.4, F a1ð Þ is a finiteextension of F, and hence by 1.4.9, a1 is algebraic over F.
By 1.4.3,
K : F½ � ¼ K : F a1ð Þ½ � F a1ð Þ : F½ �:
Since a1 is a root of pðxÞ, we have p a1ð Þ ¼ 0. Further, since pðxÞ is irreducible inF½x� and r ¼ deg pðxÞð Þ� 2, a1 is algebraic of degree r � 2ð Þ over F. Now, by1.4.16, F a1ð Þ : F½ � ¼ r, and hence
K : F½ � ¼ K : F a1ð Þ½ � F a1ð Þ : F½ � ¼ K : F a1ð Þ½ �r|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} [ K : F a1ð Þ½ �1 ¼ K : F a1ð Þ½ �:
Thus K : F a1ð Þ½ �\ K : F½ �.Since F � F a1ð Þ � K and K is a finite extension of F, by 1.4.4, K is a finite
extension of F a1ð Þ. Since f ðxÞ 2 F½x� � F a1ð Þð Þ½x�ð Þ, we have f ðxÞ 2 F a1ð Þð Þ½x�.We want to show that K is a splitting field over F a1ð Þ for f ðxÞ.To this end, let us take a proper subfield G of K that contains F a1ð Þ Fð Þ. It
suffices to show that G does not contain all the roots of f ðxÞ. Since G is a propersubfield of K that contains F, and K is a splitting field over F for f ðxÞ, G does notcontain all the roots of f ðxÞ.
2.2 Galois Groups 121
Thus K is a splitting field over F a1ð Þ for f ðxÞ. Next, since K : F a1ð Þ½ �\ K : F½ �,by the induction hypothesis (*), K is a normal extension of F a1ð Þ.
We claim that K is a normal extension of F.Suppose to the contrary that K is not a normal extension of F. We seek a
contradiction.Since K is a finite extension of F, and K is not a normal extension of F, we have
fixed field of GðK;FÞð Þ 6� F. It follows that there exists h 2 fixed fieldðofGðK;FÞÞ such that h 62 F. Since K is a normal extension of F a1ð Þ, we have
fixed field of G K;F a1ð Þð Þð Þ ¼ F a1ð Þ:
Since F � F a1ð Þ, we have
G K;F a1ð Þð Þ ¼r : r 2 Aut Kð Þ; for every a 2 F a1ð Þ; r að Þ ¼ af g� r : r 2 Aut Kð Þ; for every a 2 F; r að Þ ¼ af g|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ G K;Fð Þ;
and hence
G K;F a1ð Þð Þ � GðK;FÞ:
It follows that
h 2 fixed field of G K;Fð Þð Þ
¼a : a 2 K; for every r 2 G K;Fð Þ; r að Þ ¼ af g
� a : a 2 K; for ever r 2 G K;F a1ð Þð Þ; r að Þ ¼ af g|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ fixed field of G K;F a1ð Þð Þð Þ ¼ F a1ð Þ;
and hence h 2 F a1ð Þ. Also h 62 F.
Since a1 is algebraic of degree r � 2ð Þ over F, 1; a1; a1ð Þ2; . . .; a1ð Þr�1n o
is a
linearly independent set of vectors in the vector space F a1ð Þ over the field F.
Proof Suppose to the contrary that there exist c0; c1; . . .; cr�1 in F such that notall the ci are zero and
c01þ c1a1 þ � � � þ cr�1 a1ð Þr�1¼ 0:
We seek a contradiction. Here, it follows that qðxÞ � c0 þ c1xþ � � � þ cr�1x
r�1 is a nonzero polynomial in F½x� such that q a1ð Þ ¼ 0; anddeg qðxÞð Þ� r � 1\r. This contradicts the fact that a1 is algebraic of degreer over F. ■
122 2 Galois Theory II
Thus, we have shown that 1; a1; a1ð Þ2; . . .; a1ð Þr�1n o
is a linearly independent
set of vectors in the vector space F a1ð Þ over the field F. Since F a1ð Þ : F½ � ¼ r, thedimension of the vector space F a1ð Þ over the field F is r. Now, since
1; a1; a1ð Þ2; . . .; a1ð Þr�1n o
is a linearly independent set of vectors in the vector
space F a1ð Þ over the field F, 1; a1; a1ð Þ2; . . .; a1ð Þr�1n o
constitutes a basis for the
vector space F a1ð Þ over the field F. Now, since h 2 F a1ð Þ, there existk0; k1; . . .; kr�1 in F such that
h ¼ k01þ k1a1 þ � � � þ kr�1 a1ð Þr�1:
Since K is a splitting field over F for f ðxÞ, deg f ðxÞð Þ� 1, pðxÞ is an irreduciblefactor of f ðxÞ in F½x�, and all the roots of pðxÞ are a1; a2; . . .; ar, by 2.2.19, for everyi 2 1; 2; . . .; rf g, there exists ri 2 GðK;FÞ such that ri a1ð Þ ¼ ai. It follows that forevery i 2 1; 2; . . .; rf g,
ri hð Þ ¼ ri k01þ k1a1 þ � � � þ kr�1 a1ð Þr�1� �
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ ri k01ð Þþ ri k1a1ð Þþ � � � þ ri kr�1 a1ð Þr�1
� �¼ ri k0ð Þri 1ð Þþ ri k1ð Þri a1ð Þþ � � � þ ri kr�1ð Þ ri a1ð Þð Þr�1
¼ ri k0ð Þri 1ð Þþ ri k1ð Þai þ � � � þ ri kr�1ð Þ aið Þr�1
¼ ri k0ð Þ1þ ri k1ð Þai þ � � � þ ri kr�1ð Þ aið Þr�1¼ k01þ k1ai þ � � � þ kr�1 aið Þr�1;
and hence
ri hð Þ ¼ k01þ k1ai þ � � � þ kr�1 aið Þr�1 i ¼ 1; 2; . . .; rð Þ:
Since
h 2 fixed field of GðK;FÞð Þ¼ a : a 2 K; and for every r 2 GðK;FÞ; rðaÞ ¼ af g;
and each ri is in GðK;FÞ, we have ri hð Þ ¼ h i ¼ 1; 2; . . .; rð Þ. It follows, from(*) that
h ¼ k01þ k1ai þ � � � þ kr�1 aið Þr�1 i ¼ 1; 2; . . .; rð Þ:
This shows that a1; a2; . . .; ar|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl}r in population
are roots of the polynomial
k0 � hð Þþ k1xþ � � � þ kr�1xr�1. Here k0 � hð Þþ k1xþ � � � þ kr�1xr�1 is a poly-nomial of degree at most r � 1ð Þ \rð Þ, so k0 � hð Þþ k1xþ � � � þ kr�1xr�1 is the
2.2 Galois Groups 123
zero polynomial of K½x�. It follows that k0 � hð Þ ¼ 0, and hence h ¼ k0. Now,since k0 2 F, we have h 2 F. This is a contradiction.
Thus our claim is substantiated, and hence K is a normal extension of F.So in all cases, K is a normal extension of F. ■
Definition Let F and K be any fields such that K is an extension of F. Letf ðxÞ 2 F½x�. Let K be a splitting field over F for f ðxÞ. The groupGðK;FÞ ¼ r : r 2 AutðKÞ; and for every a 2 F; rðaÞ ¼ af gð Þ is called the Galoisgroup of f ðxÞ.2.2.21 Note Let F and K be any fields such that K is an extension of F. Let f ðxÞ bea nonzero member of F½x�. Let K be a splitting field over F for f ðxÞ. Suppose thatdeg f ðxÞð Þ� 1. Let T be a subfield of K that contains F. Put
G K; Tð Þ � r : r 2 GðK;FÞ; and for every t 2 T ; rðtÞ ¼ tf g:
Clearly, G K; Tð Þ is a subgroup of the group GðK;FÞ. For every subgroup H ofthe group GðK;FÞ, put
KH � a : a 2 K; and for every r 2 H; rðaÞ ¼ af g:
Clearly, K is a splitting field over T for f ðxÞ.Proof Since K is a splitting field over F for f ðxÞ, K is a finite extension ofF. Now since F � T � K, by 1.4.4, K is a finite extension of T. Let G be aproper subfield of K which contains T Fð Þ. We have to show that G doesnot contain all the roots of f ðxÞ. Since G is a proper subfield of K whichcontains F, and K is a splitting field over F for f ðxÞ, G does not contain all theroots of f ðxÞ. ■
Thus we have shown that K is a splitting field over T for f ðxÞ. Now, by 2.2.20,K is a normal extension of T, and hence, by definition of normal extension,
T ¼ fixed field of G K; Tð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ a : a 2 K; and for every r 2 G K; Tð Þ; rðaÞ ¼ af g¼ KG K;Tð Þ;
and hence T ¼ KG K;Tð Þ . Since K is a normal extension of T, by 2.2.17,
o G K; Tð Þð Þ ¼ K : T½ � , and H ¼ G K;KHð Þ . Since K is a splitting field over F forf ðxÞ, by 2.2.20, K is a normal extension of F, and hence, by 2.2.17,o GðK;FÞð Þ ¼ K : F½ �. Since K is a finite extension of F, and F � T � K, by 1.4.4,and 1.4.3, we have
o GðK;FÞð Þ ¼ K : F½ � ¼ K : T½ � T : F½ �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ o G K; Tð Þð Þ T : F½ �;
124 2 Galois Theory II
and hence,
T : F½ � ¼ o GðK;FÞð Þo G K; Tð Þð Þ ¼ index of subgroup G K; Tð Þof the group GðK;FÞð Þ:
2.2.22 Conclusion Let F and K be any fields such that F � K. Let K be anextension of F. Let f ðxÞ be a nonzero member of F½x�. Let K be a splitting field overF for f ðxÞ. Suppose that deg f ðxÞð Þ� 1. Let T be a subfield of K which containsF. Put
G K; Tð Þ � r : r 2 GðK;FÞ and for every t 2 T ; rðtÞ ¼ tf g:
Clearly, G K; Tð Þ is a subgroup of the group GðK;FÞ. For every subgroup H ofthe group GðK;FÞ, put
KH � a : a 2 K; and for every r 2 H; rðaÞ ¼ af g:
Then:
1. T ¼ KG K;Tð Þ, that is, the mapping U : H 7!KH from the collection of all sub-groups of the group GðK;FÞ to the collection of all subfields of K that containF is onto.
2. H ¼ G K;KHð Þ, that is, the mapping W : T 7!G K; Tð Þ from the collection of allsubfields of K that contain F to the collection of all subgroups of the groupGðK;FÞ is onto. Also W � Uð Þ Hð Þ ¼ W U Hð Þð Þ ¼ W KHð Þ ¼ G K;KHð Þ ¼ H, soW � Uð Þ Hð Þ ¼ H. Next, U �Wð ÞðtÞ ¼ U WðtÞð Þ ¼ U G K; Tð Þð Þ ¼ KG K;Tð Þ ¼ T ,
so U �Wð ÞðtÞ ¼ T . Thus W�1 ¼ U.3. It follows that W : T 7!G K; Tð Þ is a one-to-one correspondence from the col-
lection of all subfields of K that contain F onto the collection of all subgroups ofthe group GðK;FÞ.
4. o G K; Tð Þð Þ ¼ K : T½ �.5. T : F½ � is equal to the index of the subgroup G K; Tð Þ in the group GðK;FÞ that
is, T : F½ � ¼ o GðK;FÞð Þo G K;Tð Þð Þ.
6. o GðK;FÞð Þ ¼ K : F½ � ¼ K : T½ � T : F½ � ¼ o G K; Tð Þð Þ T : F½ �:
2.2.23 Problem Let F and K be any fields such that K is an extension of F. Let T bea subfield of K that contains F. Suppose that T is a normal extension of F. Letr 2 GðK;FÞ. Then rðtÞ � T .
Proof Suppose to the contrary that there exists h 2 T such that r hð Þ 62 T . We seeka contradiction.
2.2 Galois Groups 125
Since T is a normal extension of F, by 2.2.17, there exists a 2 T such thatT ¼ FðaÞ and T is a splitting field over F for
x� r1ðaÞð Þ x� r2ðaÞð Þ. . . x� rnðaÞð Þ
in F½x�, where G T;Fð Þ ¼ r1; r2; . . .; rnf g and r1ðaÞ ¼ a. Thus T is a splitting fieldover F for pðxÞ 2 F½x�, where
pðxÞ � x� r1ðaÞð Þ x� r2ðaÞð Þ . . . x� rnðaÞð Þ ¼ x� að Þ x� r2ðaÞð Þ . . . x� rnðaÞð Þð Þ:
Suppose that
pðxÞ � xn þ b1xn�1 þ � � � þ bn;
where each bi is in F. Clearly, pðaÞ ¼ 0. Now,
p r að Þð Þ � r að Þð Þn þ b1 r að Þð Þn�1 þ � � � þ bn¼ r anð Þþ b1r an�1ð Þþ � � � þ bnr anð Þþ r b1ð Þr an�1ð Þþ � � � þ r bnð Þ
¼ r anð Þþ r b1an�1ð Þþ � � � þ r bnð Þ¼ r an þ b1an�1 þ � � � þ bnð Þ ¼ r p að Þð Þ ¼ r 0ð Þ ¼ 0;
so p rðaÞð Þ ¼ 0, and hence rðaÞ is a root of pðxÞ. Now, since T is a splitting fieldover F for pðxÞ 2 F½x�, we have rðaÞ 2 T ¼ FðaÞð Þ, and hence rðaÞ 2 FðaÞ.
Since T is a normal extension of F, FðaÞ ¼ð ÞT is a finite extension of F, andhence FðaÞ is a finite extension of F. Since r hð Þ 62 T ¼ FðaÞ, we haver hð Þ 62 FðaÞ. Since h 2 T ¼ FðaÞ, we have h 2 FðaÞ. Since FðaÞ is a finiteextension of F, by 1.4.9, a is algebraic over F. Let a 2 FðaÞð Þ be algebraic of degreen over F. It follows, by 1.4.2, that FðaÞ : F½ � ¼ n, and hence n is the dimension ofthe vector space FðaÞ over the field F.
Clearly, 1; a; a2; . . .; an�1
is a linearly independent set of vectors for thevector space FðaÞ over the field F.
Proof Suppose to the contrary that there exist k0; k1; . . .; kn�1 2 F such that notall the ki are zero, and
k01þ k1aþ � � � þ kn�1an�1 ¼ 0:
We see, a contradiction. It follows that a is a root of the nonzero polynomialk0 þ k1xþ � � � þ kn�1xn�1 in F½x�. Further, the degree of this polynomial isstrictly smaller than n. This contradicts the fact that a is algebraic of degreen over F. ■Thus we have shown that 1; a; a2; . . .; an�1
is a linearly independent set of
vectors for the vector space FðaÞ over the field F. Now, since n is the dimension ofthe vector space FðaÞ over the field F, 1; a; a2; . . .; an�1
is a basis of the vector
space FðaÞ over the field F. Next, since h 2 FðaÞ, there exist c0; c1; . . .; cn�1 2 Fsuch that
126 2 Galois Theory II
h ¼ c01þ c1aþ � � � þ cn�1an�1:
Hence
r hð Þ ¼ r c01þ c1aþ � � � þ cn�1an�1� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ r c01ð Þþ r c1að Þþ � � � þ r cn�1an�1ð Þ
¼ r c0ð Þr 1ð Þþ r c1ð Þr að Þþ � � � þ r cn�1ð Þr an�1ð Þ¼ r c0ð Þr 1ð Þþ r c1ð Þr að Þþ � � � þ r cn�1ð Þ r að Þð Þn�1
¼ c0r 1ð Þþ c1r að Þþ � � � þ cn�1 r að Þð Þn�1¼ c01þ c1r að Þþ � � � þ cn�1 r að Þð Þn�1;
so
FðaÞ63 r hð Þ ¼ c01þ c1rðaÞþ � � � þ cn�1 rðaÞð Þn�1|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence c01þ c1rðaÞþ � � � þ cn�1 rðaÞð Þn�1
� �62 FðaÞ. It follows that
rðaÞ 62 FðaÞ. This is a contradiction. ■
2.2.24 Problem Let F and K be any fields such that K is an extension of F. Let T bea subfield of K that contains F. Suppose that T is a normal extension of F. Clearly,G K;Tð Þ is a subgroup of the group GðK;FÞ. Also, G K; Tð Þ is a normal subgroup ofthe group GðK;FÞ.Proof Let us take any s 2 G K; Tð Þ and r 2 GðK;FÞ. We have to show thatr�1sr 2 G K;Tð Þ. To this end, let us take any t 2 T . It suffices to show thatr�1srð ÞðtÞ ¼ t, that is, r�1 s rðtÞð Þð Þ ¼ t, that is, s rðtÞð Þ ¼ rðtÞ. Since s 2 G K; Tð Þ,it is enough to show that rðtÞ 2 T . By 2.2.23, rðtÞ � T . Now, since rðtÞ 2 rðtÞ, wehave rðtÞ 2 T . ■
2.2.25 Note Let F and K be any fields such that K is an extension of F. Let T be asubfield of K that contains F. Suppose that T is a normal extension of F. By 2.2.24,
G K;Tð Þ is a normal subgroup of the group GðK;FÞ. Hence GðK;FÞG K;Tð Þ is a quotient
group. Also, G T;Fð Þ is a group.Take an arbitrary r 2 GðK;FÞ. By 2.2.23, rðtÞ � T . Since r 2 GðK;FÞ and
GðK;FÞ is a group, the inverse function r�1 is in GðK;FÞ, and hence by 2.2.23,r�1ðtÞ � T . It follows that T � rðtÞ. Thus rðtÞ ¼ T . Now, since r 2 GðK;FÞ andF � T � K, we have rjTð Þ 2 G T ;Fð Þ. Thus
g : r 7! rjTð Þ
is a mapping from group GðK;FÞ to the group G T ;Fð Þ.η preserves the binary operation: To show this, let us take arbitrary
r; l 2 GðK;FÞ. We have to show that ðrlÞjT ¼ rjTð Þ ljTð Þ.
2.2 Galois Groups 127
Let us take an arbitrary a 2 T . We have to show that ðrlÞjTð ÞðaÞ¼ rjTð Þ ljTð Þð ÞðaÞ, that is, ðrlÞðaÞ ¼ rjTð Þ ljTð ÞðaÞð Þ, that is,ðrlÞðaÞ ¼ rjTð Þ lðaÞð Þ.
Since l 2 GðK;FÞ, as above, we have lðtÞ ¼ T . Since a 2 T , we havelðaÞ 2 lðtÞ ¼ T , and hence lðaÞ 2 T . It follows that rjTð Þ lðaÞð Þ ¼r lðaÞð Þ ¼ ðrlÞðaÞ. Thus g preserves the binary operation.
kerðgÞ ¼ GðK; TÞ: Let us take an arbitrary r 2 ker gð Þ, that is, r 2 GðK;FÞ andrjTð Þ ¼ IdT . Since r 2 GðK;FÞ, we have r 2 AutðKÞ. Next, since rjTð Þ ¼ IdT , forevery t 2 T , rðtÞ ¼ t. This shows that r 2 G K; Tð Þ. Thus ker gð Þ � G K; Tð Þ.
Let us take an arbitrary r 2 G K; Tð Þ, that is, r 2 AutðKÞ and rjTð Þ ¼ IdT . Wehave to show that r 2 ker gð Þ, that is, r 2 GðK;FÞ and rjTð Þ ¼ IdT . It remains toshow that rjFð Þ ¼ IdF . Since rjTð Þ ¼ IdT and F � T , we have rjFð Þ ¼ IdF . ThusG K;Tð Þ � kerðÞ. Hence ker gð Þ ¼ G K; Tð Þ.
Since g : GðK;FÞ ! G T ;Fð Þ preserves the group binary operations, g :GðK;FÞ ! G T;Fð Þ is a homomorphism from GðK;FÞ onto g GðK;FÞð Þ, and henceby the fundamental theorem of group homomorphisms, the quotient groupGðK;FÞker gð Þ ¼ GðK;FÞ
G K;Tð Þ� �
is isomorphic to g GðK;FÞð Þ. It follows that GðK;FÞG K;Tð Þ is isomorphic to
g GðK;FÞð Þ, and hence o GðK;FÞð Þo G K;Tð Þð Þ ¼� �
o GðK;FÞG K;Tð Þ� �
¼ o g GðK;FÞð Þð Þ. Thus o GðK;FÞð Þo G K;Tð Þð Þ ¼
o g GðK;FÞð Þð Þ. By 2.2.21,
T : F½ � ¼ o GðK;FÞð Þo G K; Tð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ o g GðK;FÞð Þð Þ� o G T ;Fð Þð Þ:
Since T is a normal extension of F, by 2.2.17, o G T ;Fð Þð Þ ¼ T : F½ �, and hence
T : F½ � ¼ o GðK;FÞð Þo G K; Tð Þð Þ ¼ o g GðK;FÞð Þð Þ� o G T;Fð Þð Þ ¼ T : F½ �:
This shows that
o g GðK;FÞð Þð Þ ¼ o G T ;Fð Þð Þ:
Since g : GðK;FÞ ! G T ;Fð Þ, we have g GðK;FÞð Þ � G T ;Fð Þ. Since
o g GðK;FÞð Þð Þ ¼ o G T;Fð Þð Þ, we have g GðK;FÞð Þ ¼ G T ;Fð Þ, and since GðK;FÞG K;Tð Þ is
isomorphic to g GðK;FÞð Þ, GðK;FÞG K;Tð Þ is isomorphic to G T ;Fð Þ.
2.2.26 Conclusion Let F and K be any fields such that K is an extension of F. LetT be a subfield of K that contains F. Suppose that T is a normal extension of
F. Then the quotient group GðK;FÞG K;Tð Þ is isomorphic to the group G T;Fð Þ.
This result is known as the fundamental theorem of Galois theory.
128 2 Galois Theory II
2.3 Applications of Galois Theory
2.3.1 Definition Let G be a group. If there exists a finite collectionN0;N1; . . .;Nkf g of subgroups of G such that
1. G ¼ N0 N1 � � � Nk ¼ ef g, where e denotes the identity element of G,2. for every i ¼ 1; . . .; k; Ni is a normal subgroup of Ni�1,3. for every i ¼ 1; . . .; k, the quotient group Ni�1
Niis abelian,
then we say that G is solvable.
Definition Let G be a group. Let a; b 2 G. By the commutator of a and b we meana�1b�1ab.
Let C be the collection of all commutators in G. Then the subgroup G0 ofG generated by all the commutators in G is the smallest subgroup of G containingC. Clearly, G0 is equal to the collection of all finite products of the members in C or
their inverses. Observe that for every a; b 2 G, a�1b�1abð Þ�1¼ b�1a�1ba, so theinverse of a member of C is also a member of C. Hence G0 is equal to the collectionof all finite products of the members in C.
Here G0 is called the commutator subgroup of G.
2.3.2 Problem Let G be a group. Then the commutator subgroup G0 of G is anormal subgroup of G.
Proof Let C be the collection of all commutators in G. Let u 2 G0 and g 2 G. Wehave to show that g�1ug 2 G0. Observe that
g�1ug ¼ u u�1g�1ug� �
:
Since u; g 2 G, we have u�1g�1ug 2 C. By the definition of G0, we haveC � G0. Now, since u�1g�1ug 2 C, we have u�1g�1ug 2 G0. Since u; u�1g�1ug 2G0 and G0 is a group, we have g�1ug ¼ð Þu u�1g�1ugð Þ 2 G0, and henceg�1ug 2 G0. ■
2.3.3 Problem Let G be a group. By 2.3.2, G0 is a normal subgroup of G. Then thequotient group G
G0 is an abelian group.
Proof Let us take any a; b 2 G. We have to show that aG0ð Þ bG0ð Þ ¼ bG0ð Þ aG0ð Þ,that is, ðabÞG0 ¼ bað ÞG0, that is, ðabÞ�1 bað Þ 2 G0, that is, b�1a�1ba 2 G0.
Since b�1a�1ba 2 C, where C is the collection of all commutators in G andC � G0, we have b�1a�1ba 2 G0. ■
2.3.4 Problem Let G be a group. Let M be a normal subgroup of G. Suppose thatthe quotient group G
M is an abelian group. Then G0 � M.
2.3 Applications of Galois Theory 129
Proof It suffices to show that C � M. To this end, let us take any a; b 2 G. Wehave to show that b�1a�1ba 2 M, that is, ðabÞ�1ba 2 M, that is, ðabÞM ¼ bað ÞM,that is, aMð Þ bMð Þ ¼ bMð Þ aMð Þ. This is known to be true, because, G
M is an abeliangroup. ■
Definition Let G be a group. Let C be a subgroup of G. If for every automorphismT of G, T Cð Þ � C, then we say that C is a characteristic subgroup of G.
2.3.5 Problem Let G be a group. Then the commutator subgroup G0 of G is acharacteristic subgroup of G.
Proof To show this, let us take an arbitrary automorphism T of G. We have to showthat T G0ð Þ � G0.
To this end, let us take arbitrary c1. . .cnð Þ 2 G0, where each ci is a commutator inG. We have to show that T c1ð Þ. . .T cnð Þ ¼ð ÞT c1. . .cnð Þ 2 G0. It suffices to show thateach T cið Þ is a commutator in G.
Since ci is a commutator in G, there exist ai; bi 2 G such thatci ¼ aið Þ�1 bið Þ�1aibi. Now, since T is an automorphism of G, we have
T cið Þ ¼ T aið Þ�1 bið Þ�1aibi� �
¼ T aið Þ�1� �
T bið Þ�1� �
T aið ÞT bið Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ T aið Þð Þ�1 T bið Þð Þ�1T aið ÞT bið Þ;
and hence T cið Þ ¼ T aið Þð Þ�1 T bið Þð Þ�1T aið ÞT bið Þ. Since T aið Þð Þ�1 T bið Þð Þ�1
T aið ÞT bið Þ is a commutator in G, T cið Þ is a commutator in G. ■
2.3.6 Problem Let G be a group. Then the subgroup G0ð Þ0 � G 2ð Þ� �of G is normal.
Similarly, for every positive integer n, G nð Þ is a normal subgroup of G.
Proof To show this, let us take an arbitrary g 2 G. We have to show thatg�1 G0ð Þ0g � G0ð Þ0, that is, T G0ð Þ0� � � G0ð Þ0, where T is the automorphismx 7! g�1xg of G. By 2.3.5, G0 is a characteristic subgroup of G. Again by 2.3.5, G0ð Þ0is a characteristic subgroup of G0. It follows that G0ð Þ0 is a subgroup of G. Since G0 isa characteristic subgroup of G, and T is an automorphism of G, we have T G0ð Þ � G0.Now, since T : G ! G is an automorphism, its restriction TjG0 is an automorphismof G0. Next, since G0ð Þ0 is a characteristic subgroup of G0, we have
T G0ð Þ0� � ¼ T jG0� �
G0ð Þ0� � � G0ð Þ0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence T G0ð Þ0� � � G0ð Þ0. ■
2.3.7 Note Let G be a group. Let G be solvable.It follows that there exists a finite collection N0;N1; . . .;Nkf g of subgroups of
G such that
130 2 Galois Theory II
1. G ¼ N0 N1 . . . Nk ¼ ef g, where e denotes the identity element of G,2. for every i ¼ 1; . . .; k; Ni is a normal subgroup of Ni�1,3. for every i ¼ 1; . . .; k; the quotient group Ni�1
Niis abelian.
Since N1 is a normal subgroup of N0, and the quotient group N0N1
is abelian, by
2.3.4, N0ð Þ0� N1. Since N2 is a normal subgroup of N1 and the quotient group N1N2
is
abelian, by 2.3.4, N1ð Þ0� N2. Since N0ð Þ0� N1, we have
G 2ð Þ ¼ N0ð Þ 2ð Þ¼ N0ð Þ0� �0� N1ð Þ0|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} � N2;
and hence G 2ð Þ � N2 .Since N3 is a normal subgroup of N2, and the quotient group N2
N3is abelian, by
2.3.4, N2ð Þ0� N3. Since N1ð Þ0� N2, we have
G 3ð Þ ¼ N0ð Þ 3ð Þ¼ N0ð Þ0� � 2ð Þ� N1ð Þ 2ð Þ¼ N1ð Þ0� �0� N2ð Þ0|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} � N3;
and hence G 3ð Þ � N3 , etc. Hence ef g ¼ GðkÞ � Nk|fflfflfflfflfflffl{zfflfflfflfflfflffl} ¼ ef g. Thus GðkÞ ¼ ef g.
2.3.8 Conclusion Let G be solvable. Then there exists a positive integer k such thatG kð Þ ¼ ef g.2.3.9 Problem Let G be a group. Let k be a positive integer. Suppose thatG kð Þ ¼ ef g, where e denotes the identity element of G. Then G is solvable. It alsofollows that G0 is solvable, etc.
Proof By 2.3.6, for every i 2 1; 2; . . .; kf g, G ið Þ is a normal subgroup of G. Now,since G is a normal subgroup of G, G 0ð Þ;Gð1Þ; . . .;G kð Þ
is a collection of normalsubgroups of G, where G 0ð Þ � G. Further,
G ¼ G 0ð Þ Gð1Þ . . . G kð Þ ¼ ef g:
By 2.3.2, for every i ¼ 1; . . .; k; G ið Þ is a normal subgroup of G i�1ð Þ. By 2.3.3, forevery i ¼ 1; . . .; k, the quotient group G i�1ð Þ
G ið Þ is abelian. Thus G is solvable. ■
2.3.10 Problem Let G, G be any groups. Let f : G ! G be a homomorphism fromG onto G. Thus G is the homomorphic f-image of G. Let G be solvable. Then G issolvable.
Proof Since G is solvable, by 2.3.8, there exists a positive integer k such thatG kð Þ ¼ ef g, where e denotes the identity element of G. By 2.3.9, it suffices to show
that Gkð Þ ¼ �ef g, where �e denotes the identity element of G.
2.3 Applications of Galois Theory 131
Now, since f : G ! G is a homomorphism from G onto G, we have f Gð Þ ¼ G
and f eð Þ ¼ �e. Thus it is enough to show that f Gð Þð Þ kð Þ� f eð Þf g. Since G kð Þ ¼ ef g,we have f eð Þf g ¼ f G kð Þ� �
, and hence it suffices to show that f Gð Þð Þ kð Þ� f G kð Þ� �.
Clearly, f Gð Þð Þ0� f G0ð Þ.Proof By 2.3.1, G0 is a normal subgroup of G. We first show that f G0ð Þ is anormal subgroup of G.
To this end, let us take arbitrary �g 2 G and f ðaÞ 2 f G0ð Þ, where a 2 G0. Wehave to show that �gð Þ�1 f ðaÞð Þ�g 2 f G0ð Þ.
Since �g 2 G ¼ f Gð Þ, there exists 2 G such that �g ¼ f gð Þ. Now,
�gð Þ�1 f ðaÞð Þ�g ¼ f gð Þð Þ�1f ðaÞ f gð Þð Þ ¼ f g�1� �f ðaÞf gð Þ ¼ f g�1ag
� �;
so �gð Þ�1 f ðaÞð Þ�g ¼ f g�1agð Þ. Since G0 is a normal subgroup of G, a 2 G0, andg 2 G, we have g�1ag 2 G0, and hence
�gð Þ�1 f ðaÞð Þ�g ¼ f g�1ag� � 2 f G0ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl} :
Thus �gð Þ�1 f ðaÞð Þ�g 2 f G0ð Þ.Next we shall show that the quotient group G
f G0ð Þ ¼ f Gð Þf G0ð Þ
� �is an abelian group.
To this end, let us take any a; b 2 G. We have to show that f ðaÞf G0ð Þð Þf ðbÞf G0ð Þð Þ ¼ f ðbÞf G0ð Þð Þ f ðaÞf G0ð Þð Þ, that is, f ðaÞf ðbÞð Þf G0ð Þ ¼ f ðbÞf ðaÞð Þf G0ð Þ, that is, f ðabÞð Þf G0ð Þ ¼ f bað Þð Þf G0ð Þ, that is, f bað Þð Þ�1 f ðabÞð Þ 2 f G0ð Þ,that is, f bað Þ�1
� �f ðabÞ 2 f G0ð Þ, that is, f a�1b�1ð Þf ðabÞ 2 f G0ð Þ, that is,
f a�1b�1abð Þ 2 f G0ð Þ.It suffices to show that a�1b�1ab 2 G0. Since a�1b�1ab is a commutator of
a and b, we have a�1b�1ab 2 G0.Since f Gð Þ
f G0ð Þ is an abelian group, by 2.3.4, f Gð Þð Þ0� f G0ð Þ . By 2.3.9, G0 issolvable. Also f jG0 is a homomorphism. So as above,
f Gð Þð Þ00¼ f Gð Þð Þ0� �0� f jG0 G0ð Þ� �0� f jG0 G0ð Þ0� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ f G0ð Þ0� � ¼ f G00ð Þ;
so f Gð Þð Þ0� f G00ð Þ . Finally, we get f Gð Þð Þ kð Þ� f G kð Þ� �. ■
2.3.11 Problem Let G be a group. Let N be a normal subgroup of G. Then N 0 isalso a normal subgroup of G.
Proof Since N 0 is a subgroup of N, and N is a subgroup of G, N 0 is a subgroup ofG. Next, let us take arbitrary g 2 G and c1. . .cnð Þ 2 N 0, where each ci is a com-mutator in N. We have to show that
132 2 Galois Theory II
g�1c1g� �
g�1c2g� �
. . . g�1cng� � ¼ g�1 c1. . .cnð Þg 2 N 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} :
It suffices to show that each g�1cig is in N 0, that is, T cið Þ 2 N 0, where T is theautomorphism x 7! g�1xg of N.
By 2.3.5, N 0 is a characteristic subgroup of N, and T is the automorphismx 7! g�1xg of N, so T N 0ð Þ � N 0. Now, since ci is a commutator in N, and N 0
contains all commutators in N, we have ci 2 N 0, and hence T cið Þ 2 T N 0ð Þ � N 0.Thus, T cið Þ 2 N 0. ■
2.3.12 Problem Suppose that n 2 5; 6; 7; 8; . . .f g. Let Sn be the symmetric groupof all permutations of n symbols 1; 2; . . .; n. Then
1. Snð Þ0 contains all 3-cycles,2. Snð Þ00 contains all 3-cycles, etc.
In short, for every n� 5 and for every k� 1, Snð Þ kð Þ contains all 3-cycles. It
follows that for every k� 1, Snð Þ kð Þ 6¼ ef g, and hence by 2.3.9, Sn is not solvablewhen n� 5.
Proof 1 Let us take an arbitrary 3-cycle i1i2i3ð Þ in Sn, where i1; i2; i3 are threedistinct members of 1; 2; . . .; nf g. We have to show that i1i2i3ð Þ 2 Snð Þ0.
Since n 2 5; 6; 7; 8; . . .f g, the 3-cycle 1 4 5ð Þ � 1 2 3 4 5. . .4 2 3 5 1. . .
� � is in Sn.
Observe that the 3-cycle (135) is a commutator in Sn.
Proof Since
123ð Þ�1 145ð Þ�1 123ð Þ 145ð Þ
¼ 123456 � � �231456 � � �
� �1 123456 � � �423516 � � �
� �1 123456 � � �231456 � � �
� 123456 � � �423516 � � �
�
¼ 123456 � � �312456 � � ��
123456 � � �423516 � � �
� �1 123456 � � �231456 � � �
� 123456 � � �423516 � � �
�
¼ 123456 � � �312456 � � �
� 123456 � � �523146 � � �
� 123456 � � �231456 � � �
� 123456 � � �423516 � � �
�
¼ 123456 � � �312456 � � �
� 123456 � � �523146 � � ��
123456 � � �431526 � � �
�
¼ 123456 � � �312456 � � �
� 123456 � � �135426 � � �
� ¼ 123456 � � �
325416 � � ��
¼ 135ð Þ
2.3 Applications of Galois Theory 133
we have 1 3 5ð Þ ¼ 1 2 3ð Þ�1 1 4 5ð Þ�1 1 2 3ð Þ 1 4 5ð Þ. Since 1 2 3ð Þ; 1 4 5ð Þ 2 Sn,
1 3 5ð Þ ¼ð Þ 1 2 3ð Þ�1 1 4 5ð Þ�1 1 2 3ð Þ 1 4 5ð Þ
is a commutator in Sn, and hence 1 3 5ð Þ is a commutator in Sn. ■
It follows that 1 3 5ð Þ 2 Snð Þ0. By 2.3.11, Snð Þ0 is a normal subgroup of Sn. Thereexists a permutation j of 1; 2; . . .; n such that jð1Þ ¼ i1; j 3ð Þ ¼ i2, and j 5ð Þ ¼ i3.Thus j 2 Sn. Since 1 3 5ð Þ 2 Snð Þ0, j 2 Sn, and Snð Þ0 is a normal subgroup of Sn, wehave j 1 3 5ð Þj�1 2 Snð Þ0. It suffices to show that
j 1 3 5ð Þj�1 ¼ i1i2i3ð Þ:
For this we must prove that
j 1 3 5ð Þj�1ð Þ i1ð Þ ¼ i2;j 1 3 5ð Þj�1ð Þ i2ð Þ ¼ i3;j 1 3 5ð Þj�1ð Þ i3ð Þ ¼ i1;
j 1 3 5ð Þj�1ð ÞðlÞ ¼ lwhen l 2 1; 2; . . .; nf g � i1; i2; i3f g:
8>><>>:
Here
j 1 3 5ð Þj�1� �i1ð Þ ¼ j 1 3 5ð Þ j�1 i1ð Þ� � ¼ j 1 3 5ð Þð1Þð Þ ¼ j 3ð Þ ¼ i2;
j 1 3 5ð Þj�1� �i2ð Þ ¼ j 1 3 5ð Þ j�1 i2ð Þ� � ¼ j 1 3 5ð Þ 3ð Þð Þ ¼ j 5ð Þ ¼ i3;
and
j 1 3 5ð Þj�1� �
i3ð Þ ¼ j 1 3 5ð Þ j�1 i3ð Þ� � ¼ j 1 3 5ð Þ 5ð Þð Þ ¼ jð1Þ ¼ i1:
Suppose that l 2 1; 2; . . .; nf g � i1; i2; i3f g. It suffices to show thatj 1 3 5ð Þj�1ð Þ lð Þ ¼ l. Here
j 1 3 5ð Þj�1� �
lð Þ ¼ j 1 3 5ð Þð Þ j�1 lð Þ� � ¼ j 1 3 5ð Þð Þ mð Þ;
where m 2 1; 2; . . .; nf g � 1; 3; 5f g, and j mð Þ ¼ l. It follows that 1 3 5ð Þ mð Þ ¼ m,and hence
LHS ¼ j 1 3 5ð Þj�1� �
lð Þ ¼ j 1 3 5ð Þð Þ mð Þ ¼ j 1 3 5ð Þ mð Þð Þ ¼ j mð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ l ¼ RHS:
2: Let us take an arbitrary 3-cycle i1i2i3ð Þ in Snð Þ0, where i1; i2; i3 are three distinctmembers of 1; 2; . . .; nf g. We have to show that i1i2i3ð Þ 2 Snð Þ0� �0
.Since n 2 5; 6; 7; 8; . . .f g, by assumption the 3-cycle
1 4 5ð Þ � 1 2 3 4 5. . .4 2 3 5 1. . .
� � is in Snð Þ0.
134 2 Galois Theory II
Observe that the 3-cycle 1 3 5ð Þ is a commutator in Snð Þ0.Proof Since
123ð Þ�1 145ð Þ�1 123ð Þ 145ð Þ
¼ 123456 � � �231456 � � �
� �1 123456 � � �423516 � � �
� �1 123456 � � �231456 � � �
� 123456 � � �423516 � � ��
¼ 123456 � � �312456 � � �
� 123456 � � �423516 � � �
� �1 123456 � � �231456 � � �
� 123456 � � �423516 � � ��
¼ 123456 � � �312456 � � �
� 123456 � � �523146 � � �
� 123456 � � �231456 � � �
� 123456 � � �423516 � � �
�
¼ 123456 � � �312456 � � �
� 123456 � � �523146 � � �
� 123456 � � �431526 � � �
�
¼ 123456 � � �312456 � � �
� 123456 � � �135426 � � �
� ¼ 123456 � � �
325416 � � �
� ¼ 135ð Þ;
we have 1 3 5ð Þ ¼ 1 2 3ð Þ�1 1 4 5ð Þ�1 1 2 3ð Þ 1 4 5ð Þ. Since 1 2 3ð Þ; 1 4 5ð Þ 2 Snð Þ0,
1 3 5ð Þ ¼ð Þ 1 2 3ð Þ�1 1 4 5ð Þ�1 1 2 3ð Þ 1 4 5ð Þ
is a commutator in Snð Þ0, and hence 1 3 5ð Þ is a commutator in Snð Þ0.It follows that 1 3 5ð Þ 2 Snð Þ0� �0
. By two applications of 2.3.11, Snð Þ0� �0is a
normal subgroup of Sn. There exists a permutation j of 1; 2; . . .; n such that
jð1Þ ¼ i1; j 3ð Þ ¼ i2, and j 5ð Þ ¼ i3. Thus j 2 Sn. Since 1 3 5ð Þ 2 Snð Þ0� �0, j 2 Sn,
and Snð Þ0� �0is a normal subgroup of Sn, we have j 1 3 5ð Þj�1 2 Snð Þ0� �0
. It sufficesto show that
j 1 3 5ð Þj�1 ¼ i1i2i3ð Þ:
For this we must prove
j 1 3 5ð Þj�1ð Þ i1ð Þ ¼ i2;j 1 3 5ð Þj�1ð Þ i2ð Þ ¼ i3;j 1 3 5ð Þj�1ð Þ i3ð Þ ¼ i1;
j 1 3 5ð Þj�1ð ÞðlÞ ¼ lwhen l 2 1; 2; . . .; nf g � i1; i2; i3f g:
8>><>>:
Here
j 1 3 5ð Þj�1� �i1ð Þ ¼ j 1 3 5ð Þ j�1 i1ð Þ� � ¼ j 1 3 5ð Þð1Þð Þ ¼ j 3ð Þ ¼ i2;
j 1 3 5ð Þj�1� �
i2ð Þ ¼ j 1 3 5ð Þ j�1 i2ð Þ� � ¼ j 1 3 5ð Þ 3ð Þð Þ ¼ j 5ð Þ ¼ i3;
2.3 Applications of Galois Theory 135
and
j 1 3 5ð Þj�1� �i3ð Þ ¼ j 1 3 5ð Þ j�1 i3ð Þ� � ¼ j 1 3 5ð Þ 5ð Þð Þ ¼ jð1Þ ¼ i1:
Suppose that l 2 1; 2; . . .; nf g � i1; i2; i3f g. It suffices to show thatj 1 3 5ð Þj�1ð Þ lð Þ ¼ l. Here
j 1 3 5ð Þj�1� �lð Þ ¼ j 1 3 5ð Þð Þ j�1 lð Þ� � ¼ j 1 3 5ð Þð Þ mð Þ;
where m 2 1; 2; . . .; nf g � 1; 3; 5f g and j mð Þ ¼ l. It follows that 1 3 5ð Þ mð Þ ¼ m,and hence
LHS ¼ j 1 3 5ð Þj�1� �lð Þ ¼ j 1 3 5ð Þð Þ mð Þ ¼ j 1 3 5ð Þ mð Þð Þ ¼ j mð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ l ¼ RHS:
■
2.3.13 Example Let F be the field of all real numbers, and let K be the field of allcomplex numbers.
Clearly, F is a subfield of K. Next,
GðK;FÞ ¼ r : r 2 AutðKÞ; and for every a 2 F; rðaÞ ¼ af g:
Suppose that r 2 AutðKÞ. Here 1; i 2 K. Since r 2 AutðKÞ, r : K ! K is anautomorphism, and hence
�1 ¼ �rð1Þ ¼ r �1ð Þ ¼ r i2� � ¼ r iið Þ ¼ r ið Þr ið Þ:
Thus r ið Þr ið Þ ¼ �1, where r ið Þ is a complex number. It follows that r ið Þ ¼ i orr ið Þ ¼ �i.
Case I: r ið Þ ¼ i. For every real a; b, we have rðaÞ ¼ a and rðbÞ ¼ b. It follows that
r aþ ibð Þ ¼ rðaÞþ r ibð Þ ¼ aþ r ibð Þ ¼ aþ r ið ÞrðbÞ ¼ aþ r ið Þb ¼ aþ ib;
and hence r aþ ibð Þ ¼ aþ ib. This shows that in this case, r is equal to the identitymapping of K. Let us denote this r by r1.
Case II: r ið Þ ¼ �i. For every real a; b, we have rðaÞ ¼ a and rðbÞ ¼ b. It followsthat
r aþ ibð Þ ¼ rðaÞþ r ibð Þ ¼ aþ r ibð Þ ¼ aþ r ið ÞrðbÞ ¼ aþ r ið Þb ¼ aþ �ið Þb¼ a� ib;
136 2 Galois Theory II
and hence r aþ ibð Þ ¼ a� ib. This shows that in this case, r is equal to thecomplex-conjugation mapping of K. Let us denote this r by r2.
Thus GðK;FÞ ¼ r1; r2f g. Hence o GðK;FÞð Þ ¼ o r1; r2f gð Þ ¼ 2. Further,
fixed field of G K;Fð Þð Þ ¼ a : a 2 K; and for every r 2 G K;Fð Þ;r að Þ ¼ af g¼ a : a 2 K; and for every r 2 r1; r2f g; r að Þ ¼ af g
¼ a : a 2 K; r1 að Þ ¼ aandr2 að Þ ¼ af g ¼ a : a 2 K; r2 að Þ ¼ af g¼ a : a 2 K; �a ¼ af g ¼ the set of all real numbersð Þ ¼ F;
so the fixed field of GðK;FÞ is F.2.3.14 Example Let F0 be the field of all rational numbers and K ¼ F0
ffiffiffi23
p� �,
whereffiffiffi23
pis the real cube root of 2. By 1.5.20, we have F0
ffiffiffi23
p� �: F0
� � ¼ 3. Now,
by 1.4.5,ffiffiffi23
pis algebraic of degree 3 over F0. It follows that 1;
ffiffiffi23
p;
ffiffiffi23
p� �2n ois a
linearly independent set of vectors in the vector space F0ffiffiffi23
p� �over F0. Since
F0ffiffiffi23
p� �: F0
� � ¼ 3, the dimension of the vector space F0ffiffiffi23
p� �over F0 is 3. It
follows that 1;ffiffiffi23
p;
ffiffiffi23
p� �2n ois a basis of the vector space F0
ffiffiffi23
p� �over F0. Hence
F0
ffiffiffi23
p� �¼ a0 þ a1
ffiffiffi23
pþ a2
ffiffiffi23
p� �2: a0; a1; a2 2 F0
� �:
Next,
G K;F0ð Þ ¼ r : r 2 AutðKÞ; and for every a 2 F0; rðaÞ ¼ af g:
Suppose that r 2 G K;F0ð Þ.Here 1;
ffiffiffi23
p;
ffiffiffi23
p� �22 K. Since r 2 G K;F0ð Þ, we have r 2 AutðKÞ, that is, r :
K ! K is an automorphism, and hence
2 ¼ r 2ð Þ ¼ rffiffiffi23
p ffiffiffi23
p ffiffiffi23
p� �¼ r
ffiffiffi23
p� �r
ffiffiffi23
p� �r
ffiffiffi23
p� �:
Thus rffiffiffi23
p� �r
ffiffiffi23
p� �r
ffiffiffi23
p� � ¼ 2. Now, since r : F0ffiffiffi23
p� �! F0ffiffiffi23
p� �and
F0ffiffiffi23
p� � � R, we have rffiffiffi23
p� � ¼ ffiffiffi23
p. Now, for every a0; a1; a2 2 F0,
r a0 þ a1ffiffiffi23
p þ a2ffiffiffi23
p� �2� �¼ r a0ð Þþ r a1
ffiffiffi23
p� �þ r a2ffiffiffi23
p� �2� �¼ r a0ð Þþ r a1ð Þr ffiffiffi
23p� �þ r a2ð Þr ffiffiffi
23p� �
rffiffiffi23
p� �¼ a0 þ a1r
ffiffiffi23
p� �þ a2rffiffiffi23
p� �r
ffiffiffi23
p� �¼ a0 þ a1
ffiffiffi23
p þ a2ffiffiffi23
p ffiffiffi23
p ¼ a0 þ a1ffiffiffi23
p þ a2ffiffiffi23
p� �2;
2.3 Applications of Galois Theory 137
that is, for every a0; a1; a2 2 F0, we have
r a0 þ a1ffiffiffi23
pþ a2
ffiffiffi23
p� �2� ¼ a0 þ a1
ffiffiffi23
pþ a2
ffiffiffi23
p� �2:
This shows that r is equal to the identity mapping Id of K. ThusG K;F0ð Þ ¼ Idf g. Hence o G K;F0ð Þð Þ ¼ o Idf gð Þ ¼ 1.
Further,
fixed field of G K;F0ð Þð Þ ¼ a : a 2 K; and for every r 2 G K;F0ð Þ; r að Þ ¼ af g¼ a : a 2 K; and for every r 2 Idf g; r að Þ ¼ af g
¼ a : a 2 K; Id að Þ ¼ af g ¼ K ¼ F0ffiffiffi23
p� �;
so the fixed field of G F0ffiffiffi23
p� �;F0
� �is F0
ffiffiffi23
p� �.
2.3.15 Example Let F0 be the field of all rational numbers. Let us denote e2pi5 by a.
It follows that a5 ¼ 1. Next, a4 þ a3 þ a2 þ aþ 1 ¼ a5�1a�1 ¼ 1�1
a�1 ¼ 0, so a is aroot of the polynomial x4 þ x3 þ x2 þ xþ 1 2 F0½x�.
Here x4 þ x3 þ x2 þ xþ 1 is an irreducible polynomial over the field ofrational numbers.
Proof Put x � yþ 1. It suffices to show that yþ 1ð Þ4 þ yþ 1ð Þ3 þyþ 1ð Þ2 þ yþ 1ð Þþ 1 is an irreducible polynomial over the field of rationalnumbers.
Observe that
yþ 1ð Þ4 þ yþ 1ð Þ3 þ yþ 1ð Þ2 þ yþ 1ð Þþ 1¼ y4 þ 4y3 þ 6y2 þ 4yþ 1ð Þþ y3 þ 3y2 þ 3yþ 1ð Þþ y2 þ 2yþ 1ð Þþ yþ 1ð Þþ 1
¼ 5þ 10yþ 10y2 þ 5y3 þ y4 ¼ 5 1þ 2yþ 2y2 þ y3ð Þþ y4:
By 1.3.5, 5þ 10yþ 10y2 þ 5y3 þ y4 is irreducible over the field of rationalnumbers, and
yþ 1ð Þ4 þ yþ 1ð Þ3 þ yþ 1ð Þ2 þ yþ 1ð Þþ 1 ¼ 5þ 10yþ 10y2 þ 5y3 þ y4;
so
yþ 1ð Þ4 þ yþ 1ð Þ3 þ yþ 1ð Þ2 þ yþ 1ð Þþ 1
is irreducible over the field of rational numbers. ■
138 2 Galois Theory II
Thus we have shown that x4 þ x3 þ x2 þ xþ 1 is an irreducible polynomial overthe field of rational numbers. Now, by 1.5.12, a is algebraic of degree 4 over F0,and hence by 1.4.16, F0 að Þ : F0½ � ¼ 4. Since a is algebraic of degree 4 over F0,1; a; a2; a3
is a linearly independent set of vectors in the vector space F0 að Þ overF0. Since F0 að Þ : F0½ � ¼ 4, the dimension of the vector space F0 að Þ over F0 is 4. Itfollows that 1; a; a2; a3
is a basis of the vector space F0 að Þ over F0. Hence
F0 að Þ ¼ a0 þ a1aþ a2a2 þ a3a
3 : a0; a1; a2 2 F0
:
Next,
G K;F0ð Þ ¼ r : r 2 AutðKÞ; and for every a 2 F0; rðaÞ ¼ af g:
Suppose that r 2 G K;F0ð Þ.Here 1; a; a2; a3 2 K. Also a5 ¼ 1 . Since r 2 G K;F0ð Þ, we have r 2 AutðKÞ,
that is, r : K ! K is an automorphism, and hence
1 ¼ rð1Þ ¼ r a5� � ¼ r aaaaað Þ ¼ r að Þr að Þr að Þr að Þr að Þ ¼ r að Þð Þ5:
Thus r að Þð Þ5¼ 1. Now, since r : F0ðaÞ ! F0ðaÞ and F0 að Þ � C, we haver að Þ ¼ 1 or a or a2 or a3 or a4. Since r : K ! K is an automorphism, r isone-to-one. Since a 6¼ 1, we have rðaÞ 6¼ rð1Þ|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} ¼ 1, and hence rðaÞ 6¼ 1. Since
rðaÞ ¼ 1 or a or a2 or a3 or a4, we have
rðaÞ ¼ a or a2 or a3 or a4:
Case I: r að Þ ¼ a. For every a0; a1; a2; a3 2 F0,
r a0 þ a1aþ a2a2 þ a3a3ð Þ ¼ r a0ð Þþ r a1að Þþ r a2a2ð Þþ r a2a3ð Þ¼ r a0ð Þþ r a1ð Þr að Þþ r a2ð Þ r að Þð Þ2 þ r a3ð Þ r að Þð Þ3
¼ a0 þ a1aþ a2 að Þ2 þ a3 að Þ3¼ a0 þ a1aþ a2a2 þ a3a3;
that is, for every a0; a1; a2; a3 2 F0, we have
r a0 þ a1aþ a2a2 þ a3a3� � ¼ a0 þ a1aþ a2a2 þ a3a3:
This shows that r is equal to the identity mapping Id of K. Let us denote this rby r1. Thus r1 að Þ ¼ a and r1 2 G K;F0ð Þ.Case II: r að Þ ¼ a2. For every a0; a1; a2; a3 2 F0,
2.3 Applications of Galois Theory 139
r a0 þ a1aþ a2a2 þ a3a
3� �¼ r a0ð Þþ r a1að Þþ r a2a
2� �þ r a2a3� �
¼ r a0ð Þþ r a1ð Þr að Þþ r a2ð Þ r að Þð Þ2 þ r a3ð Þ r að Þð Þ3
¼ a0 þ a1 a2� �þ a2 a2
� �2 þ a3 a2� �3
¼ a0 þ a1a2 þ a2
1aþ a3a
¼ a0 þ a1a2 þ a2 �1� a� a2 � a3
� �þ a3a
¼ a0 � a2ð Þþ a3 � a2ð Þaþ a1 � a2ð Þa2 � a2a3;
that is, for every a0; a1; a2; a3 2 F0, we have
r a0 þ a1aþ a2a2 þ a3a
3� � ¼ a0 � a2ð Þþ a3 � a2ð Þaþ a1 � a2ð Þa2 � a2a
3:
Let us denote this r by r2. Thus r2 að Þ ¼ a2, and for every a0; a1; a2; a3 2 F0,we have
r2 a0 þ a1aþ a2a2 þ a3a3� �¼ a0 � a2ð Þþ a3 � a2ð Þaþ a1 � a2ð Þa2 � a2a
3 2 F0 að Þð Þ¼ a0 þ a1a
2 þ a21aþ a3a ¼ 1
aa2 þ a0aþ a3a
2 þ a1a3� �
¼ a0 þ a3aþ a1a2 þ a2a
4:
Also r2 2 G K;F0ð Þ.Proof r2 : K ! K is one-to-one: Let
r2 a0 þ a1aþ a2a2 þ a3a
3� � ¼ r2 b0 þ b1aþ b2a2 þ b3a
3� �;
where each ai; bi is inF0. We have to show that for every i 2 0; 1; 2; 3f g, ai ¼ bi.Here
a0 � a2ð Þþ a3 � a2ð Þaþ a1 � a2ð Þa2 � a2a3
¼ b0 � b2ð Þþ b3 � b2ð Þaþ b1 � b2ð Þa2 � b2a3:
Now, since 1; a; a2; a3
is a basis of the vector space F0 að Þ over F0, wehave
a0 � a2 ¼ b0 � b2a3 � a2 ¼ b3 � b2a1 � a2 ¼ b1 � b2
�a2 ¼ �b2
9>>=>>;;
140 2 Galois Theory II
that is, for every i 2 0; 1; 2; 3f g, ai ¼ bi. ■
r2 : K ! K is onto: Let us take an arbitrary sum b0 þ b1aþ b2a2 þ b3a3 2 K,where each bi 2 F0. Since
r2 b0 � b3ð Þþ b2 � b3ð Þa� b3a2 þ b1 � b3ð Þa3� �
¼ b0 � b3ð Þ � �b3ð Þð Þþ b1 � b3ð Þ � �b3ð Þð Þaþ b2 � b3ð Þ � �b3ð Þð Þa2 � �b3ð Þa3
¼ b0 þ b1aþ b2a2 þ b3a
3;
it follows that
r2 b0 � b3ð Þþ b2 � b3ð Þa� b3a2 þ b1 � b3ð Þa3� � ¼ b0 þ b1aþ b2a
2 þ b3a3;
where b0 � b3ð Þþ b2 � b3ð Þa� b3a2 þ b1 � b3ð Þa3 2 K.
It is clear that r2: a0 þ a1aþ a2a2 þ a3a3ð Þ 7! a0 þ a3aþ a1a2 þ a2a4ð Þ pre-serves addition. We claim that
r2: a0 þ a1aþ a2a2 þ a3a3ð Þ 7! a0 þ a3aþ a1a2 þ a2a4ð Þ preserves multipli-cation. We have to show that
r2 a0 þ a1aþ a2a2 þ a3a
3� �b0 þ b1aþ b2a
2 þ b3a3� �� �
¼ a0 þ a1a2 þ a2
1aþ a3a
� b0 þ b1a
2 þ b21aþ b3a
� :
Since
a0 þ a1aþ a2a2 þ a3a3ð Þ b0 þ b1aþ b2a2 þ b3a3ð Þ¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b1 þ a1b0 þ a3b3ð Þaþ a0b2 þ a1b1 þ a2b0ð Þa2þ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þa3 þ a1b3 þ a2b2 þ a3b1ð Þ �1� a� a2 � a3ð Þ
¼ a0b0 þ a2b3 þ a3b2 � a1b3 þ a2b2 þ a3b1ð Þð Þþ a0b1 þ a1b0 þ a3b3ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þaþ a0b2 þ a1b1 þ a2b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa2
þ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa3;
we have
2.3 Applications of Galois Theory 141
LHS ¼ a0b0 þ a2b3 þ a3b2 � a1b3 þ a2b2 þ a3b1ð Þð Þþ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa
þ a0b1 þ a1b0 þ a3b3ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa2þ a0b2 þ a1b1 þ a2b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa4
¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þaa0b1 þ a1b0 þ a3b3ð Þa2 þ a0b2 þ a1b1 þ a2b0ð Þa4þ a1b3 þ a2b2 þ a3b1ð Þ �1� a� a2 � a4ð Þ
¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þaþ a0b1 þ a1b0 þ a3b3ð Þa2 þ a0b2 þ a1b1 þ a2b0ð Þa4 þ a1b3 þ a2b2 þ a3b1ð Þa3
¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þaþ a0b1 þ a1b0 þ a3b3ð Þa2þ a1b3 þ a2b2 þ a3b1ð Þa3 þ a0b2 þ a1b1 þ a2b0ð Þa4;
and
RHS ¼ a0 þ a1a2 þ a2 1a þ a3a
� �b0 þ b1a2 þ b2 1
a þ b3a� �
¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þaþ a0b1 þ a1b0 þ a3b3ð Þa2þ a1b3 þ a2b2 þ a3b1ð Þa3 þ a0b2 þ a1b1 þ a2b0ð Þa4;
so LHS = RHS.Finally, let a0 2 F0. We have to show that r2 a0ð Þ ¼ a0. Here,
LHS ¼ r2 a0ð Þ ¼ r2 a0 þ 0aþ 0a2 þ 0a3� � ¼ a0 þ 0aþ 0a2 þ 0a4 ¼ a0
¼ RHS:
Thus we have shown that r2 2 G K;F0ð Þ.Case III: r að Þ ¼ a3. For every a0; a1; a2; a3 2 F0,
r a0 þ a1aþ a2a2 þ a3a
3� �¼ r a0ð Þþ r a1að Þþ r a2a
2� �þ r a2a
3� �
¼ r a0ð Þþ r a1ð Þr að Þþ r a2ð Þ r að Þð Þ2 þ r a3ð Þ r að Þð Þ3
¼ a0 þ a1 a3� �þ a2 a3
� �2 þ a3 a3� �3¼ a0 þ a1a
3 þ a2aþ a31a
¼ a0 þ a1a3 þ a2aþ a3 �1� a� a2 � a3� �
¼ a0 � a3ð Þþ a2 � a3ð Þa� a3a2 þ a1 � a3ð Þa3;
that is, for every a0; a1; a2; a3 2 F0, we have
r a0 þ a1aþ a2a2 þ a3a3� � ¼ a0 � a3ð Þþ a2 � a3ð Þa� a3a2 þ a1 � a3ð Þa3:
142 2 Galois Theory II
Let us denote this r by r3. Thus r3 að Þ ¼ a3, and for every a0; a1; a2; a3 2 F0,we have
r3 a0 þ a1aþ a2a2 þ a3a
3� �¼ a0 � a3ð Þþ a2 � a3ð Þa� a3a
2 þ a1 � a3ð Þa3 2 F0 að Þð Þ¼ a0 þ a1a
3 þ a2aþ a31a¼ 1
a2a1 þ a3aþ a0a
2 þ a2a3� �
¼ a0 þ a2aþ a1a3 þ a3a
4:
Also r3 2 G K;F0ð Þ.Proof r3 : K ! K is one-to-one: Let
r3 a0 þ a1aþ a2a2 þ a3a3� � ¼ r3 b0 þ b1aþ b2a2 þ b3a3
� �;
where each ai; bi is inF0. We have to show that for every i 2 0; 1; 2; 3f g, ai ¼ bi.Here
a0 � a3ð Þþ a2 � a3ð Þa� a3a2 þ a1 � a3ð Þa3
¼ b0 � b3ð Þþ b2 � b3ð Þa� b3a2 þ b1 � b3ð Þa3:
Now, since 1; a; a2; a3
is a basis of the vector space F0 að Þ over F0, wehave
a0 � a3 ¼ b0 � b3a2 � a3 ¼ b2 � b3
�a3 ¼ �b3a1 � a3 ¼ b1 � b3
9>>=>>;;
that is, for every i 2 0; 1; 2; 3f g, ai ¼ bi.
r3 : K ! K is onto: Let us take an arbitrary sum b0 þ b1aþ b2a2 þ b3a3 2 K,where each bi is inF0. Since
r3 b0 � b2ð Þþ b3 � b2ð Þaþ b1 � b2ð Þa2 � b2a3� �
¼ b0 � b2ð Þ � �b2ð Þð Þþ b1 � b2ð Þ � �b2ð Þð Þa� �b2ð Þa2 þ b3 � b2ð Þ � �b2ð Þð Þa3
¼ b0 þ b1aþ b2a2 þ b3a
3;
it follows that
2.3 Applications of Galois Theory 143
r3 b0 � b2ð Þþ b3 � b2ð Þaþ b1 � b2ð Þa2 � b2a3
� � ¼ b0 þ b1aþ b2a2 þ b3a
3;
where b0 � b2ð Þþ b3 � b2ð Þaþ b1 � b2ð Þa2 � b2a3 2 K.
It is clear that r3: a0 þ a1aþ a2a2 þ a3a3ð Þ 7! a0 þ a2aþ a1a3 þ a3a4ð Þ pre-serves addition. We claim that
r3: a0 þ a1aþ a2a2 þ a3a3ð Þ 7! a0 þ a2aþ a1a3 þ a3a4ð Þ preservesmultiplication:
We have to show that
r3 0 þ a1aþ a2a2 þ a3a
3� �
b0 þ b1aþ b2a2 þ b3a
3� �� �
¼ a0 þ a1a3 þ a2aþ a3
1a
� b0 þ b1a
3 þ b2aþ b31a
� :
Since
a0 þ a1aþ a2a2 þ a3a3ð Þ b0 þ b1aþ b2a2 þ b3a3ð Þ¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b1 þ a1b0 þ a3b3ð Þaþ a0b2 þ a1b1 þ a2b0ð Þa2þ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þa3 þ a1b3 þ a2b2 þ a3b1ð Þ �1� a� a2 � a3ð Þ
¼ a0b0 þ a2b3 þ a3b2 � a1b3 þ a2b2 þ a3b1ð Þð Þþ a0b1 þ a1b0 þ a3b3ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þaþ a0b2 þ a1b1 þ a2b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa2
þ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa3;
we have
LHS ¼ a0b0 þ a2b3 þ a3b2 � a1b3 þ a2b2 þ a3b1ð Þð Þþ a0b2 þ a1b1 þ a2b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þaþ a0b1 þ a1b0 þ a3b3ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa3þ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa4
¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b2 þ a1b1 þ a2b0ð Þaþ a0b1 þ a1b0 þ a3b3ð Þa3þ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þa4 þ a1b3 þ a2b2 þ a3b1ð Þ �1� a� a3 � a4
� �¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b2 þ a1b1 þ a2b0ð Þaþ a0b1 þ a1b0 þ a3b3ð Þa3þ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þa4 þ a1b3 þ a2b2 þ a3b1ð Þa2
¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b2 þ a1b1 þ a2b0ð Þaþ a1b3 þ a2b2 þ a3b1ð Þa2þ a0b1 þ a1b0 þ a3b3ð Þa3 þ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þa4;
and
144 2 Galois Theory II
RHS ¼ a0 þ a1a3 þ a2aþ a3 1a
� �b0 þ b1a3 þ b2aþ b3 1
a
� �¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b2 þ a1b1 þ a2b0ð Þaþ a1b3 þ a2b2 þ a3b1ð Þa2
þ a0b1 þ a1b0 þ a3b3ð Þa3 þ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þa4;
so LHS = RHS.Finally, let a0 2 F0. We have to show that r3 a0ð Þ ¼ a0. Here
LHS ¼ r3 a0ð Þ ¼ r3 a0 þ 0aþ 0a2 þ 0a3� � ¼ a0 þ 0aþ 0a3 þ 0a4 ¼ a0
¼ RHS:
Thus we have shown that r3 2 G K;F0ð Þ. ■
Case IV: r að Þ ¼ a4. For every a0; a1; a2; a3 2 F0,
r a0 þ a1aþ a2a2 þ a3a
3� �¼ r a0ð Þþ r a1að Þþ r a2a
2� �þ r a2a3� �
¼ r a0ð Þþ r a1ð Þr að Þþ r a2ð Þ r að Þð Þ2 þ r a3ð Þ r að Þð Þ3
¼ a0 þ a1 a4� �þ a2 a4
� �2 þ a3 a4� �3¼ a0 þ a1
1aþ a2a
3 þ a3a2
¼ a0 þ a1 �1� a� a2 � a3� �þ a2a
3 þ a3a2
¼ a0 � a1ð Þ � a1aþ a3 � a1ð Þa2 þ a2 � a1ð Þa3;
that is, for every a0; a1; a2; a3 2 F0, we have
r a0 þ a1aþ a2a2 þ a3a
3� � ¼ a0 � a1ð Þ � a1aþ a3 � a1ð Þa2 þ a2 � a1ð Þa3:
Let us denote this r by r4. Thus r4 að Þ ¼ a4, and for every a0; a1; a2; a3 2 F0, wehave
r4 a0 þ a1aþ a2a2 þ a3a3� �
¼ a0 � a1ð Þ � a1aþ a3 � a1ð Þa2 þ a2 � a1ð Þa3 2 F0 að Þð Þ¼ a0 þ a1
1aþ a2a
3 þ a3a2 ¼ 1
a3a3 þ a2aþ a1a
2 þ a0a3� �
¼ a0 þ a3a2 þ a2a
3 þ a1a4:
Also r4 2 G K;F0ð Þ.Proof r4 : K ! K is one-to-one: Let
r4 a0 þ a1aþ a2a2 þ a3a
3� � ¼ r4 b0 þ b1aþ b2a2 þ b3a
3� �;
where each ai; bi is inF0. We have to show that for every i 2 0; 1; 2; 3f g, ai ¼ bi.Here
2.3 Applications of Galois Theory 145
a0 � a1ð Þ � a1aþ a3 � a1ð Þa2 þ a2 � a1ð Þa3¼ b0 � b1ð Þ � b1aþ b3 � b1ð Þa2 þ b2 � b1ð Þa3:
Now, since 1; a; a2; a3
is a basis of the vector space F0 að Þ over F0, we have
a0 � a1 ¼ b0 � b1�a1 ¼ �b1
a3 � a1 ¼ b3 � b1a2 � a1 ¼ b2 � b1
9>>=>>;;
that is, for every i 2 0; 1; 2; 3f g, ai ¼ bi.
r4 : K ! K is onto: Let us take an arbitrary sum b0 þ b1aþ b2a2 þ b3a3 2 K,where each bi is inF0. Since
r4 b0 � b1ð Þ � b1aþ b3 � b1ð Þa2 þ b2 � b1ð Þa3� �¼ b0 � b1ð Þ � �b1ð Þð Þ � �b1ð Þaþ b2 � b1ð Þ � �b1ð Þð Þa2
þ b3 � b1ð Þ � �b1ð Þð Þa3¼ b0 þ b1aþ b2a
2 þ b3a3;
it follows that
r4 b0 � b1ð Þ � b1aþ b3 � b1ð Þa2 þ b2 � b1ð Þa3� � ¼ b0 þ b1aþ b2a2 þ b3a
3;
where b0 � b1ð Þ � b1aþ b3 � b1ð Þa2 þ b2 � b1ð Þa3 2 K.
It is clear that r4: a0 þ a1aþ a2a2 þ a3a3ð Þ 7! a0 þ a3a2 þ a2a3 þ a1a4ð Þ pre-serves addition.
r4: a0 þ a1aþ a2a2 þ a3a3ð Þ 7! a0 þ a3a2 þ a2a3 þ a1a4ð Þ preserves multiplica-tion: We have to show that
r4 a0 þ a1aþ a2a2 þ a3a
3� �b0 þ b1aþ b2a
2 þ b3a3� �� �
¼ a0 þ a11aþ a2a
3 þ a3a2
� b0 þ b1
1aþ b2a
3 þ b3a2
� :
Since
a0 þ a1aþ a2a2 þ a3a3ð Þ b0 þ b1aþ b2a2 þ b3a3ð Þ¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b1 þ a1b0 þ a3b3ð Þaþ a0b2 þ a1b1 þ a2b0ð Þa2þ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þa3 þ a1b3 þ a2b2 þ a3b1ð Þ �1� a� a2 � a3ð Þ
¼ a0b0 þ a2b3 þ a3b2 � a1b3 þ a2b2 þ a3b1ð Þð Þþ a0b1 þ a1b0 þ a3b3ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þaþ a0b2 þ a1b1 þ a2b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa2
þ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa3;
we have
146 2 Galois Theory II
LHS ¼ a0b0 þ a2b3 þ a3b2 � a1b3 þ a2b2 þ a3b1ð Þð Þþ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa2
þ a0b2 þ a1b1 þ a2b0ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa3þ a0b1 þ a1b0 þ a3b3ð Þ � a1b3 þ a2b2 þ a3b1ð Þð Þa4
¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þa2þ a0b2 þ a1b1 þ a2b0ð Þa3 þ a0b1 þ a1b0 þ a3b3ð Þa4 þ a1b3 þ a2b2 þ a3b1ð Þ�1� a2 � a3 � a4ð Þ ¼ a0b0 þ a2b3 þ a3b2ð Þþ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þa2þ a0b2 þ a1b1 þ a2b0ð Þa3 þ a0b1 þ a1b0 þ a3b3ð Þa4 þ a1b3 þ a2b2 þ a3b1ð Þa
¼ a0b0 þ a2b3 þ a3b2ð Þþ a1b3 þ a2b2 þ a3b1ð Þaþ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þa2þ a0b2 þ a1b1 þ a2b0ð Þa3 þ a0b1 þ a1b0 þ a3b3ð Þa4;
and
RHS ¼ a0 þ a1 1a þ a2a3 þ a3a2
� �b0 þ b1 1
a þ b2a3 þ b3a2� �
¼ a0b0 þ a2b3 þ a3b2ð Þþ a1b3 þ a3b1 þ a2b2ð Þaþ a0b3 þ a1b2 þ a2b1 þ a3b0ð Þa2þ a0b2 þ a1b1 þ a2b0ð Þa3 þ a0b1 þ a1b0 þ a3b3ð Þa4;
so LHS = RHS.Finally, let a0 2 F0. We have to show that r4 a0ð Þ ¼ a0. Here
LHS ¼ r4 a0ð Þ ¼ r4 a0 þ 0aþ 0a2 þ 0a3� � ¼ a0 þ 0a2 þ 0a3 þ 0a4
� � ¼ a0¼ RHS:
Thus we have shown that r4 2 G K;F0ð Þ. ■Hence
G K;F0ð Þ ¼ r1; r2; r3; r4f g:
Next, o G K;F0ð Þð Þ ¼ o r1;r2; r3; r4f gð Þ ¼ 4. Further,
fixed field of G K;F0ð Þð Þ¼ a : a 2 K; and for every r 2 G K;F0ð Þ; r að Þ ¼ af g¼ a : a 2 K; and for every r 2 r1; r2; r3; r4f g; r að Þ ¼ af g¼ a : a 2 K; r1 að Þ ¼ a; r2 að Þ ¼ a; r3 að Þ ¼ a; r4 að Þ ¼ af g¼ a : a 2 K; r2 að Þ ¼ a; r3 að Þ ¼ a; r4 að Þ ¼ af g¼ a0 þ a1aþ a2a
2 þ a3a3 : a0; a1a2; a3 2 F0; a0 � a2ð Þ
þ a3 � a2ð Þaþ a1 � a2ð Þa2 � a2a3
¼ a0 þ a1aþ a2a2 þ a3a
3; a0 � a3ð Þþ a2 � a3ð Þa� a3a2 þ a1 � a3ð Þa3
¼ a0 þ a1aþ a2a2 þ a3a
3; a0 � a1ð Þ � a1aþ a3 � a1ð Þa2 þ a2 � a1ð Þa3¼ a0 þ a1aþ a2a
2 þ a3a3g
¼ a0 þ a1aþ a2a2 þ a3a
3 : a0 2 F0; a1 ¼ a2 ¼ a3 ¼ 0 ¼ F0;
2.3 Applications of Galois Theory 147
so the fixed field of G F0 að Þ;F0ð Þ is F0.Observe that
r2ð Þ2 a0 þ a1aþ a2a2 þ a3a
3� �¼ r2 r2 a0 þ a1aþ a2a
2 þ a3a3� �� �
¼ r2 a0 � a2ð Þþ a3 � a2ð Þaþ a1 � a2ð Þa2 � a2a3� �
¼ a0 � a2ð Þ � a1 � a2ð Þð Þþ �a2ð Þ � a1 � a2ð Þð Þaþ a3 � a2ð Þ � a1 � a2ð Þð Þa2 � a1 � a2ð Þa3
¼ a0 � a1ð Þ � a1aþ a3 � a1ð Þa2 þ a2 � a1ð Þa3¼ r4 a0 þ a1aþ a2a
2 þ a3a3� �;
so r2ð Þ2¼ r4. Next,
r2ð Þ3 a0 þ a1aþ a2a2 þ a3a
3� �¼ r2 a0 � a1ð Þ � a1aþ a3 � a1ð Þa2 þ a2 � a1ð Þa3� �¼ a0 � a1ð Þ � a3 � a1ð Þð Þþ a2 � a1ð Þ � a3 � a1ð Þð Þaþ �a1ð Þ � a3 � a1ð Þð Þa2 � a3 � a1ð Þa3
¼ a0 � a3ð Þþ a2 � a3ð Þa� a3a2 þ a1 � a3ð Þa3
¼ r3 a0 þ a1aþ a2a2 þ a3a
3� �;
so r2ð Þ3¼ r3. Finally,
r2ð Þ4 a0 þ a1aþ a2a2 þ a3a
3� �¼ r2 a0 � a3ð Þþ a2 � a3ð Þa� a3a2 þ a1 � a3ð Þa3� �¼ a0 � a3ð Þ � �a3ð Þð Þþ a1 � a3ð Þ � �a3ð Þð Þaþ a2 � a3ð Þ � �a3ð Þð Þa2 � �a3ð Þa3
¼ a0 þ a1aþ a2a2 þ a3a
3 ¼ r1 a0 þ a1aþ a2a2 þ a3a
3� �;
so r2ð Þ4¼ r1. Thus
G K;F0ð Þ ¼ r1; r2; r3; r4f g ¼ r2; r2ð Þ2; r2ð Þ3; r2ð Þ4n o
:
It follows that G K;F0ð Þ is a cyclic group generated by
r2 : a0 þ a1aþ a2a2 þ a3a3� � 7! a0 þ a3aþ a1a2 þ a2a4
� �:
148 2 Galois Theory II
Since
r4ð Þ2 a0 þ a1aþ a2a2 þ a3a
3� �¼ r4 a0 � a1ð Þ � a1aþ a3 � a1ð Þa2 þ a2 � a1ð Þa3� �¼ a0 � a1ð Þ � �a1ð Þð Þ � �a1ð Þaþ a2 � a1ð Þ � �a1ð Þð Þa2þ a3 � a1ð Þ � �a1ð Þð Þa3
¼ a0 þ a1aþ a2a2 þ a3a
3
¼ r1 a0 þ a1aþ a2a2 þ a3a
3� �;
we have r4ð Þ2¼ r1, and hence r1; r4f g is a subgroup of G K;F0ð Þ. Here
fixed field of r1; r4f gð Þ¼ a : a 2 K; and for every r 2 r1; r4f g; r að Þ ¼ af g¼ a : a 2 K; r1 að Þ ¼ a; r4 að Þ ¼ af g ¼ a : a 2 K; r4 að Þ ¼ af g¼ a0 þ a1aþ a2a
2 þ a3a3 : a0; a1a2; a3 2 F0;
a0 � a1ð Þ
� a1aþ a3 � a1ð Þa2 þ a2 � a1ð Þa3 ¼ a0 þ a1aþ a2a2 þ a3a
3¼ a0 þ a1aþ a2a
2 þ a3a3 : a0 2 F0; a1 ¼ 0; a2 ¼ a3
¼ a0 þ a2 a2 þ a3
� �: a0; a2 2 F0
;
so the fixed field of r1; r4f g is a0 þ a2 a2 þ a3ð Þ : a0; a2 2 F0
.
2.3.16 Problem Let n be a positive integer � 2. Let F be a field such that F � C.Suppose that F contains all the nth roots of unity, that is, F contains all the roots of
the polynomial xn � 1, that is, 1; e2pin ; e
2pin
� �2; . . .; e
2pin
� �n�1� �
� F. Let a be a
nonzero member of F. Let u be a root of the polynomial xn � a in C, that is, un ¼ a.Then FðuÞ is the splitting field over F for xn � a.
Proof We must prove:
1. FðuÞ is a finite extension of F,2. FðuÞ contains all the roots of xn � a in C,3. if G is a proper subfield of FðuÞ that contains F, then G does not contain all the
roots of xn � a in C.
For 1: Since u 2 Cð Þ is a root of the polynomial xn � að Þ 2 F½x�, u is algebraicover F, and hence by 1.4.17, FðuÞ is a finite extension of F.
For 2: Here we have to show that u1; ue2pin ; u e
2pin
� �2; . . .; u e
2pin
� �n�1� �
� FðuÞ.
Since 1; e2pin ; e
2pin
� �2; . . .; e
2pin
� �n�1� �
� F � FðuÞ, u 2 FðuÞ, and FðuÞ is a field,
we have
2.3 Applications of Galois Theory 149
u1; ue2pin ; u e
2pin
� �2; . . .; u e
2pin
� �n�1� �
� FðuÞ:
For 3: Suppose to the contrary that G is a subfield of FðuÞ such that G 6¼ FðuÞ,G contains F, and G contains
u1; ue2pin ; u e
2pin
� �2; . . .; u e
2pin
� �n�1� �
3uð Þ:
We seek a contradiction. Since G is a subfield of FðuÞ, we have G � FðuÞ. SinceG is a field that contains F [ uf g, we have FðuÞ � G, and hence G ¼ FðuÞ. This isa contradiction. ■
2.3.17 Problem Let n be a positive integer � 2. Let F be a field such that F � C.Suppose that F contains all the nth roots of unity, that is, F contains all the roots of
the polynomial xn � 1, that is, 1; e2pin ; e
2pin
� �2; . . .; e
2pin
� �n�1� �
� F. Let a be a
nonzero member of F. Hence xn � að Þ 2 F½x�. Then the Galois group of xn � a overF is abelian.
Proof By 2.3.16, FðuÞ is the splitting field over F for xn � a, where un ¼ a andu 2 C, and hence FðuÞ contains all the roots of xn � a in C. Further, the set of all
the roots of xn � a is u1; ue2pin ; u e
2pin
� �2; . . .; u e
2pin
� �n�1� �
� FðuÞð Þ.The Galois group of xn � a is the group
G FðuÞ;Fð Þ ¼ r : r 2 Aut FðuÞð Þ; and for every a 2 F; rðaÞ ¼ af gð Þ:We have toshow that G FðuÞ;Fð Þ is abelian.
To this end, let us take an automorphism r : FðuÞ ! FðuÞ such thatfor every a 2 F; rðaÞ ¼ a. Next let us take an automorphism s : FðuÞ ! FðuÞ suchthat for every a 2 F; sðaÞ ¼ a. We have to show that for every b 2 FðuÞ,s rðbÞð Þ ¼ r sðbÞð Þ.
Observe that t : t 2 FðuÞ; s rðtÞð Þ ¼ r sðtÞð Þf g is a subfield of FðuÞ.Proof Let s; t 2 FðuÞ, where s rðsÞð Þ ¼ r sðsÞð Þ, and s rðtÞð Þ ¼ r sðtÞð Þ. It sufficesto show that
1. s r s� tð Þð Þ ¼ r s s� tð Þð Þ;2. s r stð Þð Þ ¼ r s stð Þð Þ;3. if s, t are nonzero, then s r st�1ð Þð Þ ¼ r s st�1ð Þð Þ.
For 1:
LHS ¼ s r s� tð Þð Þ ¼ s r sð Þ � r tð Þð Þ ¼ s r sð Þð Þ � s r tð Þð Þ¼ r s sð Þð Þ � s r tð Þð Þ ¼ r s sð Þð Þ � r s tð Þð Þ ¼ r s sð Þ � s tð Þð Þ ¼ r s s� tð Þð Þ ¼ RHS:
150 2 Galois Theory II
For 2:
LHS ¼ s r stð Þð Þ ¼ s r sð Þr tð Þð Þ ¼ s r sð Þð Þs r tð Þð Þ¼ r s sð Þð Þs r tð Þð Þ ¼ r s sð Þð Þr s tð Þð Þ ¼ r s sð Þs tð Þð Þ ¼ r stð Þð Þ ¼ RHS:
For 3: Let s, t be nonzero. We have
LHS ¼ s r st�1ð Þð Þ ¼ s r sð Þ r tð Þð Þ�1� �
¼ s r sð Þð Þ s r tð Þð Þð Þ�1¼ r s sð Þð Þ s r tð Þð Þð Þ�1
¼ r s sð Þð Þ r s tð Þð Þð Þ�1¼ r s sð Þ s tð Þð Þ�1� �
¼ r s st�1ð Þð Þ ¼ RHS:
■
We have shown that t : t 2 FðuÞ; s rðtÞð Þ ¼ r sðtÞð Þf g is a subfield of FðuÞ. It isclear that
F � t : t 2 FðuÞ; s rðtÞð Þ ¼ r sðtÞð Þf g:
Since un ¼ a, we have
rðuÞð Þn¼ r unð Þ ¼ rðaÞ|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} ¼ a;
and hence rðuÞð Þn�a ¼ 0. Thus rðuÞ is a root of xn � a in C. Now, since the set of
all the roots of xn � a is u1; ue2pin ; u e
2pin
� �2; . . .; u e
2pin
� �n�1� �
, there exists an integer
k 2 0; 1; . . .; n� 1f g such that rðuÞ ¼ u e2pin
� �k. Similarly, there exists an integer
l 2 0; 1; . . .; n� 1f g such that sðuÞ ¼ u e2pin
� �l.
Clearly, s rðuÞð Þ ¼ r sðuÞð Þ.Proof
LHS ¼ s rðuÞð Þ ¼ s u e2pin
� �k� ¼ sðuÞs e
2pin
� �k� ¼ sðuÞ e
2pin
� �k¼ u e
2pin
� �l�
e2pin
� �k¼ u e
2pin
� �kþ l;
and
RHS ¼ r sðuÞð Þ ¼ r u e2pin
� �l� ¼ rðuÞr e
2pin
� �l� ¼ rðuÞ e
2pin
� �l¼ u e
2pin
� �k� e
2pin
� �l¼ u e
2pin
� �kþ l:
Thus LHS = RHS. ■
2.3 Applications of Galois Theory 151
We have shown that s rðuÞð Þ ¼ r sðuÞð Þ, and hence u 2 t : t 2 FðuÞ;fs rðtÞð Þ ¼ r sðtÞð Þg. Thus t : t 2 FðuÞ; s rðtÞð Þ ¼ r sðtÞð Þf g is a field containingF [ uf g, and hence
FðuÞ � t : t 2 FðuÞ; s rðtÞð Þ ¼ r sðtÞð Þf g � FðuÞð Þ:
Thus FðuÞ ¼ t : t 2 FðuÞ; s rðtÞð Þ ¼ r sðtÞð Þf g. Thus for every b 2 FðuÞ,s rðbÞð Þ ¼ r sðbÞð Þ. ■
2.4 Solvability By Radicals
2.4.1 Definition Let F, K be any fields such that K is an extension of F. LetpðxÞ 2 F½x�. Let deg pðxÞð Þ ¼ n. Suppose that K contains all the n roots of pðxÞ inK. If there exists a finite sequence
x1; r1ð Þ; x2; r2ð Þ; . . .; xk; rkð Þ
such that
1. each xi is a member of K,2. each ri is an integer � 2,3. x1ð Þr12 F; x2ð Þr22 F x1ð Þ; x3ð Þr32 F x1;x2ð Þ; . . .; xkð Þrk2 F x1;x2; . . .;xk�1ð Þ;4. F x1;x2; . . .;xk�1;xkð Þ contains all the n roots of pðxÞ in K,
then we say that pðxÞ is solvable by radicals over F.
2.4.2 Example Let us take Q for F, C for K, and x5 � 2x3 � x2 þ 2 for pðxÞ.Since
x5 � 2x3 � x2 þ 2
¼ x3 x2 � 2� �� x2 � 2
� �¼ x2 � 2� �
x3 � 1� � ¼ x�
ffiffiffi2
p� �xþ
ffiffiffi2
p� �x3 � 1� �
¼ x�ffiffiffi2
p� �xþ
ffiffiffi2
p� �x� 1ð Þ x2 þ xþ 1
� �¼ x�
ffiffiffi2
p� �xþ
ffiffiffi2
p� �x� 1ð Þ xþ 1
2
� 2
� i
ffiffiffi3
p
2
� 2 !
¼ x� 1ð Þ x�ffiffiffi2
p� �xþ
ffiffiffi2
p� �x� �1
2þ i
ffiffiffi3
p
2
� � x� �1
2� i
ffiffiffi3
p
2
� � ;
152 2 Galois Theory II
we have
x5 � 2x3 � x2 þ 2 ¼ x� 1ð Þ x�ffiffiffi2
p� �xþ
ffiffiffi2
p� �x� xð Þ x� x2� �
;
where x � �12 þ i
ffiffi3
p2 . It follows that all the roots of pðxÞ are 1,
ffiffiffi2
p;� ffiffiffi
2p
;x;x2.
They all are members of C. Let us takeffiffiffi2
pfor x1 and x for x2. Let us take 2 for r1
and 3 for r2. Put F1 � Qffiffiffi2
p� �and F2 � F1 xð Þ. Let us take k ¼ 2. Now all six
conditions of the above definition are satisfied, so x5 � 2x3 � x2 þ 2 is solvable byradicals over Q.
2.4.3 Example Let us take the general cubic polynomial
x3 þ a1x2 þ a2xþ a3
over the field F0 of all rational numbers. By F0 a1; a2; a3ð Þ we shall mean the field ofrational functions in a1; a2; a3 over F0.
Since
x3 þ a1x2 þ a2xþ a3
¼ x3 þ 3x2a13
þ 3xa13
� �2þ a1
3
� �3� � 3x
a13
� �2� a1
3
� �3þ a2xþ a3
¼ xþ a13
� �3þ x a2 � a1ð Þ2
3
!þ a3 � a1
3
� �3�
¼ xþ a13
� �3þ a2 � a1ð Þ2
3
!xþ a1
3
� �� a2 � a1ð Þ2
3
!a13
þ a3 � a13
� �3� ¼ y3 þ pyþ q;
where y � xþ a13 , p � a2 � a1ð Þ2
3 2 F0 a1; a2; a3ð Þð Þ, q � � a2 � a1ð Þ23
� �a13 þ a3 � a1
3
� �3� �¼ 2 a1ð Þ3
27 � a1a23 þ a3 2 F0 a1; a2; a3ð Þð Þ, we have
x3 þ a1x2 þ a2xþ a3 ¼ y3 þ pyþ q:
Put y � uþ v, where uv ¼ �p3 . It follows that
y3 þ pyþ q ¼ u3 þ v3 þ 3uvy� �þ pyþ q ¼ u3 þ v3 þ q:
2.4 Solvability By Radicals 153
Hence
x3 þ a1x2 þ a2xþ a3 ¼ 0
is equivalent to
y3 þ pyþ q ¼ 0;
which, in turn, is equivalent to the simultaneous equations
u3 þ v3 þ q ¼ 0uv ¼ �p
3
�;
that is,
u3 þ v3 ¼ �qu3v3 ¼ �p3
27
�;
that is,
u3 þ v3 ¼ �q
u3 � v3ð Þ2¼ q2 � 4 �p3
27
� �):
Now we can take
u3 ¼ 12 �qþ
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiq2 þ 4p3
27
q� v3 ¼ 1
2 �q�ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiq2 þ 4p3
27
q� 9>>=>>;:
It follows that we can take
u ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi� 1
2 qþffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi127 p
3 þ 14 q
2q
3
rv ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi� 1
2 q�ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi127 p
3 þ 14 q
2q
3
r9>>=>>;:
Hence all the roots of x3 þ a1x2 þ a2xþ a3 are
� a13
þ uþ vð Þ;� a13
þ uxþ vx2� �;� a1
3þ ux2 þ vx� �
:
This is known as Cardan’s formula.
154 2 Galois Theory II
2.4.4 Example Let us take the general biquadratic polynomial
x4 þ a1x3 þ a2x
2 þ a3xþ a4
over the field F0 of all rational numbers. By F0 a1; a2; a3; a4ð Þ we shall mean thefield of rational functions in a1; a2; a3; a3 over F0. Since
x4 þ a1x3 þ a2x
2 þ a3xþ a4
¼ x4 þ 4x3a14
þ 6x2a14
� �2þ 4x
a14
� �3þ a1
4
� �4� � 6x2
a14
� �2�4x
a14
� �3� a1
4
� �4þ a2x
2 þ a3xþ a4
¼ xþ a14
� �4þ x2 a2 � 6
a14
� �2� þ x a3 � 4
a14
� �3� þ a4 � a1
4
� �4� ¼ y4 þ y� a1
4
� �2� � �ð Þþ y� a1
4
� �� � �ð Þþ � � �ð Þ ¼ y4 þ py2 þ qyþ r;
where y � xþ a14 , and p; q; r 2 F0 a1; a2; a3; a4ð Þ. It suffices to solve the equation
y4 þ py2 þ qyþ r ¼ 0;
that is,
y2 þ p2
� �2¼ p2
4� qy� r;
or
y2 þ p2þm
� �2¼ m2 þ 2y2 þ p
� �mþ p2
4� qy� r;
where m is to be determined. Here
y2 þ p2þm
� �2¼ 2my2 � qyþ m2 þ pmþ p2
4� r
� :
The equation is solved when the quadratic expression on the right-hand side ofthe above equation is a perfect square, that is, if its discriminant is zero, that is,
�qð Þ2�4 2mð Þ m2 þ pmþ p2
4� r
� ¼ 0;
2.4 Solvability By Radicals 155
that is,
m3 þ pm2 þ p2
4� r
� m� q2
8¼ 0:
This is a cubic equation in m, so all its roots can be found as in Example 2.4.3.This is known as Ferrari’s method.
2.4.5 Note Let F, K be any fields such that K is an extension of F. Suppose that forevery positive integer l, F contains all the lth roots of unity. Let pðxÞ 2 F½x�. Letdeg pðxÞð Þ ¼ n. Suppose that K contains all the n roots of pðxÞ. Suppose that pðxÞ issolvable by radicals over F.
Hence there exists a finite sequence
x1; r1ð Þ; x2; r2ð Þ; . . .; xk; rkð Þ
such that
1. each xi is a nonzero member of K,2. each ri is an integer � 2,
3: x1ð Þr12 F; x2ð Þr22 F x1ð Þ; x3ð Þr32 F x1;x2ð Þ; . . .; xkð Þrk2 F x1;x2; . . .;xk�1ð Þ;
4. F x1;x2; . . .;xk�1;xkð Þ contains all the n roots of pðxÞ in K.
Let L be the splitting field over F for pðxÞ. It follows that L is the smallest fieldcontaining all the n roots of pðxÞ in K. Now since F x1;x2; . . .;xk�1;xkð Þ is a fieldcontaining all the n roots of pðxÞ in K, we have L � F x1;x2; . . .;xk�1;xkð Þ.
Since x1ð Þr12 F, there exists a nonzero a 2 F such that x1 is a root of thepolynomial xr1 � að Þ 2 F½x�. By assumption, F contains all the r1 th roots of unity.So by 2.3.16, F x1ð Þ is the splitting field over F for xr1 � a, and hence by 2.2.20,F x1ð Þ is a normal extension of F. Thus F x1ð Þ is a normal extension of F.
Since x2ð Þr22 F x1ð Þ, there exists b 2 F x1ð Þ such that x2 is a root of thepolynomial equation xr2 � bð Þ 2 F x1ð Þð Þ½x�. By assumption, F � F x1ð Þð Þ containsall the r2 th roots of unity, so F x1ð Þ contains all the r2 th roots of unity. Now by2.3.16, F x1ð Þð Þ x2ð Þ ¼ F x1;x2ð Þð Þ is the splitting field over F x1ð Þð Þ for xr2 � b,and hence by 2.2.20, F x1;x2ð Þ is a normal extension of F x1ð Þ.
Similarly, F x1;x2;x3ð Þ is a normal extension of F x1;x2ð Þ, etc. Since F x1ð Þ isa normal extension of F, by 2.2.24, G F x1;x2; . . .;xkð Þ;F x1ð Þð Þ is a normalsubgroup of the group G F x1;x2; . . .;xkð Þ;Fð Þ.
Similarly, G F x1;x2; . . .;xkð Þ;F x1;x2ð Þð Þ is a normal subgroup of the groupG F x1;x2; . . .;xkð Þ;F x1ð Þð Þ, G F x1;x2; . . .;xkð Þ;F x1;x2;x3ð Þð Þ is a normalsubgroup of the group G F x1;x2; . . .;xkð Þ;F x1;x2ð Þð Þ, etc. Thus
156 2 Galois Theory II
G F x1;x2; . . .;xkð Þ;F x1;x1;x2; . . .;xk�1ð Þð Þ;f. . .;G F x1;x2; . . .;xkð Þ;F x1ð Þð Þ;G F x1;x2; . . .;xkð Þ;Fð Þg
is a collection of subgroups of G F x1;x2; . . .;xkð Þ;Fð Þ such that
1. G F x1;x2; . . .;xkð Þ;Fð Þ G F x1;x2; . . .;xkð Þ;F x1ð Þð Þ � � � G F x1;x2; . . .;xkð Þ;F x1;x1;x2; . . .;xk�1ð Þð Þ Idf g; where Id denotes theidentity automorphism of F x1;x2; . . .;xkð Þ.2. for every i ¼ 1; . . .; k, G F x1;x2; . . .;xkð Þ;F x1;x2; . . .;xið Þð Þ is a normalsubgroup of G F x1;x2; . . .;xkð Þ;F x1;x2; . . .;xi�1ð Þð Þ.
By 2.2.26, for every i ¼ 1; . . .; k, the quotient group
G F x1;x2; . . .;xkð Þ;F x1;x2; . . .;xi�1ð Þð ÞG F x1;x2; . . .;xkð Þ;F x1;x2; . . .;xið Þð Þ
is isomorphic onto the group
G F x1;x2; � � � ;xið Þ;F x1;x2; � � � ;xi�1ð Þð Þ¼ G F x1;x2; � � � ;xi�1ð Þð Þ xið Þ;F x1;x2; � � � ;xi�1ð Þð Þð Þ:
We want to show that, for every i ¼ 1; . . .; k, the quotient group
G F x1;x2; . . .;xkð Þ;F x1;x2; . . .;xi�1ð Þð ÞG F x1;x2; . . .;xkð Þ;F x1;x2; . . .;xið Þð Þ
is abelian. It suffices to show that for every i ¼ 1; . . .; k,G F x1;x2; . . .;xi�1ð Þð Þ xið Þ;F x1;x2; . . .;xi�1ð Þð Þ is abelian, that is, eachG F x1;x2; . . .;xið Þ;F x1;x2; . . .;xi�1ð Þð Þ is abelian.
To this end, we shall apply 2.3.17.Since xið Þri2 F x1;x2; . . .;xi�1ð Þ, there exists a nonzero a 2
F x1;x2; . . .;xi�1ð Þ such that xi is a root of the polynomialxri � að Þ 2 F x1;x2; . . .;xi�1ð Þð Þ½x�. By assumption, F x1;x2; . . .;xi�1ð Þ containsall the ri th roots of unity. So by 2.3.16,F x1;x2; . . .;xi�1ð Þð Þ xið Þ ¼ F x1;x2; . . .;xið Þð Þ is the splitting field overF x1;x2; . . .;xi�1ð Þ for xri � að Þ.
By 2.3.17, the Galois group of xri � a over F x1;x2; . . .;xi�1ð Þ is abelian.Herethe Galois group of xri � a is
G F x1;x2; . . .;xið Þ;F x1;x2; . . .;xi�1ð Þð Þ;
so G F x1;x2; . . .;xið Þ;F x1;x2; . . .;xi�1ð Þð Þ is abelian.Thus we have shown that G F x1;x2; . . .;xkð Þ;Fð Þ is a solvable group.We want to show that the Galois group over F of pðxÞ is a solvable group, that is,
G L;Fð Þ is a solvable group.
2.4 Solvability By Radicals 157
Since L is the splitting field over F for pðxÞ 2 F½x�ð Þ, by 2.2.20, L is a normalextension of F. Now, since L � F x1;x2; . . .;xk�1;xkð Þ, by 2.2.24,G F x1;x2; . . .;xkð Þ; Lð Þ is a normal subgroup of the group
G F x1;x2; . . .;xkð Þ;Fð Þ. Further, by 2.2.25, the quotient group G F x1;x2;...;xkð Þ;Fð ÞG F x1;x2;...;xkð Þ;Lð Þ is
isomorphic to the group G L;Fð Þ. Since the quotient group G F x1;x2;...;xkð Þ;Fð ÞG F x1;x2;...;xkð Þ;Lð Þ is a
homomorphic image of G F x1;x2; . . .;xkð Þ;Fð Þ, the group G L;Fð Þ is a homo-morphic image of G F x1;x2; . . .;xkð Þ;Fð Þ. Next, since G F x1;x2; . . .;xkð Þ;Fð Þ isa solvable group, the group G L;Fð Þ is a homomorphic image of a solvable group. Itfollows, by 2.3.10, that the group G L;Fð Þ is solvable.2.4.6 Conclusion Let F, K be any fields such that K is an extension of F. Supposethat for every positive integer l, F contains all the lth roots of unity. Let pðxÞ 2 F½x�.Let deg pðxÞð Þ ¼ n. Suppose that K contains all the n roots of pðxÞ. Suppose thatpðxÞ is solvable by radicals over F. Then the Galois group over F of pðxÞ is asolvable group.
2.4.7 Note Let F be any field. By the general polynomial xn þ a1xn�1 þ � � � þ an ofdegree n over F, we mean the following:
F a1; . . .; anð Þ is the field of all rational functions in n variables a1; . . .; an over F,and xn þ a1xn�1 þ � � � þ an is a polynomial in x over the field F a1; . . .; anð Þ3a1; . . .; anð Þ.If xn þ a1xn�1 þ � � � þ an is solvable by radicals over F a1; . . .; anð Þ, then we say
that the general polynomial xn þ a1xn�1 þ � � � þ an of degree n over F is solvableby radicals.
By 2.2.14, the splitting field over F �a1; . . .; �1ð Þnanð Þ ¼ F a1; . . .; anð Þð Þ fortn þ a1tn�1 þ a2tn�2 þ � � � þ an is F x1; . . .; xnð Þ, and
Sn ¼ G F x1; � � � ; xnð Þ;F a1; � � � ; anð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ G
splitting field over F a1; � � � ; anð Þfor tn þ a1tn�1 þ a2t
n�2
þ � � � þ an;F a1; � � � ; anð Þ
!
¼ Galois group of tn þ a1tn�1 þ a2t
n�2 þ � � � þ an� �
:
It follows that the Galois group of tn þ a1tn�1 þ a2tn�2 þ � � � þ an is Sn. Next, by2.3.12, Sn is not solvable for n� 5; so the Galois group oftn þ a1tn�1 þ a2tn�2 þ � � � þ an is not solvable for n� 5, and hence by 2.4.6,tn þ a1tn�1 þ a2tn�2 þ � � � þ an is not solvable by radicals over F for n� 5.
2.4.8 Conclusion Let F be any field. Suppose that for every positive integer l,F contains all the lth roots of unity. Let n� 5. Let xn þ a1xn�1 þ � � � þ an be thegeneral polynomial of degree n over F. Then tn þ a1tn�1 þ a2tn�2 þ � � � þ an is notsolvable by radicals over F.
158 2 Galois Theory II
Roughly speaking, for n� 5, there exists no formula for the roots oftn þ a1tn�1 þ a2tn�2 þ � � � þ an involving only a combination of mth roots ofrational functions of a1; � � � ; an, for various values of m.
This result is due to Niels Henrik Abel (1802–1829).
2.4.9 Note Here we shall recapitulate something about high-school “constructiongeometry,” We shall assume that we have a straightedge (that is, an ungraduatedscale), a compass (that is, an instrument with two arms, one with a metallic needleend, and another with a pencil’s lead end). We also assume that we are given themeasure of a “unit distance.”
Some of the well-known constructions are sketched below:
1. In Fig. 2.1, the perpendicular bisector of a given line segment is constructed.2. In Fig. 2.2, the perpendicular at given point of a line is constructed.3. In Fig. 2.3, the perpendicular line from a given point on a given line is
constructed.4. In Fig. 2.4, a line parallel to a given line and passing through a given point is
constructed. Here, construction (3) and then construction (2) are made.5. In Fig. 2.5, a given line segment is divided into three equal parts. Similarly, a
given line segment can be divided into any number of equal parts.
It follows that every rational number is a “constructible number.” In other words,Q � W , where W denotes the collection of all constructible numbers. It is easy to
Fig. 2.1 Perpendicularbisector of a given linesegment
Fig. 2.2 Perpendicular ata given point of a line
2.4 Solvability By Radicals 159
Fig. 2.3 Perpendicularline drawn from a givenpoint on a given line
Fig. 2.4 Line parallel to agiven line and passingthrough a given point
Fig. 2.5 A given line seg-ment to be divided intothree equal parts
Fig. 2.6 Construction forsquaring the size of a linesegment
160 2 Galois Theory II
Fig. 2.7 Construction forreciprocal of the size of aline segment
observe that if a and b are constructible numbers, then aþ b and a� b are con-structible numbers.
Now we shall show that if a is a constructible number, then so is a2 (Fig. 2.6).We first construct a right triangle ABC one of whose legs, AB, is of length 1, and
the other leg, AC, is of length a. Now we draw a line perpendicular to BC atC. Suppose that this line meets AB at D. Thus we get two similar triangles, ACB andADC. It follows that
a1¼ AC
AB¼ AD
AC|fflfflfflfflfflffl{zfflfflfflfflfflffl} ¼ADa
;
and hence AD ¼ a2. Thus a2 is a constructible number.
Now, since ab ¼ aþ bð Þ2 þ a�bð Þ24 , we get the following result: if a and b are
constructible numbers, then so is ab. We want to show that W is a field. For this, itsuffices to show that if a is a positive constructible number, then so is 1
a (Fig. 2.7).We first construct a right triangle ABC one of whose legs, AB, is of length a, and
the other leg, AC, is of length 1. Now we draw a line perpendicular to BC atC. Suppose that this line meets AB at D. Thus we get two similar triangles, ACB andADC. It follows that
1a¼ AC
AB¼ AD
AC|fflfflfflfflfflffl{zfflfflfflfflfflffl} ¼AD1
;
and hence AD ¼ 1a. Thus
1a is a constructible number.
Thus W is a subfield of R that contains Q.
2.4.10 Definition Let F be a subfield of R. By the plane of F we mean theCartesian product F � F � R2
� �.
Observe that the straight line joining the point x1; y1ð Þ 2 F � Fð Þ and the pointx2; y2ð Þ 2 F � Fð Þ is
y� y1 ¼ y2 � y1x2 � x1
x� x1ð Þ;
which is of the form axþ byþ c ¼ 0, where a; b; c 2 F. Similarly, every equationof the form axþ byþ c ¼ 0 represents a straight line passing through two points ofthe plane of F. Such straight lines are called straight lines in F.
2.4 Solvability By Radicals 161
It is clear that if two straight lines in F intersect in the real plane R2, then theirpoint of intersection is a point in the plane of F.
Every circle having center at a point of the plane of F and radius an element ofF is of the form x2 þ y2 þ axþ byþ c ¼ 0, where a; b; c 2 F. Such circles are calledcircles in F.
It is clear that if a circle in F and a straight line in F intersect in the real plane R2,then their points of intersection are either points in the plane of F or points in theplane of the field extension F
ffiffiffic
p� �of F, for some positive c 2 F.
Similarly, it is clear that if two circles in F intersect in the real plane R2, thentheir points of intersection are either points in the plane of F or points in the planeof the field extension F
ffiffiffic
p� �of F, for some positive c 2 F.
Thus, if a straight line or a circle in the field F intersects another straight line or acircle in the field F in the real plane R2, then there exists a positive real number c1such that their point(s) of intersection are points in the plane of the field extensionF c1ð Þ of F, where c1ð Þ22 F.
As above, if a straight line or a circle in the field F c1ð Þ intersects another straightline or a circle in the field F c1ð Þ in the real plane R2, then there exists a positive realnumber c2 such that their point(s) of intersection are points in the plane of the fieldextension F c1ð Þð Þ c2ð Þ ¼ F c1; c2ð Þð Þ of F c1ð Þ, where c2ð Þ22 F c1ð Þ, etc.
Hence, if a point is constructible from F, then there exists a finite sequencec1; . . .; cn of real numbers such that
1. c1ð Þ22 F; c2ð Þ22 F c1ð Þ; c3ð Þ22 F c1; c2ð Þ; . . .; cnð Þ22 F c1; . . .; cn�1ð Þ;2. the point is in the plane of F c1; . . .; cnð Þ.
Since c1ð Þ22 F, there exists a 2 F such that c1 is a root of the polynomialx2 � að Þ 2 FðxÞ. Here deg x2 � að Þ ¼ 2, so c1 is algebraic of degree 1 or 2, andhence by 1.4.16, F c1ð Þ;F½ � ¼ 1 or 2. Similarly ; F c1; c2ð Þ;F c1ð Þ½ � ¼ 1 or 2, andhence by 1.4.3, F c1; c2ð Þ;F½ � ¼ F c1; c2ð Þ;F c1ð Þ½ � F c1ð Þ;F½ � ¼ 1or2or22ð Þ. ThusF c1; c2ð Þ;F½ � ¼ 1 or 2 or 22. Similarly, F c1; c2; c3ð Þ;F½ � ¼ 1 or 2 or 22 or 23, etc.
2.4.11 Conclusion Suppose that a real number a is constructible. Then there existan extension K of Q and a nonnegative integer k such that a 2 K and a is algebraicof degree 2k .
2.4.12 Theorem It is impossible, by straightedge and a compass alone, to trisectthe angle 60�.
Proof Suppose to the contrary that the angle 20� is constructible. We seek acontradiction.
Fig. 2.8 Construction forthe size of cos 20�, provided20� angle is constructible
162 2 Galois Theory II
Let us draw a circle of radius 1 with center at the vertex of the 20� angle. Nowdraw the foot of perpendicular as shown in Fig. 2.8:
Thus cos 20� is constructible. On using the formula cos 3h ¼ 4 cos3 h� 3 cos h,we get 1
2 ¼ 4 cos3 20� � 3 cos 20�, and hence 8x3 � 6x� 1 ¼ 0, wherex � cos 20�. Thus cos 20� is a root of the polynomial 8x3 � 6x� 1 2 Q½x�ð Þ.
We claim that 8x3 � 6x� 1 is an irreducible polynomial over the field ofrational numbers.
Proof Put x � y� 1. It suffices to show that 8 y� 1ð Þ3�6 y� 1ð Þ � 1 is irre-ducible over the field of rational numbers.
Observe that
8 y� 1ð Þ3�6 y� 1ð Þ � 1 ¼ 8 y3 � 3y2 þ 3y� 1� �� 6yþ 6� 1
¼ �3þ 18y� 24y2 þ 8y3
¼ 3 �1þ 6y� 8y2� �þ 8y3;
so
8 y� 1ð Þ3�6 y� 1ð Þ � 1 ¼ 3 �1þ 6y� 8y2� �þ 8y3:
By 1.3.5, 3 �1þ 6y� 8y2ð Þþ 8y3 is irreducible over the field of rationalnumbers, and hence
8 y� 1ð Þ3�6 y� 1ð Þ � 1 is irreducible over the field of rational numbers. ■
Thus we have shown that 8x3 � 6x� 1 is an irreducible polynomial over thefield of rational numbers. Then by 1.5.12, cos 20� is algebraic of degree 3 over Q.Since cos 20� is constructible, by 2.4.12, cos 20� is algebraic of a degree of the form2k. This contradicts the fact that cos 20� is algebraic of degree 3 over Q. ■
2.4.13 Theorem It is impossible by straightedge and a compass alone to duplicatethe cube in the sense of constructing an edge of a cube whose volume is twice thevolume of a given cube.
Proof For simplicity, suppose that the volume of the given cube is 1. We have toconstruct a length a such that a3 ¼ 2.
Suppose that a is constructible. We seek a contradiction.Here a is a root of the polynomial x3 � 2 2 Q½x�ð Þ.We claim that x3 � 2 is an irreducible polynomial over the field of rational
numbers.
Proof Put x � y� 1. It suffices to show that y� 1ð Þ3�2 is an irreduciblepolynomial over the field of rational numbers.
2.4 Solvability By Radicals 163
Observe that
y� 1ð Þ3�2 ¼ y3 � 3y2 þ 3y� 1� �� 2 ¼ �3þ 3y� 3y2 þ y3
¼ 3 �1þ y� y2� �þ y3;
so
y� 1ð Þ3�2 ¼ 3 �1þ y� y2� �þ y3:
By 1.3.5, 3 �1þ y� y2ð Þþ y3 is irreducible over the field of rational num-bers, and hence
y� 1ð Þ3�2 is irreducible over the field of rational numbers. ■
Thus we have shown that x3 � 2 is an irreducible polynomial over the field ofrational numbers. So by 1.5.12, a is algebraic of degree 3 over Q. Since a isconstructible, by 2.4.12, a is algebraic of a degree of the form 2k. This contradictsthe fact that a is algebraic of degree 3 over Q. ■
2.4.14 Theorem It is impossible, by straightedge and a compass alone, to constructa regular septagon.
Proof Construction of a regular septagon requires the construction of an angle 2p7 .
Suppose that the angle 2p7 is constructible, and hence 2 cos 2p7 is constructible. We
seek a contradiction. Put h � 2p7 .
It follows that
4h ¼ 2p� 3h;
and hence
2 2 sin h cos hð Þ cos 2h ¼ 2 sin 2h cos 2h ¼ sin 4h ¼ sin 2p� 3hð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ � sin 3h
¼ �3 sin hþ 4 sin hð Þ3¼ sin h �3þ 4 sin2 h� �
:
This shows that
2 cos h 2 cos hð Þ2�2� �
¼ 4 cos h 2 cos2 h� 1ð Þ ¼ 4 cos h cos 2h ¼ �3þ 4 sin2 h|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ �3þ 4 1� cos2 hð Þ ¼ �4 cos2 hþ 1;
that is,
y y2 � 2� � ¼ �y2 þ 1;
164 2 Galois Theory II
or
y3 þ y2 � 2y� 1 ¼ 0;
where y � 2 cos h. Thus 2 cos h is a root of the polynomialx3 þ x2 � 2x� 1 2 Q½x�ð Þ.
Next, we claim that x3 þ x2 � 2x� 1 is an irreducible polynomial over thefield of rational numbers.
Proof Put x � yþ 2. It suffices to show that yþ 2ð Þ3 þ yþ 2ð Þ2�2 yþ 2ð Þ � 1 isan irreducible polynomial over the field of rational numbers. Observe that
yþ 2ð Þ3 þ yþ 2ð Þ2�2 yþ 2ð Þ � 1 ¼ 7þ 14yþ 7y2 þ y3 ¼ 7 1þ 2yþ y2� �þ y3;
so
yþ 2ð Þ3 þ yþ 2ð Þ2�2 yþ 2ð Þ � 1 ¼ 7 1þ 2yþ y2� �þ y3:
By 1.3.5, 7 1þ 2yþ y2ð Þþ y3 is irreducible over the field of rational numbers,and hence
yþ 2ð Þ3 þ yþ 2ð Þ2�2 yþ 2ð Þ � 1 is irreducible over the field of rationalnumbers.
Thus we have shown that x3 þ x2 � 2x� 1 is an irreducible polynomial overthe field of rational numbers.So by 1.5.12, 2 cos h is algebraic of degree 3 over Q. Since 2 cos h is con-
structible, by 2.4.11, 2 cos h is algebraic of a degree of the form 2k. This contradictsthe fact that 2 cos h is algebraic of degree 3 over Q. ■
Exercises
1. Let F be a field. Let f ðxÞ; gðxÞ; kðxÞ 2 F½x�. Let a 2 F. Suppose thathðxÞ ¼ f ðxÞþ k að ÞgðxÞ 2 F½x�ð Þ. Then
h0ðxÞ ¼ f 0ðxÞþ k að Þg0ðxÞ:
2. Show that every finite extension of a field of characteristic 0 is a simpleextension.
3. Let F and K be any fields such that K is an extension of F. Let F be of
characteristic p. Suppose that a is a member of K such that a p2�1ð Þ ¼ 1. Show
that p2a p2�1ð Þ 6¼ 1.4. Suppose that K is a finite extension of F. Show that the order of the group of
automorphisms of K relative to F cannot be greater than K : F½ �.5. Suppose that K is a finite extension of F. Suppose that T is a subfield of K that
contains F. Suppose that T is a normal extension of F. Show that G K; Tð Þ is anormal subgroup of GðK;FÞ.
2.4 Solvability By Radicals 165
6. Show that for all positive integers m, n, the group Smð Þ nð Þ is a normal subgroupof the symmetric group Sm.
7. Show that Qffiffiffi23
p� �is the fixed field of G Q
ffiffiffi23
p� �;Q
� �.
8. Let F be a field such that F � C. Suppose that F contains all the nth roots ofunity. Show that the Galois group of x5 � 5 over F is abelian.
9. Show that it is impossible, by straightedge and a compass alone, to constructthe angle 10�.
10. Suppose that a; b are nonzero constructible numbers. Show that
a2 � b2
a2 þ b2
is a constructible number.
166 2 Galois Theory II
Chapter 3Linear Transformations
The subject matter in this chapter is also known as linear algebra. As we shall see,the theory of matrices is intimately related to linear algebra. Its applications to otherbranches of knowledge is overwhelming. That is why it is considered an inde-pendent subfield of mathematics, exciting on its own.
3.1 Eigenvalues
3.1.1 Theorem Let V be an n-dimensional inner product space. Let T : V ! V be alinear transformation. Suppose that for every v 2 V , TðvÞ; vh i ¼ 0. Then T ¼ 0.
Proof We have to show that T ¼ 0. To this end, let us fix an arbitrary w 2 V . Wehave to show that TðwÞ ¼ 0, that is, TðwÞ; TðwÞh i ¼ 0.
Let us take arbitrary u; v 2 V . By the given condition, we have
TðuÞ; vh iþ TðvÞ; uh i ¼ 0þ TðuÞ; vh iþ TðvÞ; uh iþ 0
¼ TðuÞ; uh iþ TðuÞ; vh iþ TðvÞ; uh iþ TðvÞ; vh i¼ TðuÞþ TðvÞ; uþ vh i ¼ T uþ vð Þ; uþ vh i ¼ 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} :
Thus for every u; v 2 V ,
TðuÞ; vh iþ TðvÞ; uh i ¼ 0:
It follows that for every u; v 2 V ,
© Springer Nature Singapore Pte Ltd. 2020R. Sinha, Galois Theory and Advanced Linear Algebra,https://doi.org/10.1007/978-981-13-9849-0_3
167
2i TðvÞ; uh i ¼ i TðvÞ; uh iþ i TðvÞ; uh i¼ �i TðuÞ; vh iþ i TðvÞ; uh i ¼ �i TðuÞ; vh iþ i TðvÞ; uh i¼ TðuÞ; ivh iþ i TðvÞ; uh i ¼ TðuÞ; ivh iþ iTðvÞ; uh i¼ TðuÞ; ivh iþ T ivð Þ; uh i ¼ 0;|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
and hence for every u; v 2 V ,
TðvÞ; uh i ¼ 0:
It follows that
TðwÞ; TðwÞh i ¼ 0:
■
Definition Let V be an n-dimensional inner product space. Let T : V ! V be alinear transformation. If for every u; v 2 V , TðuÞ; TðvÞh i ¼ u; vh i, then we say that Tis unitary.
3.1.2 Theorem Let V be an n-dimensional inner product space. Let T : V ! V be aunitary linear transformation. Suppose that for every v 2 V , TðvÞ; TðvÞh i ¼ v; vh i.Then T is unitary.
Proof Let us take any u; v 2 V . By the given condition, we have
u; uh iþ TðuÞ; TðvÞh iþ TðvÞ; TðuÞh iþ v; vh i¼ TðuÞ; TðuÞh iþ TðuÞ; TðvÞh iþ TðvÞ; TðuÞh iþ TðvÞ; TðvÞh i¼ TðuÞþ TðvÞ; TðuÞþ TðvÞh i ¼ T uþ vð Þ; T uþ vð Þh i ¼ uþ v; uþ vh i|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ u; uh iþ u; vh iþ v; uh iþ v; vh i:
Thus for every u; v 2 V ,
TðuÞ; TðvÞh iþ TðvÞ; TðuÞh i ¼ u; vh iþ v; uh i:
It follows that for every u; v 2 V ,
168 3 Linear Transformations
2i TðvÞ; TðuÞh i � i u; vh i � i v; uh i ¼ i TðvÞ; TðuÞh i � u; vh i � v; uh ið Þþ i TðvÞ; TðuÞh i¼ �i TðuÞ; TðvÞh iþ i TðvÞ; TðuÞh i
¼ �i TðuÞ; TðvÞh iþ iTðvÞ; TðuÞh i ¼ TðuÞ; iTðvÞh iþ iTðvÞ; TðuÞh i¼ TðuÞ; T ivð Þh iþ T ivð Þ; TðuÞh i ¼ u; ivh iþ iv; uh i|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ �i u; vh iþ iv; uh i ¼ �i u; vh iþ i v; uh i;
and hence for every u; v 2 V ,
TðvÞ; TðuÞh i ¼ v; uh i:
Thus T is unitary. ■
3.1.3 Theorem Let V be an n-dimensional inner product space. Let T : V ! V be aunitary linear transformation. Let v1; . . .; vnf g be any orthonormal basis of V. ThenT v1ð Þ; . . .; T vnð Þf g is an orthonormal basis of V.
Proof Let
a1T v1ð Þþ � � � þ anT vnð Þ ¼ 0:
It follows that
a1 ¼ a11þ a20þ � � � þ an0 ¼ a1 v1; v1h iþ a2 v2; v1h iþ � � � þ an vn; v1h i¼ a1 T v1ð Þ; T v1ð Þh iþ a2 T v2ð Þ; T v1ð Þh iþ � � � þ an T vnð Þ; T v1ð Þh i
¼ a1T v1ð Þþ � � � þ anT vnð Þ; T v1ð Þh i ¼ 0; T v1ð Þh i|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0;
and hence a1 ¼ 0. Similarly, a2 ¼ 0; . . .; an ¼ 0. Thus we have shown thatT v1ð Þ; . . .; T vnð Þ are linearly independent. Since T v1ð Þ; . . .; T vnð Þ are linearly inde-pendent, T v1ð Þ; . . .; T vnð Þ are distinct members of V. It follows thatT v1ð Þ; . . .; T vnð Þf g is a basis of V. Next, for distinct indices i; j 2 1; . . .; nf g,T við Þ; T vj
� �� � ¼ vi; vj� � ¼ 0. Also, for every index i 2 1; . . .; nf g, T við Þ; T við Þh i ¼
vi; vih i ¼ 1. Thus T v1ð Þ; . . .; T vnð Þf g is an orthonormal basis of V. ■
3.1.4 Theorem Let V be an n-dimensional inner product space. Let T : V ! V be alinear transformation. Suppose that T sends every orthonormal basis of V to anorthonormal basis of V. Then T is unitary.
3.1 Eigenvalues 169
Proof Since V is an n-dimensional inner product space, there exists an orthonormalbasis v1; . . .; vnf g of V. By assumption, T v1ð Þ; . . .; T vnð Þf g is also an orthonormalbasis of V. Let u �Pn
i¼1 aivi and w �Pnj¼1 bjvj be any members of V. We have to
show that TðuÞ; TðwÞh i ¼ u;wh i: Here,
LHS ¼ TðuÞ; TðwÞh i ¼ TXni¼1
aivi
!; T
Xnj¼1
bjvj
!* +
¼Xni¼1
aiT við Þ !
;Xnj¼1
bjT vj� � !* +
¼Xi;j
aib| T við Þ; T vj� �� �
¼Xi;j
aib|dij ¼
Xni¼1
aibi;
and
RHS ¼ hu;wi ¼Xni¼1
aivi;Xnj¼1
bjvj
* +¼Xi;j
ai�b| vi; vj� � ¼X
i;j
ai�bjdij ¼
Xni¼1
ai�bi
so LHS = RHS. ■
3.1.5 Theorem Let V be an n-dimensional inner product space. Let T : V ! V be alinear transformation. Let v 2 V . Then there exists a unique w 2 V such that
u 2 V ) u;wh i ¼ TðuÞ; vh i:
We denote w by T�ðvÞ. Thus T� : V ! V , and for every u; v 2 V ,u; T�ðvÞh i ¼ TðuÞ; vh i. Also, T� : V ! V is linear.
Proof Existence: Since V is an n-dimensional inner product space, there exists anorthonormal basis u1; . . .; unf g of V. Put
w � T u1ð Þ; vh iu1 þ � � � þ T unð Þ; vh iun:
Let us fix an arbitrary u �Pni¼1 aiui. We have to show that
170 3 Linear Transformations
Xni¼1
aiui;Xnj¼1
T u|� �
; v� �
uj
* +¼ T
Xni¼1
aiui
!; v
* +;
LHS ¼Xni¼1
aiui;Xnj¼1
T u|� �
; v� �
uj
* +
¼Xi;j
ai T uj� �
; v� �
ui; uj� � ¼X
i;j
ai T uj� �
; v� �
dij
¼Xni¼1
ai T uið Þ; vh i ¼Xni¼1
aiT uið Þ; v* +
¼ TXni¼1
aiui
!; v
* +¼ RHS:
Uniqueness: Suppose that there exist w1;w2 2 V such that
u 2 V ) u;w1h i ¼ TðuÞ; vh i; and u;w2h i ¼ TðuÞ; vh i:
We have to show that w1 ¼ w2, that is, w1 � w2;w1 � w2h i ¼ 0. Here
u 2 V ) u;w1h i ¼ u;w2h i;
so for every u 2 V , u;w1 � w2h i ¼ 0. It follows that w1 � w2;w1 � w2h i ¼ 0.Linearity: Let us take arbitrary v1; v2 2 V : Let a; b be arbitrary complex num-
bers. We have to show that
T� av1 þ bv2ð Þ ¼ aT� v1ð Þþ bT� v2ð Þ:
It suffices to show that for every u 2 V ,
u; T� av1 þ bv2ð Þh i ¼ u; aT� v1ð Þþ bT� v2ð Þh i:
To this end, let us fix an arbitrary u 2 V . We have to show that
u; T� av1 þ bv2ð Þh i ¼ u; aT� v1ð Þþ bT� v2ð Þh i;LHS ¼ u; T� av1 þ bv2ð Þh i ¼ TðuÞ; av1 þ bv2h i ¼ �a TðuÞ; v1h iþ �b TðuÞ; v2h i
¼ �a u; T� v1ð Þh iþ �b u; T� v2ð Þh i ¼ u; aT� v1ð Þþ bT� v2ð Þh i ¼ RHS:
■
Definition Let V be an n-dimensional inner product space. Let T : V ! V be alinear transformation. By 3.1.5, T� : V ! V is a linear transformation such that forevery u; v 2 V , u; T�ðvÞh i ¼ TðuÞ; vh i. Here T� is called the Hermitian adjoint of T.
3.1 Eigenvalues 171
3.1.6 Problem Let V be an n-dimensional inner product space. Let T : V ! V be alinear transformation. Then T�ð Þ�¼ T .
Proof Let us take an arbitrary v 2 V . We have to show that
T�ð Þ�ðvÞ ¼ TðvÞ:
To this end, let us take an arbitrary u 2 V . It suffices to show that
u; T�ð Þ�ðvÞh i ¼ u; TðvÞh i;LHS ¼ u; T�ð Þ�ðvÞh i ¼ T�ðuÞ; vh i
¼ v; T�ðuÞh i ¼ TðvÞ; uh i ¼ u; TðvÞh i ¼ RHS:
:
■
3.1.7 Problem Let V be an n-dimensional inner product space. Let S : V ! V andT : V ! V be linear transformations. Let k; l be any complex numbers. ThenkSþ lTð Þ�¼ �kS� þ �lT�.
Proof Let us take an arbitrary v 2 V . We have to show that
kSþ lTð Þ�ðvÞ ¼ �kS� þ �lT�� �ðvÞ;that is,
kSþ lTð Þ�ðvÞ ¼ �kS�ðvÞþ �lT�ðvÞ:
To this end, let us take an arbitrary u 2 V . It suffices to show that
u; kSþ lTð Þ�ðvÞh i ¼ u; �kS�ðvÞþ �lT�ðvÞ� �:
LHS ¼ u; kSþ lTð Þ�ðvÞh i ¼ kSþ lTð ÞðuÞ; vh i¼ kSðuÞþ lTðuÞ; vh i ¼ k SðuÞ; vh iþ l TðuÞ; vh i¼ k u; S�ðvÞh iþ l u; T�ðvÞh i¼ u; �kS�ðvÞþ �lT�ðvÞ� � ¼ RHS:
■
3.1.8 Problem Let V be an n-dimensional inner product space. Let S : V ! V andT : V ! V be linear transformations. Then STð Þ�¼ T�S�:
Proof Let us take an arbitrary v 2 V . We have to show that
STð Þ�ðvÞ ¼ T�S�ð ÞðvÞ;
172 3 Linear Transformations
that is,
STð Þ�ðvÞ ¼ T� S�ðvÞð Þ:
To this end, let us take an arbitrary u 2 V . It suffices to show that
u; STð Þ�ðvÞh i ¼ u; T� S�ðvÞð Þh i:LHS ¼ u; STð Þ�ðvÞh i ¼ STð ÞðuÞ; vh i
¼ S TðuÞð Þ; vh i ¼ TðuÞ; S�ðvÞh i ¼ u; T� S�ðvÞð Þh i ¼ RHS:
■
3.1.9 Problem Let V be an n-dimensional inner product space. Let T : V ! V be aunitary linear transformation. Then T�T ¼ I.
Proof Let us take an arbitrary v 2 V . We have to show that T� TðvÞð Þ ¼ v. To thisend, let us take an arbitrary u 2 V . It suffices to show that u; T� TðvÞð Þh i ¼ u; vh i:
RHS ¼ u; vh i ¼ TðuÞ; TðvÞh i ¼ u; T� TðvÞð Þh i ¼ LHS:
■
3.1.10 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea linear transformation such that T�T ¼ I. Then T is unitary.
Proof Let us take arbitrary u; v 2 V . We have to show that TðuÞ; TðvÞh i ¼ u; vh i:
LHS ¼ TðuÞ; TðvÞh i ¼ u; T� TðvÞð Þh i ¼ u; T�Tð ÞðvÞh i ¼ u; IðvÞh i ¼ u; vh i ¼ RHS:
■
3.1.11 Theorem Let V be an n-dimensional inner product space. Let T : V ! V bea linear transformation. Let v1; . . .; vnf g be an orthonormal basis of V. Let aij
� �be
the matrix of T relative to the basis v1; . . .; vnf g, in the sense that
T v1ð Þ ¼ a11v1 þ a21v2 þ � � � þ an1vn ¼Pni¼1
ai1vi
;
T v2ð Þ ¼ a12v1 þ a22v2 þ � � � þ an2vn;
..
.
T vnð Þ ¼ a1nv1 þ a2nv2 þ � � � þ annvn:
In short, T vj� � ¼Pn
i¼1 aijvi.Then the matrix of T� relative to the basis v1; . . .; vnf g is bij
� �, where bij ¼ a|i.
In short, T� vj� � ¼Pn
i¼1 bijvi.
3.1 Eigenvalues 173
Proof By the proof of 3.1.5,
T� v1ð Þ ¼ T v1ð Þ; v1h i v1 þ � � � þ T vnð Þ; v1h i vn;T� v2ð Þ ¼ T v1ð Þ; v2h i v1 þ � � � þ T vnð Þ; v2h i vn;
..
.
T� vnð Þ ¼ T v1ð Þ; vnh i v1 þ � � � þ T vnð Þ; vnh i vn:
Since
T� v1ð Þ ¼Xni¼1
T við Þ; v1h ivi
¼Xni¼1
a1iv1 þ a2iv2 þ � � � þ anivn; v1h ivi
¼Xni¼1
a1i v1; v1h iþ a2i v2; v1h iþ � � � þ ani vn; v1h ivi
¼Xni¼1
a1i1þ ai20þ � � � þ ain0vi
¼Xni¼1
a1ivi ¼ a11v1 þ a12v2 þ � � � þ a1nvn;
we have
T� v1ð Þ ¼ a11 v1 þ a12 v2 þ � � � þ a1n vn ¼Xni¼1
a1i vi
!;
Similarly,
T� v2ð Þ ¼ a21 v1 þ a22 v2 þ � � � þ a2n vn;
etc. In short, T� vj� � ¼Pn
i¼1 a|i vi. If the matrix of T� relative to the basisv1; . . .; vnf g is bij
� �, then bij ¼ a|i. ■
3.1.12 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea unitary linear transformation. Let v1; . . .; vnf g be an orthonormal basis of V. Letaij� �
be the matrix of T relative to the basis v1; . . .; vnf g. Then Pnj¼1 ajia|k ¼ dik.
Proof It is given that
T við Þ ¼Xnj¼1
ajivj:
174 3 Linear Transformations
By 3.1.11,
T� vj� � ¼Xn
k¼1
a|k vk:
Since T : V ! V is unitary, by 3.1.9, T�T ¼ I. It follows that
Xnk¼1
dkivk ¼ vi ¼ I við Þ ¼ T�Tð Þ við Þ|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ T� T við Þð Þ ¼ T� Xn
j¼1
ajivj
!¼Xnj¼1
ajiT� vj� �
¼Xnj¼1
ajiXnk¼1
a|k vk
!¼Xnj¼1
Xnk¼1
ajia|k vk
!
¼Xnk¼1
Xnj¼1
ajia|kvk
!¼Xnj¼1
Xnj¼1
ajia|k
!vk;
and hence
Xnj¼1
ajia|k ¼ dki:
■
Definition Let V be an n-dimensional inner product space. Let T : V ! V be alinear transformation. If T� ¼ T , then we say that T is Hermitian.
3.1.13 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea Hermitian linear transformation. Then all its eigenvalues are real.
Proof Let k be an eigenvalue of T. We have to show that k is real, that is, �k ¼ k.Since k is an eigenvalue of T, there exists a nonzero v 2 V such that TðvÞ ¼ kv.
Since T : V ! V is Hermitian, we have T� ¼ T . Now,
k v; vh i ¼ kv; vh i ¼ TðvÞ; vh i ¼ v; T�ðvÞh i ¼ v; TðvÞh i|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ v; kvh i ¼ �k v; vh i;
so
�k� k� �
v; vh i ¼ 0;
and hence, �k ¼ k or v; vh i ¼ 0. Since v is nonzero, v; vh i 6¼ 0, and hence �k ¼ k. ■
3.1 Eigenvalues 175
3.1.14 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea linear transformation. Then
T�Tð ÞðvÞ ¼ 0 ) TðvÞ ¼ 0:
Proof Let v 2 V be such that T�Tð ÞðvÞ ¼ 0. We have to show that TðvÞ ¼ 0, thatis, TðvÞ; TðvÞh i ¼ 0. Since T�Tð ÞðvÞ ¼ 0, we have
TðvÞ; TðvÞh i ¼ v; T� TðvÞð Þh i ¼ v; T�Tð ÞðvÞh i ¼ 0|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence TðvÞ; TðvÞh i ¼ 0. ■
3.1.15 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea Hermitian linear transformation. Let k be a positive integer. Then
TkðvÞ ¼ 0 ) TðvÞ ¼ 0:
Proof For k ¼ 1, the theorem is trivial. So we consider the case k ¼ 2. SinceT : V ! V is Hermitian, we have T� ¼ T . Now,
0 ¼ TkðvÞ ¼ T2ðvÞ|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl} ¼ TTð ÞðvÞ ¼ T�Tð ÞðvÞ;
so T�Tð ÞðvÞ ¼ 0. It follows from 3.1.14 that TðvÞ ¼ 0.Next, we consider the case k ¼ 3. Here
0 ¼ TkðvÞ ¼ T3ðvÞ|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl} ¼ TTTð ÞðvÞ ¼ T�TTð ÞðvÞ ¼ T�Tð Þ TðvÞð Þ;
so T�Tð Þ TðvÞð Þ ¼ 0. It follows from 3.1.14 that T TðvÞð Þ ¼ 0, that is, T2ðvÞ ¼ 0.Since the theorem has been proved for k ¼ 2, we have TðvÞ ¼ 0, etc. ■
Definition Let V be an n-dimensional inner product space. Let T : V ! V be alinear transformation. If T�T ¼ TT�, then we say that T is normal.
3.1.16 Theorem Let V be an n-dimensional inner product space. Let T : V ! V bea unitary linear transformation. Then T is normal.
Proof We have to show that T�T ¼ TT�. Since T : V ! V is unitary, by 3.1.9,T�T ¼ I. It follows that T�1 ¼ T�:
LHS ¼ T�T ¼ T�1T ¼ I ¼ TT�1 ¼ TT� ¼ RHS:
■
3.1.17 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea Hermitian linear transformation. Then T is normal.
176 3 Linear Transformations
Proof We have to show that T�T ¼ TT�. Since T : V ! V is Hermitian, T� ¼ T:
LHS ¼ T�T ¼ TT ¼ TT� ¼ RHS:
■
Definition Let V be an n-dimensional inner product space. Let T : V ! V be alinear transformation. If T� ¼ �T , then we say that T is skew-Hermitian.
3.1.18 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea skew-Hermitian linear transformation. Then T is normal.
Proof We have to show that T�T ¼ TT�. Since T : V ! V is skew-Hermitian,T� ¼ �T :
LHS ¼ T�T ¼ �Tð ÞT ¼ T �Tð Þ ¼ TT� ¼ RHS:
■
3.1.19 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea normal linear transformation. Then
TðvÞ ¼ 0 ) T�ðvÞ ¼ 0:
Proof Let v 2 V be such that TðvÞ ¼ 0. We have to show that T�ðvÞ ¼ 0, that is,T�ðvÞ; T�ðvÞh i ¼ 0, that is, T T�ðvÞð Þ; vh i ¼ 0, that is, TT�ð ÞðvÞ; vh i ¼ 0. Since T :V ! V is normal, we have T�T ¼ TT�. It suffices to show that T�Tð ÞðvÞ; vh i ¼ 0:
LHS ¼ T�Tð ÞðvÞ; vh i ¼ T� TðvÞð Þ; vh i ¼ T� 0ð Þ; vh i ¼ 0; vh i ¼ 0 ¼ RHS:
■
3.1.20 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea normal linear transformation. Let k be any complex number. Then T � kIð Þ :V ! V is a normal linear transformation.
Proof By 3.1.7, T � kIð Þ� ¼ T� � �kI�. Since for every u; v 2 V , u; I�ðvÞh i ¼IðuÞ; vh i ¼ u; vh i ¼ u; IðvÞh i, we have, for every u; v 2 V , u; I�ðvÞh i ¼ u; IðvÞh i. Itfollows that I� ¼ I. Thus T � kIð Þ�¼ T� � �kI. It suffices to show that
T� � �kI� �
T � kIð Þ ¼ T � kIð Þ T� � �kI� �
;
that is,
T�T � kT� � �kT þ kj j2I ¼ TT� � �kT � kT� þ kj j2I;
3.1 Eigenvalues 177
that is,
T�T ¼ TT�:
This is known to be true, because T : V ! V is normal. ■
3.1.21 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea normal linear transformation. Let k be an eigenvalue of T. Let v be an eigenvectorbelonging to k, in the sense that v is nonzero, and TðvÞ ¼ kv. Then T�ðvÞ ¼ �kv.
Proof It suffices to show that T� � �kI� �ðvÞ ¼ 0, that is, T� � �kI�
� �ðvÞ ¼ 0, that is,T � kIð Þ�ðvÞ ¼ 0.Since TðvÞ ¼ kv, we have T � kIð ÞðvÞ ¼ 0. By 3.1.20, T � kIð Þ : V ! V is a
normal linear transformation. Now, since T � kIð ÞðvÞ ¼ 0, by 3.1.19,T � kIð Þ�ðvÞ ¼ 0. ■
3.1.22 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea unitary linear transformation. Let k be an eigenvalue of T. Then kj j ¼ 1.
Proof Since T is unitary, by 3.1.16, T is normal. Since k is an eigenvalue of T, thereexists a nonzero v 2 V such that TðvÞ ¼ kv. Now, by 3.1.21, T�ðvÞ ¼ �kv. Since T isunitary, by 3.1.9, T�T ¼ I, and hence
kj j2v ¼ k�k� �
v ¼ k �kv� � ¼ k T�ðvÞð Þ ¼ T� kvð Þ ¼ T� TðvÞð Þ ¼ T�Tð ÞðvÞ ¼ IðvÞ|fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl} ¼ v:
It follows that kj j2�1� �
v ¼ 0: Now since v is nonzero, we have kj j2�1 ¼ 0,
that is, kj j ¼ 1. ■
3.1.23 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea normal linear transformation. Let k be a positive integer. Then
TkðvÞ ¼ 0 ) TðvÞ ¼ 0:
Proof For k ¼ 1, the theorem is trivial. So we consider the case k ¼ 2. SinceT : V ! V is normal, we have TT� ¼ T�T . Since
T�Tð Þ�¼ T� T�ð Þ�¼ T�T ;
we have T�Tð Þ�¼ T�T , and hence T�T is a Hermitian linear transformation. Nowsuppose that TkðvÞ ¼ 0, that is, T2ðvÞ ¼ 0. We have to show that TðvÞ ¼ 0. Since
178 3 Linear Transformations
T�Tð Þ2ðvÞ ¼ T�Tð Þ T�Tð Þð ÞðvÞ¼ T� TT�ð ÞTð ÞðvÞ ¼ T� T�Tð ÞTð ÞðvÞ¼ T�T�ð Þ TTð ÞðvÞð Þ ¼ T�T�ð Þ T2ðvÞ� �¼ T�T�ð Þ 0ð Þ ¼ 0;
we have T�Tð Þ2ðvÞ ¼ 0. Now, since T�T is Hermitian, by 3.1.15, T�Tð ÞðvÞ ¼ 0,and hence by 3.1.14, TðvÞ ¼ 0.
Next, we consider the case k ¼ 3. Suppose that TkðvÞ ¼ 0, that is, T3ðvÞ ¼ 0.We have to show that TðvÞ ¼ 0. Since T is normal, we have
T�Tð Þ3ðvÞ ¼ T�Tð Þ T�Tð Þ T�Tð Þð ÞðvÞ ¼ T�ð Þ3T3� �
ðvÞ ¼ T�ð Þ3 T3ðvÞ� � ¼ T�ð Þ3 0ð Þ¼ 0;
and hence T�Tð Þ3ðvÞ ¼ 0. Now, since T�T is Hermitian, by 3.1.15, T�Tð ÞðvÞ ¼ 0,and hence by 3.1.14, TðvÞ ¼ 0, etc. ■
3.1.24 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea normal linear transformation. Let k be any complex number. Let k be a positiveinteger. Then
T � kIð ÞkðvÞ ¼ 0 ) TðvÞ ¼ kv:
Proof By 3.1.20, T � kIð Þ : V ! V is a normal linear transformation. Now, by3.1.23,
T � kIð ÞkðvÞ ¼ 0 ) T � kIð ÞðvÞ ¼ 0;
and hence
T � kIð ÞkðvÞ ¼ 0 ) TðvÞ ¼ kv:
■
3.1.25 Problem Let V be an n-dimensional inner product space. Let T : V ! V bea normal linear transformation. Let k; l be two distinct eigenvalues of T. Let v;w 2V be such that TðvÞ ¼ kv and TðwÞ ¼ lw. Then v;wh i ¼ 0.
Proof If v ¼ 0 or w ¼ 0, then the theorem is trivial. So we consider the case thatv and w both are nonzero. Since l is an eigenvalue of T and TðwÞ ¼ lw, w is aneigenvector belonging to l. It follows, from 3.1.21, that T�ðwÞ ¼ �lw. Hence
l v;wh i ¼ v; �lwh i ¼ v; T�ðwÞh i|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ TðvÞ;wh i ¼ kv;wh i ¼ k v;wh i:
3.1 Eigenvalues 179
Thus
l� kð Þ v;wh i ¼ 0:
Now, since l 6¼ k, we have v;wh i ¼ 0. ■
Definition Let A be a ring. Let A be a vector space over the field C of complexnumbers. If for every a; b 2 A, and, for every complex number a,
a abð Þ ¼ aað Þb ¼ a abð Þ;
then we say that A is an algebra.
Let V be any vector space over the field C of complex numbers. Let AðVÞ be thecollection of all linear transformations from V to V. We know that AðVÞ is analgebra with unit element I. If dimV ¼ n, then dimAðVÞ ¼ n2.
Definition Let A be an algebra with unit element e. Let
pðxÞ � a0 þ a1xþ � � � þ anxn
be any polynomial in x with complex coefficients ai. Let a 2 A. By a satisfies pðxÞ,we mean
a0eþ a1aþ � � � þ anan ¼ 0: In short; pðaÞ ¼ 0:ð Þ
3.1.26 Problem Let A be an algebra with unit element e. Let m be the dimension ofA. Let a 2 A. Then there exists a nontrivial polynomial pðxÞ such that a satisfiespðxÞ. Also, the degree of pðxÞ is not greater than m.
Proof If any two of e; a; a2; . . .; am are equal, say a2 ¼ a5, then the polynomial1x2 þð�1Þx5 serves the purpose of pðxÞ.
Finally, we consider the case that e; a; a2; . . .; am are ðmþ 1Þ distinct members ofA. Since m is the dimension of A; e; a; a2; . . .; am are linearly dependent, and hencethere exist complex numbers a0; a1; . . .; am, not all zero, such that
a0eþ a1aþ � � � þ amam ¼ 0:
It follows that the polynomial a0 þ a1xþ � � � þ amxm serves the purpose of pðxÞ.Here, the degree of pðxÞ is not greater than m. ■
3.1.27 Problem Let V be any n-dimensional vector space. Let T 2 AðVÞ. Thenthere exists a nontrivial polynomial pðxÞ of degree � n2 such that pðTÞ ¼ 0.
Proof We know that dimAðVÞ ¼ n2, so by 3.1.26, there exists a nontrivial poly-nomial pðxÞ of degree � n2 such that pðTÞ ¼ 0. ■
180 3 Linear Transformations
Definition Let V be any n-dimensional vector space. Let T 2 AðVÞ. A nontrivialpolynomial pðxÞ of lowest degree such that pðTÞ ¼ 0 is called a minimal polyno-mial of T.
If pðxÞ is a minimal polynomial of T, and T satisfies another polynomial hðxÞ,then pðxÞ divides hðxÞ.3.1.28 Problem Let V be any n-dimensional vector space. Let T 2 AðVÞ. Let T beinvertible. Suppose that a0 þ a1xþ � � � þ amxm is a minimal polynomial of T,where am 6¼ 0. Then a0 6¼ 0.
Proof Suppose to the contrary that a0 ¼ 0. We seek a contradiction.Since a1xþ � � � þ amxm is a minimal polynomial of T, we have
a1T þ � � � þ amTm ¼ 0;
and hence
a1Iþ a2T � � � þ amTm�1 ¼ T a1T þ � � � þ amT
mð Þ ¼ 0T|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0:
Thus
a1Iþ a2T � � � þ amTm�1 ¼ 0:
Hence hðTÞ ¼ 0, where hðxÞ � a1 þ a2x � � � þ amxm�1. Now, since am 6¼ 0,
deg hðxÞ ¼ m� 1|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl}\m ¼ deg a1xþ � � � þ amxmð Þ:
Since deg hðxÞ\deg a1xþ � � � þ amxmð Þ and a1xþ � � � þ amxm is a minimalpolynomial of T, we have hðTÞ 6¼ 0. This is a contradiction. ■
3.1.29 Problem Let V be any n-dimensional vector space. Let T 2 AðVÞ. Supposethat a0 þ a1xþ � � � þ amxm is a minimal polynomial of T, where am 6¼ 0 anda0 6¼ 0. Then T�1 exists.
Proof Since a0 þ a1xþ � � � þ amxm is a minimal polynomial of T, we have
a0Iþ a1T þ � � � þ amTm ¼ 0:
It follows that
I ¼ �a1a0
T þ �a2a0
T2 þ � � � þ �ama0
Tm;
3.1 Eigenvalues 181
or
T�a1a0
þ �a2a0
T þ � � � þ �ama0
Tm�1
¼ I:
This shows that �a1a0
þ �a2a0
T þ � � � þ �ama0
Tm�1 is the inverse of T. ■
3.1.30 Problem Let V be any n-dimensional vector space. Let T 2 AðVÞ. Supposethat T�1 does not exist. Then there exists a nonzero S 2 AðVÞ such thatST ¼ TS ¼ 0.
Proof Suppose that a0 þ a1xþ � � � þ amxm is a minimal polynomial of T, wheream 6¼ 0. By 3.1.29, a0 ¼ 0. Hence
a1T þ � � � þ amTm ¼ 0;
or
a1Iþ a2T þ � � � þ amTm�1� �
T ¼ T a1Iþ a2T þ � � � þ amTm�1� � ¼ 0:
Thus ST ¼ TS ¼ 0, where S � a1Iþ a2T þ � � � þ amTm�1 2 AðVÞð Þ. Sincea0 þ a1xþ � � � þ amxm is a minimal polynomial of T and am 6¼ 0, we havea1Iþ a2T þ � � � þ amTm�1 6¼ 0, and hence S 6¼ 0. ■
3.1.31 Problem Let V be any n-dimensional vector space. Let T 2 AðVÞ. Supposethat T�1 does not exist. Then there exists a nonzero v 2 V such that TðvÞ ¼ 0.
Proof By 3.1.30, there exists a nonzero S 2 AðVÞ such that TS ¼ 0. Since S isnonzero, there exists u 2 V such that SðuÞ 6¼ 0. Now, since TS ¼ 0, we have
T SðuÞð Þ ¼ TSð ÞðuÞ ¼ 0ðuÞ|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} ¼ 0;
and hence TðvÞ ¼ 0, where v � SðuÞ 6¼ 0ð Þ. ■
3.1.32 Problem Let V be any n-dimensional vector space. Let T 2 AðVÞ. Supposethat there exists a nonzero v 2 V such that TðvÞ ¼ 0. Then T�1 does not exist.
Proof Suppose to the contrary that T�1 exists. We seek a contradiction.Suppose that a0 þ a1xþ � � � þ amxm is a minimal polynomial of T, where
am 6¼ 0. Now, by 3.1.28, a0 6¼ 0. Since a0 þ a1xþ � � � þ amxm is a minimal poly-nomial of T and am 6¼ 0, we have
a0Iþ a1T þ � � � þ amTm ¼ 0:
182 3 Linear Transformations
It follows that
a0v ¼ a0vþ a10þ � � � þ am0 ¼ a0IðvÞþ a1TðvÞþ � � � þ amTmðvÞ
¼ a0Iþ a1T þ � � � þ amTmð ÞðvÞ ¼ 0ðvÞ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0;
and hence a0v ¼ 0. Now, since v is nonzero, a0 ¼ 0. This is a contradiction. ■
3.1.33 Problem Let V be any n-dimensional vector space. Let S; T 2 AðVÞ. Letv1; . . .; vn be any basis of V. Let mðSÞ be the matrix of S relative to the basisv1; . . .; vn, in the sense that mðSÞ ¼ aij
� �n�n, where S vj
� � �Pni¼1 aijvi. Let mðTÞ be
the matrix of T relative to the basis v1; . . .; vn. Then
m STð Þ ¼ mðSÞmðTÞ:Proof Let mðTÞ ¼ bij
� �n�n, where T vj
� � �Pni¼1 bijvi. It suffices to show that
STð Þ vj� � ¼Xn
k¼1
Xni¼1
akibij
!vk:
LHS ¼ STð Þ vj� � ¼ S T vj
� �� � ¼ SXni¼1
bijvi
!
¼Xni¼1
bijS við Þ ¼Xni¼1
bijXnk¼1
akivk
!
¼Xni¼1
Xnk¼1
bijakivk
!¼Xnk¼1
Xni¼1
bijakivk
!¼Xnk¼1
Xni¼1
bijaki
!vk
¼Xnk¼1
Xni¼1
akibij
!vk ¼ RHS:
:
■
3.1.34 Problem Let V be any n-dimensional vector space. Let T 2 AðVÞ. Letv1; . . .; vn be any basis of V. Suppose that T�1 exists. Let mðTÞ be the matrix ofT relative to the basis v1; . . .; vn. Then
m T�1� � ¼ mðTÞð Þ�1:
Proof Here, it suffices to show that m T�1ð ÞmðTÞ ¼ dij� �
n�n. Since T�1T ¼ I, byProblem 3.1.33, we have
dij� �
n�n¼ mðIÞ ¼ m T�1T� � ¼ m T�1� �
mðTÞ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence m T�1ð ÞmðTÞ ¼ dij
� �n�n: ■
3.1 Eigenvalues 183
3.1.35 Theorem
a. Let V be any n-dimensional vector space. Let T 2 AðVÞ. Let v1; . . .; vn andw1; . . .;wn be any two bases of V. Let m1ðTÞ be the matrix of T relative to thebasis v1; . . .; vn. Let m2ðTÞ be the matrix of T relative to the basis w1; . . .;wn. LetS : V ! V be the linear transformation such that for every i 2 1; . . .; nf g,S við Þ ¼ wi. Then
m2ðTÞ ¼ m1ðSÞð Þ�1m1ðTÞm1ðSÞ:
b. Let V be any n-dimensional vector space. Let T 2 AðVÞ. Let v1; . . .; vn be anybasis of V. Let A � aij
� �n�n be the matrix of T relative to the basis v1; . . .; vn. Let
P � pij� �
n�n be any invertible matrix. Then there exists a basis w1; . . .;wn ofV such that P�1AP is the matrix of T relative to the basis w1; . . .;wn.
Proof (a) Let m1ðTÞ ¼ aij� �
n�n, where T vj� � �Pn
i¼1 aijvi. Let m2ðTÞ ¼ bij� �
n�n,where T wj
� � �Pni¼1 bijwi.
Clearly, S is invertible, that is, S�1 exists.
Proof Suppose that SðvÞ ¼ 0. It suffices to show that v ¼ 0. There exist scalarsa1; . . .; an such that
v ¼ a1v1 þ � � � þ anvn. It follows that
0 ¼ SðvÞ ¼ S a1v1 þ � � � þ anvnð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ a1S v1ð Þþ � � � þ anS vnð Þ
¼ a1w1 þ � � � þ anwn;
so a1w1 þ � � � þ anwn ¼ 0. Now, since w1; . . .;wn is a basis of V, each ai is 0,and hence
v ¼ a1v1 þ � � � þ anvn ¼ 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}. Thus v ¼ 0. ■
Since for every j 2 1; . . .; nf g,
T wj� � ¼Xn
i¼1
bijwi;
we have, for every j 2 1; . . .; nf g,
TSð Þ vj� � ¼ T S vj
� �� � ¼ T wj� � ¼Xn
i¼1
bijwi|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼Xni¼1
bijS við Þ ¼ SXni¼1
bijvi
!;
184 3 Linear Transformations
and hence for every j 2 1; . . .; nf g,
TSð Þ vj� � ¼ S
Xni¼1
bijvi
!:
It follows that for every j 2 1; . . .; nf g,
S�1TS� �
vj� � ¼ S�1 TSð Þ vj
� �� � ¼Xni¼1
bijvi|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} :
Thus for every j 2 1; . . .; nf g,
S�1TS� �
vj� � ¼Xn
i¼1
bijvi:
It follows that
m1 S�1TS� � ¼ bij
� �|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ m2ðTÞ;
where m1 S�1TSð Þ is the matrix of S�1TS relative to the basis v1; . . .; vn. Thus
m2ðTÞ ¼ m1 S�1TS� �
:
By 3.1.33, and 3.1.34,
m2ðTÞ ¼ m1 S�1TS� � ¼ m1 S�1� �
m1ðTÞm1ðSÞ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ m1ðSÞð Þ�1m1ðTÞm1ðSÞ;
and hence
m2ðTÞ ¼ m1ðSÞð Þ�1m1ðTÞm1ðSÞ:Proof (b) Put wj �
Pni¼1 pijvi j ¼ 1; . . .; nð Þ.
Clearly, w1; . . .;wnf g is linearly independent.
Proof To show this, suppose that a1w1 þ � � � þ anwn ¼ 0. We have to showthat each ai equals 0. Since
3.1 Eigenvalues 185
Pnk¼1
Pni¼1
pkiai
vk ¼
Pnk¼1
Pni¼1
aipki
vk ¼
Pnk¼1
Pni¼1
aipkivk
¼Pn
i¼1
Pnk¼1
aipkivk
¼Pn
i¼1aiPnk¼1
pkivk
¼Xni¼1
aiwi ¼ 0|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl};
we have
Xnk¼1
Xni¼1
pkiai
!vk ¼ 0:
Since v1; . . .; vn is a basis of V, we have, for every k 2 1; . . .; nf g,Xni¼1
pkiai ¼ 0;
and hence pij� �
n�n
a1...
an
264
375 ¼
0...
0
2435, that is, P a1; . . .; an½ �T¼ 0; . . .; 0½ �T . It follows
that
a1; . . .; an½ �T¼ I a1; . . .; an½ �T¼ P�1P� �
a1; . . .; an½ �T
¼ P�1 P a1; . . .; an½ �T� � ¼ P�1 0; . . .; 0½ �T|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0; . . .; 0½ �T ;
and hence a1; . . .; an½ �T¼ 0; . . .; 0½ �T . This shows that each ai is 0. ■
Thus, we have shown that w1; . . .;wnf g is linearly independent. Now, since V isan n-dimensional vector space, w1; . . .;wn is a basis of V. It remains to show thatP�1AP is the matrix of T relative to the basis w1; . . .;wn.
Let S : V ! V be the linear transformation defined as follows: for everyi 2 1; . . .; nf g, S við Þ ¼ wi. Since
S við Þ ¼ wi|fflfflfflfflfflffl{zfflfflfflfflfflffl} ¼Xn
k¼1
pkivk;
we have S við Þ ¼Pnk¼1 pkivk, and hence the matrix of S relative to the basis
v1; . . .; vn is pki½ � ¼ Pð Þ. Now, by 3.1.35(a), P�1AP is the matrix of T relative to thebasis w1; . . .;wn. ■
186 3 Linear Transformations
3.2 Canonical Forms
Definition Let V be any n-dimensional vector space. Let T 2 AðVÞ. Let V1 be anysubspace of V. If T V1ð Þ V1, then we say that V1 is invariant under T.
3.2.1 Theorem Let V be any n-dimensional vector space. Let T 2 AðVÞ. Let V1
and V2 be any subspaces of V. Suppose that V ¼ V1 V2, in the sense that everyv 2 V can be expressed uniquely as v1 þ v2, where v1 2 V1 and v2 2 V2. Supposethat V1 is invariant under T, and V2 is invariant under T, in the sense that therestriction T jV1
isinA V1ð Þ and the restriction T jV2is inA V2ð Þ. Let p1ðxÞ be a mini-
mal polynomial of T jV1, and p2ðxÞ a minimal polynomial of T jV2
. Then the leastcommon multiple of p1ðxÞ and p2ðxÞ is a minimal polynomial of T.
Proof Let pðxÞ be a minimal polynomial of T. It suffices to show that
1. p1ðxÞ divides pðxÞ, that is, p TjV1
� �¼ 0,
2. p2ðxÞ divides pðxÞ, that is, p TjV2
� �¼ 0,
3. if p1ðxÞ divides qðxÞ, and p2ðxÞ divides qðxÞ, then pðxÞ divides qðxÞ, that is,
p1ðxÞ qðxÞ and p2ðxÞj jqðxÞð Þ ) qðTÞ ¼ 0:
For 1: Since p1ðxÞ is a minimal polynomial of T1, we have p1 T1ð Þ ¼ 0.
Clearly, for every polynomial qðxÞ, qðTÞjV1¼ q T jV1
� �.
Proof Suppose that
qðxÞ � a0 þ a1xþ � � � þ amxm
and v 2 V1. We have to show that
a0IðvÞþ a1TðvÞþ a2T TðvÞð Þ � � � þ amTmðvÞ¼ a0 IjV1
� �ðvÞþ a1 T jV1
� �ðvÞþ a2 TjV1
� �TjV1
� �ðvÞ
� �� � � þ am T jV1
� �mðvÞ:
RHS ¼ a0 IjV1
� �ðvÞþ a1 TjV1
� �ðvÞþ a2 TjV1
� �TjV1
� �ðvÞ
� �� � � þ am T jV1
� �mðvÞ
¼ a0vþ a1TðvÞþ a2 T jV1
� �TðvÞð Þ � � � þ am TjV1
� �mðvÞ
¼ a0vþ a1TðvÞþ a2T TðvÞð Þ � � � þ am T jV1
� �mðvÞ
¼ a0IðvÞþ a1TðvÞþ a2T TðvÞð Þ � � � þ am T jV1
� �mðvÞ
..
.
¼ a0IðvÞþ a1TðvÞþ a2T TðvÞð Þ � � � þ amTmðvÞ ¼ LHS:
■
3.2 Canonical Forms 187
Similarly, for every polynomial qðxÞ, we have qðTÞjV2¼ q TjV2
� �. Since pðxÞ is a
minimal polynomial of T, we have pðTÞ ¼ 0, and hence
p TjV1
� �¼ pðTÞjV1
¼ 0|fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl} :
Thus p T jV1
� �¼ 0.
For 2: The proof is similar to (1).For 3: Suppose that p1ðxÞ divides qðxÞ and p2ðxÞ divides qðxÞ. We have to show
that qðTÞ ¼ 0.To this end, let us take an arbitrary v 2 V . We have to show that qðTÞð ÞðvÞ ¼ 0.
Since v 2 V and V ¼ V1 V2, there exist v1 2 V1 and v2 2 V2 such thatv ¼ v1 þ v2. We have to show that
qðTÞð Þ v1ð Þþ qðTÞð Þ v2ð Þ ¼ qðTÞð Þ v1 þ v2ð Þ ¼ 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};that is,
qðTÞð Þ v1ð Þþ qðTÞð Þ v2ð Þ ¼ 0:
It suffices to show that
qðTÞjV1¼ 0
qðTÞjV2¼ 0
;
that is,
q TjV1
� �¼ 0
q TjV2
� �¼ 0
9=;:
Since p1ðxÞ is a minimal polynomial of TjV1, we have p1 TjV1
� �¼ 0. Now, since
p1ðxÞ divides qðxÞ, we have q T jV1
� �¼ 0. Similarly, q T jV2
� �¼ 0. ■
3.2.2 Note Let V be any n-dimensional vector space over the field F. Let T 2 AðVÞ.Let pðxÞ 2 F x½ �. Let pðxÞ be the minimal polynomial of T over F.
By 1.3.19, we can write
pðxÞ ¼ q1ðxÞð Þl1 q2ðxÞð Þl2 . . . qkðxÞð Þlk ;
188 3 Linear Transformations
where q1ðxÞ; q2ðxÞ; . . .; qkðxÞ are distinct irreducible polynomials (of degree � 1) inF x½ �, and l1; l2; . . .; lk are positive integers.
Put
V1 � v : v 2 V and q1ðTÞð Þl1� �
ðvÞ ¼ 0n o
;
V2 � v : v 2 V and q2ðTÞð Þl2� �
ðvÞ ¼ 0n o
; etc:
It is clear that each Vi is a linear subspace of V. It follows that
V1 þV2 þ � � � þVk V : �ð Þ
Next suppose that v 2 V1. It follows that q1ðTÞð Þl1� �
ðvÞ ¼ 0. Now
q1ðTÞð Þl1� �
TðvÞð Þ ¼ q1ðTÞð Þl1�T� �
ðvÞ ¼ T � q1ðTÞð Þl1� �
ðvÞ¼ T q1ðTÞð Þl1ðvÞ
� �¼ T 0ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0;
so
q1ðTÞð Þl1� �
TðvÞð Þ ¼ 0;
and hence TðvÞ 2 V1. This shows that V1 is invariant under T. Similarly, V2 isinvariant under T, etc.
We assume that k[ 1. Put
h1ðxÞ � q2ðxÞð Þl2 q3ðxÞð Þl3 � � � qkðxÞð Þlk ;h2ðxÞ � q1ðxÞð Þl1 q3ðxÞð Þl3 � � � qkðxÞð Þlk ;
..
.
hkðxÞ � q1ðxÞð Þl1 q2ðxÞð Þl2 � � � qk�1ðxÞð Þlk�1 :
Observe that the greatest common divisor of h1ðxÞ; h2ðxÞ; . . .; hkðxÞ is 1. So by1.1.5, there exist a1ðxÞ; a2ðxÞ; . . .; akðxÞ 2 F x½ � such that
1 ¼ h1ðxÞa1ðxÞþ h2ðxÞa2ðxÞþ � � � þ hkðxÞakðxÞ;
and hence
3.2 Canonical Forms 189
I ¼ h1ðTÞð Þ � a1ðTÞð Þþ h2ðTÞð Þ � a2ðTÞð Þþ � � � þ hkðTÞð Þ � akðTÞð Þ:
It follows that for every v 2 V , we have
v ¼ IðvÞ ¼ h1ðTÞð Þ � a1ðTÞð Þþ h2ðTÞð Þ � a2ðTÞð Þþ � � � þ hkðTÞð Þ � akðTÞð Þð ÞðvÞ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ h1ðTÞð Þ a1ðTÞð ÞðvÞð Þþ h2ðTÞð Þ a2ðTÞð ÞðvÞð Þþ � � � þ hkðTÞð Þ akðTÞð ÞðvÞð Þ;
and hence for every v 2 V , there exist w1;w2; . . .;wk 2 V such that
v ¼ h1ðTÞð Þ w1ð Þþ h2ðTÞð Þ w2ð Þþ � � � þ hkðTÞð Þ wkð Þ: ��ð Þ
Since q1ðxÞ is a polynomial of degree � 1 in F x½ � and l1 is a positive integer, wehave
deg h1 xð Þð Þ ¼ deg q2 xð Þð Þl2 q3 xð Þð Þl3 � � � qk xð Þð Þlk� �
¼ l2deg q2 xð Þð Þþ � � � þ lkdeg qk xð Þð Þ\l1deg q1 xð Þð Þþ � � � þ lkdeg qk xð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ deg q1 xð Þð Þl1 q2 xð Þð Þl2 � � � qk xð Þð Þlk
� �¼ deg p xð Þð Þ;
and hence deg h1ðxÞð Þ\deg pðxÞð Þ. Now since pðxÞ is the minimal polynomial ofT over F, we have pðTÞ ¼ 0 and h1ðTÞ 6¼ 0. Since h1ðTÞ 6¼ 0 and h1ðTÞ : V ! V ,there exists v1 2 V such that h1ðTÞð Þ v1ð Þ 6¼ 0. Since pðxÞ ¼ q1ðxÞð Þl1h1ðxÞ, we have0 ¼ pðTÞ ¼ q1ðTÞð Þl1 � h1ðTÞ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}, and hence q1ðTÞð Þl1�h1ðTÞ ¼ 0. It follows that
q1ðTÞð Þl1 h1ðTÞð Þ v1ð Þð Þ ¼ q1ðTÞð Þl1�h1ðTÞ� �
v1ð Þ ¼ 0 v1ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0;
and hence q1ðTÞð Þl1 w1ð Þ ¼ 0, where w1 � h1ðTÞð Þ v1ð Þ 6¼ 0ð Þ. Hence by the defi-nition of V1, we have w1 2 V1. Also w1 6¼ 0. This shows that V1 6¼ 0f g. Similarly,V2 6¼ 0f g, etc.
Since h1ðTÞ 6¼ 0 and h1ðTÞ : V ! V , we have h1ðTÞð ÞðVÞ 6¼ 0f g. Similarly,h2ðTÞð ÞðVÞ 6¼ 0f g, etc.Clearly, h1ðTÞð ÞðVÞ V1.
Proof To show this, let us take an arbitrary u1 2 V . We have to show that
h1ðTÞð Þ u1ð Þ 2 V1 ¼ v : v 2 V and q1ðTÞð Þl1� �
ðvÞ ¼ 0n o� �
;
190 3 Linear Transformations
that is, q1ðTÞð Þl1� �
h1ðTÞð Þ u1ð Þð Þ ¼ 0, that is, q1ðTÞð Þl1 � h1ðTÞð Þ� �
u1ð Þ ¼ 0.
Since pðxÞ ¼ q1ðxÞð Þl1h1ðxÞ, we have
0 ¼ pðTÞ ¼ q1ðTÞð Þl1� h1ðTÞ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence q1ðTÞð Þl1 � h1ðTÞ ¼ 0. It follows that
LHS ¼ q1ðTÞð Þl1 � h1ðTÞð Þ� �
u1ð Þ ¼ 0ð Þ u1ð Þ ¼ 0 ¼ RHS:
■
Similarly, h2ðTÞð ÞðVÞ V2, etc. Now, since w1;w2; . . .;wk 2 V , we haveh1ðTÞð Þ w1ð Þ 2 V1, h2ðTÞð Þ w2ð Þ 2 V2; . . .; hkðTÞð Þ wkð Þ 2 Vk. Hence from (**), forevery v 2 V , there exist v1 2 V1; v2 2 V2; . . .; vk 2 Vk such that
v ¼ v1 þ v2 þ � � � þ vk:
This proves that
V V1 þV2 þ � � � þVk:
Now, from (*), we have
V ¼ V1 þV2 þ � � � þVk: � � �ð Þ
Observe that if v 2 V2, then q2ðTÞð Þl2� �
ðvÞ ¼ 0, and hence
h1ðTÞð ÞðvÞ¼ q3ðTÞð Þl3 � � � qkðTÞð Þlk� �
q2ðTÞð Þl2� �
ðvÞ� �
¼ q3ðTÞð Þl3 � � � qkðTÞð Þlk� �
0ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ 0:
Thus
v2 2 V2 ) h1ðTÞð Þ v2ð Þ ¼ 0:
Similarly,
v3 2 V3 ) h1ðTÞð Þ v3ð Þ ¼ 0;
etc. In short, for any distinct i; j 2 1; 2; . . .; kf g, we have hiðTÞð Þ Vj� � ¼ 0f g.
3.2 Canonical Forms 191
We claim that
V ¼ V1 V2 � � � Vk:
In view of (***), it suffices to show that if each vi is inVi, andv1 þ v2 þ � � � þ vk ¼ 0, then each vi equals 0.
Suppose to the contrary that there exist vi 2 Vi i ¼ 1; 2; . . .; kð Þ such thatv1 þ v2 þ � � � þ vk ¼ 0 and v1 6¼ 0. We seek a contradiction.
Since v1 þ v2 þ � � � þ vk ¼ 0, we have
q2ðTÞð Þl2 q3ðTÞð Þl3 � � � qkðTÞð Þlk� �
v1ð Þ ¼ h1ðTÞð Þ v1ð Þþ 0þ � � � þ 0
¼ h1ðTÞð Þ v1ð Þþ h1ðTÞð Þ v2ð Þþ � � �þ h1ðTÞð Þ vkð Þ
¼ h1ðTÞð Þ v1 þ v2 þ � � � þ vkð Þ ¼ h1ðTÞð Þ 0ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ 0;
and hence
q2ðTÞð Þl2� q3ðTÞð Þl3� � � � � qkðTÞð Þlk� �
v1ð Þ ¼ 0:
Since v1 2 V1 ¼ v : v 2 V and q1ðTÞð Þl1� �
ðvÞ ¼ 0n o
, we have q1ðTÞð Þl1� �
v1ð Þ ¼ 0.Observe that the greatest common divisor of q1ðxÞð Þl1 ; q2ðxÞð Þl2 q3ðxÞð Þl3 � � �
qkðxÞð Þlk is 1. So by 1.1.5, there exist b1ðxÞ; b2ðxÞ 2 F x½ � such that
1 ¼ b1ðxÞ � q1ðxÞð Þl1 þ b2ðxÞ � q2ðxÞð Þl2 q3ðxÞð Þl3 � � � qkðxÞð Þlk ;
and hence
I ¼ b1ðTÞð Þ � q1ðTÞð Þl1� �
þ b2ðTÞð Þ � q2ðTÞð Þl2� q3ðTÞð Þl3� � � � � qkðTÞð Þlk� �
:
It follows that
192 3 Linear Transformations
v1 ¼ I v1ð Þ ¼b1ðTÞð Þ � q1ðTÞð Þl1
� ��þ b2ðTÞð Þ�
q2ðTÞð Þl2� q3ðTÞð Þl3� � � � � qkðTÞð Þlk� ��
v1ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ b1ðTÞð Þ q1ðTÞð Þl1
� �v1ð Þ
� �þ b2ðTÞð Þ q2ðTÞð Þl2� q3ðTÞð Þl3� � � � � qkðTÞð Þlk
� �v1ð Þ
� �¼ b1ðTÞð Þ q1ðTÞð Þl1
� �v1ð Þ
� �þ b2ðTÞð Þ 0ð Þ
¼ b1ðTÞð Þ 0ð Þþ b2ðTÞð Þ 0ð Þ ¼ 0þ 0 ¼ 0;
and hence v1 ¼ 0. This is a contradiction.Thus we have shown that V ¼ V1 V2 � � � Vk.
Observe that q1 T jV1
� �� �l1¼ 0.
Proof To show this, let us take an arbitrary v1 2 V1. We have to show that
q1 T jV1
� �� �l1 v1ð Þ ¼ 0.
Since
v1 2 V1 ¼ v : v 2 V and q1ðTÞð Þl1� �
ðvÞ ¼ 0n o
;
we have q1ðTÞð Þl1� �
v1ð Þ ¼ 0. Now
LHS ¼ q1 T jV1
� �� �l1 v1ð Þ ¼ q1 TjV1
� �v1ð Þ
� �� �l1¼ q1 T v1ð Þð Þð Þl1
¼ q1ðTÞð Þl1� �
v1ð Þ ¼ 0 ¼ RHS:
■
It follows that the minimal polynomial of T jV1is of the form q1ðxÞð Þm1 , where m1
is a positive integer � l1. Similarly, the minimal polynomial of T jV2is of the form
q2ðxÞð Þm2 , where m2 is a positive integer � l2, etc. Now, since q1ðxÞ; q2ðxÞ; . . .;qkðxÞ are distinct irreducible polynomials, the least common multiple ofq1ðxÞð Þm1 ; q2ðxÞð Þm2 ; . . .; qkðxÞð Þmk is q1ðxÞð Þm1 q2ðxÞð Þm2 . . . qkðxÞð Þmk .By 3.2.1, the minimal polynomial of T is the least common multiple of
q1ðxÞð Þm1 ; q2ðxÞð Þm2 ; . . .; qkðxÞð Þmk , and hence the minimal polynomial of T isq1ðxÞð Þm1 q2ðxÞð Þm2 . . . qkðxÞð Þmk . Since
q1ðxÞð Þl1 q2ðxÞð Þl2 . . . qkðxÞð Þlk
3.2 Canonical Forms 193
is the minimal polynomial of T over F, and each mi � li, we havem1 ¼ l1; . . .;mk ¼ lk. Since the minimal polynomial of TjV1
is of the form
q1ðxÞð Þm1 , and m1 ¼ l1, the minimal polynomial of TjV1is q1ðxÞð Þl1 . Similarly, the
minimal polynomial of T jV2is q2ðxÞð Þl2 , etc.
3.2.3 Conclusion Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Let pðxÞ 2 F x½ �. Let pðxÞ be the minimal polynomial of T overF. Suppose that
pðxÞ ¼ q1ðxÞð Þl1 q2ðxÞð Þl2 . . . qkðxÞð Þlk ;
where q1ðxÞ; q2ðxÞ; . . .; qkðxÞ are distinct irreducible polynomials (of degree � 1) inF x½ �, and l1; l2; . . .; lk are positive integers. Put
V1 � v : v 2 V and q1ðTÞð Þl1� �
ðvÞ ¼ 0n o
;V2
� v : v 2 V and q2ðTÞð Þl2� �
ðvÞ ¼ 0n o
; etc:
Then,
1. each Vi is a nontrivial linear subspace of V,2. each Vi is invariant under T,3. V ¼ V1 V2 � � � Vk,4. for each i ¼ 1; 2; . . .; k, the minimal polynomial of T jV1
is qiðxÞð Þli .
3.2.4 Problem Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Let k 2 F. Suppose that k is an eigenvalue of T. Then kI � Tð Þ : V ! Vis not invertible.
Proof Since k is an eigenvalue of T, there exists a nonzero v 2 V such thatTðvÞ ¼ kv, and hence kI � Tð ÞðvÞ ¼ 0 ¼ kI � Tð Þ 0ð Þð Þ. Here v 6¼ 0, andkI � Tð ÞðvÞ ¼ kI � Tð Þ 0ð Þ, kI � Tð Þ : V ! V is not one-to-one, and hencekI � Tð Þ is not invertible. ■
3.2.5 Problem Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Let k 2 F. Suppose that kI � Tð Þ : V ! V is not invertible, that is,kI � Tð Þ is singular. Then k is an eigenvalue of T.
Proof Here kI � Tð Þ : V ! V is not invertible, that is, kI � Tð Þ�1 does not exist,so by 3.1.31, there exists a nonzero v 2 V such that kI � Tð ÞðvÞ ¼ 0. It follows thatTðvÞ ¼ kv, where v 6¼ 0. Hence k is an eigenvalue of T. ■
3.2.6 Problem Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Let k 2 F. Suppose that k is an eigenvalue of T. Let qðxÞ 2 F x½ �.(It follows that q kð Þ 2 F and qðTÞ 2 AðVÞ.) Then q kð Þ is an eigenvalue of qðTÞ.
194 3 Linear Transformations
Proof Suppose that
qðxÞ ¼ a0 þ a1xþ � � � þ anxn;
where each ai is inF. It follows that
q kð Þ ¼ a0 þ a1kþ � � � þ ankn
and
qðTÞ ¼ a0Iþ a1T þ � � � þ anTn:
Since k is an eigenvalue of T, there exists a nonzero v 2 V such that TðvÞ ¼ kv.It suffices to show that
a0IðvÞþ a1TðvÞþ a2T TðvÞð Þþ a3T TðvÞð Þþ � � � þ anTnðvÞ
¼ a0Iþ a1T þ � � � þ anTnð ÞðvÞ ¼ qðTÞð ÞðvÞ ¼ q kð Þð Þv|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ a0 þ a1kþ � � � þ anknð Þv;
that is,
a0IðvÞþ a1TðvÞþ a2T TðvÞð Þþ a3T T TðvÞð Þð Þþ � � � þ anTnðvÞ
¼ a0 þ a1kþ � � � þ anknð Þv:
LHS ¼ a0IðvÞþ a1TðvÞþ a2T TðvÞð Þþ a3T T TðvÞð Þð Þþ � � � þ anTnðvÞ¼ a0vþ a1TðvÞþ a2T TðvÞð Þþ a3T T TðvÞð Þð Þþ � � � þ anTnðvÞ
¼ a0vþ a1 kvð Þþ a2T TðvÞð Þþ a3T T TðvÞð Þð Þþ � � � þ anTn�1 TðvÞð Þ¼ a0vþ a1 kvð Þþ a2T kvð Þþ a3T T kvð Þð Þþ � � � þ anTn�1 kvð Þ¼ a0vþ a1 kvð Þþ a2kTðvÞþ a3kT TðvÞð Þþ � � � þ ankTn�1ðvÞ¼ a0vþ a1 kvð Þþ a2k kvð Þþ a3kT kvð Þþ � � � þ ankTn�2 kvð Þ¼ a0vþ a1 kvð Þþ a2 k2v
� �þ a3k2TðvÞþ � � � þ ank
2Tn�2ðvÞ...
¼ a0vþ a1 kvð Þþ a2 k2v� �þ a3 k3v
� �þ � � � þ an knvð Þ¼ a0 þ a1kþ a2k
2 þ a3k3 þ � � � þ ank
n� �v ¼ RHS:
■
3.2.7 Problem Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Let pðxÞ be the minimal polynomial of T over F. Let k ε F. Suppose thatk is an eigenvalue of T. Then k is a root of pðxÞ, that is, p kð Þ ¼ 0.
(Since the number roots of pðxÞ is finite, the number of eigenvalues of T isfinite.)
3.2 Canonical Forms 195
Proof Since pðxÞ is the minimal polynomial of T over F, we have pðxÞ 2 F x½ � andpðTÞ ¼ 0. Suppose that
pðxÞ ¼ a0 þ a1xþ � � � þ anxn;
where each ai is inF. It follows that
p kð Þ ¼ a0 þ a1kþ � � � þ ankn
and
pðTÞ ¼ a0Iþ a1T þ � � � þ anTn:
Since k is an eigenvalue of T, there exists a nonzero v 2 V such that TðvÞ ¼ kv.We claim that pðTÞð ÞðvÞ ¼ p kð Þð Þv, that is,
a0IðvÞþ a1TðvÞþ a2T TðvÞð Þþ a3T TðvÞð Þþ � � � þ anTnðvÞ
¼ a0 þ a1kþ � � � þ anknð Þv;
LHS ¼ a0IðvÞþ a1TðvÞþ a2T TðvÞð Þþ a3T T TðvÞð Þð Þþ � � � þ anTnðvÞ¼ a0vþ a1TðvÞþ a2T TðvÞð Þþ a3T T TðvÞð Þð Þþ � � � þ anTnðvÞ
¼ a0vþ a1 kvð Þþ a2T TðvÞð Þþ a3T T TðvÞð Þð Þþ � � � þ anTn�1 TðvÞð Þ¼ a0vþ a1 kvð Þþ a2T kvð Þþ a3T T kvð Þð Þþ � � � þ anTn�1 kvð Þ¼ a0vþ a1 kvð Þþ a2kTðvÞþ a3kT TðvÞð Þþ � � � þ ankTn�1ðvÞ¼ a0vþ a1 kvð Þþ a2k kvð Þþ a3kT kvð Þþ � � � þ ankTn�2 kvð Þ¼ a0vþ a1 kvð Þþ a2 k2v
� �þ a3k2TðvÞþ � � � þ ank
2Tn�2ðvÞ...
¼ a0vþ a1 kvð Þþ a2 k2v� �þ a3 k3v
� �þ � � � þ an knvð Þ¼ a0 þ a1kþ a2k
2 þ a3k3 þ � � � þ ank
n� �v ¼ RHS:
Thus, 0 ¼ 0ðvÞ ¼ pðTÞð ÞðvÞ ¼ p kð Þð Þv|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}. Now, since p kð Þð Þv ¼ 0 and v is non-
zero, we have p kð Þ ¼ 0. ■
3.2.8 Problem Let V be any n-dimensional vector space over the field F. LetS; T 2 AðVÞ. Let S : V ! V be invertible. Then T and S�1 � T � S have the sameminimal polynomial.
Proof Let pðxÞ be the minimal polynomial of T, and qðxÞ the minimal polynomialof S�1 � T � S. It suffices to show that
1. qðxÞjpðxÞ, that is, p S�1 � T � Sð Þ ¼ 0,2. pðxÞjqðxÞ, that is, qðTÞ ¼ 0,
196 3 Linear Transformations
For 1: Since pðxÞ is the minimal polynomial of T, we have pðTÞ ¼ 0. Supposethat
pðxÞ ¼ a0 þ a1xþ � � � þ anxn;
where each ai is inF. It follows that
0 ¼ pðTÞ ¼ a0Iþ a1T þ � � � þ anTn|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
and
p S�1 � T � S� � ¼ a0Iþ a1 S�1 � T � S� �þ a2 S�1 � T � S� � � S�1 � T � S� �þ a3 S�1 � T � S� � � S�1 � T � S� � � S�1 � T � S� �þ � � � þ an S�1 � T � S� �n
:
Hence
p S�1 � T � S� � ¼ a0Iþ a1 S�1 � T � S� �þ a2 S�1 � T2 � S� �þ a3 S�1 � T3 � S� �þ � � � þ an S�1 � Tn � S� �
:
Now, since
a0Iþ a1 S�1 � T � S� �þ a2 S�1 � T2 � S� �þ a3 S�1 � T3 � S� �þ � � � þ an S�1 � Tn � S� � ¼ S�1 � a0Ið Þ � Sþ S�1 � a1Tð Þ � Sþ S�1 � a2T
2� � � Sþ S�1 � a3T
3� � � S
þ � � � þ S�1 � anTnð Þ � S
¼ S�1 � a0Iþ a1T þ a2T2 þ a3T
3 þ � � � þ anTn
� � � S¼ S�1 � pðTÞð Þ � S ¼ S�1 � 0 � S ¼ 0;
we have p S�1 � T � Sð Þ ¼ 0.For 2: Since qðxÞ is the minimal polynomial of S�1 � T � S, we have
q S�1 � T � Sð Þ ¼ 0. Suppose that
qðxÞ ¼ b0 þ b1xþ � � � þ bmxm;
where each bi is inF. It follows that
qðTÞ ¼ b0Iþ b1T þ b2T2 þ � � � þ bmT
m
3.2 Canonical Forms 197
and
0 ¼ q S�1 � T � S� � ¼ b0Iþ b1 S�1 � T � S� �þ � � � þ bm S�1 � T � S� �m|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ b0 S�1 � I � Sð Þþ b1 S�1 � T � Sð Þþ � � � þ bm S�1 � Tm � Sð Þ¼ S�1 � b0Ið Þ � Sþ S�1 � b1Tð Þ � Sþ � � � þ S�1 � bmTmð Þ � S¼ S�1 � b0Iþ b1T þ b2T2 þ � � � þ bmTmð Þ � S
and hence
0 ¼ S�1 � b0Iþ b1T þ b2T2 þ � � � þ bmT
m� � � S:
It follows that
qðTÞ ¼ b0Iþ b1T þ b2T2 þ � � � þ bmT
m ¼ S � 0 � S�1|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0;
and hence qðTÞ ¼ 0. ■
Definition Let V be any n-dimensional vector space over the field F. Let T 2 AðVÞ.Let k 2 F. Suppose that k is an eigenvalue of T. Let v be a nonzero vector in V. IfTðxÞ ¼ kx, then we say that v is an eigenvector of T belonging to the eigenvalue k.
3.2.9 Problem Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Let k1; k2; . . .; kk be distinct eigenvalues of T. Suppose that for everyi 2 1; 2; . . .; kf g, vi is an eigenvector of T belonging to the eigenvalue ki. Thenv1; v2; . . .; vkf g is a linearly independent set of vectors over F.
Proof Suppose to the contrary (after suitably rearranging the indices) that
v1 ¼ a2v2 þ a3v3 þ . . .þ alvl|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}l�1ð Þ terms
is the shortest linear relation, where a2; a3; . . .; al are nonzero members of F. Weseek a contradiction.
Since v1 ¼ a2v2 þ a3v3 þ � � � þ alvl, we have
k1a2v2 þ k1a3v3 þ � � � þ k1alvl ¼ k1 a2v2 þ a3v3 þ � � � þ alvlð Þ ¼ k1v1¼ T v1ð Þ ¼ T a2v2 þ a3v3 þ � � � þ alvlð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ a2T v2ð Þþ a3T v3ð Þþ � � � þ alT vlð Þ ¼ a2 � k2v2 þ a3 � k3v3 þ � � � þ al � klvl;
198 3 Linear Transformations
and hence
k1a2v2 þ k1a3v3 þ � � � þ k1alvl ¼ a2k2v2 þ a3k3v3 þ � � � þ alklvl:
This shows that
k1 � k2ð Þa2v2 þ k1 � k3ð Þa3v3 þ � � � þ k1 � klð Þalvl ¼ 0:
Since k1 6¼ k2 and a2 6¼ 0, we have
v2 ¼ � k1 � k3ð Þa3k1 � k2ð Þa2 v3 þ � � � þ � k1 � klð Þal
k1 � k2ð Þa2 vl|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}l�2ð Þ terms
:
This contradicts the fact that
v1 ¼ a2v2 þ a3v3 þ � � � þ alvl|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}l�1ð Þ terms
is the shortest linear relation. ■
3.2.10 Problem Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Then the number of distinct eigenvalues of T is � n.
Proof Let k1; k2; . . .; kk be distinct eigenvalues of T. We have to show that k� n,that is, k� dimðVÞ.
Suppose that for every i 2 1; 2; . . .; kf g, vi is an eigenvector of T belonging tothe eigenvalue ki. By 3.2.9, v1; v2; . . .; vkf g is a linearly independent set of vectorsover F. It follows that the number of elements in v1; v2; . . .; vkf g is � dimðVÞ. Sincev1; v2; . . .; vkf g is a linearly independent set of vectors, the number of elements inv1; v2; . . .; vkf g is k, and hence k� dimðVÞ. ■
3.2.11 Problem Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Suppose that T has n distinct eigenvalues in F. Then there exists a basisof V over F such that each member of the basis is an eigenvector of T.
Proof Let k1; k2; . . .; kn be distinct eigenvalues of T. Suppose that for everyi 2 1; 2; . . .; nf g, vi is an eigenvector of T belonging to the eigenvalue ki. By 3.2.9,v1; v2; . . .; vnf g is a linearly independent set of vectors over F. Now, since the
number of elements in v1; v2; . . .; vnf g is equal to dimðVÞ, v1; v2; . . .; vnf g consti-tutes a basis of V. ■
Definition Let V be any n-dimensional vector space over the field F. LetS; T 2 AðVÞ. If there exists C 2 AðVÞ such that C�1 exists, and C�1 � S � C ¼ T ,then we say that S is similar to T, and we write S T .
3.2.12 Problem is an equivalence relation over AðVÞ.
3.2 Canonical Forms 199
And hence AðVÞ is partitioned into equivalence classes. Each equivalence classis called a similarity class.
Proof
a. Let us take an arbitrary T 2 AðVÞ. Since I�1 � T � I ¼ T , we have T T . Thus is reflexive over AðVÞ.
b. Let us take arbitrary S; T 2 AðVÞ satisfying S T . It follows that there existsC 2 AðVÞ such that C�1 exists, and C�1 � S � C ¼ T . It follows thatC�1ð Þ�1 ¼ C 2 AðVÞð Þ andC � T � C�1 ¼ S. Thus S T . Hence is symmetric.
c. Let us take arbitrary R; S;T 2 AðVÞ satisfying R S and S T . We have toshow that R T . Since R S, there exists C 2 AðVÞ such that C�1 exists, andC�1 � R � C ¼ S. Again there exists D 2 AðVÞ such that D�1 exists, and
C � Dð Þ�1�R � C � Dð Þ ¼ D�1 � C�1� � � R � C � Dð Þ¼ D�1 � C�1 � R � C� � � D ¼ D�1 � S � D ¼ T|fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl} :
Thus E�1 � R � E ¼ T , where E � C � Dð Þ 2 AðVÞ.This proves that is an equivalence relation over AðVÞ. ■
3.2.13 Problem Let V be any n-dimensional vector space over F. Let T 2 AðVÞ.Let W be any subspace of V. Suppose that W is invariant under T. Then bT :
vþWð Þ 7! TðvÞþWð Þ from the quotient space VW to V
W is a linear transformation.
Proof bT is a well-defined function from the quotient space VW to V
W: To show this, letus take arbitrary u; v 2 V such that uþWð Þ ¼ vþWð Þ, that is, u� vð Þ 2 W . Wehave to show that TðuÞþWð Þ ¼ TðvÞþWð Þ, that is, TðuÞ � TðvÞð Þ 2 W , that is,T u� vð Þ 2 W . Since u� vð Þ 2 W and W is invariant under T, we haveT u� vð Þ 2 W .bT : V
W ! VW is linear: To show this, let us take arbitrary u; v 2 V and a; b 2 F.
We have to show that
T auþ bvð ÞþW ¼ a TðuÞþWð Þþ b TðvÞþWð Þ :LHS ¼ T auþ bvð ÞþW ¼ aTðuÞþ bTðvÞð ÞþW
¼ a TðuÞþWð Þþ b TðvÞþWð Þ ¼ RHS:■
3.2.14 Problem Let V be any n-dimensional vector space over F. Let T 2 AðVÞ.Let W be any subspace of V. Suppose that W is invariant under T. Let pðxÞ 2 F x½ �.Here bT , as defined in 3.2.13, is a member of A V
W
� �, and hence p bT� �
2 A VW
� �. Also
pðTÞ 2 AðVÞ. Suppose that pðTÞ is the zero element of AðVÞ. Then p bT� �is the
zero element of A VW
� �.
200 3 Linear Transformations
Proof Suppose that
pðxÞ ¼ a0 þ a1xþ � � � þ anxn;
where each ai is inF. It follows that
AðVÞ 3 0 ¼ pðTÞ ¼ a0Iþ a1T þ � � � þ anTn|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}and
p bT� �¼ a0bI þ a1bT þ a2bT � bT þ a3bT � bT � bT þ � � � þ anT
n:
We have to show that a0bI þ a1bT þ a2bT � bT þ a3bT � bT � bT þ � � � þ anTn is thezero element ofA V
W
� �. To this end, let us take an arbitrary v 2 V .We have to show that
a0bI þ a1bT þ a2bT � bT þ a3bT � bT � bT þ � � � þ anTn
� �vþWð Þ ¼ 0þWð Þ;
that is,
a0 I vþWð Þþ a1T vþWð Þþ a2 T � T� �
vþWð Þþ a3 T � T � T
� �vþWð Þþ � � � þ an Tn
� �vþWð Þ ¼ 0þWð Þ:
LHS ¼ a0 I vþWð Þþ a1T vþWð Þþ a2 T � T� �
vþWð Þþ a3 T � T � T
� �vþWð Þþ � � � þ an Tn
� �vþWð Þ
¼ a0 IðvÞþWð Þþ a1 TðvÞþWð Þþ a2T T vþWð Þ� �þ a3 T � T
� �T vþWð Þ� �þ � � � þ an Tn�1� �
T vþWð Þ� �¼ a0 vþWð Þþ a1 TðvÞþWð Þþ a2T TðvÞþWð Þþ a3 T � T
� �TðvÞþWð Þþ � � � þ an Tn�1� �
TðvÞþWð Þ¼ a0 vþWð Þþ a1 TðvÞþWð Þþ a2 T TðvÞð ÞþWð Þþ a3T T TðvÞð ÞþWð Þþ � � � þ an Tn�2� �
T TðvÞð ÞþWð Þ¼ a0 vþWð Þþ a1 TðvÞþWð Þþ a2 T2ðvÞþW
� �þ a3T T2ðvÞþW
� �þ � � � þ an Tn�2� �T2ðvÞþW� �
¼ a0 vþWð Þþ a1 TðvÞþWð Þþ a2 T2ðvÞþW� �
þ a3 T3ðvÞþW� �þ � � � þ an TnðvÞþWð Þ
¼ a0vþWð Þþ a1TðvÞþWð Þþ a2T2ðvÞþW� �
þ a3T3ðvÞþW
� �þ � � � þ anTnðvÞþWð Þ
¼ a0vþ a1TðvÞþ a2T2ðvÞþ a3T
3ðvÞþ � � � þ anTnðvÞ� �þW
¼ a0Iþ a1T þ � � � þ anTnð ÞðvÞþW ¼ 0ðvÞþW ¼ 0þW ¼ RHS:
■
3.2 Canonical Forms 201
3.2.15 Problem Let V be any n-dimensional vector space over F. Let T 2 AðVÞ.Let W be any subspace of V. Suppose that W is invariant under T. LetpðxÞ; qðxÞ 2 F x½ �. Here bT , as defined in 3.2.13, is a member of A V
W
� �, and hence
p bT� �2 A V
W
� �. Also pðTÞ 2 AðVÞ. Similarly, q bT� �
2 A VW
� �and qðTÞ 2 AðVÞ.
Suppose that pðxÞ is the minimal polynomial of T : V ! V over F, and qðxÞ is theminimal polynomial of bT : V
W ! VW over F. Then qðxÞjpðxÞ.
Proof Since pðxÞ is the minimal polynomial of T : V ! V over F, pðTÞ is the zeroelement of AðVÞ, and hence by 3.2.14, p bT� �
is the zero element of A VW
� �. Now,
since qðxÞ is the minimal polynomial of bT : VW ! V
W over F, we have qðxÞjpðxÞ. ■3.2.16 Theorem Let V be any n-dimensional vector space over F. Let T 2 AðVÞ.Suppose that all the roots of the minimal polynomial of T over F are in F. Thenthere exists a basis of V in which the matrix of T is such that all its entries above thediagonal are zero. In short, there exists a basis of V in which the matrix of T istriangular.
Proof (Induction on n): The theorem is trivially true when n ¼ 1.
Now let us assume that the theorem is true for all vector spaces of dimensionn� 1ð Þ.Let k1 2 F. Let k1 be an eigenvalue of T.Since k1 is an eigenvalue of T, there exists a nonzero vector v1 2 V such that
T v1ð Þ ¼ k1v1. Put
W � av1 : a 2 Ff g:
Clearly, W is a linear subspace of V. Also, since v1 is nonzero, W is aone-dimensional space. Next, since T v1ð Þ ¼ k1v1, W is invariant under T. It followsthat
dimVW
¼ dimðVÞ � dimðWÞ ¼ dimðVÞ � 1 ¼ n� 1:
Thus VW is an n� 1ð Þ-dimensional vector space over F.
Suppose that pðxÞ is the minimal polynomial of T : V ! V over F, and qðxÞ isthe minimal polynomial of bT : V
W ! VW over F, where bT : vþWð Þ 7! TðvÞþWð Þ
from the quotient space VW to V
W is a linear transformation. By 3.2.15, qðxÞjpðxÞ, andhence every root of qðxÞ is a root of pðxÞ. Now, since all the roots of the minimalpolynomial pðxÞ of T over F are in F, all the roots of qðxÞ are in F.
Since VW is an n� 1ð Þ-dimensional vector space, bT 2 A V
W
� �, and all the roots of
the minimal polynomial qðxÞ of bT over F are in F, it follows by our inductionhypothesis that there exists a basis v2 þW ; v3 þW ; . . .; vn þWf g of V
W in which the
202 3 Linear Transformations
matrix of bT is such that all its entries above the diagonal are zero. So there exists amatrix
a22 a23 a24 � � � a2na32 a33 a34 � � � a3n
. ..
an2 an3 an4 � � � ann
26664
37775
n�1ð Þ� n�1ð Þ
such that all its entries above the diagonal are zero, and
bT v2 þWð Þ ¼ a22 v2 þWð Þ;bT v3 þWð Þ ¼ a32 v2 þWð Þþ a33 v3 þWð Þ;...
bT vn þWð Þ ¼ an2 v2 þWð Þþ an3 v3 þWð Þþ � � � þ ann vn þWð Þ:
It follows that
T v2ð ÞþW ¼ a22v2 þW ;T v3ð ÞþW ¼ a32v2 þ a33v3ð ÞþW ;
..
.
T vnð ÞþW ¼ an2v2 þ an3v3 þ � � � þ annvnð ÞþW ;
and hence
T v2ð Þ � a22v2 2 WT v3ð Þ � a32v2 þ a33v3ð Þ 2 W
..
.
T vnð Þ � an2v2 þ an3v3 þ � � � þ annvnð Þ 2 W
9>>>=>>>;:
Thus
T v2ð Þ � a22v2 2 av1 : a 2 Ff gT v3ð Þ � a32v2 þ a33v3ð Þ 2 av1 : a 2 Ff g
..
.
T vnð Þ � an2v2 þ an3v3 þ � � � þ annvnð Þ 2 av1 : a 2 Ff g
9>>>=>>>;:
It follows that there exists a21 2 F such that T v2ð Þ � a22v2 ¼ a21v1, and henceT v2ð Þ ¼ a21v1 þ a22v2. Similarly, there exists a31 2 F such that T v3ð Þ ¼ a31v1 þa32v2 þ a33v3, etc. Also T v1ð Þ ¼ k1v1. Thus
3.2 Canonical Forms 203
T v1ð Þ ¼ k1v1T v2ð Þ ¼ a21v1 þ a22v2
T v3ð Þ ¼ a31v1 þ a32v2 þ a33v3...
T vnð Þ ¼ an1v1 þ an2v2 þ � � � þ an3vn
9>>>>>=>>>>>;: �ð Þ
Clearly, v1; v2; . . .; vnf g is a linearly independent set of vectors in V.
Proof To show this, suppose that
a1v1 þ a2v2 þ � � � þ anvn ¼ 0: �ð Þ
We have to show that each ai equals 0. Here
a2v2 þ � � � þ anvn ¼ �a1ð Þv1 2 av1 : a 2 Ff g ¼ W ;
so
a2 v2 þWð Þþ � � � þ an vn þWð Þ ¼ 0þW :
Now, since v2 þW ; v3 þW ; . . .; vn þWf g is a basis of VW, v2 þW ; v3 þf
W ; . . .; vn þWg is a linearly independent set of vectors in VW, and hence
a2 ¼ 0; a3 ¼ 0, and an ¼ 0. It remains to show that a1 ¼ 0.From (*),
a1v1 ¼ a1v1 þ 0v2 þ � � � þ 0vn ¼ 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence a1v1 ¼ 0. Now, since v1 is nonzero, we have a1 ¼ 0. ■Thus we have shown that v1; v2; . . .; vnf g is a linearly independent set of vectors
in the n-dimensional vector space V. It follows that v1; v2; . . .; vnf g is a basis ofV. Next, from (*), the matrix of T relative to the basis v1; v2; . . .; vnf g is triangular.
■
3.2.17 Problem Let F be a field. Let A be an n� n matrix with entries inF. Suppose that all its eigenvalues are in F. Then there exists an invertible n� nmatrix C with entries in F such that C�1AC is a triangular matrix.
Such matrices of a particularly nice form are called canonical forms. In short, wesay that A can be brought to triangular form over F by similarity.
Proof We know that Fn constitutes a vector space over F under pointwise additionand pointwise scalar multiplication. Put
204 3 Linear Transformations
v1 � 1; 0; 0; . . .; 0|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl}n
0@
1A; v2 � 0; 1; 0; . . .; 0|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl}
n
0@
1A; etc:
We know that v1; v2; . . .; vnf g is a basis of Fn. Suppose that A � aij� �
n�n, whereeach aij is inF. Suppose that T : Fn ! Fn is a linear transformation such that
T v1ð Þ ¼ a11v1 þ a12v2 þ � � � þ a1nvn;T v2ð Þ ¼ a21v1 þ a22v2 þ � � � þ a2nvn; etc:
It follows that m1ðTÞ ¼ aij� �
n�n, where m1ðTÞ is the matrix of T relative to thebasis v1; . . .; vn. Since all the eigenvalues of the matrix aij
� �n�n are in F, all the
eigenvalues of T are in F, and hence by 3.2.7, all the roots of the minimal poly-nomial of T over F are in F. Hence by 3.2.16, there exists a basis w1; . . .;wn of Fn
such that the matrix m2ðTÞ of T relative to the basis w1; . . .;wn is triangular.Let S : Fn ! Fn be the linear transformation such that for every i 2 1; . . .; nf g,
S við Þ ¼ wi. By 3.1.35,
m2ðTÞ ¼ m1ðSÞð Þ�1m1ðTÞm1ðSÞ:
Thus
m2ðTÞ ¼ C�1AC;
where C � m1ðSÞ. Now, since the matrix m2ðTÞ of T is triangular, C�1AC is atriangular matrix. ■
3.2.18 Problem Let A be a triangular matrix with entries in the field F. Supposethat no entry on the diagonal is 0. Then A is invertible.
Proof Let A � aij� �
n�n, where i\ j ) aij ¼ 0. Next, suppose that each aii isnonzero. We have to show that aij
� �n�n is invertible.
It suffices to show that det aij� �
n�n 6¼ 0. Since aij� �
n�n is a triangular matrix, wehave det aij
� �n�n¼ a11a22 . . . ann, and since each aii is nonzero, we have
det aij� �
n�n 6¼ 0. ■
3.2.19 Problem Let A be a triangular matrix with entries in the field F. Supposethat some entry on the diagonal is 0. Then A�1 does not exist.
Proof Let A � aij� �
n�n, where i\j ) aij ¼ 0. Next, there exists i 2 1; 2; . . .; nf gsuch that aii ¼ 0. We have to show that aij
� �n�n is not invertible.
It suffices to show that det aij� �
n�n¼ 0. Since aij� �
n�n is a triangular matrix, wehave det aij
� �n�n¼ a11a22 . . . ann, and since aii ¼ 0, we have det aij
� �n�n¼ 0. ■
3.2 Canonical Forms 205
3.2.20 Problem Let aij� �
n�n be a triangular matrix with entries in the field F. Thenthe set of all eigenvalues of aij
� �n�n is a11; a22; . . .; annf g.
Proof Let k be an eigenvalue of aij� �
n�n. It follows that aij� �
n�n�kI� �
is singular,that is, det aij
� �n�n�kI
� � ¼ 0. Since aij� �
n�n is a triangular matrix, we havedet aij
� �n�n�kI
� � ¼ a11 � kð Þ a22 � kð Þ � � � ann � kð Þ. Since det aij� �
n�n�kI� � ¼ 0,
we have
a11 � kð Þ a22 � kð Þ � � � ann � kð Þ ¼ 0:
This shows that k 2 a11; a22; . . .; annf g.Conversely, since det aij
� �n�n�a11I
� � ¼ a11 � a11ð Þ a22 � a11ð Þ � � � ann � a11ð Þ¼ 0, it follows that aij
� �n�n�a11I
� �is singular, and hence a11 is an eigenvalue of
aij� �
n�n. Similarly, a22 is an eigenvalue of aij� �
n�n, etc. ■
3.2.21 Theorem Let V be any n-dimensional vector space over F. Let T 2 AðVÞ.Suppose that all the roots of the minimal polynomial of T over F are in F. Thenthere exists a polynomial pðxÞ in F x½ � such that pðxÞ is of degree n, and pðTÞ ¼ 0.
Proof By 3.2.16, there exists a basis v1; v2; . . .; vnf g of V such that the matrixaij� �
n�n of T relative to the basis v1; v2; . . .; vnf g is triangular. It follows that
i\ j ) aij ¼ 0;
and
T v1ð Þ ¼ a11v1;T v2ð Þ ¼ a21v1 þ a22v2;
T v3ð Þ ¼ a31v1 þ a32v2 þ a33v3;
..
.:
For every i 2 1; 2; . . .; nf g, put ki � aii. By 3.2.20, the set of all eigenvalues ofaij� �
n�n is k1; k2; . . .; knf g. Further,
T � k1Ið Þ v1ð Þ ¼ 0;T � k2Ið Þ v2ð Þ ¼ a21v1;
T � k3Ið Þ v3ð Þ ¼ a31v1 þ a32v2;
..
.
Observe that
206 3 Linear Transformations
T � k1Ið Þ � T � k2Ið Þð Þ v2ð Þ ¼ T � k1Ið Þ T � k2Ið Þ v2ð Þð Þ ¼ T � k1Ið Þ a21v1ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ a21 T � k1Ið Þ v1ð Þ ¼ a210 ¼ 0;
T � k1Ið Þ � T � k2Ið Þ � T � k3Ið Þð Þ v3ð Þ¼ T � k1Ið Þ � T � k2Ið Þð Þð Þ T � k3Ið Þ v3ð Þð Þ ¼ T � k1Ið Þ � T � k2Ið Þð Þð Þ
a31v1 þ a32v2ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ a31 T � k1Ið Þ � T � k2Ið Þð Þ v1ð Þþ a32 T � k1Ið Þ � T � k2Ið Þð Þ v2ð Þ
¼ a31 T � k1Ið Þ � T � k2Ið Þð Þ v1ð Þþ a320¼ a31 T � k1Ið Þ � T � k2Ið Þð Þ v1ð Þ ¼ a31 T � T � k1T � k2T þ k1k2Ið Þ v1ð Þ
¼ a31 T � k2Ið Þ � T � k1Ið Þð Þ v1ð Þ¼ a31 T � k2Ið Þ T � k1Ið Þ v1ð Þð Þ ¼ a31 T � k2Ið Þ 0ð Þ ¼ 0; etc:
Thus
T � k1Ið Þ v1ð Þ ¼ 0;T � k1Ið Þ � T � k2Ið Þð Þ v2ð Þ ¼ 0;
T � k1Ið Þ � T � k2Ið Þ � T � k3Ið Þð Þ v3ð Þ ¼ 0;
..
.
Now, since each T � kiIð Þ commutes with each T � kjI� �
, we have
T � k1Ið Þ � T � k2Ið Þ � � � � � T � knIð Þð Þ v1ð Þ ¼ 0T � k1Ið Þ � T � k2Ið Þ � � � � � T � knIð Þð Þ v2ð Þ ¼ 0
..
.
T � k1Ið Þ � T � k2Ið Þ � � � � � T � knIð Þð Þ vnð Þ ¼ 0
9>>>=>>>;:
Next, since v1; v2; . . .; vnf g is a basis of V, for every v 2 V , we have
T � k1Ið Þ � T � k2Ið Þ � � � � � T � knIð Þð ÞðvÞ ¼ 0:
This shows that T � k1Ið Þ � T � k2Ið Þ � � � � � T � knIð Þ ¼ 0, and hencepðTÞ ¼ 0, where
pðxÞ � x� k1ð Þ x� k2ð Þ � � � x� knð Þ:
Since each ki is inF, pðxÞ �ð Þ x� k1ð Þ x� k2ð Þ � � � x� knð Þ is a polynomial inF x½ �, and it is of degree n. Thus pðxÞ is a polynomial in F x½ � of degree n. ■
Definition Let V be any n-dimensional vector space over the field F. Let T 2 AðVÞ.If there exists a positive integer m such that Tm ¼ 0, then we say that T is nilpotent.
3.2.22 Problem Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Suppose that T is nilpotent. Let k be an eigenvalue of T. Then k ¼ 0.
3.2 Canonical Forms 207
Proof Suppose to the contrary that k 6¼ 0. We seek a contradiction.Since k is an eigenvalue of T, there exists a nonzero v in V such that TðvÞ ¼ kv.
Since T is nilpotent, there exists a positive integer m such that Tm ¼ 0. It followsthat TmðvÞ ¼ 0. Hence
k : k is a positive integer; and TkðvÞ ¼ 0� �
is a nonempty set of positive integers. It follows that
min k : k is a positive integer; and TkðvÞ ¼ 0� �
exists. Put
n � min k : k is a positive integer; and TkðvÞ ¼ 0� �
:
It follows that Tn�1ðvÞ 6¼ 0 and TnðvÞ ¼ 0. Now,
kTn�1ðvÞ ¼ Tn�1 kvð Þ ¼ Tn�1 TðvÞð Þ ¼ TnðvÞ ¼ 0|fflfflfflfflfflffl{zfflfflfflfflfflffl} :Thus kTn�1ðvÞ ¼ 0. Since k 6¼ 0, we have Tn�1ðvÞ ¼ 0. This is a contradiction.
■
3.2.23 Problem Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Suppose that T is nilpotent. Let a0; a1; . . .; an 2 F. Let a0 6¼ 0. Then
a0Iþ a1T þ � � � þ anTn
is invertible, and a0Iþ a1T þ � � � þ anTnð Þ�1 is a polynomial in the linear trans-formation S, where S � a1T þ � � � þ anTn.
Proof We have to show that a0Iþ S is invertible.Since T is nilpotent, there exists a positive integer m such that Tm ¼ 0. It follows
that
r�m ) Tr ¼ 0;
and hence
Sm ¼ a1T þ � � � þ anTnð Þm
¼ a1T þ � � � þ anTnð Þ � a1T þ � � � þ anT
nð Þ � � � � � a1T þ � � � þ anTnð Þ ¼ 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
m factors
:
208 3 Linear Transformations
Thus Sm ¼ 0. Observe that
Iþ 1a0S
� �� 1
a0I � 1
a0ð Þ2 Sþ 1a0ð Þ3 S
2 � � � � þ �1ð Þm�1 1a0ð Þm S
m�1� �
:
¼ 1a0I � 1
a0ð Þ2 Sþ 1a0ð Þ3 S
2 � � � � þ �1ð Þm�1 1a0ð Þm S
m�1� �
þ 1a0ð Þ2 S� 1
a0ð Þ3 S2 þ 1
a0ð Þ4 S3 � � � � � �1ð Þm�1 1
a0ð Þm Sm�1 þ �1ð Þm�1 1
a0ð Þmþ 1 Sm� �
¼ 1a0Iþ �1ð Þm�1 1
a0ð Þmþ 1 Sm ¼ 1a0Iþ �1ð Þm�1 1
a0ð Þmþ 1 0 ¼ 1a0I;
so
Iþ 1a0
S
� 1
a0I � 1
a0ð Þ2 Sþ1
a0ð Þ3 S2 � � � � þ �1ð Þm�1 1
a0ð Þm Sm�1
!¼ 1
a0I:
This shows that
a0Iþ Sð Þ�1 ¼ 1a0
I � 1
a0ð Þ2 Sþ1
a0ð Þ3 S2 � � � � þ �1ð Þm�1 1
a0ð Þm Sm�1
!;
and hence a0Iþ S is invertible. ■
Definition Let V be any n-dimensional vector space over the field F. Let T 2 AðVÞ.Suppose that T is nilpotent.
Since T is nilpotent, there exists a positive integer m such that Tm ¼ 0. Hence
k : k is a positive integer; and Tk ¼ 0� �
is a nonempty set of positive integers. It follows that
min k : k is a positive integer; and Tk ¼ 0� �
exists. Put
n � min k : k is a positive integer; and Tk ¼ 0� �
:
It follows that Tn�1 6¼ 0 and Tn ¼ 0. Here n is called the index of nilpotenceof T.
3.2.24 Note Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Suppose that T is nilpotent. Let n1 be a positive integer. Let n1 be theindex of nilpotence of T.
It follows that Tn1�1 6¼ 0 and Tn1 ¼ 0. Since Tn1�1 : V ! V is a nonzerofunction, there exists a nonzero v 2 V such that Tn1�1ðvÞ 6¼ 0 and Tn1ðvÞ ¼ 0.
Clearly, v; TðvÞ; T2ðvÞ; . . .; Tn1�1ðvÞ are linearly independent over F.
3.2 Canonical Forms 209
Proof Suppose to the contrary that v; TðvÞ; T2ðvÞ; . . .; Tn1�1ðvÞ are linearlydependent over F. We seek a contradiction.
It follows that there exist a1; a2; . . .; an1 2 F such that not all the ai are zero,and
a1vþ a2TðvÞþ a3T2ðvÞþ � � � þ an1T
n1�1ðvÞ ¼ 0:
Suppose that as is the first nonzero ai. It follows that s� n1, as 6¼ 0, andr\s ) ar ¼ 0ð Þ. Hence
asIþ asþ 1T þ asþ 2T2 þ � � � þ an1T
n1�s� �
Ts�1ðvÞ� �¼ asT
s�1ðvÞþ asþ 1TsðvÞþ � � � þ an1T
n1�1ðvÞ ¼ 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} :This shows that
asIþ asþ 1T þ asþ 2T2 þ � � � þ an1Tn1�s
� �Ts�1ðvÞ� � ¼ 0: �ð Þ
Since as 6¼ 0 and T is nilpotent, by 3.2.23,
asIþ asþ 1T þ asþ 2T2 þ � � � þ an1T
n1�s� ��1
exists. Now, from (*), Ts�1ðvÞ ¼ 0. Since s� n1, we have s� 1ð Þ� n1 � 1ð Þ.Since Tn1�1ðvÞ 6¼ 0, we have Ts�1ðvÞ 6¼ 0. This is a contradiction. ■
Put
v1 � v; v2 � TðvÞ; v3 � T2ðvÞ; . . .; vn1 � Tn1�1ðvÞ:
Since v; TðvÞ; T2ðvÞ; . . .; Tn1�1ðvÞ are linearly independent over F, it follows thatv1; v2; . . .; vn1f g is a linearly independent set of vectors in V.Suppose that V1 is the linear span of v1; v2; . . .; vn1f g.It follows that V1 is an n1-dimensional linear subspace of V, and v1; v2; . . .; vn1f g
is a basis of V1. Observe that
T v1ð Þ ¼ TðvÞ ¼ v2 2 V1;T v2ð Þ ¼ T TðvÞð Þ ¼ T2ðvÞ ¼ v3 2 V1;
..
.
T vn1ð Þ ¼ T Tn1�1ðvÞð Þ ¼ Tn1ðvÞ ¼ 0 2 V1;
and hence T v1ð Þ; T v2ð Þ; . . .; T vn1ð Þf g V1. Now, since v1; v2; . . .; vn1f g is a basisof V1, we have w 2 V1 ) TðwÞ 2 V1ð Þ.
Thus we have shown that V1 is invariant under T.
210 3 Linear Transformations
3.2.24.1 There exists a linear subspace W of V, of largest possible dimension, suchthat
1. V1 \W ¼ 0f g,2. W is invariant under T.
Proof Since V1 is an n1-dimensional linear subspace of the n-dimensional vectorspace V, and v1; v2; . . .; vn1f g is a basis of V1, there exists a basisv1; v2; . . .; vn1 ;wn1 þ 1;wn1 þ 2; . . .;wnf g of V.Let W1 be the linear span of wn1 þ 1;wn1 þ 2; . . .;wnf g.It is clear that W1 is a linear subspace of V, dim W1ð Þ ¼ n� n1(� 1), and
V ¼ V1 W1. Thus V1 \W1 ¼ 0f g. Since v1; v2; . . .; vn1 ;wn1 þ 1;fwn1 þ 2; . . .;wng is a basis of V, we have wn1 þ 1 6¼ 0. Now, since Tn1 ¼ 0, wehave that
k : k is a positive integer; and Tk wn1 þ 1ð Þ ¼ 0� �
is a nonempty set of positive integers, and it follows that
min k : k is a positive integer; and Tk wn1 þ 1ð Þ ¼ 0� �
exists. Put
n2 � min k : k is a positive integer; and Tk wn1 þ 1ð Þ ¼ 0� �
:
It follows that Tn2�1 wn1 þ 1ð Þ 6¼ 0 and Tn2 wn1 þ 1ð Þ ¼ 0. Put w � wn1 þ 1. Wehave w 2 W1; Tn2�1ðwÞ 6¼ 0 and Tn2ðwÞ ¼ 0.Clearly, w; TðwÞ; T2ðwÞ; . . .; Tn2�1ðwÞ are linearly independent over F.
Proof Suppose to the contrary that w; TðwÞ; T2ðwÞ; . . .; Tn2�1ðwÞ are linearlydependent over F. We seek a contradiction.
It follows that there exist a1; a2; . . .; an2 2 F such that not all the ai arezero and
a1wþ a2TðwÞþ a3T2ðwÞþ � � � þ an2T
n2�1ðwÞ ¼ 0:
Suppose that as is the first nonzero ai. It follows that s� n2, as 6¼ 0, andr\s ) ar ¼ 0ð Þ. Hence
asIþ asþ 1T þ asþ 2T2 þ � � � þ an2T
n2�s� �
Ts�1ðwÞ� �¼ asT
s�1ðwÞþ asþ 1TsðwÞþ � � � þ an2T
n2�1ðwÞ ¼ 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} :
3.2 Canonical Forms 211
This shows that
asIþ asþ 1T þ asþ 2T2 þ � � � þ an2T
n2�s� �
Ts�1ðwÞ� � ¼ 0: �ð Þ
Since as 6¼ 0 and T is nilpotent, by 3.2.23,
asIþ asþ 1T þ asþ 2T2 þ � � � þ an2T
n2�s� ��1
exists. Now, from (*), Ts�1ðwÞ ¼ 0. Since s� n2, we haves� 1ð Þ� n2 � 1ð Þ. SinceTn2�1ðwÞ 6¼ 0, we have Ts�1ðwÞ 6¼ 0. This is a contradiction. ■
Put
w1 � w;w2 � TðwÞ;w3 � T2ðwÞ; . . .;wn2 � Tn2�1ðwÞ:
Since w; TðwÞ; T2ðwÞ; . . .; Tn2�1ðwÞ are linearly independent over F, it fol-lows that w1;w2; . . .;wn2f g is a linearly independent set of vectors in V.
Suppose that W1 is the linear span of w1;w2; . . .;wn2f g.It follows that W1 is an n2-dimensional linear subspace of V, and
w1;w2; . . .;wn2f g is a basis of W1. Observe that
T w1ð Þ ¼ TðwÞ ¼ w2 2 W1;T w2ð Þ ¼ T TðwÞð Þ ¼ T2ðwÞ ¼ w3 2 W1;
..
.
T wn2ð Þ ¼ T Tn2�1ðwÞð Þ ¼ Tn2ðwÞ ¼ 0 2 W1;
and hence T w1ð Þ; T w2ð Þ; . . .; T wn2ð Þf g W1. Now, since w1;w2; . . .;wn2f g is abasis of W1, we have
z 2 W1 ) TðzÞ 2 W1ð Þ. Thus we have shown that W1 is invariant under T.Hence there exists a linear subspace W1 of V such that
1. V1 \W1 ¼ 0f g,2. W1 is invariant under T.
It follows that there exists a linear subspace W of V, of largest possibledimension, such that
1. V1 \W ¼ 0f g,2. W is invariant under T. ■
3.2.24.2 Suppose that u 2 V1. Let k be a positive integer such that k� n1.Suppose that T n1�kð ÞðuÞ ¼ 0. Then there exists u0 2 V1 such that Tk u0ð Þ ¼ u.
212 3 Linear Transformations
Proof Since u 2 V1 and V1 is the linear span of v; TðvÞ; T2ðvÞ; . . .; Tn1�1ðvÞ� �,
there exist a1; a2; . . .; an1 2 F such that
u ¼ a1vþ a2TðvÞþ a3T2ðvÞþ � � � þ an1T
n1�1ðvÞ:
Since
0 ¼ T n1�kð ÞðuÞ|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl}¼ T n1�kð Þ a1vþ a2TðvÞþ a3T
2ðvÞþ � � � þ akTk�1ðvÞ�
þ akþ 1TkðvÞþ � � � þ an1T
n1�1ðvÞ�¼ a1T
n1�kð ÞðvÞþ a2Tn1�kð Þþ 1ðvÞþ a3T
n1�kð Þþ 2ðvÞþ � � �þ akT
n1�1ðvÞþ akþ 1Tn1ðvÞ
þ akþ 2Tn1 þ 1ðvÞþ � � � þ an1Tn1�kð Þþ n1�1ð ÞðvÞ
¼ a1T n1�kð ÞðvÞþ a2T n1�kð Þþ 1ðvÞþ a3T n1�kð Þþ 2ðvÞþ � � � þ akTn1�1ðvÞþ akþ 10ðvÞþ akþ 20ðvÞþ � � � þ an10ðvÞ
¼ a1T n1�kð ÞðvÞþ a2T n1�kð Þþ 1ðvÞþ a3T n1�kð Þþ 2ðvÞþ � � � þ akTn1�1ðvÞ;
we have
a1Tn1�kð ÞðvÞþ a2T
n1�kð Þþ 1ðvÞþ a3Tn1�kð Þþ 2ðvÞþ � � � þ akT
n1�1ðvÞ ¼ 0:
Next, since v; TðvÞ; T2ðvÞ; . . .; Tn1�1ðvÞ� �is a basis of V1, we have
a1 ¼ 0; a2 ¼ 0; . . .; ak ¼ 0. It follows that
u ¼ akþ 1TkðvÞþ akþ 2T
kþ 1ðvÞþ � � � þ an1Tn1�1ðvÞ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ Tk akþ 1vþ akþ 2TðvÞþ � � � þ an1Tn1�k�1ðvÞ� �
¼ Tk akþ 1v1 þ akþ 2v2 þ � � � þ an1vn1�kð Þ ¼ Tk u0ð Þ;
where u0 � akþ 1v1 þ akþ 2v2 þ � � � þ an1vn1�kð Þ 2 v1; v2; . . .; vn1½ � ¼ V1. Thusu0 2 V1 and Tk u0ð Þ ¼ u. ■
3.2.24.3 There exists a linear subspace W of V, of largest possible dimension,such that
1. V ¼ V1 W ,2. W is invariant under T.
Proof By 3.2.24.1, there exists a linear subspace W of V such that
1. V1 \W ¼ 0f g,2. W is invariant under T.
It suffices to show that V V1 þW .
3.2 Canonical Forms 213
Suppose to the contrary that there exists z 2 V such that z 62 V1 þWð Þ. Weseek a contradiction.Clearly, V1 þWð Þ T�1 V1 þWð Þ.
Proof To show this, let us take arbitrary x 2 V1 and y 2 W . We have toshow that
TðxÞþ TðyÞ ¼ð Þ T xþ yð Þ 2 V1 þWð Þ;
that is, TðxÞþ TðyÞð Þ 2 V1 þWð Þ. It suffices to show that TðxÞ 2 V1 andTðyÞ 2 W . Since V1 is invariant under T, and x 2 V1, we have TðxÞ 2 V1.Since W is invariant under T, and y 2 W , we have TðyÞ 2 W . ■
Clearly, T�1 V1 þWð Þ T2ð Þ�1V1 þWð Þ.
Proof To show this, let us take arbitrary x 2 V1 and y 2 W such thatT xþ yð Þ 2 V1 þWð Þ. We have to show that T2ðxÞþ T2ðyÞ ¼ð ÞT2 xþ yð Þ 2V1 þWð Þ, that is, T2ðxÞþ T2ðyÞð Þ 2 V1 þWð Þ. It suffices to show thatT2ðxÞ 2 V1 and T2ðyÞ 2 W . Since V1 is invariant under T, and x 2 V1, wehave TðxÞ 2 V1, and hence T2ðxÞ ¼ð ÞT TðxÞð Þ 2 V1: Since W is invari-ant under T, and y 2 W , we have TðyÞ 2 W , and henceT2ðyÞ ¼ð ÞT TðYÞð Þ 2 W . ■Clearly, T2ð Þ�1 V1 þWð Þ T3ð Þ�1 V1 þWð Þ.
Proof To show this, let us take arbitrary x 2 V1 and y 2 W such thatT2 xþ yð Þ 2 V1 þWð Þ. We have to show that
T3ðxÞþ T3ðyÞ ¼� �T3 xþ yð Þ 2 V1 þWð Þ;
that is, T3ðxÞþ T3ðyÞð Þ 2 V1 þWð Þ. It suffices to show that T3ðxÞ 2 V1 andT3ðyÞ 2 W . Since V1 is invariant under T, and x 2 V1, we have TðxÞ 2 V1,and hence T2ðxÞ ¼ð ÞT TðxÞð Þ 2 V1. Now, T3ðxÞ ¼ð ÞT T2ðxÞð Þ 2 V1, soT3ðxÞ 2 V1. Similarly, T3ðyÞ 2 W .
Thus we have shown that T2ð Þ�1V1 þWð Þ T3ð Þ�1
V1 þWð Þ, etc. ■
Hence
z 62 V1 þWð Þ T�1 V1 þWð Þ T2� ��1
V1 þWð Þ T3� ��1
V1 þWð Þ � � � Tn1ð Þ�1 V1 þWð Þ ¼ V :
Now, since z 2 V , there exists a positive integer k such that
1. k� n1,
2. z 2 Tk� ��1
V1 þWð Þ,3. r\k ) z 62 Trð Þ�1 V1 þWð Þ.
Since z 2 Tk� ��1
V1 þWð Þ, we have Tk� �ðzÞ 2 V1 þWð Þ, and hence there
exist u 2 V1 and w 2 W such that Tk� �ðzÞ ¼ uþw. It follows that
214 3 Linear Transformations
0 ¼ 0ðzÞ ¼ Tn1ð ÞðzÞ ¼ Tn1�k Tk� �ðzÞ� � ¼ Tn1�k uþwð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ Tn1�kðuÞþ Tn1�kðwÞ;
and hence Tn1�kðuÞ ¼ �Tn1�kðwÞ. Since u 2 V1, and V1 is invariant under T, wehave Tn1�kðuÞ 2 V1. Since w 2 W , and W is invariant under T, we haveTn1�kðwÞ 2 W , and hence Tn1�kðuÞ ¼� �� Tn1�kðwÞ 2 W . Thus
Tn1�kðuÞ 2 V1 \W|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0f g:
It follows that Tn1�kðuÞ ¼ 0. Now by 3.2.24.2, there exists u0 2 V1 such thatTk u0ð Þ ¼ u ¼ Tk
� �ðzÞ � w� �
, and hence Tk z� u0ð Þ ¼ w 2 Wð Þ. ThusTk z� u0ð Þ 2 W . Next, since W is invariant under T, we have
m� k ) T m�kð Þ Tk z� u0ð Þ� � 2 W|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence
m� k ) Tm z� u0ð Þ 2 W :
It follows that Tk z� u0ð Þ 2 W .Let us take an arbitrary r\k. Clearly, Tr z� u0ð Þ 62 V1 þWð Þ. (*)
Proof Suppose to the contrary that TrðzÞ � Tr u0ð Þ ¼ð ÞTr z� u0ð Þ2 V1 þWð Þ. We seek a contradiction.
Since u0 2 V1, and V1 is invariant under T, we have Tr u0ð Þ 2 V1. Now,since TrðzÞ � Tr u0ð Þð Þ 2 V1 þWð Þ, we have
TrðzÞ ¼ Tr u0ð Þþ TrðzÞ � Tr u0ð Þð Þ ¼ V1 þ V1 þWð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ V1 þV1ð ÞþW
¼ V1 þW ;
and hence Trð ÞðzÞ 2 V1 þWð Þ. Since r\k, we have by point 3 above thatTrð ÞðzÞ 62 V1 þWð Þ. This is a contradiction. ■
Thus we have shown that
r\k ) Tr z� u0ð Þ 62 V1 þWð Þ: Að Þ
Clearly, z� u0ð Þ 62 V1 þWð Þ.Proof Suppose to the contrary that z� u0ð Þ 2 V1 þWð Þ. We seek a contra-diction. Since u0 2 V1 and z� u0ð Þ 2 V1 þWð Þ, we have
3.2 Canonical Forms 215
z 2 u0 þ V1 þWð Þ ¼ u0 þV1ð ÞþW V1 þV1ð ÞþW ¼ V1 þW ;
and hence z 2 V1 þWð Þ. This is a contradiction. ■
Thus we have shown that z� u0ð Þ 62 V1 þWð Þ � 0f gþW ¼ Wð Þ. Hencez� u0ð Þ 62 W . Similarly, from (A),
T z� u0ð Þ; T2 z� u0ð Þ; . . .; Tk�1 z� u0ð Þ 62 V1 þWð Þ � 0f gþW ¼ Wð Þ:
Thus
z� u0ð Þ; T z� u0ð Þ; T2 z� u0ð Þ; . . .; Tk�1 z� u0ð Þ� � Wc:
Suppose that W2 is the linear span of
z� u0ð Þ; T z� u0ð Þ; T2 z� u0ð Þ; . . .; Tk�1 z� u0ð Þ� �[W :
Since z� u0ð Þ 62 W , the dimension of the linear span of
z� u0ð Þ; T z� u0ð Þ; T2 z� u0ð Þ; . . .; Tk�1 z� u0ð Þ� �[W
is strictly greater than dimðWÞ, and hence
dim W2ð Þ[ dimðWÞ:
It follows that
either V1 \W2 6¼ 0f gð Þ orW2 is not invariant under T : ð��Þ
Observe that
T z� u0ð Þ 2 z� u0ð Þ; T z� u0ð Þ; T2 z� u0ð Þ; . . .; Tk�1 z� u0ð Þ� �[W� � W2;
so T z� u0ð Þ 2 W2. Next,
T T z� u0ð Þð Þ 2 z� u0ð Þ; T z� u0ð Þ; T2 z� u0ð Þ; . . .; Tk�1 z� u0ð Þ� �[W� � W2;
so T T z� u0ð Þð Þ 2 W2. Similarly, T T2 z� u0ð Þð Þ 2 W2; . . .. Next, T Tk�1�
ðz� u0ÞÞ ¼ Tk z� u0ð Þ 2 W W2, so T Tk�1 z� u0ð Þ� � 2 W2. Thus
T z� u0ð Þ; T z� u0ð Þ;T2 z� u0ð Þ; . . .; Tk�1 z� u0ð Þ� �� � W2:
216 3 Linear Transformations
Since W is invariant under T, we have TðWÞ W W2. Thus
T z� u0ð Þ; T z� u0ð Þ; T2 z� u0ð Þ; . . .; Tk�1 z� u0ð Þ� �[W� �
¼ T z� u0ð Þ; T z� u0ð Þ; T2 z� u0ð Þ; . . .; Tk�1 z� u0ð Þ� �� �[ TðWÞ W2|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} :Hence
T z� u0ð Þ;T z� u0ð Þ; T2 z� u0ð Þ; . . .; Tk�1 z� u0ð Þ� �[W� � W2:
Now, since T is a linear transformation and W2 is a linear space, we have
T W2ð Þ ¼ T linear span ofð z� u0ð Þ; T z� u0ð Þ;fðT2 z� u0ð Þ; . . .; Tk�1 z� u0ð Þ�[W
�� W2|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};
and hence T W2ð Þ W2. This shows that W2 is invariant under T. It follows from(**) that V1 \W2 6¼ 0f g.
Since V1 \W2 6¼ 0f g, there exists a nonzero
z0 2 W2 ¼ linear span ofz� u0ð Þ; T z� u0ð Þ; T2 z� u0ð Þ; . . .;
Tk�1 z� u0ð Þ
( )[W
! !
such that z0 2 V1. It follows that there exist a1; a2; . . .; ak 2 F and w� 2 W suchthat
0 6¼ð Þ z0 ¼ a1 z� u0ð Þþ a2T z� u0ð Þþ � � � þ akTk�1 z� u0ð Þþw�:
Clearly, not all of a1; a2; . . .; ak are zero,
Proof Suppose to the contrary that each ai is zero. We seek a contradiction.Since each ai is zero, we have
V13 z0 ¼ 0 z� u0ð Þþ 0T z� u0ð Þþ � � � þ 0Tk�1 z� u0ð Þþw�|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ w�;
and hence w� 2 V1. Also w� 2 W . It follows that z0 ¼ð Þw�
2 V1 \W ¼ 0f gð Þ, and hence z0 ¼ 0. This is a contradiction. ■
Suppose that as is the first nonzero ai, where s� k. It follows from (*) thatTs�1 z� u0ð Þ 62 V1 þWð Þ. Also
3.2 Canonical Forms 217
0 6¼ð Þ z0 ¼ asTs�1 z� u0ð Þþ asþ 1T
s z� u0ð Þþ � � � þ akTk�1 z� u0ð Þþw�|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ asIþ asþ 1T þ � � � þ akTk�s� �
Ts�1 z� u0ð Þð Þþw�;
so
asIþ asþ 1T þ � � � þ akTk�s
� �Ts�1 z� u0ð Þ� � ¼ z0 � w�:
By 3.2.23, asIþ asþ 1T þ � � � þ akTk�s� ��1 exists and is a polynomial pðSÞ
in S, where
S � asþ 1T þ � � � þ akTk�s:
Thus
V1 þWð Þ 63 Ts�1 z� u0ð Þ ¼ pðSÞð Þ z0 � w�ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ pðSÞð Þ z0ð Þ � pðSÞð Þ w�ð Þ:
Since V1 is invariant under T, V1 is invariant under asþ 1T þ � � � þðakTk�sÞ ¼ Sð Þ, and hence V1 is invariant under S. Now, since pðSÞ is a poly-nomial in S, V1 is invariant under pðSÞ. Next, since z0 2 V1, we havepðSÞð Þ z0ð Þ 2 V1. Since pðSÞð Þ z0ð Þ � pðSÞð Þ w�ð Þð Þ 62 V1 þWð Þ, we havepðSÞð Þ w�ð Þ 62 W .Since W is invariant under T, W is invariant under asþ 1T þ � � � þð
akTk�sÞ ¼ Sð Þ, and henceW is invariant under S. Since pðSÞ is a polynomial in S,W is invariant under pðSÞ. Next, since w� 2 W , we have
pðSÞð Þ w�ð Þ 2 W :
This is a contradiction. ■
3.2.24.4 Note Let V1;V2; . . .;Vk be linear subspaces of V such that
1. V ¼ V1 V2 � � � Vk,2. each Vi is invariant under T.
Suppose that dim V1ð Þ ¼ n1, and v11; v21; . . .; v
n11
� �is a basis of V1. Suppose
that dim V2ð Þ ¼ n2, and v12; v22; . . .; v
n22
� �is a basis of V2, etc.
Since V ¼ V1 V2 � � � Vk , we have
n ¼ dimðVÞ ¼ n1 þ n2 þ � � � þ nk;
and
218 3 Linear Transformations
v11; v21; . . .; v
n11 ; v
12; v
22; . . .; v
n22 ; . . .; v
1k ; v
2k ; . . .; v
nkk
� �is a basis of V. Since v11 2 V1, and V1 is invariant under T, we have T v11
� � 2 V1.Now, since v11; v
21; . . .; v
n11
� �is a basis of V1, there exist a11; a
21; . . .; a
n11 2 F such
that
T v11� � ¼ a11v
11 þ a21v
21 þ � � � þ an11 v
n11 :
Hence
T v11� � ¼ a11v
11 þ a21v
21 þ � � � þ an11 v
n11
� �þ 0v12 þ 0v22 þ � � � þ 0vn22� �þ � � �
þ 0v1k þ 0v2k þ � � � þ 0vnkk� �
:
Similarly,
T v21� � ¼ a12v
11 þ � � � þ an12 v
n11
� �þ 0v12 þ � � � þ 0vn22� �þ � � � þ 0v1k þ � � � þ 0vnkk
� �;
..
.
T vn11� � ¼ a1n1v
11 þ � � � þ an1n1v
n11
� �þ 0v12 þ � � � þ 0vn22� �þ � � � þ 0v1k þ � � � þ 0vnkk
� �;
T v12� � ¼ 0v11 þ � � � þ 0vn11
� �þ b11v12 þ � � � þ bn21 v
n22
� �þ � � � þ 0v1k þ � � � þ 0vnkk� �
;
T v22� � ¼ 0v11 þ � � � þ 0vn11
� �þ b12v12 þ � � � þ bn22 v
n22
� �þ � � � þ 0v1k þ � � � þ 0vnkk� �
;
..
.
T vn22� � ¼ 0v11 þ � � � þ 0vn11
� �þ b1n2v12 þ � � � þ bn2n2v
n22
� �þ � � � þ 0v1k þ � � � þ 0vnkk
� �;
..
.:
Thus the matrix of T 2 AðVÞð Þ relative to the basis
v11; v21; . . .; v
n11 ; v
12; v
22; . . .; v
n22 ; . . .; v
1k ; v
2k ; . . .; v
nkk
� �is the n� n matrix in the canonical form
A1 0 00 A2 0
0 0 . ..
264
375;
where
A1 �a11 a21 � � �a12 a22 � � �... ..
. . ..
264
375n1�n1
;
3.2 Canonical Forms 219
A2 �b11 b21 � � �b12 b22 � � �... ..
. . ..
264
375n2�n2
; etc:
Since
TjV1
� �v11� � ¼ T v11
� � ¼ a11v11 þ a21v
21 þ � � � þ an11 v
n11 ;
TjV1
� �v21� � ¼ T v21
� � ¼ a12v11 þ a22v
21 þ � � � þ an12 v
n11 ;
..
.
T jV1
� �vn11� � ¼ T vn11
� � ¼ a1n1v11 þ a2n1v
21 þ � � � þ an1n1v
n11 ;
A1 is the n1 � n1 matrix of the linear transformation TjV1induced by T on V1.
Similarly, A2 is the n2 � n2 matrix of the linear transformation T jV2induced by
T on V2, etc.
3.2.24.5 Conclusion Let V1;V2; . . .;Vk be linear subspaces of V such that
1. V ¼ V1 V2 � � � Vk,2. each Vi is invariant under T.
Then there exist a basis v11; v21; . . .; v
n11
� �of V1, a basis v12; v
22; . . .; v
n22
� �of
V2; . . ., a basis v1k ; v2k ; . . .; v
nkk
� �of Vk such that the matrix of T 2 AðVÞð Þ relative
to the basis v11; v21; . . .; v
n11 ; v
12; v
22; . . .; v
n22 ; . . .; v
1k ; v
2k ; . . .; v
nkk
� �has the canonical
form
A1 0 00 A2 0
0 0 . ..
264
375n�n
;
where A1 is the matrix of the linear transformation T jV1relative to
v11; v21; . . .; v
n11
� �, A2 is the matrix of the linear transformation T jV2
relative tov12; v
22; . . .; v
n22
� �, etc. Also v21 ¼ T v11
� �, v31 ¼ T2 v11
� �; . . .; v22 ¼ T v12
� �,
v31 ¼ T2 v12� �
; . . .; etc. Hence A1 takes the form
0 1 0 00 0 1 0...
0 0 . ..
0... ..
.0 0
..
. ... ..
.1
0 0 0 0
2666666664
3777777775t�t
:
220 3 Linear Transformations
Notation This matrix is denoted by Mt. Thus
Mt �
0 1 0 00 0 1 0...
0 0 . ..
0... ..
.0 0
..
. ... ..
.1
0 0 0 0
2666666664
3777777775t�t
:
Similarly, A2 takes the form Ms for some positive integer s, etc.
3.2.24.6 Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Suppose that T is nilpotent. Let n1 be a positive integer. Let n1 be theindex of nilpotence of T.
By 3.2.24.3, there exist linear subspaces V1, W of V such that
1. V ¼ V1 W ,2. V1, W are invariant under T.
Now, by 3.2.24.4, there exists a basis v11; v21; . . .; v
n11
� �of V1 such that for
every basis B of W, the matrix of T 2 AðVÞð Þ relative to the basisv11; v
21; . . .; v
n11
� �[B has the canonical form
Mn1 00 A2
� �n�n
;
where Mn1 is the n1 � n1 matrix of the linear transformation TjV1relative to the
basis v11; v21; . . .; v
n11
� �, and A2 is the n� n1ð Þ � n� n1ð Þ matrix of the linear
transformation T jW relative to the “arbitrary” basis B.Since Tn1 ¼ 0 and W is invariant under T, we have T jW
� �n1¼ 0, and hencethere exists a smallest integer n2 such that n2 � n1 and T jW
� �n2¼ 0. ThusT2ð Þn2¼ 0, where T2 � TjW2 AðWÞ. Also, n2 is the index of nilpotence of T2.Again by 3.2.24.3, there exist linear subspaces V2, X of W such that
1. W ¼ V2 X,2. V2, X are invariant under T2.
Now, by 3.2.24.4, there exists a basis v12; v22; . . .; v
n22
� �of V2 such that for
every basis C of X, the matrix of T2 2 AðWÞð Þ relative to the basisv12; v
22; . . .; v
n22
� �[C has the canonical form
Mn2 00 A3
� �n�n1ð Þ� n�n1ð Þ
;
3.2 Canonical Forms 221
where Mn2 is the n2 � n2 matrix of the linear transformation TjV2relative to the
basis v12; v22; . . .; v
n22
� �, and A3 is the n� n1ð Þ � n2ð Þ � n� n1ð Þ � n2ð Þ matrix of
the linear transformation T2 relative to the “arbitrary” basis C.Now, since A2 is the n� n1ð Þ � n� n1ð Þ matrix of the linear transformation
T2 relative to the “arbitrary” basis B, the matrix of T 2 AðVÞð Þ relative to thebasis v11; v
21; . . .; v
n11
� �[ v12; v22; . . .; v
n22
� �[C has the canonical form
Mn1 0 00 Mn2 00 0 A3
24
35n�n
:
We can repeat the above process finitely many times, obtaining finally thefollowing result.
3.2.25 Conclusion Let V be any n-dimensional vector space over the field F. LetT 2 AðVÞ. Suppose that T is nilpotent. Then there exist linear subspacesV1;V2; . . .;Vk of V such that
1. V ¼ V1 V2 � � � Vk,2. each Vi is invariant under T.
Also there exist a basis v11; v21; . . .; v
n11
� �of V1, a basis v12; v
22; . . .; v
n22
� �of V2; . . .,
a basis v1k ; v2k ; . . .; v
nkk
� �of Vk such that the matrix of T 2 AðVÞð Þ relative to the
basis v11; v21; . . .; v
n11 ; v
12; v
22; . . .; v
n22 ; . . .; v
1k ; v
2k ; . . .; v
nkk
� �has the canonical form
Mn1 0 0
0 . ..
00 0 Mnk
264
375n�n
;
where Mn1 is the matrix of the linear transformation T jV1relative to
v11; v21; . . .; v
n11
� �, Mn2 is the matrix of the linear transformation T jV2
relative tov12; v
22; . . .; v
n22
� �, etc. Further,
n ¼ n1 þ � � � þ nk
and
n1 � n2 � � � � � nk:
222 3 Linear Transformations
3.3 The Cayley–Hamilton Theorem
3.3.1 Definition By the transpose of an m� n matrix A � aij� �
, we mean then� m matrix whose i; jð Þ-entry is aji: The transpose of A is denoted by AT . By theconjugate of an m� n matrix A � aij
� �having complex numbers as entries, we
mean the m� n matrix whose i; jð Þ-entry is ai| where ai| denotes the complexconjugate of the complex number aij. The conjugate of A is denoted by A. By A� we
mean A� �T
, and this matrix is called the conjugate transpose of A. The n� n matrixwhose i; jð Þ-entry is
1 if i ¼ j0 if i 6¼ j
is denoted by In, or simply I. By a scalar matrix we mean a scalar multiple of I. Bya zero matrix we mean a matrix each entry of which is 0.
Definition Let A � aij� �
be a square complex matrix.If i 6¼ j ) aij ¼ 0� �
, then we say that A is diagonal. If i[ j ) aij ¼ 0� �
, then wesay that A is upper triangular. If AT ¼ A; then we say that A is symmetric. IfATA ¼ AAT ¼ I, then we say that A is orthogonal. If A�A ¼ AA� ¼ I, then we saythat A is unitary. If A� ¼ A, then we say that A is Hermitian. If A�A ¼ AA�, then wesay that A is normal.
Definition Let A, B be square complex matrices of the same size. If there exists aninvertible matrix P such that P�1AP ¼ B, then we say that A and B are similar.Clearly, the relation of similarity is an equivalence relation.
Definition A submatrix of a matrix is obtained by suppressing some rows and/orsuppressing some columns from the given matrix. Some nomenclatures are self-explanatory. For example, if
A �1 3 �14 0 3
50 3 �2
24
35
is the given matrix, then 0 35
� �is a submatrix of A. Using submatrices, we can
form matrices like
1½ � 3 �1½ �40
� �0 3
53 �2
� �24
35;
3.3 The Cayley–Hamilton Theorem 223
which is an example of a partitioned form of A into a 2� 2 block matrix. Themanipulation of matrices in partitioned form is a basic technique in linear algebra.For example:
1½ � 3 �1½ �40
� �0 3
53 �2
� �24
352�2
1½ � 3½ � �1½ �40
� �03
� �35�2
� �24
352�3
¼1½ � 1½ � þ 3 �1½ � 4
0
� �1½ � 3½ � þ 3 �1½ � 0
3
� �1½ � �1½ � þ 3 �1½ �
35�2
� �40
� �1½ � þ 0 3
53 �2
� �40
� �40
� �3½ � þ 0 3
53 �2
� �03
� �40
� ��1½ � þ 0 3
53 �2
� �35�2
� �2664
37752�3
¼1½ � þ 12½ � 3½ � þ �3½ � �1½ � þ 19
5
� �40
� �þ 0
12
� �120
� �þ
95�6
� � �40
� �þ
�65295
� �24
352�3
¼13½ � 0½ � 14
5
� �412
� �695�6
� � �265295
� �24
352�3
:
Further,
1 3 �14 0 3
50 3 �2
24
35 1 3 �1
4 0 35
0 3 �2
24
35 ¼
13 0 145
4 695
�265
12 �6 295
24
35:
3.3.2 Note By an elementary row operation for matrices, we mean any one of thefollowing:
1. interchange any two rows,2. multiply a row by a nonzero constant,3. add a multiple of a row to another row.
Definition An n� n (or n-square for short) matrix E is called an elementary matrixif there exists an elementary row operation R such that E is obtained by a singleapplication of R on the unit matrix In.
For n ¼ 3, examples of elementary matrices are
1 0 00 0 10 1 0
24
35; 1 0 0
0 �3 00 0 1
24
35; 1 0 0
0 1 50 0 1
24
35:
3.3.3 Example Observe that
1 2 34 5 6
� � ��������!R2!R2 þ �4ð ÞR1 1 2 30 �3 �6
� �;
1 00 1
� � ��������!R2!R2 þ �4ð ÞR1 1 0�4 1
� �;
224 3 Linear Transformations
and
1 0�4 1
� �1 2 34 5 6
� �¼ 1 2 3
0 �3 �6
� �:
Again
1 2 30 �3 �6
� � �����!R2! 1�3R2 1 2 3
0 1 2
� �;
1 00 1
� � �����!R2! 1�3R2 1 0
0 1�3
� �;
and
1 00 1
�3
� �1 2 30 �3 �6
� �¼ 1 2 3
0 1 2
� �:
Next,
1 2 30 1 2
� � ��������!R1!R1 þ �2ð ÞR2 1 0 �10 1 2
� �;
1 00 1
� � ��������!R1!R1 þ �2ð ÞR2 1 �20 1
� �;
and
1 �20 1
� �1 2 30 1 2
� �¼ 1 0 �1
0 1 2
� �:
Now we apply certain elementary column operations on1 0 �10 1 2
� �to make
things simpler. The following are self-explanatory:
1 0 �10 1 2
� � ��������!C3!C3 þ �2ð ÞC2 1 0 �10 1 0
� �;
1 0 00 1 00 0 1
24
35 ��������!C3!C3 þ �2ð ÞC2
1 0 00 1 �20 0 1
24
35;
1 0 �10 1 2
� � 1 0 00 1 �20 0 1
24
35 ¼ 1 0 �1
0 1 0
� �:
3.3 The Cayley–Hamilton Theorem 225
Next,
1 0 �10 1 0
� � ��������!C3!C3 þ 1C1 1 0 00 1 0
� �;
1 0 00 1 00 0 1
24
35 ��������!C3!C3 þ 1C1
1 0 10 1 00 0 1
24
35;
1 0 �10 1 0
� � 1 0 10 1 00 0 1
24
35 ¼ 1 0 0
0 1 0
� �:
If we collect the above information, we get
1 2 34 5 6
� � ��������!R2!R2 þ �4ð ÞR1 1 2 30 �3 �6
� � �����!R2! 1�3R2 1 2 3
0 1 2
� �
��������!R1!R1 þ �2ð ÞR2 1 0 �10 1 2
� � ��������!C3!C3 þ �2ð ÞC2 1 0 �10 1 0
� � ��������!C3!C3 þ 1C1 1 0 00 1 0
� �
and
1 �20 1
� �1 00 1
�3
� �1 0�4 1
� � 1 2 34 5 6
� � 1 0 00 1 �20 0 1
24
35 1 0 1
0 1 00 0 1
24
35
0@
1A
¼ 1 0 00 1 0
� �:
3.3.4 Conclusion Let A be a nonzero m� n matrix with complex numbers asentries. Then there exist an invertible m� m matrix P, an invertible n� n matrix Q,and a positive integer r such that
1. PAQ ¼ Ir 00 0
� �m�n
;
2. P is a product of elementary matrices of size m� m,3. Q is a product of elementary matrices of size n� n.4. r is the rank of A.
3.3.5 Observe that
QT AT� �
PT ¼ PAQð ÞT¼ Ir 00 0
� �m�n
T
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼Ir 00 0
� �n�m
;
226 3 Linear Transformations
so
R AT� �
S ¼ Ir 00 0
� �n�m
;
where R � QT and S � PT . Since Q is invertible, R ¼ð ÞQT is invertible, and henceR is invertible. Similarly, S is invertible. It follows that rank ATð Þ ¼ r ¼ rankðAÞð Þ.
Thus rank ATð Þ ¼ rankðAÞ. Similarly, rank A� � ¼ rankðAÞ and rank A�ð Þ ¼
rankðAÞ.
3.3.6 Note It is easy to see that for a partitioned matrixA B0 C
� �, we have
detA B0 C
� �¼ detðAÞ � detðCÞ:
It is also written as
A B0 C
�������� ¼ detðAÞ � detðCÞ:
3.3.7 Example
a11 a12 a13a21 a22 a23a31 a32 a33
24
35 b11 b12
b21 b22b31 b32
24
35
0 0 00 0 0
� �c11 c12c21 c22
� �����������
����������¼ �c21
a11a21a310
a12a22a320
a13a23a330
b12b22b32c12
��������
��������þ c22
a11a21a310
a12a22a320
a13a23a330
b11b21b31c11
��������
��������¼ �c21 c12
a11 a12 a13a21 a22 a23a31 a32 a33
������������
0@
1Aþ c22 c11
a11 a12 a13a21 a22 a23a31 a32 a33
������������
0@
1A
¼a11 a12 a13a21 a22 a23a31 a32 a33
������������ c11c22 � c12c21ð Þ ¼
a11 a12 a13a21 a22 a23a31 a32 a33
������������ c11 c12c21 c22
��������;
and hence
3.3 The Cayley–Hamilton Theorem 227
a11 a12 a13a21 a22 a23a31 a32 a33
24
35 b11 b12
b21 b22b31 b32
24
35
0 0 00 0 0
� �c11 c12c21 c22
� �����������
����������¼
a11 a12 a13a21 a22 a23a31 a32 a33
������������ c11 c12c21 c22
��������:
3.3.8 Problem Let V be an n-dimensional vector space over C. Let A : V ! V bea linear transformation. Let k1 and k2 be distinct eigenvalues of A. Let v1 be aneigenvector corresponding to the eigenvalue k1. Let v2 be an eigenvector corre-sponding to the eigenvalue k2. Then v1; v2 are linearly independent.
Proof Suppose to the contrary that v1 ¼ kv2, where k is a complex number. Weseek a contradiction.
Since v2 is an eigenvector corresponding to the eigenvalue k2, we have v2 6¼ 0.Similarly,v1 6¼ 0. Now, since v1 ¼ kv2, we have k 6¼ 0. Also, A v1ð Þ ¼ k1v1 andA v2ð Þ ¼ k2v2. Hence
kA v2ð Þ ¼ A kv2ð Þ ¼ A v1ð Þ ¼ k1v1|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} ¼ k1kv2:
Since kA v2ð Þ ¼ k1kv2 and k is nonzero, we have k2v2 ¼ A v2ð Þ ¼ k1v2|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl}, andhence k2 � k1ð Þv2 ¼ 0. Since k1 and k2 are distinct, k2 � k1 is nonzero. Now, sincek2 � k1ð Þv2 ¼ 0, we have v2 ¼ 0. This is a contradiction. ■
3.3.9 Conclusion The eigenvectors corresponding to distinct eigenvalues are lin-early independent.
3.3.10 Note By Cn we shall mean the collection of all column matrices of sizen� 1 with complex entries. Such matrices are also called column vectors. We knowthat Cn is a vector space over C. For every x � x1; . . .; xn½ �T and y � y1; . . .; yn½ �T inCn, we define
x; yh i � x1y1 þ . . .þ xnyn ¼ xT�y ¼ y�x� �
:
Clearly, Cn is an inner product space. For every x � x1; . . .; xn½ �T in Cn, wedefine the length x of x as follows:
xk k �ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffix1j j2 þ � � � þ xnj j2
q¼
ffiffiffiffiffiffiffiffiffiffix; xh i
p� �:
In Cn, an m-tuple v1; . . .; vmð Þ of vectors in Cn can be thought of as a matrixvij� �
n�m, where v1 � v11; . . .; vn1½ �T , v1 � v12; . . .; vn2½ �T , etc.3.3.11 Note Let x be a nonzero vector of Cn.
228 3 Linear Transformations
There exist distinct nonzero vectors v2; . . .; vn in Cn such that x; v2; . . .; vnf g is abasis of Cn. Assume that for some complex number a, axþ v2 is orthogonal to x,that is,
a x; xh iþ v2; xh i ¼ð Þ axþ v2; xh i ¼ 0:
Since x is nonzero, x; xh i is nonzero. This shows that � v2;xh ix;xh i xþ v2 is orthogonal
to x. Let us put
y2 � � v2; xh ix; xh i xþ v2:
Since x; v2; . . .; vnf g is a basis of Cn, x; � v2;xh ix;xh i xþ v2; v3; . . .; vn
n ois a basis of
Cn, and hence x; y2; v3; . . .; vnf g is a basis of Cn. Since, y2 ¼ð Þ � v2;xh ix;xh i xþ v2 is
orthogonal to x, it follows that y2 is orthogonal to x.Thus x; y2; v3; . . .; vnf g is a basis of Cn such that y2 is orthogonal to x.Assume that for some complex numbers a; b, axþ by2 þ v3 is orthogonal to
x and y2, that is,
a x; xh iþ v3; xh i ¼ a x; xh iþ b0þ v3; xh i ¼ a x; xh iþ b y2; xh iþ v3; xh i ¼ð Þaxþ by2 þ v3; xh i ¼ 0
and
b y2; y2h iþ v3; y2h i ¼ a0þ b y2; y2h iþ v3; y2h i ¼ a x; y2h iþ b y2; y2h iþ v3; y2h i ¼ð Þaxþ by2 þ v3; y2h i ¼ 0:
It follows that � v3;xh ix;xh i x� v3;y2h i
y2;y2h i y2 þ v3 is orthogonal to x and y2. Let us put
y3 � � v3; xh ix; xh i x� v3; y2h i
y2; y2h i y2 þ v3:
Since x; y2; v3; . . .; vnf g is a basis of Cn, x; y2;� v3;xh ix;xh i x� v3;y2h i
y2;y2h i y2 þ v3; v4; . . .; vnn o
is a basis of Cn, and hence x; y2; y3; v4; . . .; vnf g is a basis of Cn. Since y3 ¼ð Þ �v3;xh ix;xh i x� v3;y2h i
y2;y2h i y2 þ v3 is orthogonal to x and y2, it follows that y3 is orthogonal to
x and y2.Thus x; y2; y3; v4; . . .; vnf g is a basis of Cn such that the members of x; y2; y3f g
are mutually orthogonal, etc.Finally, we get a basis x; y2; y3; y4; . . .; ynf g of Cn such that the members of
x; y2; y3; y4; . . .; ynf g are mutually orthogonal.
3.3 The Cayley–Hamilton Theorem 229
It follows that 1xk k x;
1y2k k y2;
1y3k k y3;
1y4k k y4; . . .;
1ynk k yn
n ois an orthonormal basis
of Cn.
3.3.12 Conclusion Let u1; . . .; uk be orthonormal vectors in Cn. Then there existukþ 1; . . .; un|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl}
n�k
in Cn such that u1; . . .; uk; ukþ 1; . . .; unf g is an orthonormal basis of
Cn. By the definition of unitary matrix A�A ¼ AA� ¼ Ið Þ, the n-square matrixu1; . . .; uk; ukþ 1; . . .; un½ � is unitary.The construction used here is known as the Gram–Schmidt orthogonalization
process.
3.3.13 Note Let A � aij� �
be an n-square complex matrix.
For every x � x1; . . .; xn½ �T in Cn, the product Ax of matrices A and x is a matrixof size n� 1, and hence Ax is a member of Cn. Let us define a function A : Cn !Cn as follows: for every x � x1; . . .; xn½ �T in Cn,
AðxÞ � Ax:
Clearly, A : Cn ! Cn is a linear transformation.
3.3.14 Note Let A : Cn ! Cn be a linear transformation. Put
e1 � 1; 0; . . .; 0½ �T 2 Cnð Þ; e2 � 0; 1; 0; . . .; 0½ �T 2 Cnð Þ; etc:
Clearly, e1; . . .; enf g is an orthonormal basis of the inner product space Cn. Thisbasis is called the standard basis of Cn: Let A � aij
� �be the matrix of A relative to
the basis e1; . . .; enf g. Hence
A e1ð Þ ¼ a11e1 þ a21e2 þ � � � þ an1en ¼Xni¼1
ai1ei ¼a11
..
.
an1
2664
3775 ¼ a11; . . .; an1½ �T
0BB@
1CCA;
A e2ð Þ ¼ a12e1 þ a22e2 þ � � � þ an2an ¼Xni¼1
ai2ei ¼a12
..
.
an2
2664
3775 ¼ a12; . . .; an2½ �T
0BB@
1CCA;
etc:
In short, A ej� � ¼ a1j; . . .; anj
� �T . By 3.3.13,
bA : x 7!Ax
230 3 Linear Transformations
is a linear transformation from Cn to Cn. It follows that
bA e1ð Þ ¼ Ae1 ¼a11
..
.
an1
2664
3775 ¼ A e1ð Þ;
bA e2ð Þ ¼ Ae2 ¼a12
..
.
an2
2664
3775 ¼ A e2ð Þ; etc:
It follows that bA ¼ A.
3.3.15 Conclusion The n-square complex matrix A � aij� �
can be represented bythe linear transformation x 7!Ax from Cn to Cn.
3.3.16 Note Let a1; a2; a3; a4 be any complex numbers. Observe that
1 1a1 a2
�������� ¼ a2 � a1ð Þ:
Next,
1 1 1
a1 a2 a3a1ð Þ2 a2ð Þ2 a3ð Þ2
��������������
¼1 1 1
a1 a2 a30 a2 a2 � a1ð Þ a3 a3 � a1ð Þ
�������������� R3 ! R3 � a1R2ð Þ
¼1 1 1
0 a2 � a1 a3 � a10 a2 a2 � a1ð Þ a3 a3 � a1ð Þ
�������������� R2 ! R2 � a1R1ð Þ
¼ a2 � a1ð Þ a3 � a1ð Þ1 1 1
0 1 1
0 a2 a3
��������������
¼ a2 � a1ð Þ a3 � a1ð Þ 1 1
a2 a3
�������� ¼ a2 � a1ð Þ a3 � a1ð Þ a3 � a2ð Þ;
3.3 The Cayley–Hamilton Theorem 231
so
1 1 1a1 a2 a3a1ð Þ2 a2ð Þ2 a3ð Þ2
������������ ¼ a2 � a1ð Þ a3 � a1ð Þ a3 � a2ð Þ:
Again,
1
a1a1ð Þ2a1ð Þ3
1
a2a2ð Þ2a2ð Þ3
1
a3a3ð Þ2a3ð Þ3
1
a4a4ð Þ2a4ð Þ3
���������
���������¼
1
0
0
0
1
a2 � a1ð Þa2 a2 � a1ð Þa2ð Þ2 a2 � a1ð Þ
1
a3 � a1ð Þa3 a3 � a1ð Þa3ð Þ2 a3 � a1ð Þ
1
a4 � a1ð Þa4 a4 � a1ð Þa4ð Þ2 a4 � a1ð Þ
���������
���������R4 ! R4 � a1R3
R3 ! R3 � a1R2
R2 ! R2 � a1R1
0B@
1CA
¼ a2 � a1ð Þ a3 � a1ð Þ a4 � a1ð Þ
1
0
0
0
1
1
a2a2ð Þ2
1
1
a3a3ð Þ2
1
1
a4a4ð Þ2
���������
���������¼ a2 � a1ð Þ a3 � a1ð Þ a4 � a1ð Þ
1 1 1
a2 a3 a4a2ð Þ2 a3ð Þ2 a4ð Þ2
��������������
¼ a2 � a1ð Þ a3 � a1ð Þ a4 � a1ð Þ � a3 � a2ð Þ a4 � a2ð Þ a4 � a3ð Þ;
so
1
a1a1ð Þ2a1ð Þ3
1
a2a2ð Þ2a2ð Þ3
1
a3a3ð Þ2a3ð Þ3
1
a4a4ð Þ2a4ð Þ3
���������
���������¼ a2 � a1ð Þ a3 � a1ð Þ a3 � a2ð Þ a4 � a1ð Þ a4 � a2ð Þ a4 � a3ð Þ; etc:
3.3.17 Conclusion If each ai is a complex number, then
1a1...
a1ð Þn�1
1a2...
a2ð Þn�1
� � �1an...
anð Þn�1
��������
�������� ¼Y
n� j[ i� 1
aj � ai� �
:
232 3 Linear Transformations
3.3.18 Note Let A � aij� �
be an n-square complex matrix. Let k1; . . .; kn be theeigenvalues of A. Suppose that all the eigenvalues of A are distinct. Let B � bij
� �be
an n-square complex matrix such that AB ¼ BA, that is, B commutes with A.
Since k1 is an eigenvalue of A, there exists a nonzero u1 2 Cn such thatAu1 ¼ k1u1. Similarly, there exists a nonzero u2 2 Cn such that Au2 ¼ k2u2, etc.Since all the eigenvalues of A are distinct, by 3.3.9, u1; . . .; unf g is linearly inde-pendent, and hence u1; . . .; unf g is a basis of Cn. It follows that the n� n matrixu1; . . .; unð Þ in invertible. Next,
u1; . . .; unð Þ�1A u1; . . .; unð Þ ¼ u1; . . .; unð Þ�1 A u1; . . .; unð Þð Þ¼ u1; . . .; unð Þ�1 Au1; . . .;Aunð Þ ¼ u1; . . .; unð Þ�1 k1u1; . . .; knunð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ u1; . . .; unð Þ�1 k1u1ð Þ; . . .; u1; . . .; unð Þ�1 knunð Þ� �¼ k1 u1; . . .; unð Þ�1u1; . . .; kn u1; . . .; unð Þ�1un� �¼ k1e1; . . .; knenð Þ ¼ diag k1; . . .; knð Þ;
so
u1; . . .; unð Þ�1A u1; . . .; unð Þ ¼ diag k1; . . .; knð Þ:
Next,
diag k1; . . .; knð Þð Þ u1; . . .; unð Þ�1B u1; . . .; unð Þ� �
¼ u1; . . .; unð Þ�1A u1; . . .; unð Þ� �
u1; . . .; unð Þ�1B u1; . . .; unð Þ� �
¼ u1; . . .; unð Þ�1A u1; . . .; unð Þ u1; . . .; unð Þ�1� �
B u1; . . .; unð Þ¼ u1; . . .; unð Þ�1AIB u1; . . .; unð Þ ¼ u1; . . .; unð Þ�1AB u1; . . .; unð Þ¼ u1; . . .; unð Þ�1BA u1; . . .; unð Þ ¼ u1; . . .; unð Þ�1BIA u1; . . .; unð Þ¼ u1; . . .; unð Þ�1B u1; . . .; unð Þ u1; . . .; unð Þ�1
� �A u1; . . .; unð Þ
¼ u1; . . .; unð Þ�1B u1; . . .; unð Þ� �
u1; . . .; unð Þ�1A u1; . . .; unð Þ� �
¼ u1; . . .; unð Þ�1B u1; . . .; unð Þ� �
diag k1; . . .; knð Þð Þ;
:
so diag k1; . . .; knð Þð ÞD ¼ D diag k1; . . .; knð Þð Þ; where D � u1; . . .; unð Þ�1B u1; . . .; unð Þ:Suppose that D � v1; . . .; vnð Þ, where each vi � v1i; . . .; vni½ �T is inCn. It follows
that
3.3 The Cayley–Hamilton Theorem 233
diag k1; . . .; knð Þð ÞD ¼ diag k1; . . .; knð Þð Þ v1; . . .; vnð Þ¼ diag k1; . . .; knð Þð Þv1; . . .; diag k1; . . .; knð Þð Þvnð Þ¼ diag k1; . . .; knð Þð Þ v11; . . .; vn1½ �T ; . . .;�
diag k1; . . .; knð Þð Þ v1n; . . .; vnn½ �T�¼ k1v11; . . .; knvn1½ �T ; . . .; k1v1n; . . .; knvnn½ �T� �
¼k1v11 k1v1n
..
. . .. ..
.
knvn1 knvnn
2664
3775;
and
D diag k1; . . .; knð Þð Þ ¼ v1; . . .; vnð Þ diag k1; . . .; knð Þð Þ¼ v1; . . .; vnð Þ k1e1; . . .; knenð Þ¼ v1; . . .; vnð Þ k1e1ð Þ; . . .; v1; . . .; vnð Þ knenð Þð Þ¼ k1 v1; . . .; vnð Þe1; . . .; kn v1; . . .; vnð Þenð Þ¼ k1v1; . . .; knvnð Þ ¼ k1v1 þ � � � þ knvn
¼k1v11 knv1n
..
. . .. ..
.
k1vn1 knvnn
2664
3775:
Next, since
k1v11 k1v1n... . .
. ...
knvn1 knvnn
264
375 ¼ diag k1; . . .; knð Þð ÞD ¼ D diag k1; . . .; knð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼
k1v11 knv1n... . .
. ...
k1vn1 knvnn
264
375;
we have
k1v11 k1v1n... . .
. ...
knvn1 knvnn
264
375 ¼
k1v11 knv1n... . .
. ...
k1vn1 knvnn
264
375:
Now, since all the eigenvalues k1; . . .; kn of A are distinct, i 6¼ j ) vij ¼ 0.Next, since D ¼ v1; . . .; vnð Þ; and each vi equals v1i; . . .; vni½ �T , it follows that
234 3 Linear Transformations
u1; . . .; unð Þ�1B u1; . . .; unð Þ ¼� �
D is a diagonal matrix, and hence P�1BP is a
diagonal matrix, where P � u1; . . .; unð Þ.Since P�1BP is a diagonal matrix, we can suppose that
P�1BP � diag l1; . . .; lnð Þ;
where each li is a complex number.Let us consider the following system of linear equations in n variables
x0; x1; . . .; xn�1:
1x0 þ k1x1 þ k1ð Þ2x2 þ � � � þ k1ð Þn�1xn�1 ¼ l11x0 þ k2x1 þ k2ð Þ2x2 þ � � � þ k2ð Þn�1xn�1 ¼ l2
..
.
1x0 þ knx1 þ knð Þ2x2 þ � � � þ knð Þn�1xn�1 ¼ ln
9>>>=>>>;:
Since k1; . . .; kn are distinct, by 3.3.17, we have
11...
1
k1k2...
kn
k1ð Þ2k2ð Þ2...
knð Þ2� � �
k1ð Þn�1
k2ð Þn�1
..
.
knð Þn�1
���������
���������¼
Yn� j[ i� 1
kj � ki� � 6¼ 0;
and the above system of linear equations has a unique solution x0; x1; . . .;ðxn�1Þ ¼ a0; a1; . . .; an�1ð Þ.
Hence
1a0 þ k1a1 þ k1ð Þ2a2 þ � � � þ k1ð Þn�1an�1 ¼ l11a0 þ k2a1 þ k2ð Þ2a2 þ � � � þ k2ð Þn�1an�1 ¼ l2
..
.
1a0 þ kna1 þ knð Þ2a2 þ � � � þ knð Þn�1an�1 ¼ ln
9>>>=>>>;;
that is,
1a0 þ a1k1 þ a2 k1ð Þ2 þ � � � þ an�1 k1ð Þn�1¼ l11a0 þ a1k2 þ a2 k2ð Þ2 þ � � � þ an�1 k2ð Þn�1¼ l2
..
.
1a0 þ a1kn þ a2 knð Þ2 þ � � � þ an�1 knð Þn�1¼ ln
9>>>=>>>;:
Let us denote the polynomial
3.3 The Cayley–Hamilton Theorem 235
1a0 þ a1xþ a2x2 þ � � � þ an�1x
n�1
by pðxÞ. Observe that deg pðxÞð Þ� n� 1ð Þ. Also,
p k1ð Þ ¼ l1p k2ð Þ ¼ l2
..
.
p knð Þ ¼ ln
9>>>=>>>;:
Next,
pðAÞ ¼ a0Iþ a1Aþ a2A2 þ � � � þ an�1A
n�1
and
p P�1APð Þ ¼ a0Iþ a1 P�1APð Þþ a2 P�1APð Þ P�1APð Þþ � � � þ an�1 P�1APð Þn�1
¼ a0 P�1IPð Þþ a1 P�1APð Þþ a2 P�1A2Pð Þþ � � � þ an�1 P�1An�1Pð Þ¼ P�1 a0Iþ a1Aþ a2A2 þ � � � þ an�1An�1ð ÞP ¼ P�1pðAÞP;
so p P�1APð Þ ¼ P�1pðAÞP. Since
P�1AP ¼� �u1; . . .; unð Þ�1A u1; . . .; unð Þ ¼ diag k1; . . .; knð Þ;
we have
P�1pðAÞP ¼ p P�1AP� � ¼ p diag k1; . . .; knð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ a0Iþ a1diag k1; . . .; knð Þþ a2 diag k1; . . .; knð Þð Þ2
þ � � � þ an�1 diag k1; . . .; knð Þð Þn�1
¼ a0diag 1; . . .; 1ð Þþ a1diag k1; . . .; knð Þþ a2diag k1ð Þ2; . . .; knð Þ2� �
þ � � � þ an�1diag k1ð Þn�1; . . .; knð Þn�1� �
¼ diag a0; . . .; a0ð Þþ diag a1k1; . . .; a1knð Þþ diag a2 k1ð Þ2; . . .; a2 knð Þ2� �
þ � � � þ diag an�1 k1ð Þn�1; . . .; an�1 knð Þn�1� �
¼ diaga0 þ a1k1 þ a2 k1ð Þ2 þ � � � þ an�1 k1ð Þn�1; . . .; a0 þ a1kn þ a2 knð Þ2
þ � � � þ an�1 knð Þn�1
!¼ diag p k1ð Þ; . . .; p knð Þð Þ ¼ diag l1; . . .; lnð Þ ¼ P�1BP;
236 3 Linear Transformations
and hence
P�1pðAÞP ¼ P�1BP:
It follows that pðAÞ ¼ B.
3.3.19 Conclusion Let A � aij� �
be an n-square complex matrix. Suppose that allthe eigenvalues of A are distinct. Let B � bij
� �be an n-square complex matrix such
that B commutes with A. Then there exists a polynomial pðxÞ such that
1. deg pðxÞð Þ� n� 1,2. pðAÞ ¼ B.
3.3.20 Problem Let A � aij� �
and B � bij� �
be any n-square complex matrices.Suppose that A commutes with B, that is, AB ¼ BA. Then there exists a unitarymatrix U such that U�AU and U�BU are both upper triangular matrices.
Proof (Induction on n) The assertion is trivially true for n ¼ 1. Next, suppose thatthe assertion is true for n� 1. We have to show that the assertion is true for n.
Let us take any eigenvalue l of B.Clearly, v : v 2 Cn andBv ¼ lvf g is a linear subspace of Cn such that its
dimension is � 1: Let A : v 7!Av be a mapping from Cn to Cn. Clearly, A : Cn !Cn is a linear transformation. Observe that for every v 2 Cn satisfying Bv ¼ lv, wehave
B AðvÞð Þ ¼ B Avð Þ ¼ BAð Þv ¼ ABð Þv ¼ A Bvð Þ ¼ A lvð Þ ¼ l Avð Þ ¼ l AðvÞð Þ;
and hence B AðvÞð Þ ¼ l AðvÞð Þ. This shows that the subspacev : v 2 CnandBv ¼ lvf g is invariant under A : Cn ! Cn. Hence the restriction
AjVl: Vl ! Vl;
where Vl � v : v 2 Cn andBv ¼ lvf g, is a linear transformation. Also,dim Vl� �� 1.Let k be an eigenvalue of AjVl
. Then there exists a nonzero vector v1 in Vl such
that
Av1 ¼ A v1ð Þ ¼ AjVl
� �v1ð Þ ¼ kv1|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl};
and hence Av1 ¼ kv1. Next, since v1 is in Vl ¼ v : v 2 Cn andBv ¼ lvf gð Þ, we haveBv1 ¼ lv1. Since v1 6¼ 0, w1 � 1
v1v1 is a unit vector. Also, Aw1 ¼ kw1 and
Bw1 ¼ lw1. By 3.3.12, there exist w2; . . .;wn|fflfflfflfflfflffl{zfflfflfflfflfflffl}n�1
in Cn such that w1;w2; . . .;wnf g is an
3.3 The Cayley–Hamilton Theorem 237
orthonormal basis of Cn. By the definition of unitary matrix A�A ¼ AA� ¼ Ið Þ, then-square matrix w1;w2; . . .;wn½ � is unitary. Observe that
w1;w2; . . .;wn½ ��A w1;w2; . . .;wn½ � ¼ w1;w2; . . .;wn½ �� A w1;w2; . . .;wn½ �ð Þ¼ w1;w2; . . .;wn½ �� Aw1;Aw2; . . .;Awn½ �¼ w1;w2; . . .;wn½ �� kw1;Aw2; . . .;Awn½ �¼ w1;w2; . . .;wn½ �T kw1;Aw2; . . .;Awn½ �¼ w1;w2; . . .;wn½ �T kw1ð Þ; w1;w2; . . .;wn½ �T�
Aw2ð Þ; . . .; w1;w2; . . .;wn½ �T Awnð Þ�¼ k w1;w2; . . .;wn½ �Tw1
� �; w1;w2; . . .;wn½ �T�
Aw2ð Þ; . . .; w1;w2; . . .;wn½ �T Awnð Þ�¼ k w1;w1h i; w1;w2h i; . . .; w1;wnh i½ �T ;�
Aw2;w1h i; Aw2;w2h i; . . .; Aw2;wnh i½ �T ; . . .;Awn;w1h i; Awn;w2h i; . . .; Awn;wnh i½ �T�
¼ k 1; 0; . . .; 0½ �T ; Aw2;w1h i; Aw2;w2h i; . . .; Aw2;wnh i½ �T ; . . .;�Awn;w1h i; Awn;w2h i; . . .; Awn;wnh i½ �T�
¼ k; 0; . . .; 0½ �T ; Aw2;w1h i; Aw2;w2h i; . . .; Aw2;wnh i½ �T ; . . .;�Awn;w1h i; Awn;w2h i; . . .; Awn;wnh i½ �T�;
and
k; 0; . . .; 0½ �T ; Aw2;w1h i; Aw2;w2h i; . . .; Aw2;wnh i½ �T ; . . .;�Awn;w1h i; Awn;w2h i; . . .; Awn;wnh i½ �T�
is of the form
k0...
0
��...
�� � �
��...
�
2664
3775;
so w1;w2; . . .;wn½ ��A w1;w2; . . .;wn½ � is of the partitioned form
k a0 C
� �;
where C is a matrix of size n� 1ð Þ � n� 1ð Þ, and a is a matrix of size 1� n� 1ð Þ.
238 3 Linear Transformations
Next,
w1;w2; . . .;wn½ ��B w1;w2; . . .;wn½ �¼ w1;w2; . . .;wn½ �� B w1;w2; . . .;wn½ �ð Þ¼ w1;w2; . . .;wn½ �� Bw1;Bw2; . . .;Bwn½ �¼ w1;w2; . . .;wn½ �� lw1;Bw2; . . .;Bwn½ �¼ w1;w2; . . .;wn½ �T lw1;Bw2; . . .;Bwn½ �¼ w1;w2; . . .;wn½ �T lw1ð Þ; w1;w2; . . .;wn½ �T Bw2ð Þ; . . .;�w1;w2; . . .;wn½ �T Bwnð Þ�
¼ l w1;w2; . . .;wn½ �Tw1� �
; w1;w2; . . .;wn½ �T Bw2ð Þ; . . .;�w1;w2; . . .;wn½ �T Bwnð Þ�
¼ l w1;w1h i; w1;w2h i; . . .; w1;wnh i½ �T ;�Bw2;w1h i; Bw2;w2h i; . . .; Bw2;wnh i½ �T ; . . .;Bwn;w1h i; Bwn;w2h i; . . .; Bwn;wnh i½ �T�
¼ l 1; 0; . . .; 0½ �T ; hBw2;w1i; Bw2;w2h i; . . .; Bw2;wnh i½ �T ; . . .;�Bwn;w1h i; Bwn;w2h i; . . .; Bwn;wnh i½ �T�
¼ l; 0; . . .; 0½ �T ; Bw2;w1h i; Bw2;w2h i; . . .; Bw2;wnh i½ �T ; . . .;�Bwn;w1h i; Bwn;w2h i; . . .; Bwn;wnh i½ �T�
and
l; 0; . . .; 0½ �T ; Bw2;w1h i; Bw2;w2h i; . . .; Bw2;wnh i½ �T ; . . .;�Bwn;w1h i; Bwn;w2h i; . . .; Bwn;wnh i½ �T�
is of the form
l0...
0
��...
�� � �
��...
�
2664
3775;
so w1;w2; . . .;wn½ ��B w1;w2; . . .;wn½ � is of the partitioned form
l b0 D
� �;
3.3 The Cayley–Hamilton Theorem 239
where D is a matrix of size n� 1ð Þ � n� 1ð Þ, and b is a matrix of size 1� n� 1ð Þ.It follows that
lk laþ bC
0 DC
� �¼ l b
0 D
� �k a
0 C
� �¼ w1;w2; . . .;wn½ ��B w1;w2; . . .;wn½ �ð Þ w1;w2; . . .;wn½ ��A w1;w2; . . .;wn½ �ð Þ¼ w1;w2; . . .;wn½ ��B w1;w2; . . .;wn½ � w1;w2; . . .;wn½ ��ð ÞA w1;w2; . . .;wn½ �¼ w1;w2; . . .;wn½ ��BIA w1;w2; . . .;wn½ � ¼ w1;w2; . . .;wn � BA w1;w2; . . .;wn½ �¼ w1;w2; . . .;wn½ ��AB w1;w2; . . .;wn½ � ¼ w1;w2; . . .;wn½ �� AIB w1;w2; . . .;wn½ �¼ w1;w2; . . .;wn½ ��A w1;w2; . . .;wn½ � w1;w2; . . .;wn½ ��ð ÞB w1;w2; . . .;wn½ �
¼w1;w2; . . .;wn½ ��A w1;w2; . . .;wn½ �ð Þ w1;w2; . . .;wn½ ��B w1;w2; . . .;wn½ �ð Þ
¼k a
0 C
" #l b
0 D
" #|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ kl kbþ aD
0 CD
� �;
and hence
lk laþ bC0 DC
� �¼ kl kbþ aD
0 CD
� �:
This shows that CD ¼ DC. Further, C and D are n� 1ð Þ-square complexmatrices. By the induction hypothesis, there exists a unitary matrix V of sizen� 1ð Þ � n� 1ð Þ such that V�CV and V�DV are both upper triangular matrices ofsize n� 1ð Þ � n� 1ð Þ. It follows that
1 00 V
� �
is a partitioned form of an n� n matrix, and hence
w1;w2; . . .;wn½ � 1 00 V
� �
240 3 Linear Transformations
is an n� n matrix. Next,
w1;w2; . . .;wn½ � 1 0
0 V
� � �A w1;w2; . . .;wn½ � 1 0
0 V
� �
¼ 1 0
0 V
� ��w1;w2; . . .;wn½ ��
A w1;w2; . . .;wn½ � 1 0
0 V
� �
¼ 1 0
0 V�
� �w1;w2; . . .;wn½ ��
A w1;w2; . . .;wn½ � 1 0
0 V
� �
¼ 1 0
0 V�
� �w1;w2; . . .;wn½ ��A w1;w2; . . .;wn½ �ð Þ 1 0
0 V
� �
¼ 1 0
0 V�
� �k a
0 C
� �1 0
0 V
� �
¼ 1 0
0 V�
� �k a
0 C
� �1 0
0 V
� �
¼ 1 0
0 V�
� �k aV
0 CV
� �¼ k aV
0 V�CV
� �;
so
w1;w2; . . .;wn½ � 1 00 V
� � �A w1;w2; . . .;wn½ � 1 0
0 V
� � ¼ k aV
0 V�CV
� �:
Now, since V�CV is an upper triangular matrix of size n� 1ð Þ � n� 1ð Þ,k aV0 V�CV
� �is an upper triangular matrix of size n� n, and hence U�AU is an
upper triangular matrix of size n� n, where
U ¼ w1;w2; . . .;wn½ � 1 00 V
� �:
Similarly, U�BU is an upper triangular matrix of size n� n. It remains to showthat U is unitary, that is, U�U ¼ UU� ¼ I.
Observe that
3.3 The Cayley–Hamilton Theorem 241
U�U ¼ w1;w2; . . .;wn½ � 1 00 V
� � �w1;w2; . . .;wn½ � 1 0
0 V
� � ¼ 1 0
0 V
� ��w1;w2; . . .;wn½ ��
w1;w2; . . .;wn½ � 1 0
0 V
� � ¼ 1 0
0 V�
� �w1;w2; . . .;wn½ ��
w1;w2; . . .;wn½ � 1 0
0 V
� � ¼ 1 0
0 V�
� �w1;w2; . . .;wn½ �� w1;w2; . . .;wn½ �ð Þ 1 0
0 V
� �¼ 1 0
0 V�
� �I
1 00 V
� �¼ 1 0
0 V�
� �1 00 V
� �¼ 1 0
0 V�V
� �¼ 1 0
0 In�1
� �¼ I;
so U�U ¼ I. Similarly, UU� ¼ I. ■
3.3.21 Problem Let A � aij� �
be an n-square complex matrix. Then there exists aunitary matrix U such that U�AU is an upper triangular matrix.
Proof Since A commutes with A, by 3.3.20, there exists a unitary matrix U suchthat U�AU is an upper triangular matrix. ■
3.3.22 Theorem Let A � aij� �
be an n-square complex matrix. Then there exists aunitary matrix U such that
1. U�AU is an upper triangular matrix,2. the eigenvalues of A are the diagonal entries of U�AU.
This result is due to Issai Schur (1875–1941).
Proof By 3.3.21, there exists a unitary matrix U such that U�AU is an uppertriangular matrix, say C. Since U is unitary, U�U ¼ UU� ¼ I, we have U�1 ¼ U�.Thus C ¼ U�1AU, and hence A ¼ UCU�1. Now,
A� kI ¼ UCU�1 � kI ¼ UCU�1 � kUU�1 ¼ UC � kUð ÞU�1
¼ UC � k UIð Þð ÞU�1
¼ UC � U kIð Þð ÞU�1
¼ U C � kIð ÞU�1;
so A� kIð Þ ¼ U C � kIð ÞU�1, and hence
det A� kIð Þ ¼ det U C � kIð ÞU�1� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ detðUÞ � det C � kIð Þ � det U�1� �¼ detðUÞ � det C � kIð Þ � 1
detðUÞ¼ det C � kIð Þ:
242 3 Linear Transformations
Thus det A� kIð Þ ¼ det C � kIð Þ: Since C is an upper triangular matrix, we have
det A� kIð Þ ¼ det C � kIð Þ ¼ c1 � kð Þ c2 � kð Þ � � � cn � kð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};where the diagonal entries of U�AU ¼ð ÞC are c1; c2; . . .; cn. Also,
det A� kIð Þ ¼ c1 � kð Þ c2 � kð Þ � � � cn � kð Þ:
Hence the roots of the polynomial det A� kIð Þ in k are c1; c2; . . .; cn. This showsthat c1; c2; . . .; cn are the eigenvalues of A. ■
3.3.23 Problem If C is a normal upper triangular matrix of size n� n, then C isdiagonal.
Proof (Induction on n) The assertion is trivially true for n ¼ 1. Next suppose thatthe assertion is true for n� 1. We have to show that the assertion is true for n.
Let C be any normal upper triangular matrix of size n� n. We have to show thatC is diagonal.
Since C is an upper triangular matrix, C is of the form
k a0 D
� �;
where D is an upper triangular matrix of size n� 1ð Þ � n� 1ð Þ, a is a matrix of size1� n� 1ð Þ, and k is a complex number. It suffices to show that a ¼ 0, and D is adiagonal matrix of size n� 1ð Þ � n� 1ð Þ. Here,
C� ¼ k a0 D
� ��¼ �k �a
0 D
� �T¼ �k 0
a� D�
� �;
so
C� ¼ �k 0a� D�
� �:
Now, since C is normal, we have
�kk �kaa�k a�aþD�D
� �¼ �k 0
a� D�
� �k a0 D
� �¼ C�C ¼ CC�|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} ¼ k a
0 D
� ��k 0a� D�
� �
¼ k�kþ aa� aD�
Da� DD�
� �;
3.3 The Cayley–Hamilton Theorem 243
and hence
�kk �kaa�k a�aþD�D
� �¼ k�kþ aa� aD�
Da� DD�
� �:
It follows that
�kk ¼ k�kþ aa�
a�aþD�D ¼ DD�
;
that is,
aa� ¼ 0a�aþD�D ¼ DD�
:
Hence
a ¼ 0a�aþD�D ¼ DD�
;
that is,
a ¼ 0D�D ¼ DD�
;
that is, a ¼ 0, and D is normal. Since D is a normal upper triangular matrix of sizen� 1ð Þ � n� 1ð Þ, it follows by the induction hypothesis that D is diagonal. ■
3.3.24 Problem Let A � aij� �
be an n-square complex matrix. Suppose that A is anormal matrix. Then there exists a unitary matrix U such that
1. U�AU is a diagonal matrix,2. the eigenvalues of A are the diagonal entries of U�AU.
Proof By 3.3.22, there exists a unitary matrix U such that
1. U�AU is an upper triangular matrix,2. the eigenvalues of A are the diagonal entries of U�AU.
It suffices to show that U�AU is a diagonal matrix.Let us denote the upper triangular matrix U�AU by C. Since U is unitary, we
have U�U ¼ UU� ¼ I. It follows that U�1 ¼ U�. Now, since C ¼ U�AU, we have
UCU� ¼ U U�AUð ÞU�|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ UU�ð ÞA UU�ð Þ ¼ IAI ¼ A;
244 3 Linear Transformations
and hence A ¼ UCU�. It follows that
A� ¼ UCU�ð Þ�|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} ¼ U�ð Þ�C�U� ¼ UC�U�;
and hence A� ¼ UC�U�. Since A is a normal matrix, we have
U C�Cð ÞU�1 ¼ U C�Cð ÞU� ¼ UC�ICU� ¼ UC� U�Uð ÞCU�
¼ UC�U�ð Þ UCU�ð Þ ¼ A�A ¼ AA�|fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl} ¼ A UC�U�ð Þ
¼ UCU�ð Þ UC�U�ð Þ ¼ UC U�Uð ÞC�U�
¼ UCIC�U� ¼ U CC�ð ÞU� ¼ U CC�ð ÞU�1;
and hence U C�Cð ÞU�1 ¼ U CC�ð ÞU�1. It follows that C�C ¼ CC�, and hence C isnormal. Since C is a normal upper triangular matrix, by 3.3.23, U�AU ¼ð ÞC isdiagonal, and hence U�AU is diagonal. ■
Note 3.3.25 Problem Let A � aij� �
be an n-square complex matrix. Let U be aunitary matrix such that U�AU is a diagonal matrix. Then A is a normal matrix.
Proof We have to show that A�A ¼ AA�.Let us denote the diagonal matrix U�AU by diag k1; . . .; knð Þ. Since U is unitary,
we have U�U ¼ UU� ¼ I. Now, since
U diag k1; . . .; knð Þð ÞU� ¼ U U�AUð ÞU� ¼ UU�ð ÞA UU�ð Þ ¼ IAI|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ A;
we have A ¼ U diag k1; . . .; knð Þð ÞU�. It follows that
A� ¼ U diag k1; . . .; knð Þð ÞU�ð Þ�¼ U�ð Þ� diag k1; . . .; knð Þð Þ�U�
¼ U diag k1; . . .; knð Þð Þ�U�
¼ U diag k1; . . .; kn� �� �
U�;
so A� ¼ U diag k1; . . .; kn� �� �
U�. Here
3.3 The Cayley–Hamilton Theorem 245
A�A ¼ U diag k1; . . .; kn� �� �
U�� �A
¼ U diag k1; . . .; kn� �� �
U�� �U diag k1; . . .; knð Þð ÞU�ð Þ
¼ U diag k1; . . .; kn� �� �
U�Uð Þ diag k1; . . .; knð Þð ÞU�
¼ U diag k1; . . .; kn� �� �
I diag k1; . . .; knð Þð ÞU¼ U diag k1; . . .; kn
� �� �diag k1; . . .; knð Þð Þ� �
U�
¼ U diag k1j j2; . . .; knj j2� �� �
U�;
and
AA� ¼ A U diag k1; . . .; kn� �� �
U�� �¼ U diag k1; . . .; knð Þð ÞU�ð Þ U diag k1; . . .; kn
� �� �U�� �
¼ U diag k1; . . .; knð Þð Þ U�Uð Þ diag k1; . . .; kn� �� �
U�
¼ U diag k1; . . .; knð Þð ÞI diag k1; . . .; kn� �� �
U�
¼ U diag k1; . . .; knð Þð Þ diag k1; . . .; kn� �� �� �
U�
¼ U diag k1j j2; . . .; knj j2� �� �
U�;
so A�A ¼ AA�. ■
3.3.26 Theorem Let A � aij� �
be an n-square complex matrix. Let A be aHermitian matrix, that is, A� ¼ A. Then the eigenvalues of A are real numbers.
Proof Let k1; . . .; kn be the eigenvalues of A. It suffices to show thatdiag k1; . . .; kn
� � ¼ diag k1; . . .; knð Þ, that is,
diag k1; . . .; knð Þð Þ�¼ diag k1; . . .; knð Þ:
Since A� ¼ A, we have A�A ¼ AA ¼ AA�, and hence A�A ¼ AA�. Thus A isnormal. Now, by 3.3.24, there exists a unitary matrix U such that
U�AU ¼ diag k1; . . .; knð Þ:
Hence
diag k1; . . .; knð Þð Þ�¼ U�AUð Þ�|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ U�A� U�ð Þ�¼ U�A�U ¼ U�AU
¼ diag k1; . . .; knð Þ:
Thus diag k1; . . .; knð Þð Þ�¼ diag k1; . . .; knð Þ. ■
246 3 Linear Transformations
3.3.27 Theorem Let A � aij� �
be an n-square complex matrix. Let A be a normalmatrix. Suppose that all the eigenvalues of A are real numbers. Then A is aHermitian matrix, that is, A� ¼ A.
Proof By 3.3.22, there exists a unitary matrix U such that
1. U�AU is an upper triangular matrix,2. the eigenvalues of A are the diagonal entries of U�AU.
Let us denote U�AU by C. Thus C ¼ U�AU. Since U is unitary, we haveU�U ¼ UU� ¼ I. It follows that U�1 ¼ U� and
UCU� ¼ U U�AUð ÞU� ¼ UU�ð ÞA UU�ð Þ ¼ IAI ¼ A;
and hence A ¼ UCU�. Next,
A� ¼ UCU�ð Þ�¼ U�ð Þ�C�U� ¼ UC�U�;
so A� ¼ UC�U�. Thus it suffices to show that C� ¼ C.Since A is normal, we have
U C�Cð ÞU�1 ¼ UC�CU� ¼ UC�ICU� ¼ UC� U�Uð ÞCU�
¼ UC�U�ð Þ UCU�ð Þ ¼ A�A ¼ AA�|fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl} ¼ UCU�ð Þ UC�U�ð Þ
¼ UC U�Uð ÞC�U� ¼ UCIC�U� ¼ UCC�U� ¼ U CC�ð ÞU�1;
and hence U C�Cð ÞU�1 ¼ U CC�ð ÞU�1. It follows that C�C ¼ CC�, and hence C isnormal. Now, since C is an upper triangular matrix, by 3.3.23, C is diagonal. SinceC is diagonal, and the diagonal entries ofC are real numbers, we have C� ¼ C. ■
Definition Let A � aij� �
be an n-square complex matrix. If for every x 2 Cn, the1� 1 matrix x�Ax has a nonnegative real number as its sole entry, then we say thatA is a positive semidefinite matrix or A is a nonnegative definite matrix. This isexpressed as A� 0.
In short, A is a positive semidefinite matrix if for every x 2 Cn, x�Ax� 0.
3.3.28 Problem Let A � aij� �
be an n-square complex matrix. Suppose that A ispositive semidefinite. Then A is a Hermitian matrix, that is, A� ¼ A. Also, thediagonal entries of A are nonnegative real numbers.
Proof Since A is a positive semidefinite matrix and 1; 0; . . .; 0½ �T2 Cn, we have
a11 ¼ 1; 0; . . .; 0½ � A 1; 0; . . .; 0½ �T� � ¼ 1; 0; . . .; 0½ �T� ��A 1; 0; . . .; 0½ �T � 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};
3.3 The Cayley–Hamilton Theorem 247
and hence a11 is a nonnegative real number. Similarly, a22 is a nonnegative realnumber, a33 is a nonnegative real number, etc.
Also,
a11 þ a22ð Þþ a12 þ a21ð Þ ¼ 1 a111þ a121ð Þþ 1 a211þ a221ð Þ¼ 1; 1; 0; . . .; 0½ � A 1; 1; 0; . . .; 0½ �T� �¼ 1; 1; 0; . . .; 0½ �T� ��
A 1; 1; 0; . . .; 0½ �T � 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};so
a11 þ a22ð ÞþRe a12ð ÞþRe a21ð Þ� 0Im a12ð Þ ¼ �Im a21ð Þ
:
Next,
a11 þ a22ð Þþ ia12 � ia21ð Þ ¼ 1 a111þ a12ið Þþ �ið Þ a211þ a22ið Þ¼ 1;�i; 0; . . .; 0½ � A 1; i; 0; . . .; 0½ �T� �¼ 1; i; 0; . . .; 0½ �T� ��
A 1; i; 0; . . .; 0½ �T � 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};so
a11 þ a22ð Þ � Im a12ð Þþ Im a21ð Þ� 0Re a12ð Þ ¼ Re a21ð Þ
:
Since
Re a12ð Þ ¼ Re a21ð ÞIm a12ð Þ ¼ �Im a21ð Þ
;
we have a12 ¼ a21. Similarly, a13 ¼ a31; a23 ¼ a32, etc. This shows that
aij� � ¼ ai|
� �T , that is, A ¼ A�. ■
3.3.29 Problem Let A � aij� �
be an n-square complex matrix. Suppose that A ispositive semidefinite. Then the eigenvalues of A are nonnegative real numbers.
Proof Let k be an eigenvalue of A. We have to show that k is a nonnegative realnumber.
Since k is an eigenvalue of A, there exists a nonzero x 2 Cn such that Ax ¼ kx.Since A is positive semidefinite, k x; xh i ¼ k x�xð Þ ¼ x� kxð Þ ¼ x� Axð Þ ¼ð Þ x�Ax is anonnegative real number, and hence k x; xh i is a nonnegative real number. Since
248 3 Linear Transformations
x 6¼ 0, x; xh i is a positive real number. Since k x; xh i is a nonnegative real number,k is a nonnegative real number. ■
3.3.30 Problem Let A � aij� �
be an n-square complex matrix. Let A be normal.Suppose that the eigenvalues of A are nonnegative real numbers. Then A is positivesemidefinite.
Proof Let x be a member of Cn. We have to show that x�Ax is a nonnegative realnumber.
By 3.3.24, there exists a unitary matrix U such that
1. U�AU is a diagonal matrix,2. the eigenvalues of A are the diagonal entries of U�AU.
So we can write U�AU ¼ diag t1; . . .; tnð Þ, where t1; . . .; tn are the eigenvaluesof A. By assumption, t1; . . .; tn are nonnegative real numbers. Since U is unitary, wehave U�U ¼ UU� ¼ I. It follows that U�1 ¼ U� and
U diag t1; . . .; tnð Þð ÞU� ¼ U U�AUð ÞU� ¼ UU�ð ÞA UU�ð Þ ¼ IAI ¼ A;
and hence A ¼ U diag t1; . . .; tnð Þð ÞU�. Next,
x�Ax ¼ x� U diag t1; . . .; tnð Þð ÞU�ð Þx ¼ U�xð Þ� diag t1; . . .; tnð Þð Þ U�xð Þ;
so
x�Ax ¼ y1; . . .; yn½ �T� ��diag t1; . . .; tnð Þð Þ y1; . . .; yn½ �T ;
where y1; . . .; yn½ �T� U�x. We now have
x�Ax ¼ y1; . . .; yn½ �T� ��diag t1; . . .; tnð Þð Þ y1; . . .; yn½ �T|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ y1; . . .; yn½ � diag t1; . . .; tnð Þð Þ y1; . . .; yn½ �T� �¼ y1; . . .; yn½ � t1y1; . . .; tnyn½ �T¼ y1 t1y1ð Þþ � � � þ yn tnynð Þ¼ t1 y1j j2 þ � � � þ tn ynj j2 � 0;
and conclude that x�Ax� 0. ■
3.3.31 Note Let A � aij� �
be an n-square complex matrix.LetMn be the collection of all n-square complex matrices. We know thatMn is a
vector space over C. The collection of all C inMn such that C has exactly one entry1 and 0 entries elsewhere constitutes a basis of Mn. This basis has n2 members, sodim Mnð Þ ¼ n2. Since I;A;A2; . . .;An2 is a collection of size n2 þ 1ð Þ [ dim Mnð Þð Þand I;A;A2; . . .;An2 are in Mn, it follows that I;A;A2; . . .;An2 are linearly
3.3 The Cayley–Hamilton Theorem 249
dependent in Mn. Thus there exist complex numbers a0; a1; a2; . . .; an2 , not all zero,such that
a0Iþ a1Aþ a2A2 þ � � � þ an2A
n ¼ 0:
Thus pðAÞ ¼ 0, where pðxÞ denotes the polynomial a0 þ a1xþ a2x2 þ � � � þan2xn
2. Clearly, deg pðxÞð Þ� n2.
3.3.32 Conclusion Let A � aij� �
be an n-square complex matrix. Then there existsa polynomial pðxÞ with complex coefficients such that pðAÞ ¼ 0. In short, thereexists an “annihilating polynomial” for A.
3.3.33 Problem Let A � aij� �
be an n-square complex matrix. Let A be an uppertriangular matrix. Let k1; k2; . . .; kn be the diagonal entries of A. Then k1I � Að Þk2I � Að Þ � � � knI � Að Þ ¼ 0.
Proof Observe that the first column of k1I � Að Þ is 0. So we can suppose thatk1I � Að Þ ¼ 0; a1; . . .; an�1½ �, where each ai is inC
n. Similarly, we can suppose thatk2I � Að Þ ¼ b1; 0; b2; . . .; bn�1½ �, where each bi is inC
n, etc. Hence
k1I � Að Þ k2I � Að Þ¼ 0; a1; . . .; an�1½ � b1; 0; b2; . . .; bn�1½ �¼ 0; a1; . . .; an�1½ �b1; 0; a1; . . .; an�1½ �0; 0; a1; . . .; an�1½ �b2; . . .;½0; a1; . . .; an�1½ �bn�1�
¼ 0; a1; . . .; an�1½ �b1; 0; 0; a1; . . .; an�1½ �b2; . . .; 0; a1; . . .; an�1½ �bn�1½ �:
Thus
k1I � Að Þ k2I � Að Þ¼ 0; a1; . . .; an�1½ �b1; 0; 0; a1; . . .; an�1½ �b2; . . .; 0; a1; . . .; an�1½ �bn�1½ �; �ð Þ
and hence the second column of k1I � Að Þ k2I � Að Þ is 0. Since k2I � Að Þ ¼b1; 0; b2; . . .; bn�1½ � and A is an upper triangular matrix, we have b1 ¼ k2 � k1;½0; . . .; 0�T 2 Cn, and hence
0; a1; . . .; an�1½ �b1 ¼ 0; a1; . . .; an�1½ � k2 � k1; 0; . . .; 0½ �T¼ 0; 0; . . .; 0½ �T2 Cn:
Thus 0; a1; . . .; an�1½ �b1 ¼ 0. Now from (*),
k1I � Að Þ k2I � Að Þ ¼ 0; 0; 0; a1; . . .; an�1½ �b2; . . .; 0; a1; . . .; an�1½ �bn�1½ �:
250 3 Linear Transformations
Thus the first and second columns of k1I � Að Þ k2I � Að Þ are 0. Similarly, thefirst three columns of k1I � Að Þ k2I � Að Þ k3I � Að Þ are 0, etc. Finally, all then columns of k1I � Að Þ k2I � Að Þ � � � knI � Að Þ are 0. Thus,
k1I � Að Þ k2I � Að Þ � � � knI � Að Þ ¼ 0:
■
3.3.34 Note Let A � aij� �
be an n-square complex matrix. Let k1; k2; . . .; kn be theeigenvalues of A.
By 3.3.22, there exists a unitary matrix U such that
1. U�AU is an upper triangular matrix,2. k1; k2; . . .; kn are the diagonal entries of U�AU.
By 3.3.12,
U� k1I � Að Þ k2I � Að Þ � � � knI � Að Þð ÞU¼ U� k1I � Að ÞUð Þ U� k2I � Að ÞUð Þ � � � U� knI � Að ÞUð Þ¼ k1I � U�AUð Þ k2I � U�AUð Þ � � � knI � U�AUð Þ ¼ 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};
and hence
U� k1I � Að Þ k2I � Að Þ � � � knI � Að Þð ÞU ¼ 0:
Since U is unitary, we have
k1I � Að Þ k2I � Að Þ � � � knI � Að Þ ¼ 0:
It follows that pðAÞ ¼ 0, where pðxÞ denotes the polynomial x� k1ð Þx� k2ð Þ � � � x� knð Þ. Observe that k1; k2; . . .; kn are the roots of the monic poly-nomial x� k1ð Þ x� k2ð Þ � � � x� knð Þ ¼ pðxÞð Þ. Since k1; k2; . . .; kn are the eigen-values of A, k1; k2; . . .; kn are the roots of the monic polynomial det kI � Að Þ, andsince k1; k2; . . .; kn are the roots of the monic polynomial pðxÞ, we have
p kð Þ ¼ det kI � Að Þ:3.3.35 Conclusion Let A � aij
� �be an n-square complex matrix. Let k1; k2; . . .; kn
be the eigenvalues of A. Then pðAÞ ¼ 0, where p kð Þ ¼ det kI � Að Þ.This result is known as the Cayley–Hamilton theorem.Here the polynomial det kI � Að Þ is called the characteristic polynomial
of A. Thus the characteristic polynomial of A is an annihilating polynomial of A. Itfollows that the minimal polynomial of A divides the characteristic polynomial of A.
3.3 The Cayley–Hamilton Theorem 251
Exercises
1. Let V be an n-dimensional inner product space. Let T : V ! V be a lineartransformation. Let v;w1;w2 2 V . Suppose that
u 2 V ) u;w1h i ¼ TðuÞ; vh i ¼ u;w2h i:
Show that w1 ¼ w2.2. Let V be an n-dimensional inner product space. Let S1 : V ! V , S2 : V ! V ,
and S3 : V ! V be linear transformations. Let k; l be any complex numbers.Show that kS1 þ lS2ð ÞS3ð Þ�¼ �k S3ð Þ� S1ð Þ� þ �l S3ð Þ� S2ð Þ�.
3. Let V be any n-dimensional vector space. Let S; T 2 AðVÞ be such that ST ¼ 0and TS 6¼ 0. Show that T is not invertible.
4. Let V be an n-dimensional inner product space. Let T : V ! V be a normallinear transformation. Let v 2 V . Suppose that
T3ðvÞ ¼ 0:
Show that v is a member of the null space of T.
5. Let T 2 A C3� �. Suppose that C is invariant under T, and C2 is invariant under
T. Let p1ðxÞ be a minimal polynomial of TjC, and p2ðxÞ a minimal polynomialof TjC2 . Show that the least common multiple of p1ðxÞ and p2ðxÞ is a minimalpolynomial of T.
6. Let T 2 A Cnð Þ. Suppose that T is nilpotent. Show that there exist linear sub-spaces V1;V2; . . .;Vk of C
n such that1. Cn ¼ V1 V2 � � � Vk ,2. each Vi is invariant under T.
Also there exist a basis v11; v21; . . .; v
n11
� �of V1, a basis v12; v
22; . . .; v
n22
� �of
V2; . . ., a basis v1k ; v2k ; . . .; v
nkk
� �of Vk such that the matrix of T relative to the
basis v11; v21; . . .; v
n11 ; v
12; v
22; . . .; v
n22 ; . . .; v
1k ; v
2k ; . . .; v
nkk
� �has the canonical form
Mn1 0 0
0 . ..
00 0 Mnk
264
375n�n
:
7. Let A be a nonzero 3� 5 matrix with complex numbers as entries. Suppose thatrankðAÞ ¼ 2. Show that there exist an invertible 3� 3 matrix P and aninvertible 5� 5 matrix Q such that
PAQ ¼ I2 00 0
� �3�5
:
252 3 Linear Transformations
8. Let A and B be any n-square complex matrices. Suppose that AB ¼ BA. Showthat there exists a unitary matrix U such that U�AU and U�BU are both uppertriangular matrices.
9. Let A be an n-square complex matrix. Suppose that A is a normal matrix. Showthat there exists a unitary matrix U such that the diagonal entries of U�AU arethe eigenvalues of A.
10. Find the characteristic polynomial pðxÞ of the matrix
A �1 2 �35 �2 �31 �1 4
24
35:
Verify that pðAÞ ¼ 0.
3.3 The Cayley–Hamilton Theorem 253
Chapter 4Sylvester’s Law of Inertia
Sylvester’s law characterizes an equivalence relation called congruence. Thisremarkable result introduces a new concept of a matrix, called its signature. It issimilar to the rank of a matrix. Finally, a beautiful method of obtaining the signatureof a real quadratic form is introduced.
4.1 Positive Definite Matrices
4.1.1 Theorem Let V be any n-dimensional vector space over the field F. LetT : V ! V be a linear transformation. Then there exists a positive integer k suchthat N Tk
� � ¼ N Tkþ 1� � ¼ N Tkþ 2
� � ¼ � � � ; and N Tk�1� �
is a proper subset ofN Tk� �
:
Proof It is clear that
v : v 2 V and T vð Þ ¼ 0f g ¼ð ÞN Tð Þ � N T � Tð Þ ¼ N T2� �� �;
so N Tð Þ � N T2ð Þ: Similarly, N T2ð Þ � N T3ð Þ; etc. Since
0f g � N Tð Þ � N T2� � � N T3� � � � � � � V ;
each “null space” N Tk� �
is a linear subspace of V, and V is a finite-dimensionalvector space, the chain N Tð Þ � N T2ð Þ � N T3ð Þ � � � � cannot continue toincrease indefinitely. Hence there exists a positive integer k such that N Tk
� � ¼N Tkþ 1� � ¼ N Tkþ 2
� � ¼ � � � ; and N Tk�1� �
is a proper subset of N Tk� �
: ∎
© Springer Nature Singapore Pte Ltd. 2020R. Sinha, Galois Theory and Advanced Linear Algebra,https://doi.org/10.1007/978-981-13-9849-0_4
255
4.1.2 Theorem N Tk� �
is invariant under T.
Proof To show this, let us take an arbitrary v 2 N Tk� �
; that is, Tk vð Þ ¼ 0: Wehave to show that T vð Þ 2 N Tk
� �; that is,
T 0ð Þ ¼ T Tk vð Þ� � ¼ Tkþ 1 vð Þ ¼ Tk T vð Þð Þ ¼ 0|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl};that is, T 0ð Þ ¼ 0: This is known to be true, because T : V ! V is a linear trans-formation. ∎
4.1.3 Theorem The restriction TjN Tkð Þ : N Tk� �! N Tk
� �is a nilpotent
transformation.
Proof To show this, let us take an arbitrary v 2 N Tk� �
: It suffices to show that
T jN Tkð Þ� �k� �
vð Þ ¼ 0:
Since v 2 N Tk� �
; we have Tk� �
vð Þ ¼ 0: Since v 2 N Tk� �
; we have
TjN Tkð Þ� �
vð Þ ¼ 0: Now,
LHS ¼ T jN Tkð Þ� �k� �
vð Þ ¼ T jN Tkð Þ� �k�1� �
TjN Tkð Þ� �
vð Þ� �
¼ T jN Tkð Þ� �k�1� �
0ð Þ ¼ 0 ¼ RHS:
∎
4.1.4 Theorem N Tk� �\ ran Tk
� � ¼ 0f g.Proof Suppose to the contrary that there exists a nonzero v in N Tk
� �\ ran Tk� �
;
that is, v 6¼ 0; Tk vð Þ ¼ 0; and for some nonzero w 2 V ; Tk wð Þ ¼ v: We seek acontradiction.
Since
T2k wð Þ ¼ Tk Tk wð Þ� � ¼ Tk vð Þ ¼ 0|fflfflfflfflfflffl{zfflfflfflfflfflffl};we have T2k wð Þ ¼ 0; and hence w 2 N T2k
� �: Now, since N Tk
� � ¼ N Tkþ 1� � ¼
N Tkþ 2� � ¼ � � � ; we have N Tk
� � ¼ N T2k� � 3wð Þ; and hence w 2 N Tk
� �: It fol-
lows that v ¼ Tk wð Þ ¼ 0|fflfflfflfflfflffl{zfflfflfflfflfflffl}; and hence v ¼ 0: This is a contradiction. ∎
256 4 Sylvester’s Law of Inertia
4.1.5 Theorem V ¼ N Tk� �� ran Tk
� �:
Proof From 4.1.4, it remains to show that V ¼ N Tk� �þ ran Tk
� �: Since
N Tk� �
; ran Tk� �
are subspaces of V, N Tk� �þ ran Tk
� �is a subspace of V, and
hence it suffices to show that
dim domain of Tk� � ¼ dim Vð Þ ¼ dim N Tk� �
+ ran Tk� �� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ dim N Tk
� �� �þ dim ran Tk� �� �� dim N Tk
� �\ ran Tk� �� �
¼ dim N Tk� �� �þ dim ran Tk
� �� �� dim 0f gð Þ ¼ dim N Tk� �� �þ dim ran Tk
� �� �� 0¼ dim N Tk
� �� �þ dim ran Tk� �� �
;
that is, dim domain of Tk� � ¼ dim N Tk
� �� �þ dim ran Tk� �� �
:
Since Tk : V ! V is a linear transformation, we obtain from the well-knownresult (nullity + rank = dimension of domain) that
dim N Tk� �� �þ dim ran Tk
� �� � ¼ dim domain of Tk� �
:
Thus V ¼ N Tk� �� ran Tk
� �: ∎
4.1.6 Theorem T jran Tkð Þ : ran Tk� �! ran Tk
� �is a mapping, that is, ran Tk
� �is
invariant under T.
Proof To show this, let us take an arbitrary v 2 V : We have to show that
Tjran Tkð Þ� �
Tk vð Þ� � 2 ran Tk� �
; that is, T Tk vð Þ� � 2 ran Tk� �
; that is, Tkþ 1 vð Þ 2ran Tk� �
; that is, Tk T vð Þð Þ 2 ran Tk� �
: This is clearly true. ∎
4.1.7 Theorem The restriction T jran Tkð Þ : ran Tk� �! ran Tk
� �is invertible.
Proof To show this, let us take an arbitrary v 2 V such that T jran Tkð Þ� �
Tk vð Þ� � ¼ 0;
that is, T Tk vð Þ� � ¼ 0; that is, Tkþ 1 vð Þ ¼ 0; that is, v 2 N Tkþ 1� �
: It suffices toshow that Tk vð Þ ¼ 0; that is, v 2 N Tk
� �:
Since v 2 N Tkþ 1� �
; and N Tk� � ¼ N Tkþ 1
� �; we have V 2 N Tk
� �: ∎
Thus we have shown that the linear transformation Tjran Tkð Þ : ran Tk� �!
ran Tk� �
is invertible.
4.1.8 Conclusion Let V be any n-dimensional vector space over the field F. Let T :V ! V be a linear transformation. Then there exists a positive integer k such that
1. N Tð Þ � N T2ð Þ � � � � � N Tk� � ¼ N Tkþ 1
� � ¼ N Tkþ 2� � ¼ � � � ;
2. V ¼ N Tk� �� ran Tk
� �;
3. T jN Tkð Þ : N Tk� �! N Tk
� �is a nilpotent transformation,
4. T jran Tkð Þ : ran Tk� �! ran Tk
� �is invertible.
4.1 Positive Definite Matrices 257
4.1.9 Theorem Let V be any n-dimensional vector space over the field F. LetT : V ! V be a linear transformation. Then there exist unique linear subspacesH and K of V such that
1. V ¼ H � K;2. T jH : H ! H is a nilpotent transformation,3. T jK : K ! K is invertible.
Also, there exists a positive integer k such that H ¼ N Tk� �
; and K ¼ ran Tk� �
:
Proof of existence: By 4.1.1, there exists a positive integer k such that
1. N Tð Þ � N T2ð Þ � � � � � N Tk� � ¼ N Tkþ 1
� � ¼ N Tkþ 2� � ¼ � � � ;
2. V ¼ N Tk� �� ran Tk
� �;
3. T jN Tkð Þ : N Tk� �! N Tk
� �is a nilpotent transformation,
4. T jran Tkð Þ : ran Tk� �! ran Tk
� �is invertible.
Let us put H � N Tk� �
; and K � ran Tk� �
: We get
1. V ¼ H � K;2. T jH : H ! H is a nilpotent transformation,3. T jK : K ! K is invertible.
Proof of uniqueness: Suppose that H1 and K1 are subspaces of V such that
1. V ¼ H1 � K1;2. T jH1
: H1 ! H1 is a nilpotent transformation,3. T jK1
: K1 ! K1 is invertible.
Suppose that H2 and K2 are subspaces of V such that
1. V ¼ H2 � K2;2. T jH2
: H2 ! H2 is a nilpotent transformation,3. T jK2
: K2 ! K2 is invertible.
We have to show that H1 ¼ H2; and K1 ¼ K2:By 4.1.1, there exists a positive integer k such that
1. N Tð Þ � N T2ð Þ � � � � � N Tk� � ¼ N Tkþ 1
� � ¼ N Tkþ 2� � ¼ � � � ;
2. V ¼ N Tk� �� ran Tk
� �;
3. T jN Tkð Þ : N Tk� �! N Tk
� �is a nilpotent transformation,
4. T jran Tkð Þ : ran Tk� �! ran Tk
� �is invertible.
Since V ¼ N Tk� �� ran Tk
� �; we have dim Vð Þ ¼ dim N Tk
� �� �þdim ran Tk
� �� �:
Clearly, H1 � N Tk� �
:
258 4 Sylvester’s Law of Inertia
Proof To show this, let us take an arbitrary v 2 H1: We have to show thatv 2 N Tk
� �; that is, Tk vð Þ ¼ 0: Since T jH1
: H1 ! H1 is a nilpotent transfor-
mation, there exists a positive integer l such that T jH1
� �l¼ 0: It follows that
Tl� �
vð Þ ¼ � � � ¼ TjH1
� �l�1T vð Þð Þ ¼ T jH1
� �l�1T jH1
� �vð Þ� � ¼ T jH1
� �lvð Þ ¼ 0|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl};
and hence Tl� �
vð Þ ¼ 0: Thus v 2 N Tl� �
: Since
N Tð Þ � N T2� � � � � � � N Tk
� � ¼ N Tkþ 1� � ¼ N Tkþ 2
� � ¼ � � � ;
we have
v 2 N Tl� � � N Tk
� �|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence v 2 N Tk
� �: ∎
It follows that dim H1ð Þ� dim N Tk� �� �
:
We claim that dim H1ð Þ ¼ dim N Tk� �� �
:
Suppose to the contrary that dim H1ð Þ\dim N Tk� �� �
: We seek a contradiction.Clearly, K1 � ran Tð Þ:
Proof To show this, let us take an arbitrary v 2 K1: We have to show thatv 2 ran Tð Þ: Since v 2 K1, and T jK1
: K1 ! K1 is invertible, there exists w 2 K1
such that T wð Þ ¼ T jK1
� �wð Þ ¼ v|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl}; and hence ran Tð Þ3ð ÞT wð Þ ¼ v: Thus
v 2 ran Tð Þ: ∎Clearly, K1 � ran T2ð Þ:
Proof To show this, let us take an arbitrary v 2 K1: We have to show that
v 2 ran T2ð Þ: Since TjK1: K1 ! K1 is invertible, TjK1
� �2: K1 ! K1 is invertible.
Now, since v 2 K1; there exists w 2 K1 such that
TjK1
� �T wð Þð Þ ¼ T jK1
� �TjK1
� �wð Þ� � ¼ T jK1
� �2� �wð Þ ¼ v|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl};
and hence T jK1
� �T wð Þð Þ ¼ v: Since w 2 K1 and T jK1
: K1 ! K1; we haveT wð Þ ¼ T jK1
� �wð Þ 2 K1|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl}; and hence T wð Þ 2 K1: It follows that
4.1 Positive Definite Matrices 259
v ¼ TjK1
� �T wð Þð Þ ¼ T T wð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ T2� �
wð Þ 2 ran T2� �;
and hence v 2 ran T2ð Þ: ∎Similarly, K1 � ran T3ð Þ: etc. Thus K1 � ran Tk
� �:
It follows that dim K1ð Þ� dim ran Tk� �� �
: Since V ¼ H1 � K1; we havedim Vð Þ ¼ dim H1ð Þþ dim K1ð Þ: Similarly, dim Vð Þ ¼ dim H2ð Þþ dim K2ð Þ: Since
dim Vð Þ � dim H1ð Þ ¼ dim K1ð Þ� dim ran Tk� �� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ dim Vð Þ � dim N Tk
� �� �;
we have
dim Vð Þ � dim H1ð Þ� dim Vð Þ � dim N Tk� �� �
;
and hence dim N Tk� �� �� dim H1ð Þ: This is a contradiction.
So our claim is substantiated, that is, dim H1ð Þ ¼ dim N Tk� �� �
:
Now, since H1 � N Tk� �
; we have H1 ¼ N Tk� �
: Similarly, H2 ¼ N Tk� �
: Itfollows that H1 ¼ H2: It remains to show that K1 ¼ K2:
Since V ¼ H1 � K1; we have dim Vð Þ ¼ dim H1ð Þþ dim K1ð Þ: Similarly,dim Vð Þ ¼ dim N Tk
� �� �þ dim ran Tk� �� �
: Since
dim Vð Þ � dim K1ð Þ ¼ dim H1ð Þ ¼ dim N Tk� �� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ dim Vð Þ � dim ran Tk
� �� �;
we have dim Vð Þ � dim K1ð Þ ¼ dim Vð Þ � dim ran Tk� �� �
; and hence dim K1ð Þ ¼dim ran Tk
� �� �: Now, since K1 � ran Tk
� �; we have K1 ¼ ran Tk
� �: Similarly, K2 ¼
ran Tk� �
: Hence K1 ¼ K2: ∎
4.1.10 Note Let V be any n-dimensional vector space. Let T : V ! V be a lineartransformation.
Let v1; . . .; vn be any basis of V. Let A � aij
nn be the matrix of T relative tothe basis v1; . . .; vn: By 3.3.22, there exists a unitary matrix U such that
1. UAU is an upper triangular matrix,2. the eigenvalues of A are the diagonal entries of UAU:
Since U is a unitary matrix, we have UU ¼ UU ¼ I; and hence U�1 ¼ U:Thus U is invertible. Also
1. U�1AU is an upper triangular matrix,2. the eigenvalues of T are the diagonal entries of U�1AU:
260 4 Sylvester’s Law of Inertia
Since U is invertible, by 3.1.35(b), there exists a basis w1; . . .;wn of V such thatU�1AU is the matrix of T relative to the basis w1; . . .;wn: Thus
1. the matrix of T relative to the basis w1; . . .;wn is upper triangular,2. the eigenvalues of T are the diagonal entries of the matrix of T relative to the
basis w1; . . .;wn:
4.1.11 Conclusion Let V be any n-dimensional vector space. Let T : V ! V be alinear transformation. Then there exists a basis w1; . . .;wn of V such that if B is thematrix of T relative to the basis w1; . . .;wn; then
1. B is an upper triangular matrix,2. the eigenvalues of T are the diagonal entries of B.
4.1.12 Note Let V be any n-dimensional vector space over the field C: Let T :V ! V be a linear transformation.
Suppose that k1; k2; . . .; kp are all the distinct eigenvalues of T. Suppose that theeigenvalue k1 has multiplicity m1; the eigenvalue k2 has multiplicity m2; etc. Inother words, the list of all eigenvalues of T is k1; k1; . . .; k1|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl}
m1 in number
; . . .; kp; kp; . . .; kp|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl}mp in number
;
where
n ¼ m1 þ � � � þmp:
Thus the characteristic polynomial of T is k� k1ð Þm1 � � � k� kp� �mp : Now, since
the minimal polynomial of T divides the characteristic polynomial of T, we cansuppose that the minimal polynomial of T is of the form
k� k1ð Þl1 � � � k� kp� �lp ;
where li �mi i ¼ 1; . . .; pð Þ: Put
V1 � v : v 2 V and T � k1Ið Þl1� �
vð Þ ¼ 0n o
;
V2 � v : v 2 V and T � k2Ið Þl2� �
vð Þ ¼ 0n o
; etc:
By 3.2.3,
1. each Vi is a nontrivial linear subspace of V,2. each Vi is invariant under T,3. V ¼ V1 � V2 � � � � � Vp;
4. for each i ¼ 1; 2; . . .; p; the minimal polynomial of T jViis k� kið Þli :
From 4.1.5, T � k1Ið ÞjV1
� �l1¼� �TjV1
� k1I� �l1¼ 0; so T � k1Ið ÞjV1
: V1 ! V1
is a nilpotent transformation, and hence by 3.2.22, 0 is the only eigenvalue of
4.1 Positive Definite Matrices 261
T � k1Ið ÞjV1: It follows that k1 is the only eigenvalue of TjV1
: Similarly, k2 is theonly eigenvalue of T jV2
; etc.
4.1.13 Conclusion 4.1.13 Let V be any n-dimensional vector space over the fieldC: Let T : V ! V be a linear transformation. Suppose that k1; k2; . . .; kp are all thedistinct eigenvalues of T. Suppose that the minimal polynomial of T is of the form
k� k1ð Þl1 � � � k� kp� �lp :
Put
V1 � v : v 2 V and T � k1Ið Þl1� �
vð Þ ¼ 0n o
;
V2 � v : v 2 V and T � k2Ið Þl2� �
vð Þ ¼ 0n o
; etc:
Then
1. each Vi is a nontrivial linear subspace of V,2. each Vi is invariant under T,3. V ¼ V1 � V2 � � � � � Vp;
4. for each i ¼ 1; 2; . . .; p; ki is the only eigenvalue of T jVi:
5. there exists a basis B of V such that the matrix of T relative to B is of the blockform
A1
A2. ..
Ap
2664
3775nn
;
where A1 is a dimV1ð Þ dimV1ð Þ matrix of TjV1; A2 is a dimV2ð Þ dimV2ð Þ
matrix of T jV2; etc., and all other entries are 0.
Definition A square matrix of the form
k 1 0 00 k 1 0...
0 k . ..
0... ..
.0 0
..
. ... ..
.1
0 0 0 k
2666666664
3777777775
262 4 Sylvester’s Law of Inertia
is called a basic Jordan block belonging to k: The basic Jordan block belonging tok00 of size t t can also be written as
kIþ
0 1 0 00 0 1 0...
0 0 0... ..
.0 . .
.0
..
. ... ..
.1
0 0 0 0
2666666664
3777777775tt
or kIþMt; where
Mt �
0 1 0 00 0 1 0...
0 0 . ..
0... ..
.0 0
..
. ... ..
.1
0 0 0 0
2666666664
3777777775tt
:
4.1.14 Theorem Let V be any n-dimensional vector space over the field C: LetT : V ! V be a linear transformation. Suppose that k1; k2; . . .; kp are the distincteigenvalues of T. Then there exists a basis B of V such that the matrix of T relativeto B is of the form
J1J2
. ..
Jp
2664
3775
such that for every i 2 1; 2; . . .; pf g; Ji is of the form
Bi1
Bi2
. ..
264
375;
where each Bij is a basic Jordan block belonging to ki:Here a matrix of the type
4.1 Positive Definite Matrices 263
B11
B12
. ..
264
375
B21
B22
. ..
264
375
. ..
2666666666664
3777777777775;
where each Bij is a basic Jordan block belonging to ki; is called a Jordan canonicalform.
Thus for a given square matrix A, there exists an invertible matrix C such thatC�1AC is a Jordan canonical form.
Proof By 4.1.13, there exist linear subspaces V1; . . .;Vp such that1. each Vi is invariant under T,
2. V ¼ V1 � V2 � � � � � Vp;
3. for each i ¼ 1; 2; . . .; p; ki is the only eigenvalue of T jVi:
4. there exists a basis B of V such that the matrix of T relative to B is of the blockform
A1
A2. ..
Ap
2664
3775nn
;
where A1 is a dimV1ð Þ dimV1ð Þ matrix of TjV1; A2 is a dimV2ð Þ dimV2ð Þ
matrix of T jV2; etc., and all other entries are 0.
Now, by 3.3.22, there exists a basis C1 of V1 such that the matrix
K1 ¼ C1½ ��1A1 C1½ �� �
of TjV1relative to C1 is upper triangular, and each diagonal
entry of K1 is k1: Similarly, there exists a basis C2 of V2 such that the matrix
K2 ¼ C2½ ��1A2 C2½ �� �
of TjV2relative to C2 is upper triangular, and each diagonal
entry of K2 is k2; etc. Now, since V ¼ V1 � V2 � � � � � Vp; there exists a basis D ofV such that the matrix of T relative to D is of the block form
K1
K2. ..
Kp
2664
3775nn
:
264 4 Sylvester’s Law of Inertia
Since k1 is the only eigenvalue of T jV1; 0 is the only eigenvalue of T jV1
� k1I� �
;
and hence the characteristic polynomial of TjV1� k1I
� �is
k� 0ð Þ � � � k� 0ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}dim T jV1ð Þ in number
¼ kdim T jV1ð Þ� �; and hence by the Cayley–Hamilton theorem,
TjV1� k1I
� �dim T jV1ð Þ¼ 0: This shows that T jV1� k1I
� �is nilpotent. Now, by 3.2.25
, there exists a basis D1 of V1 such that the matrix
D1½ ��1 K1 � k1Ið Þ D1½ � ¼ D1½ ��1 K1 � k1Ið Þ D1½ � ¼ D1½ ��1K1 D1½ � � k1I� �
of T jV1� k1I
� �relative to the basis D1 has the canonical form
Mn1 0 00 Mn2 0
0 0 . ..
264
375:
It follows that the matrix D1½ ��1K1 D1½ � ¼ D1½ ��1 C1½ ��1A1 C1½ �� �
D1½ � ¼�
C1½ � D1½ �ð Þ�1A1 C1½ � D1½ �ð ÞÞ of T jV1relative to the basis D1 has the canonical form
k1IþMn1 0 00 Mn2 0
0 0 . ..
264
375
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼
k1IþMn1 0 00 k1IþMn2 0
0 0 . ..
264
375 ¼
B11 0 00 B12 0
0 0 . ..
264
375;
where B11 � k1IþMn1 ;B12 � k1IþMn2 ; etc. Clearly, each of B11;B12; . . . is abasic Jordan block belonging to k1: Thus the matrix of TjV1
relative to the basis D1
has the canonical form
B11 0 00 B12 0
0 0 . ..
264
375:
Similarly, there exists a basis D2 of V2 such that the matrix of T jV2relative to the
basis D2 has the canonical form
B21 0 00 B22 0
0 0 . ..
264
375;
where each B21;B22; . . . is a basic Jordan block belonging to k2; etc.
4.1 Positive Definite Matrices 265
Now, since V ¼ V1 � V2 � � � � � Vp; there exists a basis B of V such that thematrix of T relative to B is of the form
J1J2
. ..
Jp
2664
3775
such that for every i 2 1; 2; . . .; pf g; Ji is of the form
Bi1
Bi2
. ..
264
375;
where each Bij is a basic Jordan block belonging to ki: ∎
4.1.15 Note Let V be any n-dimensional inner product space over the field C: LetT : V ! V be a linear transformation. Let T be normal.
Suppose that k1; k2; . . .; kp are the distinct eigenvalues of T. Suppose that theminimal polynomial of T is of the form
k� k1ð Þl1 � � � k� kp� �lp :
Put
V1 � v : v 2 V and T � k1Ið Þl1� �
vð Þ ¼ 0n o
;
V2 � v : v 2 V and T � k2Ið Þl2� �
vð Þ ¼ 0n o
; etc.
By 4.1.13,1. each Vi is a nontrivial linear subspace of V ;2. each Vi is invariant under T ;3. V ¼ V1 � V2 � � � � � Vp;
4. for each i ¼ 1; 2; . . .; p; ki is the only eigenvalue of T jVi:
Observe that
V1 ¼ E1 [ 0f g;
where E1 is the set of all eigenvectors belonging to the eigenvalue k1 of T.
Proof To show this, let us take an arbitrary nonzero v 2 V1 ¼ v : v 2 V andfðT � k1Ið Þl1
� �vð Þ ¼ 0gÞ: It follows that T � k1Ið Þl1
� �vð Þ ¼ 0: Since T is nor-
mal, by 3.1.24, we have T vð Þ ¼ kv: Next, since v 6¼ 0; v is an eigenvector
266 4 Sylvester’s Law of Inertia
belonging to the eigenvalue k1; and hence v 2 E1: Thus V1 � E1 [ 0f g: It suf-fices to show that E1 � V1:
To show this, let us take an arbitrary v 2 E1; that is, v is an eigenvectorbelonging to the eigenvalue k1 of T. Hence T vð Þ ¼ k1v: It follows thatT � k1Ið Þ vð Þ ¼ 0; and hence
T � k1Ið Þl1� �
vð Þ ¼ T � k1Ið Þl1�1� �
T � k1Ið Þ vð Þð Þ ¼ T � k1Ið Þl1�1� �
0ð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ 0:
Thus T � k1Ið Þl1� �
vð Þ ¼ 0: It follows that v 2 V1:
We have shown that V1 ¼ E1 [ 0f g: ∎Similarly, V2 ¼ E2 [ 0f g; where E2 is the set of all eigenvectors belonging to the
eigenvalue k2 of T, etc.Since V1 is a nontrivial linear subspace of V, by 3.3.12, there exists an
orthonormal basis B1 of V1 ¼ E1 [ 0f gð Þ: Since a basis does not contain the zerovector, we have B1 � E1; and hence each member of B1 is an eigenvector belongingto the eigenvalue k1 of T. Thus for every v 2 B1; T vð Þ ¼ k1v:
Similarly, there exists an orthonormal basis B2 of V2 such that B2 � E2; and forevery w 2 B2; T wð Þ ¼ k2w; etc.
Clearly, B1 [B2 [ � � � [Bp is an orthonormal basis of V.
Proof Since each Bi is an orthonormal basis of Vi, and V ¼ V1 � V2 � � � � � Vp;
it suffices to show that for distinct i; j 2 1; . . .; pf g; ðv 2 Bi;w2 Bj ) v;wh i ¼ 0Þ:
To show this, let us take arbitrary i; j 2 1; . . .; pf g such that i 6¼ j: Next, let ustake arbitrary v 2 Bi; and w 2 Bj: We have to show that v;wh i ¼ 0:
Since v 2 Bi; we have T vð Þ ¼ kiv: Since v 2 Bi; and Bi is a basis, v is non-zero. Since i 6¼ j; and k1; k2; . . .; kp are distinct, we have ki 6¼ kj: Since w 2 Bj;
we have T wð Þ ¼ kjw: Now, since T is normal, by 3.1.25, v;wh i ¼ 0:We have shown that B is an orthonormal basis of V, where
B � B1 [B2 [ � � � [Bp: ∎
Since for every i 2 1; . . .; pf g;Bi � Ei; we have
B ¼ B1 [B2 [ � � � [Bp� � � E1 [E2 [ � � � [Ep
� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ the collection of all eigenvectors of Tð Þ;
and hence each member of B is an eigenvector of T: Suppose that
B ¼ e1; e2; . . .; enf g � Vð Þ:
Since e1; e2; . . .; enf g is an orthonormal basis B of V we have
4.1 Positive Definite Matrices 267
T e1ð Þ ¼ T e1ð Þ; e1h ie1|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ T e1ð Þ; e1h ie1 þ 0e2 þ � � � þ 0eP;
and hence
T e1ð Þ ¼ T e1ð Þ; e1h ie1 þ 0e2 þ � � � þ 0eP:
Similarly,
T e2ð Þ ¼ 0e1 þ T e2ð Þ; e2h ie2 þ 0e3 þ � � � þ 0eP;
etc. Thus the matrix of T relative to the basis e1; e2; . . .; enf g is the diagonal matrix
T e1ð Þ; e1h iT e2ð Þ; e2h i
. ..
264
375nn
:
4.1.16 Conclusion Let V be any n-dimensional inner product space over the fieldC: Let T : V ! V be a linear transformation. Let T be normal. Then there exists anorthonormal basis B of V such that the matrix of T relative to the basis B is adiagonal matrix.
Since every Hermitian linear transformation is normal, and every unitary lineartransformation is normal, the above conclusion is also valid when either T isHermitian or T is unitary.
4.1.17 Theorem Let V be any n-dimensional inner product space over the field C:Let T : V ! V be a normal linear transformation. Then T is Hermitian if and only ifall the eigenvalues of T are real.
Proof In view of 3.1.13, it remains to show that if all the eigenvalues of T are real,then T is Hermitian. So we suppose that all the eigenvalues of T are real. We have toshow that T is Hermitian, that is, T ¼ T :
Since T is normal, by 4.1.16, there exists an orthonormal basis e1; . . .; enf g ofV such that the matrix of T relative to the basis e1; . . .; enf g is a diagonal matrix, saydiag a1; . . .; anð Þ: It follows that
T e1ð Þ ¼ a1e1 þ 0e2 þ � � � þ 0en|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ a1e1;
and hence T e1ð Þ ¼ a1e1: Since e1; . . .; enf g is a basis, we have e1 6¼ 0: Now, sinceT e1ð Þ ¼ a1e1; a1 is an eigenvalue of T. Next, by assumption, a1 is a real number.Similarly, a2 is a real number, etc. It follows that
268 4 Sylvester’s Law of Inertia
diag a1; . . .; anð Þð Þ ¼ diag a1; . . .; anð Þð ÞT� ��¼ diag a1; . . .; anð Þð Þ�¼ diag a1; . . .; anð Þ ¼ diag a1; . . .; anð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl};
and hence
diag a1; . . .; anð Þð Þ¼ diag a1; . . .; anð Þ:
Since the matrix of T relative to the basis e1; . . .; enf g is diag a1; . . .; anð Þ; by3.1.11, the matrix of T relative to the basis e1; . . .; enf g isdiag a1; . . .; anð Þð Þ ¼ diag a1; . . .; anð Þð Þ; and hence the matrix of T relative to thebasis e1; . . .; enf g is diag a1; . . .; anð Þ: Now, since the matrix of T relative to the basise1; . . .; enf g is diag a1; . . .; anð Þ; the matrices of T and T relative to the basise1; . . .; enf g are equal. It follows that T eið Þ ¼ T eið Þ i ¼ 1; . . .; nð Þ: Now, since
T and T are linear, for every v 2 V ; T vð Þ ¼ T vð Þ; and hence T ¼ T : ∎
4.1.18 Theorem Let V be any n-dimensional inner product space over the field C:Let T : V ! V be a normal linear transformation. Then T is unitary if and only ifthe absolute value of each eigenvalue of T is 1.
Proof In view of 3.1.22, it remains to show that if the absolute value of eacheigenvalue of T is 1, then T is unitary. So we suppose that the absolute value of eacheigenvalue of T is 1 and show that T is unitary. In view of 3.1.10, it suffices to showthat TT ¼ I:
Since T is normal, by 4.1.16, there exists an orthonormal basis e1; . . .; enf g ofV such that the matrix of T relative to the basis e1; . . .; enf g is a diagonal matrix, saydiag a1; . . .; anð Þ:
Since the matrix of T relative to the basis e1; . . .; enf g is diag a1; . . .; anð Þ; by3.1.11, the matrix of T relative to the basis e1; . . .; enf g is
diag a1; . . .; anð Þð Þ ¼ diag a1; . . .; anð Þð ÞT� ���¼ diag a1; . . .; anð Þð Þ�¼ diag a1; . . .; anð ÞÞ;
and hence the matrix of T relative to the basis e1; . . .; enf g is diag a1; . . .; anð Þ:Now, since the matrix of T relative to the basis e1; . . .; enf g is diag a1; . . .; anð Þ; by3.1.33, the matrix of TT relative to the basis e1; . . .; enf g is
diag a1; . . .; anð Þ � diag a1; . . .; anð Þ¼ diag a1a1; . . .; ananð Þ ¼ diag a1j j2; . . .; anj j2
� �� �;
and hence the matrix of TT relative to the basis e1; . . .; enf g is
diag a1j j2; . . .; anj j2� �
:
Since the matrix of T relative to the basis e1; . . .; enf g is diag a1; . . .; anð Þ; wehave
4.1 Positive Definite Matrices 269
T e1ð Þ ¼ a1e1 þ 0e2 þ � � � þ 0en|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ a1e1;
and hence T e1ð Þ ¼ a1e1: Since e1; . . .; enf g is a basis, we have e1 6¼ 0: Now, sinceT e1ð Þ ¼ a1e1; a1 is an eigenvalue of T. Next, by assumption, a1j j ¼ 1: Similarly,a2j j ¼ 1, etc. Now, since the matrix of TT relative to the basis e1; . . .; enf g is
diag a1j j2; . . .; anj j2� �
¼ diag 12; . . .; 12� � ¼ diag 1; . . .; 1ð Þ ¼ dij
� �;
thematrix ofTT relative to the basis e1; . . .; enf g is dij
:Also, the matrix of I relativeto the basis e1; . . .; enf g is dij
: So, the matrices of TT and I relative to the basis
e1; . . .; enf g are equal. It follows that TTð Þ eið Þ ¼ I eið Þ i ¼ 1; . . .; nð Þ: Now, sinceTT and I are linear, for every v 2 V ; TTð Þ vð Þ ¼ I vð Þ; and hence TT ¼ I: ∎
4.1.19 Theorem Let V be any n-dimensional inner product space over the field C:Let N : V ! V be a normal linear transformation. Let T : V ! V be a lineartransformation. Suppose that TN ¼ NT : Then TN ¼ NT :
Proof Let us put X � TN � NT : We have to show that X ¼ 0:Since N is normal, we have NN ¼ NN; and hence N commutes with N: Since
TN ¼ NT ; N commutes with T. Since N commutes with T and N; N commuteswith TN � NTð Þ ¼ Xð Þ; and hence N commutes with X
By 3.1.7, 3.1.8, and 3.1.6, we have
X ¼ TN � NTð Þ¼ TNð Þ� NTð Þ¼ Nð ÞT � T Nð Þ¼ NT � TN;
and hence
XX ¼ TN � NTð Þ NT � TNð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ TN � NTð ÞNT � TN � NTð ÞTN
¼ TN � NTð ÞNð ÞT � TN � NTð ÞTN¼ N TN � NTð Þð ÞT � TN � NTð ÞTN¼ N TN � NTð ÞTð Þ � TN � NTð ÞTN¼ N TN � NTð ÞTð Þ � TN � NTð ÞTð ÞN ¼ NB� BN;
where B � TN � NTð ÞT: Thus
XX ¼ NB� BN:
Since N is normal, by 4.1.16, there exists an orthonormal basis e1; . . .; enf g ofV such that the matrix of T relative to the basis e1; . . .; enf g is a diagonal matrix, saydiag a1; . . .; anð Þ:
Let bij
be the matrix of B relative to the basis e1; . . .; enf g:
270 4 Sylvester’s Law of Inertia
By 3.1.33, the matrix of BN relative to the basis e1; . . .; enf g isbij
diag a1; . . .; anð Þð Þ: Clearly, the diagonal entries of bij
diag a1; . . .; anð Þð Þ areb11a1; b22a2; . . .; bnnan: Thus the diagonal entries of the matrix of BN relative to thebasis e1; . . .; enf g are b11a1; b22a2; . . .; bnnan:
By 3.1.33, the matrix of NB relative to the basis e1; . . .; enf g isdiag a1; . . .; anð Þð Þ bij
: Clearly, the diagonal entries of diag a1; . . .; anð Þð Þ bij
are
a1b11; a2b22; . . .; anbnn: Thus the diagonal entries of the matrix of NB relative to thebasis e1; . . .; enf g are a1b11; a2b22; . . .; anbnn:
It follows that the matrix of NB� BN relative to the basis e1; . . .; enf g is
diag a1; . . .; anð Þð Þ bij � bij
diag a1; . . .; anð Þð Þ;
and hence the diagonal entries of the matrix of NB� BN ¼ XXð Þ relative to thebasis e1; . . .; enf g are all 0:
Thus, the diagonal entries of the matrix of XX relative to the basis e1; . . .; enf gare all 0:
Let xij
be the matrix of X relative to the basis e1; . . .; enf g: By 3.1.11, thematrix of X relative to the basis e1; . . .; enf g is
xij � � ¼ xij
T� ��¼ xij T� �
;
and hence by 3.1.33, the matrix of XX relative to the basis e1; . . .; enf g is
xij
xij T
: Here the diagonal entries of xij
xij T
areXnj¼1
x1jx1J ;Xnj¼1
x2jx2J ; . . .;Xnj¼1
xnjxnJ ;
that is, the diagonal entries of xij
xiJ½ �T are
Xnj¼1
x1j�� ��2;Xn
j¼1
x2j�� ��2; . . .;Xn
j¼1
xnj�� ��2:
Hence the diagonal entries of the matrix of XX relative to the basis e1; . . .; enf gare Xn
j¼1
x1j�� ��2;Xn
j¼1
x2j�� ��2; . . .;Xn
j¼1
xnj�� ��2:
Now, since the diagonal entries of the matrix of XX relative to the basise1; . . .; enf g are all 0; we have
4.1 Positive Definite Matrices 271
Xnj¼1
x1j�� ��2¼ 0;
Xnj¼1
x2j�� ��2¼ 0; etc:
SincePn
j¼1 x1j�� ��2¼ 0; we have x1j ¼ 0 j ¼ 1; . . .; nð Þ: Similarly, x2j ¼
0 j ¼ 1; . . .; nð Þ; etc. Thus each xij is 0, and hence the matrix xij
of X relative to thebasis e1; . . .; enf g is the zero matrix. This shows that X ¼ 0: ∎
4.1.20 Theorem Let V be any n-dimensional inner product space over the field C:Let T : V ! V be a linear transformation. Then T is Hermitian if and only if forevery v 2 V ; T vð Þ; vh i is a real number.
Proof Let T be Hermitian, that is, T ¼ T : We have to show that for every v 2 V ;T vð Þ; vh i is a real number.To do so, let us take an arbitrary v 2 V : We have to show that T vð Þ; vh i is a real
number, that is, T vð Þ; vh i ¼ T vð Þ; vh i:
LHS ¼ T vð Þ; vh i ¼ v; T vð Þh i ¼ v; T vð Þh i ¼ T vð Þ; vh i ¼ RHS:
Conversely, suppose that for every v 2 V ; T vð Þ; vh i is a real number. We have toshow that T is Hermitian, that is, T ¼ T ; that is, X ¼ 0; where X � T � T : By3.1.1, it suffices to show that for every v 2 V ; X vð Þ; vh i ¼ 0:
LHS ¼ X vð Þ; vh i ¼ T � Tð Þ vð Þ; vh i ¼ T vð Þ � T vð Þ; vh i¼ T vð Þ; vh i � T vð Þ; vh i ¼ v; T vð Þh i � T vð Þ; vh i¼ T vð Þ; vh i � T vð Þ; vh i ¼ T vð Þ; vh i � T vð Þ; vh i ¼ 0 ¼ RHS:
Definition Let V be any n-dimensional inner product space over the field C: LetT : V ! V be a linear transformation. If for every v 2 V ; T vð Þ; vh i is a nonnegativereal number, then we write T � 0; and we say that T is nonnegative (definite).
By 4.1.20, if T � 0; then T is Hermitian.
Theorem 4.1.21 Let V be any n-dimensional inner product space over the field C:Let T : V ! V be a linear transformation. Suppose that T is nonnegative. Then allthe eigenvalues of T are nonnegative.
Proof To show this, let us take an arbitrary eigenvalue k of T. We have to showthat k is a nonnegative real number.
Since k is an eigenvalue of T, there exists a nonzero v 2 V such that T vð Þ ¼ kv:Since T is nonnegative, k v; vh i ¼ kv; vh i ¼ð Þ T vð Þ; vh i is a nonnegative real number,and hence k v; vh i is a nonnegative real number. Since v is nonzero, v; vh i is apositive real number. Now, since k v; vh i is a nonnegative real number, k is anonnegative real number. ∎
272 4 Sylvester’s Law of Inertia
Theorem 4.1.22 Let V be any n-dimensional inner product space over the field C:Let T : V ! V be a Hermitian linear transformation. Suppose that all the eigen-values of T are nonnegative. Then T is nonnegative.
Proof To show this, let us take an arbitrary nonzero v 2 V : We have to show thatT vð Þ; vh i is a nonnegative real number.Since T is Hermitian, T is normal, and hence by 4.1.16, there exists an
orthonormal basis e1; . . .; enf g of V such that the matrix of T relative to the basise1; . . .; enf g is a diagonal matrix, say diag t1; . . .; tnð Þ: It follows that
T e1ð Þ ¼ t1e1 þ 0e2 þ � � � þ 0en ¼ t1e1ð Þ;
and hence T e1ð Þ ¼ t1e1: Since e1; . . .; enf g is a basis, e1 is nonzero. Now, sinceT e1ð Þ ¼ t1e1; t1 is an eigenvalue of T. Here by assumption, t1 is a nonnegative realnumber. Similarly, T e2ð Þ ¼ t2e2, and t2 is a nonnegative real number, etc.
Since v 2 V ; and e1; . . .; enf g is an orthonormal basis of V, we have
v ¼ v; e1h ie1 þ � � � þ v; enh ien;
and hence
T vð Þ; vh i ¼ TXni¼1
v; eih iei !
;Xni¼1
v; eih iei* +
¼Xni¼1
v; eih iT eið Þ;Xni¼1
v; eih iei* +
¼Xni¼1
v; eih itiei;Xni¼1
v; eih iei* +
¼Xni¼1
v; eih iti ei;Xnj¼1
v; ej�
ej
* +
¼Xni¼1
v; eih itiXnj¼1
v; ej�
ei; ej� !
¼Xni¼1
v; eih itiXnj¼1
v; ej�
dij
!
¼Xni¼1
v; eih iti v; eih i ¼Xni¼1
v; eih ij j2ti:
Thus T vð Þ; vh i is a nonnegative real number. ∎
Definition Let V be any n-dimensional inner product space over the field C: LetT : V ! V be a linear transformation. If for every nonzero v 2 V ; T vð Þ; vh i is apositive real number, then we write T[ 0; and we say that T is positive (definite).
By 4.1.20, if T [ 0; then T is Hermitian.
4.1.23 Theorem Let V be any n-dimensional inner product space over the field C:Let T : V ! V be a Hermitian linear transformation. Suppose that T is positive.Then all the eigenvalues of T are positive.
Proof To show this, let us take an arbitrary eigenvalue k of T We have to show thatk is a positive real number.
4.1 Positive Definite Matrices 273
Since k is an eigenvalue of T there exists a nonzero v 2 V such that T vð Þ ¼ kv:Since T is positive, k v; vh i ¼ kv; vh i ¼ð Þ T vð Þ; vh i is a positive real number, andhence k v; vh i is a positive real number. Since v is nonzero, v; vh i is a positive realnumber. Now, since k v; vh i is a positive real number, k is a positive real number. ∎
Theorem 4.1.24 Let V be any n-dimensional inner product space over the field C:Let T : V ! V be a Hermitian linear transformation. Suppose that all the eigen-values of T are positive. Then T is positive.
Proof To show this, let us take an arbitrary nonzero v 2 V : We have to show thatT vð Þ; vh i is a positive real number.Since T is Hermitian, T is normal, and hence by 4.1.16, there exists an
orthonormal basis e1; . . .; enf g of V such that the matrix of T relative to the basise1; . . .; enf g is a diagonal matrix, say diag t1; . . .; tnð Þ: It follows that
T e1ð Þ ¼ t1e1 þ 0e2 þ � � � þ 0en ¼ t1e1ð Þ;
and hence T e1ð Þ ¼ t1e1. Since e1; . . .; enf g is a basis, e1 is nonzero. Now, sinceT e1ð Þ ¼ t1e1; t1 is an eigenvalue of T. Here by assumption, t1 is a positive realnumber. Similarly, T e2ð Þ ¼ t2e2, and t2 is a positive real number, etc.
Since v 2 V ; and e1; . . .; enf g is an orthonormal basis of V, we have
v ¼ v; e1h ie1 þ � � � þ v; enh ien;
and hence
T vð Þ; vh i ¼ TXni¼1
v; eih iei !
;Xni¼1
v; eih iei* +
¼Xni¼1
v; eih iT eið Þ;Xni¼1
v; eih iei* +
¼Xni¼1
v; eih itiei;Xni¼1
v; eih iei* +
¼Xni¼1
v; eih iti ei;Xnj¼1
v; ej�
ej
* +
¼Xni¼1
v; eih itiXnj¼1
v; eJh i ei; ej� !
¼Xni¼1
v; eih itiXnj¼1
v; eJh idij !
¼Xni¼1
v; eih iti v; eih i ¼Xni¼1
v; eih ij j2ti:
Since v ¼ v; e1h ie1 þ � � � þ v; enh ien; and v is nonzero, there exists j 21; . . .; nf g such that v; ej
� 6¼ 0; and hence v; ej� �� ��2 [ 0: Also tj [ 0; so
T vð Þ; vh i ¼Xni¼1
v; eih ij j2ti � v; ej� �� ��2tj [ 0|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl} :
Thus T vð Þ; vh i is a positive real number. ∎
274 4 Sylvester’s Law of Inertia
4.1.25 Theorem Let V be any n-dimensional inner product space over the field C:Let T : V ! V be a Hermitian linear transformation. Let e1; . . .; enf g be anorthonormal basis of T. Let A � aij
be the matrix of T relative to e1; . . .; enf g:
Then A is a Hermitian matrix.
Proof We have to show that A ¼ A: By 3.1.11, it suffices to show that aJi ¼ aij:Since A � aij
is the matrix of T relative to e1; . . .; enf g; we have
T e1ð Þ ¼ a11e1 þ a21e2 þ � � � þ an1en;
and hence
T e1ð Þ; eih i ¼ a11e1 þ a21e2 þ � � � þ an1en; eih i|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼Pnj¼1aj1ej; ei�
¼Pnj¼1
aj1 ej; ei� ¼Pn
j¼1aj1dji ¼ ai1;
etc. Thus aij ¼ T ej� �
; ei�
: Hence T ej� �
; ei�
is the matrix of T relative toe1; . . .; enf g: Similarly, T ej
� �; ei
� is the matrix of T relative to e1; . . .; enf g:
Since T is Hermitian, we have T ¼ T : Since A ¼ aij
; we have AT ¼ bij
; wherebij � aji ¼ T eið Þ; ej
� � �: It follows that A ¼ AT
� ��¼ bij �|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} ¼ biJ
: Also,
aJi ¼ biJ ¼ T eið Þ; eJh i ¼ ej; T eið Þ� ¼ T ej� �
; ei� ¼ T ej
� �; ei
� ¼ aij;
so aJi ¼ aij: ∎
4.1.26 Note Let V be any n-dimensional inner product space over the field C: LetT : V ! V be a linear transformation. Let T be nonnegative.
Let e1; . . .; enf g be an orthonormal basis of T. Let A � aij
be the matrix ofT relative to e1; . . .; enf g:
It follows that
T e1ð Þ ¼ a11e1 þ a21e2 þ � � � þ an1en;
and hence
T e1ð Þ; eih i ¼ a11e1 þ a21e2 þ � � � þ an1en; eih i ¼ ai1;
etc. Thus aij ¼ T ej� �
; ei�
: Hence T ej� �
; ei�
is the matrix of T relative toe1; . . .; enf g:Since T � 0; by 4.1.20, T is a Hermitian linear transformation, and hence by
4.1.25, A is a Hermitian matrix. Since T is a Hermitian linear transformation, wehave T ¼ T :
Suppose that t1; . . .; tn are the eigenvalues of the linear transformation T.
4.1 Positive Definite Matrices 275
Since T � 0; by 4.1.21, all the eigenvalues of the linear transformation T arenonnegative, that is, each ti is a nonnegative real number. Hence each
ffiffiffiti
pis a
nonnegative real number. Since A is a Hermitian matrix, A � aij
is a normalmatrix. Now, by 3.3.24, there exists a unitary matrix U � uij
such that
1. UAU is a diagonal matrix,2. the eigenvalues of the matrix A (that is, the eigenvalues of the linear transfor-
mation T) are the diagonal entries of UAU:
Thus
UAU ¼ diag t1; . . .; tnð Þ:
Put
uj �Xni¼1
uijei j ¼ 1; . . .; nð Þ:
Clearly, u1; . . .; unf g is an orthonormal basis of V.
Proof It suffices to show that uj; uk� ¼ dij: Since
uj; uk� ¼ Pn
i¼1uijei;
Pnl¼1
ulkel
� �¼Pn
i¼1uij ei;
Pnl¼1
ulkel
� �¼Pn
i¼1uij
Pnl¼1
ulk ei; elh i� �
¼Pni¼1
uijPnl¼1
ulkdil
� �¼Pn
i¼1uijuik ¼
Pni¼1
uikuij;
we have uj; uk� ¼Pn
i¼1uikuij: Since uij
is a unitary matrix, we have
uj; ui� ¼ Xn
k¼1
ukiukj
" #¼ uiJ½ �T uij
¼ uij
uij ¼ dij
|fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl};and hence uj; ui
� ¼ dij
: This shows that ui; uj� ¼ dij: ∎
Since U is a unitary matrix, U is invertible, and U�1 ¼ U:Let us define a linear transformation W : V ! V as follows: for every i 2
1; . . .; nf g; W eið Þ � ui ¼ Pnk¼1
ukiek
� �: It follows that the matrix of W relative to
e1; . . .; enf g is uij
:
Now, since A � aij
is the matrix of T relative to e1; . . .; enf g; by 3.1.35(a), the
matrix of T relative to u1; . . .; unf g is uij �1
A uij ¼ U�1AU ¼ UAU ¼ð
diag t1; . . .; tnð ÞÞ; and hence the matrix of T relative to u1; . . .; unf g isdiag t1; . . .; tnð Þ:
276 4 Sylvester’s Law of Inertia
Let us define a linear transformation S : V ! V as follows:
S uið Þ � ffiffiffiti
pui i ¼ 1; . . .; nð Þ:
This shows thatffiffiffiffit1
p; . . .;
ffiffiffiffit1
pare the eigenvalues of the linear transformation
S. Also, the eigenvalues of the linear transformation S are nonnegative real num-bers. Further, the matrix of the linear transformation S relative to the basisu1; . . .; unf g is diag
ffiffiffiffit1
p; . . .;
ffiffiffiffitn
pð Þ: Next, by 3.1.11, the matrix of the linear trans-formation S relative to the basis u1; . . .; unf g is
diagffiffiffiffit1
p; . . .;
ffiffiffiffitn
pð Þð Þ ¼ diagffiffiffiffit1
p; . . .;
ffiffiffiffitn
pð Þð Þ:
It follows, by 3.1.33, that the matrix of the linear transformation SS relative tothe basis u1; . . .; unf g is
diagffiffiffiffit1
p; . . .;
ffiffiffiffitn
pð Þ diagffiffiffiffit1
p; . . .;
ffiffiffiffitn
pð Þð Þ¼ diag
ffiffiffiffit1
p; . . .;
ffiffiffiffitn
pð Þdiag ffiffiffiffit1
p; . . .;
ffiffiffiffitn
pð Þ ¼ diag t1; . . .; tnð Þ:
Thus the matrix of the linear transformation SS relative to the basis u1; . . .; unf gis diag t1; . . .; tnð Þ: Now, since the matrix of T relative to u1; . . .; unf g isdiag t1; . . .; tnð Þ; we have SS ¼ T :
Clearly, S is a Hermitian linear transformation, that is, S ¼ S:
Proof By 4.1.20, it suffices to show that for every v 2 V ; S vð Þ; vh i is a realnumber.
To this end, let us take an arbitrary v � a1u1 þ � � � þ anun in V. Since
S vð Þ; vh i ¼ S a1u1 þ � � � þ anunð Þ; a1u1 þ � � � þ anunh i¼ a1S u1ð Þþ � � � þ anS unð Þ; a1u1 þ � � � þ anunh i¼ a1
ffiffiffiffit1
pu1 þ � � � þ an
ffiffiffiffitn
pun; a1u1 þ � � � þ anun
� ¼ a1
ffiffiffiffit1
pa1 þ � � � þ an
ffiffiffiffitn
pan ¼ a1j j2 ffiffiffiffi
t1p þ � � � þ anj j2 ffiffiffiffi
tnp
;
S vð Þ; vh i is a real number. ∎
We have shown that S is a Hermitian linear transformation. Next, since theeigenvalues of the linear transformation S are nonnegative real numbers, by 4.1.22,S is nonnegative.
4.1.27 Conclusion Let V be any n-dimensional inner product space over the fieldC: Let T : V ! V be a linear transformation. Let T be nonnegative. Then thereexists a linear transformation S : V ! V such that
1. S� 0;2. SS ¼ T;3. S2 ¼ T:
4.1 Positive Definite Matrices 277
Definition Let A � aij
be an n-square complex matrix. Observe that for everyx 2 Cn; xAx is a 1 1 matrix. By xAx[ 0; we mean that the entry of the 1 1matrix xAx is a positive real number.
If for every nonzero x 2 Cn; xAx[ 0; then we say that A is a positive definitematrix, and we write A[ 0:
4.1.28 Problem Let V be any n-dimensional inner product space over the field C:Let T : V ! V be a nonnegative linear transformation. Let v1; . . .; vnf g be anorthonormal basis of V. Let A � aij
be the matrix of T relative to the basis
v1; . . .; vnf g: Then A is a nonnegative definite matrix.
Proof To show this, let us take an arbitrary x � a1; . . .; an½ �T2 Cn: We have toshow that xAx� 0:
Since aij
is the matrix of T relative to the basis v1; . . .; vnf g; it follows that
T vj� � ¼Pn
i¼1aijvi: Now,
xAx ¼ a1; . . .; an½ �T� �A a1; . . .; an½ �T¼ a1; . . .; an½ �T� �T� ��
A a1; . . .; an½ �T
¼ a1; . . .; an½ �A a1; . . .; an½ �T¼ a1; . . .; an½ �Að Þ a1; . . .; an½ �T
¼Xni¼1
aiai1; . . .;Xni¼1
aiain
" #a1; . . .; an½ �T
¼Xni¼1
aiai1
!a1 þ � � � þ
Xni¼1
aiain
!an
¼Xnj¼1
Xni¼1
aiaij
!aj ¼
Xnj¼1
Xni¼1
aiaijaj
!;
so
xAx ¼Xnj¼1
Xni¼1
aiaijaj
!:
We have to show thatPnj¼1
Pni¼1
aiaijaj
� �� 0:
Since T : V ! V is a nonnegative linear transformation, we have
T a1v1 þ � � � þ anvnð Þ; a1v1 þ � � � þ anvnð Þh i� 0:
278 4 Sylvester’s Law of Inertia
It suffices to show that
T a1v1 þ � � � þ anvnð Þ; a1v1 þ � � � þ anvnð Þh i ¼Pnj¼1
Pni¼1
aiaijaj
� �:
LHS ¼ T a1v1 þ � � � þ anvnð Þ; a1v1 þ � � � þ anvnð Þh i
¼ a1T v1ð Þþ � � � þ anT vnð Þ; a1v1 þ � � � þ anvnh i ¼Xni¼1
aiT við Þ;Xnj¼1
ajvj
* +
¼Xni¼1
ai T við Þ;Xnj¼1
ajvj
* +¼Xni¼1
aiXnk¼1
akivk;Xnj¼1
ajvj
* +
¼Xni¼1
aiXnk¼1
aki vk;Xnj¼1
ajvj
* + !¼Xni¼1
Xnk¼1
aiaki vk;Xnj¼1
ajvj
* + !
¼Xni¼1
Xnk¼1
aiakiXnj¼1
aJ vk; vj� ! !
¼Xni¼1
Xnk¼1
aiakiXnj¼1
aJdkj
! !
¼Xni¼1
Xnk¼1
aiakiak
!¼Xnk¼1
Xni¼1
akaikai
!
¼Xnj¼1
Xni¼1
ajaijai
!¼Xnj¼1
Xni¼1
aiaijaj
!¼ RHS:
∎
4.1.29 Problem Let A � aij
be an n-square complex matrix. Suppose that A is anonnegative definite matrix. Let T : x 7!Ax be the linear transformation from theinner product space Cn to Cn: Then T is a nonnegative linear transformation.
Proof To show this, let us take an arbitrary x � x1; . . .; xn½ �T2 Cn:We have to showthat T xð Þ; xh i� 0; that is, Ax; xh i� 0:
Since A is a nonnegative definite matrix, and x 2 Cn; we have xAx� 0: Itsuffices to show that Ax; xh i ¼ xAx: By the definition of inner product of Cn;Ax; xh i ¼ x Axð Þ|fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl} ¼ xAx; we have Ax; xh i ¼ xAx: ∎
4.1.30 Problem Let A � aij
be an n-square complex matrix. Suppose that A is anonnegative definite matrix. Let X � xij
nm¼ x1; . . .; xm½ � be any complex matrix
of size n m: (It is clear that XAX is an m m matrix.) Then XAX � 0:
Proof To show this, let us take an arbitrary x 2 Cm: We have to show thatx XAXð Þx� 0: Since
4.1 Positive Definite Matrices 279
x XAXð Þx ¼ xXð ÞA Xxð Þ ¼ Xxð ÞA Xxð Þ;
we have to show that yAy� 0; where y � Xx: Since X is a complex matrix of sizen m; and x 2 Cm; we have y ¼ð ÞXx 2 Cn; and hence y 2 Cn: Now, since A is anonnegative definite matrix of size n n; we have yAy� 0: ∎
4.1.31 Problem Let A � aij
be an n-square complex matrix. Suppose that forevery n m complex matrix X � xij
nm¼ x1; . . .; xm½ �; XAX � 0: Then A� 0:
Proof To show this, let us take an arbitrary x 2 Cn:We have to show that xAx� 0:Let us take x1 ¼ x; x2 ¼ 0; . . .; xm ¼ 0: By assumption, XAX� 0: Observe that
XAX ¼ x1; . . .; xm½ �A x1; . . .; xm½ � ¼ x1ð Þ; . . .; xmð Þ½ �TA x1; . . .; xm½ �¼ x1ð Þ; . . .; xmð Þ½ �T A x1; . . .; xm½ �ð Þ ¼ x1ð Þ; . . .; xmð Þ½ �T Ax1; . . .;Axm½ �
¼x1ð ÞAx1 x1ð ÞAx2 � � �x2ð ÞAx1 x2ð ÞAx2 � � �
..
. ... . .
.
2664
3775 ¼
x1ð ÞAx1 x1ð ÞA0 � � �0Ax1 0A0 � � �
..
. ... . .
.
2664
3775
¼x1ð ÞAx1 0 � � �
0 0 � � �... ..
. . ..
264
375;
so
XAX ¼x1ð ÞAx1 0 � � �
0 0 � � �... ..
. . ..
24
35:
Now, since XAX � 0; and 1; 0; . . .; 0½ �T2 Cn; we have
0� 1; 0; . . .; 0½ �T� �XAXð Þ 1; 0; � � � 0½ �T� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ 1 0 � � � 0½ �x1ð ÞAx1 0 � � �
0 0 � � �... ..
. . ..
24
35
10...
0
26643775
¼ 1 0 � � � 0½ �x1ð ÞAx1
0...
0
2664
3775 ¼ x1ð ÞAx1 ¼ xAx;
and hence, 0� xAx: ∎
280 4 Sylvester’s Law of Inertia
4.1.32 Note Let A � aij
be an n-square complex matrix. Suppose that A is anonnegative definite matrix.
By 3.3.28, A is a Hermitian matrix, and hence A � aij
is a normal matrix. Now,by 3.3.24, there exists a unitary matrix U � uij
such that
1. UAU is a diagonal matrix, say diag k1; . . .; knð Þ;2. the eigenvalues of the matrix Aare k1; . . .; kn:
Since U is unitary, we have UU ¼ UU ¼ I; and hence U�1 ¼ U: Now, sinceUAU ¼ diag k1; . . .; knð Þ; we have A ¼ U diag k1; . . .; knð Þð ÞU:
Let T : x 7!Ax be the linear transformation from the inner product space Cn toCn: By 4.1.29, T is a nonnegative linear transformation, and by 4.1.21, all theeigenvalues of the linear transformation T are nonnegative. Here e1; . . .; enf g is anorthonormal basis of V, where e1 � 1; 0; . . .; 0½ �T ; e2 � 0; 1; 0; . . .; 0½ �T ; etc. Since
T e1ð Þ ¼ Ae1 ¼ aij
1; 0; . . .; 0½ �T¼ a11; a21; . . .; an1½ �T¼ a11e1 þ a21e2 þ � � � þ an1en;
we have
T e1ð Þ ¼ a11e1 þ a21e2 þ � � � þ an1en:
Similarly,
T e2ð Þ ¼ a12e1 þ a22e2 þ � � � þ an2en;
etc. Thus T ej� � ¼Pn
j¼1 aijej: It follows that the matrix of T relative to e1; . . .; enf gis aij ¼ Að Þ: Now, since the eigenvalues of the matrix A are k1; . . .; kn; the
eigenvalues of the linear transformation Tare also k1; . . .; kn: Next, since theeigenvalues of the linear transformation T are nonnegative real numbers, each ki is anonnegative real number.
Since
A ¼ U diag k1; . . .; knð Þð ÞU ¼ U diag k1; . . .; knð Þð ÞU�1� �;
we have
det Að Þ ¼ det U diag k1; . . .; knð Þð ÞU�1� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ det Uð Þ � det diag k1; . . .; knð Þð Þ � det U�1ð Þ
¼ det Uð Þ � det diag k1; . . .; knð Þð Þ � 1det Uð Þ ¼ det diag k1; . . .; knð Þð Þ ¼ k1k2 � � � kn;
and hence
det Að Þ ¼ k1k2. . .kn:
Since each ki is a nonnegative real number, det Að Þ is a nonnegative real number.
4.1 Positive Definite Matrices 281
4.1.33 Conclusion Let A � aij
be an n-square complex matrix. Suppose that A isa nonnegative definite matrix. Then there exists a unitary matrix U such that
1. A ¼ U diag k1; . . .; knð Þð ÞU;2. k1; . . .; kn are the eigenvalues of the matrix A,3. each ki is a nonnegative real number,4. det Að Þ is a nonnegative real number.
4.1.34 Note Let A � aij
be an n-square complex matrix. Suppose that A is anonnegative definite matrix.
By 4.1.33, there exists a unitary matrix U such that
1. A ¼ U diag k1; . . .; knð Þð ÞU;2. k1; . . .; kn are the eigenvalues of the matrix A,3. each ki is a nonnegative real number,4. det Að Þ is a nonnegative real number.
Since each ki is a nonnegative real number, eachffiffiffiffiki
pis a nonnegative real
number. Observe that
U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U� �2
¼ U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U� �
U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U� �
¼ U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �UUð Þ diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
¼ U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �I diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
¼ U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �� �U
¼ U diag k1; . . .; knð Þð ÞU ¼ A;
so
U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
� �2¼ A:
Thus
B2 ¼ A;
where B � U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U: Since for every x � x1; . . .; xn½ �T2 Cn;
x diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �x
¼ x1; . . .; xn½ �T� �diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �x1; . . .; xn½ �T
¼ x1; . . .; xn½ � diag ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �x1; . . .; xn½ �T
¼ x1; . . .; xn½ � diag ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �� �x1; . . .; xn½ �T
¼ x1ffiffiffiffiffik1
p; . . .; xn
ffiffiffiffiffikn
p x1; . . .; xn½ �T¼ x1
ffiffiffiffiffik1
px1 þ . . .; xn
ffiffiffiffiffikn
pxn
¼ ffiffiffiffiffik1
px1j j2 þ � � � þ ffiffiffiffiffi
knp
xnj j2 � 0;
282 4 Sylvester’s Law of Inertia
we have
x diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �x� 0:
This shows that diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �is a nonnegative definite matrix, and hence
by 4.1.30, Uð Þ diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �Uð Þ ¼ U diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U ¼ B
� �is
a nonnegative definite matrix. Thus B is a nonnegative matrix.Since
kI � B ¼ kI � U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
¼ kUU � U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
¼ U kIð ÞU � U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
¼ U kI � diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
¼ U diag k; . . .; kð Þ � diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
¼ U diag k�ffiffiffiffiffik1
p; . . .; k�
ffiffiffiffiffikn
p� �� �U;
we have
kI � B ¼ U diag k�ffiffiffiffiffik1
p; . . .; k�
ffiffiffiffiffikn
p� �� �U;
and hence
det kI � Bð Þ ¼ det U diag k�ffiffiffiffiffik1
p; . . .; k�
ffiffiffiffiffikn
p� �� �U
� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ det Uð Þ � det diag k�
ffiffiffiffiffik1
p; . . .; k�
ffiffiffiffiffikn
p� �� �� det Uð Þ
¼ det Uð Þ � det diag k�ffiffiffiffiffik1
p; . . .; k�
ffiffiffiffiffikn
p� �� �� det U�1
� �¼ det Uð Þ � det diag k�
ffiffiffiffiffik1
p; . . .; k�
ffiffiffiffiffikn
p� �� �� 1det Uð Þ
¼ det diag k�ffiffiffiffiffik1
p; . . .; k�
ffiffiffiffiffikn
p� �� �¼ k�
ffiffiffiffiffik1
p� �k�
ffiffiffiffiffik2
p� �� � � k�
ffiffiffiffiffikn
p� �:
Thus
det kI � Bð Þ ¼ k�ffiffiffiffiffik1
p� �k�
ffiffiffiffiffik2
p� �� � � k�
ffiffiffiffiffikn
p� �:
4.1 Positive Definite Matrices 283
Hence the characteristic polynomial of the matrix B is k� ffiffiffiffiffik1
p� �k� ffiffiffiffiffi
k2p� � � � � k� ffiffiffiffiffi
knp� �
: Its roots areffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p:
So the eigenvalues of the matrix B areffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p: Also, k1; . . .; kn are the
eigenvalues of the matrix A. Thus the eigenvalues of the matrix B are the squareroots of the eigenvalues of the matrix A.
4.1.35 Conclusion Let A � aij
be an n-square complex matrix. Suppose that A isa nonnegative definite matrix. Let k1; . . .; kn be the eigenvalues of the matrixA. Then there exists a matrix B such that
1. B2 ¼ A;2. B is a nonnegative definite matrix,3.
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
pare the eigenvalues of the matrix B,
4. there exists a unitary matrix U such that U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U ¼ B:
4.1.36 Problem Let A � aij
be an n-square complex matrix. Suppose that A is anonnegative definite matrix. Let k1; . . .; kn be the eigenvalues of the matrix A. Thenthere exists a unique matrix B such that
1. B2 ¼ A;2. there exists a unitary matrix U such that U diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U ¼ B:
Proof In view of 4.1.25, it remains to prove the uniqueness part.Uniqueness: Suppose that B is a matrix such that
1. B2 ¼ A;2. there exists a unitary matrix U such that U diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U ¼ B:
Suppose that C is a matrix such that
1. C2 ¼ A;2. there exists a unitary matrix V such that V diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �V ¼ C:
We have to show that B ¼ C; that is,U diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U ¼ V diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �V;that is,
U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �¼ V diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �VU;
that is,
VU diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �¼ diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �VU;
that is,
284 4 Sylvester’s Law of Inertia
W diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �¼ diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �W ;
where W � VU:
Suppose that W � wij
: Observe that diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� � ¼ sij
; where sij �ffiffiffiffiki
pdij: So we have to show that
wij
sij ¼ sij
wij
;
that is,
ffiffiffiffikj
pwij ¼ wij
ffiffiffiffikj
p¼Xnk¼1
wik
ffiffiffiffiffikk
pdkj
� �¼
!Xnk¼1
wikskj ¼Xnk¼1
sikwkj|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}that is,
ffiffiffiffikj
pwij ¼
ffiffiffiffiki
pwij;
that is, ffiffiffiffiki
p�
ffiffiffiffikj
p� �wij ¼ 0:
Thus it suffices to show that
ffiffiffiffiki
p � ffiffiffiffikj
p� �wij ¼ 0:
Since
A ¼ B2 ¼ BB ¼ U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
� �U diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
� �¼ U diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �UUð Þ diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
¼ U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �I diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U
¼ U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �� �U
¼ U diag k1; . . .; knð Þð ÞU ¼ U diag k1; . . .; knð Þð ÞU;
we have
A ¼ U diag k1; . . .; knð Þð ÞU:
4.1 Positive Definite Matrices 285
Similarly,
A ¼ V diag k1; . . .; knð Þð ÞV:
It follows that
U diag k1; . . .; knð Þð ÞU ¼ V diag k1; . . .; knð Þð ÞV;
that is,
VU diag k1; . . .; knð Þð ÞU ¼ diag k1; . . .; knð Þð ÞV;
that is,
VUð Þ diag k1; . . .; knð Þð Þ ¼ diag k1; . . .; knð Þð Þ VUð Þ;
that is,
W diag k1; . . .; knð Þð Þ ¼ diag k1; . . .; knð Þð ÞW ;
that is,
wij
diag k1; . . .; knð Þð Þ ¼ diag k1; . . .; knð Þð Þ wij
:
Observe that diag k1; . . .; knð Þ ¼ jij
; where jij � kidij: It follows that
wij
jij ¼ jij
wij
;
and hence
Xnk¼1
wikjkj ¼Xnk¼1
jikwkj:
Now, since
Xnk¼1
wikjkj ¼Xnk¼1
wik kkdkj� � ¼ wijkj;
and
Xnk¼1
jikwkj ¼Xnk¼1
kidikð Þwkj ¼ kiwij ¼ wijki;
we have wijkj ¼ wijki; and hence ki � kj� �
wij ¼ 0: It follows that for distinct ki and
kj; wij ¼ 0: Henceffiffiffiffiki
p � ffiffiffiffikj
p� �wij ¼ 0: ∎
286 4 Sylvester’s Law of Inertia
4.1.37 Theorem Let A � aij
be an n-square complex matrix. Suppose that A is anonnegative definite matrix. Then there exists a unique matrix B such that
1. B2 ¼ A;2. B is a nonnegative definite matrix.
Here the unique matrix B is denoted byffiffiffiA
pand is called the square root of the
nonnegative definite matrix A.
Proof In view of 4.1.35, it remains to prove the uniqueness part.Uniqueness: Let k1; . . .; kn be the eigenvalues of the matrix A.Suppose that B is a matrix such that
1. B2 ¼ A;2. B is a nonnegative definite matrix.
Suppose that C is a matrix such that
1. C2 ¼ A;2. C is a nonnegative definite matrix.
We have to show that B ¼ C:Since B is a nonnegative definite matrix, by 3.3.28, B is a Hermitian matrix, and
hence by 3.3.24, there exists a unitary matrix U such that
1. UBU is a diagonal matrix,2. the eigenvalues of B are the diagonal entries of UBU:
Hence UBU ¼ diag l1; . . .; lnð Þ; where l1; . . .; ln are the eigenvalues of B. Itfollows that
B ¼ U diag l1; . . .; lnð Þð ÞU:
Now, since
A ¼ B2 ¼ U diag l1; . . .; lnð Þð ÞUð Þ2|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ U diag l1; . . .; lnð Þð ÞUð Þ U diag l1; . . .; lnð Þð ÞUð Þ¼ U diag l1; . . .; lnð Þ diag l1; . . .; lnð Þð Þð ÞU
¼ U diag l1ð Þ2; . . .; lnð Þ2� �� �
U;
we have
A ¼ U diag l1ð Þ2; . . .; lnð Þ2� �� �
U:
4.1 Positive Definite Matrices 287
Since
kI � A ¼ kI � U diag l1ð Þ2; . . .; lnð Þ2� �� �
U
¼ kUU � U diag l1ð Þ2; . . .; lnð Þ2� �� �
U
¼ U kIð ÞU � U diag l1ð Þ2; . . .; lnð Þ2� �� �
U
¼ U kI � diag l1ð Þ2; . . .; lnð Þ2� �� �
U
¼ U diag k; . . .; kð Þ � diag l1ð Þ2; . . .; lnð Þ2� �� �
U
¼ U diag k� l1ð Þ2; . . .; k� lnð Þ2� �� �
U;
we have
kI � A ¼ U diag k� l1ð Þ2; . . .; k� lnð Þ2� �� �
U;
and hence
det kI � Að Þ ¼ det U diag k� l1ð Þ2; . . .; k� lnð Þ2� �� �
U� �
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ det Uð Þ � det diag k� l1ð Þ2; . . .; k� lnð Þ2
� �� �� det Uð Þ
¼ det Uð Þ � det diag k� l1ð Þ2; . . .; k� lnð Þ2� �� �
� det U�1� �¼ det Uð Þ � det diag k� l1ð Þ2; . . .; k� lnð Þ2
� �� �� 1det Uð Þ
¼ det diag k� l1ð Þ2; . . .; k� lnð Þ2� �� �
¼ k� l1ð Þ2� �
k� l2ð Þ2� �
� � � k� lnð Þ2� �
:
Thus
det kI � Að Þ ¼ k� l1ð Þ2� �
k� l2ð Þ2� �
� � � k� lnð Þ2� �
:
Hence the characteristic polynomial of the matrix A is k� l1ð Þ2� �
k� l2ð Þ2� �
� � � k� lnð Þ2� �
: Its roots are l1ð Þ2; . . .; lnð Þ2: So the eigenvalues of
the matrix A are l1ð Þ2; . . .; lnð Þ2:Now, since the eigenvalues of the matrix A are k1; . . .; kn; we can suppose that
l1ð Þ2¼ k1: Since l1 is an eigenvalue of B, and B is a nonnegative definite matrix,
by 3.3.30, l1 is a nonnegative real number. It follows thatffiffiffiffiffik1
p ¼ffiffiffiffiffiffiffiffiffiffiffil1ð Þ2
q¼ l1|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl};
and hence l1 ¼ffiffiffiffiffik1
p: Similarly, l2 ¼
ffiffiffiffiffik2
p; etc.
288 4 Sylvester’s Law of Inertia
Next, since
UBU ¼ diag l1; . . .; lnð Þ ¼ diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffik2
p� �� �;
we have
UBU ¼ diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffik2
p� �;
and hence
B ¼ U diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffik2
p� �� �U:
Similarly,
C ¼ V diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffik2
p� �� �V:
Thus1. B2 ¼ A; 2; U diag
ffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �U ¼ B; where U is a unitary matrix; 3.
C2 ¼ A; 4. V diagffiffiffiffiffik1
p; . . .;
ffiffiffiffiffikn
p� �� �V ¼ C; where V is a unitary matrix. Now, by
4.1.36, B ¼ C: ∎
4.1.38 Theorem Let V be any n-dimensional inner product space over the field C:Let T : V ! V be a nonnegative linear transformation. Then there exists a uniquenonnegative linear transformation S : V ! V such that S2 ¼ T :
Here the unique linear transformation S is denoted byffiffiffiffiT
p; and is called the
square root of the nonnegative linear transformation T.
Proof In view of 4.1.27, it remains to prove the uniqueness part.Uniqueness: Let R : V ! V be a nonnegative linear transformation such that
R2 ¼ T : Let S : V ! V be a nonnegative linear transformation such that S2 ¼ T:We have to show that R ¼ S:
Let us take an orthonormal basis e1; . . .; enf g of V. Let A � aij
be the matrix ofT relative to the basis e1; . . .; enf g: Let B � bij
be the matrix of R relative to the
basis e1; . . .; enf g: Let C � cij
be the matrix of S relative to the basis e1; . . .; enf g:Thus
T ej� � ¼Pn
i¼1aijei
R ej� � ¼Pn
i¼1bijei
S ej� � ¼Pn
i¼1cijei
9>>>>>>=>>>>>>;:
4.1 Positive Definite Matrices 289
It suffices to show that bij ¼ cij i; j 2 1; . . .; nf gð Þ; that is, B ¼ C:Since T is a nonnegative linear transformation, by 4.1.28, A is a nonnegative
definite matrix. Similarly, B is a nonnegative definite matrix, and C is a nonnegativedefinite matrix. By 3.1.33, the matrix of R2 ¼ R � R ¼ Tð Þ relative to the basise1; . . .; enf g is BB ¼ B2ð Þ: Now, since the matrix of T relative to the basise1; . . .; enf g is A, We have B2 ¼ A: Similarly, C2 ¼ A: Now, by 4.1.37, B ¼ C: ∎
4.2 Sylvester’s Law
4.2.1 Note Let A � aij
be an n-square real matrix.
Since aiJ ¼ aij; A is Hermitian if and only if A is symmetric (that is, AT ¼ A).Here A is unitary if and only if A is orthogonal (that is, AAT ¼ ATA ¼ I).
Let A � aij
be a real symmetric matrix.It follows that A is a Hermitian matrix, and hence A is a normal matrix. Now, by
3.3.24, there exists a unitary matrix U such that
1. UAU is a diagonal matrix,2. the eigenvalues of A are the diagonal entries of UAU:
Hence UAU ¼ diag k1; . . .; knð Þ; where k1; . . .; kn are the eigenvalues of thematrix A. It follows that
diag k1; . . .; knð Þ ¼ UAU ¼ UAU ¼ UA Uð Þ¼ UAUð Þ¼ diag k1; . . .; knð Þð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ diag k1; . . .; kn
� �;
and hence diag k1; . . .; knð Þ ¼ diag k1; . . .; kn� �
: It follows that ki ¼ ki i ¼ 1; . . .; nð Þ;and hence each ki is a real number.
4.2.2 Conclusion Let A � aij
be a real symmetric matrix. Then there exists aunitary matrix U such that UAU ¼ diag k1; . . .; knð Þ; where k1; . . .; kn are realnumbers.
4.2.3 Theorem Let A � aij
be an n-square real matrix. Let B � bij
be an n-square real matrix. Let P be an invertible n-square complex matrix such that B ¼P�1AP: Then there exists an invertible n-square real matrix Q such thatB ¼ Q�1AQ:
Proof Since P is an n-square complex matrix, we can write P ¼ P1 þ iP2; whereP1;P2 are n-square real matrices.
Case I: P2 ¼ 0: In this case, P ¼ P1: Now, since B ¼ P�1AP; we have B ¼P1ð Þ�1AP1; where P1 is an n-square real matrix.
290 4 Sylvester’s Law of Inertia
Case II: P2 6¼ 0: Since B ¼ P�1AP; we have
P1Bþ iP2B ¼ P1 þ iP2ð ÞB ¼ PB ¼ AP|fflfflfflfflffl{zfflfflfflfflffl} ¼ A P1 þ iP2ð Þ ¼ AP1 þ iAP2;
and hence
P1Bð Þþ i P2Bð Þ ¼ AP1ð Þþ i AP2ð Þ:: ðÞ
Since P1;B are real matrices, P1B is a real matrix. Similarly, P2B; AP1; AP2 arereal matrices. Now, from (*)
P1B ¼ AP1
P2B ¼ AP2
�: ðÞ
Since P2 6¼ 0; det P1 þ xP2ð Þ is a polynomial in x. Suppose that a1; . . .; akf g isthe collection of all the roots of the polynomial det P1 þ xP2ð Þ: We can find a realnumber t0 62 a1; . . .; akf g: It follows that
det P1 þ t0P2ð Þ 6¼ 0:
Hence P1 þ t0P2 is an invertible n-square matrix. Since P1;P2 are real matricesand t0 is a real number, P1 þ t0P2 is an n-square real matrix. Thus Q is an invertiblen-square real matrix, where Q � P1 þ t0P2: It remains to show that Q�1AQ ¼ B;that is, AQ ¼ QB; that is, A P1 þ t0P2ð Þ ¼ P1 þ t0P2ð ÞB; that is, AP1 þ t0 AP2ð Þ ¼P1Bþ t0 P2Bð Þ: This is clearly true from (**) ∎
4.2.4 Note Let k and l be distinct complex numbers. Observe that
1 0 0
0 0 1
0 1 0
264
375 diag k; l; kð Þð Þ
1 0 0
0 0 1
0 1 0
264
375
¼1 0 0
0 0 1
0 1 0
264
375 diag k; l; kð Þð Þ
0B@
1CA 1 0 0
0 0 1
0 1 0
264
375
¼1 0 0
0 0 1
0 1 0
264
375 k 0 0
0 l 0
0 0 k
264
375
0B@
1CA 1 0 0
0 0 1
0 1 0
264
375
¼k 0 0
0 0 k
0 l 0
264
375 1 0 0
0 0 1
0 1 0
264
375 ¼
k 0 0
0 k 0
0 0 l
264
375 ¼ diag k; k|{z}
2
; l|{z}1
0@
1A;
and hence
4.2 Sylvester’s Law 291
1 0 00 0 10 1 0
24
35 diag k; l; kð Þð Þ
1 0 00 0 10 1 0
24
35 ¼ diag k; k|{z}
2
; l|{z}1
0@
1A:
Notation diag k; k|{z}2
; l|{z}1
0@
1A is denoted by kI2 � lI1:
Thus
1 0 00 0 10 1 0
24
35 diag k;l; kð Þð Þ
1 0 00 0 10 1 0
24
35 ¼ kI2 � lI1: ðÞ
Observe that
1 0 00 0 10 1 0
24
35 1 0 0
0 0 10 1 0
24
35 ¼
1 0 00 1 00 0 1
24
35 ¼ I3;
so
1 0 00 0 10 1 0
24
35�1
¼1 0 00 0 10 1 0
24
35:
Thus1 0 00 0 10 1 0
24
35 is invertible. It is clear that
1 0 00 0 10 1 0
24
35 is symmetric and
Hermitian. Thus1 0 00 0 10 1 0
24
35 is unitary.
Also, from (*)
diag k; l; kð Þ ¼1 0 00 0 10 1 0
24
35 kI2 � lI1ð Þ
1 0 00 0 10 1 0
24
35:
Hence
diag k; l; kð Þ ¼1 0 00 0 10 1 0
24
35
kI2 � lI1ð Þ1 0 00 0 10 1 0
24
35:
292 4 Sylvester’s Law of Inertia
4.2.5 Conclusion Let D be a diagonal matrix. Then D can be expressed asQ kIr � � � � � lIsð ÞQ; where k; . . .; l are the distinct members of the diagonalentries of D, and Q is a unitary matrix.
4.2.6 Note Let A � aij
be an n-square complex matrix. Let A be unitary (that is,AA ¼ AA ¼ I; that is, A ¼ A�1Þ. Let k be an eigenvalue of the matrix A. Thenkj j ¼ 1:
Proof Let T : x 7!Ax be the linear transformation from the inner product spaceCn
toCn: Since k is an eigenvalue of the matrix A, we have det kI � Að Þ ¼ 0; and hencethere exists a nonzero x 2 Cn such that
kx� T xð Þ ¼ kx� Ax ¼ kIx� Ax ¼ kI � Að Þx ¼ 0|fflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflffl} :Thus kx� T xð Þ ¼ 0; and hence T xð Þ ¼ kx; where x 6¼ 0: This shows that k is an
eigenvalue of the linear transformation T.By 3.1.22, it suffices to show that T is a unitary transformation. To this end, let
us take an arbitrary x 2 Cn: By 3.1.2, it suffices to show that T xð Þ; T xð Þh i ¼ x; xh i:
LHS ¼ T xð Þ; T xð Þh i ¼ Ax;Axh i ¼ Axð Þ Axð Þ¼ xAð Þ Axð Þ ¼ x AAð Þx ¼ xIx ¼ xx ¼ x; xh i ¼ RHS:
4.2.7 Note Let A � aij
be an n-square complex matrix. Let A be symmetric (thatis, AT ¼ AÞ: Let A be unitary (that is, AA ¼ AA ¼ I; that is, A ¼ A�1Þ.
Since A is unitary, A is a normal matrix, and hence by 3.3.24, there exists aunitary matrix U such that
1. UAU is a diagonal matrix,2. the eigenvalues of A are the diagonal entries of UAU:
Hence UAU ¼ diag k1; . . .; knð Þ; where k1; . . .; kn are the eigenvalues of thematrix A.
Now, since A is a unitary matrix, by 4.2.6, k1j j ¼ 1; k2j j ¼ 1; etc. SinceUAU ¼ diag k1; . . .; knð Þ; and U is a unitary matrix, we haveA ¼ U diag k1; . . .; knð Þð ÞU:
Suppose that l1; . . .; lk are the distinct members of k1; . . .; kn:By 4.2.5, diag k1; . . .; knð Þ can be expressed as Q l1Ir1 � � � � � lkIrkð ÞQ; where
Q is a unitary matrix. Since
A ¼ U diag k1; . . .; knð Þð ÞU;
we have
4.2 Sylvester’s Law 293
A ¼ U Q l1Ir1 � � � � � lkIrkð ÞQð ÞU|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ UQð Þ l1Ir1 � � � � � lkIrkð Þ UQð Þ
¼ V l1Ir1 � � � � � lkIrkð ÞV;
where V � UQ: Thus
A ¼ V l1Ir1 � � � � � lkIrkð ÞV:
Since l1; . . .; lk are the distinct members of k1; . . .; kn; and each kij j ¼ 1; wehave l1j j ¼ 1; l2j j ¼ 1; etc. Hence we can suppose that l1 � eih1 ; l2 � eih2 ; etc.,where h1; h2; . . . are real numbers. Thus
A ¼ V eih1 Ir1 � � � � � eihk Irk� �
V:
Since Q is unitary, Q is unitary. Since U is unitary, V ¼ð ÞUQ is unitary, andhence V is unitary. Next, since
A ¼ V eih1 Ir1 � � � � � eihk Irk� �
V;
we have
eih1 Ir1 � � � � � eihk Irk� � ¼ VAV :
Put
S � V eih12 Ir1 � � � � � ei
hk2 Irk
� �V:
It follows that
eih12 Ir1 � � � � � ei
hk2 Irk
� �¼ VSV :
Here
S2 ¼ V eih12 Ir1 � � � � � ei
hk2 Irk
� �V
� �V ei
h12 Ir1 � � � � � ei
hk2 Irk
� �V
� �¼ V ei
h12 Ir1 � � � � � ei
hk2 Irk
� �VVð Þ ei
h12 Ir1 � � � � � ei
hk2 Irk
� �V
¼ V eih12 Ir1 � � � � � ei
hk2 Irk
� �ei
h12 Ir1 � � � � � ei
hk2 Irk
� �� �V
¼ V eih12
� �2Ir1 � � � � � ei
hk2
� �2Irk
� �V ¼ V eih1 Ir1 � � � � � eihk Irk
� �V ¼ A;
so S2 ¼ A:Clearly, S is unitary, that is, SS ¼ SS ¼ I:
294 4 Sylvester’s Law of Inertia
Proof Here,
S ¼ V eih12 Ir1 � � � � � ei
hk2 Irk
� �V
� �¼ Vð Þ ei
h12 Ir1 � � � � � ei
hk2 Irk
� �V
¼ V eih12 Ir1 � � � � � ei
hk2 Irk
� �V ¼ V ei
h12 Ir1 � � � � � ei
hk2 Irk
� ��� �TV
¼ V eih12
� �Ir1 � � � � � ei
hk2
� �Irk
� �T
V
¼ V e�ih12 Ir1 � � � � � e�ihk2 Irk
� �TV ¼ V e�ih12 Ir1 � � � � � e�i
hk2 Irk
� �V;
so
S ¼ V e�ih12 Ir1 � � � � � e�i
hk2 Irk
� �V:
Now,
SS ¼ V e�ih12 Ir1 � � � � � e�i
hk2 Irk
� �V
� �V ei
h12 Ir1 � � � � � ei
hk2 Irk
� �V
� �¼ V e�i
h12 Ir1 � � � � � e�i
hk2 Irk
� �ei
h12 Ir1 � � � � � ei
hk2 Irk
� �� �V
¼ V e�ih12 ei
h12 Ir1 � � � � � e�i
hk2 ei
hk2 Irk
� �V ¼ V 1Ir1 � � � � � 1Irkð ÞV
¼ VIV ¼ VV ¼ I;
so SS ¼ I: Similarly, SS ¼ I:Thus we have shown that S is unitary: . ∎
Suppose that B is any n-square complex matrix. Suppose that B commutes withA, that is, AB ¼ BA:
Now clearly, VBV commutes with eih1 Ir1 � � � � � eihk Irk� �
; that is,
eih1 Ir1 � � � � � eihk Irk� �
VBVð Þ ¼ VBVð Þ eih1 Ir1 � � � � � eihk Irk� �
:
Proof Here,
LHS ¼ eih1 Ir1 � � � � � eihk Irk� �
VBVð Þ ¼ VAVð Þ VBVð Þ ¼ VA BVð Þ¼ V ABð ÞV ;
and
RHS ¼ VBVð Þ eih1 Ir1 � � � � � eihk Irk� � ¼ VBVð Þ VAVð Þ ¼ VB AVð Þ
¼ V BAð ÞV ¼ V ABð ÞV ;
so LHS ¼ RHS: ∎
4.2 Sylvester’s Law 295
Again, it is clear that S ¼ V eih12 Ir1 � � � � � ei
hk2 Irk
� �V
� �commutes with B,
that is, SB ¼ BS; that is, B ¼ SBS:
Proof Here,
RHS ¼ SBS ¼ V e�ih12 Ir1 � � � � � e�i
hk2 Irk
� �V
� �B V ei
h12 Ir1 � � � � � ei
hk2 Irk
� �V
� �¼ V e�i
h12 Ir1 � � � � � e�i
hk2 Irk
� �VBVð Þ ei
h12 Ir1 � � � � � ei
hk2 Irk
� �� �V
¼ V e�ih12 Ir1 � � � � � e�i
hk2 Irk
� �ei
h12 Ir1 � � � � � ei
hk2 Irk
� �VBVð Þ
� �V
¼ V e�ih12 Ir1 � � � � � e�i
hk2 Irk
� �ei
h12 Ir1 � � � � � ei
hk2 Irk
� �� �VB
¼ V e�ih12 eih12 Ir1 � � � � � e�i
hk2 ei
hk2 Irk
� �VB ¼ V 1Ir1 � � � � � 1Irkð ÞVB
¼ VIVB ¼ B ¼ LHS:
Thus we have shown that if B commutes withA; thenB commutes with S: ð ÞSince A is unitary, we have AA ¼ AA ¼ I; and hence the inverse of the matrix
A is A: Since A is unitary, we have AA ¼ I; and hence
AT �A ¼ AT Að ÞT¼ AAð ÞT¼ IT|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} ¼ I:
Thus AT �A ¼ I: Similarly, �AAT ¼ I: Thus the inverse of the matrix AT is �A:Similarly, since V is unitary, the inverse of the matrix VT is �V ; and the inverse ofthe matrix V is V:
Since
�V eih1 Ir1 � � � � � eihk Irk� �
VT ¼ �V eih1 Ir1 � � � � � eihk Irk� �T
VT
¼ Vð ÞT eih1 Ir1 � � � � � eihk Irk� �T
VT
¼ V eih1 Ir1 � � � � � eihk Irk� �
V� �T¼ AT ¼ A|fflfflffl{zfflfflffl} ¼ V eih1 Ir1 � � � � � eihk Irk� �
V;
we have
�V eih1 Ir1 � � � � � eihk Irk� �
VT ¼ V eih1 Ir1 � � � � � eihk Irk� �
V:
Hence
�V eih1 Ir1 � � � � � eihk Irk� �
VTV ¼ V eih1 Ir1 � � � � � eihk Irk� �
:
296 4 Sylvester’s Law of Inertia
It follows that
eih1 Ir1 � � � � � eihk Irk� �
VTV ¼ VTV eih1 Ir1 � � � � � eihk Irk� �
:
Thus we have shown that VTV commutes with eih1 Ir1 � � � � � eihk Irk� �
:
Clearly, VVT commutes with A.
Proof We have to show that
VVT� �
V eih1 Ir1 � � � � � eihk Irk� �
V� � ¼ VVT� �
A ¼ A VVT� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ V eih1 Ir1 � � � � � eihk Irk� �
V� �VVT� �
;
that is,
VTV� �
eih1 Ir1 � � � � � eihk Irk� �
V� � ¼ V eih1 Ir1 � � � � � eihk Irk� �
VVð ÞVT� �
;
that is,
VTV� �
eih1 Ir1 � � � � � eihk Irk� �
V ¼ eih1 Ir1 � � � � � eihk Irk� �
VVð ÞVT ;
that is,
VTV� �
eih1 Ir1 � � � � � eihk Irk� �
V ¼ eih1 Ir1 � � � � � eihk Irk� �
IVT ;
that is,
VTV� �
eih1 Ir1 � � � � � eihk Irk� � ¼ eih1 Ir1 � � � � � eihk Irk
� �VTV� �
:
This is known to be true. ∎It follows, from ð Þ; that VVT commutes with S, that is,
VVT� �
V eih12 Ir1 � � � � � ei
hk2 Irk
� �V
� �¼ VVT� �
S ¼ S VVT� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ V eih12 Ir1 � � � � � ei
hk2 Irk
� �V
� �VVT� � ¼ V ei
h12 Ir1 � � � � � ei
hk2 Irk
� �VT ;
that is,
V VTV eih12 Ir1 � � � � � ei
hk2 Irk
� �V
� �¼ V ei
h12 Ir1 � � � � � ei
hk2 Irk
� �VT
� �;
that is,
4.2 Sylvester’s Law 297
VTV eih12 Ir1 � � � � � ei
hk2 Irk
� �V ¼ ei
h12 Ir1 � � � � � ei
hk2 Irk
� �VT ;
that is,
VTV� �
eih12 Ir1 � � � � � ei
hk2 Irk
� �¼ ei
h12 Ir1 � � � � � ei
hk2 Irk
� �VTV� �
:
Thus VTV commutes with eih12 Ir1 � � � � � ei
hk2 Irk
� �:
Clearly, S is symmetric.
Proof We have to show that
�V eih12 Ir1 � � � � � ei
hk2 Irk
� �VT ¼ Vð ÞT ei
h12 Ir1 � � � � � ei
hk2 Irk
� �TVT
¼ V eih12 Ir1 � � � � � ei
hk2 Irk
� �V
� �T¼ ST ¼ S|fflfflffl{zfflfflffl} ¼ V ei
h12 Ir1 � � � � � ei
hk2 Irk
� �V;
that is,
�V eih12 Ir1 � � � � � ei
hk2 Irk
� �VT ¼ V ei
h12 Ir1 � � � � � ei
hk2 Irk
� �V;
that is,
�V eih12 Ir1 � � � � � ei
hk2 Irk
� �VTV ¼ V ei
h12 Ir1 � � � � � ei
hk2 Irk
� �;
that is,
eih12 Ir1 � � � � � ei
hk2 Irk
� �VTV� � ¼ VTV
� �ei
h12 Ir1 � � � � � ei
hk2 Irk
� �:
This is known to be true.Thus S is symmetric : ∎
4.2.8 Conclusion Let A � aij
be an n-square complex matrix. Let A be sym-metric. Let A be unitary. Then there exists a complex matrix S such that
1. S2 ¼ A;2. S is unitary,3. if B commutes with A, then B commutes with S,4. S is symmetric.
4.2.9 Note Let A � aij
be an n-square real matrix. Let B � aij
be an n-squarereal matrix. Let U be a unitary complex matrix such that A ¼ UBU
that is; UAU ¼ Bð Þ:
298 4 Sylvester’s Law of Inertia
Since U is unitary, we have UU ¼ UU ¼ I; and hence the inverse of thematrix U is U: Since U is unitary, we have UU ¼ I; and hence
UT �U ¼ UT Uð ÞT¼ UUð ÞT¼ IT|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} ¼ I:
Thus UT �U ¼ I: Similarly, �UUT ¼ I: Thus the inverse of the matrix UT is �U:
Clearly, UTU is symmetric.
Proof Since
UTU� �T¼ UT UT
� �T¼ UTU;
we have UTUð ÞT¼ UTU; so UTU is symmetric. ∎Clearly, UTU is unitary.
Proof Since U is unitary, we have UU ¼ I; and hence
UT� �
UT� �¼ UT �U ¼ UT Uð ÞT¼ UUð ÞT¼ IT|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} ¼ I:
Thus UTð Þ UTð Þ¼ I: Similarly, UTð Þ UTð Þ ¼ I: It follows that UT is unitary,
and hence UTð Þ�1¼ UTð Þ: Since U is unitary, we have U�1 ¼ U: We have to
show that UTUð Þ�1¼ UTUð Þ:RHS ¼ UTU
� �¼ U UT� �¼ U UT
� ��1¼ U�1 UT� ��1¼ UTU
� ��1¼ LHS:
∎Since UTU is unitary and symmetric, by 4.2.8, there exists a complex matrix
S such that
1. S2 ¼ UTU;2. S is unitary,3. if C commutes with UTU; then C commutes with S,4. S is symmetric.
Clearly, UTU commutes with B, that is, UTUð ÞB ¼ B UTUð Þ:Proof Since A is a real matrix, we have
�UBUT ¼ �U�BUT ¼ �U�BU ¼ UBUð Þ ¼ �A ¼ A|fflffl{zfflffl} ¼ UBU;
and hence �UBUT ¼ UBU: This shows that BUT ¼ UT UBUð Þ; and henceBUTð ÞU ¼ UT UBð Þ: Thus
4.2 Sylvester’s Law 299
B UTU� � ¼ UTU
� �B:
∎
Thus we have shown that UTU commutes with B. Now, by (3), B commuteswith S. ð Þ
Let us put Q � US�1: Clearly, Q is unitary.
Proof We have to show that US�1ð Þ�1¼ US�1ð Þ: Since S is unitary, we haveS�1 ¼ S: Now we have to show that
USð Þ�1¼ US�1� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ USð Þ¼ Sð ÞU ¼ SU ¼ SU�1;
that is, USð Þ�1¼ SU�1:
LHS ¼ USð Þ�1¼ US�1� ��1¼ SU�1 ¼ RHS:
∎
Clearly, Q is orthogonal.
Proof We have to show that QTQ ¼ I:
LHS ¼ US�1� �TUS�1� � ¼ USð ÞT US�1� � ¼ Sð ÞTUT
� �US�1� � ¼ �S UT
� �� �US�1� �
¼ STð Þ UT� �� �
US�1� � ¼ S UT
� �� �US�1� � ¼ S UTU
� �S�1
¼ S�1 UTU� �
S�1 ¼ S�1 S2� �
S�1 ¼ I ¼ RHS:
∎
Since Q is orthogonal, we have QTQ ¼ I; and hence �Qð ÞT¼ Q ¼ Q�1 ¼ QT|fflfflfflfflfflffl{zfflfflfflfflfflffl} :Thus �Qð ÞT¼ QT ; and therefore �Q ¼ Q: Hence Q is a real matrix.
Clearly, A ¼ QBQT :
Proof Here
QBQT ¼ QB�QT ¼ QBQ ¼ QB US�1ð Þ¼ QB USð Þ¼ QB SUð Þ¼ QBSð ÞU ¼ US�1ð ÞBSð ÞU ¼ U S�1BSð ÞU;
so
300 4 Sylvester’s Law of Inertia
QBQT ¼ U S�1BS� �
U:
Now, since A ¼ UBU; it suffices to show that B ¼ S�1BSð Þ; that is, SB ¼BS: From ð Þ; this is true. ∎
4.2.10 Conclusion Let A and B be n-square real matrices. Let U be a unitarycomplex matrix such that A ¼ UBU: Then there exists a real orthogonal matrixQ such that A ¼ QBQT :
4.2.11 Note Let A � aij
be an n-square real matrix (that is, �A ¼ AÞ. Let A besymmetric (that is, AT ¼ AÞ:
It follows that A ¼ �Að ÞT¼ AT ¼ A; and hence A ¼ A: This shows that A isHermitian, and hence A is a normal matrix. Now, by 3.3.24, there exists a unitarymatrix U such that
1. UAU is a diagonal matrix,2. the eigenvalues of A are the diagonal entries of UAU:
Hence UAU ¼ diag k1; . . .; knð Þ; where k1; . . .; kn are the eigenvalues of thematrix A. Since A is Hermitian, by 3.3.26, k1; . . .; kn are real numbers. SinceUAU ¼ diag k1; . . .; knð Þ; and U is a unitary matrix, we have A ¼U diag k1; . . .; knð Þð ÞU: Also A and diag k1; . . .; knð Þ are real matrices of the samesize. Now, by 4.2.10, there exists a real orthogonal matrix Q such that
A ¼ Q diag k1; . . .; knð Þð ÞQT : ð Þ:
Since Q is orthogonal, we have Q�1 ¼ QT : Now from ð Þ;
A ¼ Q diag k1; . . .; knð Þð ÞQ�1:
It follows that
diag k1; . . .; knð Þ ¼ Q�1AQ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ QTAQ ¼ QT� �
A QT� �T¼ PAPT ;
where P � QT : Since Q is a real matrix, P ¼ð ÞQT is a real matrix, and hence P is a
real matrix. Since Q is orthogonal, we have QQT ¼ I; and hence P�1 ¼
QT� ��1¼ Q|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} ¼ PT : Thus P�1 ¼ PT ; and hence P is orthogonal. Also,
PAPT ¼ diag k1; . . .; knð Þ:4.2.12 Conclusion Let A be an n-square real symmetric matrix. Then there exists areal orthogonal matrix P such that PAPT ¼ diag k1; . . .; knð Þ; where k1; . . .; kn arethe eigenvalues of A.
4.2 Sylvester’s Law 301
In short, a real symmetric matrix can be brought to diagonal form by a realorthogonal matrix.
Definition Let A be an n-square real symmetic matrix. Let B be an n-square realsymmetic matrix. If there exists a real invertible matrix S such that A ¼ SBST ; thenwe say that A and B are congruent.
4.2.13 Note Suppose that S is an invertible matrix. It follows that S�1 exists, andSS�1 ¼ I: Hence
S�1� �TST ¼ SS�1� �T¼ IT|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} ¼ I:
Thus S�1ð ÞTST ¼ I: It follows that STð Þ�1 exists, and STð Þ�1 = S�1ð ÞT :4.2.14 Conclusion If S is a real invertible matrix, then ST is a real invertible matrix,
and STð Þ�1¼ S�1ð ÞT :4.2.15 Problem Congruence is an equivalence relation.
Proof
(i) Let us take an arbitrary real symmetric n-square matrix A. Since A ¼ IAIT ; andI is a real invertible matrix, A and A are congruent.
(ii) Let us take arbitrary real symmetric n-square matrices A and B that are con-gruent. We have to show that B and A are congruent.
Since A and B are congruent, there exists a real invertible matrix S such that
A ¼ SBST : It follows that STð Þ�1 = S�1ð ÞT ; and B ¼ S�1ð ÞA S�1ð ÞT ; and hence B ¼RART ; where R � S�1: Since S is a real invertible matrix, R ¼ð ÞS�1 is a realinvertible matrix, and hence R is a real invertible matrix. Thus B and A arecongruent.
(iii) Let us take any real symmetric n-square matrices A, B, C. Suppose that A andB are congruent. Suppose that B and C are congruent. We have to show thatA and C are congruent.
Since A and B are congruent, there exists a real invertible matrix S such thatA ¼ SBST : Since B and C are congruent, there exists a real invertible matrix R suchthat B ¼ RCRT :
It follows that A ¼ S RCRTð ÞST ¼ SRð ÞC SRð ÞT� �; and hence A ¼ SRð ÞC SRð ÞT :
Since S,R are real invertible matrices,SR is also a real invertible matrix. Thus A andC are congruent.
Hence, congruence is an equivalence relation. ∎
4.2.16 Note Let A be an n-square real symmetric matrix.
302 4 Sylvester’s Law of Inertia
Since A ¼ �Að ÞT¼ AT ¼ A; we have A ¼ A: This shows that A is Hermitian,and hence by 3.3.26, its eigenvalues are real numbers. Let l1; . . .; lr|fflfflfflfflffl{zfflfflfflfflffl}
r
be the positive
distinct eigenvalues of A, and let �lrþ 1; . . .;�lrþ s|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}s
be the negative distinct
eigenvalues of A.By 4.2.12, there exists a real orthogonal matrix P such that
PAPT ¼ l1Ik1 � � � � � lrIkr � �lrþ 1
� �Ikrþ 1 � � � � � �lrþ s
� �Ikrþ s
� 0In� k1 þ ��� þ krþ sð Þ:
Put
D � 1ffiffiffiffiffil1
p Ik1 � � � � � 1ffiffiffiffiffilr
p Ikr �1ffiffiffiffiffiffiffiffiffiffilrþ 1
p Ikrþ 1 � � � � � 1ffiffiffiffiffiffiffiffiffiffilrþ s
p Ikrþ s
� 1In� k1 þ ��� þ krþ sð Þ:
It follows that
DT ¼ 1ffiffiffiffil1
p Ik1 � � � � � 1ffiffiffiffilr
p Ikr � 1ffiffiffiffiffiffiffilrþ 1
p Ikrþ 1 � � � � � 1ffiffiffiffiffiffiffilrþ s
p Ikrþ s � 1In� k1 þ ��� þ krþ sð Þ� �T
¼ 1ffiffiffiffil1
p Ik1 � � � � � 1ffiffiffiffilr
p Ikr � 1ffiffiffiffiffiffiffilrþ 1
p Ikrþ 1 � � � � � 1ffiffiffiffiffiffiffilrþ s
p Ikrþ s � 1In� k1 þ ��� þ krþ sð Þ;
and hence
DPð ÞA DPð ÞT¼ D PAPTð ÞDT
¼ 1ffiffiffiffiffil1
p l11ffiffiffiffiffil1
p Ik1 � � � �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}r
� 1ffiffiffiffiffiffiffiffiffiffilrþ 1
p �lrþ 1
� � 1ffiffiffiffiffiffiffiffiffiffilrþ 1
p Ikrþ 1 � � � �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}s
0BBB@� 1 � 0 � 1In� k1 þ ��� þ krþ sð Þ
!
¼ 1Ik1 � � � � � 1Ikr � �1ð ÞIkrþ 1 � � � � � �1ð ÞIkrþ s � 0IIn� k1 þ ��� þ krþ sð Þ
� �¼ 1Il � �1ð ÞIm � 0In� lþmð Þ
where l � k1 þ � � � þ kr; m � krþ 1 þ � � � þ krþ s:Thus
RART ¼ 1Il � �1ð ÞIm � 0In� lþmð Þ� �
;
where R � DP: Since
4.2 Sylvester’s Law 303
D ¼ 1ffiffiffiffiffil1
p Ik1 � � � � � 1ffiffiffiffiffilr
p Ikr �1ffiffiffiffiffiffiffiffiffiffilrþ 1
p Ikrþ 1 � � � � � 1ffiffiffiffiffiffiffiffiffiffilrþ s
p Ikrþ s
� 1In� k1 þ ��� þ krþ sð Þ;
D is a real invertible matrix. Since P is a real orthogonal matrix, P is a real invertiblematrix, and P�1 ¼ PT : Since D;P are real invertible matrices, R ¼ð ÞDP is a realinvertible matrix, and hence R is a real invertible matrix. Since RART ¼ð Þ1Il � �1ð ÞIm � 0In� lþmð Þ� �
is a real symmetric matrix, RART is a real symmetricmatrix. Now, since R is a real invertible matrix, A and RART ¼ 1Il � �1ð Þðð Im �0In� lþmð ÞÞÞ are congruent, and hence A and 1Il � �1ð ÞIm � 0In� lþmð Þ
� �are con-
gruent. Next, by 4.2.15, 1Il � �1ð ÞIm � 0In� lþmð Þ is a member of the congruenceclass of A. It is clear that lþm is the rank of A.
Now we want to show that l and m are unique.To this end, suppose that
1Ir � �1ð ÞIs � 0In� rþ sð Þ and 1Ir0 � �1ð ÞIs0 � 0In� r0 þ s0ð Þ
are congruent. We have to show that r ¼ r0:Suppose to the contrary that r\r0: We seek a contradiction.Since
1Ir0 � �1ð ÞIs0 � 0In� r0 þ s0ð Þ� �
and 1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
are congruent, there exists a real invertible matrix S such that
1Ir0 � �1ð ÞIs0 � 0In� r0 þ s0ð Þ� � ¼ S 1Ir � �1ð ÞIs � 0In� rþ sð Þ
� �ST :
Since S is invertible, by 4.2.14, ST is invertible, and hence
r0 þ s0 ¼ rank 1Ir0 � �1ð ÞIs0 � 0In� r0 þ s0ð Þ� �
¼ rank S 1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
ST� � ¼ rank 1Ir � �1ð ÞIs � 0In� rþ sð Þ
� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ rþ s:
Thus r0 þ s0 ¼ rþ s: Now, since r\r0; we have s0\s: ð ÞObserve that the set
U � 0; . . .; 0|fflfflffl{zfflfflffl}r
; y1; . . .; ys|fflfflfflfflffl{zfflfflfflfflffl}s
; 0; . . .; 0|fflfflffl{zfflfflffl}n� rþ sð Þ
264
375T
: y1; . . .; ys 2 R
8><>:
9>=>;
304 4 Sylvester’s Law of Inertia
is an s-dimensional subspace of the real inner product space Rn; where Rn denotesthe collection of all n 1 column matrices with real entries. Also
W � x1; . . .; xr0|fflfflfflfflffl{zfflfflfflfflffl}r0
; 0; . . .; 0|fflfflffl{zfflfflffl}s0
; z1; . . .; zn� r0 þ s0ð Þ|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl}n� r0 þ s0ð Þ
264
375T
: x1; . . .; xr0 ; z1; . . .; zn� r0 þ s0ð Þ 2 R
8><>:
9>=>;
is an n� s0ð Þ-dimensional subspace of the real inner product space Rn:
Observe that, for every nonzero 0; . . .; 0|fflfflffl{zfflfflffl}r
; y1; . . .; ys|fflfflfflfflffl{zfflfflfflfflffl}s
; 0; . . .; 0|fflfflffl{zfflfflffl}n� rþ sð Þ
264
375T
2 U;
1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
0; . . .; 0|fflfflffl{zfflfflffl}r
; y1; . . .; ys|fflfflfflfflffl{zfflfflfflfflffl}s
; 0; . . .; 0|fflfflffl{zfflfflffl}n� rþ sð Þ
264
375T
;
*
0; . . .; 0|fflfflffl{zfflfflffl}r
; y1; . . .; ys|fflfflfflfflffl{zfflfflfflfflffl}s
; 0; . . .; 0|fflfflffl{zfflfflffl}n� rþ sð Þ
264
375T+
¼ 0; . . .; 0|fflfflffl{zfflfflffl}r
; y1; . . .; ys|fflfflfflfflffl{zfflfflfflfflffl}s
; 0; . . .; 0|fflfflffl{zfflfflffl}n� rþ sð Þ
264
375T0
B@1CA
T
1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
0; . . .; 0|fflfflffl{zfflfflffl}r
; y1; . . .; ys|fflfflfflfflffl{zfflfflfflfflffl}s
; 0; . . .; 0|fflfflffl{zfflfflffl}n� rþ sð Þ
264
375T0
B@1CA
¼ 0; . . .; 0|fflfflffl{zfflfflffl}r
; y1; . . .; ys|fflfflfflfflffl{zfflfflfflfflffl}s
; 0; . . .; 0|fflfflffl{zfflfflffl}n� rþ sð Þ
264
375
1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
0; . . .; 0|fflfflffl{zfflfflffl}r
; y1; . . .; ys|fflfflfflfflffl{zfflfflfflfflffl}s
; 0; . . .; 0|fflfflffl{zfflfflffl}n� rþ sð Þ
264
375T0
B@1CA
¼ 0; . . .; 0|fflfflffl{zfflfflffl}r
; y1; . . .; ys|fflfflfflfflffl{zfflfflfflfflffl}s
; 0; . . .; 0|fflfflffl{zfflfflffl}n� rþ sð Þ
264
375 0; . . .; 0|fflfflffl{zfflfflffl}
r
; �1ð Þy1; . . .; �1ð Þys|fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}s
; 0; . . .; 0|fflfflffl{zfflfflffl}n� rþ sð Þ
264
375T
4.2 Sylvester’s Law 305
¼ 0þ � � � þ 0|fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl}r
þ � y1ð Þ2� �
þ � � � þ � ysð Þ2� �
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}s
þ 0þ � � � þ 0|fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl}n� rþ sð Þ
¼
� y1ð Þ2 þ � � � þ ysð Þ2� �
\0;
so for every nonzero u 2 U; 1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
u; u�
is negative.
Observe that, for every x1; . . .; xr0|fflfflfflfflffl{zfflfflfflfflffl}r0
; 0; . . .; 0|fflfflffl{zfflfflffl}s0
; z1; . . .; zn� r0 þ s0ð Þ|fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl}n� r0 þ s0ð Þ
264
375T
2 W ;
1Ir0 � �1ð ÞIs0 � 0In� r0 þ s0ð Þ� �
x1; � � � ;|fflfflffl{zfflfflffl}r0
0; � � �|ffl{zffl}s0
; z1; � � �|fflffl{zfflffl}n� r0 þ s0ð Þ
264
375T
;
*
x1; � � �|fflffl{zfflffl}r0
; 0; � � � ; 0|fflfflfflffl{zfflfflfflffl}s0
; z1; � � �|fflffl{zfflffl}n� r0 þ s0ð Þ
264
375T+
¼ x1; � � �|fflffl{zfflffl}r0
; 0; � � �|ffl{zffl}s0
; z1; � � �|fflffl{zfflffl}n� r0 þ s0ð Þ
264
375T0
B@1CA
T
1Ir0 � �1ð ÞIs0 � 0In� r0 þ s0ð Þ� �
x1; � � �|fflffl{zfflffl}r0
; 0; � � �|ffl{zffl}s0
z1; � � �|fflffl{zfflffl}n� r0 þ s0ð Þ
264
375T0
B@1CA
¼ x1; � � �|fflffl{zfflffl}r0
; 0; � � � ;|fflffl{zfflffl}s0
z1; � � �|fflffl{zfflffl}n� r0 þ s0ð Þ
264
375
1Ir0 � �1ð ÞIs0 � 0In� r0 þ s0ð Þ� �
x1; � � �|fflffl{zfflffl}r0
; 0; � � �|ffl{zffl}s0
; z1; � � �|fflffl{zfflffl}n� r0 þ s0ð Þ
264
375T0
B@1CA
¼ x1; � � � ; xr0|fflfflfflfflfflffl{zfflfflfflfflfflffl}r0
; 0; � � � ; 0|fflfflfflffl{zfflfflfflffl}s0
; z1; � � � ; zn� r0 þ s0ð Þ|fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl}n� r0 þ s0ð Þ
264
375 x1; � � � ; xr0|fflfflfflfflfflffl{zfflfflfflfflfflffl}
r0
; 0; � � � ; 0|fflfflfflffl{zfflfflfflffl}s0
; 0; � � � ; 0|fflfflfflffl{zfflfflfflffl}n� r0 þ s0ð Þ
264
375T
¼ x1ð Þ2 þ � � � þ xr0ð Þ2|fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}r0
þ 0þ � � � þ 0|fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl}s0
þ 0þ � � � þ 0|fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl}n� r0 þ s0ð Þ
¼ x1ð Þ2 þ � � � þ xr0ð Þ2 � 0;
so for every w 2 W ;
306 4 Sylvester’s Law of Inertia
1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
STw� �
; STw� ��
¼ STw� �T
1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
STw� �� �
¼ wT S 1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
ST� �
w� �
¼ S 1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
ST� �
w;w�
¼ 1Ir0 � �1ð ÞIs0 � 0In� r0 þ s0ð Þ� ��
w;wi� 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} :Thus for every v 2 STw : w 2 Wf g;
1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
v; v� � 0:
Further, we have seen that for every nonzero u 2 U;
1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
u; u�
� 0:
It follows that STw : w 2 Wf g\U ¼ 0f g:Clearly, STw : w 2 Wf g is an n� s0ð Þ-dimensional real vector space.
Proof Since S is invertible, ST is invertible. The map T : x 7! STx from the realvector space Rn to Rn is a linear transformation. Since ST is invertible, T isone-to-one, and hence
dim STw : w 2 W� �� � ¼ dim Wð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ n� s0:
Thus we have shown that STw : w 2 Wf g is an n� s0ð Þ-dimensional realvector space. ∎
Next, since dim Uð Þ ¼ s; we have
n ¼ dim Rnð Þ� dim STw : w 2 W� �[U� �
¼ dim STw : w 2 W� �� �þ dim Uð Þ � dim STw : w 2 W
� �\U� �
¼ dim STw : w 2 W� �� �þ dim Uð Þ � dim 0f gð Þ ¼ dim STw : w 2 W
� �� �þ dim Uð Þ � 0
¼ dim STw : w 2 W� �� �þ dim Uð Þ ¼ n� s0ð Þ þ dim Uð Þ ¼ n� s0ð Þþ s ¼ nþ s� s0ð Þ;
and hence n� nþ s� s0ð Þ: It follows that s� s0: This contradicts ð Þ:Thus we have shown that r ¼ r0:Finally, we have to show that s ¼ s0:
4.2 Sylvester’s Law 307
Since
1Ir � �1ð ÞIs � 0In� rþ sð Þ and 1Ir0 � �1ð ÞIs0 � 0In� r0 þ s0ð Þ
are congruent, there exists a real invertible matrix S such that
1Ir0 � �1ð ÞIs0 � 0In� r0 þ s0ð Þ� � ¼ S 1Ir � �1ð ÞIs � 0In� rþ sð Þ
� �ST :
Since S is invertible, by 4.2.14, ST is invertible, and hence
r0 þ s0 ¼ rank 1Ir0 � �1ð ÞIs0 � 0In� r0 þ s0ð Þ� �
¼ rank S 1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
ST� � ¼ rank 1Ir � �1ð ÞIs � 0In� rþ sð Þ
� �|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} ¼ rþ s:
Thus r0 þ s0 ¼ rþ s: Now, since r ¼ r0; we have s ¼ s0: ∎
4.2.17 Conclusion Let A be an n-square real symmetric matrix. There exist a realinvertible matrix R, and two nonnegative integers r and s such that
RART ¼ 1Ir � �1ð ÞIs � 0In� rþ sð Þ� �
:
Also, r and s are unique. Further, rank Að Þ ¼ rþ s:The integer r � sð Þ is called the signature of A, and is denoted by sg Að Þ:Thus there exists a real invertible matrix R such that
RART ¼ 1Irank Að Þþ sg Að Þ2
� �1ð ÞIIrank Að Þ�sg Að Þ2
� 0In�rank Að Þ
� �:
This result is known as Sylvester’s law.
4.2.18 Theorem Let V be an n-dimensional real inner product space. Let S : V !V be a linear transformation. Let v 2 V : Then there exists a unique w 2 V such that
u 2 V ) u;w ¼ S uð Þ; vh i:
We denote w by ST vð Þ: Thus ST : V ! V ; and for every u; v 2 V ; u; ST vð Þh i ¼S uð Þ; vh i: Also, ST : V ! V is linear.
Proof Existence: Since V is an n-dimensional real inner product space, there existsan orthonormal basis u1; . . .; unf g of V. Put
w � S u1ð Þ; vh iu1 þ � � � þ S unð Þ; vh iun:
Let us fix an arbitrary u �Pni¼1 aiui: We have to show that
308 4 Sylvester’s Law of Inertia
Pni¼1
aiui;Pnj¼1
S uj� �
; v� *
uj ¼ S
Pni¼1
aiui
� �; v
� �:
LHS ¼ Pni¼1
aiui;Pnj¼1
S uj� �
; v�
uj
* +¼P
i;jai S uj
� �; v
� ui; uj� ¼P
i;jai S uj
� �; v
� dij
¼Pni¼1
ai S uið Þ; vh i ¼ Pni¼1
aiS uið Þ; v� �
¼ SPni¼1
aiui
� �; v
� �¼ RHS:
Uniqueness: Suppose that there exist w1;w2 2 V such that
u 2 V ) u;w1h i ¼ S uð Þ; vh i; and u;w2h i ¼ S uð Þ; vh i:
We have to show that w1 ¼ w2; that is, w1 � w2;w1 � w2h i ¼ 0: Here
u 2 V ) u;w1h i ¼ u;w2h i;
so for every u 2 V ; u;w1 � w2h i ¼ 0: It follows that w1 � w2;w1 � w2h i ¼ 0:Linearity: Let us take arbitrary v1; v2 2 V : Let a; b be arbitrary real numbers. We
have to show that
ST av1 þ bv2ð Þ ¼ aST v1ð Þþ bST v2ð Þ:
It suffices to show that for every u 2 V ;
u; ST av1 þ bv2ð Þ� ¼ u; aST v1ð Þþ bST v2ð Þ� :
To this end, let us fix an arbitrary u 2 V : We have to show that
u; ST av1 þ bv2ð Þh i ¼ u; aST v1ð Þþ bST v2ð Þh i :LHS ¼ u; ST av1 þ bv2ð Þh i ¼ S uð Þ; av1 þ bv2h i ¼ a S uð Þ; v1h iþ b S uð Þ; v2h i
¼ a u; ST v1ð Þh iþ b u; ST v2ð Þh i ¼ u; aST v1ð Þþ bST v2ð Þh i ¼ RHS:
∎
Definition Let V be an n-dimensional real inner product space. Let S : V ! V be alinear transformation. By 4.2.18, ST : V ! V is a linear transformation such that forall u; v 2 V ; u; ST vð Þh i ¼ S uð Þ; vh i: Here ST is called the transpose of S.
4.2.19 Theorem Let V be an n-dimensional real inner product space. Let S : V !V be a linear transformation. Then STð ÞT¼ S:
Proof Let us take an arbitrary v 2 V : We have to show that
4.2 Sylvester’s Law 309
ST� �T
vð Þ ¼ S vð Þ:
To this end, let us take an arbitrary u 2 V : It suffices to show that
u; STð ÞT vð ÞD E
¼ u; S vð Þh i :LHS ¼ u; STð ÞT vð Þ
D E¼ ST uð Þ; vh i ¼ v; ST uð Þh i ¼ S vð Þ; uh i ¼ u; S vð Þh i ¼ RHS:
∎
4.2.20 Theorem Let V be an n-dimensional real inner product space. Let R : V !V and S : V ! V be linear transformations. Let k; l be any real numbers. ThenkRþ lSð ÞT¼ kRT þ lST :
Proof Let us take an arbitrary v 2 V : We have to show that
kRþ lSð ÞT vð Þ ¼ kRT þ lST� �
vð Þ;
that is,
kRþ lSð ÞT vð Þ ¼ kRT vð Þþ lST vð Þ:
To this end, let us take an arbitrary u 2 V : It suffices to show that
u; kRþ lSð ÞT vð Þ� ¼ u; kRT vð Þþ lST vð Þh i :LHS ¼ u; kRþ lSð ÞT vð Þ� ¼ kRþ lSð Þ uð Þ; vh i
¼ kR uð Þþ lS uð Þ; vh i ¼ k R uð Þ; vþ lS uð Þ; vh i¼ k u;RT vð Þ� þ l u; ST vð Þ� ¼ u; kRT vð Þþ lST vð Þ� ¼ RHS:
∎
4.2.21 Theorem Let V be an n-dimensional real inner product space. Let R : V !V and S : V ! V be linear transformations. Then RSð ÞT¼ STRT :
Proof Let us take an arbitrary v 2 V : We have to show that
RSð ÞT vð Þ ¼ STRT� �
vð Þ;
that is,
RSð ÞT vð Þ ¼ ST RT vð Þ� �:
To this end, let us take an arbitrary u 2 V : It suffices to show that
310 4 Sylvester’s Law of Inertia
u; RSð ÞT vð Þ� ¼ u; ST RT vð Þð Þh i :LHS ¼ u; RSð ÞT vð Þ� ¼ RSð Þ uð Þ; v ¼ R S uð Þð Þ; vh i
¼ S uð Þ;RT vð Þ� ¼ u; ST RT vð Þ� �� ¼ RHS:
∎
4.2.22 Theorem Let V be an n-dimensional real inner product space. Let S : V !V be a linear transformation. Let v1; . . .; vnf g be an orthonormal basis of V. Let aij
be the matrix of S relative to the basis v1; . . .; vnf g; in the sense that
S v1ð Þ ¼ a11v1 þ a21v2 þ � � � þ an1vn ¼Pni¼1
ai1vi
� �;
S v2ð Þ ¼ a12v1 þ a22v2 þ � � � þ an2vn;
..
.
S vnð Þ ¼ a1nv1 þ a2nv2 þ � � � þ annvn:
In short, S vj� � ¼Pn
i¼1 aijvi:Then the matrix of ST relative to the basis v1; . . .; vnf g is bij
; where, bij ¼ aji:
In short, ST vj� � ¼Pn
i¼1 bijvi:
Proof By the proof of 4.2.18,
ST v1ð Þ ¼ S v1ð Þ; v1h iv1 þ � � � þ S vnð Þ; v1h ivn;ST v2ð Þ ¼ S v1ð Þ; v2h iv1 þ � � � þ S vnð Þ; v2h ivn;
..
.
ST vnð Þ ¼ S v1ð Þ; vnh iv1 þ � � � þ S vnð Þ; vnh ivn:
Since
ST v1ð Þ ¼Xni¼1
S við Þ; v1h ivi ¼Xni¼1
a1iv1 þ a2iv2 þ � � � þ anivn; v1h ivi
¼Xni¼1
a1i v1; v1h iþ a2i v2; v1h iþ � � � þ ani vn; v1h ið Þvi
¼Xni¼1
a1i1þ a2i0þ � � � þ ani0ð Þvi ¼Xni¼1
a1ivi
¼ a11v1 þ a12v2 þ � � � þ a1nvn;
we have
ST v1ð Þ ¼ a11v1 þ a12v2 þ � � � þ a1nvn ¼Xni¼1
a1ivi
!:
4.2 Sylvester’s Law 311
Similarly,
ST v2ð Þ ¼ a21v1 þ a22v2 þ � � � þ a2nvn;
etc. In short, ST vj� � ¼Pn
i¼1 ajivi: If the matrix of ST relative to the basisv1; . . .; vnf g is bij
; then bij ¼ aji: ∎
Definition Let V be an n-dimensional real inner product space. Let S : V ! V be alinear transformation. If ST ¼ S; then we say that S is symmetric.
4.2.23 Theorem Let V be an n-dimensional real inner product space. Let S : V !V be a linear transformation. Let S be symmetric. Let e1; . . .; enf g be anyorthonormal basis of V. Let aij
be the matrix of S relative to e1; . . .; enf g; that is,
S ej� � ¼Pn
i¼1 aijei: Let Q : v 7! S vð Þ; vh i be a function from V to R: Let v �Pni¼1 xiei: Then
QXni¼1
xiei
!¼ a11 x1ð Þ2 þ � � � þ ann xnð Þ2 þ 2
Xni\j
aijxixj:
Here a11 x1ð Þ2 þ � � � þ ann xnð Þ2 þ 2Pn
i\j aijxixj is called the real quadratic formof S.
Proof Since S is symmetric and aij
is the matrix of S relative to e1; . . .; enf g; by4.2.22, aji ¼ aij: It follows that
QXni¼1
xiei
!¼ Q vð Þ ¼ S vð Þ; vh i ¼ S
Xni¼1
xiei
!;Xni¼1
xiei
* +
¼Xni¼1
xiS eið Þ;Xnj¼1
xjej
* +¼Xni¼1
xi S eið Þ;Xnj¼1
xjej
* +
¼Xni¼1
xiXnj¼1
xj S eið Þ; ej� !
¼Xni¼1
Xnj¼1
xixj S eið Þ; ej� !
¼Xni¼1
Xnj¼1
xixjXnk¼1
akiek; ej
* + !¼Xni¼1
Xnj¼1
xixjXnk¼1
aki ek; ej� ! !
¼Xni¼1
Xnj¼1
xixjXnk¼1
akidkj
! !¼Xni¼1
Xnj¼1
xixjaji
!
¼Xni¼1
Xnj¼1
ajixixj
!¼Xni¼1
Xnj¼1
aijxixj
!
¼ a11 x1ð Þ2 þ � � � þ ann xnð Þ2 þ 2Xni\j
aijxixj;
312 4 Sylvester’s Law of Inertia
and hence
QXni¼1
xiei
!¼ a11 x1ð Þ2 þ � � � þ ann xnð Þ2 þ 2
Xni\j
aijxixj:
∎
4.2.24 Example Let us consider the following real quadratic form:
x1ð Þ2 þ 2 x2ð Þ2 þ 2 x3ð Þ2 þ x1x2 þ 2x1x3 þ 4x2x3:
Here
x1ð Þ2 þ 2 x2ð Þ2 þ 2 x3ð Þ2 þ x1x2 þ 2x1x3 þ 4x2x3¼ x1ð Þ2 þ 1
2 x1x2 þ x1x3� �
þ 12 x1x2 þ 2 x2ð Þ2 þ 2x2x3� �
þ x1x3 þ 2x2x3 þ 2 x3ð Þ2� �
¼ x1 þ 12 x2 þ x3
� �x1 þ 1
2 x1 þ x2 þ 2x3� �
x2 þ x1 þ 2x2 þ 2x3ð Þx3
¼ x1x2x3½ �1 1
2 112 2 21 2 2
24
35 x1
x2x3
24
35 ¼ x1x2x3½ �A x1x2x3½ �T ;
where
A �a11 a12 a13a21 a22 a23a31 a32 a33
24
35 ¼
1 12 1
12 2 21 2 2
24
35:
Observe that
1 12 1
12 2 21 2 2
24
35 !R2!R2�1
2R11 1
2 10 7
432
1 2 2
24
35 !C2!C2�1
2C11 0 10 7
432
1 32 2
24
35:
Thus
1 0 0� 1
2 1 00 0 1
24
35 1 1
2 112 2 21 2 2
24
35 1 � 1
2 00 1 00 0 1
24
35 ¼
1 0 10 7
432
1 32 2
24
35;
or
1 0 0� 1
2 1 00 0 1
24
35 1 1
2 112 2 21 2 2
24
35 1 0 0
� 12 1 0
0 0 1
24
35T
¼1 0 10 7
432
1 32 2
24
35:
4.2 Sylvester’s Law 313
Since
1 0 0� 1
2 1 00 0 1
������������ ¼
1 0 00 1 00 0 1
������������ ¼ 1 6¼ 0;
it follows that
1 0 0� 1
2 1 00 0 1
������������
is invertible.Observe that
1 0 10 7
432
1 32 2
24
35 !R3!R3�R1
1 0 10 7
432
0 32 1
24
35 !C3!C3�C1
1 0 00 7
432
0 32 1
24
35:
Thus
1 0 00 1 0�1 0 1
24
35 1 0 1
0 74
32
1 32 2
24
35 1 0 �1
0 1 00 0 1
24
35 ¼
1 0 00 7
432
0 32 1
24
35;
or
1 0 00 1 0�1 0 1
24
35 1 0 1
0 74
32
1 32 2
24
35 1 0 0
0 1 0�1 0 1
24
35T
¼1 0 00 7
432
0 32 1
24
35:
Since
1 0 00 1 0�1 0 1
������������ ¼
1 0 00 1 00 0 1
������������ ¼ 1 6¼ 0;
it follows that
1 0 00 1 0�1 0 1
������������
is invertible.
314 4 Sylvester’s Law of Inertia
Observe that
1 0 00 7
432
0 32 1
24
35 !R2!2R2
1 0 00 7
2 30 3
2 1
24
35 !C2!2C2
1 0 00 7 30 3 1
24
35:
Thus
1 0 00 2 00 0 1
24
35 1 0 0
0 74
32
0 32 1
24
35 1 0 0
0 2 00 0 1
24
35 ¼
1 0 00 7 30 3 1
24
35;
or
1 0 00 2 00 0 1
24
35 1 0 0
0 74
32
0 32 1
24
35 1 0 0
0 2 00 0 1
24
35T
¼1 0 00 7 30 3 1
24
35:
Since
1 0 00 2 00 0 1
������������ ¼ 2
1 0 00 1 00 0 1
������������ ¼ 2 6¼ 0;
it follows that
1 0 00 2 00 0 1
������������
is invertible.Observe that
1 0 00 7 30 3 1
24
35 !R2!R2�2R3
1 0 00 1 10 3 1
24
35 !C2!C2�2C3
1 0 00 �1 10 1 1
24
35:
Thus
1 0 00 1 �20 0 1
24
35 1 0 0
0 7 30 3 1
24
35 1 0 0
0 1 00 �2 1
24
35 ¼
1 0 00 �1 10 1 1
24
35;
or
4.2 Sylvester’s Law 315
1 0 00 1 �20 0 1
24
35 1 0 0
0 7 30 3 1
24
35 1 0 0
0 1 �20 0 1
24
35T
¼1 0 00 �1 10 1 1
24
35:
Since
1 0 00 1 �20 0 1
������������ ¼
1 0 00 1 00 0 1
������������ ¼ 1 6¼ 0;
it follows that
1 0 00 1 �20 0 1
������������
is invertible.Observe that
1 0 00 �1 10 1 1
24
35 !R3!R3 þR2
1 0 00 �1 10 0 2
24
35 !C3!C3 þC2
1 0 00 �1 00 0 2
24
35:
Thus
1 0 00 1 00 1 1
24
35 1 0 0
0 �1 10 1 1
24
35 1 0 0
0 1 10 0 1
24
35 ¼
1 0 00 �1 00 0 2
24
35;
or
1 0 00 1 00 1 1
24
35 1 0 0
0 �1 10 1 1
24
35 1 0 0
0 1 00 1 1
24
35T
¼1 0 00 �1 00 0 2
24
35:
Since
1 0 00 1 00 1 1
������������ ¼
1 0 00 1 00 0 1
������������ ¼ 1 6¼ 0;
it follows that
316 4 Sylvester’s Law of Inertia
1 0 00 1 00 1 1
������������
is invertible.Observe that
1 0 00 �1 00 0 2
24
35 !
R3! 1ffiffi2
p R31 0 00 �1 00 0
ffiffiffi2
p
24
35 !
C3! 1ffiffi2
p C31 0 00 �1 00 0 1
24
35:
Thus
1 0 00 1 00 0 1ffiffi
2p
24
35 1 0 0
0 �1 00 0 2
24
35 1 0 0
0 1 00 0 1ffiffi
2p
24
35 ¼
1 0 00 �1 00 0 1
24
35;
or
1 0 00 1 00 0 1ffiffi
2p
24
35 1 0 0
0 �1 00 0 2
24
35 1 0 0
0 1 00 0 1ffiffi
2p
24
35T
¼1 0 00 �1 00 0 1
24
35:
Since
1 0 00 1 00 0 1ffiffi
2p
������������ ¼ 1ffiffiffi
2p
1 0 00 1 00 0 1
������������ ¼ 1ffiffiffi
2p 6¼ 0;
it follows that
1 0 00 1 00 0 1ffiffi
2p
������������
is invertible.Observe that
1 0 00 �1 00 0 1
24
35!R23
1 0 00 0 10 �1 0
24
35!C23
1 0 00 1 00 0 �1
24
35:
4.2 Sylvester’s Law 317
Thus
1 0 00 0 10 1 0
24
35 1 0 0
0 �1 00 0 1
24
35 1 0 0
0 0 10 1 0
24
35 ¼
1 0 00 1 00 0 �1
24
35;
or
1 0 00 0 10 1 0
24
35 1 0 0
0 �1 00 0 1
24
35 1 0 0
0 0 10 1 0
24
35T
¼1 0 00 1 00 0 �1
24
35:
Since
1 0 00 0 10 1 0
������������ ¼ �
1 0 00 1 00 0 1
������������ ¼ �1 6¼ 0;
it follows that
1 0 00 0 10 1 0
������������
is invertible.From
1 0 00 1 00 0 �1
24
35;
we find that r ¼ 2 and s ¼ 1: Hence the signature of the real quadratic form isr � s ¼ 2� 1 ¼ 1ð Þ:
If we collect the above results, we get
RART ¼1 0 00 1 00 0 �1
24
35;
318 4 Sylvester’s Law of Inertia
whereR stands for
1 0 00 0 10 1 0
24
35 1 0 0
0 1 00 0 1ffiffi
2p
24
35 1 0 0
0 1 00 1 1
24
35 1 0 0
0 1 �20 0 1
24
35 1 0 0
0 2 00 0 1
24
35 1 0 0
0 1 0�1 0 1
24
35 1 0 0
� 12 1 0
0 0 1
24
35
¼1 0 00 0 1ffiffi
2p
0 1 0
24
35 1 0 0
0 1 00 1 1
24
35 1 0 0
0 1 �20 0 1
24
35 1 0 0
0 2 00 0 1
24
35 1 0 0
0 1 0�1 0 1
24
35 1 0 0
� 12 1 0
0 0 1
24
35
¼1 0 00 1ffiffi
2p 1ffiffi
2p
0 1 0
24
35 1 0 0
0 1 �20 0 1
24
35 1 0 0
0 2 00 0 1
24
35 1 0 0
0 1 0�1 0 1
24
35 1 0 0
� 12 1 0
0 0 1
24
35
¼1 0 00 1ffiffi
2p �1ffiffi
2p
0 1 �2
24
35 1 0 0
0 2 00 0 1
24
35 1 0 0
0 1 0�1 0 1
24
35 1 0 0
� 12 1 0
0 0 1
24
35
¼1 0 00
ffiffiffi2
p �1ffiffi2
p0 2 �2
24
35 1 0 0
0 1 0�1 0 1
24
35 1 0 0
� 12 1 0
0 0 1
24
35
¼1 0 01ffiffi2
pffiffiffi2
p �1ffiffi2
p2 2 �2
24
35 1 0 0
� 12 1 0
0 0 1
24
35 ¼
1 0 00
ffiffiffi2
p �1ffiffi2
p1 2 �2
24
35:
Thus
1 0 00
ffiffiffi2
p �1ffiffi2
p1 2 �2
24
35 1 1
2 112 2 21 2 2
24
35 1 0 0
0ffiffiffi2
p �1ffiffi2
p1 2 �2
24
35T
¼1 0 00 1 00 0 �1
24
35:
Clearly,1 0 00
ffiffiffi2
p �1ffiffi2
p1 2 �2
24
35 ¼Rð Þ is invertible. It follows that
1 12 1
12 2 2
1 2 2
264
375 ¼
1 0 0
0ffiffiffi2
p �1ffiffi2
p
1 2 �2
264
375�1 1 0 0
0 1 0
0 0 �1
264
375 1 0 0
0ffiffiffi2
p �1ffiffi2
p
1 2 �2
264
375T0
B@1CA
�1
¼1 0 0
0ffiffiffi2
p �1ffiffi2
p
1 2 �2
264
375�1 1 0 0
0 1 0
0 0 �1
264
375 1 0 0
0ffiffiffi2
p �1ffiffi2
p
1 2 �2
264
375�10
B@1CA
T
;
and hence
4.2 Sylvester’s Law 319
x1ð Þ2 þ 2 x2ð Þ2 þ 2 x3ð Þ2 þ x1x2 þ 2x1x3 þ 4x2x3 ¼ x1x2x3½ �1 1
2 112 2 21 2 2
24
35 x1x2x3½ �T
¼ x1x2x3½ �1 0 00
ffiffiffi2
p �1ffiffi2
p1 2 �2
24
35�1 1 0 0
0 1 00 0 �1
24
35 1 0 0
0ffiffiffi2
p �1ffiffi2
p1 2 �2
24
35�10
@1A
T
x1x2x3½ �T
¼ x1x2x3½ �1 0 00
ffiffiffi2
p �1ffiffi2
p1 2 �2
24
35�10
@1A 1 0 0
0 1 00 0 �1
24
35 x1x2x3½ �
1 0 00
ffiffiffi2
p �1ffiffi2
p1 2 �2
24
35�10
@1A
T
¼ y1y2y3½ �1 0 00 1 00 0 �1
24
35 y1y2y3½ �T ;
where
y1y2y3½ � � x1x2x3½ �1 0 00
ffiffiffi2
p �1ffiffi2
p1 2 �2
24
35�1
:
It follows that
x1x2x3½ � ¼ y1y2y3½ �1 0 00
ffiffiffi2
p �1ffiffi2
p1 2 �2
24
35 ¼ y1 þ y3
ffiffiffi2
py2 þ 2y3
�1ffiffiffi2
p y2 � 2y3
� �;
or
x1 ¼ y1 þ y3x2 ¼
ffiffiffi2
py2 þ 2y3
x3 ¼ �1ffiffi2
p y2 � 2y3
9=;:
Also,
x1ð Þ2 þ 2 x2ð Þ2 þ 2 x3ð Þ2 þ x1x2 þ 2x1x3 þ 4x2x3 ¼ y1y2y3½ �1 0 00 1 00 0 �1
24
35 y1y2y3½ �T
¼ y1ð Þ2 þ y2ð Þ2� y3ð Þ2;
that is,
320 4 Sylvester’s Law of Inertia
x1ð Þ2 þ 2 x2ð Þ2 þ 2 x3ð Þ2 þ x1x2 þ 2x1x3 þ 4x2x3 ¼ y1ð Þ2 þ y2ð Þ2� y3ð Þ2;
where
x1 ¼ y1 þ y3x2 ¼
ffiffiffi2
py2 þ 2y3
x3 ¼ �1ffiffi2
p y2 � 2y3
9=;:
Verification: Here
LHS ¼ x1ð Þ2 þ 2 x2ð Þ2 þ 2 x3ð Þ2 þ x1x2 þ 2x1x3 þ 4x2x3
¼ y1 þ y3ð Þ2 þ 2ffiffiffi2
py2 þ 2y3
� �2þ 2
�1ffiffiffi2
p y2 � 2y3
� �2
þ y1 þ y3ð Þffiffiffi2
py2 þ 2y3
� �þ 2 y1 þ y3ð Þ �1ffiffiffi
2p y2 � 2y3
� �þ 4
ffiffiffi2
py2 þ 2y3
� � �1ffiffiffi2
p y2 � 2y3
� �¼ y1ð Þ2 þ y2ð Þ2 4þ 1� 4ð Þþ y3ð Þ2 1þ 8þ 8þ 2� 4� 16ð Þþ y1y2
ffiffiffi2
p�
ffiffiffi2
p� �þ y1y3 2þ 2� 4ð Þþ y2y3 8
ffiffiffi2
pþ 4
ffiffiffi2
pþ
ffiffiffi2
p�
ffiffiffi2
p� 12
ffiffiffi2
p� �¼ y1ð Þ2 þ y2ð Þ2� y3ð Þ2¼ RHS:
Verified.
4.2.25 Conclusion Let A � aij
be an n-square real symmetric matrix. Let
/ x1; . . .; xnð Þ �Xi;j
aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �T� �
be a real quadratic form. Then there exists a real invertible matrix C � cij
suchthat the transformation
x1 ¼ c11y1 þ � � � þ c1nyn...
xn ¼ cn1y1 þ � � � þ cnnyn
9>=>;
reduces the formP
i;j aijxixj to the “normal form”
y1ð Þ2 þ � � � þ yrð Þ2� yrþ 1ð Þ2� � � � � yrþ sð Þ2:Definition If r ¼ 0; then the normal form becomes
� y1ð Þ2� � � � � ysð Þ2;
and / x1; . . .; xnð Þ� 0 for every real xi i ¼ 1; . . .; nð Þ: In this case, we say that / isnegative definite.
4.2 Sylvester’s Law 321
If s ¼ 0; then the normal form becomes
y1ð Þ2 þ � � � þ yrð Þ2;
and / x1; . . .; xnð Þ� 0 for every real xi i ¼ 1; . . .; nð Þ: In this case, we say that / ispositive definite.
By / is definite we mean that either / is negative definite or / is positivedefinite. If / is not definite, then we say that / is indefinite.
In the above example, the quadratic form is indefinite.
Definition Let
/ �P aijxixjw �P bijxixj
�
be a pair of real quadratic forms. For a parameter k 2 C; the quadratic formPaij � kbij� �
xixj is denoted by /� kw: By the discriminant of / we mean det aij
:
Similarly, the discriminant of w is det bij
; and the discriminant of /� kw is
det aij � kbij
:
Clearly, det aij � kbij
is a polynomial in k: The polynomial equation
det aij � kbij ¼ 0
is called the k-equation of the pair of quadratic formsP
aijxixj andP
bxixj:
4.2.26 Theorem Let
/ �Paijxixjw �Pbijxixj
�
be a pair of real quadratic forms. Let bij
be invertible. Then all the roots of the k-equation of / and w are real.
Proof Let us denote aij
by A; and bij
by B. Now, aij � kbij ¼ A� kB; and the
k-equation of / and w becomes
det A� kBð Þ ¼ 0:
Since bij
is invertible, B�1 exists, and det Bð Þ 6¼ 0: Now,
A� kB ¼ AB�1 � kI� �
B;
and hence
322 4 Sylvester’s Law of Inertia
det A� kBð Þ ¼ det AB�1 � kI� �
B� � ¼ det AB�1 � kI
� �det Bð Þ:
Thus
det A� kBð Þ ¼ det AB�1 � kI� �
det Bð Þ:
Since det Bð Þ 6¼ 0; every root of the k-equation of / and w is an eigenvalue ofAB�1: Since B is real and symmetric, B�1 is real and symmetric. Since A is real andsymmetric, the product AB�1 is real and symmetric, and hence AB�1ð Þ¼ AB�1ð Þ:This shows that AB�1ð Þ is Hermitian, and hence by 3.3.26, all the eigenvalues ofAB�1ð Þ are real. Now, since every root of the k-equation of / and w is an eigen-value of AB�1; every root of the k-equation of / and w is real. ∎
4.3 Application to Riemannian Geometry
4.3.1 Note Let A � aij
be a symmetric n-square real matrix. Let
/ x1; . . .; xnð Þ �X
aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �T
be a real quadratic form. Suppose that a11 6¼ 0:
Clearly, / x1; . . .; xnð Þ � 1a11
Pni¼1 a1ixi
� �2is a real quadratic form independent of x1:
Proof Observe that
/ x1; . . .; xnð Þ � 1a11
Xni¼1
a1ixi
!2
¼ / x1; . . .; xnð Þ � 1a11
Xni¼1
a1ixi
! Xnj¼1
a1jxj
!
¼ / x1; . . .; xnð Þ � 1a11
Xni¼1
a1ixið ÞXnj¼1
a1jxj
!
¼Xni¼1
Xnj¼1
aijxixj
!� 1a11
Xni¼1
Xnj¼1
a1ixi � a1jxj !
¼Xni¼1
Xnj¼1
a11aij � a1ia1ja11
� �xixj
!;
so
4.2 Sylvester’s Law 323
/ x1; . . .; xnð Þ � 1a11
Xni¼1
a1ixi
!2
¼X
cijxixj;
where cij � a11aij�a1ia1ja11
: Here
cji ¼ a11aji � a1ja1ia11
¼ a11aji � a1ia1ja11
¼ a11aij � a1ia1ja11
¼ cij;
so cji ¼ cij: Thus cij
is a symmetric n-square real matrix, and hence
/ x1; . . .; xnð Þ � 1a11
Xni¼1
a1ixi
!2
is a real quadratic form. It suffices to show that c1j ¼ 0: Here
LHS ¼ c1j ¼ a11a1j � a11a1ja11
¼ 0 ¼ RHS:
Thus we have shown that / x1; . . .; xnð Þ � 1a11
Pni¼1
a1ixi
� �2
is a real quadratic form
independent of x1: We can denote it by /1 x2; . . .; xnð Þ: Thus
/ x1; . . .; xnð Þ � 1a11
Xni¼1
a1ixi
!2
þ/1 x2; . . .; xnð Þ:
Put
y1; . . .; yn½ �T�a110...
0
a121...
0
� � �� � �. ..
� � �
a1n0...
1
2664
3775 x1; . . .; xn½ �T :
Since
det
a110...
0
a121...
0
� � �� � �. ..
� � �
a1n0...
1
2664
3775 ¼ a11 6¼ 0;
324 4 Sylvester’s Law of Inertia
Q �a110...
0
a121...
0
� � �� � �. ..
� � �
a1n0...
1
2664
3775 is invertible, and hence
x1; . . .; xn½ �T 7!Q x1; . . .; xn½ �T
from R3 to R3 is a one-to-one linear transformation. Since
/ x1; . . .; xnð Þ � 1a11
Xni¼1
a1ixi
!2
þ/1 x2; . . .; xnð Þ;
we have
/ x1; . . .; xnð Þ � 1a11
y1ð Þ2 þ/1 y2; . . .; ynð Þ:
4.3.2 Conclusion Let A � aij
be a symmetric n-square real matrix. Let
/ x1; . . .; xnð Þ �X
aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �T
be a real quadratic form. Suppose that a11 6¼ 0: Then the one-to-one lineartransformation
y1; . . .; yn½ �T¼a110...
0
a121...
0
� � �� � �. ..
� � �
a1n0...
1
2664
3775 x1; . . .; xn½ �T
reduces / x1; . . .; xnð Þ to 1a11
y1ð Þ2 þ/1 y2; . . .; ynð Þ; where /1 y2; . . .; ynð Þ is a quad-ratic form.
This result is known as Lagrangian reduction.
4.3.3 Note Let A � aij
be a symmetric n-square real matrix. Let
/ x1; . . .; xnð Þ �X
aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �T
be a real quadratic form. Suppose that a11 ¼ 0; a22 ¼ 0; and a12 6¼ 0:Observe that
4.3 Application to Riemannian Geometry 325
/ x1; . . .; xnð Þ �X
aijxixj
¼ a11 x1ð Þ2 þ a22 x2ð Þ2 þ a12x1x2 þ a21x2x1� �þ 2a13x1x3 þ 2a14x1x4 þ � � � þ 2a1nx1xnð Þ
þ 2a23x2x3 þ 2a24x2x4 þ � � � þ 2a2nx2xnð ÞþXni¼3
Xnj¼3
aijxixj
!
¼ 2a12x1x2 þ 2x1 a13x3 þ a14x4 þ � � � þ a1nxnð Þ
þ 2x2 a23x3 þ a24x4 þ � � � þ a2nxnð ÞþXni¼3
Xnj¼3
aijxixj
!
¼ 2 a12x1x2 þ x1Xni¼3
a1ixi þ x2Xni¼3
a2ixi
!þXni¼3
Xnj¼3
aijxixj
!
¼ 2a12
a12x1ð Þ a12x2ð Þþ a12x1ð ÞXni¼3
a1ixi
þ a12x2ð ÞXni¼3
a2ixi
!þXni¼3
Xnj¼3
aijxixj
!
¼ 2a12
a12x2 þXni¼3
a1ixi
!a12x1 þ
Xni¼3
a2ixi
!
�Xni¼3
a1ixi
! Xni¼3
a2ixi
!!þXni¼3
Xnj¼3
aijxixj
!
¼ 2a12
a12x2 þXni¼3
a1ixi
!a21x1 þ
Xni¼3
a2ixi
!
�Xni¼3
a1ixi
! Xni¼3
a2ixi
!!þXni¼3
Xnj¼3
aijxixj
!
¼ 2a12
a12x2 þXni¼3
a1ixi
!a21x1 þ
Xni¼3
a2ixi
!
þ �2a12
Xni¼3
a1ixi
! Xnj¼3
a2jxj
!þXni¼3
Xnj¼3
aijxixj
!
¼ 2a12
a12x2 þXni¼3
a1ixi
!a21x1 þ
Xni¼3
a2ixi
!
þ �2a12
Pni¼3
a1ixiPnj¼3
a2jxj
! !þ Pn
i¼3
Pnj¼3
aijxixj
!
326 4 Sylvester’s Law of Inertia
¼ 2a12
a12x2 þXni¼3
a1ixi
!a21x1 þ
Xni¼3
a2ixi
!
þ �2a12
Xni¼3
Xnj¼3
a1ixi � a2jxj� � ! !
þXni¼3
Xnj¼3
aijxixj
!
¼ 2a12
a12x2 þXni¼3
a1ixi
!a21x1 þ
Xni¼3
a2ixi
!
þXni¼3
Xnj¼3
�2a1ia2ja12
þ aij
� �xixj
!;
so
/ x1; . . .; xnð Þ � 2a12
a12x2 þXni¼3
a1ixi
!a21x1 þ
Xni¼3
a2ixi
!þXni¼3
Xnj¼3
cijxixj
!;
where cij � �2a1ia2ja12
þ aij: Now, since
cji ¼ �2a1ja2ia12
þ aji ¼ �2a2ia1ja12
þ aji ¼ �2a2ia1ja12
þ aij ¼ cij;
we have cji ¼ cij; and hencePn
i¼3
Pnj¼3 cijxixj
� �is a real quadratic form inde-
pendent of x1 and x2: So we can denote the real quadratic formPn
i¼3
Pnj¼3 cijxixj
� �by /1 x3; . . .; xnð Þ: Thus
/ x1; . . .; xnð Þ � 2a12
a12x2 þXni¼3
a1ixi
!a21x1 þ
Xni¼3
a2ixi
!þ/1 x3; . . .; xnð Þ:
Put
y1; . . .; yn½ �T�
0a210...
0
a1200...
0
a13a231...
0
� � �� � �� � �. ..
� � �
a1na2n0...
1
266664
377775 x1; . . .; xn½ �T :
4.3 Application to Riemannian Geometry 327
Since
det
0a210...
0
a1200...
0
a13a231...
0
� � �� � �� � �. ..
� � �
a1na2n0...
1
266664
377775 ¼ �a21
a120...
0
a131...
0
� � �� � �. ..
� � �
a1n0...
1
2664
3775 ¼ � a12ð Þ2 6¼ 0;
Q �
0a210...
0
a1200...
0
a13a231...
0
� � �� � �� � �. ..
� � �
a1na2n0...
1
266664
377775
is invertible, and hence
x1; . . .; xn½ �T 7!Q x1; . . .; xn½ �T
from R3 to R3 is a one-to-one linear transformation. Since
/ x1; . . .; xnð Þ � 2a12
a12x2 þXni¼3
a1ixi
!a21x1 þ
Xni¼3
a2ixi
!þ/1 x3; . . .; xnð Þ;
we have
/ x1; . . .; xnð Þ � 2a12
y1y2 þ/1 y3; . . .; ynð Þ:
Put
z1; . . .; zn½ �T�
110...
0
1�10...
0
001...
0
� � �� � �� � �. ..
� � �
000...
1
266664
377775 y1; . . .; yn½ �T :
Since
det
110...
0
1�10...
0
001...
0
� � �� � �� � �. ..
� � �
000...
1
266664
377775 ¼ �2 6¼ 0;
328 4 Sylvester’s Law of Inertia
R �
110...
0
1�10...
0
001...
0
� � �� � �� � �. ..
� � �
000...
1
266664
377775 is invertible, and hence
y1; . . .; yn½ �T 7!R y1; . . .; yn½ �T
from R3 to R3 is a one-to-one linear transformation. Since
/ x1; . . .; xnð Þ � 2a12
y1y2 þ/1 y3; . . .; ynð Þ;
we have
/ x1; . . .; xnð Þ � 2a12
12
z1 þ z2ð Þ� �
12
z1 � z2ð Þ� �
þ/1 z3; . . .; znð Þ
� 12a12
z1ð Þ2 þ �12a12
z2ð Þ2 þ/1 z3; . . .; znð Þ;
and hence
/ x1; . . .; xnð Þ ¼ 12a12
z1ð Þ2 þ �12a12
z2ð Þ2 þ/1 z3; . . .; znð Þ:
4.3.4 Conclusion (I) Let A � aij
be a symmetric n-square real matrix. Let
/ x1; . . .; xnð Þ �X
aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �T
be a real quadratic form. Suppose that a11 ¼ 0; a22 ¼ 0; and a12 6¼ 0: Then theone-to-one linear transformation
z1; . . .; zn½ �T�
110...
0
1�10...
0
001...
0
� � �� � �� � �. ..
� � �
000...
1
266664
377775
0a210...
0
a1200...
0
a13a231...
0
� � �� � �� � �. ..
� � �
a1na2n0...
1
266664
377775
0BBBB@
1CCCCA x1; . . .; xn½ �T
reduces / x1; . . .; xnð Þ to 12a12
z1ð Þ2 þ �12a12
z2ð Þ2 þ/1 z3; . . .; znð Þ; where /1 z3; . . .; znð Þis a quadratic form.
This result is also known as Lagrangian reduction.By repeated application of Lagrangian reduction, we get the following result.
4.3 Application to Riemannian Geometry 329
4.3.5 Conclusion (II) Let A � aij
be a symmetric n-square real matrix. Let
/ x1; . . .; xnð Þ �X
aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �T
be a real quadratic form. Then there exists a one-to-one linear transformation
y1; . . .; yn½ �T� Q x1; . . .; xn½ �T
such that / x1; . . .; xnð Þ reduces to a form y1; . . .; yn½ � diag c1; . . .; cnð Þð Þ x1; . . .; xn½ �T :4.3.6 Theorem Let A � aij
be a symmetric n-square real matrix. Let
/ x1; . . .; xnð Þ �X
aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �T
be a real quadratic form. Let A be invertible. Let / be a definite form. Then eachaii i ¼ 1; . . .; nð Þ is nonzero.Proof Suppose to the contrary that there exists a diagonal entry of A that is 0. Weseek a contradiction. For simplicity, suppose that a11 ¼ 0:
Case I: / is a positive definite form. Since A is a symmetric n-square real matrixand / ¼ x1; . . .; xn½ �A x1; . . .; xn½ �T� �
is a definite form, there exists, by 4.2.17, a realinvertible matrix R such that
RART ¼ In:
It follows that
A ¼ R�1 RT� ��1¼ R�1 R�1� �T
:
Since
/ x1; . . .; xnð Þ �X
aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �T
¼ x1; . . .; xn½ �R�1 R�1� �T
x1; . . .; xn½ �T
¼ x1; . . .; xn½ �R�1� �x1; . . .; xn½ �R�1� �T
¼ y1; . . .; yn½ � y1; . . .; yn½ �T ;
where
y1; . . .; yn½ � � x1; . . .; xn½ �R�1; :
it follows that
330 4 Sylvester’s Law of Inertia
/ x1; . . .; xnð Þ ¼ y1; . . .; yn½ � y1; . . .; yn½ �T¼ y1ð Þ2 þ � � � þ ynð Þ2:
Since R is invertible, we have that
x1; . . .; xn½ � 7! x1; . . .; xn½ �R�1
from R3 to R3 is a one-to-one linear transformation, and hence
1; 0; . . .; 0½ �R�1 6¼ 0; 0; . . .; 0½ �:
It follows that 1; 0; . . .; 0½ �R�1ð Þ 1; 0; . . .; 0½ �R�1ð ÞT [ 0: Since / x1; . . .; xnð Þ �Paijxixj; we have
0\ 1; 0; . . .; 0½ �R�1� �1; 0; . . .; 0½ �R�1� �T¼ / 1; 0; . . .; 0ð Þ ¼ a11 � 1 � 1þ 0þ � � � þ 0|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
¼ a11 ¼ 0:
Thus we have obtained a contradiction.Case II: / is a negative definite form. This case is similar to Case I. ∎
4.3.7 Theorem Let B � bij
be a symmetric n-square real matrix. Let B beinvertible. Let
w x1; . . .; xnð Þ �X
bijxixj ¼ x1; . . .; xn½ �B x1; . . .; xn½ �T
be a real quadratic form. Let w be positive definite. Let P be a real orthogonal n-square real matrix. Then the (1,1)-entry in PBPT is nonzero. Similarly, the (2,2)-entry in PBPT is nonzero, etc.
Proof Suppose to the contrary that the (1,1)-entry in PBPT is 0. We seek acontradiction.
Since B � bij
is a symmetric n-square real matrix and
w x1; . . .; xnð Þ �X
bijxixj ¼ x1; . . .; xn½ �B x1; . . .; xn½ �T
is a positive definite form, by 4.2.17, there exists a real invertible matrix C such that
CBCT ¼ In: It follows that B ¼ C�1 CTð Þ�1 ¼ C�1 C�1ð ÞT� �
: Now,
PBPT ¼ P C�1 C�1� �T� �
PT ¼ PC�1� �
PC�1� �T
;
so PBPT ¼ PC�1ð Þ PC�1ð ÞT : Since P and C are invertible, PC�1 is invertible.
4.3 Application to Riemannian Geometry 331
Suppose that PC�1 �c11 � � � c1n... . .
. ...
cn1 � � � cnn
264
375; where each cij is a real number. Now,
PBPT ¼c11 � � � c1n... . .
. ...
cn1 � � � cnn
264
375 c11 � � � c1n
..
. . .. ..
.
cn1 � � � cnn
264
375T
¼c11 � � � c1n... . .
. ...
cn1 � � � cnn
264
375 c11 � � � cn1
..
. . .. ..
.
c1n � � � cnn
264
375:
Since the (1,1)-entry in
c11 � � � c1n... . .
. ...
cn1 � � � cnn
264
375 c11 � � � cn1
..
. . .. ..
.
c1n � � � cnn
264
375 ¼ PBPT� �
is c11ð Þ2 þ � � � þ c1nð Þ2; the (1,1)-entry. in PBPT is c11ð Þ2 þ � � � þ c1nð Þ2: Byassumption, the (1,1)-entry. in PBPT is 0, so c11ð Þ2 þ � � � þ c1nð Þ2¼ 0: Since eachcij is a real number, we have c1i ¼ 0 i ¼ 1; . . .; nð Þ: It follows that det PC�1ð Þ ¼ 0;and hence PC�1 is not invertible. This is a contradiction. ∎
4.3.8 Note Let A � aij
be a symmetric n-square real matrix. Let B � bij
be asymmetric n-square real matrix. Let B be invertible. Let
/ x1; . . .; xnð Þ �X
aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �T
be a real quadratic form. Let
w x1; . . .; xnð Þ �X
bijxixj ¼ x1; . . .; xn½ �B x1; . . .; xn½ �T
be a real quadratic form. Let w be a positive definite form. Let k1 be a root of the k-equation of / and w; that is, det A� k1B½ � ¼ 0:
Since det A� k1B½ � ¼ 0; the characteristic equation det A� k1Bð Þ � kIn½ � ¼ 0 ofA� k1Bð Þ is satisfied by k ¼ 0; and hence 0 is an eigenvalue of A� k1Bð Þ: SinceA;B are symmetric n-square real matrices, A� k1Bð Þ is also a symmetric n-squarereal matrix. It follows, by 4.2.12, that there exists a real orthogonal matrix P suchthat
P A� k1Bð ÞPT ¼ diag l1; . . .; lnð Þ;
332 4 Sylvester’s Law of Inertia
where l1; . . .; ln are the eigenvalues of A� k1Bð Þ: Since 0 is an eigenvalue ofA� k1Bð Þ; one of the lis is 0. For simplicity, suppose that l1 ¼ 0: Thus
A� k1Bð Þ ¼ PT diag 0; l2; . . .; lnð Þð ÞP:
It follows that
x1; . . .; xn½ � A� k1Bð Þ x1; . . .; xn½ �T� x1; . . .; xn½ � PT diag 0; l2; . . .; lnð Þð ÞP� �x1; . . .; xn½ �T|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
� x1; . . .; xn½ �PT� �
diag 0; l2; . . .; lnð Þð Þ x1; . . .; xn½ �PT� �T
� y1; . . .; yn½ � diag 0; l2; . . .; lnð Þð Þ y1; . . .; yn½ �T ;
where y1; . . .; yn½ � � x1; . . .; xn½ �PT : Since P is invertible,
x1; . . .; xn½ �T 7!P x1; . . .; xn½ �T
from R3 to R3 is a one-to-one linear transformation. Clearly,y1; . . .; yn½ � diag 0; l2; . . .;lnð Þð Þ y1; . . .; yn½ �T is a real quadratic form independent ofy1; so we can denote y1; . . .; yn½ � diag 0; l2; . . .; lnð Þð Þ y1; . . .; yn½ �T by /1 y2; . . .; ynð Þ:Thus
x1; . . .; xn½ � A� k1Bð Þ x1; . . .; xn½ �T� /1 y2; . . .; ynð Þ:
Hence in the reduced form /1 y2; . . .; ynð Þ of / x1; . . .; xnð Þ � k1w x1; . . .; xnð Þ; thecoefficient of y1ð Þ2 is zero.
By 4.3.7, the 1; 1ð Þ-entry in PBPT is nonzero. It follows that the coefficient ofy1ð Þ2 in the real quadratic form
ðw x1; . . .; xnð Þ ¼ x1; . . .; xn½ �B x1; . . .; xn½ �T� y1; . . .; yn½ �Pð ÞB y1; . . .; yn½ �PÞT� � � y1; . . .; yn½ � PBPTð Þ y1; . . .; yn½ �T
is nonzero. Thus in the reduced form, say w1 y1; . . .; ynð Þ; of w x1; . . .; xnð Þ; thecoefficient of y1ð Þ2 is nonzero.
Now we can suppose that
w1 y1; . . .; ynð Þ ¼ y1; . . .; yn½ � dij
y1; . . .; yn½ �T ;
where d11 6¼ 0: Next, by 4.3.2 the one-to-one linear transformation
4.3 Application to Riemannian Geometry 333
z1; . . .; zn½ �T¼d110...
0
d121...
0
� � �� � �. ..
� � �
d1n0...
1
2664
3775 y1; . . .; yn½ �T
reduces w1 y1; . . .; ynð Þ to 1d11
z1ð Þ2 þ/2 z2; . . .; znð Þ; where /2 z2; . . .; znð Þ is a quad-ratic form. Here we can write
z1 ¼ d11y1 þ d12y2 þ � � � þ d1nynz2 ¼ y2z3 ¼ y3
..
.
zn ¼ yn
9>>>>>=>>>>>;;
or
y1 ¼ 1d11
z1 þ �d12d11
z2 þ � � � þ �d1nd11
zny2 ¼ z2y3 ¼ z3
..
.
yn ¼ zn
9>>>>>=>>>>>;:
Hence
/ x1; . . .; xnð Þ � k1w x1; . . .; xnð Þ � x1; . . .; xn½ � A� k1Bð Þ x1; . . .; xn½ �T� /1 y2; . . .; ynð Þ � /1 z2; . . .; znð Þ|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} :
Thus
/ x1; . . .; xnð Þ � k1w x1; . . .; xnð Þ � /1 z2; . . .; znð Þ:
Since w x1; . . .; xnð Þ reduces to w1 y1; . . .; ynð Þ; and w1 y1; . . .; ynð Þ reduces to1d11
z1ð Þ2 þ/2 z2; . . .; znð Þ; it follows that w x1; . . .; xnð Þ reduces to 1d11
z1ð Þ2 þ/2 z2; . . .; znð Þ: Thus
w x1; . . .; xnð Þ � 1d11
z1ð Þ2 þ/2 z2; . . .; znð Þ:
334 4 Sylvester’s Law of Inertia
It follows that
/ x1; . . .; xnð Þ � k1 1d11
z1ð Þ2 þ/2 z2; . . .; znð Þ� �
þ/1 z2; . . .; znð Þ� k1 1
d11z1ð Þ2 þ/3 z2; . . .; znð Þ;
where /3 z2; . . .; znð Þ � k1/2 z2; . . .; znð Þþ/1 z2; . . .; znð Þ:4.3.9 Conclusion Let A � aij
; and B � bij
be symmetric n-square real matri-
ces. Let B be invertible. Let
/ x1; . . .; xnð Þ �P aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �Tw x1; . . .; xnð Þ �P bijxixj ¼ x1; . . .; xn½ �B x1; . . .; xn½ �T
�
be a pair of real quadratic forms. Let w be positive definite. Let k1 be a root of the k-equation of / and w: Then there exists a one-to-one linear transformationx1; . . .; xn½ � 7! z1; . . .; zn½ � such that the pair’s reduced forms are
/ x1; . . .; xnð Þ � k1c1 z1ð Þ2 þ/1 z2; . . .; znð Þw x1; . . .; xnð Þ � c1 z1ð Þ2 þw1 z2; . . .; znð Þ
�;
where c1 is a nonzero real number.
Definition Let B � bij
be a symmetric n-square real matrix. Let
w x1; . . .; xnð Þ �Xni¼1
Xnj¼1
bijxixj
!�Xni¼1
xiXnj¼1
bijxj
! !� x1; . . .; xn½ � bij
x1; . . .; xn½ �T� �
be a real quadratic form. Suppose that c1; . . .; cnð Þ 6¼ 0; . . .; 0ð Þ; where each ci isreal. If bij
c1; . . .; cn½ �T¼ 0; . . .; 0½ �T ; then we say that c1; . . .; cnð Þ is a vertex of
w x1; . . .; xnð Þ:4.3.10 Let B � bij
be a symmetric n-square real matrix. Let B be invertible. Let
w x1; . . .; xnð Þ �Xni¼1
Xnj¼1
bijxixj
!�Xni¼1
xiXnj¼1
bijxj
! !� x1; . . .; xn½ � bij
x1; . . .; xn½ �T� �
be a real quadratic form. Let w x1; . . .; xnð Þ be an indefinite form.Since w x1; . . .; xnð Þ is an indefinite form, w x1; . . .; xnð Þ is neither positive definite
nor negative definite. If follows, by 4.2.25, that there exists a real invertible matrixC � cij
such that the one-to-one transformation
4.3 Application to Riemannian Geometry 335
x1; . . .; xn½ �T¼ C y1; . . .; yn½ �T
reduces the form w x1; . . .; xnð Þ to the normal form
y1ð Þ2 þ � � � þ yrð Þ2� yrþ 1ð Þ2� � � � � ynð Þ2;
where 1� r\n: Put
a1; . . .; an½ �T¼ C 1; 0; . . .; 0½ �T :
Now, since C is invertible, a1; . . .; anð Þ 6¼ 0; . . .; 0ð Þ: Also,
w a1; . . .; anð Þ ¼ 12 þ 02 þ � � � þ 02 � 02 � � � � � 02 ¼ 1[ 0;
so w a1; . . .; anð Þ is positive. Similarly, there exist nonzero b1; . . .; bnð Þ such thatw b1; . . .; bnð Þ is negative.4.3.11 Conclusion Let B � bij
be a symmetric n-square real matrix. Let B be
invertible. Let
w x1; . . .; xnð Þ �Xni¼1
Xnj¼1
bijxixj
!�Xni¼1
xiXnj¼1
bijxj
! !� x1; . . .; xn½ � bij
x1; . . .; xn½ �T� �
be a real quadratic form. Let w x1; . . .; xnð Þ be an indefinite form. Then there existnonzero real points a1; . . .; anð Þ and b1; . . .; bnð Þ such that w a1; . . .; anð Þ is positiveand w b1; . . .; bnð Þ is negative.4.3.12 Let B � bij
be a symmetric n-square real matrix. Let B be invertible. Let
w x1; . . .; xnð Þ �Xni¼1
Xnj¼1
bijxixj
!�Xni¼1
xiXnj¼1
bijxj
! !� x1; . . .; xn½ � bij
x1; . . .; xn½ �T� �
be a real quadratic form. Let w x1; . . .; xnð Þ be an indefinite form. Let a1; . . .; anð Þand b1; . . .; bnð Þ be nonzero real points such that w a1; . . .; anð Þ is positive andw b1; . . .; bnð Þ is negative.
Observe that
336 4 Sylvester’s Law of Inertia
w a1 þ kb1; . . .; an þ kbnð Þ ¼ a1 þ kb1; . . .; an þ kbn½ � bij
a1 þ kb1; . . .; an þ kbn½ �T¼ a1; . . .; an½ � þ k b1; . . .; bn½ �ð Þ bij
a1; . . .; an½ � þ k b1; . . .; bn½ �ð ÞT
¼ a1; . . .; an½ � bij
a1; . . .; an½ �T þð a1; . . .; an½ � bij
b1; . . .; bn½ �Tþ b1; . . .; bn½ � bij
a1; . . .; an½ �Þkþ b1; . . .; bn½ � bij
b1; . . .; bn½ �Tk2
¼ w b1; . . .; bnð Þð Þk2 þ Pni¼1
Pnj¼1
bijaibj
!þ Pn
i¼1
Pnj¼1
bijbiaj
! !kþw a1; . . .; anð Þ
¼ w b1; . . .; bnð Þð Þk2 þ Pni¼1
Pnj¼1
bijaibj
!þ Pn
i¼1
Pnj¼1
bjibiaj
! !kþw a1; . . .; anð Þ
¼ w b1; . . .; bnð Þð Þk2 þ Pni¼1
Pnj¼1
bijaibj
!þ Pn
i¼1
Pnj¼1
bjiajbi
! !kþw a1; . . .; anð Þ
¼ w b1; . . .; bnð Þð Þk2 þ Pni¼1
Pnj¼1
bijaibj
!þ Pn
j¼1
Pni¼1
bijaibj
� � !kþw a1; . . .; anð Þ
¼ w b1; . . .; bnð Þð Þk2 þ Pni¼1
Pnj¼1
bijaibj
!þ Pn
i¼1
Pnj¼1
bijaibj
! !kþw a1; . . .; anð Þ
¼ w b1; . . .; bnð Þð Þk2 þ 2Pni¼1
Pnj¼1
bijaibj
!kþw a1; . . .; anð Þ;
so
w a1 þ kb1; . . .; an þ kbnð Þ
� w b1; . . .; bnð Þð Þk2 þ 2Xni¼1
Xnj¼1
bijaibj
!kþw a1; . . .; anð Þ:
Since w a1; . . .; anð Þ is positive and w b1; . . .; bnð Þ is negative, the discriminant of
w a1 þ kb1; . . .; an þ kbnð Þ ¼ð Þ w b1; . . .; bnð Þð Þk2
þ 2Xni¼1
Xnj¼1
bijaibj
!kþw a1; . . .; anð Þ
is positive, and hence there exist two distinct real numbers k1 and k2 such that
w a1 þ k1b1; . . .; an þ k1bnð Þ ¼ 0w a1 þ k2b1; . . .; an þ k2bnð Þ ¼ 0
�:
Clearly k1k2 is negative, and hence k1; k2 are of opposite signs.Also, w x1; . . .; xnð Þ vanishes at the two real points a1 þ k1b1; . . .; an þ k1bnð Þ
¼ a1; . . .; anð Þþ k1 b1; . . .; bnð Þð Þ and a1 þ k2b1; . . .; an þ k2bnð Þ ¼ a1; . . .; anð Þðþ k2 b1; . . .; bnð ÞÞ: Since k1 and k2 are distinct real numbers and b1; . . .; bnð Þ are
4.3 Application to Riemannian Geometry 337
nonzero, a1; . . .; anð Þþ k1 b1; . . .; bnð Þ and a1; . . .; anð Þþ k2 b1; . . .; bnð Þ are distinctpoints.
Clearly, bij
a1 þ k1b1; . . .; an þ k1bn½ �T 6¼ 0; . . .; 0½ �T :Proof Suppose to the contrary that bij
a1 þ k1b1; . . .; an þ k1bn½ �T¼ 0; . . .; 0½ �T :
We seek a contradiction.Since bij
a1 þ k1b1; . . .; an þ k1bn½ �T¼ 0; . . .; 0½ �T ; and bij
is invertible, we
havea1 þ k1b1; . . .; an þ k1bn½ � ¼ 0; . . .; 0½ �; and hence a1; . . .; anð Þþ k1
b1; . . .; bnð Þ ¼ 0; . . .; 0ð Þ: It follows that a1; . . .; anð Þ ¼ �k1b1; . . .;�k1bnð Þ:Since w a1; . . .; anð Þ is positive,
w �k1b1; . . .;�k1bnð Þ¼ �k1b1; . . .;�k1bn½ � bij
�k1b1; . . .;�k1bn½ �T� � ¼ k1ð Þ2w b1; . . .; bnð Þ� �
is positive, and hence k1ð Þ2w b1; . . .; bnð Þ is positive. Since k1 is real andw b1; . . .; bnð Þ is negative,
k1ð Þ2w b1; . . .; bnð Þ� 0: This is a contradiction. ∎
Thus we have shown that bij
a1 þ k1b1; . . .; an þ k1bn½ �T 6¼ 0; . . .; 0½ �T : Hencea1; . . .; anð Þþ k1 b1; . . .; bnð Þ is different from the origin and is not a vertex ofw x1; . . .; xnð Þ: Similarly, a1; . . .; anð Þþ k2 b1; . . .; bnð Þ is different from the originand is not a vertex of w x1; . . .; xnð Þ:4.3.13 Conclusion Let B � bij
be a symmetric n-square real matrix. Let B be
invertible. Let
w x1; . . .; xnð Þ �Xni¼1
Xnj¼1
bijxixj
!�Xni¼1
xiXnj¼1
bijxj
! !� x1; . . .; xn½ � bij
x1; . . .; xn½ �T� �
be a real quadratic form. Let w x1; . . .; xnð Þ be an indefinite form. Let a1; . . .; anð Þand b1; . . .; bnð Þ be nonzero real points such that w a1; . . .; anð Þ is positive andw b1; . . .; bnð Þ is negative. Then there exist two distinct real numbers k1 and k2 suchthat
1. k1; k2 are of opposite signs,2. w a1 þ k1b1; . . .; an þ k1bnð Þ ¼ 0;3. w a1 þ k2b1; . . .; an þ k2bnð Þ ¼ 0;4. a1 þ k1b1; . . .; an þ k1bnð Þ and a1 þ k2b1; . . .; an þ k2bnð Þ are points different
from origin and the vertices of w x1; . . .; xnð Þ:
338 4 Sylvester’s Law of Inertia
4.3.14 Theorem Let B � bij
be a symmetric n-square real matrix. Let B beinvertible. Let
w x1; . . .; xnð Þ �Xni¼1
Xnj¼1
bijxixj
!�Xni¼1
xiXnj¼1
bijxj
! !� x1; . . .; xn½ � bij
x1; . . .; xn½ �T� �
be a real quadratic form. Suppose that for every real point c1; . . .; cnð Þ that isdifferent from the origin and the vertices of w x1; . . .; xnð Þ; w c1; . . .; cnð Þ 6¼ 0: Thenw x1; . . .; xnð Þ is definite.Proof Suppose to the contrary that w x1; . . .; xnð Þ is indefinite. We seek acontradiction.
By 4.3.11, there exist nonzero real points a1; . . .; anð Þ and b1; . . .; bnð Þ such thatw a1; . . .; anð Þ is positive and w b1; . . .; bnð Þ is negative. By 4.3.13, there exist twodistinct real numbers k1 and k2 such that
1. k1; k2 are of opposite signs,2. w a1 þ k1b1; . . .; an þ k1bnð Þ ¼ 0;3. w a1 þ k2b1; . . .; an þ k2bnð Þ ¼ 0;4. a1 þ k1b1; . . .; an þ k1bnð Þ and a1 þ k2b1; . . .; an þ k2bnð Þ are points different
from origin and the vertices of w x1; . . .; xnð Þ:Since a1 þ k1b1; . . .; an þ k1bnð Þ is a point different from origin and the vertices
of w x1; . . .; xnð Þ; by assumption, w a1 þ k1b1; . . .; an þ k1bnð Þ 6¼ 0: This is a con-tradiction. ∎
4.3.15 Theorem Let B � bij
be a symmetric n-square real matrix. Let B be notinvertible. Let
w x1; . . .; xnð Þ �Xni¼1
Xnj¼1
bijxixj
!�Xni¼1
xiXnj¼1
bijxj
! !� x1; . . .; xn½ � bij
x1; . . .; xn½ �T� �
be a real quadratic form. Then there exists a real point a1; . . .; anð Þ different from theorigin such that w a1; . . .; anð Þ ¼ 0:
Proof Since B is not invertible, rank Bð Þ\n: By 4.2.17, there exists a real invertiblematrix R such that
RBRT ¼ diag 1; . . .1;�1; . . .;�1|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}\n
; 0; . . .; 0
0@
1A:
It follows that
4.3 Application to Riemannian Geometry 339
B ¼ R�1 diag 1; . . .1;�1; . . .;�1|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}\n
; 0; . . .; 0
0@
1A
0@
1A RT� ��1
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ R�1 diag 1; . . .1;�1; . . .;�1|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
\n
; 0; . . .; 0
0@
1A
0@
1A R�1� �T
;
and hence
w x1; . . .; xnð Þ� x1; . . .; xn½ �B x1; . . .; xn½ �T
� x1; . . .; xn½ � R�1 diag 1; . . .1;�1; . . .;�1|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}\n
; 0; . . .; 0
0@
1A
0@
1A R�1� �T0
@1A x1; . . .; xn½ �T
� y1; . . .; yn½ �diag 1; . . .1;�1; . . .;�1|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}\n
; 0; . . .; 0
0@
1A y1; . . .; yn½ �T ;
where y1; . . .; yn½ � � x1; . . .; xn½ �R�1: Since R is invertible, 0; . . .; 0; 1½ �R is nonzero.Put a1; . . .; an½ � � 0; . . .; 0; 1½ �R: Thus a1; . . .; anð Þ 6¼ 0; . . .; 0ð Þ: Also,
w a1; . . .; anð Þ
¼ a1; . . .; an½ � R�1 diag 1; . . .1;�1; . . .;�1|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}\n
; 0; . . .; 0
0@
1A
0@
1A R�1� �T0
@1A a1; . . .; an½ �T
¼ 0; . . .; 0; 1½ � diag 1; . . .1;�1; . . .;�1|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}\n
; 0; . . .; 0
0@
1A
0@
1A 0; . . .; 0; 1½ �T¼ 0;
so w a1; . . .; anð Þ ¼ 0: ∎
4.3.16 Theorem Let B � bij
be a symmetric n-square real matrix. Let B beinvertible. Let
w x1; . . .; xnð Þ �Xni¼1
Xnj¼1
bijxixj
!�Xni¼1
xiXnj¼1
bijxj
! !� x1; . . .; xn½ � bij
x1; . . .; xn½ �T� �
be a real quadratic form. Let w x1; . . .; xnð Þ be an indefinite form. Then there exists areal point c1; . . .; cnð Þ different from the origin such that w c1; . . .; cnð Þ ¼ 0:
340 4 Sylvester’s Law of Inertia
Proof By 4.3.11, there exist nonzero real points a1; . . .; anð Þ and b1; . . .; bnð Þ suchthat w a1; . . .; anð Þ is positive and w b1; . . .; bnð Þ is negative. By 4.3.13, there existtwo distinct real numbers k1 and k2 such that
1. k1; k2 are of opposite signs,2. w a1 þ k1b1; . . .; an þ k1bnð Þ ¼ 0;3. w a1 þ k2b1; . . .; an þ k2bnð Þ ¼ 0;4. a1 þ k1b1; . . .; an þ k1bnð Þ and a1 þ k2b1; . . .; an þ k2bnð Þ are points different
from origin and the vertices of w x1; . . .; xnð Þ:Let us take c1; . . .; cnð Þ � a1 þ k1b1; . . .; an þ k1bnð Þ: Now, w c1; . . .; cnð Þ ¼ 0;
and c1; . . .; cnð Þ is a real point different from the origin. ∎
4.3.17 Theorem Let B � bij
be a symmetric n-square real matrix. Let B beinvertible. Let
w x1; . . .; xnð Þ �Xni¼1
Xnj¼1
bijxixj
!�Xni¼1
xiXnj¼1
bijxj
! !� x1; . . .; xn½ � bij
x1; . . .; xn½ �T� �
be a real quadratic form. Let w x1; . . .; xnð Þ be a definite form. Let a1; . . .; anð Þ be areal point that is different from the origin. Then w a1; . . .; anð Þ 6¼ 0:
Proof Case I: w x1; . . .; xnð Þ is a positive definite form. Since w x1; . . .; xnð Þ is adefinite form, rank Bð Þ ¼ n: By 4.2.17, there exists a real invertible matrix R suchthat
RBRT ¼ diag 1; . . .; 1|fflfflffl{zfflfflffl}n
0@
1A:
It follows that
B ¼ R�1 diag 1; . . .; 1|fflfflffl{zfflfflffl}n
0@
1A
0@
1A RT� ��1
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼ R�1 diag 1; . . .; 1|fflfflffl{zfflfflffl}
n
0@
1A
0@
1A R�1� �T
;
and hence
w x1; . . .; xnð Þ � x1; . . .; xn½ �B x1; . . .; xn½ �T
� x1; . . .; xn½ � R�1 diag 1; . . .; 1|fflfflffl{zfflfflffl}n
0@
1A
0@
1A R�1� �T0
@1A x1; . . .; xn½ �T
� y1; . . .; yn½ � y1; . . .; yn½ �T ;
4.3 Application to Riemannian Geometry 341
where y1; . . .; yn½ � � x1; . . .; xn½ �R�1: Since R is invertible and a1; . . .; anð Þ 6¼0; . . .; 0ð Þ; a1; . . .; an½ �R�1 is nonzero. Put b1; . . .; bn½ � � a1; . . .; an½ �R�1: Thusb1; . . .; bnð Þ 6¼ 0; . . .; 0ð Þ: It follows that b1ð Þ2 þ � � � þ bnð Þ2 6¼ 0: Also,
w a1; . . .; anð Þ ¼ a1; . . .; an½ � R�1 diag 1; . . .; 1|fflfflffl{zfflfflffl}n
0@
1A
0@
1A R�1ð ÞT
0@
1A a1; . . .; an½ �T
¼ b1; . . .; bn½ � diag 1; . . .; 1|fflfflffl{zfflfflffl}n
0@
1A
0@
1A b1; . . .; bn½ �T¼ b1ð Þ2 þ � � � þ bnð Þ2 6¼ 0;
so w a1; . . .; anð Þ 6¼ 0:Case II: w x1; . . .; xnð Þ is a negative definite form. This case is similar to Case I. ∎
4.3.18 Note Let A � aij
, and B � bij
be symmetric n-square real matrices. LetB be invertible. Let
/ x1; . . .; xnð Þ �P aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �Tw x1; . . .; xnð Þ �P bijxixj ¼ x1; . . .; xn½ �B x1; . . .; xn½ �T
�
be a pair of real quadratic forms. Let w be positive definite. Let k1 be a root of the k-equation of / and w:
Then by 4.3.9, there exists a one-to-one linear transformationx1; . . .; xn½ � 7! z1; . . .; zn½ � such that the pair’s reduced forms are
/ x1; . . .; xnð Þ � k1c1 z1ð Þ2 þ/1 z2; . . .; znð Þw x1; . . .; xnð Þ � c1 z1ð Þ2 þw1 z2; . . .; znð Þ
�;
where c1 is a nonzero real number.
Clearly, the matrix associated with w1 is invertible.
Proof Suppose to the contrary that the matrix associated with w1 is not inver-table. We seek a contradiction.
By 4.3.15, there exists a real point d2; . . .; dnð Þ 6¼ 0; . . .; 0ð Þ such thatw1 d2; . . .; dnð Þ ¼ 0: Suppose that x1; . . .; xn½ � ¼ð Þ a1; . . .; an½ � 7! 0; d2; . . .; dn½ �¼ z1; . . .; zn½ �ð Þ: Since 0; d2; . . .; dnð Þ 6¼ 0; . . .; 0ð Þ; and the linear transformationx1; . . .; xn½ � 7! z1; . . .; zn½ � is one-to-one, a1; . . .; anð Þ is nonzero. Also,
w a1; . . .; anð Þ ¼ c1 0ð Þ2 þw1 d2; . . .; dnð Þ ¼ 0;
so w a1; . . .; anð Þ ¼ 0: Since a1; . . .; anð Þ is nonzero and w is positive definite, by4.3.17, w a1; . . .; anð Þ 6¼ 0: This is a contradiction. ∎
342 4 Sylvester’s Law of Inertia
Clearly, w1 is definite.
Proof Suppose to the contrary that w1 is indefinite. We seek a contradiction.By 4.3.16, there exists a real point d2; . . .; dnð Þ 6¼ 0; . . .; 0ð Þ such that
w1 d2; . . .; dnð Þ ¼ 0: Suppose that x1; . . .; xn½ � ¼ð Þ a1; . . .; an½ � 7! 0; d2; . . .; dn½ �¼ z1; . . .; zn½ �ð Þ: Since 0; d2; . . .; dnð Þ 6¼ 0; . . .; 0ð Þ; and the linear transformationx1; . . .; xn½ � 7! z1; . . .; zn½ � is one-to-one, a1; . . .; anð Þ is nonzero. Also,
w a1; . . .; anð Þ ¼ c1 0ð Þ2 þw1 d2; . . .; dnð Þ ¼ 0;
so w a1; . . .; anð Þ ¼ 0: Since a1; . . .; anð Þ is nonzero and w is positive definite, by4.3.17, w a1; . . .; anð Þ 6¼ 0: This is a contradiction. ∎
4.3.19 Conclusion Let A � aij
; and B � bij
be symmetric n-square realmatrices. Let B be invertible. Let
/ x1; . . .; xnð Þ �P aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �Tw x1; . . .; xnð Þ �P bijxixj ¼ x1; . . .; xn½ �B x1; . . .; xn½ �T
�
be a pair of real quadratic forms. Let w be positive definite. Let k1 be a root of the k-equation of / and w: Let x1; . . .; xn½ � 7! z1; . . .; zn½ � be a one-to-one linear trans-formation such that the pair’s reduced forms are
/ x1; . . .; xnð Þ � k1c1 z1ð Þ2 þ/1 z2; . . .; znð Þw x1; . . .; xnð Þ � c1 z1ð Þ2 þw1 z2; . . .; znð Þ
�;
where c1 is a nonzero real number. Then
1. the matrix associated with w1 is invertible,2. w1 is definite.
4.3.20 Theorem Let A � aij
; and B � bij
be symmetric n-square real matrices.Let B be invertible. Let
/ x1; . . .; xnð Þ �P aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �Tw x1; . . .; xnð Þ �P bijxixj ¼ x1; . . .; xn½ �B x1; . . .; xn½ �T
�
be a pair of real quadratic forms. Let w be positive definite. Let k1; . . .; kn be theroots of the k-equation of / and w: Let z1; . . .; zn½ � ¼ x1; . . .; xn½ �Q be a one-to-onelinear transformation such that the pair’s reduced forms are
/ x1; . . .; xnð Þ � k1c1 z1ð Þ2 þ/1 z2; . . .; znð Þw x1; . . .; xnð Þ � c1 z1ð Þ2 þw1 z2; . . .; znð Þ
�;
where c1 is a nonzero real number and Q is an invertible n-square real matrix. Thenk2; . . .; kn are the roots of the k-equation of /1 and w1:
4.3 Application to Riemannian Geometry 343
Proof Let A1 be the n� 1ð Þ-square real symmetric matrix associated with thequadratic form /1 z2; . . .; znð Þ: Let B1 be the n� 1ð Þ-square real symmetric matrixassociated with the quadratic form w1 z2; . . .; znð Þ: Clearly,
x1; . . .; xn½ �A x1; . . .; xn½ �T
¼ z1; . . .; zn½ � k1c1 0
0 A1
� �z1; . . .; zn½ �T
¼ x1; . . .; xn½ �Qð Þ k1c1 0
0 A1
� � x1; . . .; xn½ �Qð ÞT
¼ x1; . . .; xn½ � Qk1c1 0
0 A1
� �QT
� �x1; . . .; xn½ �T ;
so
A ¼ Qk1c1 00 A1
� �QT :
Similarly,
B ¼ Qc1 00 B1
� �QT :
Since Q is invertible, det Qð Þ is a nonzero real number. Since k1; . . .; kn are theroots of the k-equation of / and w; we have
det A� kBð Þ ¼ k1 � kð Þ k2 � kð Þ � � � kn � kð Þ:
It suffices to show that
det A1 � kB1ð Þ ¼ nonzero constantð Þ k2 � kð Þ � � � kn � kð Þ:
Since
k1 � kð Þ k2 � kð Þ � � � kn � kð Þ ¼ det A� kBð Þ
¼ det Qk1c1 0
0 A1
� �QT � kQ
c1 0
0 B1
� �QT
� �
¼ det Qk1c1 � kc1 0
0 A1 � kB1
� �QT
� �
¼ det Qð Þ � det k1c1 � kc1 0
0 A1 � kB1
� �� det QT
� �
344 4 Sylvester’s Law of Inertia
¼ det Qð Þ � det k1c1 � kc1 0
0 A1 � kB1
� �� det Qð Þ
¼ det Qð Þð Þ2 � det k1c1 � kc1 0
0 A1 � kB1
� �¼ det Qð Þð Þ2 � k1c1 � kc1ð Þ � det A1 � kB1ð Þ¼ det Qð Þð Þ2 � k1 � kð Þc1 � det A1 � kB1ð Þ;
we have
det A1 � kB1ð Þ ¼ 1
c1 det Qð Þð Þ2 k2 � kð Þ � � � kn � kð Þ:
∎
4.3.21 Note Let A � aij
and B � bij
be symmetric n-square real matrices. LetB be invertible. Let
/ x1; . . .; xnð Þ �P aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �Tw x1; . . .; xnð Þ �P bijxixj ¼ x1; . . .; xn½ �B x1; . . .; xn½ �T
�
be a pair of real quadratic forms. Let w be positive definite. Let k1; . . .; kn be theroots of the k-equation of / and w: Let z1; . . .; zn½ � ¼ x1; . . .; xn½ �Q be a one-to-onelinear transformation such that the pair’s reduced forms are
/ x1; . . .; xnð Þ � k1c1 z1ð Þ2 þ/1 z2; . . .; znð Þw x1; . . .; xnð Þ � c1 z1ð Þ2 þw1 z2; . . .; znð Þ
�;
where c1 is a nonzero real number and Q is an invertible n-square real matrix.By 4.3.19,
1. the matrix associated with w1 is invertible,2. w1 is definite.
Next, by 4.3.20,
3. k2; . . .; kn are the roots of the k-equation of /1 and w1:
Again, by repeating the same procedure, there exists a one-to-one linear trans-formation z1; z2; . . .; zn½ � 7! w1;w2; . . .;wn½ � such that z1 ¼ w1, and the pair’sreduced forms are
/1 z2; . . .; znð Þ � k2c2 w2ð Þ2 þ/2 w3; . . .;wnð Þw1 z2; . . .; znð Þ � c2 w2ð Þ2 þw2 w3; . . .;wnð Þ
�;
where c2 is a nonzero real number. Also,
4.3 Application to Riemannian Geometry 345
1. the matrix associated with w2 is invertible,2. w2 is definite,3. k3; . . .; kn are the roots of the k-equation of /2 and w2:
It follows that
/ x1; . . .; xnð Þ � k1c1 w1ð Þ2 þ k2c2 w2ð Þ2 þ/2 w3; . . .;wnð Þw x1; . . .; xnð Þ � c1 w1ð Þ2 þ c2 w2ð Þ2 þw2 w3; . . .;wnð Þ
�:
On repeating the above procedure, we get a one-to-one linear transformationx1; . . .; xn½ � 7! v1; . . .; vn½ � such that the pair’s reduced forms are
/ x1; . . .; xnð Þ � k1c1 v1ð Þ2 þ � � � þ kncn vnð Þ2w x1; . . .; xnð Þ � c1 v1ð Þ2 þ � � � þ cn vnð Þ2
�;
where each ci is a nonzero real number.
4.3.22 Conclusion Let A � aij
; and B � bij
be symmetric n-square realmatrices. Let B be invertible. Let
/ x1; . . .; xnð Þ �P aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �Tw x1; . . .; xnð Þ �P bijxixj ¼ x1; . . .; xn½ �B x1; . . .; xn½ �T
�
be a pair of real quadratic forms. Let w be positive definite. Let k1; . . .; kn be theroots of the k-equation of / and w: Then there exists a one-to-one linear trans-formation x1; . . .; xn½ � 7! v1; . . .; vn½ � such that the pair’s reduced forms are
/ x1; . . .; xnð Þ � k1c1 v1ð Þ2 þ � � � þ kncn vnð Þ2w x1; . . .; xnð Þ � c1 v1ð Þ2 þ � � � þ cn vnð Þ2
�;
where each ci is a nonzero real number.
4.3.23 Theorem Let A � aij
; and B � bij
be symmetric n-square real matrices.Let B be invertible. Let
/ x1; . . .; xnð Þ �P aijxixj ¼ x1; . . .; xn½ �A x1; . . .; xn½ �Tw x1; . . .; xnð Þ �P bijxixj ¼ x1; . . .; xn½ �B x1; . . .; xn½ �T
�
be a pair of real quadratic forms. Let w be positive definite. Let k1; . . .; kn be theroots of the k-equation of / and w: Then there exists a one-to-one linear trans-formation x1; . . .; xn½ � 7! y1; . . .; yn½ � such that the pair’s reduced forms are
/ x1; . . .; xnð Þ � k1 y1ð Þ2 þ � � � þ kn ynð Þ2� �
w x1; . . .; xnð Þ � y1ð Þ2 þ � � � þ ynð Þ2� �
9=;:
346 4 Sylvester’s Law of Inertia
Proof By 4.3.22, there exists a one-to-one linear transformationx1; . . .; xn½ � 7! v1; . . .; vn½ � such that the pair’s reduced forms are
/ x1; . . .; xnð Þ � k1c1 v1ð Þ2 þ � � � þ kncn vnð Þ2w x1; . . .; xnð Þ � c1 v1ð Þ2 þ � � � þ cn vnð Þ2
�;
where each ci is a nonzero real number. On applying the one-to-one lineartransformation
v1 ¼ 1ffiffiffiffiffic1j j
p y1
..
.
vn ¼ 1ffiffiffiffifficnj j
p yn
9>>>=>>>;;
we get the following reduced forms:
/ x1; . . .; xnð Þ � k1c1c1j j y1ð Þ2 þ � � � þ kn
cncnj j ynð Þ2
w x1; . . .; xnð Þ � c1c1j j y1ð Þ2 þ � � � þ cn
cnj j ynð Þ2):
Here each cicij j is equal to 1 or −1. So if w x1; . . .; xnð Þ is positive definite, then each
cicij j is equal to 1, and hence
/ x1; . . .; xnð Þ � k1 y1ð Þ2 þ � � � þ kn ynð Þ2w x1; . . .; xnð Þ � y1ð Þ2 þ � � � þ ynð Þ2
�:
Similarly, if w x1; . . .; xnð Þ is negative definite, then
/ x1; . . .; xnð Þ � � k1 y1ð Þ2 þ � � � þ kn ynð Þ2� �
w x1; . . .; xnð Þ � � y1ð Þ2 þ � � � þ ynð Þ2� �
9=;:
∎
Exercises
1. Let V be any n-dimensional vector space over the field F. Let T : V ! V be alinear transformation. Show that there exists a positive integer k such that
i� k ) N Tk� � ¼ N Ti
� �;
and N Tk�1� �
is a proper subset of N Tk� �
:
2. Let V be any n-dimensional vector space over the field F. Let T : V ! V be alinear transformation. Show that ran T3ð Þ is invariant under T.
4.3 Application to Riemannian Geometry 347
3. Let V be any n-dimensional inner product space over the field C: Let T : V !V be a normal linear transformation. Suppose that all the eigenvalues of T arereal. Show that T is Hermitian.
4. Let A be a 6-square complex matrix. Suppose that A is a nonnegative definitematrix. Show that there exists a unitary 6 6 matrix U such that
a. A ¼ U diag k1; . . .; k6ð Þð ÞU;b. k1; . . .; k6 are the eigenvalues of the matrix A,c. each ki is a nonnegative real number,d. det Að Þ is a nonnegative real number.
5. Let A be a 6-square complex matrix. Let A be symmetric and unitary. Show thatthere exists a symmetric unitary complex matrix S such that S2 ¼ A:
6. Let aij
be an n-square real symmetric matrix. Let
/ x1; . . .; xnð Þ �Xi;j
aijxixj
be a real quadratic form. Show that there exists a real invertible matrix cij
such that the transformation
x1 ¼ c11y1 þ � � � þ c1nyn...
xn ¼ cn1y1 þ � � � þ cnnyn
9>=>;
reduces the formP
i;j aijxixj to the form
y1ð Þ2 þ � � � þ yrð Þ2� yrþ 1ð Þ2� � � � � yrþ sð Þ2:
7. Let A � aij
; and B � bij
be symmetric 5-square real matrices. Let B beinvertible. Let
/ x1; . . .; x5ð Þ � x1; . . .; x5½ �A x1; . . .; x5½ �Tw x1; . . .; xnð Þ � x1; . . .; x5½ �B x1; . . .; x5½ �T
�
be a pair of real quadratic forms. Let w be positive definite. Let k1; . . .; k5 be theroots of the k-equation of / and w: Show that there exists a one-to-one lineartransformation x1; . . .; x5½ � 7! v1; . . .; v5½ � such that the pair’s reduced forms are
/ x1; . . .; x5ð Þ � k1c1 v1ð Þ2 þ � � � þ k5c5 v5ð Þ2w x1; . . .; x5ð Þ � c1 v1ð Þ2 þ � � � þ c5 v5ð Þ2
�;
where each ci is a nonzero real number.
348 4 Sylvester’s Law of Inertia
8. Let A be an n-square complex matrix. Suppose that A is a nonnegative definitematrix. Show that the square root of A exists.
9. Let V be any n-dimensional vector space over the field F. Let T : V ! V be alinear transformation. Show that there exists a positive integer k such that
V ¼ N Tk� �� ran Tk
� �:
10. Suppose that A � aij
is a symmetric n-square real matrix. Let
/ x1; . . .; xnð Þ ¼ x1; . . .; xn½ �A x1; . . .; xn½ �T
be a real quadratic form. Suppose that a11 6¼ 0: Show that the one-to-one lineartransformation
y1; . . .; yn½ �T¼a110...
0
a121...
0
� � �� � �. ..
� � �
a1n0...
1
2664
3775 x1; . . .; xn½ �T
reduces / x1; . . .; xnð Þ to
1a11
y1ð Þ2 þ/1 y2; . . .; ynð Þ;
where /1 y2; . . .; ynð Þ is a quadratic form.
4.3 Application to Riemannian Geometry 349
Bibliography
1. M. Artin, Algebra (Prentice Hall, 2008)2. P.R. Halmos, Finite-Dimensional Vector Spaces (Springer, 2011)3. I.N. Herstein, Topics in Algebra, 2nd edn. (Wiley-India, 2008)4. N. Jacobson, Lectures in Abstract Algebra (D. Van Nostrand Company, Inc., 1965)5. I.S. Luthar, I.B.S. Passi, Field Theory (Narosa, 2008)6. F. Zhang, Matrix Theory (Springer, 1999)
© Springer Nature Singapore Pte Ltd. 2020R. Sinha, Galois Theory and Advanced Linear Algebra,https://doi.org/10.1007/978-981-13-9849-0
351