Chapter 8 Linear Functions and Matrices - WordPress.com · 8.2 Linear and Afﬁne Functions. A...

Chapter 8

Linear Functions and Matrices

Functions (also known as ‘mappings,’ or ‘maps’) between Euclidean spaces areubiquitous in physics, engineering, and mathematics. Examples abound. Scalarfunctions of a single variable are the subject of differential and integral calculus,while scalar and vector functions of variables in R3 are studied in vector calculus,with applications in mechanics, electromagnetism, fluid mechanics, and elasticity.

Among the most important functions are linear functions. A precise defini-tion will be given later in this chapter, but, for the moment, we can think of linesand planes that pass through the origin as prototypical linear functions. Thoughless general than their non-linear counterparts, linear functions are no less impor-tant because many non-linear functions benefit from a locally linear analysis. Thederivative (Fig. 3.1) provides an example of a linear approximation near a point ofany function (where the derivative exists). Higher-dimensional analogues of thisconstruction lead to the notion of locally linear functions (planes, hyperplanes)near any point of a smooth surface or hypersurface. Changes of integration vari-ables between multivariable Cartesian coordinates and other coordinate systemsinvolve the Jacobian, which determines how the volume element (an infinitesimaln-dimensional hypercube) changes under the variable transformation.

In this chapter, we formalize the definition of a function between Euclideanspaces. We then specialize our discussion to linear functions and develop theirkey properties. Examples are used illustrate the various concepts associated withfunctions. The main theorem in this chapter establishes the equivalence betweenlinear functions and matrices. Of these, square matrices are the most commonbecause they are used for operations that map Rn onto itself, such as rotationsand other rigid-body transformations used throughout physics, but especially inclassical and quantum mechanics. We conclude with several examples of commonlinear transformations and their matrix realizations.

103

Mark Gill

MPH

Mark Gill

marksphysicshelp

104 Linear Algebra

8.1 Functions between Euclidean SpacesRecall that a function is a relation between two sets: an input set and a set ofallowed outputs with the property that each element of the input set is related toone and only one element of the output set. The following definition provides thestandard nomenclature for the input and output sets:

DEFINITION 8.1. The domain of a function f is the set over which it is defined,The range of f is the set of values taken by f over its domain.

The notation f : Rn → Rn indicates a function f whose domain is a (proper orimproper) subset of Rn and whose range is a (proper or improper) subset of Rm.

EXAMPLE 8.1. A function f : R→ R is represented by a curve in the x-y plane,with the input along the x-axis and the output along the y-axis. Figure 8.1 showsthree such functions that illustrate three domains and ranges that commonly arise.Figure 8.1(a) shows the function y = x3, for which the domain and range areboth the entire real line R. For the function y = tanhx, shown in Fig. 8.1(b), thedomain is the entire real line R, but the range is the subset (−1,1) of R. The rangeis an open interval because the limit points (±1) of the interval are attained onlyasymptotically as x → ±∞. Finally, Fig. 8.1(c) shows the function y =

√1− x2,

for which the domain is the subset [−1,1] and the range the subset [0,1] of R.

EXAMPLE 8.2. A function f : R2 → R is represented as a surface in R3. Fig-ure 8.2(a) shows the function f = x2+y2, whose surface is a paraboloid (a surfacegenerated by rotating a parabola about its axis of symmetry). The domain is all ofR2, which is represented by the x-y plane, and the range, which is plotted alongthe z-axis, is the non-negative real line [0,∞). Functions f : R → R3 have the

�a�

�2 �1 0 1 2�2

�1

0

1

2

x

f

�b�

�2 �1 0 1 2�2

�1

0

1

2

x

f

�c�

�2 �1 0 1 2�2

�1

0

1

2

x

f

1

Figure 8.1: The functions (a) y = x3, (b) y = tanhx, and (c) y =√

1− x2. Thedomain and range in (a) are both R, in (b) the domain is R and the range is (−1,1),and in (c), the domain is [−1,1] and the range [0,1].

Mark Gill

MPH

Mark Gill

marksphysicshelp

Linear Functions and Matrices 105

�2�1

01

2 �2�1012

0

1

2

3

4

(a) (b)x x

y y

f f

1

Figure 8.2: The functions (a) f = x2 + y2 and (b) f = (2cos4t,2sin4t, t). Thedomain of (a) is R2 and the range is the subset [0,∞) of R. For (b), the domain isthe subset [0,∞) of R, and the range is the subset [−2,2]× [−2,2]× [0,∞) of R3.

form (x(t),y(t),z(t)), which is the parametric form of a curve in R3. Such func-tions occur in mechanics as particle trajectories. Figure 8.2(b) shows the functionf = (2cos4t,2sin4t, t) for t ≥ 0, which is the domain of the function. The rangeis the subset [−2,2]× [−2,2]× [0,∞) of R3. The spiral is the type of trajectoryseen for a charged particle in a magnetic field (cf. Example 3.5).

8.2 Linear and Affine FunctionsA central theme of differential calculus is the approximation of nonlinear func-tions by linear functions determined by the derivative of the nonlinear function.Thus, as discussed in the introduction, linear functions have a far broader rangeof applications than their properties might suggest. This section formalizes thedefinition linear and affine functions, which provides the basis for establishing adirect connection to matrices. We begin with the definition of a linear function:

DEFINITION 8.2. A function f : Rn → Rn is linear if, for any vectors xxx,yyy ∈ Rn,

f (xxx+ yyy) = f (xxx)+ f (yyy) , (8.1)

and, for any xxx ∈ Rn and λ ∈ R,

f (λxxx) = λ f (xxx) . (8.2)

These conditions can be subsumed by a single requirement for linearity:

f (λxxx+µyyy) = λ f (xxx)+µ f (yyy) , (8.3)

for any xxx,yyy ∈ Rn and λ , µ ∈ R.

Mark Gill

MPH

Mark Gill

marksphysicshelp

106 Linear Algebra

EXAMPLE 8.3. Consider the function f = ax, which is the equation of a straightline through the origin with slope a. To test for linearity, we write

f (λx+µy) = a(λx+µy) = λ (ax)+µ(ay) = λ f (x)+µ f (y) , (8.4)

which, according (8.3) fulfills the condition for linearity. Now consider the func-tion f = ax+b, which is a straight line with slope a and y-intercept b. To test forlinearity, we again write

f (λx+µy) = a(λx+µy)+b = λ (ax)+µ(ay)+b , (8.5)

which, because of the additive factor of b, does not equal λ f (x)+ µ f (y), so thisfunction is not linear. In fact, such functions are called affine because they are acombination of a linear transformation (in this case ax) followed by a translation(in this case, b). Linear and affine functions share several properties, as we willshow below, but linear transformations have additional special properties.

Finally, consider the equation of a plane (4.20): Ax+By+Cz = D. Rearrang-ing this equation into a form where z is expressed as a function of x and y:

z(x,y) =DC− A

Cx− B

Cy , (8.6)

shows that z: R2 → R. Thus, by defining rrr = (x,y) and rrr� = (x�,y�), we test forlinearity by writing

z(λ rrr+µrrr�) =DC− A

C(λx+µx�)− B

C(λy+µy�)

=DC−λ

�AC

x+BC

y�−µ

�AC

x�+BC

y��

, (8.7)

which shows that z is a linear function only if D = 0; otherwise, it is affine.

There are several useful properties of linear functions, some of which areshared by affine functions. The following theorems prove several such proper-ties.

THEOREM 8.1. A linear function f : Rn →Rm maps the origin of Rn to the originof Rm.Proof. We invoke (8.3) with λxxx+µyyy = 000. Then, by using the properties of linearfunctions, we have, with µyyy =−λxxx,

f (000) = f (λxxx+µyyy) = f (λxxx)+ f (µyyy)

= f (λxxx)+ f (−λxxx) = λ f (xxx)−λ f (xxx) = 000. (8.8)

Mark Gill

MPH

Mark Gill

marksphysicshelp


Affine functions clearly do not share this property because the translational partof the function translates the origin in Rm. �THEOREM 8.2. A linear function f : Rn →Rm maps straight lines in Rn to straightlines Rm.Proof. From (4.1), the equation of a straight line in Rn is

xxx = xxx0 +λddd , (8.9)

in which xxx is any point on the line, xxx0 is a reference point on the line, λ ∈ R, andddd is the direction of the line. Thus, using the fact that f is a linear function,

f (xxx0 +λddd) = f (xxx0)+ f (λddd) = f (xxx0)+λ f (ddd) , (8.10)

which is the equation of a line in Rm, with reference point f (xxx0) and directionvector f (ddd). �

Building on the discussion in Example 8.3, the general form of an affine func-tion g: Rn →Rm is g = f +bbb, where f is a linear function and bbb is a vector in Rm.Thus, with (8.9) as input and using (8.10), the output of an affine function is

g(xxx0 +λddd) = f (xxx0 +λddd)+bbb = f (xxx0)+bbb+λ f (ddd) , (8.11)

which is a straight line with reference point f (xxx0)+bbb and direction vector f (ddd).Using analogous steps, we can also show that linear and affine functions both mapparallel lines in Rn to parallel lines in Rm.

8.3 Matrices Associated with Linear FunctionsSeveral general properties of linear functions were discussed in the preceding sec-tion. There are other properties for particular types of linear functions. A system-atic development of such properties relies on the association of linear transforma-tions with matrices. The basis of this association is the following theorem:

THEOREM 8.3. A function f : Rn → Rm is linear if and only if there exists anm×n matrix A such that f = Axxx, for all xxx ∈ Rn.

Proof. We first suppose that f = Axxx and show that the linearity of f follows. Forany two vectors xxx,yyy ∈ Rn and λ , µ ∈ R, the property (7.28) of matrices yields

f (λxxx+µyyy) = A(λxxx+µyyy) = λAxxx+µAyyy = λ f (xxx)+µ f (yyy) , (8.12)

Thus, according to (8.3), f is a linear function.

Mark Gill

MPH

Mark Gill

marksphysicshelp

108 Linear Algebra

Now suppose that f : Rn → Rm is a linear function. We must show that f ]Axxxfor all xxx ∈ Rn. We begin by expressing all vectors in Rn in terms of the standardbasis. These vectors, denoted as eeei, for i = 1,2, . . .n are column vectors with a 1in the ith row and zero in all other rows:

eee1 =

1

0...0

, eee2 =

0

1...0

, . . . , eeen =

0

0...1

. (8.13)

This basis is the natural extension to Rn of the familiar unit vectors iii ≡ eee1, jjj ≡ eee2,and kkk ≡ eee3 in R3:

iii =

1

0

0

, jjj =

0

1

0

, kkk =

0

0

1

, (8.14)

in terms of which any vectors in R3 can be expressed: xxx = x1iii+ x2 jjj+ x3kkk. Simi-larly, for any xxx ∈ Rn,

xxx =

x1

x2...

xn

= x1eee1 + x2eee2 + · · ·+ xneeen . (8.15)

Hence, by using the two conditions for linearity in (8.1) and (8.2), we have,

f (xxx) = f (x1eee1 + x2eee2 + · · ·+ xneeen)

= f (x1eee1)+ f (x1eee2)+ · · ·+ f (xneeen)

= x1 f (eee1)+ x2 f (eee2)+ · · ·+ xn f (eeen) . (8.16)

As f maps vectors in Rn to vectors in Rm, f (eeei) is an element of Rm, which is anm-dimensional column vector. We denote this vector as aaai and its components as

aaai =

a1i

a2i...

am1

. (8.17)

Mark Gill

MPH

Mark Gill

marksphysicshelp


Combining this result with (8.16) yields

f (xxx) = x1 f (eee1)+ x2 f (eee2)+ · · ·+ xn f (eeen)

= x1aaa1 + x2aaa2 + · · ·+ xnaaan

=

x1a11 + x2a12 + · · ·+ xna1n

x1a21 + x2a22 + · · ·+ xna2n...

......

x1am1 + x2am2 + · · ·+ xnamn

=

a11 a12 · · · a1n

a21 a22 · · · a2n...

... . . . ...am1 am2 · · · amn

x1

x2...

xn

≡ Axxx , (8.18)

which established the equivalence between linear functions f :Rn →Rm and m×nmatrices. �

We will explore some of the consequences of Theorem 8.3 once we examinesome of the implications of this theorem for the algebraic structure of matrices.

8.4 Linear Functions and Matrices1

Suppose we have two linear functions f : Rn → Rn and g: Rn → Rn. The sum,f (xxx)+g(xxx), multiplication by a scalar λ ∈ R, λ f (xxx), and composition ( f ◦g)(xxx)are well-defined operations. According to Theorem 8.3, there are n× n matricesA and B such that f (xxx) = Axxx and g(xxx) = Bxxx for all xxx ∈ Rn. The transcription ofthe foregoing function operations into matrix forms are:

f (xxx)+g(xxx) = Axxx+Bxxx = (A+B)xxx ,

λ f (xxx) = λ (Axxx) = (λA)xxx , (8.19)

( f ◦g)(xxx) = f (g(xxx)) = A(Bxxx) = (AB)xxx ,

which introduces the sum of two matrices, the multiplication of matrices by areal constant, and the multiplication of two matrices. We consider each of theseoperations in the following sections. Although each operation is valid for squarematrices, rectangular matrices (with certain restrictions for multiplication) mayalso be used.

1For information only; not examinable.

Mark Gill

MPH

Mark Gill

marksphysicshelp

110 Linear Algebra

8.4.1 Matrix AdditionAddition is defined for any matrices that have the same numbers of rows andcolumns. For m× n matrices A = (ai j) and B = (bi j), where i = 1,2, . . . ,m andj = 1,2, . . . ,n, the sum A+B is defined as the addition of entries which are in thesame position in the two matrices, that is, elements which have the same valuesof i and j:

A+B=

a11 a12 · · · a1n

a21 a22 · · · a2n...

... . . . ...am1 am2 · · · amn

+

b11 b12 · · · b1n

b21 b22 · · · b2n...

... . . . ...bm1 bm2 · · · bmn

=

a11 +b11 a12 +b12 · · · a1n +b1n

a21 +b21 a22 +b22 · · · a2n +b2n...

... . . . ...am1 +bm1 am2 +bm2 · · · amn +bmn

. (8.20)

This sum can be written in abbreviated form as A+B= (ai j +bi j) once the num-bers of rows and columns are specified. This definition makes apparent that A andB must have the same number of rows and columns, that is, both must be m× nmatrices, for their sum to be well-defined.

EXAMPLE 8.4. Consider the sum of the two matrices

A=

�2 5

−1 6

�, B=

�1 −1

3 9

�. (8.21)

These matrices have the same numbers of rows and columns, so, from (8.20), theirsum is calculated as

A+B=

�2 5

−1 6

�+

�1 −1

3 9

�

=

�2+1 5+(−1)

−1+3 6+9

�(8.22)

=

�3 4

3 15

�. (8.23)

Mark Gill

MPH

Mark Gill

marksphysicshelp


8.4.2 Multiplication of a Matrix by a Real ScalarSuppose A = (ai j) is an m× n matrix. Then the matrix λA, where λ ∈ R is thematrix λA= (λai j), where each element is multiplied by λ . In matrix form,

λA= λ

a11 a12 · · · a1n

a21 a22 · · · a2n...

... . . . ...am1 am2 · · · amn

=

λa11 λa12 · · · λa1n

λa21 λa22 · · · λa2n...

... . . . ...λam1 λam2 · · · λamn

. (8.24)

In the special case that λ = −1, (−1)A = (−ai j), which enables us to define thedifference A−B between matrices A and B as A−B= (ai j −bi j).

EXAMPLE 8.5. Consider the matrices

A=

1 2 3

4 5 6

−1 −2 −3

, B=

2 4 6

−1 3 5

4 −2 −3

. (8.25)

Then,

3A= 3

1 2 3

4 5 6

−1 −2 −3

=

3×1 3×2 3×3

3×4 3×5 3×6

3× (−1) 3× (−2) 3× (−3)

=

3 6 0

12 15 18

−3 −6 −9

, (8.26)

and

A−B=

1 2 3

4 5 6

−1 −2 −3

−

2 4 6

−1 3 5

4 −2 −3

=

−1 −2 −3

5 2 1

−5 0 0

(8.27)

Mark Gill

MPH

Mark Gill

marksphysicshelp

112 Linear Algebra

8.4.3 Linear Algebraic Structure of MatricesWe define the zero matrix 0 as the matrix having all entries equal to zero:

0 =

0 0 · · · 0

0 0 · · · 0...

... . . . ...0 0 · · · 0

. (8.28)

Then, for any m × n matrices A, B, C, and 0 and real numbers λ and µ , thealgebraic rules of the preceding two subsections imply:

1. Addition is commutative:A+B= B+A .

2. Addition is associative:

(A+B)+C= A+(B+C) .3. There exists a unique zero matrix 0 that obeys

A+0 = A .4. Every matrix A has a unique inverse −A under addition: such that

A+(−A) = 0.5. Scalar multiplication is distributive over matrix addition,

λ (A+B) = λA+λB ,

(λ +µ)A = λA+µA ,

and has a unit, signified by 1, such that,

1A= A ,

and successive multiplication of vectors by scalars satisfies

(λ µ)A= λ (µA) .

These laws for the operations of addition and multiplication by a real constantare analogous to the corresponding laws for vectors: m×n matrices form a linearvector space (Definition 2.1). Indeed, we may regard matrices as a generaliza-tion of vectors, namely, vectors of vectors. This can be made more precise bydefining a function that maps matrices to vectors by taking the first column ofthe matrix, to which we successively append each remaining column to obtain anmn-dimensional column vector. As each operation described above is carried outelement by element, m×n matrices form an mn-dimensional vector space.

Mark Gill

MPH

Mark Gill

marksphysicshelp


8.4.4 Matrix MultiplicationThe multiplication of matrices is a somewhat more subtle matter than the op-erations we have considered in the previous subsections. We first consider themultiplications of an m× p matrix A by an p-dimensional column vector bbb:

Abbb =

a11 a12 · · · a1p

a21 a22 · · · a2p...

... . . . ...am1 am2 · · · amp

b1

b2...

bp

=

a11b1 +a12b2 + · · ·+a1pbp

a21b1 +a22b2 + · · ·+a2pbp...

......

am1b1 +am2b2 + · · ·+ampbp

=

∑pk=1 a1kbk

∑pk=1 a2kbk

...∑p

k=1 amkbk

. (8.29)

Suppose we now have a p × n matrix B, which we write as n columns of p-dimensional vectors:

B=

b11 b12 · · · b1n

b21 b22 · · · b2n...

... . . . ...bp1 bp2 · · · bpn

= (bbb1,bbb2, . . . ,bbbn) , (8.30)

in which bbb j forms the jth column of B and has entries bi j, with i = 1,2, . . . , p.Thus, the entries of the product AB are

AB= (Abbb1,Abbb2, . . . ,Abbbn)

=

∑pk=1 a1kbk1 ∑p

k=1 a1kbk2 · · · ∑pk=1 a1kbkn

∑pk=1 a2kbk1 ∑p

k=1 a2kbk2 · · · ∑pk=1 a2kbkn

...... . . . ...

∑pk=1 amkbk1 ∑p

k=1 amkbk2 · · · ∑pk=1 amkbkn

, (8.31)

so the product AB has matrix elements

(AB)i j =p

∑k=1

aikbk j , (8.32)

for i = 1,2, . . . ,m and j = 1,2, . . . ,n, that is, the product is an m×n matrix.

Mark Gill

MPH

Mark Gill

marksphysicshelp

114 Linear Algebra

If m �= n, then A and B can be multiplied only in the order AB. If, however, m=n, the the produce is well-defined and yields a p× p matrix. The more interestingcase is where p = m = n, so that A and B are both n× n matrices, as are theirproducts AB and BA. The two products need not be equal, and this inequality hasprofound ramifications for quantum mechanics, when such matrices are associatedwith operations such as rotations about different axes.

EXAMPLE 8.6. Consider the matrices

A=

�3 −1

5 2

�, B=

�0 2

1 3

�, C=

�2 3 4

1 −1 2

�. (8.33)

The products AB, BA, AC, and BC are well-defined, and we find

AB=

�3 −1

5 2

��0 2

1 3

�=

�−1 3

2 16

�, (8.34)

BA=

�0 2

1 3

��3 −1

5 2

�=

�10 4

18 5

�, (8.35)

AC=

�3 −1

5 2

��2 3 4

1 −1 2

�=

�5 10 10

12 13 24

�, (8.36)

BC=

�0 2

1 3

��2 3 4

1 −1 2

�=

�2 −2 4

5 0 10

�. (8.37)

We see, in particular, that AB �= BA.

8.4.5 Properties of Matrix MultiplicationWe denote the unit matrix 1I as the matrix with its diagonal entries equal to 1 andall other entries equal to zero:

1I=

1 0 · · · 0

0 1 · · · 0...

... . . . ...0 0 · · · 1

. (8.38)

The following properties are valid for matrices A, B, and C for which the indicatedoperations are well-defined, and for any λ ∈ R:

Mark Gill

MPH

Mark Gill

marksphysicshelp


1. Matrix multiplication is distributive over addition:(A+B)C= AB+AC ,

A(B+C) = AB+AC .

2. Scalar multiplication commutes with matrix multiplication:

λ (AB) = (λA)B= A(λB) .

3. Matrix multiplication is associative:

A(BC) = (AB)C .

4. Zero matrix for matrix multiplication:

A0 = 0A= 0.

4. Unit matrix for matrix multiplication:

A1I= 1IA= A .

8.5 RotationsAmong the most widespread applications of Theorem 8.3 is determining the ma-trices of rigid-body transformations, such as rotations, reflections, and inversions.These operations appear in both classical and quantum mechanics, but the alge-braic properties of rotations in particular have important consequences for thequantum mechanical description of three-dimensional rotationally-invariant sys-tems. In this section, we will derive the matrices for rotations in R2 and R3.

8.5.1 Rotations in R2

We consider rotations about the origin in thew x-y plane. The positive direction ofrotation is usually taken in the counter-clockwise direction. For any vector xxx∈R2,

xxx =

�x1

x2

�= x1

�1

0

�+ x2

�0

1

�= x1eee1 + x2eee2 . (8.39)

The vectors xxx, eee1, and eee2, and the components x1 and x2 are shown in Fig. 8.3(a).The matrix associated with the counter-clockwise rotation by an angle θ is ob-tained by following the prescription in Theorem 8.3 and referring to Fig. 8.3(b).Basic trigonometry yields

f (eee1) =

�cosθsinθ

�, f (eee2) =

�−sinθ

cosθ

�. (8.40)

Mark Gill

MPH

Mark Gill

marksphysicshelp

116 Linear Algebra

x1

x2

eee1

eee2

xxx

y1

y2

f (eee1)

f (eee2)

f (xxx)(a) (b)

θ

θ

1

Figure 8.3: The rotation of the x-y plane about the origin in the counter-clockwisedirection by an angle θ . (a) The original orientation of the basis vectors eee1 andeee2, together with a vector xxx. The shaded region is the area bounded by eee1 and eee2.(b) The quantities in (a) transformed by a linear function that represents a counter-clockwise rotation by an angle θ . The shaded region is the transformed shadedarea in (a). These areas are equal because the determinant of the transformation isequal to unity.

Hence,

f (xxx;θ) = ( f (eee1), f (eee2))xxx

=

�cosθ −sinθsinθ cosθ

��x1

x2

�= R(θ)xxx , (8.41)

which defines the matrix for a counterclockwise rotation in the x-y plane about theorigin by an angle θ :

R(θ) =

�cosθ −sinθsinθ cosθ

�. (8.42)

This matrix determines all of the properties of rotations in R2.Consider first the determinant of R(θ):

det�R(θ)

�=

��cosθ −sinθsinθ cosθ

��= cos2 θ + sin2 θ = 1. (8.43)

the interpretation of this result is seen by observing that the two shaded regionsin Fig. 8.3(a,b) have equal areas, that is, the transformation does not alter the area

Mark Gill

MPH

Mark Gill

marksphysicshelp


spanned by the basis vectors. This is what is meant by a ‘rigid-body’ transforma-tion: the body is transformed rigidly, in this case by a rotation, with no change involume.

We now consider a rotation by θ followed by a rotation by φ :

R(φ)R(θ) =

�cosφ −sinφsinφ cosφ

��cosθ −sinθsinθ cosθ

�

=

�cosφ cosθ − sinφ sinθ −cosφ sinθ − sinφ cosθsinφ cosθ + cosφ sinθ −sinφ sinθ + cosφ cosθ

�

=

�cos(φ +θ) −sin(φ +θ)sin(φ +θ) cos(φ +θ)

�= R(φ +θ) , (8.44)

where in writing the last matrix equality we have invoked the trigonometric iden-tities

cos(φ +θ) = cosφ cosθ − sinφ sinθ ,sin(φ +θ) = sinφ cosθ + cosφ sinθ .

(8.45)

Notice that, because φ and θ enter the expression for the product transformationsymmetrically, there roles can be interchanged without affecting the result, so

R(φ)R(θ) = R(θ)R(φ) . (8.46)

In other words, these rotations commute. We will see in the next section thatrotations about different axes do not commute.

The rotation angle θ = 0 yields the 2×2 unit matrix:

R(0) =

�1 0

0 1

�, (8.47)

which is to be expected because θ = 0 corresponds to performing no transforma-tion at all. With this result and that in (8.44) we have that

R(0) = R(θ −θ) = R(θ)R(−θ) = R(−θ)R(θ) . (8.48)

Hence, [R(θ)]−1 = R(−θ), which simply says that reversing a transformationyields the original vectors.

Mark Gill

MPH

Mark Gill

marksphysicshelp

118 Linear Algebra

x1 x2 x1

x2 x3 x3

x3 x1 x2

f (eee1) f (eee2)

f (eee1)

f (eee2) f (eee3) f (eee3)

(a) (b) (c)

θ θθ

θ θ θ

1

Figure 8.4: The rotations in the counter-clockwise direction by an angle θ abouteach of the coordinate axes of a right-handed coordinate system in R3: rotationsabout (a) the x3 axis, (b) the x1 axis, and (c) the x2-axis. Each of the rotation axespoint out of the plane of the page.

8.5.2 Rotations in R3

We will not develop the complete theory of rotations in R3, but consider onlyrotations about the three coordinate axes. This will provide an indication of thefundamental difference between these rotations and those in R2.

The rotations we are considering are shown in Fig. 8.4. For any vector xxx ∈R3,

xxx =

x1

x2

x3

= x1

1

0

0

+ x2

0

1

0

+ x3

0

0

1

= x1eee1 + x2eee2 + x3eee3 . (8.49)

We first consider rotations about the x3 axis, which is depicted in Fig. 8.4(a).The x3 axis is unaffected by the transformation, and the x1- and x2-axes are trans-formed in the same manner as for two-dimensional rotations. Hence, we have

f3(eee1) =

cosθsinθ

0

, f3(eee2) =

−sinθcosθ

0

, f3(eee3) =

0

0

1

. (8.50)

The matrix corresponding to this linear transformation is

f3(xxx;θ) = ( f3(eee1), f3(eee2), f3)eee3))xxx

=

cosθ −sinθ 0

sinθ cosθ 0

0 0 1

x1

x2

x3

= R3(θ)xxx , (8.51)

Mark Gill

MPH

Mark Gill

marksphysicshelp


which defines

R3(θ) =

cosθ −sinθ 0

sinθ cosθ 0

0 0 1

. (8.52)

As Fig. 8.4(b) shows, rotations about the x1-axis have a similar geometry torotations about the x3-axis. A procedure analogous ton that used to determineR3(θ) yields

R1(θ) =

1 0 0

0 cosθ −sinθ0 sinθ cosθ

. (8.53)

Finally, we consider rotations about the x2-axis [Fig., 8.4(c)]. We obtain

f2(eee1) =

cosθ0

−sinθ

, f2(eee2) =

0

1

0

, f2(eee3) =

sinθ0

cosθ

. (8.54)

The matrix form of this linear transformation is

f2(xxx;θ) = ( f2(eee1), f2(eee2), f2(eee3))xxx

=

cosθ 0 sinθ0 1 0

−sinθ 0 cosθ

x1

x2

x3

= R2(θ)xxx , (8.55)

which defines

R2(θ) =

cosθ 0 sinθ0 1 0

−sinθ 0 cosθ

. (8.56)

Each of the rotations about the coordinate aces separately have the same prop-erties as two-dimensional rotations. However, taken together, some different prop-

Mark Gill

MPH

Mark Gill

marksphysicshelp

120 Linear Algebra

erties emerge. Most notably, we calculate, for example, R1(θ)R2(θ):

R1(θ)R2(θ) =

1 0 0


cosθ 0 sinθ0 1 0

−sinθ 0 cosθ

=

cosθ 0 sinθ

−sin2 θ cosθ −sinθ cosθ

−sinθ cosθ sinθ cos2 θ

, (8.57)

which we compare with

R2(θ)R1(θ) =

cosθ 0 sinθ0 1 0

−sinθ 0 cosθ

1 0 0


=

cosθ sin2 θ sinθ cosθ0 cosθ −sinθ

−sinθ sinθ cosθ cos2 θ

, (8.58)

so R1(θ)R2(θ) �= R2(θ)R1(θ). This is a general property of rotations in threedimensions that has important implications for quantum mechanical systems,.

8.6 SummaryThe main theme of this chapter is the relationship between linear functions andmatrices. A linear function f : Rn → Rm can be succinctly defines as having theproperty that

f (λxxx+µyyy) = λ f (xxx)+µ f (yyy) ,

for any xxx,yyy ∈ Rn and λ , µ ∈ R.The equivalence of linear functions and matrices was established by Theorem

8.3 and led to the transcription of operations between functions to those betweenmatrices:

f (xxx)+g(xxx) = Axxx+Bxxx = (A+B)xxx ,

λ f (xxx) = λ (Axxx) = (λA)xxx ,

( f ◦g)(xxx) = f (g(xxx)) = A(Bxxx) = (AB)xxx ,

Mark Gill

MPH

Mark Gill

marksphysicshelp


Addition and multiplication can be defined for any m×n matrices, but the productAB can be defined only for an m× p matrix A and a p×n matrix B, in which casethe product is an m×n matrix.

Rotations are among the most important linear transformations and are ex-pressed as n× n matrices. Rotations R(θ) in two dimensions about the origin inthe counter-clockwise direction by an angle θ were found to have the followingproperties:

1. det�R(θ)

�= 1, which means that areas are preserved under rotations.

2. The composition of rotations is R(θ)R(φ) = R(θ +φ).

3. The composition of rotation is commutative: R(θ)R(φ) = R(φ)R(θ).

4. R(0) = 1I, where 1I is the 2×2 unit, or identity, matrix.

5. The inverse�R(θ)

�−1 of a rotation R(θ) is R(−θ).

Rotations in R3 share all of the properties except 3: rotations in three dimen-sions do not generally commute. The ramifications of this non-commutativity willbe explored in quantum mechanics.

Mark Gill

MPH

Mark Gill

marksphysicshelp

122 Linear Algebra

Mark Gill

MPH

Mark Gill

marksphysicshelp

Chapter 8 Linear Functions and Matrices - WordPress.com · 8.2 Linear and Afﬁne Functions. A...

Documents

Transcript of Chapter 8 Linear Functions and Matrices - WordPress.com · 8.2 Linear and Afﬁne Functions. A...