Electromagnetism & Relativity Books Brian Pendletonbjp/emr/emr-nup.pdf · Electromagnetism &...

52
Electromagnetism & Relativity [PHYS10093] Semester 1, 2017/18 Brian Pendleton Email: [email protected] Office: JCMB 4413 Phone: 0131-650-5241 Web: http://www.ph.ed.ac.uk/bjp/emr/ November 28, 2017 Books The course should be self-contained, but it’s always good to read textbooks to broaden your education. David J Griffiths, Introduction to Electrodynamics, (Prentice Hall) John R Reitz, Frederick J Milford, Robert W Christy, Foundations of Electromagnetic Theory, (Addison-Wesley) JD Jackson, Classical Electrodynamics (Wiley) – advanced, good for next year. KF Riley, MP Hobson and SJ Bence, Mathematical Methods for Physics and Engineering, (CUP 1998). PC Matthews, Vector Calculus, (Springer 1998). ML Boas, Mathematical Methods in the Physical Sciences, (Wiley 2006). GB Arfken and HJ Weber, Mathematical Methods for Physicists, (Academic Press 2001). DE Bourne and PC Kendall, Vector Analysis and Cartesian Tensors, (Chapman and Hall 1993). Griffiths is the main text for electromagnetism; Reitz, Milford & Christy is a stan- dard electromagnetism text; Jackson is pretty advanced, but it will also be good for Classical Electrodynamics next year. The other books are useful for the first part of the course, which will cover vectors and tensors. Acknowledgements: Many thanks to Richard Ball, Martin Evans and Roger Horsley for providing copies of their lecture notes and tutorial sheets, from which I have sampled with (apparent) impunity. i

Transcript of Electromagnetism & Relativity Books Brian Pendletonbjp/emr/emr-nup.pdf · Electromagnetism &...

Electromagnetism & Relativity

[PHYS10093] Semester 1, 2017/18

Brian Pendleton

• Email: [email protected]

• Office: JCMB 4413

• Phone: 0131-650-5241

• Web: http://www.ph.ed.ac.uk/∼bjp/emr/

November 28, 2017

Books

The course should be self-contained, but it’s always good to read textbooks tobroaden your education.

• David J Griffiths,Introduction to Electrodynamics, (Prentice Hall)

• John R Reitz, Frederick J Milford, Robert W Christy,Foundations of Electromagnetic Theory, (Addison-Wesley)

• JD Jackson,Classical Electrodynamics (Wiley) – advanced, good for next year.

• KF Riley, MP Hobson and SJ Bence,Mathematical Methods for Physics and Engineering, (CUP 1998).

• PC Matthews,Vector Calculus, (Springer 1998).

• ML Boas,Mathematical Methods in the Physical Sciences, (Wiley 2006).

• GB Arfken and HJ Weber,Mathematical Methods for Physicists, (Academic Press 2001).

• DE Bourne and PC Kendall,Vector Analysis and Cartesian Tensors, (Chapman and Hall 1993).

Griffiths is the main text for electromagnetism; Reitz, Milford & Christy is a stan-dard electromagnetism text; Jackson is pretty advanced, but it will also be good forClassical Electrodynamics next year.

The other books are useful for the first part of the course, which will cover vectorsand tensors.

Acknowledgements: Many thanks to Richard Ball, Martin Evans and RogerHorsley for providing copies of their lecture notes and tutorial sheets, from which Ihave sampled with (apparent) impunity.

i

Contents

1 Revision of vector algebra 1

1.1 Revision of vectors & index/suffix notation . . . . . . . . . . . . . . . 1

1.1.1 Position vector . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.2 Cartesian coordinates . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.3 The Kronecker delta symbol δij . . . . . . . . . . . . . . . . . 2

1.1.4 The Levi-Civita or epsilon symbol εijk . . . . . . . . . . . . . 2

1.1.5 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.6 Vector product . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.7 Scalar triple product . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.8 Vector triple product . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Linear transformation of basis . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Transformation properties of vectors and scalars . . . . . . . . . . . . 4

1.3.1 Transformation of vector components . . . . . . . . . . . . . . 4

1.3.2 Scalar and vector products . . . . . . . . . . . . . . . . . . . . 4

2 Cartesian tensors 5

2.1 Definition and transformation properties . . . . . . . . . . . . . . . . 5

2.1.1 Definition of a tensor . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.2 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.3 Dyadic notation . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.4 Internal consistency in the definition of a tensor . . . . . . . . 7

2.1.5 Properties of Cartesian tensors – tensor algebra . . . . . . . . 7

2.1.6 The quotient theorem . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Revision of matrices and determinants . . . . . . . . . . . . . . . . . 9

2.2.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.2 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.3 Linear equations . . . . . . . . . . . . . . . . . . . . . . . . . 13

ii

2.2.4 Orthogonal matrices . . . . . . . . . . . . . . . . . . . . . . . 15

2.3 Pseudotensors, pseudovectors & pseudoscalars . . . . . . . . . . . . . 16

2.3.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.2 Pseudovectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.4 Some physical examples . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.5 Invariant/Isotropic Tensors . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Rotation, reflection & inversion tensors 22

3.1 The rotation tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.1.1 Rotation about an arbitrary axis . . . . . . . . . . . . . . . . 22

3.1.2 Some important properties of the rotation tensor . . . . . . . 23

3.2 Reflections and inversions . . . . . . . . . . . . . . . . . . . . . . . . 24

3.3 Projection operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.4 Active and passive transformations . . . . . . . . . . . . . . . . . . . 25

4 The inertia tensor 26

4.1 Angular momentum and kinetic energy . . . . . . . . . . . . . . . . . 26

4.1.1 Angular momentum . . . . . . . . . . . . . . . . . . . . . . . 27

4.1.2 Kinetic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.1.3 The parallel axes theorem . . . . . . . . . . . . . . . . . . . . 28

4.1.4 Diagonalisation of rank-two tensors . . . . . . . . . . . . . . . 30

5 Taylor expansions 34

5.1 The one-dimensional case . . . . . . . . . . . . . . . . . . . . . . . . . 34

5.1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.1.2 A precursor to the three-dimensional case . . . . . . . . . . . 37

5.2 The three-dimensional case . . . . . . . . . . . . . . . . . . . . . . . . 37

6 Curvilinear coordinates 39

6.1 Orthogonal curvilinear coordinates . . . . . . . . . . . . . . . . . . . 40

6.1.1 Scale factors and basis vectors . . . . . . . . . . . . . . . . . . 40

6.1.2 Examples of orthogonal curvilinear coordinates (OCCs) . . . . 41

6.2 Elements of length, area and volume . . . . . . . . . . . . . . . . . . 42

6.2.1 Vector length . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.2.2 Arc length and metric tensor . . . . . . . . . . . . . . . . . . . 43

6.2.3 Vector area . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

iii

6.2.4 Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

6.3 Components of a vector field in curvilinear coordinates . . . . . . . . 45

6.4 Div, grad, curl and the Laplacian in OCCs . . . . . . . . . . . . . . . 46

6.4.1 Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6.4.2 Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.4.3 Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.4.4 Laplacian of a scalar field . . . . . . . . . . . . . . . . . . . . 50

6.4.5 Laplacian of a vector field . . . . . . . . . . . . . . . . . . . . 50

7 Electrostatics 51

7.1 The Dirac delta function in three dimensions . . . . . . . . . . . . . . 51

7.2 Coulomb’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

7.3 The electric field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

7.3.1 Field lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

7.3.2 The principle of superposition . . . . . . . . . . . . . . . . . . 55

7.4 The electrostatic potential for a point charge . . . . . . . . . . . . . . 55

7.5 The static Maxwell equations . . . . . . . . . . . . . . . . . . . . . . 56

7.5.1 The curl equation . . . . . . . . . . . . . . . . . . . . . . . . . 56

7.5.2 Conservative fields and potential theory . . . . . . . . . . . . 56

7.5.3 The divergence equation . . . . . . . . . . . . . . . . . . . . . 58

7.6 Electric dipole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7.6.1 Potential and electric field due to a dipole . . . . . . . . . . . 60

7.6.2 Force, torque and energy . . . . . . . . . . . . . . . . . . . . . 61

7.7 The multipole expansion . . . . . . . . . . . . . . . . . . . . . . . . . 63

7.7.1 Worked example . . . . . . . . . . . . . . . . . . . . . . . . . 65

7.7.2 Interaction energy of a charge distribution . . . . . . . . . . . 66

7.7.3 A brute-force calculation - the circular disc . . . . . . . . . . . 66

7.8 Gauss’ law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

7.9 Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

7.9.1 Normal component . . . . . . . . . . . . . . . . . . . . . . . . 70

7.9.2 Tangential component . . . . . . . . . . . . . . . . . . . . . . 71

7.9.3 Conductors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

7.10 Poisson’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

7.10.1 Uniqueness of solution . . . . . . . . . . . . . . . . . . . . . . 72

iv

7.10.2 Methods of solution . . . . . . . . . . . . . . . . . . . . . . . . 73

7.10.3 The method of images . . . . . . . . . . . . . . . . . . . . . . 74

7.11 Electrostatic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

7.11.1 Electrostatic energy of a general charge distribution . . . . . . 77

7.11.2 Field energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

7.12 Capacitors (condensers) and capacitance . . . . . . . . . . . . . . . . 79

7.12.1 Parallel-plate capacitor . . . . . . . . . . . . . . . . . . . . . . 79

7.12.2 Concentric conducting spheres . . . . . . . . . . . . . . . . . . 80

7.12.3 Energy stored in a capacitor . . . . . . . . . . . . . . . . . . . 80

8 Magnetostatics 81

8.1 Currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

8.1.1 Charge and current conservation . . . . . . . . . . . . . . . . . 82

8.1.2 Conduction current . . . . . . . . . . . . . . . . . . . . . . . . 83

8.2 Forces between currents (Ampere, 1821) . . . . . . . . . . . . . . . . 84

8.2.1 The Lorentz force . . . . . . . . . . . . . . . . . . . . . . . . . 85

8.3 Biot-Savart law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

8.3.1 Long straight wire . . . . . . . . . . . . . . . . . . . . . . . . 87

8.3.2 Two long parallel wires . . . . . . . . . . . . . . . . . . . . . . 87

8.3.3 Current loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

8.4 Divergence and curl of the magnetic field: Gauss and Ampere laws . . 88

8.4.1 Ampere’s law (1826) . . . . . . . . . . . . . . . . . . . . . . . 90

8.4.2 Conducting surfaces . . . . . . . . . . . . . . . . . . . . . . . 91

8.5 The vector potential . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

8.6 Magnetic dipoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

8.6.1 Magnetic moment and angular momentum . . . . . . . . . . . 95

8.6.2 Force and torque on a dipole in an external field . . . . . . . . 96

8.7 Summary: electrostatics and magnetostatics . . . . . . . . . . . . . . 98

v

Chapter 1

Revision of vector algebra

There should be nothing new in this (brief) chapter. It’s intended to revise conceptsand notation, and to refresh your memory. The style and content should be similarto Martin Evans’ Fields notes, to which you should refer for detailed expositions.

1.1 Revision of vectors & index/suffix notation

In this chapter we will work entirely in three-dimensional real space, R3.

1.1.1 Position vector

The position vector is bound to some origin O andgives the position of a point relative to that origin; itwill generally be denoted by x or r.

O

1

x

1.1.2 Cartesian coordinates

Let e i, i = 1, 2, 3, be a set of orthonormal Cartesianbasis vectors in R3. By definition, these satisfy

e 1 · e 1 = e 2 · e 2 = e 3 · e 3 = 1

e 1 · e 2 = e 2 · e 3 = e 3 · e 1 = 0-e 1

6e 3

3e 2

The position vector xmay be expressed in terms of Cartesian coordinates (x1, x2, x3):

x = x1 e 1 + x2 e 2 + x3 e 3

Similarly, for an arbitrary vector a in R3, we write

a = a1 e 1 + a2 e 2 + a3 e 3 =3∑

i=1

ai e i ≡ ai e i

In the last expression, we’ve used the Einstein summation convention to denote animplicit sum over the repeated or dummy index i = 1, 2, 3

1

1.1.3 The Kronecker delta symbol δij

Define δij, where i and j can take on the values 1, 2, 3, such that

δij ≡

1 when i = j0 when i 6= j

so that δ11 = δ22 = δ33 = 1 and δ12 = δ13 = δ23 = · · · = 0.

The equations satisfied by the orthonormal basis vectors e i may then be writtensuccinctly as

e i · e j = δij

The components ai of a in an orthonormal basis may be obtained using orthonor-mality

a · e i = aj e j · e i = aj δij = ai[or occasionally (a)i

]

In this equation i is a ‘free’ index and it takes on all three values i = 1, 2, 3.

1.1.4 The Levi-Civita or epsilon symbol εijk

Define εijk , where i, j and k can each take on the values 1, 2, 3, such that

εijk = +1 if (i, j, k) is an even permutation of (123)

= −1 if (i, j, k) is an odd permutation of (123)

= 0 otherwise (i.e. when 2 or more indices are the same)

In other words:

ε123 = ε231 = ε312 = +1; ε213 = ε321 = ε132 = −1; all others = 0.

Note the cyclic symmetry

εijk = εkij = εjki = −εjik = −εikj = −εkjiThe Kronecker delta and epsilon symbols generalise straightforwardly to any numberof dimensions.

1.1.5 Scalar product

The scalar product, also known as the inner product, or just the “dot” product, oftwo vectors a and b is defined as

a · b = a b cos θab

where a ≡ |a| and b ≡ |b| are the lengths of the two vectors, and θab is the anglebetween them. In Cartesian coordinates

a · b = a1 b1 + a2 b2 + a3 b3 ≡ ai bi

where we used the summation convention in the last expression.

2

1.1.6 Vector product

The vector product (or cross product) is defined as

a× b ≡ a b sin θab n

where n is a unit vector orthogonal to both a and b, and the vectors a, b, n forma right-handed set.

In Cartesian coordinates, the ith component of the vector product of a and b maybe written

(a× b)i = εijk aj bk

where there is an implicit sum over both of the repeated indices j & k:∑3

j=1

∑3k=1 .

Similarly, we can write the vector product as

a× b = e i (a× b)i = εijk e i aj bk

The orthonormality relations satisfied by the vector products of the (right-handed)orthonormal basis vectors e i can be written as:

e i × e j = εijk e k ∀ i, j = 1, 2, 3

where we have again used the summation convention: there is an implicit sum overthe repeated index k:

∑3k=1.

1.1.7 Scalar triple product

From the results above, we can easily deduce an algebraic definition of the scalartriple product in Cartesian coordinates

(a, b, c) ≡ a · (b× c) = ai (b× c)i = εijk ai bjck

1.1.8 Vector triple product

The vector triple product a×(b× c

)satisfies

a×(b× c

)=(a · c

)b−

(a · b

)c

which may be proved by considering explicit components of each side of the equation.

It may also be obtained from the following expression for the product of two epsilonsymbols with one common (summed) index:

εijk εklm = δil δjm − δim δjl

You must know this result!

When the epsilon symbols have two common indices, we get another useful result(exercise)

εijk εpjk = 2 δip

3

1.2 Linear transformation of basis

Let e i and e ′i be two orthonormal bases with a common origin related by the

linear transformatione ′i

= `ij e j

where we have again used the summation convention, so the repeated index j issummed over.

We sometimes refer to the bases e i and e ′i as frames S and S ′, respectively.

Using orthonormality of both bases, it is straightforward to show that the nineconstants `ij satisfy (exercise)

`ik `jk = `ki `kj = δij ⇔ LLT = LT L = I

where we have used matrix notation in the 2nd version.

The 3 × 3 transformation matrix L is an orthogonal matrix, whose elements aregiven by (L)ij = `ij = e ′

i· e j (exercise), and I is the unit matrix.

Every orthogonal matrix L has determinant detL = ±1.

If L represents a rotation of the basis e i, then detL = 1. This is called a propertransformation.

If L represents an inversion of the basis, i.e. e ′i

= −e i , or a reflection of the basisvectors in some plane, then detL = −1. These are called improper transformations.

1.3 Transformation properties of vectors and scalars

1.3.1 Transformation of vector components

Let a be any vector, with components ai in the e i basis and components a′i in thee ′i

basis, so that

a = ai e i = a′i e′i

The vector a is the same in both bases, but its components in the two bases arerelated by

a′i = `ij aj

1.3.2 Scalar and vector products

The scalar product of two vectors a and b is the same in both bases

a · b = ai bi = a′i b′i

Hence the name scalar product.

The vector product a× b is a pseudovector.

The scalar triple product is a pseudoscalar, and the vector triple product is a vector.

We shall revisit all these properties in detail after we introduce tensors in the nextchapter. . .

4

Chapter 2

Cartesian tensors

2.1 Definition and transformation properties

Consider a rotation of the e i basis (frame S) to the e′i basis (frame S ′). This

is called a passive rotation.

The rotation matrix L, with components `ij , satisfies LLT = I = LTL, and it hasunit determinant detL = +1.

The components of two arbitrary vectors a and b in the two frames are related by

a′i = `ij aj

b′i = `ij bj

Let us now define a vector a as an entity whose 3 components ai in S are related toits 3 components a′i in S ′ by a′i = `ij aj .

Note that the vector a is not rotated – it remains fixed in space – only the basisvectors are rotated. Hence the term passive.

Now consider the 9 quantities ai bj . Under the change of basis, these transform to

a′i b′j = `ir ar `js bs

= (`ir `js) (arbs)

The 9 quantities arbs obey a particular transformation law under the change of basise i → e′i (change of frame S → S ′). This motivates our definition of a tensor.

2.1.1 Definition of a tensor

Following on from our new definition of a vector, we define a tensor of rank 2, T , asan entity whose 32 = 9 cartesian components Tpq in S are related to its 9 componentsT ′ij in S ′ by

T ′ij = `ip `jq Tpq

where L is the rotation matrix with components `ij which takes S → S ′. Since thereare 2 free indices (i & j) in the above expression, it represents 9 equations.

5

Similarly a tensor of rank n, T , is defined to be an entity whose 3n componentsT ′ijk···op (n-indices) in S ′ are related to its 3n components Trst···vw (n-indices) in S by

T ′ijk···op = `ir `js `kt · · · `ov `pw Trst···vw

where there are n factors of ` on the RHS.

In this new language

• A scalar is a tensor of rank 0 (i.e. a′ = a).

• A vector is a tensor of rank 1 (as a′i = `ij aj).

We shall often be sloppy and say Tijk···rs is a tensor, when what we really mean isthat T is a tensor with components Tijk···rs in a particular frame S.

The expressions tensor of rank 2 and second-rank tensor are used interchangeably.Similarly for tensor of rank n and nth-rank tensor.

Important: Note that a rank-n tensor is more general than the ‘direct product’ ofcomponents of n vectors, i.e., not every tensor has components that can be writtenas a direct product of vectors aibjck . . . pr qs. For example, aibj + ajbi is a rank-2tensor. Another explicit counterexample for n = 2 will be given in section 2.1.5.

2.1.2 Fields

A scalar or vector or tensor quantity is called a field when it is a function of position:

• Temperature T (r) is a scalar field

• The electric field Ei(r) is a vector field

• The stress-tensor field Pij(r) is a (rank 2) tensor field

In the latter case the transformation law is

P ′ij(r) = `ip `jq Ppq(r) or P ′ij(x′k) = `ip `jq Ppq(xk) with x′k = `kl xl

These two expressions mean the same thing, but the latter form is perhaps better.

2.1.3 Dyadic notation

In some (mostly older) books you will see dyadic notation. This is rather clumsy fortensors – although it works well for vectors of course!

dyadic notation index notation

a ai

a · b ai bi

A Aij or aij

aA b or a · A · b aiAij bj

· · · · · ·We will not use dyadic notation for tensors in this course.

6

2.1.4 Internal consistency in the definition of a tensor

Let Tij, T ′ij, T ′′ij be the components of a tensor in frames S, S ′, S ′′ respectively

And let

L be the rotation matrix which takes S → S ′, L = `ijM be the rotation matrix which takes S ′ → S ′′, M = mij

Then

T ′′ij = mipmjq T′pq

= mipmjq (`pk `ql Tkl)

= (ML)ik (ML)jl Tkl

= nik njl Tkl

where N = ML is the rotation matrix which takes S → S ′′, so the definition of atensor is self-consistent. A similar result can be derived for vectors (exercise).

2.1.5 Properties of Cartesian tensors – tensor algebra

• Addition: if Tij···p and Uij···p are two tensors with the same rank n, i.e bothhave n indices, then

Vij···p = Tij···p + Uij···p

is also a tensor of rank n. The proof is straightforward.

• Multiplication or tensor product: if Tij···s and Ulm···r are the components oftensors T and U of rank n and m respectively, then

Vij···slm···r = Tij···s Ulm···r

are the components of a tensor TU of rank n+m, which has 3n × 3m = 3m+n

components:

V ′i···r = T ′i···s U′l···r

= `iα · · · `sδ Tα···δ `lε · · · `rρ Uε···ρ= `iα · · · `rρ Vα···ρ

where there are n+m factors of ` in the last expression.

Note that we sometimes use Greek letters α, β, etc for dummy indices whenwe have a large number of them.

Example: If U = λ is a scalar, and T is a tensor of rank n, then λT is atensor of rank 0 + n = n.

• Contraction: if Tijk···s is a tensor of rank n, then Tiik···s (i.e. n−2 free indices) isa tensor of rank n−2. The process of setting indices to be equal and summingis called contraction.

Example: If Tij = ai bj is a tensor of rank 2, then Tii = ai bi is a tensor ofrank 0 (scalar).

The process of multiplying two tensors and contracting over a pair (or pairs)of indices on different tensors is (sometimes) called taking the scalar product.

7

• If Tij is a tensor then so is Tji (where Tji are the components of the transposeT T of T ).

• If Tij = Tji in S, then T ′ij = T ′ji in S ′:

T ′ij = `ip `jq Tpqp↔q= `iq `jp Tqp = `jp `iq Tpq = T ′ji

Tij is a symmetric tensor; the symmetry is preserved under a change of basis.

[The notation p↔ q refers to relabelling indices.]

Similarly if Tij = −Tji, then T ′ij = −T ′ji. Tij is an anti-symmetric tensor.

Given any second rank tensor T , we can always decompose it into symmetricand anti-symmetric parts

Tij = 12

(Tij + Tji) + 12

(Tij − Tji)

• We can re-write the tensor transformation law for rank 2 tensors (only) usingmatrix notation:

T ′ij = `ip `jq Tpq = (LTLT )ij

so

T ′ = LTLT ≡ LTL−1 (for rank 2 only)

• Kronecker delta, δij , is a second rank tensor

To show this, first note that the definition

δij =

+1 i = j

0 i 6= j

holds in all frames, so that, for example,

a · b = δij ai bj = δij a′i b′j

so δ′ij = δij.

Starting withδ′ij = δij

and using the general result `ip `jp = δij, we obtain

δ′ijgiven= δij = `ip `jq δpq

which we recognise as the definition of a second rank tensor.

A tensor which has the same components in all frames is called an invariantor isotropic tensor.

δij is an example of a tensor that can not be written in the form Tij = ai bj.

[To show this, assume we can write δij = ai bj. The diagonal terms in thisexpression require non-zero ai and bi for all i, while the off-diagonal termsrequire at least 3 of ai and bj to be zero, which is a contradiction.]

There are no invariant first rank tensors, apart from the zero vector 0 .

8

2.1.6 The quotient theorem

Let T be an entity with 9 components in any frame, say Tij in S, and T ′ij in S ′.

Let a be an arbitrary vector and let bi = Tij aj. If b always transforms as a vector,then T is a second rank tensor.

To prove this, we determine the transformation properties of T . In S ′ we have

b′i = T ′ij a′j = T ′ij `jk ak

≡ `ij bj = `ij Tjk ak

Equate the last expression on each line and rearrange(T ′ij `jk − `ij Tjk

)ak = 0

This expression holds for all vectors a, for example a = (1, 0, 0) etc, therefore

T ′ij `jk = `ij Tjk

⇒ T ′ij `jk `mk = `ij `mk Tjk [ multiplied both sides by `mk ]

⇒ T ′im = `ij `mk Tjk

where we used `jk `mk = δjm in the last step. Thus T transforms as a tensor.

Important example: If there is a linear relationship between two vectors a andb, so that ai = Tij bj , it follows from the quotient theorem that T is a tensor.

This is an alternative (mathematical) definition of a second-rank tensor, and it’salso the way that tensors arise in nature.

Quotient theorem: Generalising, if Rij···r is an arbitrary tensor of rank-m, andTij···s is a set of 3n numbers (with n > m) such that Tij···sRij···r is a tensor of rankn−m , then Tij···s is a tensor of rank-n.

Proof is similar to the rank-2 case, but it isn’t very illuminating.

2.2 Revision of matrices and determinants

Before extending the discussion to pseudotensors, we revise and extend some familiarproperties of matrices and determinants.

2.2.1 Matrices

An M ×N matrix is a rectangular array of numbers with M rows and N columns,

A =

a11 a12 · · · a1,N−1 a1Na21 a22 · · · a2,N−1 a2N· ·· ·· ·

aM−1,1 aM−1,2 aM−1,N−1 aM−1,NaM,1 aM,2 aM,N−1 aM,N

≡ aij

9

The quantities aij, with aij ≡ (A)ij for all 1 ≤ i ≤ M , 1 ≤ j ≤ N , are theelements of the matrix.

A square matrix has N = M . We’ll work mostly with 3 × 3 matrices, but themajority of what we’ll do generalises to N ×N matrices rather easily.

• We can add & subtract same-dimensional matrices and multiply a matrix bya scalar. Component forms are obvious, e.g. A = B + λC becomes aij =bij +λcij in index notation. Since both i and j are free indices, this represents9 equations.

• The unit matrix, I, defined by

I =

1 0 00 1 00 0 1

,

has components δij, i.e. Iij = δij

• The trace of a square matrix is the sum of its diagonal elements

TrA = aii

Note the implicit sum over i due to our use of the summation convention.

• The transpose of a square matrix A with components aij is defined by swappingits rows with its columns, so

(AT)ij

= ATij = aji

• IfA = AT then aji = aij A is symmetric

IfA = −AT then aji = −aij A is antisymmetric

Product of matrices

We can very easily implement the usual ‘row into column’ matrix multiplication rulein index notation.

If A (with elements aij) is an M ×N matrix and B (with elements bij) is an N ×Pmatrix then C = AB is an M × P matrix with elements cij = aik bkj. Sincewe’re using the summation convention, there is an implicit sum k = 1, . . . N in thisexpression.

Matrix multiplication is associative and distributive

A(BC) = (AB)C ≡ ABC

A(B + C) = AB + AC

10

respectively, but it’s not commutative: AB 6= BA in general.

An important result is(AB)T = BTAT

which follows because

(AB)Tij = (AB)ji = ajk bki = (BT )ik (AT )kj = (BTAT )ij

2.2.2 Determinants

The determinant detA (or |A| or ||A||) of a 3× 3 matrix A may be defined by

detA = εlmn a1l a2m a3n

This is equivalent to the ‘usual’ recursive definition,

detA =

∣∣∣∣∣∣

a11 a12 a13a21 a22 a23a31 a32 a33

∣∣∣∣∣∣,

where the first index labels rows and the second index columns. Expanding thedeterminant gives

detA = a11

∣∣∣∣a22 a23a32 a33

∣∣∣∣− a12∣∣∣∣a21 a23a31 a33

∣∣∣∣+ a13

∣∣∣∣a21 a22a31 a32

∣∣∣∣= a11(a22 a33 − a23 a32)− a12(a21 a33 − a23 a31) + a13(a21 a32 − a22 a31)= (ε123 a11 a22 a33 + ε132 a11 a23 a32) + . . .

= ε1mn a11 a2m a3n + . . .

= εlmn a1l a2m a3n

Thus the two forms are equivalent. The ε form is convenient for derivation of variousproperties of determinants

Note that only one term from each row and column appears in the determinant sum,which is why the determinant can be expressed in terms of the ε symbol.

The determinant is only defined for a square matrix, but the definition can begeneralised to N ×N matrices,

detA = εi1...iNa1i1 . . . aNiN ,

where the epsilon symbol with N indices is defined by

εi1...iN =

+1 if i1, . . . , iN is an even permutation of 1, . . . , N−1 if i1, . . . , iN is an odd permutation of 1, . . . , N

0 otherwise.

We shall usually consider N = 3, but most results generalise to arbitrary N .

11

We may use these results to derive several alternative (and equivalent) expressionsfor the determinant.

First define the quantityXijk = εlmn ail ajm akn

It follows that

Xjik = εlmn ajl aim akn

= εmln ajm ail akn (where we relabelled l↔ m)

= −εlmn ail ajm akn = −Xijk

Thus the symmetry of Xijk is dictated by the symmetry of εlmn, and we must have

Xijk = c εijk

where c is some constant. To determine c, set i = 1, j = 2, k = 3, which givesε123 c = X123, so c = εlmn a1l a2m a3n = detA and hence

εijk detA = εlmn ail ajm akn

Multiplying by εijk and using εijkεijk = 6 (exercise) gives the symmetrical form fordetA

detA = 13!εijk εlmn ail ajm akn

This elegant expression isn’t of practical use because the number of terms in thesum increases from 33 to 36 (overcounting).

We can obtain a result similar to the boxed expression above by defining

Ylmn = εijk ail ajm akn

Using the same argument as before [tutorial] gives

Ylmn = εlmn [εijk ai1 aj2 ak3]

Since detA = 1/3! εlmn Ylmn this means that

detA = εijk ai1aj2ak3

and

εlmn detA = εijk ail ajm akn

[tutorial]

12

Properties of Determinants

We can easily derive familiar properties of determinants from the definitions above:

• Adding a multiple of one row to another does not change the value of thedeterminant.

Example: Adding a multiple of the second row to the first row

εlmn a1l a2m a3n → εlmn (a1l + λ a2l) a2m a3n = εlmn a1l a2m a3n + 0

and detA is unaltered. The last term is zero because εlmn a2l a2m = 0.

• Adding a multiple of one column to another does not change the value of thedeterminant (use the other form for the determinant, detA = εijkai1aj2ak3).

• Interchanging any two rows of a matrix changes the sign of the determinant.

Example: Interchanging the first and second rows gives

εlmn a2l a1m a3n = εmln a2m a1l a3n = −εlmn a1l a2m a3n = − detA

In the first step we simply relabelled l↔ m.

When A has two identical rows, detA = 0.

• Interchanging any two columns of a matrix also changes the sign of the deter-minant (use the other form for the determinant, detA = εijk ai1 aj2 ak3).

Finally,

εijk εlmn detA =

∣∣∣∣∣∣

ail aim ainajl ajm ajnakl akm akn

∣∣∣∣∣∣(2.1)

To derive this, start with the original definition of detA as | · · · | and permute rowsand columns. This produces ± signs equivalent to ε permutations.

2.2.3 Linear equations

A standard use of matrices & determinants is to solve (for x) the matrix-vectorequation

Ax = y

where A is a square 3 × 3 matrix. Representing x and y by column matrices, andwriting out the components, this becomes

a11 x1 + a12 x2 + a13 x3 = y1

a21 x1 + a22 x2 + a23 x3 = y2

a31 x1 + a32 x2 + a33 x3 = y3

In index notationaij xj = yi

13

With a suitable definition of the inverse A−1 of the matrix A, we can write thesolution as

x = A−1 y or xi = A−1ij yj

where A−1ij ≡ (A−1)ij is the ijth element of the inverse of A

A−1ij =1

2!

1

detAεimn εjpq apm aqn

By explicit multiplication [tutorial] we can show AA−1 = I = A−1A as required.

Alternatively [tutorial],

A−1 =CT

detA

where C = cij is the co-factor matrix of A, and cij = (−1)i+j × the determinantformed by omitting the row and column containing aij.

Note that a solution exists if and only if detA 6= 0.

These results generalise to N ×N matrices.

Determinant of the transpose

detAT = εlmnATl1A

Tm2A

Tn3

= εlmn a1l a2m a3n

and hence

detAT = detA

Product of determinants

If C = AB so that cij = aik bkj then

detC = εijk c1i c2j c3k

= εijka1l bli a2m bmj a3n bnk

= [εijk bli bmj bnk] a1l a2m a3n

= εlmn detB a1l a2m a3n

and hence

detAB = detA detB

Product of two epsilons

The product of two epsilon symbols with no identical indices may be written as

εijk εlmn =

∣∣∣∣∣∣

δil δim δinδjl δjm δjnδkl δkm δkn

∣∣∣∣∣∣

14

This equation has 6 free indices, so it represents 36 = 729 identities: 18 say ‘1 = 1’,18 say ‘−1 = −1′, so 693 say ‘0 = 0’.

The proof follows almost trivially by setting A = I in equation (2.1)

εijk εlmn detA =

∣∣∣∣∣∣

ail aim ainajl ajm ajnakl akm akn

∣∣∣∣∣∣,

whence det I = 1 and aij = δij . Unfortunately, this result isn’t as useful as onemight expect. . .

If we set l = k and sum over k

εijk εklm =

∣∣∣∣∣∣

δik δil δimδjk δjl δjmδkk δkl δkm

∣∣∣∣∣∣

= δik (δjl δkm − δjm δkl)− δil (δjk δkm − 3δjm) + δim (δjk δkl − 3δjl))

= δim δjl − δil δjm + 2δil δjm − 2δim δjl

we obtain an old friend, which we now note can be written as the determinant of a2× 2 matrix,

εijk εklm = δil δjm − δim δjl =

∣∣∣∣δil δimδjl δjm

∣∣∣∣

2.2.4 Orthogonal matrices

Let A be a 3× 3 matrix with elements aij, and define the row vectors

a(1) = (a11, a12, a13) , a(2) = (a21, a22, a23) , a(3) = (a31, a32, a33) ,

so that(a(i))j

= aij . If we choose the vectors a(i) to be orthonormal:

a(1) · a(2) = a(2) · a(3) = a(3) · a(1) = 0 ,

and ∣∣a(1)∣∣2 =

∣∣a(2)∣∣2 =

∣∣a(3)∣∣2 = 1 ,

i.e.a(i) · a(j) = δij ,

then A is an orthogonal matrix. The rows of A form an orthonormal triad.

Properties of orthogonal matrices

• Consider

AAT =

a11 a12 a13a21 a22 a23a31 a32 a33

a11 a21 a31a12 a22 a32a13 a23 a33

=

1 0 00 1 00 0 1

,

Therefore

AAT = I

15

• Taking the determinant of both sides of AAT = I gives detA detAT = 1.

Since detAT = detA, then (detA)2 = 1, and therefore

detA = ±1

• Since detA 6= 0, the inverse matrix A−1 always exists, and therefore

A−1 = AT

Multiplying this equation on the right by A gives

ATA = I

so the columns of A also form an orthonormal triad.

2.3 Pseudotensors, pseudovectors & pseudoscalars

Armed with properties of determinants, let’s now return to basis transformations.

Suppose that we now allow reflection and inversion (as well as rotation) of the basisvectors, and represent them all by a transformation matrix L with

detL = +1 for rotations

detL = −1 for reflections and inversions

In the second case, the handedness of the basis is changed: if S is right-handed(RH), then S ′ is left-handed (LH), and vice versa.

Before introducing pseudotensors, we note that the basis vectors always transformas

e′i

= `ij ej

A second-rank tensor or true tensor T obeys the transformation law

T ′ij = `ip `jq Tpq

for all transformations, i.e. rotations, reflections and inversions.

A second-rank pseudotensor T obeys the transformation law

T ′ij = (detL) `ip `jq Tpq

There is a change of sign for components of a pseudotensor, relative to a true tensor,when we make a reflection or inversion of the reference axes.

We can similarly define pseudovectors, pseudoscalars and rank-n pseudotensors.

Note: detL = −1 for a basis transformation that consists of any combination ofan odd number of reflections or inversions, plus any number of rotations.

16

2.3.1 Vectors

Inversion of the basis vectors is defined by

(e1, e

2, e

3) → (e′

1, e′

2, e′

3) = (−e

1,−e

2,−e

3)

so that

L =

−1 0 0

0 −1 00 0 −1

which has components `ij = −δij.If e

i is a standard right-handed basis, then e′

i is left-handed.

Let a be a vector (also called a tensor of rank 1 or a polar vector or a true vector).We showed previously that the components ai transform as a′i = `ij ai, so we have

a′1 = −a1 a′2 = −a2 a′3 = −a3The vector itself therefore transforms as

a′ ≡ a′i e′i

= (−ai) (−ei) = ai ei = a

as expected. Thus a (true/polar) vector remains unchanged by inversion of thebasis.

S

~e1

~e2

~e3

S′

~e′2

reflection

~a

~e′1

~a

~e′3

Inversion is sometimes called reflection in the origin - hence the label in the figureabove.

2.3.2 Pseudovectors

If a and b are (true) vectors then c = a × b is a pseudovector (also known as apseudotensor of rank 1 or an axial vector).

Let’s illustrate this by considering an inversion.

c′1 = a′2 b′3 − a′3 b′2 = (−a2) (−b3)− (−a3) (−b2) = +c1 etc

Thereforec′ ≡ c′i e

′i

= (+ci) (−ei) = −ci ei = −c

Thus the direction of c = a× b is reversed in the new LH basis.

17

S

~e1

~e2

~e3

S′

~e′2

reflection

~b

~e′1

~b

~e′3

~a

~c

~a

~c

The vectors a, b and a× b form a LH triad in the S ′ basis, i.e. they always have thesame ‘handedness’ as the underlying basis.

Now consider the general case. We define the epsilon symbol in all bases, regardlessof their handedness, in the same way:

εijk =

+1 if ijk is an even permutation of 123−1 if ijk is an odd permutation of 123

0 otherwise

The components of c = a× b are ci = εijk aj bk in S and c′i = εijk a′j b′k in S ′ .

We can now determine the components of c = a× b in S ′

c′i = εijk a′j b′k

= εijk `jp ap `kq bq

= εmjk `ir `mr `jp `kq ap bq

= detL `ir εrpq ap bq

= detL `ir (a× b)rwhere we used εijk = εmjk δim = εmjk `ir `mr to get to the third line, and the deter-minant result εrpq detL = εmjk `mr `jp `kq to get the fourth line. So, finally

c′i = detL `ir cr

This is our definition of a pseudovector. Equivalently, since e′i

= `ij ej then

c′ ≡ c′i e′i

= (detL `ir cr)(`ij e j

)= detL cj e j = detL c

Therefore a pseudovector changes sign under any improper transformation such asinversion of the basis or reflection of the basis in a plane.

Pseudotensors: ε is a pseudotensor of rank 3. The proof is simple and uses thefact that ε is the same in all bases, together with (detL)2 = 1

ε′ijk ≡ εijk = detL detL εijk = detL `ip `jq `kr εpqr

where we used the definition of the determinant in the last step.

18

Furthermore since ε is the same in all bases, it’s an isotropic or invariant rank-3tensor.

We can build pseudotensors of higher rank using a combination of vectors, tensors,δ and ε. For example:

• If ai and bi are rank 1 tensors (i.e. vectors), then the 35 quantities εijk al bmare the components of a pseudotensor of rank 5.

• εijk δpq is a pseudotensor of rank 5.

In general:

• The product of two tensors is a tensor

• The product of two pseudotensors is a tensor

• The product of a tensor and a pseudotensor is a pseudotensor

2.4 Some physical examples

• Velocity v =dr(t)

dtvector

• Acceleration a = v vector

• Force F = ma vector

• Electric field E = F /q vector

(F is the force on a test charge q)

• Torque G = r × F pseudovector

• Angular velocity (ω) v = ω × r pseudovector

• Angular momentum L = r × p pseudovector

• Magnetic field (B) F = qv ×B pseudovector

B(r) =µ0

C

I dr′ × (r − r′)∣∣r − r′

∣∣3

(Biot–Savart Law – more later)

O

P

C~r′

~rd~r′

i

• E ·B pseudoscalar

A pseudoscalar changes sign under an improper transformation: a′ = (detL) a

In this case, one can easily show that(E ·B

)′= detL

(E ·B

)

19

• Helicity h = p · s pseudoscalar

The pseudovector s is the angular momentum,or spin, of a particle/ball. As the ball spins,a point on it traces out a RH helix. In thefigure, p is parallel to s

~s

~p

• Inertia tensor (I) Li = Iij ωj rank 2 tensor

• Conductivity tensor (σ) Ji = σij Ej rank 2 tensor

• Stress tensor (P ) dFi = Pij dSj rank 2 tensor

2.5 Invariant/Isotropic Tensors

A tensor T is invariant or isotropic1 if it has the same components, Tijk··· in anyCartesian basis (or frame of reference), so that

Tijk··· = `iα `jβ `kγ · · · Tαβγ···

for every (orthogonal) transformation matrix L = `ij.Similarly, T is an invariant pseudotensor if

Tijk··· = detL `iα `jβ `kγ · · · Tαβγ···

Theorem: If aij is a second-rank invariant tensor, then

aijinvariant

= λ δij

Proof: For a rotation of π/2 about the z-axis

L =

0 1 0−1 0 0

0 0 1

Since the only non-zero elements are `12 = 1, `21 = −1, `33 = 1, using aij =`iα `jβ aαβ, we find

a11 = `1α `1β aαβ = `12 `12 a22 = a22

a13 = `1α `3β aαβ = `12 `33 a23 = a23

a23 = `2α `3β aαβ = `21 `33 a13 = −a131Isotropic means “the same in all directions”.

20

Thereforea11 = a22 , a13 = 0 = a23

Similarly, for a rotation of π/2 about the y-axis, we find (exercise)

a11 = a33 , a12 = 0 = a32

Finally, for a rotation of π/2 about the x-axis

a22 = a33 , a21 = 0 = a31

The only solution to these equations is aij = λ δij. We’ve already shown that δij isan invariant second-rank tensor, therefore λ δij is the most general invariant second-rank tensor.

One can use a similar argument to show that the only invariant vector is the zerovector. It’s obvious that no non-zero vector has the same components in all bases!

Theorem: There is no invariant tensor of rank 3. The most general invariantpseudotensor of rank 3 has components

aijkinvariant

= λ εijk

Proof: This is similar to the rank 2 case [tutorial].

Theorem: The most general 4th rank invariant tensor has components

aijklinvariant

= λ δij δkl + µ δik δjl + ν δil δjk

The proof is long and not very illuminating. See, for example, Matthews, VectorCalculus, (Springer) or Jeffreys, Cartesian Tensors, (CUP). However, it’s easy toshow that the expression above is indeed an invariant tensor.

Invariant tensors of higher rank

• The most general rank-5 invariant pseudotensor has components

aijklminvariant

= λ δij εklm + µ δik εjlm . . .

where the dots indicate all terms consisting of a constant × further permuta-tions of the 5 indices.

• The most general rank-6 invariant tensor has components

aijklmninvariant

= λ δij δkl δmn + . . .

Note that invariant tensors involving products of two epsilons can always berewritten as sums of products of δs using the expression for the product of twoepsilons.

• The most general invariant tensor of rank 2n is a sum of products of constantstimes n Kronecker deltas. There is no invariant pseudotensor of rank 2n.

• Similarly, the most general invariant pseudotensor of rank 2n + 1 is a sum ofproducts of constants times one ε and n − 1 Kronecker deltas. There is noinvariant tensor of rank 2n+ 1.

21

Chapter 3

Rotation, reflection & inversiontensors

The transformations of the previous chapters in which we rotate (or reflect or invert)the basis vectors keeping the vector fixed are called passive transformations.

Alternatively we can keep the basis fixed and rotate (or reflect or invert) the vector.These are called active transformations and are represented by second-rank tensors.

3.1 The rotation tensor

3.1.1 Rotation about an arbitrary axis

Consider a rotation of a rigid body, through an-gle θ, about an axis which points in the directionof the unit vector n. The axis passes through afixed origin O.

The point P is rotated toQ. The position vectorx is rotated to y .

In the first diagram,−→OS is the projection of x

onto the n direction, i.e.(x · n

)n.

In the second diagram, n× x is parallel to−→TQ.

P

~n

Q

~x

~y

S

O

θ

θ

~n× ~x

SP

Q

T

y =−→OS +

−→ST +

−→TQ

=(x · n

)n + SQ cos θ︸ ︷︷ ︸

|−→ST |

(x− (x · n)n

)

SP︸ ︷︷ ︸SP

+ SQ sin θ︸ ︷︷ ︸|−→TQ|

n× x|n× x|︸ ︷︷ ︸TQ

=(x · n

)n + cos θ

(x−

(x · n

)n)

+ sin θ n× x

(as SP = SQ and |n× x| = SP = SQ).

22

This gives the important result

y = x cos θ + (1− cos θ)(n · x

)n+ (n× x) sin θ

In index notation, this is

yi = xi cos θ + (1− cos θ)nj xj ni + εikj nk xj sin θ

which we write asyi = Rij

(θ, n)xj

This is a linear relation between the vectors x and y, so R(θ, n)

is a rank 2 tensor,the rotation tensor, with components

Rij

(θ, n)

= δij cos θ + (1− cos θ)ni nj − εijknk sin θ

3.1.2 Some important properties of the rotation tensor

(i) In any basis R is represented by an orthogonal matrix, with components Rij,and detR = 1. Proofs [tutorial]

(ii) It is straightforward to show that [tutorial]

TrR = Rii = 1 + 2 cos θ [independent of n]

−12εkijRij = nk sin θ

If R is known, then the angle θ and the axis of rotation n can be determined.

Note: R has 1 + (3− 1) = 3 independent parameters.

(iii) The product of two rotations xR→ y

S→ z is given by z = SRx .

(iv) Consider a small (infinitesimal) rotation δθ, for whichcos δθ = 1 +O(δθ2) and sin δθ = δθ +O(δθ3) , then

Rij = δij − εijk nk δθ

A quicker (and sufficient) graphical proof follows di-rectly from the diagram on the right, which gives

y − x = n× x δθ

from which the result above follows directly. ~n ~x

~y

δθ

|~n× ~x|

|~n× ~x|δθ

(v) For θ 6= 0, π, R has only one real eigenvalue +1 , with one real eigenvector n .[tutorial]

23

3.2 Reflections and inversions

Consider reflection of a vector x → y in aplane with unit normal n.

From the figure y = x− 2(x · n

)n

In index notation, this becomes

yi = σij xj where σij = δij − 2ni nj

The quantities σij are the components of thereflection tensor σ.

~n

~x

~y

Inversion of a vector in the origin is given by y = −x . This defines the parityoperator P :

yi = Pij xj where Pij = −δij

The parity operator is an invariant tensor with components −δij .

For reflections and inversions, respectively, detσ = detP = −1.

Note that for reflections and inversions, performing the operation twice yields theoriginal vector, i.e. σ2 = I , P 2 = I.

3.3 Projection operators

P is a parallel projection operator onto a vector u if

P u = u and P v = 0

where v is any vector orthogonal to u , i.e. v · u = 0 . Similarly Q is an orthogonalprojection to u if

Qu = 0 and Qv = v

so that Q = I − P . Suitable operators are (exercise: check this!)

Pij =ui uju2

and Qij = δij −ui uju2

These have the properties

P 2 = P , Q2 = Q , PQ = QP = 0

They’re also unique. For example, if there exists another operator T with the sameproperties as P , i.e. Tu = u and T v = 0, then for any vector w ≡ µu+ν v+λu×vwe have

(P − T )w =(µu+ 0 + 0

)−(µu+ 0 + 0

)= 0

because (P(u× v

))i

= Pij(u× v

)j

=(ui uj/u

2)εjkl uk vl = 0

This holds for all vectors w , so T = P .

24

3.4 Active and passive transformations

• Rotation of a vector in a fixed basis is called an active transformation

x → y with yi = Rij xj in the ei basis

• Rotation of the basis whilst keeping the vector fixed is called a passive trans-formation

ei → e′

i and xi → x′i = `ij xj

If we set Rij = `ij, then numerically yi = x′i.

Consider a simple example of both types of rotation:

Rotation of a vector about the z-axis

Rij(θ, e3) = δij cos θ + (1− cos θ) δi3 δj3 − εijk δk3 sin θ

=

cos θ − sin θ 0sin θ cos θ 0

0 0 1

where we used ni = δi3 .

This is an active rotation through angle θ .

~e1

~e2

~e3

~y

θ~x

Rotation of the basis about the z-axis

e′i

= `ij ej ≡ Rij ej

In components

e′1

= cos θ e1− sin θ e

2

e′2

= sin θ e1

+ cos θ e2

e′3

= e3

This is a passive rotation through angle −θ.

~e1

~e2

~e3, ~e′3

~x

~e′2

~e′1

θ

We conclude that

An active rotation of the vector x through angle θ is equivalent to apassive rotation of the basis vectors by an equal and opposite amount.

Colloquially, rotating a vector in one direction is equivalent to rotatingthe basis in the opposite direction.

The general case can be built from three rotations (Euler angles).

25

Chapter 4

The inertia tensor

4.1 Angular momentum and kinetic energy

Suppose a rigid body of arbitrary shape rotates withangular velocity ω = ω n about a fixed axis, parallel tothe unit vector n, which passes through the origin.

Consider a small element of mass dm in volume dV atthe point P , with position vector r relative to O.

If the rigid body has density (mass per unit volume)ρ(r), then dm = ρ dV .

The velocity of the element is

v = ω × r

ω

dm

O

P

~r

We can see this geometrically from the figure.

The distance |δr| moved in time δt, is

|δr| = r sinφ δθ = δθ∣∣n× r

∣∣

So its velocity is

v =δr

δt= ω × r where ω =

δθ

δtn

~r

φ

δθ

Alternatively, we can use the rotation tensor R(θ, n)

δxi = Rij

(δθ, n

)xj − xi =

[−εijk nk δθ + O

((δθ)2

)]xj

⇒ δxiδt

=(n× r

)i

δθ

δt

which again gives v = ω × r.

26

4.1.1 Angular momentum

The angular momentum L of a point particle of mass m at position r moving withvelocity v = r is L = r × p, where the momentum p = mv.

The angular momentum dL of an element of mass dm = ρ dV at r is

dL = ρ(r) dV r × vThe angular momentum of the entire rotating body is then

L =

body

ρ r ×(ω × r

)dV

In components

Li =

body

ρ εijk xj (εklm ωl xm) dV

=

body

ρ (δil δjm − δim δjl)xj ωl xm dV

=

body

ρ(r2 ωi − xi xj ωj

)dV

Thus

Li = Iij ωj with Iij =

body

ρ(r2 δij − xi xj

)dV

The geometric quantity I(O) (where O refers to the origin) is called the inertiatensor.1 It is a tensor because L is pseudovector, ω is a pseudovector, and hencefrom the quotient theorem I is a tensor.

Iij is symmetric and independent of the rotation axis n , it’s a property of the body.

4.1.2 Kinetic energy

The kinetic energy, dT , of an element of mass dm is dT = 12 (ρ dV )

∣∣ω × r∣∣2. The

kinetic energy of the body is then

T =1

2

body

ρ (εijk ωj xk) (εilm ωl xm) dV

=1

2

body

ρ (δjlδkm − δjmδkl)ωj xk ωl xm dV

=1

2

body

ρ(ω2r2 − (r · ω)2

)dV

=1

2

body

ρ(r2 δij − xi xj

)dV ωi ωj

which gives

T =1

2Iij ωi ωj =

1

2L · ω

1Some of my colleagues insist that I(O) should be called the moment of inertia tensor. Youwill see both names in books.

27

Alternative (more familiar) forms

Recalling that the angular velocity may be written as ω = ω n, consider

n · L = ni Iij ωj = Iij ni nj ω ≡ I(n) ω

L · n is the component of angular momentum parallel to the axis n, and

I(n) = Iij ni nj =

body

ρ(r2 −

(r · n

)2)dV ≡

body

ρ r2⊥ dV

is the moment of inertia about n, with r⊥ the perpendicular distance from the n-axis.

Similarly for the kinetic energy, so that

L(n) = I(n) ω

T = 12 I

(n) ω2

with L(n) = L · n , I(n) = Iij ni nj

Example: Consider a cube of side a of constant density ρ and mass M = ρa3

Iij(O) =

∫ρ(r2δij − xixj

)dV

In this case

I11 = ρ

∫ a

0

dx dy dz(x2 + y2 + z2

)− x2

= ρ[13y

3xz + 13z

3xy]a0

= 23ρa

5 = 23Ma2

I12 = ρ

∫ a

0

dx dy dz (−xy)

= −ρ[12x

2 12y

2z]a0

= −14ρa

5 = −14Ma2 O

y

a

z

x

Similarly for the other components. By symmetry, I(O) = Ma2

23 −1

4 −14

−14

23 −1

4

−14 −1

423

4.1.3 The parallel axes theorem

It’s often more useful, and also simpler, to find theinertia tensor about the centre of mass G, ratherthan about an arbitrary point O. There is, how-ever, a simple relationship between them.

Taking O to be the origin, and−→OG = R , we have

r = R + r′ , giving

Iij(O) =

∫ρ(r)

(r2δij − xi xj

)dV

=

∫ρ′(r′)

∣∣R + r′∣∣2 δij − (Xi + x′i)

(Xj + x′j

)dV ′

ω

dm

O

P

~r

G

~r′

~R

28

In the above, ρ(r) = ρ(R + r′) ≡ ρ′(r′), and we changed integration variables to r′ .Expanding the integrand and using the definition of G, namely

∫ρ′(r′) r′ dV ′ = 0 ,

we get

Iij(O) =

∫ρ′(r′)

(R2 + r′2

)δij −XiXj − x′i x′j

dV ′

Hence

Iij(O) = Iij(G) +M (R2 δij −XiXj)

where M =∫ρ′(r′) dV ′ is the total mass of the body. This is a general result; given

I(G) we can easily find the moment of inertia tensor about any other point.

The general result above is often called the parallel axes theorem. However, theparallel axes theorem technically refers to the inertia tensor about the same axis nas the original axis

I(n)(O) = I(n)(G) +MR2⊥

where R⊥ ≡√R2 − (R · n)2 is the perpendicular distance of G from the n axis.

Example (revisited): In our previous example, the centre of mass G is at thecentre of the cube with position vector R =

(12a,

12a,

12a). Using Cartesian coordi-

nates with their origin at G

I11(G) = ρ

∫ a/2

−a/2dx dy dz

(x2 + y2 + z2

)− x2

= ρ

13

[y3]a/2−a/2 [x]

a/2−a/2 [z]

a/2−a/2 + 1

3

[z3]a/2−a/2 [x]

a/2−a/2 [y]

a/2−a/2

= ρ

13 · 2(a/2)3 2(a/2) 2(a/2) · 2

= 1

6ρa5 = 1

6Ma2

and

I12(G) = ρ

∫ a/2

−a/2dx dy dz (−xy)

= 0 because[12x

2]a/2−a/2 = 0

Similarly for the other components.

O

y

a

z

x

G

The inertia tensor about the centre of mass is then

Iij(G) = Ma2

16 0 0

0 16 0

0 0 16

ij

=Ma2

6δij

and

M(R2 δij −XiXj

)= Ma2

12 −1

4 −14

−14

12 −1

4

−14 −1

412

ij

Since 12 + 1

6 = 23 , this reproduces our previous result for Iij(O).

29

4.1.4 Diagonalisation of rank-two tensors

Question: are there any directions for ω such that L is parallel to ω?

If so, then L = λω, and hence

(Iij − λδij)ωj = 0

For a non-trivial solution of these three simultaneous linear equations (i.e. ω 6= 0),we must have det (Iij − λδij) = 0. Expanding the determinant, or writing it as

det (Iij − λδij) = 16εijk εlmn (Iil − λδil ) (Ijm − λδjm ) (Ikn − λδkn ) = 0

and then expanding gives

P −Qλ+Rλ2 − λ3 = 0

where

P = 16 εijk εlmn Iil Ijm Ikn

= det I

Q = 16 εijk εlmn (δil Ijm Ikn + Iil δjm Ikn + IilIjm δkn)

= 16 (δjmδkn − δjnδkm) Ijm Ikn × 3

= 12

[(TrI)2 − Tr

(I2)]

R = 16εijk εlmn (δil δjm Ikn + δil Ijm δkn + Iil δjm δkn)

= 16 2 δkn Ikn × 3

= Tr I

For any tensor A, we know that detA and TrA are invariant (the same in any basis),hence the quantities P , Q, R are invariants of the tensor I (i.e. their values are alsothe same in any basis).

The three values of λ (i.e. the solutions of the cubic equation) are the eigenvaluesof the rank-two tensor, and the vectors ω are its eigenvectors.2 We will generallycall the eigenvectors e.

Eigenvectors and eigenvalues: If we take I ω = λω (or Iij ωj = λωi), andmultiply on the left by L, we obtain (in matrix notation)

(LI LTL︸︷︷︸

=1

)ijωj = λ

(Lω

)i

⇒ I ′ij(Lω)j

= λ(Lω)i

In the primed basis, we have by definition I ′ij ω′j = λ′ ω′i . Comparing with the second

equation above, we see that eigenvectors ω are indeed vectors, i.e. they transformas vectors: ω′i = `ij ωj.

Similarly, eigenvalues are scalars, i.e. they transform as scalars: λ′ = λ .

Note that only the direction (up to a ± sign) of the eigenvectors is determined bythe eigenvalue equation, the magnitude is arbitrary.

The answer to our question is that we must find the eigenvalues λ(i), i = 1, 2, 3 andthe corresponding eigenvectors ω(i) , whence L(i) = λ(i) ω(i) (no sum on i).

2Yes, the language is indeed the same as for matrices.

30

Eigenvalues and eigenvectors of a real symmetric tensor

Theorems

(i) All of the eigenvalues of a real symmetric matrix are real.

(ii) The eigenvectors corresponding to distinct eigenvalues are orthogonal.

If a subset of the eigenvalues is degenerate (eigenvalues are equal), the corre-sponding eigenvectors can be chosen to be orthogonal because:

(a) the eigenvector subspace corresponding to the degenerate eigenvalues isorthogonal to the other eigenvectors;

(b) within this subspace, the eigenvectors can be chosen to be orthogonal bythe Gram-Schmidt procedure.

Proofs will not be given here – see books or lecture notes from mathematics courses.

Diagonalisation of a real symmetric tensor

Let T be a real second-rank symmetric tensor with real eigenvalues λ(1), λ(2), λ(3)

and orthonormal eigenvectors ` (1), `( 2), ` (3) , so that T ` (i) = λ(i) ` (i) (no summa-

tion) and ` (i) · ` (j) = δij. Let the matrix L have elements

`ij = `(i)j ≡

`(1)1 `

(1)2 `

(1)3

`(2)1 `

(2)2 `

(2)3

`(3)1 `

(3)2 `

(3)3

ij

= ` (i) · ej

I.e the ith row of L is the ith eigenvector of T . L is an orthogonal matrix

(LLT )ij = `im `jm = ` (i)m ` (j)m = δij

We can always choose the normalised eigenvectors ` (i) to form a right-handed basis:

detL = εijk `1i `2j `3k = ` (3) ·(` (1) × ` (2)

)= +1

With this choice, L is a rotation matrix which transforms S to S ′.

The tensor T transforms as (summing over the indices p, q only)

T ′ij = `ip `jq Tpq

= ` (i)p Tpq `(j)q

= ` (i)p λ(j) ` (j)p

or

T ′ij = λ(j) δij =

λ(1) 0 0

0 λ(2) 0

0 0 λ(3)

ij

Thus we have found a basis or frame, S ′, in which the tensor T takes a diagonalform; the diagonal elements are the eigenvalues of T .3

3Thus tensors may be diagonalised in the same way as matrices.

31

The inertia tensor

When studying rigid body dynamics, it’s (usually) best to work in a basis in whichthe inertia tensor is diagonal. The eigenvectors of I define the principal axes of thetensor. In this (primed) basis

I ′ =

A 0 00 B 00 0 C

where the (positive) quantities A, B, C are called the principal moments of inertia.

In this basis, the angular momentum and kinetic energy take the form

L = Aω′1 e′1

+B ω′2 e′2

+ C ω′3 e′3

T = 12

(Aω′ 21 +B ω′ 22 + C ω′ 23

)

For a free body (i.e. no external forces), L and T are conserved (time-independent),but ω will in general be time dependent.

Note that the angular momentum L is parallel to the angular velocity ω whenboth are parallel to one of the principal axes of the inertia tensor. For example, ifω = ω′1 e

′1

thenL = Aω′1 e

′1

In this case, the body is rotating about a principal axis which passes through itscentre of mass.

Notes:

• The principal axes basis is used in the Lagrangian Dynamics course to studythe rotational motion of a free rigid body in the Newtonian approach to dy-namics, and the motion of a spinning top with principal moments (A,A,C) inthe Lagrangian approach.

• The principal axes basis/frame is ‘fixed to the body’, i.e. it moves with therotating body, and is therefore a non-inertial frame.

The inertia ellipsoid

The remainder of this chapter is non-examinable, but it may be of interest to thosetaking the course Lagrangian Dynamics.

A geometrical picture is provided by the inertia ellipsoid, which is defined by

Iij ωi ωj = 1

A factor of√

2T is absorbed into ω by convention: ωi → ωi/√

2T .

32

In the principal axes basis, where ω′i = `ij ωj,we have

Aω′ 21 +B ω′ 22 + C ω′ 23 = 1

This is called the normal form. It describesan ellipsoid because A, B, C are all positive.(This follows from the definition, for exampleA =

∫ρ (y2 + z2) dV .)

[The angular momentum L is labelled h in thefigure on the right. To be fixed . . . ]

ω2

ω′1

O

~h~n

ω1

ω′3

ω′2

ω3

P

In any basis, a small displacement ω → ω+ dω onthe ellipsoidal surface at the point P , with normaln, obeys

Iij ωi dωj = Lj dωj = 0

for all dω.

Therefore L is orthogonal to dω and parallel to n,i.e. L is always orthogonal to the surface of theellipsoid at P .

~n

P

d~ω

In the principal axes basis

L = Aω′1 e′1

+B ω′2 e′2

+ C ω′3 e′3

The directions for which L is parallel to ω are obviouslythe directions of the principal axes of the ellipsoid. Forexample, if ω = ω′1 e

′1

then

L = Aω′1 e′1

In this case, the body is rotating about a principal axiswhich passes through its center of mass.

This gives a ‘geometrical’ answer to our original ques-tion.

ω′1

O

ω′3

ω′2

~n

P

Notes:

• If two principal moments are identical (A,A,C), the ellipsoid becomes aspheroid.

• If all three principal moments are identical the ellipsoid becomes a sphere, andL is always parallel to ω.

33

Chapter 5

Taylor expansions

Taylor expansion is one of the most important and fundamental techniques in thephysicist’s toolkit. It allows a differentiable function to be expressed as a powerseries in its argument(s). This is useful when approximating a function, it oftenallows the problem to be ‘solved’ in some range of interest, and it’s used in derivingfundamental differential equations. We shall use the expression ‘Taylor’s Theorem’interchangeably with ‘Taylor expansion’.

We shall assume familiarity with Taylor expansions of functions of one variable, sowe won’t go through this material on the blackboard in the lectures. However, forcompleteness, we include some comprehensive notes here.

You may have seen the multivariate Taylor expansion beyond leading order in pre-vious courses, but probably not in the way it’s done here. . .

5.1 The one-dimensional case

Let f(x) have a continuous mth-order derivative f (m)(x) in a ≤ x ≤ b, so that

∫ x1

a

f (m) (x0) dx0 = f (m−1)(x1)− f (m−1)(a)

Integrating a total of m times gives

∫ xm

a

· · ·∫ x1

a

f (m)(x0) dx0 · · · dxm−1

=

∫ xm

a

· · ·∫ x2

a

[f (m−1)(x1)− f (m−1)(a)

]dx1 · · · dxm−1

=

∫ xm

a

· · ·∫ x3

a

[f (m−2)(x2)− f (m−2)(a)− (x2 − a)f (m−1)(a)

]dx2 · · · dxm−1

= f(xm)− f(a)− (xm − a)f ′(a)− 1

2!(xm − a)2f ′′(a)− · · ·

− 1

(m− 1)!(xm − a)m−1f (m−1)(a)

34

where we used the basic integral∫ x

a

(y − a)n−1 dy =1

n(x− a)n

Now let xm → x, which gives

f(x) = f(a) + (x− a)f ′(a) +1

2!(x− a)2f ′′ + . . .+

1

(m− 1)!(x− a)m−1f (m−1)(a) +Rm ,

where is n! = n(n − 1) · · · 1 is the usual factorial function, with 0! = 1, and theremainder Rm is

Rm =

∫ x

a

· · ·∫ x1

a

f (m)(x0) dx0 · · · dxm−1

But from the mean value theorem applied to f (m), we have∫ x

a

f (m)(x0) dx0 = (x− a)f (m)(ξ) , a ≤ ξ ≤ x

which gives the “Lagrange form” for the remainder

Rm(x) =1

m!(x− a)mf (m)(ξ) , a ≤ ξ ≤ x

Notes

• We can repeat the proof above for∫ ax· · · where x ∈ [c, a] with c ≤ a ≤ b.

Since nothing changes, we can talk about expansion in a region about x = a.

• If (limm→∞Rm)→ 0 (as usually assumed here), we have an infinite series

f(x) =n=∞∑

n=0

1

n!f (n)(a) (x− a)n

This is the Taylor expansion of f(x) about x = a.

The set of x values for which the series converges is called the region of con-vergence of the Taylor expansion.

• If a = 0 , then

f(x) =∞∑

n=0

1

n!f (n)(0)xn

The Taylor expansion about x = 0 is called the Maclaurin expansion.

Physicist’s “proof” We can bypass the formal proof above by assumingthat a power series expansion of f(x) exists (i.e. the polynomials xn form acomplete basis) so that

f(x) =∞∑

n=0

anxn .

Now differentiate n times, and equate coefficients for each n, to obtain

f (n)(0) = 0 + · · ·+ p! ap + · · ·+ 0 which gives ap =1

p!f (p)(0) (as before)

35

5.1.1 Examples

Example 1: Expand the function

f(x) = sinx

about x = 0. We need

f (2n)(0) = (−1)n sin 0 = 0

f (2n+1)(0) = (−1)n cos 0 = (−1)n

Now, since |f (m)(ξ)| ≤ 1, then, for fixed x,

|Rm| =1

m!xm∣∣f (m)(ξ)

∣∣ ≤ 1

m!xm → 0

x− 13!x3

x

Therefore

sinx =∞∑

n=0

(−1)nx2n+1

(2n+ 1)!= x− x3

3!+x5

5!+ . . .

This ‘small x’ expansion is shown in the figure.

Example 2: Expand the function

f(x) = (1 + x)α

about x = 0. In this case

f (n)(0) = α(α− 1) · · · (α− n+ 1) ≡ α!

(α− n)!

giving

f(x) =∞∑

n=0

α!

n!(α− n)!xn ≡

∞∑

n=0

(αn

)xn

The Taylor expansion includes the binomial expansion, α need not be a +ve integer.

Example 3: a ‘problem’ case Consider, for example, the well-behaved function

f(x) = exp

(− 1

x2

)

Now f(0) = 0, and f (n)(0) = 0 ∀n, so

f(x) = 0 + 0 + 0 + . . . = 0 ∀x

Beware of essential singularities – not all functionswith an infinite number of derivatives can be ex-pressed as a Taylor series. See Laurent Series inHonours Complex Variables next semester.

36

5.1.2 A precursor to the three-dimensional case

If we regard f(x+ a) ≡ g(a) temporarily as a function of a only, we can write g(a)as a Maclaurin series in powers of a

f(x+ a) ≡ g(a) =∞∑

n=0

1

n!g(n)(0) an , g(n)(0) = f (n)(x)

which we can rewrite as

f(x+ a) =∞∑

n=0

1

n!f (n)(x) an ≡ exp

(a

d

dx

)f(x)

The differential operator exp

(a

d

dx

)is defined by its power-series expansion. This

is the form that we shall generalise to three dimensions in an elegant way.

It can also be obtained by first defining

F (t) ≡ f(x+ at)

which we regard as a function of t. We need the expansion in powers of t aboutt = 0, namely

F (t) =∞∑

n=0

tn

n!F (n)(0) (5.1)

Noting that F (n)(0) = an f (n)(x), and setting t = 1, we find

f(x+ a) = F (1) =∞∑

n=0

1

n!an f (n)(x)

as before.

5.2 The three-dimensional case

With this trick, we can use the one-dimensional result to find the Taylor expansionof φ(r + a) in powers of a about the point r . Let

F (t) ≡ φ(r + ta) ≡ φ(u) (where we defined u = r + ta)

=∞∑

n=0

tn

n!F (n)(0)

where we used equation (5.1) above. We want φ(r + a) which is F (1).

Using the chain rule, the first derivative of F (t) with respect to t is

F (1)(t) =∂φ(u)

∂ui

∂ui∂t

=∂φ

∂uiai =

(a · ∇

u

)φ(u)

37

where we used

∂ui∂t

=∂

∂t(xi + tai) = ai and defined a · ∇

u≡ ai

∂φ

∂ui

The nth derivative of F (t) is

F (n)(t) =(a · ∇

u

)nφ(u) and hence F (n)(0) =

(a · ∇

r

)nφ(r) (5.2)

For F (1) we have

φ(r + a) =∞∑

n=0

1

n!

(a · ∇

r

)nφ(r) ≡ exp

(a · ∇

r

)φ(r)

This is the Taylor expansion of a scalar field φ(r) in three dimensions, in a ratherelegant form.

Generalisation to an arbitrary tensor field is easy. Simply replace φ(r) by Tij···(r)in the above expression.

Example: Find the Taylor expansion of φ(r + a) =1

|r + a| for r a.

Since φ(r) = 1/|r| = 1/r , we have

1

|r + a| =∞∑

n=0

1

n!

(a · ∇

r

)n 1

r

=1

r+

1

1!(ai∂i)

1

r+

1

2!(ai ∂i) (aj ∂j)

1

r+ · · ·

=1

r− a · r

r3+

3(a · r)2 − a2 r22r5

+ O

(1

r4

)

Exercise: check this explicitly.

This result is used in the multipole expansion in electrostatics and magnetostatics.

38

Chapter 6

Curvilinear coordinates

As you have seen in previous courses, it is often convenient to work with coordinatesystems other than Cartesian coordinates xi, i.e. (x1, x2, x3) or (x, y, z).

For example, spherical polar coordinates (r, θ, φ)are defined by:

x = r sin θ cosφ

y = r sin θ sinφ

z = r cos θ

- y

6z

x

r

P

θ

φ

We shall set up a formalism to deal with rather general coordinate systems, of whichspherical polars are a very important example.

Suppose we make a transformation from the Cartesian coordinates (x1, x2, x3) tothe variables (u1, u2, u3), which are functions of the xi

u1 = u1(x1, x2, x3)

u2 = u2(x1, x2, x3)

u3 = u3(x1, x2, x3)

If the variables ui are single-valued functions of the variables xi, then we canmake the inverse transformations,

xi = xi(u1, u2, u3) for i = 1, 2, 3,

except possibly at certain points.

A point may be specified by its Cartesian coordinates xi, or its curvilinear coor-dinates ui. We may define the curvilinear coordinates by equations giving xias functions of ui, or vice-versa.1

• For Cartesian coordinates, the surfaces ‘xi = constant’ (i = 1, 2, 3) are planes,with (constant) normal vectors e i (the Cartesian basis vectors) intersecting atright angles.

1Sometimes the curvilinear coordinates are called (u, v, w), just as Cartesians are called (x, y, z).

39

• For general curvilinear coordinates,the surfaces ‘ui = constant’ do nothave constant normal vectors, nor dothey intersect at right angles.

For example, in 2-D, we might havethe situtation illustrated in the dia-gram on the right.

x

x = c

x = cu = constant

u = constant

x

2

2

1 1

1

2

From the definition of spherical polar coordinates (r, θ, φ), we have

r =√x2 + y2 + z2 θ = cos−1

z√

x2 + y2 + z2

φ = tan−1

(yx

).

The surfaces of constant r, θ, and φ are

r = constant ⇒ spheres centred at the origin

θ = constant ⇒ cones of semi-angle θ and axis along the z-axis

φ = constant ⇒ planes passing through the z-axis

Not all of these surfaces are planes, but they do intersect at right angles. The angleθ is undefined at the origin; φ is undefined on the z-axis.

6.1 Orthogonal curvilinear coordinates

If the coordinate surfaces (surfaces of constant ui), intersect at right angles, as in theexample of spherical polars, the curvilinear coordinates are said to be orthogonal.

6.1.1 Scale factors and basis vectors

Let point P have position vector r = r(u1, u2, u3). If we change u1 by du1 (with u2and u3 fixed), then r → r + dr , with

dr =∂r

∂u1du1 ≡ h1 e 1 du1

where we have defined the scale factor h1 and the unit vector e 1 by

h1 =

∣∣∣∣∂r

∂u1

∣∣∣∣ and e 1 =1

h1

∂r

∂u1

• The scale factor h1 gives the length h1 du1 of dr when we change u1 → u1+du1.

• e 1 is a unit vector in the direction of increasing u1 (with fixed u2 and u3.)

Similarly, we can define hi and e i for i = 2 and 3.

40

In general, if we change a single ui, keeping the other two fixed, we have

∂r

∂ui= hi e i i = 1, 2, 3 [no sum on i]

• The unit vectors e i are in general not constant vectors – their directionsdepend on the position vector r, and hence on the curvilinear coordinatesui. [They should perhaps be called e

ui or e

u, e

v, e

w to avoid confusion

with Cartesian basis vectors.]

• If the curvilinear unit vectors satisfy e i · e j = δij, the ui are said to beorthogonal curvilinear coordinates, and the three unit vectors e i form anorthonormal basis. We will always choose the ui to give a right-handed basis.

e

e

e

u

u

u

1

1

33

2

2

6.1.2 Examples of orthogonal curvilinear coordinates (OCCs)

Cartesian coordinates:

r = x ex + y e y + z e z ⇒ hx ex =∂r

∂x= ex , etc.

The scale factors are all unity, and each individual unit vector points in the samedirection everywhere.

Spherical polar coordinates: u1 = r , u2 = θ , u3 = φ (in that order)

r = r sin θ cosφ ex + r sin θ sinφ e y + r cos θ e z

∂r

∂r= sin θ cosφ ex + sin θ sinφ e y + cos θ e z ⇒ hr =

∣∣∣∣∂r

∂r

∣∣∣∣ = 1

∂r

∂θ= r cos θ cosφ ex + r cos θ sinφ e y − r sin θ e z ⇒ hθ =

∣∣∣∣∂r

∂θ

∣∣∣∣ = r

∂r

∂φ= −r sin θ sinφ ex + r sin θ cosφ e y ⇒ hφ =

∣∣∣∣∂r

∂φ

∣∣∣∣ = r sin θ

Hence the unit vectors for spherical polars are

e r = sin θ cosφ ex + sin θ sinφ e y + cos θ e z = r / r

e θ = cos θ cosφ ex + cos θ sinφ e y − sin θ e z

eφ = − sinφ ex + cosφ e y

41

These unit vectors are normal to the surfaces described above (spheres, cones andplanes).

They are orthogonal:

e r · e θ = e r · eφ = e θ · eφ = 0

And they form a right-handed orthonormalbasis:

e r×e θ = eφ , e θ×eφ = e r , eφ×e r = e θ .

See also tutorial sheet.

- y

6z

x

r

P

θ

φ

3e r

AAUe θ

: eφ

Cylindrical coordinates: u1 = ρ , u2 = φ , u3 = z (in that order)

r = ρ cosφ ex + ρ sinφ e y + z e z

⇒ ∂r

∂ρ= cosφ ex + sinφ e y

∂r

∂φ= −ρ sinφ ex + ρ cosφ e y

∂r

∂z= e z

The scale factors are then (tutorial) hρ = 1 , hφ = ρ , hz = 1 , and the basis vectorsare

e ρ = cosφ ex + sinφ e y eφ = − sinφ ex + cosφ e y e z = e z

These unit vectors are normal to surfaces which are (respectively): cylinders centredon the z-axis (ρ = constant), planes through the z-axis (φ = constant), planesperpendicular to the z axis (z = constant), and they are clearly orthonormal.

6.2 Elements of length, area and volume

6.2.1 Vector length

If we change u1 → u1 + du1, keeping u2 and u3 fixed, then r → r + dr1

withdr

1= h1 e 1 du1. The infinitesimal element of length along e 1 is h1 du1.

Similarly, the infinitesimal lengths along the curvilinear basis vectors e 2, e 3, areh2 du2, h3 du3 respectively.

If we change ui → ui + dui, for all i = 1, 2, 3, then

dr = h1 du1 e 1 + h2 du2 e 2 + h3 du3 e 3 =3∑

i=1

hi dui e i

Note: We are not using the summation convention here – because the expressionfor dr above contains 3 identical indices! Great care is needing when using thesummation convention with OCCs.

42

6.2.2 Arc length and metric tensor

Defining ds as the length of the infinitesimal vector dr , we have (ds)2 = dr · drIn Cartesian coordinates: (ds)2 = (dx)2 + (dy)2 + (dz)2

(i) For an arbitrary set of curvilinear coordinates (not necessarily orthogonal),letting uk → uk + duk , ∀ k = 1, 2, 3, gives

(dr)k = dxk =∂xk∂u1

du1 +∂xk∂u2

du2 +∂xk∂u3

du3 =∂xk∂ui

dui

The square (ds)2 of the length ds is then

(ds)2 = dxk dxk =∂xk∂ui

∂xk∂uj

dui duj ≡ gij dui duj (6.1)

where we defined the 32 = 9 components gij of the metric tensor by

gij = gji =∂xk∂ui

∂xk∂uj

=∂r

∂ui· ∂r∂uj

The 9 quantities gij are the components of a symmetric second-rank tensor.For an arbitrary set of curvilinear coordinates, gij is not diagonal in general.

(ii) In terms of scale factors2

(ds)2 =

(∑

i

hi e i dui

)·(∑

j

hj e j duj

)=∑

ij

hi hj(e i · e j

)dui duj (6.2)

For orthogonal curvilinear coordinates, we have e i · e j = δij, and hence

(ds)2 =∑

i

h2i (dui)2 = h21 du21 + h22 du22 + h23 du23 (6.3)

Comparing equations (6.1) and (6.2), the metric tensor can be written as

gij = hi hj(e i · e j

)

For orthogonal curvilinear coordinates, the metric tensor is diagonal

gij = h2i δij (no sum on i)

and it’s simplest to use equation (6.3).

Example: For spherical polars, we showed that

hr = 1 hθ = r hφ = r sin θ

In this case(ds)2 = (dr)2 + r2(dθ)2 + r2 sin2 θ (dφ)2

2The shorthand∑

i

means

3∑

i=1

etc.

43

6.2.3 Vector area

If we let u1 → u1 + du1 , with u2, u3 fixed, then

r → r + dr1

with dr1

= h1 e 1 du1

Similarly, if we let u2 → u2+du2, with u1, u3 fixed,

r → r + dr2

with dr2

= h2 e 2 du2

dS

dS

e1r(u1, u2, u3)

e2

h2 du2

h1 du1

The vector area of the infinitesimal parallelogram whose sides are the vectors dr1

and dr2

isdS = (dr

1)× (dr

2) = (h1 du1 e 1)× (h2 du2 e 2)

For OCCs, this simplifies to

dS3

= h1 h2 du1 du2 e 3 ,

since e 1 × e 2 = e 3 for orthogonal systems. dS3

points in the direction of e 3, whichis normal to the surfaces u3 =constant, and the infinitesimal area is a rectangle.

The vector areas dS1

and dS2

are defined similarly.

Example: For spherical polars, if we vary θ and φ, keeping r fixed, we easily obtaina familiar result

dSr

= (hθ dθ e θ)×(hφ dφ eφ

)= hθ hφ dθ dφ e r = r2 sin θ dθ dφ e r

Similarly, if we vary φ and r, keeping θ fixed, we obtain the vector element of areaon a cone of semi-angle θ, with its axis along the z axis

dSθ

=(hφ dφ eφ

)× (hr dr e r) = hφ hr dφ dr e θ = r sin θ dr dφ e θ

Similarly for dSφ

[tutorial].

6.2.4 Volume

The volume of the infinitesimal parallelepiped with edges dr1, dr

2and dr

3is

dV =∣∣(dr

1× dr

2

)· dr

3

∣∣ = |((h1 du1 e 1)× (h2 du2 e 2)) · (h3 du3 e 3)|

For OCCs, we have (e 1 × e 2) · e 3 = 1 (in a RH basis), hence

dV = h1 h2 h3 du1 du2 du3

In this case, the infinitesimal volume is a cuboid.

For spherical polars, we have dV = hr hθ hφ dr dθ dφ = r2 sin θ dr dθ dφ

For a general set of curvilinear coordinates

dV =√g du1 du2 du3

where g is the determinant of the metric tensor [hand-in].

44

6.3 Components of a vector field in curvilinear

coordinates

A vector field a(r) can be expressed in terms of curvilinear components ai, definedby

a(r) =3∑

i=1

ai(u1, u2, u3) e i

where e i is the ith curvilinear basis vector (which again should really be called eui

to avoid confusion with the Cartesian basis vectors.)

For orthogonal curvilinear coordinates, the component ai can be obtained by takingthe scalar product of a with the ith curvilinear basis vector e i

ai = a(r) · e i

NB ai must be expressed in terms of u1, u2, u3 (not x, y, z) when working in theui basis.

Example: If a = a ex in Cartesians, then in spherical polars

ar = a · e r = (a ex) ·(sin θ cosφ ex + sin θ sinφ e y + cos θ e z

)= a sin θ cosφ

Similarly, aθ = a · e θ and aφ = a · eφ , and we obtain a in the spherical-polar basis(exercise)

a(r, θ, φ) = a(sin θ cosφ e r + cos θ cosφ e θ − sinφ eφ

)

• You can often spot the curvilinear components “by inspection”.

• In general, one chooses the set of coordinates which matches most closely thesymmetry of the problem.

45

6.4 Div, grad, curl and the Laplacian in OCCs

6.4.1 Gradient

The gradient of a scalar field3 f(r) is defined in terms of the change in the fielddf(r) when r → r + dr

df(r) = ∇ f(r) · dr (6.4)

Now write f(r) in terms of orthogonal curvilinear coordinates: f(r) = f(u1, u2, u3).

As usual, we denote the curvilinear basis vectors by e 1, e 2, e 3 .

Let u1 → u1 + du1, u2 → u2 + du2, and u3 → u3 + du3.

Using the chain rule, we have

df =∂f

∂u1du1 +

∂f

∂u2du2 +

∂f

∂u3du3 (6.5)

We need to rewrite the RHS of this equation in the form of equation (6.4). Startwith

dr = h1 du1 e 1 + h2 du2 e 2 + h3 du3 e 3

and use orthogonality of the curvilinear basis vectors, e i · e j = δij, to rewriteequation (6.5) as

df =∂f

∂u1du1 +

∂f

∂u2du2 +

∂f

∂u3du3

=

(∂f

∂u1e 1 +

∂f

∂u2e 2 +

∂f

∂u3e 3

)· (e 1 du1 + e 2 du2 + e 3 du3)

=

(1

h1

∂f

∂u1e 1 +

1

h2

∂f

∂u2e 2 +

1

h3

∂f

∂u3e 3

)· (h1 e 1 du1 + h2 e 2 du2 + h3 e 3 du3)

=

(1

h1

∂f

∂u1e 1 +

1

h2

∂f

∂u2e 2 +

1

h3

∂f

∂u3e 3

)· dr

Comparing this result with equation (6.4), which holds for all dr, gives us ∇ f inorthogonal curvilinear coordinates

∇ f =1

h1

∂f

∂u1e 1 +

1

h2

∂f

∂u2e 2 +

1

h3

∂f

∂u3e 3 =

3∑

i=1

1

hi

∂f

∂uie i

For spherical polars, hr = 1, hθ = r, hφ = r sin θ, and we have

∇ f(r, θ, φ) =∂f

∂re r +

1

r

∂f

∂θe θ +

1

r sin θ

∂f

∂φeφ

3We use f(r) rather than φ(r) in order to avoid confusion with the angle φ in spherical polars.

46

6.4.2 Divergence

Let a(r) be a vector field, which we write in orthogonal curvilinear coordinates as

a(r) =3∑

i=1

ai(u1, u2, u3) e i

where ai are the components of a in the curvilinear basis, e i is the ith curvilinearbasis vector, and a is continuously differentiable.

We obtain ∇·a in orthogonal curvilinears using the integral definition of divergence

∇ · a = limδV→0

1

δV

δS

a · dS ,

where δS is the closed surface bounding δV .

Let the point P have curvilinear coordinates (u1, u2, u3).

Choose δV to be a small “cuboid” with its threeedges δr

i along the basis vectors e i at P :

δr1

= h1 δu1 e 1

δr2

= h2 δu2 e 2

δr3

= h3 δu3 e 3

P APPq u1

Su3

D

B

CR

1u2

Q

The outward element of area on the face ABCD is dS = +h2 h3 du2 du3 e 1

The outward element of area on the face PQRS is dS = −h2 h3 du2 du3 e 1

The contributions to the surface integral from the faces ABCD and PQRS are then

∫ u3+δu3

u3

∫ u2+δu2

u2

[a1 h2 h3]ABCD − [a1 h2 h3]PQRS

du′2 du′3

=

∫ u3+δu3

u3

∫ u2+δu2

u2

[a1 h2 h3](u1+δu1,u′2,u′3)

− [a1 h2 h3](u1,u′2,u′3)

du′2 du′3

=

∫ u3+δu3

u3

∫ u2+δu2

u2

δu1

[∂

∂u1(a1 h2 h3)

]

(u1,u′2,u′3)

du′2 du′3 (by Taylor’s theorem)

= δu1 δu2 δu3

[∂

∂u1(a1 h2 h3)

]

(u1,u2,u3)

(6.6)

In the last step, we assumed that δV is small enough that the integrand is ap-proximately constant over the range of integration. We can then approximate theintegrals over u′2 and u′3 by the integrand evaluated at the point P ,

δu1

[∂

∂u1(a1 h2 h3)

]

(u1,u2,u3)

47

multiplied by the range of integration δu2 δu3. [This is a rather crude use of themean value theorem!]

The contributions of the other four faces to the integral over δS can be obtainedsimilarly, or by cyclic permutations of the indices 1, 2, 3 in equation (6.6).

Finally, divide by the volume of the cuboid δV = h1 h2 h3 δu1 δu2 δu3, whereupon allthe factors of δui cancel, and we obtain our final expression for ∇ · a in orthogonalcurvilinear coordinates

∇ · a =1

h1h2h3

∂u1(a1h2h3) +

∂u2(a2h3h1) +

∂u3(a3h1h2)

For Cartesian coordinates, the scale factors are all unity, and we recover the usualexpression for ∇ · a in Cartesians.

For spherical polars we have

∇ · a(r, θ, φ) =1

r2 sin θ

∂r

(r2 sin θ ar

)+

∂θ(r sin θ aθ) +

∂φ(r aφ)

=1

r2∂

∂r

(r2 ar

)+

1

r sin θ

∂θ(sin θ aθ) +

∂φ(aφ)

where ar, aθ, and aφ are the components of the vector field a in the basis e r, e θ, eφ.

6.4.3 Curl

We obtain ∇× a in orthogonal curvilinear coordinates using the line integral defi-nition of curl.

The component of ∇× a in the direction of theunit vector n is

n ·(∇× a

)= lim

δS→0

1

δS

δC

a · dr

where δS is a small planar surface, with unit nor-mal n, bounded by the closed curve δC.

Let δS be a small rectangular surface parallel tothe e 2−e 3 plane with one corner at r(u1, u2, u3),and with edges

δr2

= h2 δu2 e 2 and δr3

= h3 δu3 e 3

which lie along the basis vectors, so that n = e 1.

δS1 2

4 3

r(u1, u2, u3)

e1

e3

e2

δC

The line integral around the curve δC is the sum of the line integrals along the lines

48

1→ 4 respectively,

δC

a · dr =

∫ u2+δu2

u2

[a2 h2](u1,u′2,u3)du′2 +

∫ u3+δu3

u3

[a3 h3](u1, u2+δu2,u′3)du′3

−∫ u2+δu2

u2

[a2 h2](u1,u′2,u3+δu3)du′2 −

∫ u3+δu3

u3

[a3 h3](u1, u2,u′3)du′3

Using Taylor’s theorem, we can write this as

δC

a · dr =

∫ u3+δu3

u3

δu2

[∂

∂u2(a3 h3)

]

(u1, u2,u′3)

du′3

−∫ u2+δu2

u2

δu3

[∂

∂u3(a2 h2)

]

(u1, u′2,u3)

du′2

In each case, we approximate the integrals over u′3 and u′2 by the product of theintegrand and the integration ranges δu3 and δu2, respectively. Hence

δC

a · dr =∂

∂u2(a3 h3) δu2 δu3 −

∂u3(a2 h2) δu3 δu2

where all the ai and hi are evaluated at r(u1, u2, u3).

Finally, we divide by the area of the rectangle δS = h2h3 δu2δu3, whereupon all thefactors of δui cancel, and we obtain

e 1 ·(∇× a

)=(∇× a

)1

=1

h2 h3

∂u2(a3h3) −

∂u3(a2h2)

The components of ∇ × a in the directions of the curvilinear basis vectors e 2 ande 3 may be obtained similarly, or by cyclic permutations of the indices.

It is convenient to write the final result in the form

∇× a =1

h1 h2 h3

∣∣∣∣∣∣∣∣∣∣

h1 e 1 h2 e 2 h3 e 3

∂u1

∂u2

∂u3

h1 a1 h2 a2 h3 a3

∣∣∣∣∣∣∣∣∣∣

For spherical polars we have

∇× a =1

r2 sin θ

∣∣∣∣∣∣∣∣∣∣

e r r e θ r sin θ eφ

∂r

∂θ

∂φ

ar raθ r sin θaφ

∣∣∣∣∣∣∣∣∣∣

49

6.4.4 Laplacian of a scalar field

The action of the Laplacian operator on a scalar field f(r) is defined by ∇2f =∇ · (∇ f) .

Using the expression for ∇ · a, with a = ∇ f , derived above, we find

∇2f =1

h1 h2 h3

∂u1

(h2 h3h1

∂f

∂u1

)+

∂u2

(h3 h1h2

∂f

∂u2

)+

∂u3

(h1 h2h3

∂f

∂u3

)

In spherical polars, we have

∇2f(r, θ, φ) =1

r2 sin θ

∂r

(r2 sin θ

∂f

∂r

)+

∂θ

(sin θ

∂f

∂θ

)+

∂φ

(1

sin θ

∂f

∂φ

)

=1

r2∂

∂r

(r2∂f

∂r

)+

1

r2 sin2 θ

sin θ

∂θ

(sin θ

∂f

∂θ

)+∂2f

∂φ2

The expression for the Laplacian of a scalar field in spherical polars is one of themost useful results in the course, with applications in electromagnetism, quantummechanics, optics, elasticity, fluid mechanics, meteorology, general relativity, cos-mology, . . .

6.4.5 Laplacian of a vector field

The Laplacian of a vector field a(r) in curvilinear coordinates is defined by meansof the identity

∇×(∇× a

)= ∇

(∇ · a

)− ∇2 a

in the form4

∇2 a = ∇(∇ · a

)− ∇×

(∇× a

)

The quantities on the right hand side are evaluated using the expressions for grad,div and curl derived above.

4Remember the mnemonic ‘grad-div-minus-curl-curl’ or GDMCC, pronounced ‘guddumk’.Thanks to Peter Boyle for teaching me this!

50

Chapter 7

Electrostatics

7.1 The Dirac delta function in three dimensions

Consider the mass of a body with density ρ(r). The massof the body is

M =

V

ρ(r) dV

How can we use this general expression for the case of asingle particle? What is the ‘density’ of a single ‘point’particle with mass M at r

0?

~r0

V

O

We need a ‘function’ ρ(r) with the properties

ρ(r) = 0 ∀ r 6= r0

M =∫Vρ(r) dV r

0∈ V

ρ(r) = M δ(r − r

0)︸ ︷︷ ︸

notation

Generalising slightly, we define the delta function to pick out the value of the functionf(r

0)1 at one point r

0in the range of integration, so that

V

dV f(r) δ(r − r0) =

f(r

0) r

0∈ V

0 otherwise

Similarly, the total charge on a body with charge density (charge per unit volume)ρ(r) is

Q =

V

ρ(r) dV

The one dimensional delta function

The delta function may be defined by a sequence of functions δε(x − a), each of‘area’ unity, which have the desired limit when integrated over. We give a numberof examples of how this may be done.

1f(r0) = M in the example above

51

• Top hat

δε(x− a) =

1

2εa− ε < x < a+ ε

0 otherwise

a− ǫ

a

12ǫ

a + ǫ

• Witch’s hat

δε(x−a) =

1

ε2[ε− |x− a|] a− ε < x < a+ ε

0 otherwise

a− ǫ

a

a + ǫ

• Gaussian

δε(x− a) =1

ε√π

exp

−(x− a)2

ε2

a

1ǫ√π

In each case∫ +∞

−∞dx f(x) δε(x− a) =

∫ +∞

−∞dx f(x+ a) δε(x)

=

∫ +∞

−∞dx[f(a) + x f ′(a) + x2/2 f ′′(a) + . . .

]δε(x)

where we shifted the integration variable in the first line, and Taylor-expanded theintegrand in the second. The function f(x) is a ‘good’ test function, i.e. one forwhich the integral is convergent for all ε.

52

For the top hat, we need to evaluate

∫ +∞

−∞dx xn δε(x) =

∫ +ε

−εdx xn

1

=

1

n+1εn n = 0, 2, 4, . . .

0 n = 1, 3, 5, . . .→︸︷︷︸ε→0

=

1 n = 0

0 otherwise

Hence∫ +∞

−∞dx f(x) δε(x− a) →︸︷︷︸

ε→0

f(a) i.e. δε(x− a)→ δ(x− a)

Similarly for the other representations. The Gaussian representation is the cleanest,because it’s a smooth function.

Notes:

(i) The Dirac delta ‘function’ isn’t a function, it’s a distribution or generalisedfunction.

(ii) Colloquially, it’s an infinitely-tall infinitely-thin spike of unit area.

(iii) The delta function is the continuous-variable analogue of the Kronecker deltasymbol. If we let i→ x

ui δij = uj →∫

dx x δ(x− x0) = x0

(iv) An important identity is

∫ +∞

−∞dx f(x) δ (g(x)) =

i

f(xi)

|g′(xi)|

where g(xi) = 0, i.e. xi are the simple zeroes of g(x) [tutorial].

The three dimensional delta function

In Cartesian coordinates (x, y, z),

δ(3)(r − r0) ≡ δ(r − r

0) = δ(x− x0) δ(y − y0) δ(z − z0)

In orthogonal curvilinear co-ordinates (u1, u2, u3),

δ(r − a) =1

h1h2h3δ(u1 − a1) δ(u2 − a2) δ(u3 − a3)

where h1, h2, h3 are the usual scale factors [tutorial].

(In the last equation, we set r0

= a to avoid double subscripts on the RHS.)

53

7.2 Coulomb’s law

Experimentally, the force between two point chargesq and q1 at positions r and r

1, respectively, is given

by Coulomb’s law

F1

=1

4πε0

q q1(r − r

1

)∣∣r − r

1

∣∣3

F1

is the force on the charge q at r, produced by thecharge q1 at r

1.

O

1

~r1

P

~r

Charges can be positive or negative. For qq1 > 0 we have repulsion, and for qq1 < 0we have attraction: like charges repel and opposite charges attract.

In SI units, charge is measured in Coulombs and ε0 is defined to be ε0 = 107/(4πc2)C2N−1m−2.

Aside: Similarly for Newton’s law of gravitation,

F1

= −G mm1

(r − r

1

)∣∣r − r

1

∣∣3

which is always attractive (hence the negative sign, so that G, m, m1 are all positive).

In SI units: G = 6.672× 10−11Nm2kg2.

7.3 The electric field

The electric field E is ‘produced’ by a charge configuration, and is defined in termsof the force on a small positive test charge,

E(r) = limq→0

1

qF

Clearly, E is a vector field.

7.3.1 Field lines

Field lines are the ‘lines of force’ on the test charge.

Newton’s equations imply that the motion of a (test) parti-cle is unique, which implies that the field lines do not cross,and thus that they are well-defined and can be measured.

Thus for our two charges q and q1 we have

F1

= q E(r)

i.e. particle 1 ‘produces’ an electrostatic field E(r) . Thediagram shows the field lines produced by a negative charge.

54

The particle at P ‘feels’ the electrostatic field as a force q E(r) with

E(r) =1

4πε0

q1(r − r

1

)∣∣r − r

1

∣∣3 (7.1)

7.3.2 The principle of superposition

Consider a set of charges qi situated at ri

The principle of superposition is motivated by ex-periment; it states that the total electric field at ris the vector sum of the fields due to the individualcharges at r

i

E(r) =1

4πε0

i

qi(r − r

i

)∣∣r − r

i

∣∣3O

~ri

i

P

~r

In the limit of (infinitely) many charges, we intro-duce a continuous charge density (charge/volume)ρ(r′), so that the charge in dV ′ at position r′ isρ(r′) dV ′. The electric field is then

E(r) =1

4πε0

V

dV ′ ρ(r′)

(r − r′

)∣∣r − r′

∣∣3

O

P

~r

~r′

~r − ~r′

To return to our original example of a single charge q1 at position r1, we simply set

the charge density ρ(r′) = q1 δ(r′ − r

1), which recovers the result in equation (7.1).

7.4 The electrostatic potential for a point charge

Since

∇ 1∣∣r − r1

∣∣ = − (r − r1)

∣∣r − r1

∣∣3 (7.2)

where ∇ operates on r (not r1), then for a point charge q1 at r

1

E(r) =q1

4πε0

(r − r

1

)∣∣r − r

1

∣∣3 = − q14πε0

∇(

1∣∣r − r1

∣∣

)

i.e. we may write

E(r) = −∇φ(r) with φ(r) =1

4πε0

q1∣∣r − r1

∣∣ (7.3)

φ(r) is the electrostatic potential for the electric field E(r).

55

7.5 The static Maxwell equations

7.5.1 The curl equation

For a continuous charge distribution, we again use equation (7.2) to write the electricfield as a gradient

E(r) = − 1

4πε0

V

dV ′ ρ(r′) ∇ 1∣∣r − r′∣∣ = −∇

(1

4πε0

V

dV ′ρ(r′)∣∣r − r′

∣∣

)(7.4)

Note that ∇ operates on r (not r′) so we can take it out of the volume integralover r′ . Therefore

∇× E = −∇×∇(

1

4πε0

V

dV ′ρ(r′)∣∣r − r′

∣∣

)

But the curl of the gradient of a scalar field is always zero, which implies

∇× E = 0

for all static electric fields. This is (the static version of) Maxwell’s third equation.

7.5.2 Conservative fields and potential theory

A vector field that satisfies ∇× E = 0 is said to be conservative (or irrotational).

Consider the integral of ∇×E over an open surface Sbounded by the closed curve C1 − C2. Using Stokes’theorem

0 =

S

(∇× E

)· dS =

C1−C2

E · dr

Therefore ∫

C1

E · dr =

C2

E · dr A

S

C1

C2B

~a

~b

Since the line integral is independent of the path from a to b, it can only depend onthe end points. So, for some scalar field φ, we must have

−∫ b

aE · dr = φ(b)− φ(a)

Now let a = r and b = r+ δr, where δr is small, so we can approximate the integral

−E(r) · δr + . . . = φ(r + δr)− φ(r) = ∇φ · δr + . . .

where we used the definition of the gradient in the last step. Since this holds ∀ δr ,we can always write

E(r) = −∇φ(r)

The scalar field φ(r) is called the potential for the vector field E(r).

56

An explicit expression for φ(r) can be obtained from (7.4). We have E = −∇φ with

φ(r) =1

4πε0

V

dV ′ρ(r′)∣∣r − r′

∣∣

This is linear superposition for potentials.

As in the case of the electric field, if we setρ(r′) = q1 δ(r

′ − r1), we recover the potential

for a single charge (equation (7.3))

φ(r) =1

4πε0

q1∣∣r − r1

∣∣O

P

~r~r1

q1

Notes:

• For a surface charge distribution, with charge/unit-area σ(r), the electric fieldproduced is

E(r) =1

4πε0

S

dS ′ σ(r′)

(r − r′

)∣∣r − r′

∣∣3 and φ(r) =1

4πε0

S

dS ′σ(r′)

|r − r′|

where dS ′ is the infinitesimal (scalar) element of area at r′ on the surface S.

• For a line distribution of charge, with charge/unit-length λ(r)

E(r) =1

4πε0

C

dl′ λ(r′)

(r − r′

)∣∣r − r′

∣∣3 and φ(r) =1

4πε0

C

dl′λ(r′)

|r − r′|

where dl′ is the infinitesimal element of length along the line (or curve) C.

• In SI units, the potential is measured in Volts V . In terms of other unitsV = C/(C2N−1m−1) = NmC−1 = JC−1.

• Field lines are perpendicular to surfaces of con-stant potential φ, called equipotentials or equipo-tential surfaces.

Let dr be a small displacement of the positionvector r of a point in the equipotential surfaceφ = constant.

Therefore0 = dφ = ∇φ · dr

so E = −∇φ is perpendicular to dr.

Thus electric field lines E are everywhere perpen-dicular to the surfaces φ = constant.

φ = const

~n

d~r

− φ = const

57

• The potential φ is only defined up to an overall constant. If we let φ→ φ+ c,the electric field E = −∇φ (and hence the force) is unchanged. So onlypotential differences have physical significance.

In most physical situations, φ → constant as r → ∞, and we usually choosethe constant to be zero.

• So far we’ve defined the potential in purely mathematical terms.

Physically, the potential difference, VAB, between two points A and B is definedas the energy per unit charge required to move a small test charge q from Ato B:

VAB ≡ limq→0

1

qWAB

= −1

q

C

F · dr = −∫

C

E · dr

=

C

∇φ · dr =

∫ B

A

= φB − φA

The −ve sign is because this is the work done against the force F . Since thefield is conservative, the integral is independent of the path – it depends onlyon the end points.

• The potential energy of a point charge q at position r in an external electro-static field E

ext(r) = −∇φext(r) is therefore given by q φext(r).

We may generalise this to a charge distribution in an external electric fieldE

ext(r) = −∇φext . In this case, the (interaction) energy is

W =

V

dV ρ(r)φext(r)

Note that this does not include the the self-energy of the charge distribution.To emphasize this we write φext. [More on this later]

7.5.3 The divergence equation

Let’s return to the potential for an arbitrary charge distribution

φ(r) =1

4πε0

V

dV ′ρ(r′)∣∣r − r′

∣∣

Since E = −∇φ, we have ∇ · E = −∇2φ, and hence

∇ · E(r) = − 1

4πε0

V

dV ′ ρ(r′)∇2 1∣∣r − r′∣∣ (7.5)

Note that ∇2 acts only on r (not on r′), so we can take it inside the integral over r′.

58

Theorem ∇2

(1

r

)= −4π δ(r) ∀ r

Proof: We first prove it for r 6= 0

∇2

(1

r

)r 6=0= −∂i

(xir3

)= −

(3

r3− xi

3

2r−5 2xi

)= 0

To prove the result for r = 0, we integrate ∇2(1/r) over an arbitrary volume Vcontaining the origin r = 0.

V

∇2

(1

r

)dV =

∇2

(1

r

)dV = −

∇ ·( rr3

)dV

= −∫

r

r3· dS = − ε

ε34πε2 = −4π

In the first line, we used our previous result that ∇2(1/r) = 0 everywhere awayfrom the origin to write the original integral as an integral over a sphere of radius εcentred on the origin, with volume Vε and area Sε respectively.

We then used the divergence theorem to obtain the first result on the second line.On the surface Sε, we have r = ε e

rand dS = e

rdS, where e

ris a unit vector in

the direction of r, so the integral over the surface of the sphere is straightforward –check it!

[Alternatively, we may write the surface integral as an integral over solid angle∫Sr · dS/r3 =

∫S

dΩ = 4π.]

We can now take the limit ε → 0, which simply shrinks the sphere down to theorigin, leaving the integral unchanged.

Since our result for the integral holds for an arbitrary volume V centred on theorigin, and

∫Vδ(r) dV = 1, we deduce that ∇2(1/r) = −4π δ(r). Similarly

∇2

(1∣∣r − r′∣∣

)= −4π δ(r − r′)

Substituting this result into equation (7.5) gives

∇ · E(r) = − 1

4πε0

V

dV ′ ρ(r′)(−4πδ

(r − r′

))

Using the delta function to perform the integral on the right hand side, we getMaxwell’s first equation

∇ · E(r) =ρ(r)

ε0

We now have the two electrostatic Maxwell equations

∇ · E =ρ

ε0∇× E = 0

In terms of the potential

E = −∇φ ∇2φ = − ρε0

The second equation is called Poisson’s equation.

59

7.6 Electric dipole

Physically, an electric dipole consists of two nearbyequal and opposite (point) charges, with charge−qsituated at r

0and charge +q at r

0+ d .

Define the dipole moment p = qd.

It will turn out to be useful to consider the dipolelimit, in which

p = limq→∞d→0

qd

with p finite (and constant). This is sometimescalled a point dipole or an ideal dipole.

O

~r

P

~r0

~r0 + ~d

~d

+q

−q

7.6.1 Potential and electric field due to a dipole

Dipole potential The electrostatic potential φ(r) produced by the dipole is

φ(r) =q

4πε0

[1∣∣r − r0− d∣∣ −

1∣∣r − r0

∣∣

]

=q

4πε0

[1∣∣r − r

0

∣∣ +d · (r − r

0)

∣∣r − r0

∣∣3 + O(d2) − 1∣∣r − r0

∣∣

]

where we Taylor (or binomial) expanded the first term about r − r0

[tutorial].

In the dipole limit, the terms of O(qd2) vanish, and the potential is simply

φ(r) =1

4πε0

p · (r − r0)

|r − r0|3

For a dipole at the origin we have

φ(r) =1

4πε0

p · rr3

Note that φ(r) falls off as 1/r2.

Electric field The ith component of the electric field produced by (or due to) adipole of moment p situated at the origin is

Ei(r) = −∂i φ = − 1

4πε0∂i

(pj xjr3

)

= − 1

4πε0pj

[δijr3

+ xj

(−3

2r−5)

2xi

]

Therefore

E(r) =1

4πε0

[3 p · rr5

r −p

r3

]

which falls off as 1/r3.

60

Spherical polar coordinates

Consider spherical polar coordinates (r, θ, χ), withthe z-axis chosen parallel to the dipole moment,i.e. p = p e

z, so that p · r = pr cos θ .

[We use χ instead of φ for the azimuthal angle inorder to avoid confusion with the potential φ.]

Then φ(r) =p

4πε0

1

r2cos θ

E(r) =p

4πε0

1

r3[3 cos θ e

r− e

z

~ez

~er

~eθ

θ

We can also obtain this result using the expression for ∇φ in polar co-ordinates

E = −∇φ

= −(∂φ

∂rer

+1

r

∂φ

∂θeθ

+1

r sin θ

∂φ

∂χeχ

)

= − p

4πε0

(− 2

r3cos θ e

r− sin θ

r3eθ

)

The second form can be obtained from the first bysubstituting e

z= e

rcos θ− e

θsin θ into the latter.

[Exercise: show this.]

The sketch shows the electric field (full lines) andthe potential (dashed lines) for the dipole.

This picture holds in the dipole limit, but it’s alsovalid when r d, the ‘far zone’.

φ = const

~E

7.6.2 Force, torque and energy

Force on a dipole

The force on a dipole at position r due to an externalelectric field E

extis

F (r) = −q Eext

(r) + q Eext

(r + d)

= −q Eext

(r) + q[E

ext(r) + (d · ∇)E

ext(r) + · · ·

]

In the point dipole limit

F (r) =(p · ∇

)E

ext(r)

O

~d

+q

−q

~r

~r + ~d

61

Torque on a dipole

The torque (or moment of the force, or couple) on a dipole, about the point r wherethe dipole is located, due to the external electric field is

G(r) = −q 0× Eext

(r) + q(0 + d

)× E

ext(r + d)

= q d×[E

ext(r) +

(d · ∇

)E

ext(r) + · · ·

]

Taking the dipole limit (i.e. ignoring terms of order O(qd2)), we find

G(r) = p× Eext

(r)

Energy of a dipole

The energy of a dipole in an external electric field Eext

is

W = −q φext(r) + q φext(r + d)

= −q φext(r) + q[φext(r) +

(d · ∇

)φext(r) + · · ·

]

In the dipole limit, using Eext = −∇φext, we find

W = −p · Eext

How is this expression for the energy of the dipole related to the force on the dipole,namely F =

(p · ∇

)E

ext?

Recall the following identity for vector fields a and b

∇(a · b

)=(a · ∇

)b+

(b · ∇

)a+ a×

(∇× b

)+ b×

(∇× a

)

If we set a = p = constant, and b = Eext

then

∇(p · E

ext

)=(p · ∇

)E

ext+ p×

(∇× E

ext

)

Now ∇ · p = 0, and ∇× Eext

= 0, so p× (∇× Eext

) = 0, and hence

F (r) =(p · ∇

)E

ext= ∇

(p · E

ext

)= −∇W

The force on the dipole is the gradient of the potential energy, as one would expect.

Examples

• For the case of a homogeneous (i.e. constant, independent of r) external field,E

ext(r) = E

0, we have

F = 0 and G(r) = p× E0

Since W = −p ·E0, a stable or equilibrium position (i.e. position of minimum

energy) occurs when p is parallel to E0

. Colloqually, dipoles “like to alignwith the field”.

62

• If the dipole at r has dipole moment p1, and the electric field E

ext(r) is due

to a second dipole of moment p2

at the origin, then

W = −p1· E

extwith E

ext=

1

4πε0

3(p2· r)

r5r −

p2

r3

Therefore

W =1

4πε0

[p1· p

2

r3−

3(r · p1)(r · p

2)

r5

]

The interaction energy is not only depen-dent on the distance between the dipoles,but also on their relative orientations. O

~r

~p1

~p2

7.7 The multipole expansion

Consider the case of a charge distribution, ρ(r), localised in a volume V . For con-venience we will take the origin inside V .

The potential at the point P is

φ(r) =1

4πε0

V

dV ′ρ(r′)

|r − r′|~r′

O

q′

P

For |r| much larger than the extent of V , i.e |r| |r′| for all |r′| such that ρ(r′) 6= 0,we can expand the denominator using the binomial theorem

(1 + x)n = 1 + nx+ (n(n− 1)/2) x2 +O(x3)

⇒∣∣r − r′

∣∣−1 =∣∣r∣∣2 − 2r · r′ +

∣∣r′∣∣2−1/2

=∣∣r∣∣−1

1− 2r · r′∣∣r∣∣2 +

∣∣r′∣∣2

∣∣r∣∣2

−1/2

=1

r

1 +

r · r′r2− 1

2

r′2

r2+

3

8

(−2r · r′r2

)2

+O

(r′

r

)3

This can also be obtained by Taylor expansion [exercise]. Then

φ(r) =1

4πε0

V

dV ′ ρ(r′)

[1

r+r′ · rr3

+3(r′ · r

)2 − r2 r′22r5

+ . . .

]

This gives the multipole expansion for the potential

φ(r) =1

4πε0

Q

r+

1

4πε0

p · rr3

+1

4πε0

Qij xi xj2r5

+ . . .

63

where

Q =

V

dV ′ ρ(r′) is the total charge within V

p =

V

dV ′ r′ ρ(r′) is the dipole moment about the origin

Qij =

V

dV ′(3x′i x

′j − r′2 δij

)ρ(r′) is the quadrupole tensor

The multipole expansion is valid in the far zone, i.e. when r r0, with r0 the sizeof the charge distribution.

• If Q 6= 0, the monopole term dominates

φ(r) =1

4πε0

Q

r

and in the far zone, r r0, the E field is that of a point charge at the origin.

• When the total charge Q = 0 the dipole term dominates

φ(r) =1

4πε0

p · rr3

If the charge density is given by two equal but oppositely-charged particles

close together, i.e. ρ(r′) = q[δ(r′ − d)− δ(r′)

], then

p =

∫dV ′ r′ q

[δ(r′ − d)− δ(r′)

]= q d

which is the dipole moment as defined previously, and hence justifies the name.

• If Q = 0 and p = 0, the quadrupole term dominates

φ(r) =1

4πε0

Qij xi xj2r5

The quadrupole tensor Qij is symmetric, Qij = Qji , and traceless, Qii = 0.

Why quadrupole?

A simple linear quadrupole is defined by plac-ing two dipoles (so four charges) ‘back to back’with equal and opposite dipole moments, asshown.

From the figure

d

Or

r’

d

q−2q

q

φ(r) =1

4πε0

(q

| r − r′ − d | +q

| r − r′ + d | +−2q

| r − r′ |

)

64

Expanding the denominators in the usual way, and defining ρ = r − r′, the leading(1/ρ) and dipole (1/ρ2) terms cancel, so for large ρ

φ(r) =1

4πε0q

3(ρ · d)2 − d2ρ2ρ5

=1

4πε0

Qij ρi ρjρ5

where Qij = q (3di dj − δij d2) is the (traceless, symmetric) quadrupole tensor – asabove.

The quadrupole moment is sometimes defined to be Q = 2qd2, where the ‘2’ isconventional.

7.7.1 Worked example

The region inside the sphere r < a, contains a charge density

ρ(x, y, z) = f z (a2−r2)where f is a constant. Show that at large distances from the origin the potentialdue to the charge distribution is given approximately by

φ(r) =2f a7

105ε0

z

r3

The multipole expansion gives

φ(r) =1

4πε0

(Q

r+P · rr3

+ O

(1

r3

))

In spherical polars (r, θ, χ),

x = r sin θ cosχ , y = r sin θ sinχ , z = r cos θ

The total charge Q is (we drop the primes in this calculation for brevity):

Q =

V

ρ(r) dV =

∫ 2π

0

∫ π

0

∫ a

0

(fr cos θ (a2−r2)

)r2 sin θ dr dθ dχ = 0.

This integral vanishes because

∫ π

0

cos θ sin θ dθ =

∫ π

0

1

2sin(2θ) dθ = 0.

The total dipole moment P about the origin is

P =

V

r ρ(r) dV =

V

r e r ρ(r) dV

=

∫ 2π

0

∫ π

0

∫ a

0

r (sin θ cosχ e 1+ sin θ sinχ e 2+ cos θ e 3)

(fr cos θ (a2−r2)

)r2 sin θ dr dθ dχ

The x and y components of the χ integral vanish. The z component factorises:

Pz = f

∫ 2π

0

∫ π

0

sin θ cos2 θ dθ

∫ a

0

r4 (a2−r2) dr = f 2π2

3

2a7

35

Putting it all together, we obtain

φ(r) =1

4πε0

8πa7f

105

e 3 · rr3

=2f a7

105ε0

z

r3

65

7.7.2 Interaction energy of a charge distribution

Let’s consider the interaction energy W of an arbitrary (but bounded) charge dis-tribution in an external electric field E

ext= −∇φext ,

W =

∫dV ρ(r)φext(r)

For a charge distribution localised around the origin, we Taylor-expand φext(r) aboutr = 0

φext(r) = φext(0) +(r · ∇

)φext(0) +

1

2!

(r · ∇

)2φext(0) + . . .

= φext(0)− r · Eext

(0)− 1

2xi xj ∂jEext i(0) + . . .

The last term may be re-written as−16 (3xi xj − r2 δij) ∂jEext i(0) because an external

field satisfies ∇ · Eext

= 0 in the region around the origin. Therefore

W = Qφext(0)− p · Eext

(0)− 16 Qij ∂j Eext i(0) + . . .

The physical picture is that the (total) charge interacts with the external potentialφext, the dipole moment with the external field E

ext, and the quadrupole moment

with the (spatial) derivative of the external field.

7.7.3 A brute-force calculation - the circular disc

The electric field E and potential φ can be evaluated exactly for a number of in-teresting symmetric charge distibutions. We give one example using cylindricalcoordinates before moving on to more powerful techniques.

A circular disc of radius a carries uniform surfacecharge density σ. Find the electric field and thepotential due to the disc on the axis of symmetry.

Electric field: Start with the general expression

E(r) =1

4πε0

S

dS ′ σ(r′)

(r − r′

)∣∣r − r′

∣∣3

Choose the z axis parallel to the axis of symmetrywith the origin at the centre of the disc, so that rlies on the z axis and r′ lies in the x− y plane.

In cylindrical coordinates (ρ, χ, z), we have

r = z e z and r′ = ρ e ρ

Therefore

r−r′ = z e z − ρ e ρ and∣∣r − r′

∣∣ =(ρ2 + z2

)1/2

r′

z

ρ

a

r − r′

r

O

χ

66

Using the expression for e ρ in terms of ex and e y, we have

E(z) =σ

4πε0

∫ a

0

∫ 2π

0

ρ dρ dχz e z − ρ

(cosχ ex + sinχ e y

)

(ρ2 + z2)3/2

4πε02π

∫ a

0

z ρ dρ

(ρ2 + z2)3/2e z + 0 =

σz

2ε0

[−1

(ρ2 + z2)1/2

]ρ=a

ρ=0

e z

=σz

2ε0

[1

|z| −1

(a2 + z2)1/2

]e z =

σ

2ε0

[z

|z| −z

(a2 + z2)1/2

]e z

The electric field on the z axis is parallel to the z axis because of symmetry. Thesum of all the contributions to Ex and Ey cancel, i.e. they integrate to zero. So wereally only needed to calculate Ez!

Consider two limits:

(i) z a Expand the second term in square brackets

1

(a2 + z2)1/2=

1

|z|(1 + a2/z2

)−1/2=

1

|z|

[1− 1

2

a2

z2+O

(a4

z4

)]

Keeping only the leading term, we have

E(z) = sgn(z)σa2

4ε0z2e z = sgn(z)

Q

4πε0

e zz2

where the signum (or sign) function sgn(z) ≡ z/|z| is +1 for z > 0, and −1for z < 0; and Q = σπa2 is the total charge on the disc. In the far zone, werecover the field for a point charge, as expected.

(ii) a z In this case, the leading behaviour is obtained by dropping the secondterm in square brackets:

E(z) = sgn(z)σ

2ε0e z

This is the electric field due to an infinite charged surface – see later.

Potential: Start with the general expression

φ(r) =1

4πε0

S

dS ′σ(r′)

|r − r′|

For the disc, we have

φ(z) =σ

4πε0

∫ a

0

∫ 2π

0

ρ dρ dχ

(ρ2 + z2)1/2=

σ

2ε0

[(ρ2 + z2

)1/2]ρ=aρ=0

2ε0

[(a2 + z2

)1/2 − |z|]

Note: it’s often much easier to find φ than E !

67

(i) z a Expanding as before, we find

φ(z) =σ

4ε0

a2

|z| =Q

4πε0 |z|as expected.

(ii) a z In this case we find a linear potential

φ(z) =σ

2ε0

[a − |z|

]= − σ

2ε0|z|+ constant

Exercise: Check that E(z) = −∂φ∂z

e z in each case.

7.8 Gauss’ law

We showed previously that the electric field, E(r), due to a charge distribution, ρ(r),satisfies ∇ · E = ρ/ε0. This is the differential form of Maxwell’s first equation.

Integrating ∇·E = ρ/ε0 over a volume V , bounded by a closed surface S, and usingthe divergence theorem ∫

V

∇ · E dV =

S

E · dS

gives Gauss’ law ∫

S

E · dS =1

ε0

V

ρ dV =Qenc

ε0

where Qenc is the total charge enclosed by the volume V . [We will often drop thesubscript ‘enc’.]

Gauss’ Law is extremely useful, particularly for problems with a symmetry2, forsolving problems in potential theory, and for determining the behaviour of fields atboundaries.

[Gauss’s law is also known as Gauss’ theorem.]

Examples

• Consider a sphere of radius a, carrying charge Q,centred on the origin, with uniform charge densityρ0 = Q/(43πa

3).

By symmetry, the electric field will point radiallyoutwards, so that E(r) = Er(r) er .

Integrating over a sphere of radius r, we find

S

E · dS = Er(r) 4πr2 =

43πr

3ρ0/ε0 r ≤ a

43πa

3ρ0/ε0 r ≥ a

ρ

a

2This goes against the general rule that it is easier to compute the potential (a scalar) firstrather than the electric field (a vector)!

68

Therefore

E(r) =

Q

4πε0

r

a3r ≤ a

Q

4πε0

r

r3r ≥ a

Outside the sphere, the electric field ap-pears to come from a point source, andinside it increases linearly with r.

| ~E|

r = a

r1/r2

r

We can obtain the electrostatic potential from

E(r) = Er(r) er = −∇φ(r) = −∂φ(r)

∂rer

Integrating with respect to r gives

φ(r) =

− Q

4πε0

(r2

2a3+ C1

)=

Q

4πε0

1

2a3(3a2 − r2

)r ≤ a

Q

4πε0

(1

r+ C2

)=

Q

4πε0

1

rr ≥ a

For r > a we chose the constant of integration C2 to be zero so that φ→ 0 asr →∞. The potential outside the sphere is again that of a point charge.

For r < a, we chose the constant of integration C1 = −3/(2a) so that φ iscontinuous across the boundary at r = a. [Exercise: check this calculation.]

Note: E has a cusp, so the derivative ∂Er/∂r is discontinuous at the boundary.

• Consider a long (infinite) straight wire with constant charge/unit length λ.

Using cylindrical coordinates with the z axis par-allel to the wire, we integrate over a cylinder oflength L and radius ρ with its axis along the wire.By symmetry we must have E = Eρ(ρ) e

ρ.

Using Gauss’ Law, we get

S

E · dS = Eρ(ρ) 2πρL + 0︸︷︷︸ends

=1

ε0λL

This gives

E(r) =λ

2πε0

1

ρeρ

L

The potential can be found by integrating Eρ = −∂φ/∂ρ, which gives

φ(r) = − λ

2πε0ln

ρ0

)

where ρ0 is a constant of integration. In this case, φ(r) = 0 when ρ = ρ0.

69

• Infinite flat sheet of charge with constant charge density σ per unit area.Integrate over a cylindrical ‘Gaussian pill box’ with axis perpendicular to thesheet. See tutorial, and below.

7.9 Boundaries

Useful results for the changes in the normal and tangential components of the electricfield across a boundary may be obtained using Gauss’ Law and Stokes’ theorem.

Consider a surface carrying surface charge density σ. The electric field on one sideof the boundary is E

1, and on the other E

2. The unit normal to the surface is n.

S

1

~E1

~E2

~n

7.9.1 Normal component

Apply Gauss’ law to the small cylindrical ‘Gaussian pillbox’ of area A, and infinites-imal height δl, so that (δl)2 A, as shown in the figure. Start with

S

E · dS =1

ε0

∫ρ dV

Recalling dS = n dS, then for small A

(E

2− E

1

)·nA = (E2⊥ − E1⊥)A =

σA

ε0

where E1⊥ and E2⊥ are the componentsof the electric field perpendicular to thesurface. The factors of A cancel, hence

n ·(E

2− E

1

)=

σ

ε0

S

1

~n

A δl

Hence the normal component of the electric field E, is discontinuous across theboundary when σ 6= 0.

In this case the discontinuity is proportional to the surface charge density.

70

7.9.2 Tangential component

Let us apply Stokes’ theorem to the small rectangle of length l, and infinitesimalwidth δl, with δl l, in the figure

S

∇× E · dS =

C

E · dr

which gives

0 =

C

E · dr =(E2‖ − E1‖

)l

where E1‖ and E2‖ are the compo-nents of the electric field parallel tothe boundary.

This can be written as

n× E1

= n× E2

S

1

~nl

δl

S

where we used the fact that the cross product of the electric field E with n picksout the tangential component E‖ of the electric field, since

E = E⊥n + E‖

Thus the tangential component of E is continuous across a charged boundary.

7.9.3 Conductors

Physically, a conductor is a material in which ‘free’ or ‘surplus’ electrons can move(or flow) freely when an electric field is applied.

In Electrostatics

• For a conductor in equilibrium, where all charges are at rest, all the chargeresides on the surface of the conductor, i.e. ρ = 0 inside a conductor.

This holds because if ρ 6= 0, then due to Maxwell’s first equation ∇ ·E = ρ/ε0(or Gauss’ law), so we must have E 6= 0, and hence the charge would moveand we wouldn’t have equilibrium – a contradiction. So E = 0 and henceφ = constant everywhere inside a conductor.

Physically, the charges repel and move to the surface.

• The electric field on the surface of a conductor is normal to the surface,i.e. E ‖ n, otherwise charge would move along the surface

Thus if dr is a displacement on the surface of a conductor, E · dr = −dφ = 0,so φ = constant on the surface of a conductor, ie. an equipotential.

71

Therefore, on the surface of a conductor,

Et = 0 , En =σ

ε0

The external electric field induces a charge onthe surface of the conductor, which in turn de-forms the external field so that it is perpendic-ular to the conductor surface. In the case of aconductor, the surface charge is calculated fromthe electric field (and not vice-versa as is theusual case).

φ = const

vacuum

conductor

~E = 0

~E

For insulators we have the opposite situation – the charges are fixed and we mustcalculate the electric field and the potential from the charge density.

7.10 Poisson’s equation

In section (7.5.3), we showed that ∇× E = 0, and hence there exists a potential φsuch that E = −∇φ. Substituting this into Maxwell’s first equation ∇ · E = ρ/ε0gives Poisson’s equation

∇2 φ = − ρε0

If we know ρ, this may be solved for φ (or vice versa), given appropriate boundaryconditions (bcs). In a charge-free region (i.e. ρ = 0) Poisson’s equation becomesLaplace’s equation

∇2 φ = 0

7.10.1 Uniqueness of solution

Partial differential equations have many linearly-independent solutions. How do weknow which is the correct one for a given physical system?

Theorem: For a volume V bounded by a surface (orset of surfaces) S, if we are given a set of boundaryconditions :

Either

φ(r) for r ∈ S Dirichlet bcs

Or

∂φ

∂n≡ n · ∇φ(r) for r ∈ S Neumann bcs

where n is the unit normal to S, then the solution ofPoisson’s equation in V is unique.

∂φ∂n = −σ

ǫ0

φ

conductor surface

φ = const

conductor surface

ρ 6= 0

V

dipole sheet

σ

72

Proof: Let φ1(r) and φ2(r) be two solutions of Poisson’s equation, and let ψ =φ1 − φ2, so that ∇2ψ = 0 for both sets of boundary conditions.

Applying the divergence theorem to the LHS of the vector calculus identity

∇ ·(ψ∇ψ

)= ψ∇2ψ +∇ψ · ∇ψ

gives ∫

S

ψ(n · ∇ψ

)dS =

V

(ψ∇2ψ +

∣∣∇ψ∣∣2)

dV

Since either ψ = 0 (Dirichlet) or ∂ψ/∂n = 0 (Neumann) on S, then

V

∣∣∇ψ∣∣2 dV = 0

This implies that ∇ψ = 0 everywhere in V , which integrates to ψ = constant.

Consider the Dirichlet and Neumann boundary condition cases separately:

Dirichlet: In this case ψ = 0 on S, so the constant is zero and the two solutionsare equal

φ1 = φ2

Neumann: In this caseφ1 = φ2 + constant

Since the potential is only defined up to a constant, the constant may be disregarded.

Thus both types of boundary conditions give a unique solution of Poisson’s equation.

Notes:

• We can specify either Dirichlet or Neumann boundary conditions at each pointon the boundary, but not both. To specify both is generally inconsistent, sincethe solution is then overdetermined.

• However, we can specify either Dirichlet or Neumann boundary conditions ondifferent parts of the surface.

• The uniqueness theorem means we can use any method we wish to obtain thesolution - if it satisfies the correct boundary conditions, and is a solution ofthe equation, then it is the correct solution.

7.10.2 Methods of solution

The theorem is useful: if you find a solution somehow, it is the solution. For example:

(i) Guesswork

(ii) Numerical methods

(iii) Direct integration of Poisson’s equation

73

(iv) Gauss’ law plus symmetry

(v) Method of images

(vi) Separation of variables

(vii) Green-function method

Method (iii) uses ‘direct’ integration to find the (unique) solution of Poisson’s equa-tion, i.e.

φ(r) =1

4πε0

V

ρ(r′)∣∣r − r′∣∣ dV

which is okay for simple situations with a bounded potential.

We have studied method (iv) already.

7.10.3 The method of images

In the method of images, we add fictitious charges outside the volume under con-sideration in such a way that the system including the fictitious charges satisfiesPoisson’s equation in the region of interest, plus the correct boundary conditions.

ρ + bcs ⇔ ρ + mirror charges in unphysical region to mimic bcs

Example: Point charge and a planar conducting surface

In the figure, the point charge q is at position (0, 0, z0) above an infinite conductingsurface in the x− y plane (on the left in the figure). The region of interest V is thehalf-space z > 0. The image charge −q is at position (0, 0,−z0) below the x − yplane.

≡q

(0, 0, z0)

V : halfspace z > 0

−q q

The potential due to the pair of charges is

φ(r) =1

4πε0

[q

(x2 + y2 + (z − z0)2)12

− q

(x2 + y2 + (z + z0)2)12

]

74

Since φ satisfies Poisson’s equation in the half-space (z > 0), and

φ(r)|r=(x,y,0) = 0

then the surface of the conductor (at z = 0) is an equipotential (i.e. φ satisfies theboundary conditions), then φ is the unique solution.

Electric field: We may calculate the electric field from the potential

E(r) = −∇φ =q

4πε0

[(x, y, z − z0)

(x2 + y2 + (z − z0)2)32

− (x, y, z + z0)

(x2 + y2 + (z + z0)2)32

]

On the surface of the conductor (z = 0) this becomes

E(r)∣∣r=(x,y,0)

= − q

2πε0

z0

(x2 + y2 + z20)32

ez

i.e. the field at the surface of the conductor is perpen-dicular to the surface, as it must be.

The surface charge density on the conductor is

σ(r) = ε0E(r)∣∣r=(x,y,0)

= − q

z0

(x2 + y2 + z20)32

The total charge on the conducting surface is, usingpolar coordinates (ρ, χ),

Q =

z=0

σ dS

= − q

2πz0

∫ 2π

0

∫ ∞

0

ρ dρ1

(ρ2 + z20)32

= − q

which is just the mirror charge, as one would expect.

E

q

Force The force on the positive charge due to the conductor is that between twopoint charges separated by distance 2z0

F = − q2

4πε0

1

(2z0)2 e z

75

Example: point charge and a metal sphere

Consider an earthed sphere of radius a centredon the origin, together with a point charge q atpoint P which lies a distance b from the centreof the sphere. The volume V is the space r ≥ aand we choose the mirror charge q′ = −q a/b,at P ′ where OP ′ = a2/b, in order to satisfy theboundary conditions at r = a.

[The potential on an earthed conductor is de-fined to be φ = 0 everywhere on and inside theconductor – the potential of the ‘earth’.]

q′ = −qa/bP

r

r2

r1

q

b

a2/b

θ

O P ′

The potential of the two-charge system is

φ(r) =1

4πε0

[q

r1+

q′

r2

]

=q

4πε0

[1

(r2 + b2 − 2br cos θ)12

− a/b

(r2 + (a4/b2)− 2(a2/b) r cos θ)12

]

On the boundary (r = a)

φ(r)|r=a =q

4πε0

[1

(a2 + b2 − 2ab cos θ)12

− a/b

(a2 + (a4/b2)− 2(a3/b) cos θ)12

]= 0

and hence by the uniqueness theorem, this is the correct solution.

Example: Earthed conducting sphere in a uniform electric field

Consider an earthed conducting sphere of radius a centred on the origin, and subjectto a uniform external electric field E.

ez

r

P

φ = 0

For a constant electric field E = (0, 0, E) we have φ = −Ez = −Er cos θ. Weanticipate that the field will induce a dipole moment parallel to e z on the sphere,so we use a linear superposition as an ansatz (guess) for the potential:

φ(r) = −Ez +(B e z) · r

r3= −Er cos θ +

B

r2cos θ

76

This is a solution of Laplace’s equation outside the sphere, where ρ = 0. As r →∞,we have φ→ −Er cos θ. The boundary condition on the surface of the sphere is

φ|r=a = −Ea cos θ +B

a2cos θ = 0 which gives B = a3E

Therefore the (unique) solution for this problem is

φ(r, θ) = −Er cos θ +a3E

r2cos θ

Alternatively, we can use the method of images, as illustrated below. The solution tothe constant-external-field problem is obtained in the limit that the charges outsidethe sphere move to infinity, and the ones inside move inwards to form a dipole atthe centre.

−e′

φ = 0

ez

−ee′

R R

+e

7.11 Electrostatic energy

7.11.1 Electrostatic energy of a general charge distribution

Recall that the work done in moving a point charge in a field E from ra

to rb

is

We = −q∫ rb

ra

E · dr = q

∫ rb

ra

∇φ · dr =

∫ rb

ra

dφ = q

(φ(r

b)− φ(r

a)

)

Thus the work done to bring in a point charge q from infinity (where φ = 0) toposition r is W = q φ(r).

Let’s assemble a system of n charges qi from ∞ to positions ri:

W1 = 0 [nothing else is there]

W2 = q21

4πε0

q1|r

1− r

2| [the work done in the field of q1]

W3 = q31

4πε0

[q1

|r1− r

3| +

q2|r

2− r

3|

][the work done in the field of q1 and q2]

and so on. Therefore the work done to bring in the ith charge qi to position ri

is

Wi =qi

4πε0

i−1∑

j=1

qj|rj− r

i|

77

The total work done is

We =n∑

i

Wi =n∑

i

i−1∑

j=1

1

4πε0

qi qj|rj− r

i| =

1

2

n∑

i,j, (i 6=j)

1

4πε0

qi qj|rj− r

i| ≡

1

2

i

qi φi

which we may write as

We =1

2

i

qi φi with φi = φ(ri) =

j, (j 6=i)

qj4πε0

1

|rj− r

i|

where φi is the potential felt by qi due to all the other charges.3

In the limit of a continuous charge distribution ρ(r), we have

We =1

2

V

dV ρ(r)φ(r)

where the integral is over all space.

7.11.2 Field energy

We can write We in terms of the (total) electric field E using Maxwell’s first equationin the form ρ = ε0∇ · E . Then We becomes

We =ε02

∫dV φ

(∇ · E

)

Now use the product rule to rewrite

φ(∇ · E

)= ∇ ·

(φE)−(∇φ)· E = ∇ ·

(φE)

+∣∣E∣∣2

so that

We =ε02

∫dV

(∇ ·(φE)

+∣∣E∣∣2)

Applying the divergence theorem to the integral of the first term, taking V to be allspace, and S to be a large sphere of radius R→∞, this integral becomes

V

∇ ·(φE)

dV =

S

(φE)· dS ≈ O

(1

R

1

R24πR2

)→ 0 as R→∞

The total energy stored in the electric field is then

We =ε02

∫dV

∣∣E(r)∣∣2

3The factor of 1/2 ensures that we don’t count the energy required to bring in charge j from∞ in the presence of charge i at r

iand the energy required to bring in charge i from ∞ in the

presence of charge j at rj. Note there is no factor of 1/2 in our previous expression for the energy

of a charge in an external electrostatic field.

78

and the energy density (energy/unit volume) at r is

we(r) =ε02

∣∣E(r)∣∣2

Notes: The general result for We was derived for a continuous charge distribution.When there are point charges we have to be careful with self-energy contributionswhich should be excluded from the integral because they lead to divergences.

The two boxed expressions for We are complementary. We can think of the electro-static energy as lying in the charge distribution or as being stored in the E field.

Finally, note that since the energy density we is quadratic in the electric field, wedon’t have superposition of energy density.

7.12 Capacitors (condensers) and capacitance

A capacitor is formed from a pair of conductors 1 and 2 carrying equal and oppositecharges, Q and −Q. The potentials on the conductors are φ1 and φ2, so the potentialdifference is V = φ1 − φ2.

Clearly, φ1 and φ2 are proportional to Q (up to a common constant), so V ∝ Q,and we define capacitance

C =Q

V

which depends only the geometry of the capacitor.

7.12.1 Parallel-plate capacitor

The simplest example is the parallel plate capacitor

d

A

E = 0

a

E = 0

E

φ1

φ2

+Q

−Q

Two parallel plates of area A have a separation a (with a√A), and carry surface

charge densities +σ and −σ on their inner surfaces (because of the attractive forcebetween the charges on the two plates).

We can obtain the electric field using Gauss’ Law. First take a pillbox that straddlesthe inner surface of the upper plate (say). E = 0 inside the plate and between thetwo plates the field is normal to the inner surface of the upper plate, which gives

Einside

=

(+σ

ε0

)(−e z) = − σ

ε0e z

79

Now take a pillbox that straddles the outer surface of the upper plate. Since thereis no charge on the upper surface and E = 0 inside the plate, we have E = 0 at theouter surface. Similarly for the lower plate. Therefore

Eoutside

= 0 Einside

= − σε0e z

We can obtain the potential between the plates in the usual way

Ez = − σε0

= −∂φ∂z

⇒ φ(z) =σz

ε0+ constant

so the potential difference between the plates is

V =σ a

ε0=

Qa

A ε0and the capacitance is

C =Q

V=

Aε0a

which is a purely geometrical property of the plates, as expected.

7.12.2 Concentric conducting spheres

Consider a capacitor consisting of two concentric spheres,radii a and b with b > a, centred on the origin. The outersphere carries charge +Q and the inner one −Q.

By Gauss’ law, the electric field is zero inside the innersphere, and outside the outer one [exercise].

By Gauss’ law, between the spheres, we have

φ(r) = − Q

4πε0

1

r+ constant for a < r < b

−Q

E

+Q

E

a

b

The potential difference is

V = φb − φa = − Q

4πε0

(1

b− 1

a

)=

Q

4πε0

b− aab

so the capacitance is

C = 4πε0ab

b− a

7.12.3 Energy stored in a capacitor

Using our first expression for the energy of a charge distribution, we have

W =Q

2(φ1 − φ2) =

QV

2=

CV 2

2=

Q2

2Cwhich holds for a capacitor of any shape. For parallel plates

We =ε02

∫|E|2 dV =

ε02

(Aa)

ε0

)2

=a

2ε0

Q2

A=

Q2

2C

as before.

80

Chapter 8

Magnetostatics

In the previous chapter, we studied static charge distributions, which lead to anelectric field. In this chapter we study the case of steady currents (also known astime-independent currents), which lead to a magnetic field.

8.1 Currents

A current is created by moving charges: a charge q moving at velocity v gives an‘elementary’ current j = q v , which is a vector quantity.

Consider a charge density ρ(r) moving with velocity v(r). This defines a (bulk)current density,

J(r) = ρ(r) v(r)

We can also have a surface current density,

K(r) = σ(r) v(r)

where σ(r) is the surface charge density (charge/area), and a line current

I(r) = λ(r) v(r)

where λ(r) is the line charge density (charge/length).

We define a current element dI for each of these three cases by

dI(r) =

J(r) dV current element in the bulk

K(r) dS current element in a surface

I(r) dr current element along a line (or wire)

Note: Care is needed with current elements. For example, K dS 6= K dS since theleft hand side points in the direction of the current vector on the surface, but theright hand side points normal to the surface. On the other hand I dr = I dr sincea line current element always points along the wire in the direction dr.

Units: From the definition of line current, the dimension of current I is

[I] =(C m−1

) (m s−1

)=(C s−1

)≡ A (Amperes)

81

The unit of current I is called the Ampere (A), which is a base SI unit, 1A = 1C s−1

(Coulombs/second). Similarly

[K] = A m−1 [J ] = A m−2 [dI] = A m

Note that current I and current element dI have different units,1 and that none ofthese current ‘densities’ has units of current/volume! This is a little confusing, butit’s standard . . .

The total (scalar) current I passing through a surface S is the flux of J through S

I =

S

J · dS

where dS is normal to the surface. Similarly, the flux of K across the curve C is

I =

C

K · n′ dr

where the unit vector n′ is normal to C, in the plane of K.

8.1.1 Charge and current conservation

Experiment: total (net) charge is conserved; charge can’t be created or destroyed.

Consider a volume V bounded by a closed surface S. The charge Q in the volumeV changes due to current flowing over the surface S:

∂Q(t)

∂t=

∂t

V

ρ(r, t) dV =

V

∂ρ(r, t)

∂tdV = −

S

J(r, t)·dS = −∫

V

∇·J(r, t) dV

where∫SJ(r, t)·dS is the total current flow across S (the minus sign is due to current

flowing out of the surface reducing the charge in V ) and we used the divergencetheorem in the last step. This equation holds for all volumes V , so

∂ρ(r, t)

∂t+ ∇ · J(r, t) = 0

which describes charge conservation locally at the point r .

In static situations, ∂ρ/∂t = 0 (by definition), so ρ(r, t) = ρ(r) is independent of t,and ∇ · J = 0, which is called current conservation.

Example: Consider a disc, carrying uniform chargedensity σ, rotating about its axis at angular velocity ω.

The current density on the disc is

K = σ v = σ ω × r

Hence∇ ·K = σ∇ ·

(ω × r

)= 0

z

R

1Some authors use different fonts I and I for these quantities, but this is hard to do on theblackboard!

82

8.1.2 Conduction current

A common situation is where we have have no net electric charge, but a currentexists because positive and negative charges move with different velocities

J = ρ+ v+ + ρ− v− with ρ = ρ+ + ρ− = 0 , but |v+| 6= |v−|

Examples

Metal: nuclei are fixed, but electrons move: v+ = 0, so J = ρ− v−

Electrolyte: positive and negative ions move with different velocities:(v+ − v−

)

Current is often created by an electric field, which forces charge carriers to move. Asimple linear model, which agrees with experiment in common situations is Ohm’slaw

J = σ E

where σ is called the conductivity2.

In some materials J is not in general parallel to E, and the linear model must begeneralised to

Ji = σij Ej

where σij are the components of a second-rank tensor, the conductivity tensor.

Notes

(i) When ρ = 0, we have ∇·E = 0 (using Maxwell’s first equation) and ∇·J = 0(current conservation).

(ii) We cannot have a static closed current loop. Using Stokes’ theorem∮

C

E · dr =

S

(∇× E

)· dS = 0

because ∇× E = 0 for static electric fields. Ohm’s law then implies that∮

C

J · dr = 0

so I = 0 for a closed loop.

So we must have a battery to keep current flowing.[But see next semester for non-static situations.]

In static situations, we can write E = −∇φ, so

∫ r2

r1

E · dr = φ(r1)− φ(r

2) = V12

i.e. the potential supplied by the battery. This iscalled the electromotive force or emf E .

This is not a sensible name because emf is not a force,but we’re stuck with it.

E

I

2Not to be confused with the surface charge density!

83

(iii) In a perfect conductor (for example a superconductor), σ → ∞, so to keep Jfinite, we must have E = 0.

(iv) For an insulator, σ = 0.

8.2 Forces between currents (Ampere, 1821)

Consider two parallel wires, of length L, a distance d apart, carrying currents I1and I2.

r1 − r2

Parallel currents attract Anti-parallel currents repel

I1I1 I2 I2

From experiment, we find:

• For perpendicular currents, there is no force.

• Otherwise, the force between current elements dI1

and dI2

is a central forcewith an inverse square law. The force on infinitesimal current element 1 dueto infinitesimal current element 2 in SI units is

dF12

= − µ0

4πdI

1· dI

2

(r1− r

2

)∣∣r

1− r

2

∣∣3 = − µ0

4πdI

1· dI

2

r12

r212

where r12

= r1− r

2and r12 =

∣∣r12

∣∣. The constant µ0 = 4π × 10−7 N A−2 (or

N C−2 S2) is called the permeability of the vacuum (or of free space).

For two current loops C1 and C2 carrying currents I1 and I2

r2I1 I2

dr2

C1

E Er1 − r2

C2

dr1

O

r1

84

Linear superposition (which comes from experiment) gives

F12

= −µ0

C1

C2

(I1 dr

1

)·(I2 dr

2

) r12

r312

where we used the expression for current elements dI1· dI

2= I1 dr

1· I2 dr

2

This is the analog for currents of Coulomb’s law. Note that F21

= −F12

as expected.

We now separate this into two parts. The loop carrying current I2 produces amagnetic field B, which in turn produces a force on the loop carrying I1. Start byrewriting

dr1×(dr

2× r

12

)= dr

2

(dr

1· r

12

)−(dr

1· dr

2

)r12

Using Stokes’ theorem,∮

C1

dr1· r

12

r312= −

C1

dr1· ∇

1

(1

r12

)= −

S1

dS1·(∇

1×∇

1

(1

r12

))= 0

where ∇1

is the gradient with respect to r1

, and curl-grad is always zero. Therefore

F12

=µ0

4πI1 I2

C1

C2

dr1×(dr

2× r

12

)

r312−∮

C2

dr2

C1

dr1· r

12

r312

=

C1

I1 dr1×(µ0

C2

(I2 dr

2× r

12

)

r312

)

since the second term in the first line is zero (as we’ve just shown). Therefore wecan write

F12

=

C1

I1 dr1×B(r

1)

which is the force on C1 due to B(r1), where

B(r1) =

µ0

C2

I2 dr2× r

12

r312=

µ0

C2

I2 dr2×(r1− r

2

)∣∣r

1− r

2

∣∣3

is the magnetic field at r1

due to the current in C2. This is the Bio-Savart law,(approx 1820).

8.2.1 The Lorentz force

Generalising, the force on a current element dI due to a magnetic field B is (locally)

dF = dI ×B

so

F =

C

I dr ×B for a line current

S

K ×B dS for a surface current

V

J ×B dV for a bulk current

85

The force on a charge distribution ρ due to an electric field E is∫VρE dV , so the

force density (force/unit volume) at r is

f(r) = ρ(r)E(r)

Similarly, for a bulk current density in a magnetic field, we often write f = J × BTherefore, the combined force density is

f = ρE + J ×B

which is called the Lorentz force (strictly, the force density)

For a moving point charge q at r′, we have ρ(r) = q δ(r−r′) and J(r) = q v δ(r−r′),therefore

F = q(E + v ×B

)

We may regard these as the definitions of the fields E and B.

Note that magnetic fields do no work. If a charge q in a magnetic field B moves adistance dr = v dt, the work done by the field is

dW = q(v ×B

)· dr = q

(v ×B

)· v dt = q

(v × v

)·B dt = 0

So magnetic fields change the direction of charged particles, but do not acceleratethem to higher (or lower) speeds.

Units: [B] = NC−1m−1s = NA−1m−1 = T (Tesla).

8.3 Biot-Savart law

The magnetic field dB(r) at r due to a current element Idr′

at r′ is

dB(r) =µ0

Idr′ ×(r − r′

)∣∣r − r′

∣∣3

dB(r) is orthogonal to dr′ and(r − r′

), and tangential to a

circle centred on an axis through dr′. The field lines of dB‘wrap’ around this axis. This is the ‘right-hand grip rule’.

dB ( r )

r’

r

r − r’

C

I dr’

P

OThe magnetic field at r due to a current loop carrying current I is

B(r) =µ0

C

Idr′ ×(r − r′

)∣∣r − r′

∣∣3

For a bulk current density J , with dI = J dV , we have

B(r) =µ0

V

dV ′J(r′)×

(r − r′

)∣∣r − r′

∣∣3

Similarly for a surface current density

We can use the Bio-Savart law to compute the magnetic field directly for somesimple current distributions.

86

8.3.1 Long straight wire

Consider a long (infinite) straight wire alongthe z axis. Take the position vector r of thepoint P to be perpendicular to the z axis.From the diagram, using cylindrical coordi-nates, we have r = ρ e

ρand r′ = z′ e

z. Hence

r − r′ = ρ e ρ − z′e z dr′ = dz′ ez

Therefore

dr′ ×(r − r′

)= ρ dz′

(ez× e

ρ

)= ρ dz′ e

φ

θ

z

dr′

r − r′

φ

Or

r′ = z′ez

P

ez

The magnetic field due to the wire is then

B(r) =µ0

∫ ∞

−∞

I dr′ ×(r − r′

)∣∣r − r′

∣∣3 =µ0

4πIρ

∫ ∞

−∞

dz′(ρ2 + z′2

)3/2 eφ

To evaluate this integral, we use the substitution z′ = ρ tan θ, so dz′ = ρ sec2 θ dθand ρ2 + z′2 = ρ2 (1 + tan2 θ) = ρ2 sec2 θ, which gives

B(r) =µ0

I

ρ

∫ π/2

−π/2cos θ dθ e

φ=

µ0 I

2πρeφ

So B is inversely proportional to the distance of the point P from the wire, and itpoints in the direction of increasing φ, i.e. its field lines ‘wrap around’ the wire in acircle. This is the right-hand grip rule again.

8.3.2 Two long parallel wires

Now consider the force on a current element of a second parallelwire (at distance d from the first) coming from the magneticfield B

1(r) due to the first wire. Again we choose coordinates

so that this current element lies at r = ρ eρ

in cylindrical coor-dinates. The force dF on the current element I2 dr

2at r

2due

to the magnetic field B1

produced by wire 1 is:

II1 2

d

dF = I2 dr2×B

1(r

2) = I2 dz e

z× µ0

2πdI1 eφ = −µ0 I1 I2

2πddz e

ρ

Thus there is an attractive force per unit length between two parallel infinitely longstraight wires:

f =µ0 I1 I2

2πd

This is the basis of the definition of the Ampere, which is defined so that if the wiresare 1m apart and the current in each wire is 1A, the force per unit length betweenthe two wires is 2× 10−7Nm−1.

87

8.3.3 Current loop

Consider a current loop, radius a, carrying current I.Find the magnetic field on the axis through the centreof the loop. From the figure:

r = z e z , r′ = a e ρ ,∣∣r − r′

∣∣ = (a2 + z2)1/2

Hence

dr′ ×(r − r′

)= a eφ dφ×

(z e z − a e ρ

)

=(az e ρ − a2 (−e z)

)dφ

So

B(r) =µ0 I

∫ 2π

0

(az e ρ + a2 e z

)dφ

(a2 + z2)3/2

=µ0 I

2

a2

(a2 + z2)3/2e z

dr′

a

r

O

φ

zdr′ × (r − r′)

r′

r − r′

where we used∫ 2π

0e ρ dφ =

∫ 2π

0

(cosφ ex + sinφ e y

)dφ = 0. The components of the

magnetic field perpendicular to ez

cancel due to symmetry as we integrate aroundthe loop

8.4 Divergence and curl of the magnetic field: Gauss

and Ampere laws

The magnetic field produced by a bulk current density J is

B(r) =µ0

V

dV ′J(r′)×

(r − r′

)∣∣r − r′

∣∣3 = −µ0

V

dV ′ J(r′)×∇(

1∣∣r − r′∣∣

)

where ∇ acts only on r, (not r′). Now

∇ ·J ×∇

(1∣∣r − r′∣∣

)= −J ·

∇ ×∇

(1∣∣r − r′∣∣

)= 0

because ‘curl grad’ is always zero. Note that J = J(r′), so it’s treated as a constantvector when calculating the gradient with respect to r. Therefore

∇ ·B(r) =µ0

V

dV ′ J(r′) ·∇ ×∇

(1∣∣r − r′∣∣

)= 0

and we obtain Maxwell’s second equation

∇ ·B = 0

88

Since the second equation always holds, it tells us that there are no magnetic chargesor ‘monopoles’.

Compare this with Maxwell’s first equation, ∇ · E = ρ/ε0, which states that elec-tric charge density ρ is the source of electric field [strictly, electric flux – see nextsemester].

Similarly

∇×J(r′)×∇

(1∣∣r − r′∣∣

)= ∇2

(1∣∣r − r′∣∣

)J −

(J · ∇

)∇(

1∣∣r − r′∣∣

)

= −4πδ(r − r′

)J −

(J · ∇

)∇(

1∣∣r − r′∣∣

)

so

∇×B(r) = −µ0

V

dV ′−4πδ

(r − r′

)J(r′) −

(J(r′) · ∇

)∇(

1∣∣r − r′∣∣

)

The integral over the volume in the first term can be performed (trivially) with thedelta function, whilst the second term vanishes (see below), and we obtain.

∇×B = µ0 J

This is the differential form of Ampere’s law for steady currents, also known as thestatic form of Maxwell’s fourth equation.3

To show that the second term vanishes, we take the second ∇ out of the integral togive (up to constants)

∇∫

V

dV ′ J(r′) · ∇(

1∣∣r − r′∣∣

)

The integral may be written in terms of the gradient ∇ ′, wrt r′

V

dV ′ J(r′) · ∇(

1∣∣r − r′∣∣

)= −

V

dV ′ J(r′) · ∇ ′(

1∣∣r − r′∣∣

)

= −∫

V

dV ′∇ ′ ·

(J(r′)

1∣∣r − r′∣∣

)−(∇ ′ · J(r′)

) 1∣∣r − r′∣∣

= −∫

S

J(r′)∣∣r − r′∣∣ · dS

′ + 0

where we used the divergence theorem on the first term, which is zero providedthat the current density J(r′) → 0 at infinity, and the second term is zero because∇ · J = 0 in magnetostatics, where we have steady currents.

In fact, ∇×B = µ0 J implies ∇ · J = 0, since ∇ ·(∇×B

)≡ 0.

3In electrodynamics, there is an additional term on the RHS, due to time-dependent chargedistributions and therefore non-steady currents. See next semester.

89

8.4.1 Ampere’s law (1826)

The fundamental laws of magnetostatics are

∇ ·B = 0 and ∇×B = µ0 J

Applying the divergence theorem to a closed surface S, which bounds the volumeV , in the first equation gives

S

B · dS =

V

∇ ·B dV = 0

i.e. the magnetic flux (flux of B) through any closed surface is zero – Gauss’ law ofmagnetostatics.

Applying Stokes’ theorem for a closed curve C bounding an open surface S to thesecond equation gives

C

B · dr =

S

∇×B · dS = µ0

S

J · dS = µ0 I

In words: the circulation of the magnetic field Baround a closed loop C is equal to µ0 × the totalcurrent I flowing through the loop.

This is the integral form of Ampere’s law, which isextremely useful in finding B in symmetric situa-tions, just like Gauss’ law in electrostatics.

CS

B

J

Example: long straight wire Consider a long straight wire lying along the zaxis and use cylindrical coordinates. By symmetry, B(r) will be independent of z,its magnitude will be independent of φ, and it must point in the eφ direction. Hence

B(r) = Bφ(ρ) eφ

Consider a circle of radius ρ with its centre on the z axis,and apply Ampere’s law

C

B · dr = µ0 I ⇒ Bφ(ρ) 2πρ = µ0 I

which reproduces our previous result with very little effort,

B(r) =µ0 I

2πρeφ

I

φ

90

Example: coil/solenoid Consider a long coil4 of radius a, with N tightly-woundturns/unit length, centred on the z axis. Again, we use cylindrical coordinates.

By symmetry B(r) must be in the z direction, and independentof φ and z, so

B(r) = Bz(ρ) ez

Away from the wire, we have ∇×B = 0, so

∂Bz

∂ρ= 0 ⇒ Bz = constant

Since the constant is zero as ρ → ∞, then B must be zeroeverywhere outside the coil, which is a remarkable result!

z

L

φ

I

ρ

So we have

Bz =

B : ρ < a

0 : ρ > a

Now apply Ampere’s law to the rectangular ‘loop’ of length L shown in the figure,which has one long side inside the coil, and one outside. Therefore

B .L− 0 . L = µ0NLI

The magnetic field is constant inside the coil, with

B = µ0NI ez

8.4.2 Conducting surfaces

Consider a conducting surface carrying current density K. What are the boundaryconditions on the magnetic field at the surface?

Normal component: Use a small Gaussian pill-box of negligible height with a circular surface ofarea A, which straddles the surface as shown.

0 =

V

∇ ·B dV =

S

B · dS =(B

2−B

1

)· nA

So

B2· n = B

1· n

B1

n

B2

The normal component of the magnetic field is continuous across the boundary.

4A ‘long’ coil means its length is much greater than its radius, so we can treat it as beingeffectively infinite.

91

Tangential component: Use a small rectangu-lar Ampere loop of length L and negligible widthstraddling the surface, which has normal n, asshown in the figure.

Let the surface spanning the loop have normal n′,so the long side of the loop has direction n′ × n.

K

n

n′

LB1

B2

C

B · dr =(B

2−B

1

)·(n′ × n

)L =

S

µ0 J · dS = µ0

C

K · n′ dr = µ0K · n′ L

where the line C is along that part of the surface contained within the loop.

So [n×

(B

2−B

1

) ]· n′ = µ0K · n′

This holds for all n′ tangential to the surface, therefore

n×(B

2−B

1

)= µ0K

There is a discontinuity in the tangential components of the magnetic field due tothe surface current density.

8.5 The vector potential

A vector field B(r) that satisfies ∇ · B = 0 can be written as the curl of a vectorpotential5

∇ ·B = 0 ⇔ ∃ A such that B = ∇× A

The vector potential is not unique, it’s defined up to the gradient of an arbitraryscalar field χ(r). If we make the gauge transformation

A → A′ = A+∇χthen

∇× A → ∇× A′ = ∇× A + ∇×∇χ = ∇× A+ 0

The magnetic field B (and hence the physics) is unchanged by the transformation.

To fix the gauge uniquely we may choose to add an additional constraint on A, sothat

∇ · A = 0

which is called Coulomb gauge. If ∇ · A = ψ 6= 0, we can always find a gaugetransformation χ such that ∇ · A′ = 0 , i.e.

∇ · A′ = ∇ · A + ∇2χ = 0 ⇒ ∇2χ = −ψThis is just Poisson’s equation, which can be solved for χ in the standard way.

5Compare this with the case of electrostatics, where ∇ × E = 0 ⇒ the electric field canbe expressed as the gradient of a scalar potential φ , and vice versa, i.e. ∇ × E = 0 ⇔∃ φ such that E = −∇φ

92

Now if B = ∇×A, with ∇·A = 0, then from the differential form of Ampere’s law,

µ0 J = ∇×B = ∇×(∇× A

)= ∇

(∇ · A

)−∇2A = −∇2A

So ∇2A = −µ0 J

This has the form of Poisson’s equation for A (rather 3 equations for Ai), with so-lution6

A(r) =µ0

V

dV ′J(r′)∣∣r − r′

∣∣which satisfies the boundary condition A(r)→ 0 as r →∞If we take the curl of this equation, we recover the Biot-Savart law [exercise].

We can write similar expressions for line and surface currents [exercise].

Example: For a magnetic field B, which is constant everywhere,

A = 12

(B × r

)

because

∇× 12

(B × r

)= 1

2B(∇ · r

)− 1

2

(B · ∇

)r = 3

2B − 12B = B

If we know B, we can find A using Stokes’ theorem in problems with a lot ofsymmetry. ∮

C

A · dr =

S

(∇× A

)· dS =

S

B · dS ≡ Φ

Φ is the magnetic flux crossing the surface S bounded by the closed curve C.

Example: For a solenoid centred on the z axis, weshowed previously that

B =

B e

z: ρ < a

0 : ρ > a

By symmetry, A(r) = Aφ(ρ) eφ.

Choosing S to be a horizontal disc of radius ρ, as shown

ρ > a : 2πρAφ = Bπa2 ⇒ Aφ(ρ) = 12B

a2

ρ

ρ < a : 2πρAφ = Bπρ2 ⇒ Aφ(ρ) = 12Bρ

[The case ρ < a is as in the first example above.]

B

I

z

ρ

φ

Note that A 6= 0 everywhere outside the solenoid, even though B = 0 everywhereoutside it! This is important in quantum mechanics – see the Quantum Theorycourse.

6Compare this with ∇2φ = −ρ/ε0 in electrostatics, with solution φ(r) =1

4πε0

V

dV ′ ρ(r′)∣∣r − r′∣∣

93

8.6 Magnetic dipoles

Consider a circular wire, radius a, in the x−y plane, car-rying current I. What is the form of the vector potentialat points with r a?

A(r) =µ0

C

I dr′∣∣r − r′∣∣

Wlog take r in the x− z plane, so r = x ex

+ z ez

and

r′ = a(

cosφ ex

+ sinφ ey

)

dr′ = a(− sinφ e

x+ cosφ e

y

)dφ I

z

r′

x

φ

r

O

For r a1∣∣r − r′∣∣ =

1

r

(1 +

r · r′r2

+ . . .

)with r · r′ = ax cosφ

Then

A(r) =µ0 I

∫ 2π

0

dφ a(− sinφ e

x+ cosφ e

y

) 1

r

(1 +

ax cosφ

r2+ . . .

)

≈ µ0 I

π a2 x

r3ey

=µ0

m× rr3

where we used∫ 2π

0cos2 θ = π, with all the other integrals being zero, and defined

m = π a2 I ez

= I S

where S is the vector area of the loop. Now

∇×(m× rr3

)=

1

r3∇×

(m× r

)+ ∇

(1

r3

)×(m× r

)=

3(m · r

)r − r2mr5

So that

B(r) = ∇× A(r) =µ0

3(m · r

)r − r2mr5

which is the field of a magnetic dipole with dipole moment m.

As we shall show, we get the same result far away from any current loop, and thefield lines for electric dipoles, magnetic dipoles, and bar magnets are the same inthe far zone (far-field limit).

I

+

electric dipole magnetic dipole bar magnet

N

S

E B B

94

Multipole expansion Expand the magnetic vector potential A as above,

A(r) =µ0 I

C

dr′1

r

(1 +

r · r′r2

+ . . .

)

Applying Stokes’ theorem to an arbitrary constant vector c gives

c ·∮

C

dr′ =

S

(∇ ′ × c

)· dS ′ = 0

Hence∮C

dr′ = 0 for any closed curve C, so the first term in the expansion of Aalways vanishes. Hence there are no magnetic monopoles, as expected.

To evaluate the second term, first apply Stokes’ theorem to c(r · r′

).

c ·∮

C

dr′(r · r′

)=

S

(∇ ′ × c

(r · r′

))· dS ′ =

S

(∇ ′(r · r′

)× c)· dS ′

=

S

(r × c

)· dS ′ = c ·

S

dS ′ × r = c ·(S × r

)

where S is the vector area of the current loop. Since this holds for all constantvectors c , we find ∮

C

dr′(r · r′

)= S × r

Therefore the first non-zero term in the multipole expansion of A(r) is

A(r) =µ0 I

S × rr3

=µ0

m× rr3

as in our explicit example of the circular wire above. If we go further in the expan-sion, we get a magnetic quadrupole term, etc.

8.6.1 Magnetic moment and angular momentum

For a constant vector c, consider

c ·∮

C

r × dr =

C

(c× r

)· dr =

S

∇×(c× r

)· dS

=

S

c(∇ · r

)−(c · ∇

)r

dS = 2 c ·∫

S

dS

Since this holds for all constant vectors c, we have

S

dS = 12

C

r × dr

We can then rewrite the dipole moment as

m = I

S

dS = 12I

C

r × dr = 12

C

r × dI

95

where dI is the current element.

For a bulk current dI = J dV , this becomes m = 12

V

(r × J

)dV , and since

J = ρ v, we have

m = 12

V

(r × v

)ρ dV

Now suppose the charge density ρ is made up of particles each with charge q andmass m, then ρ = (q/m) ρm where ρ is the charge density, and ρm is the massdensity (mass/unit volume). Then

m =q

2m

V

r × v ρm dV =q

2m

V

r × p dV =q

2mL = g L

where p is the momentum density (momentum/unit volume), so L is the angularmomentum, and g is called the gyromagnetic ratio of the particles.

It turns out that this derivation is wrong by a factor of two for an isolated spin-12particle such as an electron (with charge −e), for which g = −e/m. The extra factorof two requires relativity.

8.6.2 Force and torque on a dipole in an external field

Force on the dipole

Consider a current loop in an external magnetic fieldB(r). The force on the loop is

F = I

C

dr′ ×B(r′)

= I

C

dr′ ×B(r) +

((r′ − r

)· ∇)B(r) + . . .

C

B

where we Taylor-expanded B(r′) about a point r inside (or close to) the loop. We’vealready shown that

∮C

dr′ = 0, so the terms in the integrand that don’t depend onr′ integrate to zero, and hence

F = I

C

dr′ ×(r′ · ∇

)B(r) + . . .

We showed above that ∮

C

dr′(r′ · r

)= S × r

The derivation above works for any vector field a(r) [check it!], so∮

C

dr′(r′ · a(r)

)= S × a(r) (8.1)

Then

Fi = I εijk

C

dx′j(r′ · ∇

)Bk = I εijk

(S ×∇

)jBk = I

((S ×∇

)×B

)i

96

so thatF =

(m×∇

)×B = −m

(∇ ·B

)+∇

(m ·B

)

But ∇ ·B = 0, therefore

F = ∇(m ·B

)≡ −∇Wdip

So the energy of the dipole in the external field is

Wdip = −(m ·B

)

Note: this is not the total energy of the dipole. It also takes energy to make thecurrent flow in the loop. [This is covered next semester.]

Note that these results mirror those for the electric dipole where F = ∇(p · E

)and

W = −p · E .

Torque on the dipole

G =

C

r′ ×[I dr′ ×B(r′)

]= I

C

[dr′(r′ ·B(r)

)−B(r)

(r′ · dr′

) ]

where we approximated B(r′) ≈ B(r) on the RHS. The second term vanishes onapplying Stokes’ theorem (because ∇ ′ × r′ = 0), and we again apply the result ofequation (8.1) to the first term, which gives

G = I

C

dr′(r′ ·B

)= IS ×B = m×B

which again mirrors the result G = p× E for an electric dipole.

The results are the same as for the electric dipole because one can think of a magneticdipole as made up of two magnetic monopoles.

mm

I

+

This is a useful way to remember the formulae, but magnetic monopoles (probably!)do not exist.

Example: compass needle We can measure the magnetic field B via oscillationsof a compass needle (which can be regarded as a magnetic dipole).

Since torque is the rate of change of angular momentum,G = dL

dt, we have

dL

dt= m×B φ

m

B

97

For a needle with moment of inertia I, this becomes

d

dt

(I φ)

= −mB sinφ

where φ is the angle between m and B (see diagram).

For small oscillations, where φ 1, we have

φ+mB

Iφ = 0

which gives an oscillation frequency of

ω =

√mB

I

8.7 Summary: electrostatics and magnetostatics

Electrostatics: Stationary charges∂ρ

∂t= 0 are the source of electric fields

∇ · E =ρ

ε0[ME1]

Coulomb’s law (field due to point charge) leads to

∇× E = 0 [ME3 (static)] ⇒ E = −∇φIn turn, the above lead to Poisson’s equation for the scalar potential φ

∇2φ = − ρε0

Magnetostatics: Steady current loops ∇ · J = 0 produce magnetic fields [nomagnetic monopoles].

∇ ·B = 0 [ME2] ⇒ B = ∇× A

The Biot-Savart law (field due to current element) leads to

∇×B = µ0 J [ME4 (static)]

In turn, in Coulomb gauge, ∇ · A = 0, the above lead to a vector Poisson equationfor A

∇2A = −µ0 J

Lorentz force: On a point particle

F = q(E + v ×B

)

or the Lorentz force densityf = ρE + J ×B

Next semester, you’ll see how the Maxwell equations involving curls, (ME3 andME4), need to be modified when time-varying currents and fields are present.

[End of Semester I]

98