Note Lin Alg Release13

20
EECS 16A Designing Information Devices and Systems I Fall 2015 Official Lecture Notes Note 13 Wireless communication and modeling signals Wireless communication through radio waves is one of the foundational technologies for the contemporary age — directly responsible for hundreds of billions of dollars worth of economic activity in the world and indirectly supporting trillions more. It is a critical enabler for other key technologies like satellites, deep- space exploration, drones, and robotics. Wireless is the key to the Internet Of Things as well as many advances yet to come in personal health. Wireless communication also happens to be a wonderful vehicle for learning how to take what we know lin- ear algebraically about inner-products, orthogonality, eigenspaces, and diagonalization to understand pow- erful techniques for understanding signals in general as well as a very important category of linear systems. So although this note is largely written with a wireless motivation, the ideas here are far more general. Signals and Systems We’re going to be talking about signals and systems that process those signals. At its heart, a signal is just a function and therefore has a domain and a range. The voltage on a wire varying over time or the brightness of a image varying as we scan across it are both examples of a signal. There are three broad classes of signals: Continuous-time signals Signals in the real world are analog in nature. Both the domain and the range of the signal are both continuous. A continuous-time signal is defined as a function x : R 7R (or C). The following graph depicts a speech waveform represented as an analog signal over time: Discrete-time signals When we store and process signals with computers, it is infeasible to read in the entire continuous signal. Hence we usually take discrete samples of the signal to form a sequence. A discrete-time signal is defined as a function x : Z 7R (or C). Because these signals are only defined on discrete values in the domain (they are undefined at all other points), they, strictly speaking, cannot EECS 16A, Fall 2015, Note 13 1

description

Note Lin Alg Release13

Transcript of Note Lin Alg Release13

EECS 16A Designing Information Devices and Systems IFall 2015 Official Lecture Notes Note 13

Wireless communication and modeling signals

Wireless communication through radio waves is one of the foundational technologies for the contemporaryage — directly responsible for hundreds of billions of dollars worth of economic activity in the world andindirectly supporting trillions more. It is a critical enabler for other key technologies like satellites, deep-space exploration, drones, and robotics. Wireless is the key to the Internet Of Things as well as manyadvances yet to come in personal health.

Wireless communication also happens to be a wonderful vehicle for learning how to take what we know lin-ear algebraically about inner-products, orthogonality, eigenspaces, and diagonalization to understand pow-erful techniques for understanding signals in general as well as a very important category of linear systems.So although this note is largely written with a wireless motivation, the ideas here are far more general.

Signals and Systems

We’re going to be talking about signals and systems that process those signals. At its heart, a signal is just afunction and therefore has a domain and a range. The voltage on a wire varying over time or the brightnessof a image varying as we scan across it are both examples of a signal. There are three broad classes ofsignals:

Continuous-time signals Signals in the real world are analog in nature. Both the domain and the range ofthe signal are both continuous. A continuous-time signal is defined as a function x : R 7→ R (or C).The following graph depicts a speech waveform represented as an analog signal over time:

Discrete-time signals When we store and process signals with computers, it is infeasible to read in theentire continuous signal. Hence we usually take discrete samples of the signal to form a sequence. Adiscrete-time signal is defined as a function x : Z 7→ R (or C). Because these signals are only definedon discrete values in the domain (they are undefined at all other points), they, strictly speaking, cannot

EECS 16A, Fall 2015, Note 13 1

be drawn as a continuous graph. The following graph shows the speech waveform from before, exceptsampled into discrete-time.

Finite-time signals Especially when dealing with computers, it is often impossible to deal with a trulyinfinite sequence. A finite-time signal is defined as a function x : Zn 7→R (or C) where Zn is the finiteset {0,1, . . . ,n−1}.The graph above, and all graphs we can realistically plot on a piece of paper, is actually of a finite-timesignal since it does not extend forever.

A system is a function that takes in a signal and outputs another signal. Continuous-time real-world systemssuch as analog electronics input and output continuous-time signals, whereas finite-time systems are thingslike computer programs that manipulate stored values in arrays. These input and output finite-time signals.Computer systems or digital electronics that keep on measuring the world in some manner and makingoutputs can be thought of as naturally dealing with discrete-time signals. In this class we will be mainlydealing with finite-time systems and signals because these can be modeled as matrices and vectors.

Linear Time-Invariant Systems and Echoes in Wireless Channels

It turns out that a very important category of systems are linear time-invariant (LTI) systems. Perhapsthe simplest physical example of an LTI system is provided by looking at what nature does to wirelesstransmissions.

In the real-world, the radio waves emitted by a wireless transmitter don’t only go directly to the wirelessreceiver that wants to listen to them. Instead, they propagate physically, much the same way that soundpropagates. If there is a clear line-of-sight path between the transmitter and the receiver, then certainly theradio waves do reach the destination directly. But they also find other paths, bouncing off of obstacles likewalls, the ground, the ceiling, buildings, hills, etc. Because the speed of light is finite1, all of these echoesof the transmitted signal reach the destination with different delays2

The figure below3 shows an example of what real-world radio echoes look like:

1The speed of light is easily remembered as 3× 108 m/sec, which is 300 meters per microsecond or 15 meters per 0.05 mi-croseconds – which is the sampling rate of many WiFi implementations — 20 samples per microsecond.

2Paths that are longer by a full 15 meters (say from a wall across the room) show up as a full sample later, while less than that,they for practical reasons show up as a pair of echoes one at zero samples and another at 1 sample. More details about this arediscussed in 16B.

3Taken/Photoshopped from: Hashemi, Homayoun. "The indoor radio propagation channel." Proceedings of the IEEE 81.7(1993): 943-968.

EECS 16A, Fall 2015, Note 13 2

The vertical scale in this plot is logarithmic (dB scale is 10log10(·)) because the echoes tend to decayexponentially in amplitude as they encounter more and more bounces. The scales are normalized so thatthe peak is always at 0 in log-scale = 1 in linear scale. The no-line-of-sight plot on the right shows that theechoes can be quite significant for over 200nsec which corresponds to about 4 samples in a system that uses20MHz wide channels like 802.11. The term “impulse response” is just used to mean what the response ofthe channel would look like to a hypothetical transmission of a single pulse. If there were no obstacles andwe were in free-space, the response would just look like a single peak going up to 0dB about 17 nsec latersince it takes light 17nsec to travel 5 meters.

It is useful to caricature the response of a wireless channel as just having a few paths. For example, if therewere three paths, one with delay τ1, another with delay τ2 > τ1, and a third with delay τ3 > τ2 > τ1, thenthe received signal y(t) = α1x(t−τ1)+α2x(t−τ2)+α3x(t−τ3) in continous time. No matter what is sent,the receiver would hear the superposition of three different copies of the transmitted signal, each shifted bydelay τi and weighted by gain αi.

This kind of response is called linear and time-invariant for very natural reasons. It is a common feature innatural and engineered systems, not just the wireless communication channel.

Linearity

Given a potentially complicated signal, how do we determine how a system acts on it? As we have seen incircuits, we can break large networks into simpler parts, compute the results for each of the parts, and sumthe results. This general principle of breaking a complex system into smaller parts, performing an operationon each part and combining the results to get the result of performing the operation on the entire system, iscalled superposition, and we would like to apply it to signals and systems too. Firstly, what kind of systemsallow us to apply the principle of superposition? Suppose we have a system F , a function taking in signals.Now if we have a signal z = x+ y, we want the output of F on z equal to the sum of the outputs of F on xand y individually. More precisely:

F(z) = F(x+ y) = F(x)+F(y) (1)

Also if we scale the input signal, we want the output signal to scale by the same amount:

F(cx) = cF(x) (2)

EECS 16A, Fall 2015, Note 13 3

This is exactly the same properties that linear transforms in linear algebra have, in fact linearity is a veryimportant concept that arises in many different areas of mathematics and the sciences.

Notice that it holds for the wireless channel with echoes. If we increase the amplitude of the transmittedsignal, the received signal increases by the same factor. If we add together two signals and send them, theresponse is the same as if we had sent them individually and then added the result. Mathematically, if thetransmitted wavefom x(t) = βxa(t)+ γxb(t), the received waveform is:

y(t) = α1x(t− τ1)+α2x(t− τ2)+α3x(t− τ3) (3)

= α1(βxa(t− τ1)+ γxb(t− τ1))+α2(βxa(t− τ2)+ γxb(t− τ2))+α3(βxa(t− τ3)+ γxb(t− τ3)) (4)

= β (α1xa(t− τ1)+α2xa(t− τ2)+α3xa(t− τ3))+ γ(α1xb(t− τ1)+α2xb(t− τ2)+α3xb(t− τ3)) (5)

= βya(t)+ γyb(t) (6)

where ya is the response to just xa and yb is similarly the response to just xb.

Time-Invariance

Many physical systems don’t care about what the absolute time is. For such systems, if an input signalis shifted in time, the output signal should also be the same, except it is also shifted in time by the sameamount:

F(x(n+ s)) = F(x)(n+ s) (7)

Notice that this also holds4 for the wireless echo channel. It only cares about the relative delays, not absolutetime. To verify just notice:

y(t + s) = α1x(t + s− τ1)+α2x(t + s− τ2)+α3x(t + s− τ3) (8)

and so it is time-invariant by inspection. A shift of the input shifts the output.

4Approximately. In real life, things move around and so this sort of thing holds for short enough delays but not long oneswhere the physical paths themselves have changed. Fortunately for us, in the real world, reflecting objects and transmitters andreceivers all move very slowly relative to the speed of light and the sampling periods of modern wireless systems. So the channelis approximately LTI for at least a few microseconds or so.

EECS 16A, Fall 2015, Note 13 4

Linear Algebraic Representation

By now, we have seen that a linear system shares a lot in common with linear transforms on vector spaces.

Let us view signals and systems in the language of linear algebra. Suppose we have a signal ~x =

x0x1...

xn−1

represented by a vector, and a system S. The system matrix acts on x to give the output5 ~y =

y0y1...

yn−1

:

S

x0

...xn−1

=

y0...

yn−1

(9)

We already know that multiplication of a vector by a matrix is a linear transform, so S is a linear system.How about time-invariance? First we need to figure out what happens to the signal when we apply a time-shift to it. Suppose we want to shift our signal ~x by 3 steps. Now we have ~x→3 = [_,_,_,x0,x1, . . . ,xn−4]

T .Notice that we have 3 “spaces” at the beginning of the signal that need to be filled in. Also, xn−3, xn−2 andxn−1 are missing.

To preserve symmetry and to avoid losing information, it’s best to assume that the original signal is periodic,and ~x and ~x→3 are both one period of the original signal, except starting at different times. Therefore~x→3 = [xn−3,xn−2,xn−1,x0,x1, . . . ,xn−4]

T . More formally, we have:

~x→k = [x0−k,x1−k, . . . ,xn−1−k]T (10)

Where all the indices additions are done modulo6 n .

This twist is required to define time-invariant systems for finite-time. It views times as being arranged in acircle like the face of a clock. So shifting by 1 is just rotating the circle by 1 position.

Exercise 13.1 (Time-Shift Systems): Imagine we have a system S→3 that takes any length 5 input signaland shifts it by 2 steps. For example, S→3([3,1,4,1,5]) = [1,5,3,1,4].

1. Is this system linear? That is, for any signals~x and~y, does S→3 fulfill properties (1) and (2)?

2. Is this system time-invariant? Does it fulfil (7)?

3. What does S→3 look like when written as a matrix?

5Note on notation: notice that writing vectors vertically can be quite space-consuming and cumbersome. So from now on

instead of using[

ab

], we will use the equivalent, less vertical-space consuming [a,b]T .

6a modulo b (or a (mod b)) is the same as adding or subtracting multiples of b to a until it is between 0 and b−1. For example,12 (mod 5) = 2 and −2 (mod 3) = 1.

EECS 16A, Fall 2015, Note 13 5

Do this exercise before reading on.

Now that we have defined a time-shift, let’s look at the structure of a system that is time-invariant. Thismeans that S (~x→k) = (S~x)→k. More concretely,

S

...xn−1x0x1...

=

...yn−1y0y1...

(11)

This still doesn’t seem to tell us too much about the structure of S. To see that structure, remember thatlinearity allows us to break a signal into smaller constituent pieces. First, just look at the first component of~x, [x0,0, . . . ,0]T , and how the system transforms it:

s0s1s2 · · ·...

sn−1

x00...0

= x0

s0s1...

sn−1

(12)

We can see that this input signal “picks out” the first column of the signal matrix. What if we shift the signalby one time-step? Since this system is time-invariant, we must have: S

0x0...0

= x0

sn−1s0...

sn−2

(13)

Hence we can deduce that the second column of S is [sn−1,s0, . . . ,sn−2]T ! In fact, the entire matrix can be

deduced by shifting the input signal n times:

S =

s0 sn−1 sn−2 s1s1 s0 sn−1 s2s2 s1 s0 · · · s3...

......

...sn−1 sn−2 sn−3 s0

(14)

Therefore, all finite-time linear time-invariant (LTI) systems can be described by matrices of the form above.We have seen such matrices before — when we were defining correlations between signals in the contextof estimating time-delays for locationing. They are called circulant matrices. We used the notation C~s todescribe making such a matrix from its first column~s.

So how many parameters do we need to uniquely determine the system? The distinct parameters makingup the matrix S are s0, . . . ,sn−1, so we only need n parameters. Suppose we have a black box system andwe want to find out these parameters; we know that it is LTI, and we can input any signal and observe theoutput. What signal should we input? You might recall from above that if we just input~δ = [1,0, . . . ,0]T , wewill get~s = [s0,s1, . . . ,sn−1]

T , all the parameters we need! The signal ~δ is called an impulse, and the output

EECS 16A, Fall 2015, Note 13 6

~s of the system is called the impulse response. Hence we see that an LTI system is completely determinedby its impulse response.

All this can be generalized to the case of continuous-time as well as discrete-time that is infinite in extent,but the mathematics becomes trickier. We will deal with some of this in 16B and you will see more in 120,but for now in 16A, we will stick largely to finite-time.

A Natural Basis For LTI Systems: the DFT Basis

We have seen that an LTI system that acts on a signal of length n can be completely determined by n piecesof information, s0,s1, . . . ,sn−1. However, it is cumbersome to do an entire matrix multiplication to computethe output of a system on signal ~x = [x[0], . . . ,x[n− 1]]T . Can we transform ~x and S to a different basis, sothat we get~x′ = [x′[0], . . . ,x′[n−1]]T and S′ is a diagonal matrix? If this is possible, then:

S′~x′ =

s′0

s′1. . .

s′n−1

x′[0]x′[1]

...x′[n−1]

=

s′0x′[0]s′1x′[1]

...s′n−1x′[n−1]

(15)

And we have transformed a matrix multiplication into a simple component-wise multiplication of one vectorwith another. This is possible only if we can diagonalize S. If we can write S =US′U−1 and~x =U~x′, whereU is the change-of-basis matrix:

S~x = (US′U−1)(U~x′) =U(S′~x′) (16)

Therefore if we work in the basis defined by the column vectors of U , S will become a diagonal matrix andsimplify a lot of operations.

How do we find this special basis? Recall that to diagonalize a dimension n matrix, we need to find n linearlyindependent eigenvectors. Then U is the matrix with the eigenvectors in its columns and S′ is a matrix withthe corresponding eigenvalues on its diagonal. Therefore we need to find the eigenvectors and eigenvaluesof S.

In general, this matrix can be very large and so the usual methods for finding eigenvalues by finding theroots of its characteristic polynomial (setting det(S−λ I) = 0 and solving for the roots λ ) would give us noinsight. However we will exploit the special symmetry inherent in S. Let’s start by examining n = 3 so wedon’t get bogged down with generality:

S =

s0 s2 s1s1 s0 s2s2 s1 s0

(17)

Let~v = [w0,w1,w2]T . Then

S~v =

s0w0 + s1w2 + s2w1s0w1 + s1w0 + s2w2s0w2 + s1w1 + s2w0

(18)

For~v to be an eigenvector, S~v = λ~v. What properties should w0, w1 and w2 have so that~v is an eigenvector?We see an intriguing pattern in S~v. If wi has the following two properties:

EECS 16A, Fall 2015, Note 13 7

1. wiw j = wi+ j (mod n)

2. w0 = 1

Then we can factor out the eigenvalue:

S~v =

(s0w0 + s1w2 + s2w1)w0w0(s0w1 + s1w0 + s2w2)w2w1(s0w2 + s1w1 + s2w0)w1w2

=

(s0w0 + s1w2 + s2w1)w0(s0w0 + s1w2 + s2w1)w1(s0w0 + s1w2 + s2w1)w2

= s0w0 + s1w2 + s2w1

w0w1w2

(19)

Therefore~v is an eigenvector with eigenvalue s0w0 + s1w2 + s2w1.

This guessing and checking method we used might seem quite magical, but it is often a great way to findeigenvectors of matrices with a lot of symmetry.

To really find the eigenvectors, we need to find a set of numbers w0,w1,w2 with the above-mentionedproperties. Real numbers wouldn’t do the trick, as they don’t have the “wrap-around” property needed. Forall m = 0,1, . . . ,n−1, complex exponentials of the form wk = ei 2πm

n k have all the required properties.

Why? Because ei 2π

n k steps along the n-th roots of unity as we let k vary from 0,1, . . . ,n−1. This is a simpleconsequence of Euler’s identity eiθ = cosθ + isinθ geometrically or the fact that (ei 2π

n )n = ei 2π

n n = ei2π = 1.

How could we have come up with this guess? The 3x3 case is not impossible to solve. We could’ve justwritten out the matrix for s0 = s2 = 0 and s1 = 1. (This is the shift-by-one matrix that simply cyclicallydelays the signal it acts on by 1 time step.) At this point, you could explicitly evaluate the det(S−λ I) tofind the characteristic polynomial and see that we need to find the roots of λ 3− 1. This implies that theeigenvalues are the 3rd roots of unity. From there, you can find the eigenvectors by direct computation andyou would find:

The 3 eigenvectors are:

v0 =

111

, v1 =

1e

2πi3

e4πi3

, v2 =

1e

4πi3

e2πi3

(20)

You would then notice that the same eigenvectors work for the case s0 = s1 = 0 and s2 = 1. This is the delayby 2 matrix which, by inspection, can be seen as being D2 if D is the delay-by-one matrix. At this point, youknow that since these every 3x3 circulant matrix is a linear combination of the identity matrix (no delay)and these two pure delay matrices, and all three of these matrices share the same eigenvectors, that in factthese must be the eigenvectors for all 3x3 circulant matrices.

Generalizing, if we have a n-dimensional signals, all LTI systems that act on such signals would have thesame n eigenvectors:

~vl = [ω l·0,ω l·1, . . . ,ω l·(n−1)]T (21)

Where ω = e2πin . To normalize this, we have ~ui =

1√n~vi. As we will see, the extra 1√

n term makes the normbe 1.

EECS 16A, Fall 2015, Note 13 8

Therefore we can write U as:

U = [~u0 ~u1 · · · ~un−1] =1√n

1 1 · · · 1 11 ω · · · ωn−2 ωn−1

......

. . ....

...1 ωn−2 · · · ω(n−2)(n−2) ω(n−1)(n−2)

1 ωn−1 · · · ω(n−2)(n−1) ω(n−1)(n−1)

(22)

The basis in which S is diagonalized is called the Fourier basis or DFT basis (also called the frequencydomain), and U as a linear transformation is called the inverse discrete Fourier transform that bringssignals from the frequency domain into the time domain. The conjugate transpose of U , namely U∗, mapssignals from the time-domain to the frequency domain and as a transformation is called the discrete Fouriertransform or DFT.

We have seen from above that the Fourier basis is a natural basis when we are dealing with LTI systems, andthe Fourier transform is an important bridge between the time and frequency domains.

For the wireless communication context, it means that no matter what pattern of echoes we encounter, theDFT basis vectors~uk are eigenvectors. This means that we can safely encode messages to different receiversonto the (complex) amplitudes of different DFT basis vectors and we are guaranteed that they will notinterfere with each other. Each receiver can get their message back without experiencing any interferencefrom the messsages being communicated to other receivers. All they have to deal with is the eigenvalueof the matrix that hits their particular ~uk. In the wireless communication context, the fact that the DFTbasis is the universal basis for all LTI systems means that the transmitter and receiver can agree on thebasis without having learned the details of the specific wireless channel (pattern of echoes) that they areexperiencing. This is very different from what would happen if messages were attempted to be separatedusing the standard basis. The wireless channel would induce interference between the messages.

Some Properties of the DFT Basis

Illustrating the DFT Basis Vectors

It is worth actually looking at an example of what the DFT vectors look like. These are complex exponen-tials, and as the frequency k increases, they “wiggle” more and more. They always have an integer numberof periods between in the total duration n.

k uk[t] Unit circle plots Cartesian coordinates plots Polar coordinates plots

0 1

EECS 16A, Fall 2015, Note 13 9

- -

1 e2π.i.1.t

10

<

=

u[0]

u[1]u[2]u[3]

u[4]

u[5]

u[6]

u[7] u[8]

u[9]

- -

2 e2π.i.2.t

10

<

=

u[0]

u[1]u[2]

u[3]u[4]

u[5]

u[6]

u[7]

u[8]

u[9]

- -

3 e2π.i.3.t

10

<

=

u[0]

u[1]

u[2] u[3]

u[4]

u[5]

u[6]

u[7]u[8]

u[9]

EECS 16A, Fall 2015, Note 13 10

- -

4 e2π.i.4.t

10

<

=

u[0]

u[1]

u[2]

u[3]

u[4]

u[5]

u[6]

u[7]

u[8]

u[9]

- -

5 e2π.i.5.t

10

<

=

u[0]u[1] u[2]u[3] u[4]u[5] u[6]u[7] u[8]u[9]

- -

6 e2π.i.6.t

10

<

=

u[0]

u[1]

u[2]

u[3]

u[4]

u[5]

u[6]

u[7]

u[8]

u[9]

EECS 16A, Fall 2015, Note 13 11

- -

7 e2π.i.7.t

10

<

=

u[0]

u[1]

u[2] u[3]

u[4]

u[5]

u[6]

u[7]u[8]

u[9]

- -

8 e2π.i.8.t

10

<

=

u[0]

u[1]u[2]

u[3]u[4]

u[5]

u[6]

u[7]

u[8]

u[9]

- -

9 e2π.i.9.t

10

<

=

Ou[0]

u[1]u[2]u[3]

u[4]

u[5]

u[6]

u[7] u[8]

u[9]

EECS 16A, Fall 2015, Note 13 12

- -

Orthonormality of the DFT Basis

The fact that the DFT Basis U is orthonormal lets us use the conjugate transpose U∗ as its inverse and thismakes it easy to go back and forth between time domain (standard basis) and frequency domain (DFT basis).The following diagram illustrates how one goes back and forth.

Time-Domain(Standard Basis)

Frequency-Domain(DFT Basis)

~s : s[t] ~S : S[ f ]

U∗(·)

U(·)Sign

als

To prove orthonormality, we have to show two things:

• The DFT basis vectors each have norm one.

‖~uk‖2 = 〈~uk,~uk〉

=n−1

∑t=0

(1√n

ei 2π

n kt)∗1√n

ei 2π

n kt

=n−1

∑t=0

1n

e−i 2π

n ktei 2π

n kt

=n−1

∑t=0

1n

= 1.

EECS 16A, Fall 2015, Note 13 13

• The DFT basis vectors are orthogonal to each other. So if k 6= `

〈~uk,~u`〉=n−1

∑t=0

1n

e−i 2π

n ktei 2π

n `t

=n−1

∑t=0

1n

ei 2π

n (`−k)t

=1n

n−1

∑t=0

(ei 2π

n (`−k))t

=1n((ei 2π

n (`−k))n−1

(ei 2π

n (`−k))−1)

=1n((ei2π(`−k))−1

(ei 2π

n (`−k))−1)

=1n(

1−1

(ei 2π

n (`−k))−1)

= 0

The above is an argument by direct computation using the formula for the sum of a geometric series.It can also be seen more intuitively by plotting the points in the sum on the complex plane and seeinghow they are balanced around 0.

It is useful to see some examples of what different signals look like in both time-domain and frequencydomain. Here are some examples:

Example Time domain Frequency domain

Constant signal

(imaginary part)

Spike signal

EECS 16A, Fall 2015, Note 13 14

(imaginary part)

Shifted spike signal

(imaginary part)

Complex exponential

(imaginary part)

The DFT Basis diagonalizes all LTI Systems

This is perhaps the most important property of the DFT basis: the DFT vectors are eigenvectors for allfinite-time LTI systems, in other words, for all circulant matrices.

This result, which we have already argued informally is true, is something that we will establish moreformally below. But first, it is useful to see a diagram that points out how LTI Systems look in the time andfrequency domain and how these two are connected.

EECS 16A, Fall 2015, Note 13 15

Time-Domain(Standard Basis)

Frequency-Domain(DFT Basis)

h[0] h[n−1] . . . h[1]h[1] h[0] . . . h[2]

......

. . ....

h[n−1] h[n−2] . . . h[0]

= H ΛH =

λH,0 0 . . . 0

0 λH,1 . . . 0...

.... . .

...0 0 . . . λH,n−1

U∗(·)U

U(·)U∗

Syst

ems

To justify the above diagram, we just need to why the DFT vectors are eigenvectors of all circulant matrices.We have already established (by showing orthonormality) that the DFT basis has dimension n and thatU−1 =U∗. Once we know this, the above diagram follows from what we already know about diagonalizationof matrices.

We will proceed by first considering special cases of circulant matrices. learly this is true for the Identitymatrix since everything is an eigenvector of the identity. Now consider the delay by 1 matrix D. This hasimpulse response ~d = [0,1,0, . . . ,0]T and D =C~d is the circulant matrix that has ~d as its first column.

Let’s see what it does to the DFT basis elements.

(D~uk)[t] =1√n

vk[(t−1) mod n] (23)

=1√n

ei 2π

n k((t−1) mod n) (24)

=1√n

ei 2π

n k((t−1)−qn) (25)

=1√n

ei 2π

n k(t−1)e−i 2π

n kqn (26)

=1√n

ei 2π

n k(t−1)e−i2πkq (27)

=1√n

ei 2π

n k(t−1)(e−i2π)kq (28)

=1√n

ei 2π

n k(t−1) (29)

= (e−ik 2π

n )1√n

ei 2π

n kt (30)

= (e−ik 2π

n )uk[t] (31)

The third line in the derivation above can be justified since we know that (t−1) mod n = t−1−qn for someinteger q since mod means remainder after dividing by n.

So, indeed the matrix D has~uk as an eigenvector and has eigenvalue λD,k = e−ik 2π

n .

Next, observe that a cyclic delay by two can be obtained by first delaying by 1 and then doing it again. Inother words D2 is a delay by two. By this exact reasoning, notice that a cyclic delay by τ ∈ {1, . . . ,n−1} is

EECS 16A, Fall 2015, Note 13 16

just Dτ and by diagonalization, it is clear that Dτ has~uk as an eigenvector too and the eigenvalue is also justraised to the power τ . So λDτ ,k = λ τ

D,k = e−ikτ2π

n .

Now, we know that the DFT basis are eigenvectors for Dτ for τ ∈ {0,1, . . . ,n− 1}. This means that theDFT basis must also be eigenvectors for all linear combinations of powers of D. This can be seen formally.Consider the matrix H = ∑

n−1τ=0 h[τ]Dτ acting on a basis vector~u`.

H~u` = (n−1

∑τ=0

h[τ]Dτ)~u` (32)

=n−1

∑τ=0

h[τ]Dτ~u` (33)

=n−1

∑τ=0

h[τ]e−i`τ 2π

n ~u` (34)

=~u`n−1

∑τ=0

h[τ]e−i`τ 2π

n (35)

= (n−1

∑τ=0

h[τ]e−i`τ 2π

n )~u` (36)

This shows that indeed~u` is an eigenvector and has

λH,` =n−1

∑τ=0

h[τ]e−i`τ 2π

n (37)

as the corresponding eigenvalue.

Finally, notice that by choosing the vector~h = [h[0],h[1], . . . ,h[n− 1]]T arbitrarily, the above matrix H isjust a general circulant matrix! This can be seen most clearly by just looking at the first column. The firstcolumn of Dτ has a 1 in the τ position and zeros everywhere else.

We have now established that the DFT basis is an eigenvector basis for all the circulant matrices.

In the DFT Basis, an LTI system just scales coefficients

The fact that the DFT Basis diagonalizes all LTI systems means that we can very easily compute the coor-dinates in the DFT basis for what happens when a general finite-time signal7 passes through an LTI system.

Given an input signal ~s, suppose that its representation in DFT basis was ~S so that ~s = ∑n−1f=0 S[ f ]~u f . Then,

if the LTI System (Circulant Matrix) H acts on ~s, we know that H~s = ∑n−1f=0 S[ f ]H~u f = ∑

n−1f=0 S[ f ]λH, f~u f .

So the coordinates in the frequency domain are simply multiplied by the eigenvalues λH, f to compute thecoordinates of the resulting signal in the DFT basis.

This is illustrated in the example below.

7Recall that physically, such finite-time signals are understood to be periodic in the real world. This is what makes the shifts becyclic.

EECS 16A, Fall 2015, Note 13 17

Time domain Freq domain (magnitude) Freq domain (phase)

Input signal

Impulse response

Output signal

Notice here how the polar coordinate representation of the frequency domain makes it easier to see how theoutput is just the input multiplied by the eigenvalues of the system matrix. This is because the magnitudesmultiply and the phases just add when multiplying two vectors by each other component-wise.

We notice another very important fact in this example plot when we consider the relationship between thefrequency components of the impulse response and the eigenvalues of the circulant matrix that has thisimpulse response. This is the topic of the next subsection of this note.

The eigenvalues of an LTI System are scaled versions of the frequency components of the impulseresponse when viewed in the DFT Basis

There is a close relationship between the representation of a finite signal ~h in frequency domain and theeigenvalues of the circulant matrix C~h representing the system whose impulse response is~h.

Because the DFT basis U is orthonormal, we know that U−1 = U∗. So ~H = U∗~h is the representation of~h

EECS 16A, Fall 2015, Note 13 18

in the frequency domain. Consider the f -th component of ~H:

H[ f ] = (U∗~h)[ f ] (38)

=n−1

∑τ=0

h[τ]1√n(ei f τ

n )∗ (39)

=1√n

n−1

∑τ=0

h[τ]e−i f τ2π

n (40)

=1√n

λC~h, f (41)

Where the last line was obtained by comparing this to (37).

This correspondance justifies the notation where ~H refers to~h viewed in the DFT basis while H =C~h is thecirculant matrix made from the impulse response~h.

This can also be seen from a diagonalization perspective

C~h =UΛU∗

U∗C~h =U∗UΛU∗

= ΛU∗

Now just look at the first column of both sides, in other words look at the standard basis vector~e0 postmul-tiplying both sides.

U∗C~h~e0 = ΛU∗~e0

U∗~h =1√n

Λ

11...1

U∗~h =1√n

λC~h,0

λC~h,1...

λC~h,n−1

Here, what turned out to be most important is that the first column of U∗ is just 1√n repeated in every position

and that the U matrix is orthonormal and diagonalizes H =C~h.

The following diagram shows abstractly what is going on. On the bottom, we have the LTI systems thatcorrespond to the signals on top. On the left, things are expressed in time-domain (with coordinates in thethe standard basis). On the right, things are expressed in frequency-domain (using the DFT basis). Movingright to left and vice-versa are change of coordinates. Moving up to down takes a signal and views this asthe impulse response of the desired LTI system. Moving down to up takes an LTI system and gives you theimpulse response.

EECS 16A, Fall 2015, Note 13 19

Time-Domain(Standard Basis)

Frequency-Domain(DFT Basis)

~h : h[t] ~H : H[ f ]

h[0] h[n−1] . . . h[1]h[1] h[0] . . . h[2]

......

. . ....

h[n−1] h[n−2] . . . h[0]

= H ΛH =

λH,0 0 . . . 0

0 λH,1 . . . 0...

.... . .

...0 0 . . . λH,n−1

(·)~e0 C(·)

U∗(·)

U(·)

U∗(·)U

1√n(·)~1

U(·)U∗

√nDiagonal(·)

Sign

als

Syst

ems

In the above diagram,~e0 =

10...0

is the standard basis vector with a 1 in the zeroth position,~1 =

11...1

is the

vector with all 1s in it, diagonal(·) takes a vector and returns a matrix with the elements of that vector alongthe diagonal and zeros everywhere else, and C(·) takes a vector and returns the circulant matrix with the firstcolumn being that vector. U is the orthonormal DFT basis and U∗ is its conjugate transpose.

The numpy.fft.fft function defaults to computing the eigenvalues λ given an impulse response. Sothat corresponds to a move from the upper left to the bottom right.

There are a great many other useful properties of the DFT basis that we will see in 16B and beyond.

EECS 16A, Fall 2015, Note 13 20