Macroeconomics III Vahagn Jerbashian Lecture noteshome.cerge-ei.cz/vahagn/files/lecture...

Macroeconomics III

Vahagn Jerbashian

Lecture notes∗

This version: February 11, 2017

Contents

Expectations 3Introduction to the concept of expectations . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Types of expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Traditional models with expectations 14The Cobweb Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

The Cagan Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

The Lucas Imperfect Information Model with AD . . . . . . . . . . . . . . . . . . . . . . . 29

The Sticky Wage Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

The Effectiveness of monetary policy (with fixed rules) 37A model with stabilizing monetary policy . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

A model with constant growth of money . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Discretionary monetary policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Political cycles and discretionary monetary policy . . . . . . . . . . . . . . . . . . . . . . . 48

Monetary policy under commitment and discretion . . . . . . . . . . . . . . . . . . . . . . 50

Monetary policy under commitment . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Monetary policy under discretion (without commitment) . . . . . . . . . . . . . . . . 53

∗These notes may contain typos/mistakes and are subject to changes/updates during our course. Please keep track ifthere are any.

1

Business cycles 55Business cycles - The Carlin and Soskice (2005) model . . . . . . . . . . . . . . . . . . . . 58

Endogenous business cycles - The Goodwin (1967) model . . . . . . . . . . . . . . . . . . 63

A Real Business Cycles Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

The Basic RBC Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

Price Rigidities - The Calvo (1983) model . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Expectations and financial markets 84Bonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Stocks, stock prices, and stock markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Measures of returns and risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Measuring portfolio return and risk to assemble a portfolio . . . . . . . . . . . . . . 94

Market price of risk and the CAPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Black-Scholes model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

Appendix 107Appendix - Reminder of Statistics 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

Mean-variance trade-off . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

2

Expectations

Introduction to the concept of expectations

Why did you start reading these notes? Would you start and/or continue reading if you expect

these notes to be useless and/or your test(s) to be very easy? I suppose the answer is no at least

for some students.2 This is an example how expectations matter for strategies, actions, and later

for performance at individual level.

Consider another example to see how expectations matter for economic performance at individ-

ual level, as well as at aggregate level. Suppose we have an economy of 1000 firms and consumers.

In this economy, firms hire consumers’labor to produce and consumers use wages they receive to

buy firms’products. Suppose that one of the firms expects low demand for its products prior to

deciding how much to produce. In order to produce according to its expectation, it would hire a few

amount of labor and pay a low wage bill. Further, imagine that instead of one firm, all firms expect

a low demand for their products. In such a case, all firms will hire a few amount of labor and pay

low wage bills. Therefore, consumers will have low income and consume a few amount of products,

which will reinforce firms’expectations. You continue reading this chapter, and I will bring more

such examples. The main focus of our examples, and classes in general, is on how expectations

can matter for macroeconomic performance and aggregate economic fluctuations due to supply or

demand shocks.

Prior to proceeding to these examples, let’s digress (which we will do often) and discuss formally

what expectations are.

Expectations - intuitive and formal discussion

We know for sure that if we throw an apple (perhaps, not an iPhone) it will eventually hit the

ground because of gravitation. Since we know for sure, we expect the event "thrown apple hits the

ground" to happen. Can you claim with the same certainty that tomorrow it won’t rain around our

university?

In these examples we have two different (random) events. One of the events is "thrown apple hits

the ground." Whereas, the other event is "no rain tomorrow around our university." These events

happen with some probabilities. The first event happens with probability 1, i.e., it happens for

sure/with certainty. (Since it happens with certainty we don’t call it a random event.) The second

event, however, happens with probability less than 1. For example, let it happen with probability

0.5. Then you would say that with probability 0.5 you expect to have no rain around our university

tomorrow.

Let’s now consider a bit more sophisticated examples. Suppose we have a lottery which pays 100

EUR with probability 0.01 and 0 EUR otherwise, i.e., with probability 0.99. What is the expected

2These notes are not useless and I promise your tests will not be easy.

3

pay-off of this lottery? It is 1 EUR:

0.01× 100 + 0.99× 0 = 1.

Therefore, if this lottery costs 5 EUR, one might think at least twice prior to buying it.

We have two possible realizations of random variable "pay-off" in this example: "100 EUR" and

"0 EUR." These realizations have associated probabilities 0.01 and 0.99. When we are calculating

expected value of the random variable "pay-off" we weight each of these possible realizations with

the corresponding probability. Intuitively, we do so in order to give larger weight to realizations

which are more likely to happen (here: 0 EUR is more likely to happen and gets 0.99 weight which

is larger than 0.01 weight of 100 EUR).

Suppose now we have a random variable x which takes (independently distributed) values from

a finite and ordered set of real numbers xjNj=1 with associated probabilitiespxjNj=1. The set of

real numbers xjNj=1 are the possible realizations of x. In turn, the space of associated probabilitiespxjNj=1

is the probability distribution of x. To relate to previous example, let x be the pay-off of

a lottery which gives xj amount of Euros with probability pxj where j = 1, ..., N . For instance, if

x1 is the event when pay-off is 0 EUR and x2 is the event when pay-off is 10 EUR. Then px1 and

px2 are the probabilities of those events.

What is then the expected value of x? Use E [x] to denote it. E [x] is given by

E [x] = px1x1 + px2x2 + ...+ pxNxN =N∑j=1

pxjxj ,

which clearly generalizes our previous example in a straightforward manner. If there were (count-

ably) infinite possible realizations of x so that we had xj+∞j=1 andpxj+∞j=1, then we would simply

write

E [x] =+∞∑j=1

pxjxj .

In any case E [x] is just a real number, of course assuming that E [x] < +∞.Lets continue this generalization process. Suppose now that random variable x takes values

from a continuous set of real numbers [A,B], where A < B (e.g., A = 0 and B = 100). Denote the

probability that the realization of x happens to be less than X by

F (X) = P (x < X) ,

where F is the probability distribution function of x. F function maps the set of possible realizations

[A,B] to [0, 1].

To derive the expected value of x we need to know the probability of each of the possible realiza-

tions of x. For a second suppose that x, as in previous example, took (independently distributed)

values from a discrete and ordered space xj+∞j=1. In such a case the probability of observing exactly

4

realization x2 is the difference between probability that the realization of x is less than x3 and the

probability that x is less than x2. In other words,

px2 = px1 + px2 − px1 .

According to our new notation this can be written then as

px2 = F (x3)− F (x2) .

Denote this difference by ∆F,

∆F = F (x3)− F (x2) .

Lets go back to the case when x takes values from a continuous set [A,B]. Suppose, as in

previous example, x1, x2, and x3 are consecutive possible realizations of x. Since the set of possible

realizations of x is continuous the distance between these realizations is infinitesimally small. In

such a circumstance we consider an infinitesimally small change in F for obtaining the probability

that x takes a value of x2. This infinitesimally small change we denote by dF instead of ∆F . The

expected value of x in this case is

E [x] =

B∫A

xdF (x) .3

In this expression, x are possible realizations of x and dF (x) are their associated probability. We

use integral instead of sum since we are summing/integrating over a continuum of infinitesimally

small points.

In case when F is a differentiable function on [A,B] (i.e., dF (x)dx exists on that interval) we can

rewrite E [x] as

E [x] =

B∫A

xf (x) dx,

where f (x) = dF (x)dx is the probability density function of x.

In economics and in many other disciplines we call the possible realizations of a random variable

"possible states" and the space of possible realizations "space of possible states." You can easily

notice that the expected value of x depends on both the space of possible states and the probability

distribution of x. For example, suppose for simplicity that

F (x) =x−BB −A

3Again, we need to have E [x] < +∞ so that E [x] is something well defined, i.e., it is just a real number.

5

(i.e., we have a uniform distribution) then

E [x] =1

B −A

B∫A

xdx

=1

2(B +A) .

Therefore, changing B and/or A, which corresponds to changing the space of possible states of x,

changes its expected value. Consider a change in F now. Suppose,

F (x) =1

Φ(B−µσ

)− Φ

(A−µσ

) x∫A

1

σφ

(x− µσ

)dx

where

φ (z) =1

σ√

2πe−

12(

z−µσ )

2

,

Φ (z) =

z∫−∞

φ (Z) dZ.

(i.e., we have a truncated normal distributed random variable with non-truncated mean µ and

variance σ. Φ and φ are normal distribution and density functions, correspondingly) In such a case,

it can be shown that

E [x] = µ+ σφ(A−µσ

)− φ

(B−µσ

)Φ(B−µσ

)− Φ

(A−µσ

) .Cooking up examples for random variables with discrete distributions is much easier. Suppose,

we have (1) . xjNj=1 = 1, 2 andpxjNj=1

= 0.5, 0.5, (2) . xjNj=1 = 2, 2 andpxjNj=1

=

0.5, 0.5, and (3) . xjNj=1 = 1, 2 andpxjNj=1

= 0.8, 0.2. Clearly, the difference between(1) and (2) is in the space of possible states. In turn, the difference between (1) and (3) is in the

associated probabilities/distribution of states. Expected values in each of these cases are 1.5, 2, and

1.2, correspondingly. For more discussion see Appendix - Statistics 0.

Economic agents act according to their expectations. Often in real life, as well as in economics,

we might not exactly know the entire space of possible states of a random variable neither we

might know exactly its probability distribution function. In our examples we will see that economic

performance depends on economic agents’ beliefs of what the possible states are and what the

distribution function is.

Further examples and Keynesian-beauty contest

In order to further assert that expectations and the way they are formed matter in economics

consider a game called "p-beauty contest" and first run in Nagel (1995). The rules of the game are

6

as follows:

• Each of N -players is asked to choose a number from the interval [0, 100]

• The winner is the player whose choice is closest to p times the mean of the choices of allplayers, where p < 1 (e.g., p = 0.5).

In this game the random variable for a player is the mean of the choices of all players. Meanwhile,

the probability distribution of this variable depends on the types and beliefs/knowledge of all players.

For example, it turns out that if all players are rational in the sense that everyone performs iterated

elimination of (weakly) dominated strategies and all players know about that (i.e., it is a common

knowledge that everyone is rational) then it is straightforward to guess what would be the mean of

choices of all players. Under these assumptions everyone simply chooses 0. To see this consider a

player. This player will never choose a number above 100p since it is dominated by 100p. Moreover,

given that the player believes that others are rational too, s/he will not pick a number above 100p2

since s/he will know that no one will pick above 100p. Similarly believing that everyone is rational,

s/he will not pick a number above 100p3 and so on, until all numbers but zero are eliminated.

If p > 1 then 100 can also be an equilibrium (and 0 is not a "stable" equilibrium). For p = 1

any number chosen by all players can be an equilibrium.

This game mimics the problem a seller in the stock market, for example. The seller wants to sell

his shares when the price of the share is at its peak, just before at least someone wants to sell. In

order to do that (i.e., design its actions) the seller needs to know the types and beliefs/knowledge of

other sellers. This example motivated John Maynard Keynes to propose the original setup of this

game in Chapter 12 of his work: The General Theory of Employment, Interest and Money (1936).

In that work, he proposed an explanation behind fluctuations in prices in equity markets in terms of

changes in beliefs of sellers and buyers. Instead of picking numbers, Keynes used an analogy based

on a newspaper contest, in which players are asked to choose from a set of photographs of women

that are the "most beautiful." Those who picked the most popular face are then the winners. (This

is the reason why the name of the game is Beauty Contest).

Imagine that there are three players. A naive strategy in this game would be to pick the most

beautiful face one perceives. If everyone does so, one could deviate with a more sophisticated

strategy and pick up a face which s/he thinks is most likely to be chosen by the other two. If

everyone does so, one could deviate with a more sophisticated strategy and pick up a face which

s/he thinks is most likely to be expected to be chosen by the other two, etc.

Keynes wrote: "It is not a case of choosing those [faces] that, to the best of one’s judgment, are

really the prettiest, nor even those that average opinion genuinely thinks the prettiest. We have

reached the third degree where we devote our intelligences to anticipating what average opinion

expects the average opinion to be. And there are some, I believe, who practice the fourth, fifth and

higher degrees."

It turns out, however, that in reality in p-beauty contest games assumptions of rationality are

often violated and therefore, equilibrium is not at 0. Researchers have revealed that, for example,

7

using experiments with high school students. The following comment summarizes the thought

processes of a high school student participating in a newspaper contest (submitted to one of the

newspaper studies, the Spektrum der Wissenschaft). The game is a p-beauty contest with p = 2/3.

"I would like to submit the proposal of a class grade 8e of the Felix-Klein-Gymnasium Goettingen

for your game: 0.0228623. How did this value come up? Johanna . . . asked in the math-class

whether we should participate in this contest. The idea was accepted with great enthusiasm and

lots of suggestions were made immediately. About half of the class wanted to submit their favorite

numbers. To send one number for all, maybe one could take the average of all these numbers.

A first concern came from Ulfert, who stated that numbers greater than 66 2/3 had no chance

to win. Sonja suggested to take 2/3 of the average. At that point it got too complicated to some

students and the finding of the decision was postponed. In the next class Helena proposed to

multiply 33 1/3 with 2/3 and again with 2/3. However, Ulfert disagreed, because starting like

that one could multiply it again with 2/3. Others agreed with him that this process then could be

continued. They tried and realized that the numbers became smaller and smaller. A lot of students

gave up at that point, thinking that this way a solution could not be found. Other believed to have

found the path of the solution: one just has to submit a very small number.

However, one could not agree how many of the people who participated realized this process.

Johanna supposed that the people who read this newspaper are quite sophisticated. At the end of

the class 7 to 8 students heatedly continued to discuss this problem. The next day the math teacher

received the following message: We think it best to submit number 0.0228623."

Consider another game called "ultimatum game." There are two players in this game. Player 1

is entitled to a 10 EUR and players decide how to divide it. The rules of the game are as follows

• Player 1 proposes a division of the sum

• Player 2 can either accept or reject this proposal

— If the player 2 rejects, neither player receives anything

— If the player 2 accepts, 10 EUR is split according to the proposal

The extensive form representation of the game is:

8

In this figure, it is assumed that player 1 either gives 2 EUR to player 2 or 5 EUR. Player 2

then decides to accept or reject. If player 2 rejects in any of these cases both get 0 EUR. In turn,

if player 2 accepts the offer money is split according to the proposal.

The strategy of player 1 is the proposal coupled with the expected strategy/response of player

2, which depends on the type of player 2. Imagine that player 1 and 2 care only about money in

this game (i.e., players are expected pay-offmaximizers). If player 1 knows about the type of player

2 then s/he knows that whatever positive amount s/he proposes player 2 will accept. Therefore,

s/he can propose something very close to 0 (in fact, perhaps, exactly 0) and get as much of the

pay-off as possible (something around 10 EUR). If, however, player 1 expects that player 2 has a

strict preference over equality of the split of the award (i.e., player 2 will reject a proposal if it splits

reward very unequally) then s/he might propose something higher than 0 EUR and close to 5 EUR.

This is because otherwise s/he gets nothing.

9

Types of expectations

In previous (sub-)section we defined and discussed expectation operators for discrete and continuous

random variables. We saw that we can compute the expected value of a random variable in cases

when we know exactly what is the space of its possible states and the probability of those states

(distribution function). We saw as well that if we have three random variables which have different

spaces of possible states and/or probabilities of those states then their expected values can be

different (i.e., expected value depends on the set of possible states and on distribution function).

Further, we saw examples when economic agents act according to their expectations and they might

not exactly know the entire space of possible states of a random variable neither its probability

distribution function.

What to do if we don’t know exactly the distribution function of a random variable? To alleviate

such a problem we use statistics. In other words, we observe realizations of the random variable and

use them in order to infer certain moments of its distribution. Expected value is the first moment.

In particular, suppose that we have xjNj=1 realizations of random variable x. Further, we have no

priors about the probability of each of the observations. In such a case, the mean of observations

xeN =1

N

N∑j=1

xj

is the sample (statistical) analogue of the expected value of x. As E [x] it is just a number.

Here we have 1N in front of each realization since we need to treat each of the observations

equally likely (i.e., each observation has 1N probability to occur). A central theorem in probability

theory called Law of Large Numbers provides us with a proof that if we have infinite realizations

of the random variable then the mean of the sample is the expected value of the random variable,

Pr

(lim

N→+∞xeN = E [x]

)= 1.

Usually in economics, and in particular in macroeconomics, we observe realizations of random

variables over time. Therefore, instead of index j and N we often use t and T , where t indexes time

and T refer to its most recent value. Then we write, for example,

xeT =1

T

T∑t=1

xt

where x1 is the value that random variable x has obtained at time 1, x2 is the value that random

variable x has obtained at time 2, etc. Again, this is the best (unbiased and consistent) guess of the

expected value of x when we have no priors (extra information) on the likelihood of its observations.

10

It is usually the default option, therefore, and we will treat

xeT =1

T

T∑t=1

xt

as the first type of expectations. It incorporates all observations and treats them equally.

In certain circumstances we might have priors over the likelihood of observations to reoccur.

For example, it might be that we know that older observations are less likely to happen. To put

more meat into the discussion, suppose that we observe xtTt=1 realizations of random variable x

and would like to compute the mean of x, perhaps because we will apply it for our actions in time

T + 1. We will denote xeT by xeT+1 for that purpose.

Suppose further that if we know that for any t from 1 to T if the likelihood of xt to reoccur

is λ < 1 then the likelihood of xt−1 to reoccur is λ (1− λ). (Notice that λ < 1 implies that

λ (1− λ) < λ. Therefore xt−1 is less likely to happen than xt). In such a case, we would write the

sample mean as

xeT+1 (λ) = λxT + λ (1− λ) xT−1 + ...+ λ (1− λ)T−1 x1

= λ

T∑t=1

(1− λ)T−t xt.

where λ is the likelihood that we think xT will happen. xT−1 happens with likelihood λ (1− λ).

We use xeT+1 (λ) instead of xeT (λ).

xeT+1 (λ) has a special name. It is called exponentially weighted average with time decay. Notice

that xeT+1 (λ) can be easily written in a recursive form in the following way:

xeT+1 (λ) = λxT + (1− λ)xeT (λ)

= xeT (λ) + λ [xT − xeT (λ)] .

According to the second line xeT+1 (λ) is the sum of old expectation xeT (λ) and a weighted difference

between the realization of x at time T , xT , and its expectation xeT (λ). This last term adapts/corrects

the current expectation to the error in forecast/expectation: [xT − xeT (λ)]. In this context, λ is

called a correction parameter. Economists call these types of expectations Adaptive Expectations.

These are the second type of expectations. Hereafter, we will use notation xeT+1|T to denote the

expectation of random variable xT+1 conditional on information available at time T .

Macroeconomists have used extensively (and abused) adaptive expectations in their models

before 1970s. The assumption of adaptive expectations has been imposed in these models without

much justification. For example, this assumption usually generates persistent errors in these models

(i.e., limT→+∞1T

T∑t=1

[xt − xeT (λ)] 6= 0) which seems to be odd. It implies that economic agents make

persistent errors (i.e., have no intention to correct their errors).

An assumption regarding expectations, which is consistent with economic models, is the "ratio-

11

nal expectations" assumption. This assumption states that economic agents use the model to form

expectations. A model is a description of an economy. Assuming that it is the right one, rational

expectations assumption states that the agents know the economy entirely and they use that in-

formation to form their expectations. "The agents know the economy" means that they know the

structure behind demand and supply and that there are random variables, with given distribution

functions, which affect supply and demand. In such a circumstance, in terms of the model, agents’

expectations are not systematically wrong in that all errors are random/not persistent. Therefore,

under this assumption deviations from perfect foresight in the model are only random.

Denote these expectations as

xeT+1|T = ET [xT+1] ,

where ET [xT+1] ≡ E [xT+1|ΩT ] is the expected value of x at time T + 1 given all information till

time time T + 1. Information is summarized by ΩT and includes all the possible structures in the

economy and values of fundamentals. These are the third type of expectations.

Lets see why this can work better than adaptive expectations which miss some of the structure

of the model. Suppose that agents are assumed to have adaptive expectations, and the model

economy features a constantly rising inflation rate. In such a circumstance agents would be assumed

to always underestimate inflation since they are assumed to predict inflation by looking at inflation

in previous years. Under rational expectations assumption, since the trend in inflation is part of

the model agents would take it into account in forming their expectations and there won’t be such

a bias.

Formally, this can be represented in the following manner. Consider an economy which starts

at time 1 and where inflation π at any time T is given by

πT = π + T + ηT ,

where π is a constant, T indexes time, and ηT is a random variable with 0 mean and σ2 variance.

Further, suppose we are at the end of T = 2, and the agents’ in this economy need to form

expectation of inflation for time T + 1 = 3. If agents have adaptive expectations with correction

parameter λ = 13 , then

πe3

(1

3

)=

1

3

2∑t=1

(2

3

)2−tπt.

At the end of T = 2 the realizations of η are known. Suppose, η1 = 0.01 and η2 = 0.02. Expected

inflation then can be rewritten as

πe3

(1

3

)=

1

3

[(π + 2 + 0.02) +

(2

3

)(π + 1 + 0.01)

]=

5

9π +

44

30.

12

In case of rational expectations,

πe3|2 = E [π3|Ω2]

= E [ π + 3 + η3|Ω2]

= π + 3 + E [η3|Ω2]

= π + 3.

Notice that

πe3|2 > πe3

(1

3

).

In this simple economy, trend in inflation can be thought to represent policy changes of a central

bank. In other words, suppose there is a central bank in this economy which constantly increases

inflation.

One of the influential drivers behind widespread adoption of rational expectations in macroeco-

nomics has been the famous Lucas’Critique. In terms of our discussion, Lucas’Critique is that

in models which do not feature agents with rational expectations, agents might not react to policy

changes. This seems not to reconcile well with reality since in such a case agents can be tricked

almost always with some policy changes. There is a famous Lincoln Quote on this issue: "You can

fool some of the people all of the time, and all of the people some of the time, but you can not fool

all of the people all of the time."

Despite these seemingly plausible properties and widespread use, rational expectations assump-

tion has received criticism on the grounds that it assumes that agents know everything. Criticism

goes on saying that in the real world agents (consumers, firms, etc) do not know exactly the econ-

omy.4 (If this were to happen, would Economics be a science?)

4There is an emerging field in macroeconomics which deals with this issue. The models in this field feature agents whichcontinuously learn the economy and know the economy fully at the end of time horizon (i.e., they are asymptoticallyrational). See for details "Evans, G., and Honkapohja, S. (2001). Learning and expectations in macroeconomics.Princeton University Press."

13

Traditional models with expectations

This section highlights how important is the assumption on expectations in two traditional models.

The models are Cobweb Model and Cagan Model.

The Cobweb Model

The Cobweb Model Kaldor (1934) proposes an explanation why prices might be subject to periodic

fluctuations in certain markets. It assumes that firms must choose output before prices are observed,

demand and supply (prices) are uncertain, and that firms’have adaptive expectations with λ = 1.

These type of expectations are called Static Expectations since there is no correction to error. Firms’

expectations about prices at time T+1 are based on the prices that prevailed just in previous period,

at time T . In other words, using p to denote prices,

peT+1 = peT (λ) + [pT − peT (λ)]

= pT .

Hereafter, we will replace T + 1 with t.

It seems that such a model can be well applicable to agricultural markets. In such markets,

producers invest in production (start the production process) much before they sell their output.

Periodic fluctuations then can happen for example because of supply shocks such as bad weather.

For example, according to this model, if producers of corn experience very bad weather and have

reduced output, they would get higher prices. In the next period, they expect high prices and

therefore will produce a lot. This will dampen the prices and lead to expectation of low prices.

Expecting low prices, the producers of corn will produce few corn and in equilibrium price will rise,

etc.

In graphical terms, the process described above can be represented in the following manner.

14

The difference between these two figures is the relation between the slopes of the demand and

supply curves. In the figure to the left, the slope of the inverse supply curve is higher than the

absolute value of the slope of the inverse demand curve. In figure to the right, the slope of the

inverse supply curve is lower than the absolute value of the slope of the inverse demand curve.

Suppose an economy represented by these figures starts at time t = 0 at the intersection of S

(supply) and D (demand) curves. In that period the economy receives a negative shock to supply

so that prices in the next period are p1. In period t = 1 the economy receives a counterbalancing

positive supply shock which brings supply curve to its original position. In this period, however,

expected prices for period t = 1 are still p1. Producers produce according to p1. The demand,

however, is shorter than supply. Therefore, maximum that producers are able to charge is p2 which

is lower than p1 (p2 < p1). For period 2 then producers expect price p2 and produce accordingly.

In such a case, in period 2 it turns out that they have produced less than the demand is. They sell

then at a higher price p3 (p2 < p3). This process continues and generates periodic fluctuations in

prices.

In case of the figure to the left, prices tend to stabilize after the shock. This happens because

inverse supply (or supply prices) reacts more than does inverse demand (or demand prices). Faced

with higher demand in period t = 2, for example, firms rise their prices. They do so much that

dampens the demand in the next period. However, in case of the figure to the right, prices do not

stabilize. This happens because inverse supply reacts less than does the inverse demand.5

Lets make this model more formal. Suppose at any time t demand and supply functions are

given by

Dt = mI −mppt + η1,t,

St = rI + rppet + η2,t,

where mI , mp, rI , and rp are positive parameters. 1mp

and 1rp.are absolute values of the slopes of

demand and supply curves. η1,t and η2,t are shocks/disturbances. Let η1,t and η2,t be identically and

independently distributed (i.i.d.) and have 0 mean and σ2 variance. The values of these shocks are

not known at the time when price expectations are formed. At time t, supply function is designed

according to the expected value of the price. To stay in line with our story, assume that the economy

starts at p0 where D and S curves intersect. Moreover, η1,t ≡ 0 and η2,0 < 0 so that price shifts

from p0 to p1 and makes pe1 = p1. Further, η2,1 = −η2,0.and η2,t ≡ 0 ∀t > 1.

Market clearing condition requires that at each and every point in time quantity demanded is

equal to the quantity supplied,

Dt = St.

Therefore,

mI −mppt + η1,t = rI + rppet + η2,t

5 If in this economy the absolute values of the slopes of demand and supply were equal then there would be permanentprice fluctuations with constant magnitude.

15

and

pt =mI − rImp

− rpmp

pet +η1,t − η2,t

mp.

Denote

α1 =mI − rImp

α2 =rpmp

ηt =η1,t − η2,t

mp.

And rewrite pt as

pt = α1 − α2pet + ηt.

Assuming that pet = pt−1 we have

pt = α1 − α2pt−1 + ηt.

This is a very basic stochastic difference equation. Its solution is the sum of a general solution of

homogenous equation

pt = −α2pt−1

and particular solution of the entire equation.

The solution of pt = −α2pt−1 is very basic

pht = (−α2)t−1 p1,

where p1 is the price where the economy started. Now we need to guess a particular solution of

the general equation. It turns out that for this form of equations the particular solution has the

following form

ppt =

t−1∑τ=1

(−α2)t−1−τ (α1 + ητ ) .

Therefore,

pt = (−α2)t−1 p1 +

t−1∑τ=1

(−α2)t−1−τ (α1 + ητ ) .6

Ignore the second term in this expression and notice that if α2 > 1 then the absolute value of

(−α2)t−1 is increasing over time. Therefore, pt diverges to infinity. Parameter α2 is the inverse ofthe ratio of slopes of supply and demand curves mp

rp. It is greater than 1 in case when rp > |−mp| ,

which is equivalent to say that the slope of the inverse supply curve 1rpis lower than the absolute

value of the slope of the inverse demand curve 1mp(so that inverse demand reacts more than inverse

supply). However, if α2 < 1 then the absolute value of (−α2)t−1 declines over time. Therefore, pt6To see that this is the solution, plug this expression into the equation above.

16

converges to a number. In this case rp < |−mp| , which means that the slope of the inverse supplycurve is higher than the absolute value of the slope of the inverse demand curve (so that inverse

demand reacts less than inverse supply). What is the number that price converges to? The answer

to this question is quite simple. Convergence here means that price stabilizes over time (i.e., we

have a steady-state). We have assumed that ηt = 0 for any t > 1. Therefore, stable prices means

p = α1 − α2p,

and price converges to

p =α1

1 + α2.

Notice that p is the level of price where our example D and S intersect. In other words, p = p0.

To make this analysis clearer, let’s consider several numerical examples. Suppose, that

p1 = 1,

mI = mp = 1,

rI = 0.

and consider two values of rp

r1p = 0.5,

r2p = 1.5.

This implies that

α1 = 1,

α12 = 0.5, α22 = 1.5,

ηt = −η2,t.

Assuming the value of p1 and η2,1 = −η2,0 determines that values of the shock (of course given thatη1,t ≡ 0 and η2,t>1 ≡ 0).

17

Let’s determine the paths of p for these parameter values. First, consider the case when r1p = 0.5

p2 = 1− 0.5p1 + η2 = 0.5,

p3 = 1− 0.5p2 + η3 = 0.75,

p4 = 1− 0.5p3 + η4 = 0.625,

...

p15 = 0.666687012,

...

p+∞ =2

3.

The price converges to 23 , which is equal to

α11+α2

.

Now consider the case when r2p = 1.5

p2 = 1− 1.5p1 + η2 = −0.5,

p3 = 1− 1.5p2 + η3 = 1.75,

p4 = 1− 1.5p3 + η4 = −1.625,

...

p15 = 175.5575562,

...

p+∞ = ±∞.

The price diverges. Of course, negative price does not make sense. Therefore, one would have

stopped at p2 saying that there is something fundamentally wrong in this economy.

What happens if λ < 1 and we have adaptive expectations? If λ < 1 then

pet = pet−1 + λ(pt−1 − pet−1

)where

pt−1 = α1 − α2pet−1 + ηt−1.

Expectations can be written as

pet = [1− λ (1 + α2)] pet−1 + λ

(α1 + ηt−1

).

In this case we have a non-homogenous difference equation in expectations. Similar to the previ-

ous discussion, expectations converge to a number if |1− λ (1 + α2)| < 1 and diverge to infinity

18

otherwise. Steady expectations are given by

pe = [1− λ (1 + α2)] pe + λα1,

pe =α1

1 + α2.

Since we have adaptive expectations, pe is steady when prices are steady

p =α1

1 + α2.7

Interestingly, there can be situations when prices (and their expected values) are stable under

adaptive expectations with λ < 1 but unstable under static expectations. For instance, suppose

that 1−λ (1 + α2) < 0 then since by definition α2 > 0 the absolute value of 1−λ (1 + α2) is always

less than α2:

|1− λ (1 + α2)| = λ (1 + α2)− 1

λ (1 + α2)− 1 < α2 ⇔ λ < 1.

This implies that if 1− λ (1 + α2) < 0 then it is more likely that prices (and their expected values)

are stable under adaptive expectations with λ < 1 than under static expectations. This happens

because λ suppresses the reaction of inverse supply curve to the observed price. For instance,

suppose that α2 = 1.5 so that we are back to our example. Further, suppose λ = 0.5. In such a case

|1− λ (1 + α2)| = 0.25 and prices (together with their expected values) are stable under adaptive

expectations. We have seen, however, that for this example prices (and their expected values) are

not stable under static expectations.

In case, however 1− λ (1 + α2) > 0 then

|1− λ (1 + α2)| = 1− λ (1 + α2)

1− λ (1 + α2) < α2 ⇔1− α21 + α2

< λ.

In case α2 > 1 (and therefore prices are not stable under static expectations) it is always the case

that 1−α21+α2

< λ, which means that adaptive expectations are more likely to generate stable prices

(and expectations). For instance, again suppose that α2 = 1.5 but now λ = 0.1. In such a case

|1− λ (1 + α2)| = 0.75 and prices (together with their expected values) are stable under adaptive

expectations.

What would happen if we had rational expectations in this model? In case of rational expecta-

tions, expectations are implied by the model. In order to answer the question then we need to solve

for expected price level from demand and supply equations. Luckily, we have done most of the job.

7To verify this plug pe = α11+α2

into p = α1 − α2pe.

19

Write Et−1 [pt] instead of pet in α1 − α2pet + ηt:

pt = α1 − α2Et−1 [pt] + ηt.

Take the expected value of pt conditional on information at time t− 1.

Et−1 [pt] = Et−1 [α1]− α2Et−1 [Et−1 [pt]] + Et−1 [ηt]

= α1 − α2Et−1 [pt] .

Therefore,

Et−1 [pt] =α1

1 + α2

and this relation holds for any t.8 This implies that in short-term

pt = α1 − α2α1

1 + α2+ ηt,

and shocks can be the only reason of fluctuations in the economy. Such an inference holds since in

this case the realizations of η are not important given that it has a zero mean and is i.i.d./unpredictable.

In the long-term, however, we assume that there are no shocks. In some sense this corresponds to

assuming that everything is perfect. In such a case, Et−1 [pt] = pt and

p =α1

1 + α2.

This analysis points that types of expectations matter for the level of aggregate economic activity

and prices and for fluctuations. In case of static or adaptive expectations in this model there can

be prolonged fluctuations in the economy much after the shock. However, in case of rational

expectations there are no fluctuations after the shock.

The Cagan Model

Cagan Model has been an influential contribution to policy making and academia. Cagan in his

1956 article (Cagan, 1956) proposed a model which delivered novel explanation for extraordinarily

high inflation/hyperinflation. This model seems to do well in explaining for the behavior of inflation

and the demand for money even in the midst of such distress.

The common line of thought is that hyperinflation is because of continued supply of very large

quantities of money by the central bank. Famous examples are hyperinflation in Germany during

the period of 1921-1923 and Argentina in 1989. Hyperinflation in both countries happened because

their governments decided to fund their debt printing money.

Cagan Model stresses the destabilizing effects that expectations might have on inflation. In

8We have just found expectation from the model. In this sense we have assumed that agents which live in the economydescribed by the model, know the economy and use the model to form expectations. Moreover, in this sense adaptiveexpectations are a notion imposed on the model (additional condition/assumption so to say.)

20

Cagan Model inflation can destabilize and become very large even with very small amount of money

injected into the economy. Therefore, an implication of the model is that policy makers should be

wary with financing a deficit by printing money even of small quantities.

Cagan Model consists of several blocks. The first block of this model is money demand equation

Mt

Pt= L (Yt, it) ,

where Mt is the amount of desired money holdings, Pt is aggregate price, and money is the medium

of exchange. Therefore, MtPtis the real amount of desired money holdings. L (Yt, it) is money demand

function. It is assumed to depend on current output/income Yt and nominal interest rate it. More

precisely, it is assumed to increase with current output/income since increasing output implies

that the same amount of money buys more goods. In turn, money demand is assumed to depend

negatively on nominal interest rate. A justification for this assumption is that nominal interest rate

is paid on bonds/nominal savings but it is not paid on money holdings. Money holdings can be

freely converted into savings. Therefore, nominal interest rate is the opportunity cost of holding

money. Higher nominal interest rate increases this opportunity cost and reduces the desire to hold

money. (This equation can be thought to be a result of combination of Quantity Equation of Money

and Liquidity Preference Theory.)

Suppose that L (., .) is differentiable function in both arguments. Formally, our assumptions

about L (., .) can be summarized as

∂L (Y, i)

∂Y> 0,

∂L (Y, i)

∂i< 0,

where time index t has been dropped because these inequalities are assumed to hold for any t.

The second block is the Fisher Equation. In case we have a deterministic setup Fisher Equation

is

1 + it = (1 + πt+1) (1 + rt) ,

where it is the nominal rate agreed for bonds at time t and paid at time t + 1 on bond holdings

from time t. πt+1 is inflation in the period from t to t+ 1,

πt+1 =Pt+1 − Pt

Pt.

In turn, rt is real interest rate.

It is straightforward to obtain this equation. Suppose at time t one has acquired an asset Atat value PtAt. At time t + 1 the value of the asset has changed to Pt+1At+1. The percentage

change of the value of asset is the nominal interest earned on PtAt and the percentage change of

the volume/quantity of the asset is the real interest rate,

21

it =Pt+1At+1 − PtAt

PtAt=Pt+1At+1PtAt

− 1,

=

(Pt+1Pt

)(At+1At

)− 1,

=

(Pt+1 − Pt

Pt+ 1

)(At+1 −At

At+ 1

)− 1,

= (1 + πt+1) (1 + rt)− 1.

In Cagan Model, however, future prices are not known. They are assumed to follow a ran-

dom/stochastic process. In such a case, we replace Pt+1 with its expected value P et+1. This also

implies that we need to replace inflation rate with its expected value, and Fisher Equation becomes

1 + it =(1 + πet+1

)(1 + rt) .

There is no expectation on it since, although it has to be paid at time t+ 1, it is determined/agreed

at time t.9

The analysis in Cagan Model particularly focuses on periods of hyperinflation. In these periods

nominal values change much more rapidly than real values. Therefore, lets assume that real variables

are constant. Moreover, assume that the logarithm of the money demand function is linear in lnYt

and ln (1 + it). In particular,

lnL (Yt, it) = α0 + α1 lnY − απ ln[(

1 + πet+1)

(1 + r)]

= : α− απ ln(1 + πet+1

),

where α0, α1, and απ are positive constants and α = α0 + α1 lnY − απ ln (1 + r).

In such a case, combining money demand equation and Fisher Equation we have to have that

lnMt − lnPt = α− απ(lnP et+1 − lnPt

).

Denote z = lnZ and rewrite the equation above

mt − pt = α− απ(pet+1 − pt

),

or

pt = − α

1 + απ+

1

1 + απmt +

απ1 + απ

pet+1,

This is a fairly interesting equation. It suggests that current price is function of current money

supply and expected price level in future. Moreover, current price increases with money supply. This

9You might be used to the following form of Fisher Equation it = πet+1+rt. This is an approximation of the one statedabove. To see this assume that it, πet+1, and rt are close to 0 and apply the following approximation x = ln (1 + x)when x is close to zero.

22

happens because, given constant output increasing money supply simply increases prices. (Prices are

the rates at which money is converted to goods). Current price also increases with expected future

price. This happens because higher expected price increases expected inflation. Given constant

real interest rate, this implies that nominal interest rate increases, which reduces the desire to hold

money. However, the supply of money is unchanged, i.e.,Mt = fixed. Therefore, the rates at which

money is exchanged for goods increase which is the same as to say that prices, pt, increase. In terms

of equations:

pet+1 ↑⇒ πet+1 ↑ and rt = const⇒ it ↑⇒ L (Yt, it) ↓ and Mt = const⇒ pt ↑ .

If we assume that α = 0, then price (well... the logarithm of it) becomes a weighted average of

current money supply and expected inflation

pt =1

1 + απmt +

απ1 + απ

pet+1.

For simplicity we will maintain this assumption and to keep the house clean will denote ψ = 11+απ

∈(0, 1) so that

pt = ψmt + (1− ψ) pet+1. (1)

This equation together with assumption on expectations is the Cagan Model.

We will analyze now how different types of expectations matter for dynamics in this economy.

Moreover, we will check how changes in money supply matter for changes in prices and inflation,

which is the percentage change in prices.

Suppose that agents in this economy have adaptive expectations. Further, for simplicity suppose

that the expected value of price for time t − 1 coincided with its realization pet−1 = pt−1 (i.e.,

pt−1 = pt−2 and πt−1 = 0). In such a case

pet+1 = (1− λ) pet + λpt

= (1− λ)[(1− λ) pet−1 + λpt−1

]+ λpt

= (1− λ) [(1− λ) pt−1 + λpt−1] + λpt

= (1− λ) pt−1 + λpt.

Therefore, the Cagan Model can be expressed as

pt =ψ

1− (1− ψ)λmt +

(1− ψ) (1− λ)

1− (1− ψ)λpt−1.

The general solution of this difference equation is the sum of the general solution of homogenous

equation and particular solution of this equation. It is "easy" to guess and verify that it is given by

pt =

[(1− ψ) (1− λ)

1− (1− ψ)λ

]tp0 +

ψ

1− (1− ψ)λ

t∑τ=0

[(1− ψ) (1− λ)

1− (1− ψ)λ

]t−τmτ ,

23

where p0 is the initial level of prices. It is a given number.

Suppose only few mt 6= 0 (or mt is a stationary function). The logarithm of the price level in

such case is stable if(1− ψ) (1− λ)

1− (1− ψ)λ< 1.

With this inequality, the logarithm of the price level converges over time to 0. Therefore, price

converges to 1. Inflation, in turn, can be written as

πt = pt − pt−1

= −[

(1− ψ) (1− λ)

1− (1− ψ)λ

]t−1 ψ

1− (1− ψ)λp0

+ψ

1− (1− ψ)λ

mt −

ψ

1− (1− ψ)λ

t−1∑τ=0

[(1− ψ) (1− λ)

1− (1− ψ)λ

]t−1−τmτ

It is straightforward to notice that (1−ψ)(1−λ)1−(1−ψ)λ < 1. Therefore, we have stability in terms of prices

and inflation in this economy if expectations are for prices. This situation is not so interesting for

our current purposes.

Consider now how things change if we have adaptive expectations not for price levels but for

inflation:

πet+1 = (1− λ)πet + λπt.

Suppose again that at time t−1 prices were stable. Therefore, inflation was equal to 0. This implies

that

πet+1 = (1− λ)[(1− λ)πet−1 + λπt−1

]+ λπt

= λπt

or equivalently,

pet+1 − pt = λ (pt − pt−1)

Therefore,

pet+1 = (1 + λ) pt − λpt−1.

This can drastically change the solution of the model. To see that, plug this expression back into

(1) and obtain

pt =ψ

1− (1− ψ) (1 + λ)mt −

(1− ψ)λ

1− (1− ψ) (1 + λ)pt−1.

The general solution of this difference equation is

pt =

[− (1− ψ)λ

1− (1− ψ) (1 + λ)

]tp0 +

ψ

1− (1− ψ) (1 + λ)

t∑τ=0

[− (1− ψ)λ

1− (1− ψ) (1 + λ)

]t−τmτ ,

For the current purposes we know the model suffi ciently well. Let’s focus on the main novelty

24

in terms of inference which Cagan introduced. Suppose that p0 = 0 and economy is at p0 at time t

and mτ ≡ 0. Therefore, pt = p0 for any t. In this situation money supply is constant and is equal

to 1. Economy, price levels, and inflation are stable

pt = 0, πt = 0.

Consider a deviation from this situation. Suppose the government in this economy wants to

raise money supply at the beginning of time t = 0 so that m0 > 0 and keep m ≡ 0 for the rest of

the periods. Perhaps, it does so in order to finance its deficit. In such a case

pt =ψ

1− (1− ψ) (1 + λ)

[− (1− ψ)λ

1− (1− ψ) (1 + λ)

]tm0.

Therefore, if∣∣∣− (1−ψ)λ

1−(1−ψ)(1+λ)

∣∣∣ < 1 over time prices converge back to 0. However, if∣∣∣− (1−ψ)λ

1−(1−ψ)(1+λ)

∣∣∣ >1 then prices diverge to infinity. This happens even though only m0 > 0 so that there is no

continual expansion of money supply. This is the "possible" destabilizing effect of (forward looking)

expectations. Inflation in this case is given by

πt = − ψ

1− (1− ψ) (1 + λ)

[(1− ψ)λ

1− (1− ψ) (1 + λ)+ 1

] [− (1− ψ)λ

1− (1− ψ) (1 + λ)

]t−1m0.

Clearly, in case when∣∣∣− (1−ψ)λ

1−(1−ψ)(1+λ)

∣∣∣ > 1 inflation is also ever growing in absolute terms or spiralling

out of control.

The summary of the main novelty here is: In case expectations are for inflation, for certain

parameter values it is enough to slightly increase money supply for a very short period to have ever

increasing prices and inflation because of expectations over inflation. This is in contrast with the

common line of thought that in order to have ever increasing inflation and prices money supply

has to increase permanently. Moreover, it suggests that fiscal authority (central government) and

monetary authority (central bank) should be independent so that government would not be able to

finance its deficit printing money. This independence is implemented in many countries as of now

and even is part of constitutions of some of those countries.

What would happen if we had rational expectations in this model? (AGAIN) Rational expecta-

tions assumption means that the agents use the model to form their expectations. Therefore they’ll

use the following equation for forming expectations:

pt = ψmt + (1− ψ) pet+1.

Replace pet+1 with Et [pt+1] and rewrite this equation

pt = ψmt + (1− ψ)Et [pt+1] .

25

Take expectation over information that available at time t− 1.

Et−1 [pt] = ψEt−1 [mt] + (1− ψ)Et−1 [Et [pt+1]] .

According to the Law of Iterated Expectations

Et−1 [Et [pt+1]] = Et−1 [pt+1] .

Lets assume mt is deterministic. Therefore, we have a difference equation in terms of expectations

Et−1 [pt] = ψmt + (1− ψ)Et−1 [pt+1] .

Since the coeffi cient in front of the lead variable is less than one it has to be that expected price

tends to infinity in this model. Therefore, with rational expectations assumption we don’t have

convergence/stabilization of the economy. Prices and their expected values diverge to infinity.

This situation is called self-fulfilling inflation. Agents expect to have high inflation. Prices rise

accordingly, and agents’expectations fulfill.

If we impose though additional condition that

limτ→+∞

(1− ψ)τ−1Et [pt+τ ] = 0,

then we will have stability in this model. This condition is sometimes called "no bubble" condition

in the sense that it makes sure that expected prices do not become infinitely large.

Our main equation implies that

pt = ψmt + (1− ψ)Et [pt+1] .

Et [pt+1] = ψmt+1 + (1− ψ)Et [pt+2]

Et [pt+2] = ψmt+2 + (1− ψ)Et [pt+3]

...

Plugging back Et [pt+1], Et [pt+2], etc, gives

pt = ψmt + (1− ψ) ψmt+1 + (1− ψ) ψmt+2 + (1− ψ)Et [pt+3] .

Iterating till infinity we have

pt = ψ+∞∑τ=0

(1− ψ)τ mt+τ .

Therefore, with "no bubble" condition prices are forward looking and can be stationary. Prices

are forward looking implies that they take into account future changes in policy/money supply.10

Basically, this is the Lucas Critique to models which do not feature rational expectations. In those

10This is one of the reasons why it might be important to announce policies much before their implementation.

26

models changes in policy parameters do not affect expectations and therefore current actions and

economic outcomes. However, if expectations are rational then policy changes can affect actions

and economic outcomes immediately and therefore change the environment which the change of

policy/policy-makers might not have anticipated.

Suppose that m ≡ 0, then pt = 0. If m is constant and equal to m > 0 then by the formula of

geometric progression we get

pt = mψ+∞∑τ=0

(1− ψ)τ

= m.

This holds for any t. Notice that in this case pt is constant which is in sharp contrast to the case

of adaptive expectations assumption.

If instead, for example, mt+1 = mt+2 = m and mt+k = m for any k > 2, then with the same

logic

pt+3 = m

pt+2 = ψ

[m+ m

+∞∑τ=1

(1− ψ)τ]

= ψm+ (1− ψ) m,

pt+1 = ψ

[m+ (1− ψ) m+ m

+∞∑τ=2

(1− ψ)τ]

= [ψ + ψ (1− ψ)] m+ (1− ψ)2 m.

This implies that if m0 > 0 and m ≡ 0 for the rest of the periods then p0 = ψm0 and pt = 0 for

any t > 0.

Lets extend slightly the model and assume that money supply is not deterministic but follows

a random process

mt = m+ ηt.

where m is constant and η is an i.i.d. random variable with 0 mean and constant variance σ. ηtis the realization of η at time t and is known at time t. Therefore, mt at time t is not a random

variable. However, mt+1 is a random variable at time t since its realization is not known. The

assumption thatmt = m+ηt corresponds to assuming that money supply is subject to unanticipated

shocks. Where could such shocks come from? A source for such shocks in the model could be that

the government/central bank does not fully announce its money supply policy and allows some

random/unpredictable changes.

Maintaining the "no bubble condition" price can be rewritten as

pt = ψ

+∞∑τ=0

(1− ψ)τ Et [mt+τ ] ,

27

where of course Et [mt] = Et [m+ ηt] = m+ ηt since the value of ηt is known and

pt = ψ (m+ ηt) + ψm+∞∑τ=1

(1− ψ)τ

= ψ (m+ ηt) + (1− ψ) m.

Notice that this is similar to what we previously obtained and given that η are not predictable they

don’t matter for expectations and prices.

Appendix

If expectations are for prices then prices are stationary. What is the value they converge to when

money supply is constant? Here is the answer to that question:

If money supply is constant m > 0, p0 = 0, and (1−ψ)(1−λ)1−(1−ψ)λ < 1 then

limt→+∞

pt =ψ

1− (1− ψ)λm lim

t→+∞

t∑τ=0

[(1− ψ) (1− λ)

1− (1− ψ)λ

]t−τRewrite this equation as

limt→+∞

pt =ψ

1− (1− ψ)λm lim

t→+∞

t∑τ=0

[(1−ψ)(1−λ)1−(1−ψ)λ

]−τ[(1−ψ)(1−λ)1−(1−ψ)λ

]−tand notice that

+∞∑τ=0

[(1−ψ)(1−λ)1−(1−ψ)λ

]−τ= +∞. Apply L’Hôpital’s rule to figure out the limit:

limt→+∞

t∑τ=0

[(1−ψ)(1−λ)1−(1−ψ)λ

]−τ[(1−ψ)(1−λ)1−(1−ψ)λ

]−t = limt→+∞

∂∂t

t∑τ=0

[(1−ψ)(1−λ)1−(1−ψ)λ

]−τ∂∂t

[(1−ψ)(1−λ)1−(1−ψ)λ

]−t= lim

t→+∞

[(1−ψ)(1−λ)1−(1−ψ)λ

]−t−[(1−ψ)(1−λ)1−(1−ψ)λ

]−tln[(1−ψ)(1−λ)1−(1−ψ)λ

] =1

ln[1−(1−ψ)λ(1−ψ)(1−λ)

] .At the end of time horizon, price converges to

limt→+∞

pt =ψ

1− (1− ψ)λ

1

ln[1−(1−ψ)λ(1−ψ)(1−λ)

]m.

28

The Lucas Imperfect Information Model with AD

The Lucas Imperfect Information Model provides us with an explanation of why in short run aggre-

gate supply curve might be upward sloping, but not vertical.11 Expectations matter for dynamics

of prices and quantities in this model too. Prior to proceeding to the model lets digress shortly and

review AD-AS Model.

The AD-AS Model consists of two equations/curves: aggregate demand (AD) and aggregate

supply (AS). Aggregate demand curve is determined from the IS-LM Model. IS-LM-model has two

equations: investments-savings and liquidity-money (money demand). These equations are

[IS] : Y = C (Y, T ) + I (r) +G,

[LM ] :M

P= L (Y, r + πe) ,

where C is consumption, I is investment, G and T are government spending and taxes (fiscal policy

parameters), M is money supply/demand (monetary policy parameter). C increases with Y and

declines with T . I declines with r.

In a period, prices are assumed to be given. Money supply is also given. The central bank

controls it. Therefore, unknowns are Y and r in the IS-LM Model. Graphically this model can be

represented in the following manner.

The solutions of Y and r depend on price level P , fiscal and monetary policy parameters G,T ,

and M , and on expected inflation level, πe. The solution of Y is the aggregate demand curve,

AD = Y (P,G, T,M, πe) .

It declines with P . Negative relation stems from LM curve. This curve shifts up when prices

increase implying higher equilibrium interest rate and lower income.

The graphical representation of AD-AS model is

11Macro II subject slightly covers this model. You can find and read a slightly different version of this model in Part Aof Chapter 6 of Romer (2006) textbook.

29

In this figure, LRAS is long-run aggregate supply (AS) curve. SRAS is short-run aggregate

supply curve. The former is vertical since we assume that in long-run factor inputs are given/prices

are determined in the model. In turn, the latter is upward sloping and lower is its slope higher is the

effect of shifts of aggregate demand on income. Shifts of aggregate demand AD = Y (P,G, T,M, πe)

in this model can stem from changes in policy parameters G,T, andM , and expected inflation level

πe. Lucas Imperfect Information Model provides us with a SRAS curve.

In Lucas Imperfect Information Model, there are N firms. Each firm i (i = 1, N) is assumed to

face demand

χdi =y

N+ ζi − η (pi − p) ,

where yN is income spent on each good (y = lnY ). ζi is a taste parameter which has normal

distribution with 0 mean and is i.i.d. across firms. pi is the price of the product of firm i. p is the

aggregate level which consumers face. Aggregate price levels is average price in this model

p =1

N

N∑i=1

pi.

Consumers know this price level since they are consuming all types of goods produced in the

economy.

Further each firm i is assumed to have the following supply function

χsi = α+ β (pi − pe) ,

where α and β are positive parameters, pi is the price of the good of firm i, and pe is firms’expected

aggregate level price. There is intra-temporal uncertainty here which stems from the assumption

that firms cannot observe the prices (shocks) of their rivals. This is especially meaningful for

firms from different industries. Supply positively depends on the gap between pi − pe which is the(expected) relative price of the firm. In this regard, for example pi > pe can happen when firm i

has received positive shock to demand relative to others.

Market clearing requires that in equilibrium χsi = χdi ,

α+ β (pi − pe) =y

N+ ζi − η (pi − p) .

30

Therefore, solving for the price of firm i we get

pi =1

β + η

( yN

+ ζi − α)

+1

β + η(βpe + ηp) . (2)

The aggregate supply is then

y = N [α+ (β + η) pi − (βpe + ηp)− ζi] .

Take the average of this expression and assume that N is high. According to the Law of Large

Numbers then 1N

N∑i=1

ζi = 0 and

y = N [α+ β (p− pe)] .

If there is no uncertainty, which means that pe = p, then Y is vertical

y = αN.

This is long run aggregate supply curve (LRAS).

The expected price is conditional on observation of pi,

pe = E [p| pi] (3)

so that firms can update their beliefs within a period. Since ζi is driving the uncertainty in this

model and it has normal distribution, p has a normal distribution too. It can be shown that in

such a case the conditional expectation of p is its expected value plus an error term which has zero

expected value,

E [p| pi] = E [p] + θ (pi − E [p])

= (1− θ)E [p] + θpi. (4)

where θ > 0, and the error of prediction is pi − E [p].

Take expected value of (2) conditional on pi

pi =1

β + η

( yN

+ ζi − α)

+ E [p| pi] . (5)

Plug (4) into (5) and take the average of pi,

p =1

(β + η) (1− θ)

( yN− α

)+ E [p] .

Compute aggregate output denoting (β + η) (1− θ)N = b,

y = y + b (p− E [p]) . (6)

31

This is short run aggregate supply curve. Clearly it is upward sloping. In case p > E [p] clearly

in short run y > y . Such a relation holds since if p > E [p] then on average firms have received a

positive shock (their prices have turned out to be higher than expected aggregate price) and have

increased output.

Combining aggregate supply curve with aggregate demand curve gives equilibrium levels of price

and quantity. To keep the discussion as simple as possible suppose that aggregate demand is given

by the Quantity Equation of Money

PY = VM.

Suppose further that the velocity of money is equal to 1. The logarithm of the aggregate demand

therefore is

y = m− p.

Then prices can be determined from

y + b (p− E [p]) = m− p, (7)

which is common market clearing condition.

Under rational expectations the agents use (7) to derive their expectations. To find it, take the

expected value of (7),

E [p] = E [m]− y.

Just as in Cagan Model, therefore, agents’expectations of prices depend on expectation of nominal

money stock. Plugging this back into (7) and solving for p gives equilibrium price levels

p =b

1 + bE [m] +

1

1 + bm− y.

Aggregate price level in equilibrium is a weighted average of expected money supply E [m] and

actual money supply m. Plugging p from this expression into the aggregate demand equation gives

the equilibrium level of output

y = y +b

1 + b(m− E [m]) .

This implies that under rational expectation assumption short run deviations of output from

long run level are possible only through unanticipated changes in money supply. If money supply

rule is known then simply m − E [m] = 0 and y = y. Therefore, output fluctuations in this model

with rational expectations assumption are purely driven by unanticipated changes in money supply.

Prices, though, respond to unanticipated changes to money supply.

We have a version of the AD-AS Model where aggregate demand curve is

y = m− p,

which means that aggregate demand is assumed not to depend on fiscal policy. This is the reason

32

why inference does not include policy parameters other than money supply.

Lucas Imperfect Information Model with AD implies a very well known and very debated rela-

tionship in macroeconomics which is called Phillips Curve. Phillips Curve illustrates the short-run

trade-off between inflation and output. To derive it assume that long-run output is fixed and put

time index in (6) to obtain

yt = y + b (pt − Et−1 [pt])

= y + b [(pt − pt−1)− (Et−1 [pt]− Et−1 [pt−1])]

= y + b (πt − Et−1 [πt]) .

In short we have an expectation augmented Phillips Curve

yt = y + b (πt − Et−1 [πt]) .

In case when inflation is higher than expected, short-run output is higher than the long-run output.

In this sense, if policy makers can manage to full the agents they can increase short-run output.

Usually Phillips Curve is written, however, as a trade-off between inflation and unemployment.

Assuming that output is inversely proportional to unemployment and using u to denote the long-run

level of unemployment, Phillips Curve can be rewritten as

ut = u− b (πt − Et−1 [πt]) ,

where b > 0 is a constant. Increasing then unanticipated inflation reduces unemployment.

The Sticky Wage Model

In Macroeconomics II you have also seen a version of the Sticky Wage Model. We will now consider

a version of that model adding microstructure which hasn’t been discussed in detail in the previous

course. We will show again that expectations matter for macroeconomic outcomes.

Suppose that there is a continuum of mass one of identical and infinitely lived consumers. Each

consumer is endowed with L units of time. The consumers can supply their time to firms at

wage rate w or use it for leisure. The consumers derive utility from consumption of N different

types of goods xjNj=1 and have disutility from supplying labor. They have constant elasticity of

substitution utility function of the form

U (x1, ..., xN , L− l) =N∑j=1

xαj + χ ln (L− l) .

where α ∈ (0, 1) and 11−α is the elasticity of substitution between different pairs of goods x.

12 l is

12The elasticity of substitution between any i and k (i 6= k) pair of x isd ln

(xixk

)d ln

(du(xk)/dxkdu(xi)/dxi

) = 11−α .

33

the amount of labor force that the representative consumer supplies to firms and L− l is its leisuretime. χ is a positive parameter.

The consumers maximize their utility subject budget constraint taking pricespxjNj=1

of goods

x and wage rate as given. Therefore, the representative consumer solves the following problem in

each period of time

maxxjNj=1,l

N∑j=1

xαj + χ ln (L− l)

s.t.

0 = wl −N∑j=1

pxjxj

To solve this problem we use Lagrangian

L = maxxjNj=1,l

N∑j=1

xαj + χ ln (L− l) + λ

wl − N∑j=1

pxjxj

Normalize λ = 1. The optimal rules are given by first order conditions

[xj ] : pxj = αxα−1j ,

[l] : w = χ1

L− l ,

where the first one is the inverse demand function of good xj and the second one is the inverse

supply function of labor. Notice that there are actually N inverse demand functions because there

are N goods x. This means that we have N + 1 optimal rules.

Assume that effective wage rate is given by

w = pef (u, z) ,

where pe is the expected aggregate price level (pe = E[1N

∑Nj=1 pxj

]), f is a function which decreases

in the first argument and increases in the second argument. u is the level of unemployment in the

economy, and z are institutional features in the economy such as unemployment benefits, minimum

wages, etc.

Suppose that there are N firms. Each firm produces a type of x good. Firms’ input for

production is labor and a unit of labor produces a unit of output in all firms. Firms are price

setters. Therefore, the problem of the firm which produces good xj is

maxxj

pxjxj − wxj

s.t.

pxj = αxα−1j .

34

This implies that the inverse supply function is

pxj =1

αw.

Inverse supply function implies that pxj ≡ p and from the inverse demand function it follows that

in equilibrium all x goods are produced at the same quantities xj ≡ x.Notice that in this framework since firms are price setters price is higher than marginal cost w.

As α tends to 1 goods become more similar/substitutable since 11−α tends to infinity and px tends

to the marginal cost.

Given that in equilibrium all firms produce the same amount of output they hire the same

amount of labor. Since there is a continuum of mass one of consumers, the total amount of labor

that firms hire is l. This implies that unemployment rate is

u =L− lL

= 1− l

L.13

Moreover, the production of a unit of good requires unit of labor. Unemployment rate then can be

rewritten as

u = 1− Y

L

where Y is total output, Y =∑N

j=1 xj .

The intersection of

w

p=

pe

pf

(1− Y

L, z

), (8)

w

p= α,

gives equilibrium level of output and prices. At the same time it gives equilibrium level of unem-

ployment.

α =pe

pf

(1− Y

L, z

). (9)

Suppose that expected price relative to actual price pe

p increases. In such a case, in order this

equation to continue to hold f(1− Y

L , z)should decline. That will happen if unemployment in-

creases/output declines.

13Although we treat this as unemployment rate, this is actually the non-participation rate. Unemployment rate wouldbe the difference between participation rate in a non-distorted economy and the participation rate in this distortedeconomy.

35

Equation (9) holds in each period of time. Add time subscripts to it and write

pt =1

αf

(1− Yt

L, z

)pet .

This is the aggregate supply curve. In long-run pt = pet so that long-run output and unemployment

are given by

α = f

(1− Y

L, z

).

In short-run price level pt is an increasing function of Yt and expected price level pet . Higher Yt in

this setup increases the wages paid according to (8). This increases the marginal costs of the firms

and therefore the prices that they charge. Higher expected price also increases wages according to

(8) and therefore results in higher actual prices.

If we assume for example that

f (ut, z) = 1− ut + z =YtL

+ z,

then the short-run aggregate supply curve is given by

pt =1

α

(YtL

+ z

)pet ,

or

Yt =

(αptpet− z)L.

Long-run aggregate supply curve then will be

Y = (α− z)L.

36

The Effectiveness of monetary policy (with fixed rules)

We have seen that expectations and their types can matter for aggregate economic performance.

They can matter for both the levels of nominal and real variables and for their dynamics. We will

now turn to more rigorous discussion of how monetary policies can matter for economic performance

given the type of expectations.

A model with stabilizing monetary policy

Often policy makers have an agenda of stabilizing the economy. They set fiscal and monetary policy

rules in order to achieve that. The reason they do so is that consumers/households are believed

to be risk averse. Therefore, they might be better-off if the economy is more stable. The intuition

behind such a result is that risk averse consumers prefer steady income over volatile income.

This section discusses properties and effects of a monetary policy rule which attempts to stabilize

the economy. The discussion follows a simplified version of the AD-AS Model.

Since the focus of the section is on monetary policy, aggregate demand (in logarithms) is sim-

plified to

yt = mt − pt + νt.

νt can be thought to be the velocity of money. In such a case, this equation is the Quantity Equation

of Money.

In the reminder, we will assume that νt is a random variable which follows a simple autoregressive

- AR(1) - process of the form

νt = ανt−1 + ηt, (10)

where α ∈ (0, 1) is a parameter and ηt is an i.i.d. random variable with 0 mean and σ2 variance.

This process is called autoregressive because it has autocorrelation: its current value is correlated

with its previous value.

Before we proceed with the model, lets discuss the properties of the random variable νt. Denote

with µν the unconditional expected value of νt, E [νt]. It is straightforward to see that µν is equal

to 0,

E [νt] = αE [νt−1] + E [ηt]

µν = αµν + 0.

Denote with σ2ν unconditional variance of νt, V [νt]. Unconditional variance of νt is

σ2ν = E [νt − E [νt]]2

= E [νt]2

37

Plug ανt−1 + ηt for νt,

E [νt]2 = E [ανt−1]

2 + 2E [(ανt−1) (ηt)] + E [ηt]2

Given that ηt is i.i.d. E [(ανt−1) (ηt)] = 0. Therefore,

σ2ν =σ2

1− α2 .

Conditional on information available at time t− 1 mean and variance of νt are

Et−1 [νt] = αEt−1 [νt−1] + Et−1 [ηt] (11)

= ανt−1,

and

Vt−1 [νt] = Et−1 [νt − Et−1 [νt]]2

= Et−1 [ανt−1 + ηt − ανt−1]2 = σ2.

The conditional expectation formula for νt shows that a very important property of νt, as compared

to ηt, is that νt can be predicted using its previous values. The previous values of νt, in turn, depend

on the previous values of η, which are known numbers at time t.

Aggregate supply in this model is given by

yt = pt − wt,

where wt are wages. Aggregate supply increases with prices and declines with wages, which sum-

marize/represent the costs of the firms.

In this model, the long-run level of output, prices, and money supply are normalized to 1 so

that in the long-run their logarithms are

y = m = p = w = 0.

Agents are assumed to live for 2 periods. Their wage contracts fix wages for 2 periods. In

this sense contracts are for a long term and wages are sticky. Wages are fixed according to the

agents’ expectation of prices for 2 periods. At birth (denote by t − 2) they inherit information

about previous periods of time from their parents and require wage at time t according to

wt =1

2

(pet|t−1 + pet|t−2

). (12)

Monetary authority pursues an agenda of stabilizing the economy. It tries to set money supply

so that to reduce the effect of shocks ν on output conditional on information it has. The shock

38

arrives after monetary policy has been implemented. In other words, monetary policy authority sets

money supply at the beginning of time t not knowing the realization of the shock η at time t, but

knowing its realizations in earlier periods In this sense monetary policy authority does not have

superior information as compared to the other members o the economy (the agents, firms, etc).

Suppose that monetary policy rule/money supply is

mt = βht + (1− β)ht−1, (13)

where β ∈ [0, 1] and

ht = γνt−1,

and γ < 0. With this policy rule monetary policy authority can reduce the expected shock

Et−1 [νt] = ανt−1 since when νt−1 increases money supply declines.

In equilibrium aggregate demand is equal to aggregate supply,

pt − wt = mt − pt + νt.

Therefore,

pt =1

2(mt + wt + νt) , (14)

yt =1

2(mt − wt + νt) . (15)

Plugging monetary policy rule (13) and wages (12) into output equation gives

yt =1

2

[βγνt−1 + (1− β) γνt−2 + νt −

1

2


)].

In this expression, νt−1 and νt−2 are known at time t.

The dynamic path of the model is given by

pt =1

2

[βγνt−1 + (1− β) γνt−2 + νt +

1

2


)],

νt = ανt−1 + ηt,

yt =1

2

[βγνt−1 + (1− β) γνt−2 + νt −

1

2


)].

We have, therefore, unknowns yt, νt, pt, and pet|t−1 and pet|t−2 and need equations for the latter two.

Suppose that agents have rational expectations in this model. In such a case, expected prices

need to be figured out from the model and

pet|t−1 = Et−1 [pt] , pet|t−2 = Et−2 [pt] .

39

Take the expectation of price pt (14) conditional on information at time t− 1,

Et−1 [pt] =1

2(Et−1 [mt] + Et−1 [wt] + Et−1 [νt]) .

According to (13) and (11) the conditional expectation of mt is

Et−1 [mt] = βγEt−1 [νt−1] + (1− β) γEt−1 [νt−2]

= βγνt−1 + (1− β) γνt−2.

Et−2 [pt] is a given number at time t−1, which means that Et−1 [Et−2 [pt]] = Et−2 [pt]. According

to (12), then, the conditional expectation of the wage rate is given by

Et−1 [wt] =1

2Et−1 [Et−1 [pt] + Et−2 [pt]]

=1

2(Et−1 [pt] + Et−2 [pt]) .

Therefore, the expected value of pt conditional on information from time t− 1 is

Et−1 [pt] =1

2

[(α+ βγ) νt−1 + (1− β) γνt−2 +

1

2(Et−1 [pt] + Et−2 [pt])

],

or equivalently,

Et−1 [pt] =2

3

[(α+ βγ) νt−1 + (1− β) γνt−2 +

1

2Et−2 [pt]

]. (16)

This means that expected price is a function of previous realizations of shock and expectations.

We still need an equation for Et−2 [pt]. Take the expectation of price pt (14) conditional on

information at time t− 2,

Et−2 [pt] =1

2(Et−2 [mt] + Et−2 [wt] + Et−2 [νt]) .

According to (10), the expected value of νt conditional on information at time t− 2 is

Et−2 [νt] = Et−2[α2νt−2 + αηt−1 + ηt

]= α2νt−2.

Conditional expectation of monetary policy rules is

Et−2 [mt] = βγEt−2 [νt−1] + (1− β) γEt−2 [νt−2]

= [1− (1− α)β] γνt−2.

In turn, according to (12) and the Law of Iterated Expectations conditional expectation of wage

40

rate is

Et−2 [wt] =1

2Et−2 [Et−1 [pt] + Et−2 [pt]]

= Et−2 [pt] .

Therefore,

Et−2 [pt] =1

2

[1− (1− α)β] γνt−2 + Et−2 [pt] + α2νt−2

,

or equivalently,

Et−2 [pt] =[(1− β) γ + αβγ + α2

]νt−2,

The expected value of price conditional on time t− 1 (16) can be rewritten as

Et−1 [pt] =2

3(α+ βγ) νt−1 +

1

3

[3 (1− β) γ + αβγ + α2

]νt−2.

In turn, output (15) can be rewritten as

yt =1

2[(α+ βγ) νt−1 + (1− β) γνt−2] +

1

2ηt

−1

4

2

3(α+ βγ) νt−1 +

1

3

[3 (1− β) γ + αβγ + α2

]νt−2

−1

4

[γ (1− β) + αβγ + α2

]νt−2.

Group together the coeffi cients of νt−1 and νt−2 and use the condition that νt−1 = ανt−2 + ηt−1 to

rewrite output as

yt =1

3(α+ βγ) ηt−1 +

1

2ηt.

The first term in this expression is the effect of shocks in previous periods on output. Monetary

policy, for example, can eliminate their effect setting γ = −αβ and reduce the effect of shocks on

output. In such a case

yt =1

2ηt,

where ηt is shock in period t which is not observed by the monetary policy authority before it sets

and runs the policy rule. Policy rule, therefore, cannot affect ηt.

Such a policy rule implies that the variance of output is

V [yt] =1

4σ2.

Notice that if γ is not selected to eliminate ηt−1 then

V [yt] =1

4σ2 +

[1

3(α+ βγ)

]2σ2,

which is larger than 14σ

2. In this sense this policy rule stabilizes economy eliminating some of

41

the influence of shocks. In particular, and importantly, it eliminates the effect of the previous

realizations of η, ηt−1. It is able to do so because the value of ηt−1 is known when the policy is

implemented. In turn, the value of ηt−1 matters for wages, prices, and output because wages are

sticky. They are indexed also to what has happened at time t− 1. Wages become indexed because

they depend on Et−2 [pt], which in turn depends on an autocorrelated random process νt. The

values of νt are autocorrelated and can be predicted using previous information (νt−1, ηt−1, etc).

From aggregate supply equation it follows that

pt − wt =1

2ηt.

This implies that positive shocks ηt imply reduction of real wage rate wt − pt.What would happen if monetary policy authority knew the realization of ηt before setting

and running its policy and had an agenda to stabilize the economy? The answer turns to be

straightforward. If it sets

mt = −νt,

then there are no shocks in this economy and no uncertainty. Aggregate output does not fluctuate

and is at its long-run level, 1.

w = p,

y = p− w,

p = −y.

It seems natural to derive also monetary policy rule in case when wages are not sticky and are

set for each period. In such a case, the model can be summarized as

[AD] : yt = mt − pt + νt,

[AS] : yt = pt − wt,

[Wages] : wt = Et−1 [pt] , (17)

[Shock] : νt = ανt−1 + ηt.

Therefore, in equilibrium

pt =1

2(mt + wt + νt) ,

yt =1

2(mt − wt + νt) .

To find equilibrium level of output derive the expected level of prices and wages using (17)

wt = Et−1 [pt] = (α+ βγ) νt−1 + (1− β) γνt−2.

42

Therefore,

yt =1

2βγνt−1 + (1− β) γνt−2 − [(α+ βγ) νt−1 + (1− β) γνt−2] + ανt−1 + ηt

=1

2ηt.

Clearly, yt in this case does not depend on money supply, which implies that money supply

rule selected above cannot reduce variability of output. Is this a general inference for any money

supply rule? Yes. To see that take again expected price level now assuming that we have a general

(deterministic) money supply rule mt

wt = Et−1 [pt] = mt + ανt−1.

This implies that aggregate output is

yt =1

2ηt.

Monetary policy has no effect in this general case too. Therefore, there is nothing to derive/any

rule suits. The reason why this happens is that there are no rigidities since wages adjust in a period

and the values of the previous shocks (which could have been eliminated) don’t matter. Higher

money supply simply alters prices.

Another possible extension of the model is to consider static expectations wt = pt−1. In such a

case in equilibrium it has to be that

pt =1

2(mt + wt + νt) ,

yt =1

2(mt − wt + νt) ,

wt = pt−1,

νt = ανt−1 + ηt,

mt = βγνt−1 + (1− β) γνt−2.

From equations for pt and wt it follows that

pt =

(1

2

)tp0 +

1

2

t∑τ=0

(1

2

)t−τ(mτ + ντ ) .

In turn, from νt = ανt−1 + ηt is follows that

νt = αtν0 +t∑

τ=0

αt−τητ .

Assume that ν0 = η0 = 0.

43

Aggregate output, therefore, is

yt =1

2

[(α+ βγ)α+ (1− β) γ]

t−2∑τ=0

αt−τητ + (α+ βγ) ηt−1

−1

2

1

2

t−1∑τ=2

(1

2

)t−1−τ [(α+ βγ)α+ (1− β) γ]

(τ−2∑k=0

ατ−kηk

)+ (α+ βγ) ητ−1 + ητ

−1

2

(1

2

)t−1p0 +

1

2ηt.

The exercise which computes the variance of aggregate output requires tedious algebra. Instead

of that notice that in this case too monetary policy can eliminate some of the effect of previous

shocks on output setting α + βγ = 0 or (α+ βγ)α + (1− β) γ = 0. In this version of the setup

monetary policy matters for aggregate output since, again, wages are rigid. They are indexed to

previous levels of prices. Monetary policy alters the prices within period which alters the output of

firms but keeps their costs (wages) fixed.

A model with constant growth of money

In the previous (sub-)section we considered a policy which attempted to stabilize the economy. In

this (sub-)section we consider a different policy and keep the setup of the economy unchanged. The

policy that we consider bears the name of its main proponent: Milton Friedman. Friedman Rule

(or Policy) is monetary policy which makes nominal interest rate zero. Consider Fisher Equation,

i = r + π.

Friedman Rule then sets

π = −r.

A motivation behind such a policy can be derived from cash-advance models, which are slightly

more advanced to be covered in this course. In short in these models money is an inferior asset

in the sense that it does not earn interest, whereas bonds earn nominal interest i. The agents are

forced to keep money since they use it for their purchases. Intuitively, though, such a policy works

since, it manages to set nominal interest rate to 0 and eliminates the difference between keeping

money and bonds. Therefore, it makes money less inferior (or not inferior at all).

Assume that output and velocity of money are constant. In such a case, from the Quantity

Equation,

v +m = p+ y,

it follows that inflation is equal to the growth rate of money

πt = mt −mt−1.

44

Suppose that real interest rate is constant and denote −r = g. Therefore, in terms of this exposition

Friedman Rule is

mt = mt−1 + g.

This is the monetary policy rule which we consider in this section. The remainder of the model is

[AD] : yt = mt − pt + νt,

[AS] : yt = pt − wt,

[Wages] : wt =1

2(Et−1 [pt] + Et−2 [pt]) ,

[Shock] : νt = ανt−1 + ηt.

From [AD] and [AS] it follows that

pt =1

2(mt + wt + νt) , (18)

yt =1

2(mt − wt + νt) . (19)

Assuming rational expectations and using [Wages] gives

Et−1 [pt] =1

2(Et−1 [mt] + Et−1 [wt] + Et−1 [νt]) (20)

=1

2

[mt−1 + g +

1

2(Et−1 [pt] + Et−2 [pt]) + ανt−1

],

=1

2

[mt +

1

2(Et−1 [pt] + Et−2 [pt]) + ανt−1

],

Therefore,

Et−1 [pt] =1

2(Et−1 [mt] + Et−1 [wt] + Et−1 [νt]) (21)

=1

2

[mt−1 + g +

1

2(Et−1 [pt] + Et−2 [pt]) + ανt−1

],

=2

3

(mt +

1

2Et−2 [pt] + ανt−1

).

Notice that Et−1 [mt] = mt−1 + g since money supply follows a deterministic/fixed rule.

To find the expected price level conditional on information available at time t− 2, consider

Et−2 [pt] =1

2(Et−2 [mt] + Et−2 [wt] + Et−2 [νt]) ,

=1

2

(mt + Et−2 [pt] + α2νt−2

).

Therefore,

Et−2 [pt] = mt + α2νt−2.

45

Plugging back into the expression for expected price (20) gives

Et−1 [pt] =2

3

(3

2mt + ανt−1 +

1

2α2νt−2

),

and from [Wages] equation it follows that

wt = mt +1

3ανt−1 +

2

3α2νt−2

= mt + ανt−1 −2

3αηt−1.

According to (19), income level then is

yt =1

2ηt +

1

3αηt−1.

The second term in this expression highlights the persistence of shocks. This policy does not

eliminate ηt−1. We have relatively volatile output and expected inflation at rate g,

E [πt] = E [pt − pt−1] = mt −mt−1 + E

(2

3ανt−1 +

1

3α2νt−2

)− E

(2

3ανt−2 +

1

3α2νt−3

)= mt −mt−1 = g.

Consider a slight extension of the discussion. Suppose that the monetary policy authority/central

bank can cheat the agents and unexpectedly increase money supply from g = g to g′ (g′ > g). Sup-

pose for simplicity that there are no shocks in the economy, so that ηt = νt ≡ 0. This implies that

agents expect prices and prices in general are

Et−1 [pt] = Et−2 [pt] = pt = mt−1 + g

and wages are given by

wt = mt−1 + g.

Therefore, aggregate supply is at its long-run level yt = 0. This happens because there is no

uncertainty.

Aggregate demand is given by

yt =1

2[mt − (mt−1 + g)] .

Therefore, if the monetary policy authority increases the growth of money supply to g′ then itincreases output at least for a short period,

yt =1

2(g′ − g) .

Suppose policy makers’ agenda is to boost output. In such a case it would be tempting for

46

the policy makers to surprise public with sudden increases of inflation. In other words, the policy

makers although might have announced a fixed monetary policy rule they might want to deviate

from that commitment. In such a case, would the public believe that policy makers will stick to the

announced policy? The answer to the question is "No" (most probably). We will continue exploring

this commitment problem more formally in the next sections.

Discretionary monetary policy

Suppose now that the central bank’s agenda is to maximize output and minimize inflation. To

formalize that assume that the central bank has a loss function,

Lt = −yt +λ

2(pt − pt−1)2 ,

which it attempts to minimize choosing the monetary policy rule. Clearly this is equivalent to

maximizing −Lt . Therefore, the central bank’s optimal problem is

−Lt = maxmt

yt −

λ

2(pt − pt−1)2

.

Aggregate demand and supply are the same as in previous section. In equilibrium we have then

that

pt =1

2(mt + wt + νt) ,

yt =1

2(mt − wt + νt) .

In turn, assume that wages and shocks are given by

wt = pet|t−1,

νt = ανt−1 + ηt.

Suppose that the central bank sets its policy prior to observing the shock. However, it has an

important advantage in the sense that it can set its monetary policy rule conditional to the wages

that prevail in the economy. Therefore, the central bank minimizes the expected loss and solves,

−Et−1 [Lt] = maxmt

Et−1

[1

2(mt − wt + ανt−1 + ηt)−

λ

2

[1

2(mt + wt + ανt−1 + ηt)− pt−1

]2].

The solution of this problem is given by

∂Et−1 [Lt]

∂mt= 0,

47

which (in this setting) is equivalent to

Et−1

[∂Lt∂mt

]= 0,

or simply

Et−1

[1

2− λ

2

[1

2(mt + wt + ανt−1 + ηt)− pt−1

]]= 0.

Therefore,

mt =2

λ+ 2pt−1 − wt − ανt−1.

Plugging this monetary policy into the expressions for output and prices gives

pt =1

λ+ pt−1 +

1

2ηt,

yt =1

λ+ pt−1 − wt +

1

2ηt.

The expression for prices implies that expected inflation is

E [πt] =1

λ.

Therefore, increasing the weight on inflation in the loss function λ reduces inflation. In other words,

if central bank values more lower inflation then it sets monetary policy rule to make sure inflation

is lower.

In turn, in order to find output assume that the agents have rational expectations and derive

wage rate from the equation for prices.

wt =1

λ+ pt−1,

Wages depend on parameter λ. This is because in this setting monetary policy rule is contingent

on prevailing wages.

Therefore, output is given by

yt =1

2ηt.

This monetary policy therefore does not limit the effect of shocks on output but affects inflation.

Political cycles and discretionary monetary policy

Consider a modification of the previous setup. Suppose that there are two political parties A and

B in the economy and it is an election year. Currently it is period mid-t−1 and elections are at the

end of period t− 1. Suppose these parties have different tastes for (weights on) inflation λA and λBand appoint monetary policy authority accordingly. Party A has lower tolerance to inflation than

party B, λA > λB. The wages for period t are set before the election contingent on prices that

prevail in the next period. Therefore, wages are set without knowing which party wins the election

48

and which tastes for inflation prevail in the economy in period t. The probability that party A wins

the elections is χA. The reminder of the model is the same as the model in previous section.

After the elections there is one party in the economy. Let it be party i where i is either A or B.

Therefore, the economy is described by

pit =1

2

(mit + wt + νt

),

yit =1

2

(mit − wt + νt

),

mit =

2

λi+ 2pt−1 − wt − ανt−1

E[πit]

=1

λi,

wt =1

λi+ pt−1.

Clearly, expected inflation is lower in case party A won the election. Combining expressions for

price level and monetary policy gives

pit =1

λi+ pt−1 +

1

2ηt.

This is price level conditional that party i has won the elections. With probability χA price level

at time t is pAt and with probability 1− χA it is pBt , therefore, unconditional price level is

pt =

[χA

1

λA+ (1− χA)

1

λB

]+ pt−1 +

1

2ηt.

This implies that wages are given by

wt =

[χA

1

λA+ (1− χA)

1

λB

]+ pt−1.

Wages are a weighted average of tolerances for inflation and the previous level of prices. Therefore,

even though only one party is elected at time t the preferences of both parties matter.

In turn, expected inflation rate is given by

E [πt] = χA1

λA+ (1− χA)

1

λB.

To determine the level of output, use the expressions for money supply and wages

yit =1

λi−[χA

1

λA+ (1− χA)

1

λB

]+

1

2ηt.

Therefore, if A is elected the expected output is

E [yt] = (1− χA)

(1

λA− 1

λB

),

49

which is a negative number. This number increases with the probability that the party B wins

elections 1 − χA. The expected output level is negative since party A tolerates inflation less than

party B. The existence of party B pushes inflation up. To keep it low party A reduces money

supply and output. Output declines to a negative number

If B is elected then the expected output is positive and given by

E [yt] = χA

(1

λB− 1

λA

)> 0.

Monetary policy under commitment and discretion

Previous sections discussed the properties of a monetary policy rule which reduces the variance of

output. The discussion in these sections suggested that in certain cases there might a commitment

problem with an announced policy in the sense that policy makers might want public to believe

in the implementation of a policy but they might deviate later from it. The latter situation arises

especially when the policy makers pursue two opposing policy goals at the same time. The monetary

policy authorities/central banks usually have exactly two policy goals: stabile inflation and output.

Consider an economy where the central bank has these two policy goals. To formalize this and

keep the matter relatively simple, assume that the central bank selects monetary policy rule to

minimize

Lt = π2t + λ (yt − y)2 .

Monetary policy tries to minimize the deviation of output from its long-run level y and the deviation

of inflation πt from its long-run level 0. λ is a positive parameter. It offers the importance of

deviation of output from its long-run level for the central bank and therefore for monetary policy.

In this context L is the central bank’s loss function.

There is no trend in the model which we consider here, deviation of output from the long-run

level stem from exogenous shocks, and ideally long-run inflation is 0. Therefore, minimizing the π2tand (yt − y)2 amounts to minimizing their variance.

Suppose that the reminder of the economy is given by versions of AD and AS curves with

expected wages

[AD] : yt = a (mt − pt) + ηdt ,

[AS] : yt = b (pt − wt) + ηst ,

[Wages] : wt = pet|t−1,

where ηdt and ηst are uncorrected i.i.d. random variables with 0 mean and variance σ2d and σ2s,

correspondingly. a and b are positive parameters.

Further, suppose that agents’have rational expectations, so that

pet|t−1 = Et−1 [pt] .

50

For the purposes of the current discussion it is more convenient to write [AS] curve with inflation

rates

[AS] : yt = b (pt − Et−1 [pt]− Et−1 [pt−1] + Et−1 [pt]) + ηst ,

= b (πt − Et−1 [πt]) + ηst

In this model we will give an important advantage to the central bank assuming that it sets

monetary policy after observing the shock ηst and ηdt . Usually, monetary policy can react quickly,

although hardly without a lag. However, what we actually need here is that it reacts faster to

shocks than the private sector/prices and wages. This seems to be a realistic assumption and opens

a door for monetary policy to have real effects. Monetary policy can have real effects in this case

since it can mitigate at least within-period shocks.

We will further assume that monetary policy is given by

[Policy] : πt = α+ βηdt + δηst ,

where α, β and δ are policy parameters which the central bank chooses. Clearly this is an indirect

formulation of monetary policy rule. One way to think about it is that the central bank sets money

supply rule so that πt is given by equation [Policy]. Once πt is determined this money supply can

be determined from [AD] equation.

With this monetary policy rule, agents’expectations about inflation are

πet|t−1 = α.

Therefore, whatever value of α the central bank would choose, the agents know it and will adjust

their expectations accordingly. In this sense, the central bank can influence the agents’expectations.

It will make use of this in its optimization problem.

Monetary policy under commitment

In this case the central bank announces its policy a period ahead (at t− 1 period for period t) and

commits to it. This means that the central bank selects at time t − 1 parameters α, β and δ to

minimize its expected loss function Et−1 [Lt]. At time t− 1, however, similar to agents it does not

know the possible realization of ηst . Therefore,

Et−1 [πt] = α.

From equations [AS], [Wages], and [Policy] and the expression for loss function it follows that

the central bank solves:

Et−1 [Lt] = minα,β,δ

Et−1

[(α+ βηdt + δηst

)2]+ λEt−1

[(b(α+ βηdt + δηst − α

)+ ηst − y

)2].

51

In order to solve this problem, open up the brackets and use the definition of variance for random

variables with 0 mean, e.g., Et−1[(ηdt)2]

= σ2d.

Et−1


)2]= Et−1

[α2 + 2βηdt +

(βηdt

)2+ 2

(α+ βηdt

)(δηst ) + (δηst )

2

]= α2 + β2σ2d + δ2σ2s,

Similarly,

Et−1

[(b(α+ βηdt + δηst − α

)+ ηst − y

)2]= (bβ)2 σ2d + (bδ + 1)2 σ2s + y2.

Therefore, the optimal problems of the central bank is


α2 + β2σ2d + δ2σ2s + λ

[(bβ)2 σ2d + (bδ + 1)2 σ2s + y2

].

The first order conditions of this optimal problem are

[α] : 2α = 0,

[β] : 2βσ2d + λ2b2σ2dβ = 0,

[δ] : 2δσ2s + λ2 (bδ + 1) bσ2s = 0.

The optimal conditions for α and β imply that α = β = 0. However, δ = − λb1+λb2

. The policy rule,

therefore, is

πt = − λb

1 + λb2ηst ,

From modified [AS] it follows then that output is

yt =1

1 + λb2ηst .

The variance of inflation and output under this rule are

V [πt] =

(λb

1 + λb2

)2σ2s,

V [yt] =

(1

1 + λb2

)2σ2s.

If the central bank does not value stability of output, λ = 0, then it sets inflation to 0 and allows

output to vary with ηst . If, however, it places very large weight on the stability of output, λ = +∞,then it allows inflation to vary with ηst but keeps output constant so that it has 0 variance. In

general, it is easy to show that the variance of inflation increases with λ and the variance of output

declines with it.

We don’t we have the influence of ηdt in monetary policy. The intuition behind such a result

52

is that shock ηdt pushes prices and output proportionately and in the same direction. Therefore,

it does not create a trade-off between offsetting inflation and output volatilities. In turn, shock ηstpushes prices and output in different directions. For example, positive shock ηst increases output,

but reduces prices. Therefore, it creates a trade-off.

Although the central bank reacts to shocks to ηst it sets α = 0. Therefore, unconditional

expectation of inflation and expected inflation for agents’are 0. Under this policy the expected

value of the loss function of the central bank is

Et−1 [Lt] = λ

(1

1 + λb2σ2s + y2

).

Monetary policy under discretion (without commitment)

A problem with the commitment equilibrium is that at time t the policy announced at t − 1 may

not longer be the optimal rule. In other words, if policy maker is able to change the announced

policy at time t then it might achieve lower loss. This is because at that time, inflation expectations

have been formed and can be treated as given. The central bank then could exploit this.

Formally, assume that agents believe that the central bank will deliver Et−1 [πt] = 0. In such a

case there is no Et−1 [πt] in [AS]. Therefore, from [AS], [Wages], and [Policy] and the expression

for loss function it follows that the central bank solves:


Et−1


)2]+ λEt−1

[(b(α+ βηdt + δηst

)+ ηst − y

)2].

This can be equivalently written as


α2 + β2σ2d + δ2σ2s + λ

[(bβ)2 σ2d + (bδ + 1)2 σ2s + (αb− y)2

].

The first order conditions in this case are

[α] : 2α+ 2λb (αb− y) = 0,

[β] : 2βσ2d + 2λb2σ2dβ = 0,

[δ] : 2δσ2s + λ2 (bδ + 1) bσ2s = 0.

Since the central bank sets the same β and δ it has to have lower loss setting

α =λb

1 + λb2y

instead of α = 0. Use [AS] - where expected inflation is zero - and the policy rule to compute

53

expected value of the loss function for under this policy

πt =λb

1 + λb2y − λb

1 + λb2ηst ,

Et−1 [πt] =λb

1 + λb2y,

yt =λb2

1 + λb2y +

1

1 + λb2ηst ,

Et−1 [Lt] = λ

(1

1 + λb2

)(y2 + σ2s

).

Clearly, expected loss with α = λb1+λb2

y is lower than with α = 0. Therefore, the central bank has

incentive to cheat the agents: Announce that α = 0 so that agents perceive Et−1 [πt] = 0 but later

set α = λb1+λb2

y so that Et−1 [πt] > 0.

If the central bank announces policy but alters it after the announcement then the policy rule

is not “time consistent” and is not credible. The agents will not believe that the central bank is

committed and the commitment equilibrium falls apart.

Assume now that the central bank cannot commit to a rule and look for a policy which is

optimal at period t when expectation of inflation is taken as given by the central bank. Call this a

discretionary policy. Clearly, if this policy does not appear to be the same as the one above, then

there is time inconsistency problem.

With discretionary monetary policy, the central bank solves

Lt = minπt

π2t + λ [b (πt − Et−1 [πt]) + ηst − y]2

.

The central bank minimizes loss function but not its expected value, since it is not committed to

any rule and makes the decision after the realization of shocks

The first order condition then is

πt + bλ [b (πt − Et−1 [πt]) + ηst − y] = 0.

To find out Et−1 [πt] take the expected value of this expression

Et−1 [πt] = λby,

which means that at period t− 1 the agents expect positive inflation at period t. In turn, inflation

and output are

πt = − bλ

1 + λb2ηst + λby,

yt =1

1 + λb2ηst .

Therefore, the central bank runs higher inflation than in case if it could commit to a rule.

54

In terms of previous choice parameters, clearly, this situation corresponds to δ = − bλ1+λb2

and

α = λby.

The central bank’s expected value of the loss function at time t− 1 then is (use Et−1 [πt] = λby

in [AS])

Et−1 [Lt] = Et−1

[(− bλ

1 + λb2ηst + λby

)2]+ λEt−1

[(1

1 + λb2ηst − y

)2]=

= λ

(1

1 + λb2

)σ2s + λ

(1 + λb2

)y2

Denote the expected values of loss in case of credible commitment, cheating, and discretion as

ECMt−1 [Lt], ECHt−1 [Lt], and EDt−1 [Lt].

ECMt−1 [Lt] = λ

(1

1 + λb2σ2s + y2

),

ECHt−1 [Lt] = λ

(1

1 + λb2

)(σ2s + y2

),

EDt−1 [Lt] = λ

(1

1 + λb2σ2s + y2

)+ (λb)2 y2.

It is clear that

EDt−1 [Lt] > ECMt−1 [Lt] > ECHt−1 [Lt] .

Since the loss of the central bank is higher in case of discretion and if it fails to commit then

discretionary equilibrium prevails, then the central bank might want to use some mechanisms to

show the public/agents that it is committed. Such mechanisms are readily available in case of

repeated interactions when the public can punish the central bank (in terms of changing their

beliefs for example) if it deviates from its announced policy. Another way to align incentives is to

increase the independence of the central bank which often means reduction of λ. In such a case,

however, the central bank although would commit to a rule it will allow for uncontrolled changes

in output.

Business cycles

The term business cycles is used to coin the fluctuations in aggregate output and other activity

(e.g., unemployment, trade) over medium-term. In an economy, medium-term is usually associated

with a period of several months or years. Fluctuations can be both upward and downward. Upward

changes in output are called economic booms, while downward changes are called economic busts

or recessions. Relatively long lasting recessions are called a depression. These fluctuations are

measured relative to the long-term growth trend of the output and are largely unpredictable.

55

The explanation of business cycles is one of the primary issues in macroeconomics. Business

cycles have been studies starting from the time of Adam Smith and David Ricardo. Economists

tend to differ a lot, however, in terms of their explanations of causes of business cycles and pro-

posed remedies. This is such a central topic that the economists even form schools of thought in

explanation of business cycles.

Some of the most highlighted shocks to aggregate output and other aggregate variables in an

economy are

• Technology shocks: 100 years ago travelling from Barcelona to New-York would take much

more time than now. This is a drastic example how production functions can change over

time. New technologies like PCs and robots alter the production process and usually raise

productivity. Sometimes, production facilities break down or employees use too much face-

book, so productivity falls. Such a technological change is not always smooth; it often comes

in some of form of shocks.

• Weather shocks: Agriculture and tourism industries are very weather-dependent. Other

industries could also depend on weather if their employees work effort depends on weather.

Fluctuations in weather then affect output in these industries.

• Monetary shocks: We have seen that in certain cases money supply and inflation affectoutput. This implies that random changes to monetary policy or liquidity in the economy can

lead to output fluctuations as well.

• Political shocks: The government can influence the real economy through public expendi-ture, and regulations. If it changes expenditures, tax laws, antitrust regulation, and expecta-

tions then that can cause fluctuations in aggregate output.

56

• Taste shocks: All the examples above are rather about the supply of goods. There couldbe also shocks to the demand for goods and services in terms of shifts in preferences. Such

shifts can cause fluctuations and can come for example from introduction of new products

that makes others obsolete in terms of the preference.

Usually none of these shocks can explain large shifts in output (and other aggregate variables)

such as observed in real economies. However, there are mechanisms in the economy that can

amplify these shocks. These shocks can be amplified because of, for example, intra- and inter-

temporal substitution. If a negative shock hits the economy and output declines then consumers

might wish to work less and enjoy more leisure (intra-temporal substitution). This would reduce

output further. Moreover, if consumers like smoothing their consumption then they would save less

(inter-temporal substitution) so that capital would decline and output would be lower in future.

Price stickiness could be another amplifying mechanism. If wages are sticky, for example, then after

a negative shock to productivity firms would like to pay lower wages but they cannot. Instead of

lowering wages and keeping rather steady-level of output they would fire labor force and reduce

output. Financial frictions, in terms of inability to lend and borrow freely, can amplify the effects

of shocks too. If there are financial frictions then even small shocks can force firms into bankruptcy.

This will affect financial sector that lent money to the bankrupt firms and reduce credit. Often

then additional firms have to declare bankruptcy, and sometimes even banks fail. Bank failures

reduce liquidity and credit. They can affect all creditors and debtors and therefore can have large

economic consequences.

The classical and neo-classical (fresh-water) school of thought tends to believe that the origins

of business cycle fluctuations are completely exogenous processes which affect aggregate output

through changes in technological effi ciency (e.g., introduction of computers and facebook) and other

real variables. They hypothesize that the economy is frictionless so that there are no price rigidities

and/or financial frictions. In this respect, they hypothesize that the cycles which follow these

shocks are the optimal/best response of the economy. Therefore, these schools of thought believe

that policies might not be effective in tackling business cycles. Neo-classical theories include the

Real Business Cycle theory by Kydland and Prescott (1982), which is built-around, in particular, the

rational expectations assumption. These theories tend to have solid micro-foundations. However,

they tend to under emphasize the importance of frictions and distortions in the real economy.

The Keynesian and neo-Keynesian (saltwater) schools of thought tend to believe that the origins

of business cycle fluctuations also include shocks to nominal variables. They hypothesize that the

economy involves frictions so that there are price rigidities, financial frictions, and other failures

in the economy. In such frameworks nominal variables affect real variables. Moreover, the cycles

which follow these shocks are not the best response of the economy. Therefore, these schools of

thought tend to believe that there is a scope of policy intervention (fiscal and/or monetary). A

standard example of Keynesian model is the AD-AS model (IS-LM-Phillips curve model). We

will see two "extensions" of it in the next two sections. These models, however, lack solid micro-

foundations. Neo-Keynesian theories tend to alleviate that problem. Prominent economists such

57

as Michael Woodford and Gregory Mankiw have contributed to the development of neo-Keynesian

theories. These theories are usually presented in a form of Dynamic Stochastic General Equilibrium

(DSGE) models, which are very far away from the scope of this course because of complexity. Such

models, however, are commonly used in the Central Banks and other (financial) institutions.

There are other schools of thought too. For example Monetarist school of thought (largely due

to Milton Friedman) and Austrian school of though (largely due to Friedrich Hayek).

Business cycles - The Carlin and Soskice (2005) model

This section presents a very classical Keynesian model of business cycles research by Carlin and

Soskice (2005). This model is an extension/analogue of the standard AD-AS model. The standard

AD-AS model consists of 3 equations. These are IS and LM equations, which describe the aggregate

demand (AD),

[IS] : Y = C + I (r) +G,

[LM ] :M

P= L (r + πe, Y ) .

and aggregate supply equation (AS). In Macroeconomics 1 and 2, AD-AS model is presented in a

static form. It is presented using levels of prices, but not changes of prices (inflation).

This model also consists of 3 equations. Its first equation is the IS curve. IS curve in this model,

however, is written in a somewhat reduced form and in terms of logarithms of variables

[IS] : y = A− ar,

where A and α are positive parameters. For example, A depends on the level of consumption

thriftiness. In turn, α measures the magnitude of the effect of interest rate on income in the goods

market. This model is dynamic and incorporates time dimension. In particular it is assumed here

that income and interest rate are time dependant. Moreover, this model assumes that interest rate

(because of some imperfect adjustment mechanisms) affects the level of income with time lag so

that

yt = A− art−1.

There exists interest rate level rs that leads to equilibrium level of output ye (stabilizing level of

interest rate).

Graphically, IS curve looks like this

58

The second equation in this model is the aggregate supply curve, which is written in the form

of the Phillips curve (where agents have static expectations)

πt = πt−1 + α (yt − ye) .

Graphically, this Phillips curve (PC) in short run and in long run is presented below

Here, the upward sloping curve (line) is the short run Phillips curve and the vertical curve (line)

is the long run Phillips curve. πT is the Central Bank’s targeted level of inflation.

The targeted level of inflation is one of the parameters of the Central Bank’s monetary policy.

In this model the Central Bank designs its monetary policy so that to minimize its loss function

L = (yt − ye)2 + β(πt − πT

)2,

where β is a positive parameter which highlights the importance of inflation stabilization for the

Central Bank.

The Central Bank sets its monetary policy so that to solve

L = minπt

(yt − ye)2 + β

(πt − πT

)2,

s.t.

πt = πt−1 + α (yt − ye) .

59

Substituting Phillips curve πt into L and differentiating with respect to yt gives the following first

order condition∂

∂ytL = 0⇔ (yt − ye) + αβ

(πt−1 + α (yt − ye)− πT

)= 0

Substituting Phillips curve back into this equation gives

(yt − ye) = −αβ(πt − πT

).

This equation summarizes the monetary policy rule. Carlin and Soskice call it MR-AD equation.

Graphically the solution of problem can be presented in the following manner.

The Central Bank’s indifference curves are ellipses (circles if β = 1) with a bliss point at(ye, π

T).

In sum, the model is given by the following 3 equations

[IS] : yt = A− art−1, (22)

[PC] : πt = πt−1 + α (yt − ye) , (23)

[MR−AD] : (yt − ye) = −αβ(πt − πT

). (24)

Plugging [MR−AD] into [PC] gives the level of inflation

πt =1

1 + α2β

(πt−1 + α2βπT

).14 (25)

Moreover, expressing πt from [MR−AD] and plugging it into [PC] gives

[Y R] : yt = ye −(

1

αβ+ α

)−1 (πt−1 − πT

). (26)

14Denote ω = 11+α2β

and rewrite this equation as πt = ωπt−1 + (1− ω)πT . The solution of this equation is πt =ωtπ0 + (1− ω)πT

∑tτ=0 ω

t−τ . As t increases the first term tends to 0. In turn, the second term tends to πT .Substituting this into [MR−AD] gives the dynamics of output. From the dynamics of output and [IS] the dynamicsof interest rate can be found.

60

Call this equation Y R: output rule. In turn, from IS equation it follows that

yt − ye = −a (rt−1 − rs) .

Therefore,

[TR] : rt−1 − rs =1

a

(1

αβ+ α

)−1 (πt−1 − πT

). (27)

This is an analogue of Taylor rule (it does not include output gap) which we call TR. This rule

directly follows from the Central Bank’s optimization rule. Therefore, it is the policy rule of the

Central Bank in terms of adjusting the interest rate. In this respect, TR suggests by how much the

interest rate should be adjusted if inflation deviates from its target.

Lets now turn to the analysis of fluctuations and policy responses. Without loss of generality, in

our example economy which we discuss below we will assume that inflation target is set to 2% (i.e.,

πT = 2). Moreover, time starts at t = 0 and at t = 0 the economy is in its long run (equilibrium)

level of output and (targeted) inflation.

Shocks causing fluctuations

Shock to the IS curve: Consider a positive to shock to the IS curve which raises A to A′ at

t = 0 (and remains there forever). The Central Bank can do nothing to affect the increase the

output in t = 0 since its monetary policy rule affects interest rate which has time lag in affecting

output. However, it can affect changes in the economy in the second period. In order to do so it

needs to make a forecast of Phillips in t = 1. With the forecast of Phillips curve it will identify its

constraint for designing its optimal problem, and find the solution of the optimal problem.

Suppose further that the shift of A to A′ has triggered (through Phillips curve) increase in the

current level of inflation to π0 = 4%. From Phillips and IS curves, therefore, we have that the

change in A is given by

4 = 2 + α(A′ −A

),

i.e., A′ = 2α +A.

Given that expectations are static the forecast of next period (t = 1) inflation is 4. For any level

of output, this implies that inflation is going to be higher in the next period implying that next

period’s Phillips curve has shifted up. The graphical representation of this process is as follows

61

Notice that if the Central Bank decides to lower inflation in t = 1 then it will cause a recession.

This is because any level of inflation below 4 corresponds to lower output level (lower than ye).

Moreover, it can affect inflation at t = 0 and move along the new Phillips curve.

According to the Taylor rule the Central Bank will calculate inflation at t = 0 so that it increases

interest rate at t = 0. In the figure offered above this corresponds to choosing points B and B′.

Clearly, therefore, the Central Bank causes a recession in t = 1. What happens next? At t = 1 since

output is lower than ye according to the Phillips curve inflation at t = 2 will be lower than at t = 1.

This implies that the forecast of Phillips curve for t = 2 at t = 1 is to the right of the dashed PC

in the figure above. The Central Bank then will set slightly lower interest rate and inflation. This

will increase output. This adjustment process will continue till the point in time when inflation is

back to its targeted level of 2. Of course, it will imply that the new stabilizing level of interest rate

is higher than the old stabilizing level of interest rate, r′s > rs.

Supply Shock: Consider a permanent shift of the level output at t = 0. This corresponds to a

shift of the long run level of output ye to some y′e such that y′e > ye and can happen, for example,

because of technical progress.

Suppose that the level of shock is such that it triggers the inflation to decline to 0. From the

Phillips curve, therefore, the magnitude of the shock has to be

0 = 2 + α(ye − y′e

),

62

i.e., y′e = 2α + ye.

Clearly, in this case given that long run Phillips curve shifts to the right (see the figure above

for graphical representation) it has to be that in the short run Phillips curve in the long run shifts

to the right too. To have such a shift there needs to be a new MR − AD curve since it has to go

through the intersection of long run and short run Phillips curves.

Using the forecast of Phillips curve the Central Bank knows that this shock will imply lower

inflation at t = 1. It will design its optimal problem accordingly. The solution of its optimal problem

implies Y R equation. According to Y R, the Central Bank would like to have higher output in period

t = 1 since yt increases with ye. From TR equation and IS curve this implies that the Central Bank

will reduce inflation at t = 0 from 2% and the interest rate from rs. In the figure offered above it

will move to points C and C ′. Later, it will gradually increase inflation and interest rate so that to

reach y′e and πT .

Endogenous business cycles - The Goodwin (1967) model

The Goodwin (1967) model combines not so orthodox growth models (Harrod, 1939; Domar, 1946)

with Phillips curve in order to generate endogenous business cycles. Fluctuations are due to cyclical

relationship between employment and wages. This model is not well micro-founded, however, it can

serve for a nice illustration of endogenous emergence of cycles.

This model has origins in Biology. In particular, it is related to a large class of Predator-Prey

models which describe dynamic biological systems. A common outcome of these systems is a vicious

cycle of large prey population. Imagine increasing the number of predators. That would reduce

63

the sample of prey and therefore reduce the number of predators. These dynamics give a raise to

Lotka-Volterra differential equation. Similar equation is derived from this model.

At any time t, the aggregate output is assumed to be given by

Yt = min

αtLt,

Kt

σ

,

where αt is the productivity of labor Lt, and σ pins down capital output ratio KtYtwhen Kt

σ ≥ αtLt.This production function implies that labor and capital are complementary. Moreover, it is

clearly not a neo-classical production function. For example, it violates diminishing returns property

of the neo-classical production functions.

In equilibrium firm will be reluctant to hire more than Ktσ amount of effi ciency adjusted labor

αtLt since hiring more would not increase their production. Moreover, they would be reluctant to

hire more than αtLt amount of Ktσ . We will focus on the case when they hire so that

αtLt =Kt

σ.

This implies that

Yt = αtLt =Kt

σ.

Further, assume that time is continuous and labor productivity grows at a rate of a. The growth

rate of labor productivity in discrete time is given by αt+1−αtαt

. The continuous time analogue of

this expression is 1αt

dαtdt .

Let total population be Nt and grow at a rate of β, i.e., 1Nt

dNtdt = β. Therefore, employment

rate is given by

εt =LtNt.

Denoting growth rates by letter g, this implies that employment rate grows at a rate of

gε = gL − β, (28)

where gL is the growth rate of labor. Notice that when the (equilibrium) amount of labor and

population grow at the same rate then the rate of growth of employment gε is zero. In case,

however, when population grows at a higher (lower) rate than labor then gε is negative (positive).

The rate of growth of labor can be found from the production function. Under the assumption

that αtLt = Ktσ the rate of growth of labor is

gL = gY − a. (29)

Clearly, when K is fixed the amount of labor should decline with labor productivity since the

amount of labor in effi ciency units should be constant.

The Phillips curve in this model depicts relationship between percentage changes of wages w

64

(price of labor) and employment (output). The Phillips curve is assumed to be given by

gw = ρεt − γ, (30)

where ρ and γ are positive parameters that characterize labor market conditions (e.g., existence of

minimum wage, labor unions).

Denote the share of worker compensation by θt

θt =wtLtYt

.

Apparently, 1 − θt would be compensation of capital. Now, since we focus on the case when inequilibrium αtLt = Kt

σ the share of worker compensation is

θt =wtαt,

which has a growth rate of

gθ = ρεt − γ − a. (31)

The last thing which needs to be determined to close the model is the growth rate of output.

Clearly, since Yt = Ktσ output and capital grow at the same rate.

Assume that there are two types of agents in this economy: workers and agents who own capital

and supply no labor. Call the latter type of agents "capitalists." Workers consume their income

immediately and do not save. In turn, capitalists save a constant fraction of s of their income. This

means that total savings are given by

St = s (1− θt)Yt.15

Capital accumulation rule, therefore, is given by

dKt

dt= St − δKt

= s (1− θt)Yt − δKt

=

[s (1− θt)

σ− δ]Kt.

The growth rate of capital and output then is

gK = gY =s (1− θt)

σ− δ. (32)

Notice that for sharp increase of θt, which corresponds to sharply increasing the share of worker’s

income, the growth rate of Y can become negative according to (32). This would imply that the

15Recall that in the Solow (1956) model there is no distinction between workers and "capitalists." In this sense, Solow(1956) assumes that both agents save the same percentage of their income.

65

growth rates of labor and employment become negative according to (29) and (28). Negative growth

rate of employment would reduce employment. Therefore, it would reduce wages according to the

Phillips curve (30) and θ according to (31). This is the predator-prey mechanism in this model.

This model can be summarized by two differential equations

gε =s (1− θt)

σ− δ − a− β,

gθ = ρεt − γ − a.

These equations are known as Lotka-Volterra differential equations. Clearly, gε declines as θt in-

creases. This is because higher θt reduces investments and growth of output. Therefore, it reduces

growth of employment. Moreover, higher employment increases gθ. This is because increasing

employment increases output and wages. Higher wages imply higher share of worker compensation.

In the steady-state equilibrium there is no dynamics in the system. Therefore, in the steady-state

equilibrium

0 =s (1− θ)

σ− δ − a− β,

0 = ρε− γ − a.

Solving for the share of worker compensation and employment gives

θSS = 1− σ

s(δ + a+ β) ,

εSS =γ + a

ρ.

Consider the differential equations in order to see the dynamic adjustment in this model. The

following figure offers the phase diagram (time evolution of the system).

66

Suppose that εt > εSS and θt < θSS . This implies that the point (εt, θt) is in quadrant I. Since

at εt = εSS we have that gθ = 0 and gθ increases with εt, it has to be that gθ > 0 for the points

(εt, θt) in quadrant I. This implies that over time θt increases in quadrant I, which is depicted

by the arrow pointing to the right (i.e., increasing θt). Moreover, since at θt = θSS we have that

gε = 0 and gε declines with θt, it has to be that gε > 0 for the points (εt, θt) in quadrant I. This

implies that over time εt increases in quadrant I, which is depicted by the arrow pointing to up

(i.e., increasing εt). The intuition behind the arrows in the remaining quadrants follows a similar

logic.

Imagine the economy starts at some point in quadrant I. In such a case, over time it will

gravitate to quadrant II then to quadrant III, quadrant IV , and return back to quadrant I.

This is illustrated by the circle of arrows around the steady-state. It turns out that dynamics are

periodic fluctuations in this model in the sense that this process never converges to the steady state

nor diverges to ±∞.An economy which is at the steady-state can appear in quadrant I for example because of a

positive shock to employment (e.g., population declines temporarily) and/or negative shock to labor

income share (e.g., labor productivity increases temporarily). After such a shock it will experience

fluctuations along the cycle and never converge back to the its steady-state. These fluctuations are

endogenous business cycles. Such fluctuations we have encountered for prices in the Cobweb Model.

In the Cobweb Model there would be never ending fluctuations if α2 = 1

A Real Business Cycles Model

This section mostly follows (1) Chapter 9 in Doepke et al. (1999) and King and Rebelo (1999).

67

Before Kydland and Prescott (1982) economists thought that classical models can be used for

studies of long-term phenomena such as economic growth. However, short- and medium-term fluc-

tuations are not well explained with classical models. Kydland and Prescott (1982) revolutionized

business cycles theory and economics in two ways. First, they showed that realistic fluctuations can

emerge in classical (well micro-founded) macroeconomic models as an optimal response to exoge-

nous shocks. By doing so they basically originated the real business cycles (RBC) theory. Second,

they offered ways for evaluating the predictions of the models so that to gauge their relevance and

fit to the real world.

Usually it is impossible to analytically solve the models used in real business cycles theory.

Instead numerical/simulation methods are used. In order to resort to numerical methods the values

of model parameters should be known. Kydland and Prescott (1982) advocated the use of real

world data for calibrate (estimating) the parameters of the models.

In these models, it is also very hard to derive analytical comparative statics for understanding the

qualitative predictions of the models. Numerical comparative statics are used instead of analytical

comparative statics. The numerical comparative statics are also useful for evaluation of quantitative

predictions. In particular, researchers evaluate the response of the model variables over time to

exogenous shocks (impulse-response functions) and compare them to the patterns in real world

data. In line with Kydland and Prescott (1982), the comparison is in terms of

• the direction and the shape of the response of model variables;

• the magnitude of response in terms of mean and standard deviation; and

• the signs and magnitude of correlations between model variables.

The table below offers business cycle statistics (moments) for main macroeconomic variables

from US data. This table is taken from King and Rebelo (1999). It is used for the latter two

points. All data are quarterly and are for the period of 1947 Q1-1992 Q4. Variables representing

quantities such as output (Y), consumption (C), investment (I), and hours worked are in per capita

terms. Consumption includes consumption of non-durables. Investments include private fixed

capital formation and expenditures on durables. Wage rate is the real compensation per hour.

Interest rate is basically the interest rate paid on treasury bill minus inflation. A is Hicks-neutral

productivity in aggregate output. It is defined as Solow residual (lnA = lnY −α lnK−(1− α) lnL).

68

The first column offers sample standard deviation of relevant variables. The standard deviation

of a variable X with observations from X1, X2, ..., XT is defined as

STD (X)T =

√√√√ 1

T − 1

T∑t=1

(Xt −

1

T

T∑t=1

Xt

)2.

It is the statistical/empirical analogue of the variance (more precisely the square root of the vari-

ance). The second column offers the standard deviation of the relevant variables relative to the

standard deviation of output (Y). Clearly consumption of non-durables is much less volatile than

output. Investments, which include expenditures on durables, are around three times more volatile

than output. The number of hours worked has around the same volatility as output. However,

wages (and interest rate) vary much less than output and output per hour worked.

The second column of the table shows the first order autocorrelations among variables. The first

order autocorrelation of the variable X can be found using the following regression

Xt = β + ρ1Xt−1 + ηt,

where β is a constant and ηt is by construction orthogonal to Xt−1. Coeffi cient ρ1 is the first

order autocorrelation of the variable X. It is called first order autocorrelation because we have the

1st order time difference (i.e., Xt−1) and the correlation of X with itself (i.e., auto-correlation).

69

Alternatively, ρ1 can be computed as

ρ1 =

1T−2

T∑t=2

(Xt − XT

) (Xt−1 − XT−1

)STD (X)T STD (X)T−1

.

This coeffi cient shows how much the observations of variables are interrelated (linearly). In other

words it shows how good is the current value of the variable for predicting its future value. Now

imagine that there is a shock to X because of some exogenous reasons (η). In such a case, if ρ1 is

close to 1 from below then this shock will stay in X for a long time. If ρ1 is equal to 1 then it will

stay in X forever. If ρ1 is higher than 1 then it will stay in X forever moreover it will propagate

and get larger over time.

From the second column it is clear that all quantity variables are very highly autocorrelated.

Perhaps, one of the most important autocorrelations for RBC theory is that of A. It is fairly large,

which means that previous values of A predict current values with high precision and a shock to A

will persist in A for a long time.

The last column of this table offers contemporaneous correlations between variables and output.

The correlation between variables X and Z can be found using the following regression

Xt = β + ρZt + ηt,

where β is a constant and ηt is by construction orthogonal to Zt. Coeffi cient ρ is the correlation

between variables X and Z. Alternatively, ρ can be computed as

ρ =

1T−1

T∑t=1

(Xt − XT

) (Zt − ZT

)STD (X)T STD (Z)T

.

This coeffi cient measures the degree two variables are related (linearly). This is the reason it is

called "co"-rrelation. More precisely, it corresponds to contemporaneous correlation because it is

for the observations from the same period of time.

The third column suggests that output and other quantities are very highly correlated. This

implies that the driving processes behind them might be quite the same. However, output and real

wage rate are not well correlated.

The Basic RBC Model

Consider a closed economy which is populated by a very large number of identical and infinitely-

lived households of mass one. In a period, the representative household is endowed with 1 unit

of time which it can use for work/labor l and leisure 1 − l. It derives instantaneous utility fromconsumption c and leisure

u (ct, 1− lt) = ln ct + γ ln (1− lt) .

70

The lifetime utility of the household at time zero is the expected discounted sum of the instantaneous

utilities

U (c, 1− l) = E0

[+∞∑t=0

βtu (ct, 1− lt)],

where β = 11+ρ is the discount factor and ρ > 0 is the discount rate. The representative household

has rational expectations and E0 is the expectation operator given all the available information at

the beginning of the economy, time 0.

The assumption that households live forever can be justified thinking that we have families of

altruistic households who care about the utilities of their offsprings as they care about their utilities.

It turns out that this assumption makes calculus easier. In turn, the assumption that there are a

very large number of households implies that each households is atomistic and can be ignored in

the analysis Therefore, non of the households can dictate prices and quantities and takes them as

given.

The representative household earns market wage w for each labor unit. It owns the capital

stocks in the economy which at time t are at the level of kt. The household earns market interest

rate of r for each unit of supplies capital. At time t, the proceeds from labor and capital are equal

to wtlt + rtkt. The household uses these proceeds to fund its consumption and savings. Therefore,

its budget constraint is given by

wtlt + rtkt = ct + st.

Savings are translated into investments

st = it.

Investments create new capital according to the law of motion of capital:

kt+1 = it + (1− δ) kt,

where δ ∈ (0, 1) is the rate of depreciation of capital.

This implies that the budget constraint of the household can be rewritten as

wtlt + rtkt + (1− δ) kt − ct − kt+1 = 0.

The household maximizes its lifetime utility with respect to the budget constraint. Formally, it

solves the following problem.

maxct,lt,kt+1+∞t=0

E0

[+∞∑t=0

βtu (ct, 1− lt)]

s.t.

wtlt + rtkt + (1− δ) kt − ct − kt+1 = 0,

k0 > 0− given.

71

We will use Lagrangian to solve this problem. We define the Lagrangian as

L = E0

[+∞∑t=0

βt u (ct, 1− lt) + qt [wtlt + rtkt + (1− δ) kt − ct − kt+1]].

Imagine that it is now some time t so that the household knows all the values of the variables

(including random variables) for time t but does not know the t+ 1 values. In such a case, take the

derivative of the Lagrangian with respect to ct and lt and set it to zero[∂L

∂ct= 0

]:

1

ct= qt,[

∂L

∂lt= 0

]:

γ

1− lt= qtwt.

Take also the derivative of the Lagrangian with respect to kt+1 and set it to zero. Notice that kt+1shows up in the Lagrangian at time t as kt+1 and at time t+ 1 in rt+1kt+1 + (1− δ) kt+1[

∂L

∂kt+1= 0

]: qt = βEtqt+1 [rt+1 + (1− δ)] .

We have Et in front of qt+1 [rt+1 + (1− δ)] because t+1 variables are subject to random shocks and

the household uses rational expectations to predict them. Notice that given the values of kt and itthe value of kt+1 is uniquely determined.

The first equation tells that the marginal utility of consumption 1c is equal to q, which is called

the shadow value of the marginal unit of capital. The second equation is the supply of labor. The

benefit of supplying a marginal unit of labor is wt, which is the real wage rate and measures the

number of consumption goods. In the second equation. wt is scaled by qt and qtwt is the benefit

of supplying a marginal unit of labor in terms of utility. In equilibrium, the household should be

indifferent between marginally increasing its labor supply or marginally increasing its leisure time.

Therefore, benefit in terms of more consumption because of labor should be equal to the utility loss

because of less time in leisure. The marginal disutility of labor is given by γ1−lt .

Plug the first equation into the second equation to find

1

ct= βEt

[1

ct+1[rt+1 + (1− δ)]

].

This equation is called Euler Equation. It equates the marginal utility of consumption to the

discounted value of the marginal utility of consumption times the earned interest. It states that in

equilibrium the household should be indifferent between consuming a marginal unit of good right

now or delaying its consumption to the next period and saving to earn a net interest of rt+1+(1− δ).In short it tells that in equilibrium the household should be indifferent between consumption and

saving.

For any given level of prices, the household has three variables to solve for (ct, lt, kt+1) and three

72

equations:

[labor supply] : lt = 1− γ ctwt,

[consumption] :1

ct= βEt

[1

ct+1[rt+1 + (1− δ)]

],

[investment] : wtlt + rtkt + (1− δ) kt − ct − kt+1 = 0.

Notice that labor supply increases with real wage and declines with consumption. This is because

household’s consumption reflects how well off it is. If the household anticipates higher wealth then

it would accordignly increase consumption and reduce its current labor supply (have you ever seen

lazy millionaires?).

In the standard neoclassical model, labor supply is inelastic. Therefore, changes for example

in government expenditures have no real effects on output. However, in this framework they can

have. Imagine an increase in government expenditures. The household knows that this is going

to be followed by a tax hike to cover the expenses. Therefore, it expects to have lower wealth.

The anticipation of lower wealth would lead to lower consumption and would increase current labor

supply. Higher labor supply would increase output, which we discuss below.

A very large number of identical firms produce consumption goods. The representative firm has

a Cobb-Douglas production technology:

yt = Atk1−αt lαt ,

where α ∈ (0, 1) and At is the technology level. Higher/lower At implies higher/lower level of output

for given levels of capital and labor.

At each period of time, the representative firm solves the following problem

πt = maxkt,ltyt − wtlt − rtkt

s.t.

yt = Atk1−αt lαt .

Therefore, its demand for capital and labor are given by[∂πt∂kt

= 0

]: rt = (1− α)

ytkt,[

∂πt∂lt

= 0

]: wt = α

ytlt.

These equations characterize the production/supply side. In equilibrium, supply and demand

are equal. This corresponds to plugging these two equations (supply prices) into the three equations

73

above (equations for demand side). If we do so then we obtain the following three equations:

lt = 1− γct1

α

lt

Atk1−αt lαt

(33)

1

ct= βEt

[1

ct+1

[(1− α)

At+1k1−αt+1 l

αt+1

kt+1+ (1− δ)

]], (34)

Atk1−αt lαt + (1− δ) kt − ct − kt+1 = 0. (35)

In these equations, the variables are c, k, l, and A. We need additional equation for A to have

4 equations and 4 variables. We will assume that

lnAt = ρ lnAt−1 + ηt, (36)

where ηt is an i.i.d. random variable with 0 mean and σ2 variance.

These 4 equations constitute the basic real business cycles model. The shocks originate in

equation (36) and are ηt. As discussed before, lnAt is computed using capital and labor series and

the following formulae

lnAt = lnYt − α lnKt − (1− α) lnLt.

The mean and variance of the shocks η and ρ are obtained running a regression that has the form

of (36). One of the major points of Kydland and Prescott (1982) is that many of the real world

business cycles summarized in Table 1 above can be matched using this model and the estimated

shocks.

It is important to stress that in this model shocks propagate to other variables and persist

over time. A shock that arrives at time t persists in the economy for several periods because of

two reasons. Suppose that a positive shock has arrived, so that the values of η and A are higher

than expected. First reason that the shock persists is that the arrival of positive shock increases the

marginal product of labor for a given value of k and increases output. Some of that increased amount

of output will be consumed and the remainder will be used for investments. Higher investments

will create higher amount of capital in the next period. Therefore, in the next period the amount of

capital will be higher than if there was no positive shock. This will imply higher marginal product

of labor and higher output, leading again to higher investments than there would have been without

the positive shock. The second reason is that A has autocorrelation as measured by ρ. Higher than

expected shock η will persist then in A. It turns out that usually this channel is quantitatively

the most important one. Therefore, the magnitude of the persistence ρ is very important for RBC

models.

How do we solve equations (33)-(36) for c, k, l, and A? For a fairly general set of parameters

we cannot actually derive the analytic solution of this model. To find solution of this model then

we run computer simulations.16 There are certain important insights, however, that we can draw

16Usually this is done for the log-linearized versions of these equations around the steady-state of the system. For afunction f (x) this involves writing a first order logarithmic approximation: f (x) = f (x∗) + f ′ (x∗) (lnx− lnx∗).

74

from this system of equations without explicitly solving them.

Steady state and technology shocks: We will say that our model economy is in a "steady

state" when all variables in equations (33)-(36) are time invariant. We will say that the economy is

in a "deterministic" steady state if there are no shocks, η, in the economy.

Suppose that in the long-run there are no shocks. It is possible to show that then in the long

term all variables in equations (33)-(36) are time invariant if |ρ| < 1. Let’s assume that |ρ| < 1.

According to the table from King and Rebelo (1999) this is fine at least for the US.17

Suppose that in the steady state we have a given value for A (e.g., A = 1). Using (33)-(35) it is

easy to show that in the steady state we have

l =α[1β − (1− δ)

]α[1β − (1− δ)

]+ γ

[1β − (1− δ)− (1− α) δ

] ,k =

[1− α

1β − (1− δ)

A

] 1α

l,

c =

1β − (1− δ)− (1− α) δ

1− α

[(1− α)

1β − (1− δ)

A

] 1α

l.

Let’s consider now a permanent increase in the value of A to A′. Such a permanent increase

would imply a new steady state where

l =α[1β − (1− δ)

]α[1β − (1− δ)

]+ γ

[1β − (1− δ)− (1− α) δ

] ,k′ =

[1− α

1β − (1− δ)

A′

] 1α

l,

c′ =

1β − (1− δ)− (1− α) δ

1− α

[(1− α)

1β − (1− δ)

A′

] 1α

l.

Clearly, in this new steady state capital stock and consumption are higher than in the old steady

state: k′ > k and c′ > c. Therefore, positive (negative) technology shocks increase (reduce) the

steady state levels of capital, output, and consumption.

Transition and technology shocks: Consider a model economy which starts at a deterministic

steady state and receives positive technology shock so that A increases to A′. We know that the

new steady state of this economy features higher levels of capital, output, and consumption. How

does the economy get to the new steady state? If there are no further shocks than it can be shown

17 If |ρ| > 1 then we will simply need to consider a "detrended" version of the model where we subtract the growth ofA from A, Y , K, and C.

75

from (33)-(35) that the economy will gradually transit toward the new steady state. During this

transition, capital, output, and consumption will increase.

For a more general discussion, we need to consider an economy that starts at a deterministic

steady state and periodically receives technology shocks which imply different values of K, Y , and

C in future steady state. After each shock the economy starts adjusting/transiting toward the new

steady state.

Since we are not able to solve for the variables (33)-(36) we will need to simulate the model

economy on a computer in order to assess its performance in terms of generated cycles in output,

investments, consumption, and other variables. In order to simulate it we need to use values for

parameters. The usual values of the parameters (for the US) are

α = 0.667, β = 0.984, δ = 0.025, γ = 3.48, ρ = 0.979, ση = 0.0072.18

The simulations provide the simulated values of model variables. We use then the definitions

of the standard deviation and correlations to check the model predictions. The following table,

borrowed from King and Rebelo (1999), summarizes the results from this exercise.

The simulation results indicate that the model closely matches the volatility of output, invest-

ments, consumption, and wages. It matches also autocorrelations and the observed procyclicality

(positive correlations with output) of almost all variables.

Although this "simple" model preforms astonishingly well for matching these data moments it

fails in matching the cyclicality of labor hours and interest rate. Hansen (1985) has suggested a

way to circumvent the problem with labor supply making it an indivisible choice at the individual

level so that the at macro level the elasticity of labor supply to shocks is very high. However,

18See Table 2 in King and Rebelo (1999).

76

that might create excess volatility in wages. To match the volatility of the interest rate and wage

rates then this simple real business cycle model is complemented with price rigidities. Adding price

rigidities, however, makes the model much more intractable and brings back Keynesian arguments.

We discuss one of the ways price rigidities are usually modelled in these frameworks in the next

section.19

Price Rigidities - The Calvo (1983) model

Classical and Keynesian schools differ the most in their view of how markets work. Classical (and

neo-classical) school of thought conjectures that markets are perfect (i.e., frictions and distortions

are insignificant). In this respect, it conjectures that prices freely adjust to equilibrate demand

and supply in all markets. Classical dichotomy holds under this conjecture, i.e., money supply

does not matter for real variables. Keynesian (and new-Keynesian) school of thought conjectures

that there are insignificant frictions and distortions. In particular, these imperfections create price

rigidities. Therefore, prices do not adjust (fully) to equilibrate demand and supply in all markets.

This breaks classical dichotomy. Moreover, in such a setup shifts in demand and supply can affect

output through prices too.20

There are two common ways for modelling price rigidity. The easy way to do so assumes that

changes in prices depend on time (time-dependent models.) For example, Taylor (1980) assumes

that firms change their prices each n-th period and that in each period 1n -th of firms change their

prices. Calvo (1983) assumes that in each period with some probability some of the firms can

change their prices. More precisely, Calvo (1983) assumes that at the beginning of each period a

random event decides which of the firms can change their prices and which of the firms cannot. A

more cumbersome, but more appealing, way for modelling price rigidity assumes that prices depend

on the state of the economy (menu cost models.) In these models firms change prices when the

expected benefit of changing their prices is higher than the cost of changing prices (menu cost.) The

complications arise because expected benefit of changing prices depends on the current and future

states of the economy.

DSGE models with Calvo (1983) style price rigidities tend to be the main workhorses in the

central banks. We will now cover a version of the Calvo (1983) model. It will provide us with an

upward sloping supply curve/Phillips curve.

Time is discrete. The economy is populated by a continuum of mass one of identical and infinitely

lived households. The representative household derives instantaneous utility from consumption of

a basket of goods. The lifetime utility of the household is given by

U =

+∞∑t=0

βtCt,

19Another criticism that applies to the RBC models is their reliance on Solow residual which is usually not very wellestimated.

20The assumption that prices are very rigid finds limited support in microeconomic data.

77

where C is a constant elasticity of substitution basket of goods i,

C =

1∫0

Cσi di

1σ

(37)

where σ ∈ (0, 1). The elasticity of substitution between any pair Ci and Cj (i 6= j) is 11−σ .

21 Denote

it by θ.

The households spend their entire income on purchases of C. Therefore, the representative

household’s budget constraint it

PCC =

1∫0

PiCidi, (38)

where PC is the price of C and Pi is the price of good i.

The household chooses its demand for different goods to maximize its lifetime utility. Since it

has no dynamic decisions (and therefore no inter-temporal trade-offs), the maximization of lifetime

utility is equivalent to the maximization of instantaneous utility streams. The problem of the

representative household can divided to two steps. In the first step the representative household

chooses C to maximize its utility. In the second step it choosesCi

i∈[0,1]

to maximize C. Therefore,

in the first step it solves

maxC

C − λPCC − 1∫

0

PiCidi

,

for any time t. The solution of this problem gives the shadow value of marginally relaxing the

budget constraint

1 = λPC .

In the second step the household solves the following problem

maxCii∈[0,1]

PC 1∫0

Cσi di

1σ

−1∫0

PiCidi

.

The solution of this problem is given by first order conditions for all goods i. Treating the integrals

as sums, these first order conditions are

[Ci

]: Pi = PCC

Cσ−1i

C. (39)

21εci,cj =d ln(Ci/Cj)

d ln

(∂C∂Cj

/ ∂C∂Ci

) = d ln(Ci/Cj)d ln(Cσ−1j /Ciσ−1)

= 11−σ .

78

where C =

1∫0

Cσi di. This expression together with (37) and budget constraint (38) implies that

Pσσ−1i =

(PCC

C

) σσ−1

Cσi .

Therefore, it implies that

1∫0

Pσσ−1i di

1σ

=

(PCC

C

) 1σ−1

1∫0

Cσi di

1σ

= P1

σ−1C C

σσ−1 C−

1σ−1 = P

1σ−1C .

Finally,

PC =

1∫0

Pσσ−1i di

σ−1σ

, (40)

which means that the aggregate level of price is a basket of prices of goods i.

Moreover, the (inverse) demand function implies that

Pi = PC

1∫0

Cσi di

1−σσ

Cσ−1i

Therefore,

Ci =

(PCPi

) 11−σ

C

=

(PCPi

)θC.

Each firm produces an i good. Firms have monopoly in their product and set prices. Firms choose

prices to maximize their real profits.

In case when there are no price rigidities firms solve

maxPiPC

PiPC

Ci − ϕiCi,

s.t.

Ci =

(PCPi

) 11−σ

C.

where ϕi is the marginal cost of producing Ci amount of good i. Plugging the demand function

79

into profit and taking the first order condition with respect to PiPC

gives[PiPC

]:PiPC

=ϕiσ,

This expression is the (inverse) supply of good i. Under perfect competition the relative price is

equal to marginal cost. Here, because we have monopolists, the price is equal to ϕiσ which is greater

than MC(Ci

)since σ ∈ (0, 1).

This relation should hold for any time t. Assuming that firms are symmetric, in logarithms, the

expression offered above can be written as

pt = lnϕt − lnσ,

where lnσ < 0 since σ ∈ (0, 1).

Suppose now that there are price rigidities. At any time t any firm i with a probability of α has

a sticky price and cannot change its price. With probability 1 − α it does not have a sticky priceand can change its price. In this case firms set their prices not knowing when they’ll be able to

reset them. Therefore, expectations of future prices will matter for their current decisions. A firm

which does not have a sticky price knows that if it sets price then with probability α it will last for

the next period (exactly one period). Moreover, with probability α (1− α) it will last exactly for

2 periods and exactly for 3 periods with probability α2 (1− α). The expected length of time until

price reset is given by+∞∑t=0

αt =1

1− α.

Firms in this case maximize their present discounted value of profit streams whenever they can

adjust their prices. Assume that at time t firm i is able to adjust its prices then it solves its problem

taking into account that for certain period of time (indexed by k below) it will not be able to reset

the price. Therefore, it solves at time t

Vt = maxPi,t

Et

[+∞∑k=0

(αβ)k[Pi,t

PC,t+kCi,t+k − ϕi,t+kCi,t+k

]], (41)

s.t.

Ci,t+k =

(PC,t+kPi,t

)θCt+k.

To solve this problem plug Ci,t+k into Vt and take the first order condition with respect to Pi,t. This

exercise gives

[Pi,t] : 0 = Et

[+∞∑k=0

(αβ)k[(1− θ) Pi,t

PC,t+k+ θϕi,t+k

](Pi,t

PC,t+k

)−θ−1 1

PC,t+kCt+k

]

Given that Pi,t does not depend on k it can be taken out of the sum and this expression can be

80

rewritten as

0 = Et

[+∞∑k=0

(αβ)k[(1− θ) Pi,t

PC,t+k+ θϕi,t+k

]P θC,t+kCt+k

],

Therefore, the price of firm i at time t relative to aggregate price PC,t is given by

Pi,tPC,t

=θ

θ − 1

+∞∑k=0

(αβ)k Et

[ϕi,t+k

(PC,t+kPC,t

)θCt+k

]+∞∑k=0

(αβ)k Et

[(PC,t+kPC,t

)θ−1Ct+k

] .

Let all the firms that are able to set price at time t be symmetric, i.e., ϕi,t+k ≡ ϕt+k. Moreover,let also all the firms which are not able to set prices at time t be symmetric. Therefore, denoting

the price of firms which are able to adjust it at time t by P at gives

P atPC,t

=θ

θ − 1

+∞∑k=0

(αβ)k Et

[ϕt+k

(PC,t+kPC,t

)θCt+k

]+∞∑k=0

(αβ)k Et

[(PC,t+kPC,t

)θ−1Ct+k

] . (42)

The aggregate price which prevails is the weighted average of prices of firms which adjust their price

and firms which do not adjust their price. It is given by (40):

Pσσ−1t = αP

σσ−1t−1 + (1− α) (P at )

σσ−1 , (43)

where we have dropped subscript C.

Equations (42) and (43) constitute new-Keynesian Phillips curve. To see this, first divide (43)

by Pt. Next, consider the first order log-linear approximations of these equations around the steady-

state, where all firms are able to set prices. To do so denote

f (Pt−1, Qt) = αPσσ−1t−1 + (1− α)Q

σσ−1t − 1,

g (Pt−1, Qt) = f(ePt−1 , eQt

)= α ln eP

σσ−1t−1 + (1− α) ln eQ

σσ−1t − 1.

The first order approximation of g around(

ln Pt−1, ln Qt)point is

g (Pt−1, Qt) =

[α ln eP

σσ−1t−1 + (1− α) ln eQ

σσ−1t − 1

]+αeP

− σσ−1

t−1 ePσσ−1t−1

σ

σ − 1P

σσ−1−1t−1 ln eP

σσ−1t−1

(Pt−1 − ln Pt−1

)+ (1− α) eQ

− σσ−1

t eQσσ−1t

σ

σ − 1Q

σσ−1−1t ln eQ

σσ−1t

(Qt − ln Qt

)

81

At the steady-state Pt−1 = Qt = 1. Therefore, around the steady-state it has to be that

g (Pt−1, Qt) = ασ

σ − 1Pt−1 + (1− α)

σ

σ − 1Qt.

Since f (Pt−1, Qt) = g (lnPt−1, lnQt) and f (Pt−1, Qt) ≡ 0 we have that around the steady-state

0 = αpt−1 + (1− α) qt.

Notice that pt−1 = ln Pt−1Pt

which is approximately minus 1 times inflation rate. Therefore,

πt =1− αα

qt.

Now consider log-linear approximation of (42) around the steady-state assuming for simplicity

that in the steady-state C = 1. To do so rewrite it as

Qt

+∞∑k=0

(αβ)k Et

[(PC,t+kPC,t

)θ−1Ct+k

]=

θ

θ − 1

+∞∑k=0

(αβ)k Et

[ϕt+k

(PC,t+kPC,t

)θCt+k

].

The approximation of the left had side around the steady-state point is as follows. Denote

g (Pt−1, Qt) = ln eQt+∞∑k=0

(αβ)k Et

[(ln e

Pt+kPt

)θ−1ln eCt+k

].

The first order log-linear approximation of g (Pt−1, Qt) around(

ln Qt, ln Ct+k, lnPt+kPt

)point is

g

(Qt, Ct+k,

PC,t+kPC,t

)≈ ln eQt

+∞∑k=0

(αβ)k Et

(ln ePt+k

Pt

)θ−1ln eCt+k

+(Qt − ln Qt

) +∞∑k=0

(αβ)k Et

(ln ePt+k

Pt

)θ−1ln eCt+k

+ ln eQt

+∞∑k=0

(αβ)k Et

(ln ePt+k

Pt

)θ−1 (Ct+k − ln Ct+k

)+ ln eQt

+∞∑k=0

(αβ)k Et

(θ − 1) ln eCt+k

(ln e

Pt+k

Pt

)−1e− Pt+k

Pt ePt+k

Pt

(Pt+kPt− ln

Pt+k

Pt

)

82

Around the the steady-state point(

0, C, 0)this is given by

g

(Qt, Ct+k,

PC,t+kPC,t

)≈ C

1

1− αβ +QtC1

1− αβ +

+∞∑k=0

(αβ)k(Et [Ct+k]− ln C

)+ (θ − 1) C

+∞∑k=0

(αβ)k Et

[Pt+kPt

].

Therefore, the approximation of the left hand side is

Qt

+∞∑k=0

(αβ)k Et

[(Pt+kPt

)θ−1Ct+k

]≈ 1

1− αβ +1

1− αβ qt ++∞∑k=0

(αβ)k Et [ct+k]

+ (θ − 1)+∞∑k=0

(αβ)k (Et [pt+k]− pt) .

In turn, the approximation of the right hand side is

θ

θ − 1

+∞∑k=0

(αβ)k Et

[ϕt+k

(Pt+kPt

)θCt+k

]≈ 1

1− αβ +θ

θ − 1

+∞∑k=0

(αβ)k Et[ϕt+k

]+

θ

θ − 1ϕ

+∞∑k=0

(αβ)k Et [ct+k]

+θ

θ − 1θϕ

+∞∑k=0

(αβ)k (Et [pt+k]− pt) .

Equate these expressions and notice that θθ−1ϕ = 1 to get

qt + pt = (1− αβ)

+∞∑k=0

(αβ)k Et

[pt+k + ln

ϕt+kϕ

].

Denote by Et−1zt = qt + pt. The above equation is the solution of the following difference equation

where z0 = 0,

αβEtzt+1 − Et−1zt = − (1− αβ)

[pt + ln

ϕtϕ

].

Solving for qt from this equation gives

qt = αβEt [qt+1 + πt+1] + (1− αβ) lnϕtϕ.

Using the relation α1−απt = qt gives

πt = βEt [πt+1] +1− αα

(1− αβ) lnϕtϕ.

83

This is the new-Keynesian Phillips curve. The difference between it and Keynesian Phillips curve

is that it contains no backward-looking terms but contains forward-looking (expected) inflation

matters.

84

Expectations and financial markets

Bonds

Bonds are (usually) very primitive financial instruments. They are forms loans (or I owe you/IOU).

More precisely, a bond is a certificate which states that the issuer is indebted to the owner of the

certificate. It states the amount of the debt to be paid back (principal), the time when the debt

has to be paid (maturity date). Depending on the type of the bond, the issuer might also be

obliged to pay the owner interest (coupon). Interest is usually payable at given and fixed intervals

(e.g., monthly or annual). Quite often bonds are negotiable in the sense that the ownership can be

transferred. This makes certain types of bonds highly liquid.

Firms tend to issue bonds in order to meet their financial requirements. According to the Pecking

Order Theory (of firm’s capital structure), issuing debt/bonds is firms’second most preferred way

to raise finance for investments. First most preferred way is to use internal finance, and the least

preferred way is to raise finance through issuing equity. In this respect, equity (stocks) and bonds are

quite much alike. They are both financial instruments for firms. However, there is a big difference

between bonds and equity/stocks. The stockholders of a firm are investors since they have equity

stakes (ownership rights) in the firm. In turn, the bondholders have creditor stakes in the firm

since they have lent money to the firm. Bondholders are creditors, therefore, they usually have

(absolute) priority over stockholders and will be repaid first in the event of bankruptcy. Another

and less important difference is that bonds usually have a defined maturity (consoles or annuities

don’t have maturity but are very rarely issued). In contrast, stocks are typically for indefinite

period. Not many firms, however, issue bonds. Moreover, firms which issue bonds typically are

quite large.

Almost in all countries one of the most significant issuers of bonds is the government. The

government uses bonds in order to take loans. These loans, together with taxes, the government

uses in order to cover its expenditures. For example, usually during recessions tax proceeds decline.

In turn, the governments implement counter-cyclical policies and increase their expenditures. These

expenditures they finance then using loans/bonds.

Government bonds tend to be sold on auctions, where participants are, for example, banks,

investment and hedge funds, various speculators, the central bank, etc. (You can imagine some-

thing alike a stock/derivative exchange. In fact, bond futures are massively traded on derivative

exchanges.) These markets are usually very liquid.22 One of the most liquid markets is the market

for US government bonds (US Treasury Bonds). Currently (18.03.2014), US debt issued in various

forms of bonds and IOUs amounts to $17,546,814,482,078.90 (yes! more than $17 trillion.)

There are many (and very complicated) types of bonds. For our purposes it is suffi cient to

consider the following simplistic examples. Hereafter, in our discussion in this chapter we will

22Very recently, because of the financial crisis (and, perhaps, reckless landing/borrowing) in several EU countries bondmarkets dried up. The governments of these countries then faced tough financial constraints and asked for loans (i.e.,for bond purchases) from the ECB, etc.

85

consider perfect competition in all (bond) markets and no lending constraints (i.e., think about

very liquid bond market where some of the participants can print money.) Bond 1 matures in 1

year and pays principal P . What is the fair market value/price of such a bond? Let yearly nominal

interest rate be i11, where subscript stands for the compounding time (1 year) and superscript

stands for the time when compounding will happen (in 1 year). Then the price of this bond is the

discounted value of the principal

PB1 =P

1 + i11.

Bond 2 matures in 3 years and pays principal P . What is the fair market value/price of Bond 2?

Let the yearly interest rate be the same across years and, as in previous example, let it be equal to

i11. The price of Bond 2 is the discounted value of the principal. It is

PB2 =P(

1 + i11)3 .

Clearly, as long as i11 > 0, PB1 > PB2 since Bond 1 matures earlier than Bond 2.

Notice that bank loans seem to have somewhat different structure. Usually, when an individual

takes L (EUR) loan for a year from a bank, the loan specifies the yearly nominal interest rate i.

Therefore, the individual has to repay (1 + i)L. Denote the latter by L. In such a case taking a

loan is equivalent to issuing a bond with principal L. Therefore, there is almost no difference.

Bond 3 matures in T years and pays principal P . Let the yearly interest rate be i11. The price

of Bond 3 is

PB3 =P(

1 + i11)T .

Bond 4 matures in T years and pays principal P . Moreover it pays yearly coupons C. Let the

yearly interest rate be i11. The price of Bond 4 is

PB4 =P(

1 + i11)T +

T∑t=1

C(1 + i11

)t .Bond 5 is exactly the same as Bond 4 but its coupon payments vary over time so that we have Ctinstead of C. The price of Bond 5 is

PB5 =P(

1 + i11)T +

T∑t=1

Ct(1 + i11

)t .Consider again Bond 2. Assume now, however, that interest rate during the first year is i11, in

the second year it is i21, and in the third year it is i31. In such a case, the price of Bond 2 is

PB2 =P(

1 + i11) (

1 + i21) (

1 + i31) .

86

In such a case the prices of Bond 3, Bond 4, and Bond 5 would be

PB3 =

T∏t=1

P

1 + it1,

PB4 =

T∏t=1

P

1 + it1+

T∑t=1

t∏τ=1

C

1 + iτ1,

PB5 =

T∏t=1

P

1 + it1+

T∑t=1

t∏τ=1

Ct1 + iτ1

.

Bonds 1, 2, and 3 are called zero coupon bonds since they pay no coupons. In finance interest

rate is usually called yield since it indicates what a loan (an investment) yields. The time sequence

of (similar) interest rates is termed as yield curve or term structure of interest rate. In our examples,

the yield curve is i11, i21, i

31, ..., i

T1 .

Imagine a market where zero coupon bonds are traded. which promise the same principal P .

Let there be bonds of all possible maturities: 1 year, 2 years, ..., and T years. It is straightforward

to determine the yield curve from the prices of these bonds. With slight abuse of previous notation,

let the prices of bonds with maturities 1 year, 2 years, ..., and T years be PB1 , PB2 , ..., and PBT ,

correspondingly. Therefore, interest rates are given by

i11 =P

PB1− 1,

i21 =P(

1 + i11)PB2− 1,

i31 =P(

1 + i11) (

1 + i21)PB3− 1,

...

iT1 =

(T−1∏t=1

1

1 + it1

)P

PBT− 1.

Call this algorithm (∗).Usually yield curves are upward sloping. The following figure illustrates the yield curve of US

Treasury bonds as of 9th of February 2005. It has interest rate/yield on Y-axis and time to maturity

on X-axis.

87

Clearly, in this example the yield curve is upward sloping with diminishing marginal rate of

increase. One of the widely accepted explanations for upward sloping yield curves is that longer

maturities entail greater risks for creditors/lenders. Lenders then demand a risk premium and

demand more premium for longer maturities. This explanation depends on the notion that currently

creditors (as well as we) know less about the distant future than about the very near term.

Another commonly accepted explanation is related to the IS-LM Model. Imagine for example,

that investors are expecting either a gradual shift of IS to the right or a gradual shift of LM to the

left (or both). In such case, they would expect to have higher interest rates in future and trade

accordingly.

This explanation points a deficiency in our pricing formulas since it points out that interest rates

might be highly variable/random. In such a case in all our formulas we need to replace interest

rates with their expected values. For example in the most general case we have for the price of

Bond 5

PB5 =T∏t=1

P

1 + it,e1+

T∑t=1

t∏τ=1

Cτ1 + iτ ,e1

.

To make things simple consider the case when Ct ≡ 0 and T = 3. Clearly, this corresponds to Bond

2. The price of the bond is then

PB2 =P(

1 + i1,e1

)(1 + i2,e1

)(1 + i3,e1

) .Notice that the price of the bond depends negatively on expected interest rates. Therefore, if

creditors are expecting higher rates in future the price of the bond declines. This is because if

investors are expecting higher rates in future they would rather wait and invest in future. Borrowers

in such a case receive less for the same principal. Therefore, again, expectation matter.

88

Money supply and risk-free bonds

Central banks usually announce their monetary policy in terms of a level of nominal interest rate.

They adjust their money supply so that they meet the targeted rate. How do they do that? They

buy or sell bonds in the market. To see how this works, consider, again, a deterministic world and

zero coupon bonds. Lets take as an example Bond 1,

PB1 =P

1 + i11.

If the central bank buys many such bonds, PB1 would increase. Higher PB1 implies lower i11.

Conversely, if the central bank sells many such bonds, PB1 would decline which would imply higher

i11. When the central bank buys bonds it does so using money. Therefore, it increases money supply.

In contrast, when it sells bonds it receives money. Therefore, it reduces money supply. This implies

that if the central bank announces that it plans to reduce interest rate then it plans to increase

money supply. Of course, if it announces that it plans to increase interest rate then it plans to

reduce money supply.

The central banks are (usually) public institutions. Therefore, they try to avoid excess risk. The

central banks then usually buy and sell government bonds, which tend to be thought to be relatively

safe assets. (After all, the government can almost always avoid defaulting on bonds denominated

in national currency with printing money.)

It is commonly thought that one of the most safe government bonds are the short maturity

US Treasury Bonds. Economists usually take them as the risk-free assets/bonds. The difference

between the interest paid on US Treasury Bonds and interest paid on equivalent (e.g., in terms

of maturity and coupons) bonds to certain extent represents risk premium (i.e., the premium the

investors demand to hold more risky asset.).

Yield curve and zero coupon bonds in reality

In reality there are almost no zero coupon bonds. Usually, zero coupon bonds are only those that

have paid their last coupon and are very close to maturity. This renders our algorithm (∗) almostuseless. For figuring out the yield curve a special algorithm is used. This algorithm is called

bootstrap.

The following example illustrates this algorithm. Imagine we have a bond very close to its

maturity, which has paid all its coupons. Basically we have a bond like B1. We observe its price

and know its principal. Therefore, interest rate (in a perfect market) is

i11 =P

PB1− 1.

Further, imagine we have a bond which matures in two years and pays coupon in a year. We know

its price (P ′B2), principal (P ), and the value of coupon that it pays (C). Therefore, in a perfect

89

market it has to be that,

P ′B2 =P(

1 + i11) (

1 + i21) +

C

1 + i11.

It is easy to figure out i21 from this expression,

i21 =P(

1 + i11)P ′B2 − C

− 1.

Stocks, stock prices, and stock markets

The stock of a firm is the equity stake of its owners. It represents the residual assets of the

firm that would be paid to stockholders after discharge of all senior claims (e.g., bonds/debt). A

stockholder/shareholder is an individual who legally owns one or more shares of stock in a firm.

Firms (usually) maximize their value appropriately designing production and marketing. It

turns out that this is equivalent to maximizing their shareholder value.

To see this, suppose that we have a firm which lives T periods and has real profit stream πt

in each period t. Denote real interest rate by r and, for simplicity, assume that it is constant over

time. At time t = 1, the present value of this firm is given by

V1 = π1 +1

1 + rπ2 +

1

(1 + r)2π3...+

1

(1 + r)t−1πt...+

1

(1 + r)T−1πT

=

T∑τ=1

1

(1 + r)τ−1πτ .

The present value of the firm V1 is the total present value of its stocks.

Clearly, V1 can be rewritten in the following manner

V1 = π1 +1

1 + rV2,

where π1 are the profits in current period (t = 1) and 11+rV2 are the discounted future capital gains.

For simplicity, assume that this firm has zero net debt. A stockholder/shareholder owns a

share/fraction of V1. Let this fraction be some ω from (0, 1). Therefore, in other words, a shareholder

has an entitlement to ωV1 amount of value (real money). It is clear that the maximization of V1 is

equivalent to maximization of ωV1.

In practice, shareholders usually call profits πt dividends. In such a case it is practical to write

dt instead of πt. In this respect, the value of the firm is simply discounted sum of dividends.

If this firm is privately or publicly traded in a perfect market then its market value is equal to

its value. Therefore, the price of a 1 percent ownership of the stock of the firm (or the price of 1%

share simply) is 0.01V1. In practice, S is used to denote stock price, instead of V .

90

Stock markets and speculators

Firms are publicly traded (usually) in stock markets/exchanges. Examples of stock markets are

New York Stock Exchange and London Stock Exchange. In a stock market, speculators engage in

a trade of stocks of many firms simultaneously.

Hereafter, we will assume that the values of firms are exogenously given keeping in mind that

they depend on the profits of the firms. For publicly traded firms, this is equivalent to assuming

that the stock prices of the firms are given.

In a deterministic world, speculators invest in the stocks of those firms which will have the

highest returns. For example, if there are two firms, Firm 1 and Firm 2, which currently have stock

price of S and are expected to have stock price S2 < S1, then the speculators would buy stocks of

Firm A. This is because the return on buying the stock of Firm 1 (R1) is higher than the return

on buying the stock of Firm 2 (R2),

R1 =S1 − SS

>S2 − SS

= R2.

Of course, in complete markets, because of arbitrage the price of the stocks of Firm 1 would increase

so that after all R1 = R2. We have assumed however that the stock prices are given. Therefore, S

does not change in our example.

Stochastic returns and portfolios of stocks: The world is not deterministic in the sense that

future profits of firms are usually not very predictable. Therefore, the future prices of stocks are not

very predictable. Speculators/investors then decide upon expected returns. Moreover, they usually

hold portfolios of stocks.

For example, if N firms are traded in the stock market and there are M speculator/investor,

then speculator j holds a portfolio

Pj =

N∑i=1

µi,jSi,

where µi,j is the amount of stock Si in the portfolio of speculator j. If µi,j = 0 then speculator j

does not have stocks of Firm i. If µi,j = 1 then speculator j holds real money equivalent to all the

stocks of Firm i. If µi,j = −1 then speculator j does not have stocks of Firm i. Moreover, speculator

j is in debt to deliver real money equivalent to all stocks of Firm i. In other words, speculator j

has short-selled stocks of Firm i.

Can µi,j be higher than 1 or lower than −1? Yes, it can. This is because Si is the real value

of the stock of Firm i. In such a case contracts can be written promising multiples of the value of

that stock. Consider, for example, a situation when speculator j has all the stocks of Firm i as well

as a certificate promising to pay exactly twice the value of the stock whenever exercised. In such a

case the speculator has real value of 3Si. In other words µi,j = 3.

91

Clearly, for any firm it has to be that the percentage of its stocks sums to 1 across all speculators,

M∑j=1

µi,j = 1.

In terms of portfolios, speculators/investors select the amounts of stock ownership µ and focus

on the returns of their portfolios. If the price of a selected portfolio is P then the return on it is

percentage change of its price over time. Using the notation of our previous example, the return is

Rj =Pj − PP

.

These returns are stochastic (are random variable) because the future value of portfolio Pj is a

random variable.

Utility maximization and mean-variance trade-off

Speculators buy stocks to maximize utility. It turns out that generally the problem of speculators

to design an appropriate portfolio can be summarized in terms of expected value and risk trade-off

given that the future value of portfolio is a random variable.

The following example illustrates this point. Consider a speculator who has utility function u (.).

Let u (.) be strictly-increasing and concave in portfolio returns. Suppose the returns on portfolios

have normal distribution with E expected value and σ2 variance. In such a case, the expected utility

of the speculator is

E [u (R)] =

+∞∫−∞

u (R) f (R;E, σ) dR,

where

f (R;E, σ) =1√2πσ

e−12(

R−Eσ )

2

.

Clearly, the expected utility depends on E and σ2, where σ2 is a measure of risk.

Use Z to denote

Z =R− Eσ

.

Given the properties of expectation and variance operators (see Appendix), Z is a normal random

variable with expected value 0 and variance 1. Z is usually called standardized return.

Replace R with Z in the expression for expected utility (notice that the limits of integration

don’t change):

E [u (E + σZ)] =

+∞∫−∞

u (E + σZ)ϕ (Z) dZ,

92

where ϕ (Z) is the density function of standard normal distribution,

ϕ (Z) =1√2πe−

12Z2 .

Next, we take the derivative of the expected utility E [u] with respect to the standard deviation

of return σ. Assuming that E [u] is finite, the derivative is

dE [u]

dσ=

+∞∫−∞

d

dσu (E + σZ)ϕ (Z) dZ

=

+∞∫−∞

u′ (E + σZ)

(dE

dσ+ Z

)ϕ (Z) dZ.

An indifference curve is defined as the (locus of) points where dE[u]dσ = 0:

0 =

+∞∫−∞

u′ (E + σZ)

(dE

dσ+ Z

)ϕ (Z) dZ.

This expression can be further rewritten as

dE

dσ= −

+∞∫−∞

u′ (E + σZ)Zϕ (Z) dZ

+∞∫−∞

u′ (E + σZ)ϕ (Z) dZ

,

where E is the expected return of the portfolio and σ is its standard deviation. E (σ) represents

indifference curve in expected value E and standard deviation (variance) space σ. It is increasing

and convex in σ. In other words, dEdσ > 0 and d2Edσ2

> 0 (see Appendix).

Utility increases moving the indifference curve from right to left in variance and mean space.

The following figure shows these indifference curves

93

Measures of returns and risk

Expected value of a random variable is its concentration point. If the random variable has continuous

distribution function then almost never it will be equal to its expected value. However, most likely

its realization will be closer to its expected value than other points in distribution. Therefore,

expected value of a random variable tends to be the best guess available.

In our previous example, we used variance as a measure of risk. Variance of random variable

shows how dispersed its outcomes may be. Therefore, it is one of the most commonly applied risk

measures. It is especially relevant for normally distributed random variables.

Digression (Central Limit Theorem): Consider a portfolio consisting of i.i.d. stocks with

weights 1N . According to the Central Limit Theorem the returns on this portfolio will be approxi-

mately normally distributed. The Central Limit Theorem is very powerful result. This result is one

of the reasons why normal distribution is used so often.

There are many other measures of risk. Examples are the range, the semi-inter-quartile range,

the semi-variance, and the mean absolute deviation. Each of these measures can have slightly

different implication (scale) for risk.

The range (RANGE) is defined as the difference between the highest and lowest outcomes. Let

the returns on portfolio Rj be from[Rminj , Rmaxj

]. Then the range is

RANGE = Rmaxj −Rminj .

The semi-inter-quartile range (SRANGE) is usually defined as the difference between the 75th and

25th quantiles of the random variable. In our example,

SRANGE = Rq75j −Rq25j .

94

The variance is a central moment in the sense that considers deviations from the mean/expected

value, i.e., V [Rj ] = E [Rj − E [Rj ]]2. In this sense it gives equal weight to deviations from the

mean/expected value. However, risk averse investors might be more concerned about returns below

the mean (i.e., downside risk but not upside risk). The semi-variance (SEMIVAR) is a measure of

risk that relates just to that risk. It defined as

Rj =

Rj − E [Rj ]

0

if

if

Rj < E [Rj ]

Rj ≥ E [Rj ]

SEMIV AR = E[Rj

]2.

Variance and semi-variance can be sensitive to observations distant from the mean/expected value

(i.e., outliers). The mean absolute deviation (MAD) avoids this problem. It is defined as

MAD = E [|Rj − E [Rj ]|] .

Hereafter, we will maintain an assumption that returns on stocks/portfolios have normal distrib-

ution. Therefore, we will characterize the risk with variance (or its square root: standard deviation).

Measuring portfolio return and risk to assemble a portfolio

An expected utility maximizing speculator/investor assembles its portfolio of stocks so that to

achieve the highest possible utility. It assembles the portfolio selecting the number of each type of

stocks (µ in our previous example). The expected return and risk of portfolio are directly linked to

the expected returns and risks of underlying assets/stocks.

The following very simplistic example illustrates this point. Let there be only 2 firms and 2

stocks Firm 1 and Firm 2, and S1 and S2. The current value of the portfolio of the speculator is

P = µ1S1 + µ2S2,

where µ1 is the number of S1 stocks and µ2 is the number of S2 stocks.

Slightly abusing the notation, let the next period prices of stocks 1 and 2 be S11 and S12 , corre-

spondingly. Therefore, next period price of the portfolio is

P 1 = µ1S11 + µ2S

12 .

The return on portfolio then is

R =P 1 − PP

=µ1(S11 − S1

)+ µ2

(S12 − S2

)P

=µ1S1P

S11 − S1S1

+µ2S2P

S12 − S2S2

= ω1RS1 + ω2RS2 ,

95

where RS1 is the return on stock S1, RS2 is the return on stock S2. ω1 is the weight of stock S1 in

the portfolio, and ω2 is the weight of stock S2 in the portfolio. These weights sum up to 1,

1 = ω1 + ω2.

In this respect this formula implies that the return on portfolio R is the weighted sum of returns

on underlying assets/stocks RS1 and RS2 . Notice that selecting the number of stocks in portfolio

µ1 and/or µ2 the speculator selects weights of stocks ω1 and ω2 in the portfolio.

The expected value of the return on this portfolio is

E [R] = E [ω1RS1 + ω2RS2 ]

= ω1E [RS1 ] + ω2E [RS2 ] .

In turn, the variance is

V [R] = ω21V [RS1 ] + ω22V [RS2 ] + 2ω1ω2COV [RS1 , RS2 ] ,

where COV [RS1 , RS2 ] is the covariance of RS1 and RS2 ,

COV [RS1 , RS2 ] = E [(RS1 − E [RS1 ]) (RS2 − E [RS2 ])] .

Intuitively, the portfolio P = ω1RS1 + ω2RS2 varies because so do RS1 and RS2 . However, we need

to take into account that RS1 and RS2 co-vary or vary together and that can amplify or reduce the

variance of the portfolio. The following two examples illustrate the latter point. Suppose S1 and

S2 are almost the same (may be because they are stocks of vertically or horizontally interrelated

firms) and

RS1 = RS2 .

In such a case

V [RS1 ] = V [RS2 ] ,

COV [RS1 , RS2 ] = V [RS1 ] ,

and

V [R] = ω21V [RS1 ] + ω22V [RS1 ] + 2ω1ω2V [RS1 ]

=(ω21 + 2ω1ω2 + ω22

)V [RS1 ]

= (ω1 + ω2)2 V [RS1 ]

= V [RS1 ] .

This inference should have followed since if RS1 = RS2 then P = RS1 . This example corresponds to

96

the case when S1 and S2 co-vary perfectly (they are almost the same). Suppose now that S1 and

S2 are extremely different (may be because they are stocks of competing firms) and when RS1 = 1

then RS2 = 0 but when RS1 = 0 then RS2 = 1. Let further RS1 be equal to 1 with probability 12 ,

which implies that E [RS1 ] = E [RS2 ] = 12 . Moreover,

V [RS1 ] = E [RS1 − E [RS1 ]]2 = E

[RS1 −

1

2

]2= E

[R2S1 −RS1 +

1

4

]=

1

4.

Apparently, the same holds for RS2 : V [RS2 ] = 14 . However, notice that

COV [RS1 , RS2 ] = E

[(RS1 −

1

2

)(RS2 −

1

2

)]= E

[RS1RS2 −

1

2RS2 −

1

2RS1 +

1

4

]= E [RS1RS2 ]−

1

2E [RS2 ]−

1

2E [RS1 ] +

1

4= 0− 1

2E [RS2 ]−

1

2E [RS1 ] +

1

4

= −1

4.

Therefore, since the expected value of the return on portfolio is simply the weighted sum of expected

returns on stocks, we have that

E [R] = E [ω1RS1 + ω2RS2 ]

= ω1E [RS1 ] + ω2E [RS2 ]

=1

2.

In turn, the variance of the portfolio is

V [R] = ω211

4+ ω22

1

4− 2ω1ω2

1

4

= (ω1 − ω2)21

4,

which can be strictly less than the variance of any of the stocks in portfolio. Let ω1 = ω2 then

V [R] = 0 < 14 . However, if ω1 = 1 or ω2 = 1 then V [R] = 1

4 . This happens because of negative

covariance between RS1 and RS2 and is called diversification of risk.

This example conveniently illustrates a very important point. In case RS1 and RS2 don’t vary

together perfectly, speculators can choose the weights of S1 and S2 in their portfolios so that to

minimize variance/risk for the same level of expected return.

The magnitude of the covariance is not easy to interpret since it depends on the possible realiza-

tions of random variables. Usually, therefore, another measure is used to describe (linear) relation

between random variables. It is the normalized version of the covariance and is called correlation

97

coeffi cient. It is defined as

ρS1,S2 =COV [RS1 , RS2 ]√V [RS1 ]V [RS2 ]

.

By definition, ρRS1 ,RS2 ∈ [−1, 1]. Using the correlation coeffi cient and denoting V [R] = σ2P ,

V [RS1 ] = σ2S1 and V [RS2 ] = σ2S2 we have that the variance of the portfolio returns is

σ2P = ω21σ2S1 + ω22σ

2S2 + 2ω1ω2σS1σS2ρS1,S2 .

A speculator chooses the weights ω1 and ω2 to construct its portfolio and makes its choices upon

expected value and variance of returns. Therefore, it is interesting to see how expected value and

variance of the portfolio returns depend on ω1 and ω2. First of all notice that ω2 = 1−ω1. Therefore,it is enough to choose only one of the weights, e.g., ω1. Let for simplicity E [RS1 ] > E [RS2 ]. In

such a case apparently the expected value of portfolio returns in linear and increasing function of

ω1

E [R] = E [RS2 ] + ω1 (E [RS1 ]− E [RS2 ]) .

In turn,

σ2P = ω21σ2S1 + (1− ω1)2 σ2S2 + 2ω1 (1− ω1)σS1σS2ρS1,S2

= ω21(σ2S1 + σ2S2 − 2σS1σS2ρS1,S2

)− 2ω1σS2

(σS2 − σS1ρS1,S2

)+ σ2S2 .

This is an upward opening parabola since

σ2S1 + σ2S2 − 2σS1σS2ρS1,S2 = (σS1 − σS2)2 + 2σS1σS2

(1− ρS1,S2

)> 0

for any value of ρRS1 ,RS2 . The following figures illustrate the relationships between E [R] and σ2Pand ω1

In the second figure ωmv1 is the ω1 which delivers the lowest variance of returns. It can be found

98

from the first order condition:

∂σ2P∂ω1

= 0⇔ 2ω1(σ2S1 + σ2S2 − 2σS1σS2ρS1,S2

)− 2σS2


)= 0.

Therefore,

ωmv1 =σS2


)σ2S1 + σ2S2 − 2σS1σS2ρS1,S2

.

Given that portfolio returns are linear in ω1, we can easily replace ω1 in the last figure with

E [R]. In such a case we would have

In this figure the curve σ2P is the minimum variance opportunity set. This is the set of variance

and expected return points which offers the minimum variance for a given expected rate of return.

In turn, E [Rmv] is the expected return of minimum variance portfolio.

Given that we have for indifference curves expected returns as functions of standard deviation,

usually this figure is transposed.

The speculators’optimal choice of portfolio/weights is given by the point of tangency of indiffer-

ence curves and minimum variance opportunity set. This is because indifference curves are convex

99

and utility increases as these curves shift to the left. Moreover, exactly because of that no portfolio

will be selected below the horizontal line that corresponds to E [Rmv].

This analysis easily proceeds for the case when the number of stocks/assets is greater than

2. We have not defined what is S2 precisely. We needed simply that it has expected value and

variance. Therefore, it can be for example a linear combination of stocks of many firms. Even

more, it can include bonds. Hereafter, we will assume that portfolios can include many types

of assets/financial instruments: stocks, bonds, options, etc. Therefore, we will let speculators to

assemble their portfolios using not only stocks. Moreover, we will assume that all speculators know

the correct variances and expected returns of assets.

Lets assume now that S2 is risk free asset with returns RS2 , which for convenience we will denote

by Rf . For example it is US Treasury bond or combination of perfectly negatively correlated assets.

In turn, S1 is itself the portfolio of all risky assets with returns RS1 which have expected value and

variance E [RS1 ] and σ2S1. The portfolio consisting of S1 and S2 has expected return and variance

of returns

E [R] = ω1E [RS1 ] + ω2Rf

= Rf + ω1 (E [RS1 ]−Rf ) .

and

σ2P = ω21σ2S1 .

This implies that the expected value of this portfolio is linear function of its standard deviation:

E [R] = ω1E [RS1 ] + ω2Rf

= Rf ± (E [RS1 ]−Rf )σPσS1

.

The following figure offers the minimum variance set in this case together with choices of ω1. The

minimum variance set is graphed using solid lines. In turn, possible choices are in dashed lines.

100

Portfolios along all these dashed lines are possible. However, only one dominates in terms of

expected return and variance. It is the portfolio at point M . With the presence of risk free asset

the speculators form their portfolios taking the tangency point of their indifference curves and the

upward sloping solid line called capital market line, CML. In this respect their portfolios are

combinations of Rf and portfolio M .

How do the speculators construct CML? To do so they need to know Rf and portfolio M .

Rf is the return on risk free asset which might be safely thought to be US Treasury bonds. What

aboutM? M is the portfolio of all risky assets (yes, it is S1 in this example, if you were wondering).

Therefore, it is at the tangency point of CML and the convex minimum variance sets.

The market is in equilibrium when prices are such that markets clear. All assets then are held.

In other words prices adjust so that excess demand and supply of assets are zero. This market

clearing condition implies that equilibrium is attained at a single-tangency portfolio, M , which all

investors combine with risk free asset and is a portfolio where all assets are held according to their

market value weights. (At this point the weight of Rf is zero. Moreover, this portfolio is called the

market portfolio.) Let the market value of an asset i be Vi and there be N assets then the market

value weight of the asset i is

wi =ViN∑i=1

Vi

,

whereN∑i=1

Vi is the total market value of all assets. Market equilibrium is not attained until the

tangency portfolio M is the market portfolio. Moreover, the value of the risk free rate of return

should be such that the aggregate borrowing and lending are equal.

The upward sloping solid line is called Capital Market Line. In turn, the result that in equilib-

rium investors hold a weighted average of risk free asset and market portfolio M is called Two-fund

Separation.

101

Market price of risk and the CAPM

The Capital Asset Pricing Model, CAPM in short, is a model which allows us to determine the

market price of risk and the appropriate measure of risk for a single asset. It rests on several very

hard assumptions (which actually we have maintained thus far). These assumptions are that

1. There are no market distortions and frictions (e.g., there are no price setters and information

is costlesly available to all)

2. Speculators/investors maximize their expected utility and are risk averse.

3. There exists an unlimited supply of risk free asset at risk free rate

4. The quantities of assets are fixed. Moreover, all assets are tradable and perfectly divisible

5. All investors have the same information about the markets

The CAPM also needs that the market portfolio is effi cient. This follows from utility maximiza-

tion and common knowledge: the market is the sum of all individual holdings and all individual

holdings are effi cient.

In equilibrium, for any asset i it has to be that its weight in the market portfolio M is equal to

wi =ViN∑i=1

Vi

.

Consider a portfolio which consists of ωi percentage of asset i and 1− ωi percentage of the marketportfolio M . The expected return on such a portfolio is

E [R] = ωiE [Ri] + (1− ωi)E [Rm] ,

where E [Ri] and E [Rm] are the expected returns on asset i and market portfolio M . The standard

deviation of this portfolio is

σP =[ω2iσ

2i + (1− ωi)2 σ2m + 2ωi (1− ωi)σi,m

] 12,

where σi,m is the covariance between asset i and market portfolio m.

Clearly, the opportunity set in terms of expected value and variance for various combinations

of asset i and portfolio M is given by the convex Minimum Variance Opportunity set. In turn, the

changes of the expected return and risk (here standard deviation) of this portfolio with ωi are given

102

by

∂

∂ωiE [R] = E [Ri]− E [Rm] ,

∂

∂ωiσP =

[ωiσ

2i − (1− ωi)σ2m + (1− 2ωi)σi,m

]×[ω2iσ

2i + (1− ωi)2 σ2m + 2ωi (1− ωi)σi,m

]− 12

The main insight of this model is that the market portfolio already contains the asset i. There-

fore, ωi is the excess demand for it. In equilibrium it should be zero and we would evaluate ∂∂ωi

E [R]

and ∂∂ωi

σP at ωi = 0. This gives

∂

∂ωiE [R]

∣∣∣∣ωi=0

= E [Ri]− E [Rm] ,

∂

∂ωiσP

∣∣∣∣ωi=0

=σi,m − σ2m

σm.

Therefore, the slope of the expected return and risk trade-off evaluated at the market portfolio is

∂∂ωi

E [R]∂∂ωi

σP

∣∣∣∣∣ωi=0

=E [Ri]− E [Rm]

(σi,m − σ2m) /σm.

This expression shows how an individual asset affects the return and risk of market portfolio.

The final insight is that this slope is equal to the slope of CML, which is

E [Rm]−Rfσm

.

Equating these two we have that

E [Ri]− E [Rm]

(σi,m − σ2m) /σm=E [Rm]−Rf

σm,

or equivalently,

E [Ri] = Rf + (E [Rm]−Rf )σi,mσ2m

.

This equation is known as the Capital Asset Pricing Model. It states that the required rate of return

on any asset i is equal to the risk free rate of return plus a risk premium. The risk premium is the

price of the risk (E [Rm]−Rf ) times the quantity of risk σi,mσ2m. The quantity of risk is denoted by

β =σi,mσ2m

.

This is the contribution of asset i to the portfolio risk. Most importantly, the variance of the asset

i a measure of risk does not matter in this context. Whatever matters is how its returns correlate

103

with market portfolio returns.

The risk free asset has β = 0 since it does not contribute to the risk of market portfolio (i.e.,

its covariance with market portfolio is zero). In turn, the market portfolio has β = 1 since it is

perfectly correlated with itself.

The following figure illustrates the CAPM where it is depicted by the blue line.

This model has several important properties. First, in equilibrium because of arbitrage every

asset should be priced so that its risk adjusted return is exactly on the blue line. Next the total

risk can be separated to two

Total risk = systematic risk + unsystematic risk,

where systematic risk is how an asset covaries with the market portfolio and unsystematic risk is

the risk not dependent on the market/economy. The speculators/investors are willing to pay a risk

premium to avoid the systematic risk. Unsystematic risk cannot be avoided.

How to find β of an asset and the magnitude of unsystematic risk. The CAPM implies that one

can estimate those from the following empirical specification to find βi

Ri = a+ βiRm + ε,

where a is a constant and ε is a random variable which does not correlate with the market. The

variance of Ri then is

σ2i = β2iσ2m + σ2ε,

where β2iσ2m is the systematic risk and σ2ε is the unsystematic risk.

A second important property of the CAPM is that the measures of risk of individual assets are

linearly additive when the assets are combined in portfolios. For instance, if an asset i has a risk

104

of βi and asset j has a risk of βj and they are combined with ωi proportion, then total risk of the

portfolio is

βp = ωiβi + (1− ωi)βj .

To show this use the properties of covariance:

βp =E [(ωiRi + (1− ωi)Rj − E [ωiRi + (1− ωi)Rj ]) (Rm − E [Rm])]

V [Rm]

= ωiE [(Ri − E [Ri]) (Rm − E [Rm])]

V [Rm]+ (1− ωi)

E [(Rj − E [Rj ]) (Rm − E [Rm])]

V [Rm]

= ωiβi + (1− ωi)βj .

Black-Scholes model

The Black-Scholes model, is a model which allows us to determine the market price of European

options. This mode maintains the following assumptions

1. There are no market distortions and frictions

2. Speculators/investors maximize their expected utility

3. There exists an unlimited supply of risk free asset at risk free rate rf

4. All assets are tradable and perfectly divisible

5. Stocks pay no dividends during the life-time of the options

European options are widely used financial instruments.

Definition 1 Option is a contract which gives its owner an option (the right but not obligation)to buy or sell an underlying asset at a specified price (strike price) before a specified date (time to

maturity).

1. Options giving the right to sell are called Call Options

2. Options giving the right to buy are called Put Options

3. Options which can be exercised only at a specified date (but not before it) are called European

Options.

To derive Black-Scholes model lets consider a stock which has random returns and a European

Call option written on it. Denote, the price of the stock at time t by St. Assume that the price of

the stock follows a geometric Brownian motion:

dSt = µtStdt+ σStdWt,

105

where d stands for the operator of infinitesimally small change over time. µt is called drift coeffi cient

and σ is a positive constant. Both µt and σ will acquire meaning when Wt is presented. This

(differential) equation equation states that the change of the value of the stock happens because of

deterministic shifts in the mean and because of a random process Wt.

Wt and, therefore, its infinitesimally small change dWt, is a random variable. More precisely,

Wt is a Wiener process: It is a random variable which changes over time so that the expected

change over any time interval is 0 (e.g., E [dWt]) and its variance over time T is equal to T (e.g.,

V [dWt] = dt). A discrete analogue for W is a simple random walk.

The above equation can be rewritten as

dStSt

= µtdt+ σdWt,

where dStStis the rate of return on the stock during a very short time of dt. The properties of Wt

random variable imply that

E

[dStSt

]= µtdt,

V ar

[dStSt

]= σ2dt.

European option written on this stock specifies the time when it can be exercised. Denote it by

T . It specifies also the strike price. Denote it by K. As we will see, at time t < T the price of this

option V is a (complicated) function of S, µ, σ, t, K, T , and rf :

V = V (S, µ, σ, t,K, T, rf ) .

From Ito’s lemma it follows that

dV =

(µS

∂V

∂S+∂V

∂t+

1

2σ2S2

∂2V

∂S2

)dt+ σS

∂V

∂SdW.

Combining this differential equation with the differential equation for stock price gives

dV =∂V

∂tdt+

1

2σ2S2

∂2V

∂S2dt+

∂V

∂SdS.

To derive the price of this option Black-Scholes model uses a non-arbitrage argument. This

involves constructing a risk free portfolio and equating its returns to the risk free returns.

Consider a portfolio of one option on this stock and a short position of ∆ in this stock. The

value of the portfolio is then

Pt = Vt −∆St.

Choose the ∆ so that to hedge this portfolio (reduce its variance).

106

The change in the price of the portfolio is given by

dPt = dVt −∆dSt

=∂V

∂tdt+

1

2σ2S2

∂2V

∂S2dt+

∂V

∂SdS −∆dS.

The first two terms are deterministic. Therefore, they don’t matter for hedging strategy and to

hedge the portfolio against (volatility) randomness select ∆ = ∂V∂S (this is called delta/dynamic

hedging). This implies that

dPt =

(∂V

∂t+

1

2σ2S2

∂2V

∂S2

)dt.

Since this portfolio entails no risk its returns should be equal to the returns on risk-free asset

rf . In other words,dPtPt

= rfdt.23

Therefore,

dPt = Ptrfdt

and (∂V

∂t+

1

2σ2S2

∂2V

∂S2

)dt = (Vt −∆St) rfdt.

The latter implies that

rfV =∂V

∂t+

1

2σ2S2

∂2V

∂S2+ rfS

∂V

∂S.

This is a second order partial differential equation.

In order to solve it one needs to have boundary conditions. First, let this be a call option. At

the time to maturity T , the price of call option is easy to derive. It is

VT = max ST −K, 0 .

This is because time T is the exercise date. The owner of this option would exercise it only if ST is

higher than the strike price and would make a net gain of ST −K. Second,

V (0, µ, σ, t,K, T, rf ) = 0.

Third,

limS→+∞

V (S, µ, σ, t,K, T, rf ) = S.

Solving this differential equation is much beyond the frames and focus of this course.

23This is because the return on portfolio is given by 1Pt

dPtdt. If the time is discrete then this becomes 1

Pt

Pt+1−Ptt+1−t =

Pt+1−PtPt

.

107

Appendix

Appendix - Reminder of Statistics 0

Probability Theory

There are at least two ways to approach probability:

• Classical (or a priori) Probability

—Classical Probability can be defined in the following way: If a random experiment can

result in n mutually exclusive and equally likely outcomes and if nA of these outcomes

have an attribute A, then the probability of A is the fraction nA/n.

• Axiomatic Probability (or so-called Kolmogorov’s Axiomatics):

—Probability Space is a triple(

Ω; A; Pr (.)), where

1. Ω is the sample space: A collection of all possible outcomes of an experiment

∗ Any given experiment result is an element of Ω.

2. A is the event space, or algebra of events.

∗ It is a collection of subsets of Ω, including Ω.

3. Pr (.) is the probability function defined on A: For any event A ∈ A, Pr (.) is a

quantitative (numerical) measure of the likelihood that this event A is observed once

the experiment is completed.

Basic definitions and theorems

The following definitions are needed to construct probability space:

Definition 2 [Sample space] The sample space, denoted by Ω, is the collection or the totality of all

possible outcomes of a conceptual experiment. Often, Ω is called the sure event since it includes any

outcome that can occur.

Definition 3 [Event and event space] An event is a subset of the sample space. The class of allevents associated with a given experiment is defined to be the event space.

Any event space A possesses the following properties:

1. Ω ∈ A

2. If A ∈ A, then the complement set A ∈ A

3. If A1, A2, ... ∈ A, then ∪i≥1Ai ∈ A

108

Definition 4 Spaces which satisfy properties 1, 2, and 3 are called σ−algebra.

Definition 5 [Probability function] A probability function Pr (.) is a set function with domain A

and counterdomain of [0, 1] interval. It which satisfies the following axioms:

• Pr (.) ≥ 0 for ∀A ∈ A

• Pr (Ω) = 1

• If A1, A2, ... ∈ A are pairwise disjoint, then Pr (∪i≥1Ai) =∑i≥1

Pr (Ai)

Definition 6 The triple(

Ω; A; Pr (.))is called probability space.

The following definitions are for conditional probability and independence of events

Definition 7 Let A,B ∈ A and Pr (B) > 0. The conditional probability of event B given the

occurrence of A is denoted and given by Pr (B|A) = Pr(B∩A)Pr(A) .

Theorem 8 (The law of total probability) Let A1, A2, ..., An ∈ A form a partition of Ω. Let B ∈ A.Then

Pr (B) =n∑i=1

Pr (B|Ai) Pr (Ai) .

Theorem 9 (Bayes’formula) Let A1, A2, ..., An ∈ A form a partition of Ω, for any i let Pr (Ai) > 0.

Then for any event B ∈ A with Pr (B) > 0

Pr (Aj |B) =Pr (B ∩Aj)

Pr (B)=

Pr (B|Aj) Pr (Aj)n∑i=1

Pr (B|Ai) Pr (Ai)

.

Definition 10 Events A1 and A2 are independent iff

Pr (A1 ∩A2) = Pr (A1) Pr (A2) .

Similarly, events A1, A2, ..., An ∈ A are independent iff the probability of any intersection of any

sub-sample of A1, A2, ..., An is the multiplication of probabilities of the terms in the intersection.

Random variables

Definition 11 (Intuitive definition) A random variable x is a real-valued function of the elements

$ of a sample space Ω.

For example, a random variable x is the sum of the two numbers that occur when we roll a

pair of fair dice one. The events are the numbers. The space of possible events is comprised of

109

36 elements. The value of a random variable depends on the outcome of the experiment being

observed. Each possible value X of a random variable x defines an event: function x ($) assigns

values X ∈ R to the set of sample space outcomes $.

Definition 12 (Technical definition) For a given probability space(

Ω; A; Pr (.)), a function x ($) :

Ω → R is said to be a random variable if for ∀X ∈ R event that occurs the event $ on which

x ($) < X, AX = $ : x ($) < X, belongs to the events space A, AX ∈ A.

Definition 13 The cumulative distribution function of a random variable x defined on the proba-

bility space(

Ω; A; Pr (.))is

Fx (X) = Pr (AX) = Pr (x ($) < X) .

Notice that the cumulative distribution function would be a non-decreasing function Fx (X) :

R → [0, 1] since higher X implies bigger space of $ events where x ($) < X. More precisely,

cumulative distribution functions satisfy the following properties.

• limX→−∞ Fx (X) = 0

• limX→+∞ Fx (X) = 1

• For any X1 and X2 where X1 < X2, Fx (X1) ≤ Fx (X2)

• limh→0− Fx (X + h) = Fx (X) for any X : R

Definition 14 Discrete density function or probability mass function of a discrete random variable

x is

fx (X) =

pxi0

x = xi, i = 1, 2...

otherwise.

Definition 15 Continuous density function or probability mass function of a continuous randomvariable x is

fx (X) =dFx (X)

dX.

Density functions satisfy the following properties.

• fx (X) ≥ 0

•∫ X−∞ fx (z) dz = Fx (X)

•∫ +∞−∞ fx (z) dz = 1

110

Basic properties of expectation, variance, and covariance

1. The basic properties of expectation operator are

• For any random variables XiNi=1 , real functions hiNi=1, and real numbers αi

Ni=1

E

[N∑i=1

αihi (Xi)

]=

N∑i=1

αiE [hi (Xi)] .

• If for any hi and hj it is the case that hi (X) ≤ hj (X) then

E [hi (X)] ≤ E [hj (X)] .

2. The basic properties of variance and covariance operators are

• For any random variables XiNi=1 , real functions hiNi=1, and real numbers αi

Ni=1

V

[N∑i=1

αihi (Xi)

]= COV

[N∑i=1

αihi (Xi) ,N∑i=1

αihi (Xi)

]

=N∑i=1

α2iV [hi (Xi)] +∑i 6=j

αiαjCOV [hi (Xi) , hj (Xj)] .

where for any hi and hj

COV [hi (Xi) , hj (Xj)] = E [(E [hi (Xi)]− hi (Xi)) (E [hj (Xj)]− hj (Xj))] .

and the second term consists of N (N − 1) items. Correlation is defined as

ρhi(Xi),hj(Xj) =COV [hi (Xi) , hj (Xj)]√V [hi (Xi)]V [hj (Xj)]

.

Correlation shows linear relation. It is from −1 to 1, ρhi(Xi),hj(Xj) ∈ [−1, 1].

• Given its definition covariance operator satisfies the following properties

COV [Xi, Xi] = V [Xi] ,

COV[αi + βiXi, αj + βjXj

]= βiβjCOV [Xi, Xj ] ,

COV[αi + βiXi + αk + βkXk, αj + βjXj

]= βiβjCOV [Xi, Xj ] + βkβjCOV [Xi, Xj ] .

Therefore, if

Xi = α+ βXj ,

then

COV [Xi, Xj ] = COV [α+ βXj , Xj ] = βV [Xj ] .

111

Mean-variance trade-off

It can be shown that indifference curve E (σ) is increasing and convex. In other words, dEdσ > 0 andd2Edσ2

> 0. This follows from that utility function u (.) is increasing and concave.

To show that

dE

dσ= −

+∞∫−∞


+∞∫−∞

u′ (E + σZ)ϕ (Z) dZ

> 0

consider the numerator

+∞∫−∞

u′ (E + σZ)Zϕ (Z) dZ =

0∫−∞

u′ (E + σZ)Zϕ (Z) dZ +

+∞∫0

u′ (E + σZ)Zϕ (Z) dZ.

Denote by Z = −Z, and rewrite

+∞∫−∞

u′ (E + σZ)Zϕ (Z) dZ =

0∫−∞

u′(E − σZ

)(−Z)ϕ(−Z)d(−Z)

+

+∞∫0


= −+∞∫0

u′(E − σZ

)Zϕ(Z)dZ +

+∞∫0


=

+∞∫0

[u′ (E + σZ)− u′ (E − σZ)

]Zϕ (Z) dZ.

Clearly, since u′′ < 0 it has to be that

+∞∫−∞

u′ (E + σZ)Zϕ (Z) dZ < 0,

which implies that dEdσ > 0.

In turn, to show that d2Edσ2

> 0, consider two points on the indifference curve (E1, σ1) and (E2, σ2),

and their average(E1+E22 , σ1+σ22

). Notice that by construction u (E1 + σ1Z) = u (E2 + σ2Z).

The indifference curve would be convex if for any two points (E1, σ1) and (E2, σ2) and Z

1

2u (E1 + σ1Z) +

1

2u (E2 + σ2Z) < u

(E1 + E2

2+σ1 + σ2

2Z

).

112

This clearly holds since u is a concave function, which implies that

E

[u

(E1 + E2

2+σ1 + σ2

2Z

)]> E [u (E1 + σ1Z)] = E [u (E2 + σ2Z)] .

113

References

Cagan, P. (1956). The monetary dynamics of hyperinflation. In M. Friedman (Ed.), Studies in the

Quantity Theory of Money. Chicago: University of Chicago Press.

Calvo, G. A. (1983). Staggered prices in a utility-maximizing framework. Journal of Monetary

Economics 12 (3), 383—398.

Carlin, W. and D. Soskice (2005). The 3-equation New Keynesian Model– a graphical exposition.

Contributions in Macroeconomics 5 (1).

Doepke, M., A. Lehnert, and A. Sellgren (1999). Macroeconomics. Available online (last accessed

29.03.2015). http://faculty.wcas.northwestern.edu/~mdo738/book.htm.

Domar, E. D. (1946). Capital expansion, rate of growth, and employment. Econometrica 14 (2),

137—147.

Goodwin, R. (1967). A growth cycle. In C. H. Feinstein (Ed.), Socialism, Capitalism & Economic

Growth, pp. 54—58. London: Macmillan.

Hansen, G. D. (1985). Indivisible labor and the business cycle. Journal of Monetary Eco-

nomics 16 (3), 309—327.

Harrod, R. F. (1939). An essay in dynamic theory. Economic Journal 49 (193), 14—33.

Kaldor, N. (1934). A classificatory note on the determinateness of equilibrium. Rreview of Economic

Studies 1 (2), 122—136.

King, R. G. and S. T. Rebelo (1999). Resuscitating real business cycles. Volume 1, Part B of

Handbook of Macroeconomics, pp. 927—1007. Elsevier.

Kydland, F. E. and E. C. Prescott (1982). Time to build and aggregate fluctuations. Economet-

rica 50 (6), 1345—1370.

Nagel, R. (1995). Unraveling in guessing games: An experimental study. American Economic

Review 85 (5), 1313—1326.

Solow, R. M. (1956). A contribution to the theory of economic growth. Quarterly Journal of

Economics 70 (1), 65—94.

Taylor, J. B. (1980). Aggregate dynamics and staggered contracts. Journal of Political Econ-

omy 88 (1), 1.

114

http://faculty.wcas.northwestern.edu/~mdo738/book.htm

Macroeconomics III Vahagn Jerbashian Lecture noteshome.cerge-ei.cz/vahagn/files/lecture...

Documents

Transcript of Macroeconomics III Vahagn Jerbashian Lecture noteshome.cerge-ei.cz/vahagn/files/lecture...