Macroeconomics III Vahagn Jerbashian Lecture noteshome.cerge-ei.cz/vahagn/files/lecture...
Transcript of Macroeconomics III Vahagn Jerbashian Lecture noteshome.cerge-ei.cz/vahagn/files/lecture...
Macroeconomics III
Vahagn Jerbashian
Lecture notes∗
This version: February 11, 2017
Contents
Expectations 3Introduction to the concept of expectations . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Types of expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Traditional models with expectations 14The Cobweb Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
The Cagan Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
The Lucas Imperfect Information Model with AD . . . . . . . . . . . . . . . . . . . . . . . 29
The Sticky Wage Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
The Effectiveness of monetary policy (with fixed rules) 37A model with stabilizing monetary policy . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
A model with constant growth of money . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Discretionary monetary policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Political cycles and discretionary monetary policy . . . . . . . . . . . . . . . . . . . . . . . 48
Monetary policy under commitment and discretion . . . . . . . . . . . . . . . . . . . . . . 50
Monetary policy under commitment . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Monetary policy under discretion (without commitment) . . . . . . . . . . . . . . . . 53
∗These notes may contain typos/mistakes and are subject to changes/updates during our course. Please keep track ifthere are any.
1
Business cycles 55Business cycles - The Carlin and Soskice (2005) model . . . . . . . . . . . . . . . . . . . . 58
Endogenous business cycles - The Goodwin (1967) model . . . . . . . . . . . . . . . . . . 63
A Real Business Cycles Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
The Basic RBC Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Price Rigidities - The Calvo (1983) model . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Expectations and financial markets 84Bonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Stocks, stock prices, and stock markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Measures of returns and risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Measuring portfolio return and risk to assemble a portfolio . . . . . . . . . . . . . . 94
Market price of risk and the CAPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Black-Scholes model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Appendix 107Appendix - Reminder of Statistics 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Mean-variance trade-off . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
2
Expectations
Introduction to the concept of expectations
Why did you start reading these notes? Would you start and/or continue reading if you expect
these notes to be useless and/or your test(s) to be very easy? I suppose the answer is no at least
for some students.2 This is an example how expectations matter for strategies, actions, and later
for performance at individual level.
Consider another example to see how expectations matter for economic performance at individ-
ual level, as well as at aggregate level. Suppose we have an economy of 1000 firms and consumers.
In this economy, firms hire consumers’labor to produce and consumers use wages they receive to
buy firms’products. Suppose that one of the firms expects low demand for its products prior to
deciding how much to produce. In order to produce according to its expectation, it would hire a few
amount of labor and pay a low wage bill. Further, imagine that instead of one firm, all firms expect
a low demand for their products. In such a case, all firms will hire a few amount of labor and pay
low wage bills. Therefore, consumers will have low income and consume a few amount of products,
which will reinforce firms’expectations. You continue reading this chapter, and I will bring more
such examples. The main focus of our examples, and classes in general, is on how expectations
can matter for macroeconomic performance and aggregate economic fluctuations due to supply or
demand shocks.
Prior to proceeding to these examples, let’s digress (which we will do often) and discuss formally
what expectations are.
Expectations - intuitive and formal discussion
We know for sure that if we throw an apple (perhaps, not an iPhone) it will eventually hit the
ground because of gravitation. Since we know for sure, we expect the event "thrown apple hits the
ground" to happen. Can you claim with the same certainty that tomorrow it won’t rain around our
university?
In these examples we have two different (random) events. One of the events is "thrown apple hits
the ground." Whereas, the other event is "no rain tomorrow around our university." These events
happen with some probabilities. The first event happens with probability 1, i.e., it happens for
sure/with certainty. (Since it happens with certainty we don’t call it a random event.) The second
event, however, happens with probability less than 1. For example, let it happen with probability
0.5. Then you would say that with probability 0.5 you expect to have no rain around our university
tomorrow.
Let’s now consider a bit more sophisticated examples. Suppose we have a lottery which pays 100
EUR with probability 0.01 and 0 EUR otherwise, i.e., with probability 0.99. What is the expected
2These notes are not useless and I promise your tests will not be easy.
3
pay-off of this lottery? It is 1 EUR:
0.01× 100 + 0.99× 0 = 1.
Therefore, if this lottery costs 5 EUR, one might think at least twice prior to buying it.
We have two possible realizations of random variable "pay-off" in this example: "100 EUR" and
"0 EUR." These realizations have associated probabilities 0.01 and 0.99. When we are calculating
expected value of the random variable "pay-off" we weight each of these possible realizations with
the corresponding probability. Intuitively, we do so in order to give larger weight to realizations
which are more likely to happen (here: 0 EUR is more likely to happen and gets 0.99 weight which
is larger than 0.01 weight of 100 EUR).
Suppose now we have a random variable x which takes (independently distributed) values from
a finite and ordered set of real numbers xjNj=1 with associated probabilitiespxjNj=1. The set of
real numbers xjNj=1 are the possible realizations of x. In turn, the space of associated probabilitiespxjNj=1
is the probability distribution of x. To relate to previous example, let x be the pay-off of
a lottery which gives xj amount of Euros with probability pxj where j = 1, ..., N . For instance, if
x1 is the event when pay-off is 0 EUR and x2 is the event when pay-off is 10 EUR. Then px1 and
px2 are the probabilities of those events.
What is then the expected value of x? Use E [x] to denote it. E [x] is given by
E [x] = px1x1 + px2x2 + ...+ pxNxN =N∑j=1
pxjxj ,
which clearly generalizes our previous example in a straightforward manner. If there were (count-
ably) infinite possible realizations of x so that we had xj+∞j=1 andpxj+∞j=1, then we would simply
write
E [x] =+∞∑j=1
pxjxj .
In any case E [x] is just a real number, of course assuming that E [x] < +∞.Lets continue this generalization process. Suppose now that random variable x takes values
from a continuous set of real numbers [A,B], where A < B (e.g., A = 0 and B = 100). Denote the
probability that the realization of x happens to be less than X by
F (X) = P (x < X) ,
where F is the probability distribution function of x. F function maps the set of possible realizations
[A,B] to [0, 1].
To derive the expected value of x we need to know the probability of each of the possible realiza-
tions of x. For a second suppose that x, as in previous example, took (independently distributed)
values from a discrete and ordered space xj+∞j=1. In such a case the probability of observing exactly
4
realization x2 is the difference between probability that the realization of x is less than x3 and the
probability that x is less than x2. In other words,
px2 = px1 + px2 − px1 .
According to our new notation this can be written then as
px2 = F (x3)− F (x2) .
Denote this difference by ∆F,
∆F = F (x3)− F (x2) .
Lets go back to the case when x takes values from a continuous set [A,B]. Suppose, as in
previous example, x1, x2, and x3 are consecutive possible realizations of x. Since the set of possible
realizations of x is continuous the distance between these realizations is infinitesimally small. In
such a circumstance we consider an infinitesimally small change in F for obtaining the probability
that x takes a value of x2. This infinitesimally small change we denote by dF instead of ∆F . The
expected value of x in this case is
E [x] =
B∫A
xdF (x) .3
In this expression, x are possible realizations of x and dF (x) are their associated probability. We
use integral instead of sum since we are summing/integrating over a continuum of infinitesimally
small points.
In case when F is a differentiable function on [A,B] (i.e., dF (x)dx exists on that interval) we can
rewrite E [x] as
E [x] =
B∫A
xf (x) dx,
where f (x) = dF (x)dx is the probability density function of x.
In economics and in many other disciplines we call the possible realizations of a random variable
"possible states" and the space of possible realizations "space of possible states." You can easily
notice that the expected value of x depends on both the space of possible states and the probability
distribution of x. For example, suppose for simplicity that
F (x) =x−BB −A
3Again, we need to have E [x] < +∞ so that E [x] is something well defined, i.e., it is just a real number.
5
(i.e., we have a uniform distribution) then
E [x] =1
B −A
B∫A
xdx
=1
2(B +A) .
Therefore, changing B and/or A, which corresponds to changing the space of possible states of x,
changes its expected value. Consider a change in F now. Suppose,
F (x) =1
Φ(B−µσ
)− Φ
(A−µσ
) x∫A
1
σφ
(x− µσ
)dx
where
φ (z) =1
σ√
2πe−
12(
z−µσ )
2
,
Φ (z) =
z∫−∞
φ (Z) dZ.
(i.e., we have a truncated normal distributed random variable with non-truncated mean µ and
variance σ. Φ and φ are normal distribution and density functions, correspondingly) In such a case,
it can be shown that
E [x] = µ+ σφ(A−µσ
)− φ
(B−µσ
)Φ(B−µσ
)− Φ
(A−µσ
) .Cooking up examples for random variables with discrete distributions is much easier. Suppose,
we have (1) . xjNj=1 = 1, 2 andpxjNj=1
= 0.5, 0.5, (2) . xjNj=1 = 2, 2 andpxjNj=1
=
0.5, 0.5, and (3) . xjNj=1 = 1, 2 andpxjNj=1
= 0.8, 0.2. Clearly, the difference between(1) and (2) is in the space of possible states. In turn, the difference between (1) and (3) is in the
associated probabilities/distribution of states. Expected values in each of these cases are 1.5, 2, and
1.2, correspondingly. For more discussion see Appendix - Statistics 0.
Economic agents act according to their expectations. Often in real life, as well as in economics,
we might not exactly know the entire space of possible states of a random variable neither we
might know exactly its probability distribution function. In our examples we will see that economic
performance depends on economic agents’ beliefs of what the possible states are and what the
distribution function is.
Further examples and Keynesian-beauty contest
In order to further assert that expectations and the way they are formed matter in economics
consider a game called "p-beauty contest" and first run in Nagel (1995). The rules of the game are
6
as follows:
• Each of N -players is asked to choose a number from the interval [0, 100]
• The winner is the player whose choice is closest to p times the mean of the choices of allplayers, where p < 1 (e.g., p = 0.5).
In this game the random variable for a player is the mean of the choices of all players. Meanwhile,
the probability distribution of this variable depends on the types and beliefs/knowledge of all players.
For example, it turns out that if all players are rational in the sense that everyone performs iterated
elimination of (weakly) dominated strategies and all players know about that (i.e., it is a common
knowledge that everyone is rational) then it is straightforward to guess what would be the mean of
choices of all players. Under these assumptions everyone simply chooses 0. To see this consider a
player. This player will never choose a number above 100p since it is dominated by 100p. Moreover,
given that the player believes that others are rational too, s/he will not pick a number above 100p2
since s/he will know that no one will pick above 100p. Similarly believing that everyone is rational,
s/he will not pick a number above 100p3 and so on, until all numbers but zero are eliminated.
If p > 1 then 100 can also be an equilibrium (and 0 is not a "stable" equilibrium). For p = 1
any number chosen by all players can be an equilibrium.
This game mimics the problem a seller in the stock market, for example. The seller wants to sell
his shares when the price of the share is at its peak, just before at least someone wants to sell. In
order to do that (i.e., design its actions) the seller needs to know the types and beliefs/knowledge of
other sellers. This example motivated John Maynard Keynes to propose the original setup of this
game in Chapter 12 of his work: The General Theory of Employment, Interest and Money (1936).
In that work, he proposed an explanation behind fluctuations in prices in equity markets in terms of
changes in beliefs of sellers and buyers. Instead of picking numbers, Keynes used an analogy based
on a newspaper contest, in which players are asked to choose from a set of photographs of women
that are the "most beautiful." Those who picked the most popular face are then the winners. (This
is the reason why the name of the game is Beauty Contest).
Imagine that there are three players. A naive strategy in this game would be to pick the most
beautiful face one perceives. If everyone does so, one could deviate with a more sophisticated
strategy and pick up a face which s/he thinks is most likely to be chosen by the other two. If
everyone does so, one could deviate with a more sophisticated strategy and pick up a face which
s/he thinks is most likely to be expected to be chosen by the other two, etc.
Keynes wrote: "It is not a case of choosing those [faces] that, to the best of one’s judgment, are
really the prettiest, nor even those that average opinion genuinely thinks the prettiest. We have
reached the third degree where we devote our intelligences to anticipating what average opinion
expects the average opinion to be. And there are some, I believe, who practice the fourth, fifth and
higher degrees."
It turns out, however, that in reality in p-beauty contest games assumptions of rationality are
often violated and therefore, equilibrium is not at 0. Researchers have revealed that, for example,
7
using experiments with high school students. The following comment summarizes the thought
processes of a high school student participating in a newspaper contest (submitted to one of the
newspaper studies, the Spektrum der Wissenschaft). The game is a p-beauty contest with p = 2/3.
"I would like to submit the proposal of a class grade 8e of the Felix-Klein-Gymnasium Goettingen
for your game: 0.0228623. How did this value come up? Johanna . . . asked in the math-class
whether we should participate in this contest. The idea was accepted with great enthusiasm and
lots of suggestions were made immediately. About half of the class wanted to submit their favorite
numbers. To send one number for all, maybe one could take the average of all these numbers.
A first concern came from Ulfert, who stated that numbers greater than 66 2/3 had no chance
to win. Sonja suggested to take 2/3 of the average. At that point it got too complicated to some
students and the finding of the decision was postponed. In the next class Helena proposed to
multiply 33 1/3 with 2/3 and again with 2/3. However, Ulfert disagreed, because starting like
that one could multiply it again with 2/3. Others agreed with him that this process then could be
continued. They tried and realized that the numbers became smaller and smaller. A lot of students
gave up at that point, thinking that this way a solution could not be found. Other believed to have
found the path of the solution: one just has to submit a very small number.
However, one could not agree how many of the people who participated realized this process.
Johanna supposed that the people who read this newspaper are quite sophisticated. At the end of
the class 7 to 8 students heatedly continued to discuss this problem. The next day the math teacher
received the following message: We think it best to submit number 0.0228623."
Consider another game called "ultimatum game." There are two players in this game. Player 1
is entitled to a 10 EUR and players decide how to divide it. The rules of the game are as follows
• Player 1 proposes a division of the sum
• Player 2 can either accept or reject this proposal
— If the player 2 rejects, neither player receives anything
— If the player 2 accepts, 10 EUR is split according to the proposal
The extensive form representation of the game is:
8
In this figure, it is assumed that player 1 either gives 2 EUR to player 2 or 5 EUR. Player 2
then decides to accept or reject. If player 2 rejects in any of these cases both get 0 EUR. In turn,
if player 2 accepts the offer money is split according to the proposal.
The strategy of player 1 is the proposal coupled with the expected strategy/response of player
2, which depends on the type of player 2. Imagine that player 1 and 2 care only about money in
this game (i.e., players are expected pay-offmaximizers). If player 1 knows about the type of player
2 then s/he knows that whatever positive amount s/he proposes player 2 will accept. Therefore,
s/he can propose something very close to 0 (in fact, perhaps, exactly 0) and get as much of the
pay-off as possible (something around 10 EUR). If, however, player 1 expects that player 2 has a
strict preference over equality of the split of the award (i.e., player 2 will reject a proposal if it splits
reward very unequally) then s/he might propose something higher than 0 EUR and close to 5 EUR.
This is because otherwise s/he gets nothing.
9
Types of expectations
In previous (sub-)section we defined and discussed expectation operators for discrete and continuous
random variables. We saw that we can compute the expected value of a random variable in cases
when we know exactly what is the space of its possible states and the probability of those states
(distribution function). We saw as well that if we have three random variables which have different
spaces of possible states and/or probabilities of those states then their expected values can be
different (i.e., expected value depends on the set of possible states and on distribution function).
Further, we saw examples when economic agents act according to their expectations and they might
not exactly know the entire space of possible states of a random variable neither its probability
distribution function.
What to do if we don’t know exactly the distribution function of a random variable? To alleviate
such a problem we use statistics. In other words, we observe realizations of the random variable and
use them in order to infer certain moments of its distribution. Expected value is the first moment.
In particular, suppose that we have xjNj=1 realizations of random variable x. Further, we have no
priors about the probability of each of the observations. In such a case, the mean of observations
xeN =1
N
N∑j=1
xj
is the sample (statistical) analogue of the expected value of x. As E [x] it is just a number.
Here we have 1N in front of each realization since we need to treat each of the observations
equally likely (i.e., each observation has 1N probability to occur). A central theorem in probability
theory called Law of Large Numbers provides us with a proof that if we have infinite realizations
of the random variable then the mean of the sample is the expected value of the random variable,
Pr
(lim
N→+∞xeN = E [x]
)= 1.
Usually in economics, and in particular in macroeconomics, we observe realizations of random
variables over time. Therefore, instead of index j and N we often use t and T , where t indexes time
and T refer to its most recent value. Then we write, for example,
xeT =1
T
T∑t=1
xt
where x1 is the value that random variable x has obtained at time 1, x2 is the value that random
variable x has obtained at time 2, etc. Again, this is the best (unbiased and consistent) guess of the
expected value of x when we have no priors (extra information) on the likelihood of its observations.
10
It is usually the default option, therefore, and we will treat
xeT =1
T
T∑t=1
xt
as the first type of expectations. It incorporates all observations and treats them equally.
In certain circumstances we might have priors over the likelihood of observations to reoccur.
For example, it might be that we know that older observations are less likely to happen. To put
more meat into the discussion, suppose that we observe xtTt=1 realizations of random variable x
and would like to compute the mean of x, perhaps because we will apply it for our actions in time
T + 1. We will denote xeT by xeT+1 for that purpose.
Suppose further that if we know that for any t from 1 to T if the likelihood of xt to reoccur
is λ < 1 then the likelihood of xt−1 to reoccur is λ (1− λ). (Notice that λ < 1 implies that
λ (1− λ) < λ. Therefore xt−1 is less likely to happen than xt). In such a case, we would write the
sample mean as
xeT+1 (λ) = λxT + λ (1− λ) xT−1 + ...+ λ (1− λ)T−1 x1
= λ
T∑t=1
(1− λ)T−t xt.
where λ is the likelihood that we think xT will happen. xT−1 happens with likelihood λ (1− λ).
We use xeT+1 (λ) instead of xeT (λ).
xeT+1 (λ) has a special name. It is called exponentially weighted average with time decay. Notice
that xeT+1 (λ) can be easily written in a recursive form in the following way:
xeT+1 (λ) = λxT + (1− λ)xeT (λ)
= xeT (λ) + λ [xT − xeT (λ)] .
According to the second line xeT+1 (λ) is the sum of old expectation xeT (λ) and a weighted difference
between the realization of x at time T , xT , and its expectation xeT (λ). This last term adapts/corrects
the current expectation to the error in forecast/expectation: [xT − xeT (λ)]. In this context, λ is
called a correction parameter. Economists call these types of expectations Adaptive Expectations.
These are the second type of expectations. Hereafter, we will use notation xeT+1|T to denote the
expectation of random variable xT+1 conditional on information available at time T .
Macroeconomists have used extensively (and abused) adaptive expectations in their models
before 1970s. The assumption of adaptive expectations has been imposed in these models without
much justification. For example, this assumption usually generates persistent errors in these models
(i.e., limT→+∞1T
T∑t=1
[xt − xeT (λ)] 6= 0) which seems to be odd. It implies that economic agents make
persistent errors (i.e., have no intention to correct their errors).
An assumption regarding expectations, which is consistent with economic models, is the "ratio-
11
nal expectations" assumption. This assumption states that economic agents use the model to form
expectations. A model is a description of an economy. Assuming that it is the right one, rational
expectations assumption states that the agents know the economy entirely and they use that in-
formation to form their expectations. "The agents know the economy" means that they know the
structure behind demand and supply and that there are random variables, with given distribution
functions, which affect supply and demand. In such a circumstance, in terms of the model, agents’
expectations are not systematically wrong in that all errors are random/not persistent. Therefore,
under this assumption deviations from perfect foresight in the model are only random.
Denote these expectations as
xeT+1|T = ET [xT+1] ,
where ET [xT+1] ≡ E [xT+1|ΩT ] is the expected value of x at time T + 1 given all information till
time time T + 1. Information is summarized by ΩT and includes all the possible structures in the
economy and values of fundamentals. These are the third type of expectations.
Lets see why this can work better than adaptive expectations which miss some of the structure
of the model. Suppose that agents are assumed to have adaptive expectations, and the model
economy features a constantly rising inflation rate. In such a circumstance agents would be assumed
to always underestimate inflation since they are assumed to predict inflation by looking at inflation
in previous years. Under rational expectations assumption, since the trend in inflation is part of
the model agents would take it into account in forming their expectations and there won’t be such
a bias.
Formally, this can be represented in the following manner. Consider an economy which starts
at time 1 and where inflation π at any time T is given by
πT = π + T + ηT ,
where π is a constant, T indexes time, and ηT is a random variable with 0 mean and σ2 variance.
Further, suppose we are at the end of T = 2, and the agents’ in this economy need to form
expectation of inflation for time T + 1 = 3. If agents have adaptive expectations with correction
parameter λ = 13 , then
πe3
(1
3
)=
1
3
2∑t=1
(2
3
)2−tπt.
At the end of T = 2 the realizations of η are known. Suppose, η1 = 0.01 and η2 = 0.02. Expected
inflation then can be rewritten as
πe3
(1
3
)=
1
3
[(π + 2 + 0.02) +
(2
3
)(π + 1 + 0.01)
]=
5
9π +
44
30.
12
In case of rational expectations,
πe3|2 = E [π3|Ω2]
= E [ π + 3 + η3|Ω2]
= π + 3 + E [η3|Ω2]
= π + 3.
Notice that
πe3|2 > πe3
(1
3
).
In this simple economy, trend in inflation can be thought to represent policy changes of a central
bank. In other words, suppose there is a central bank in this economy which constantly increases
inflation.
One of the influential drivers behind widespread adoption of rational expectations in macroeco-
nomics has been the famous Lucas’Critique. In terms of our discussion, Lucas’Critique is that
in models which do not feature agents with rational expectations, agents might not react to policy
changes. This seems not to reconcile well with reality since in such a case agents can be tricked
almost always with some policy changes. There is a famous Lincoln Quote on this issue: "You can
fool some of the people all of the time, and all of the people some of the time, but you can not fool
all of the people all of the time."
Despite these seemingly plausible properties and widespread use, rational expectations assump-
tion has received criticism on the grounds that it assumes that agents know everything. Criticism
goes on saying that in the real world agents (consumers, firms, etc) do not know exactly the econ-
omy.4 (If this were to happen, would Economics be a science?)
4There is an emerging field in macroeconomics which deals with this issue. The models in this field feature agents whichcontinuously learn the economy and know the economy fully at the end of time horizon (i.e., they are asymptoticallyrational). See for details "Evans, G., and Honkapohja, S. (2001). Learning and expectations in macroeconomics.Princeton University Press."
13
Traditional models with expectations
This section highlights how important is the assumption on expectations in two traditional models.
The models are Cobweb Model and Cagan Model.
The Cobweb Model
The Cobweb Model Kaldor (1934) proposes an explanation why prices might be subject to periodic
fluctuations in certain markets. It assumes that firms must choose output before prices are observed,
demand and supply (prices) are uncertain, and that firms’have adaptive expectations with λ = 1.
These type of expectations are called Static Expectations since there is no correction to error. Firms’
expectations about prices at time T+1 are based on the prices that prevailed just in previous period,
at time T . In other words, using p to denote prices,
peT+1 = peT (λ) + [pT − peT (λ)]
= pT .
Hereafter, we will replace T + 1 with t.
It seems that such a model can be well applicable to agricultural markets. In such markets,
producers invest in production (start the production process) much before they sell their output.
Periodic fluctuations then can happen for example because of supply shocks such as bad weather.
For example, according to this model, if producers of corn experience very bad weather and have
reduced output, they would get higher prices. In the next period, they expect high prices and
therefore will produce a lot. This will dampen the prices and lead to expectation of low prices.
Expecting low prices, the producers of corn will produce few corn and in equilibrium price will rise,
etc.
In graphical terms, the process described above can be represented in the following manner.
14
The difference between these two figures is the relation between the slopes of the demand and
supply curves. In the figure to the left, the slope of the inverse supply curve is higher than the
absolute value of the slope of the inverse demand curve. In figure to the right, the slope of the
inverse supply curve is lower than the absolute value of the slope of the inverse demand curve.
Suppose an economy represented by these figures starts at time t = 0 at the intersection of S
(supply) and D (demand) curves. In that period the economy receives a negative shock to supply
so that prices in the next period are p1. In period t = 1 the economy receives a counterbalancing
positive supply shock which brings supply curve to its original position. In this period, however,
expected prices for period t = 1 are still p1. Producers produce according to p1. The demand,
however, is shorter than supply. Therefore, maximum that producers are able to charge is p2 which
is lower than p1 (p2 < p1). For period 2 then producers expect price p2 and produce accordingly.
In such a case, in period 2 it turns out that they have produced less than the demand is. They sell
then at a higher price p3 (p2 < p3). This process continues and generates periodic fluctuations in
prices.
In case of the figure to the left, prices tend to stabilize after the shock. This happens because
inverse supply (or supply prices) reacts more than does inverse demand (or demand prices). Faced
with higher demand in period t = 2, for example, firms rise their prices. They do so much that
dampens the demand in the next period. However, in case of the figure to the right, prices do not
stabilize. This happens because inverse supply reacts less than does the inverse demand.5
Lets make this model more formal. Suppose at any time t demand and supply functions are
given by
Dt = mI −mppt + η1,t,
St = rI + rppet + η2,t,
where mI , mp, rI , and rp are positive parameters. 1mp
and 1rp.are absolute values of the slopes of
demand and supply curves. η1,t and η2,t are shocks/disturbances. Let η1,t and η2,t be identically and
independently distributed (i.i.d.) and have 0 mean and σ2 variance. The values of these shocks are
not known at the time when price expectations are formed. At time t, supply function is designed
according to the expected value of the price. To stay in line with our story, assume that the economy
starts at p0 where D and S curves intersect. Moreover, η1,t ≡ 0 and η2,0 < 0 so that price shifts
from p0 to p1 and makes pe1 = p1. Further, η2,1 = −η2,0.and η2,t ≡ 0 ∀t > 1.
Market clearing condition requires that at each and every point in time quantity demanded is
equal to the quantity supplied,
Dt = St.
Therefore,
mI −mppt + η1,t = rI + rppet + η2,t
5 If in this economy the absolute values of the slopes of demand and supply were equal then there would be permanentprice fluctuations with constant magnitude.
15
and
pt =mI − rImp
− rpmp
pet +η1,t − η2,t
mp.
Denote
α1 =mI − rImp
α2 =rpmp
ηt =η1,t − η2,t
mp.
And rewrite pt as
pt = α1 − α2pet + ηt.
Assuming that pet = pt−1 we have
pt = α1 − α2pt−1 + ηt.
This is a very basic stochastic difference equation. Its solution is the sum of a general solution of
homogenous equation
pt = −α2pt−1
and particular solution of the entire equation.
The solution of pt = −α2pt−1 is very basic
pht = (−α2)t−1 p1,
where p1 is the price where the economy started. Now we need to guess a particular solution of
the general equation. It turns out that for this form of equations the particular solution has the
following form
ppt =
t−1∑τ=1
(−α2)t−1−τ (α1 + ητ ) .
Therefore,
pt = (−α2)t−1 p1 +
t−1∑τ=1
(−α2)t−1−τ (α1 + ητ ) .6
Ignore the second term in this expression and notice that if α2 > 1 then the absolute value of
(−α2)t−1 is increasing over time. Therefore, pt diverges to infinity. Parameter α2 is the inverse ofthe ratio of slopes of supply and demand curves mp
rp. It is greater than 1 in case when rp > |−mp| ,
which is equivalent to say that the slope of the inverse supply curve 1rpis lower than the absolute
value of the slope of the inverse demand curve 1mp(so that inverse demand reacts more than inverse
supply). However, if α2 < 1 then the absolute value of (−α2)t−1 declines over time. Therefore, pt6To see that this is the solution, plug this expression into the equation above.
16
converges to a number. In this case rp < |−mp| , which means that the slope of the inverse supplycurve is higher than the absolute value of the slope of the inverse demand curve (so that inverse
demand reacts less than inverse supply). What is the number that price converges to? The answer
to this question is quite simple. Convergence here means that price stabilizes over time (i.e., we
have a steady-state). We have assumed that ηt = 0 for any t > 1. Therefore, stable prices means
p = α1 − α2p,
and price converges to
p =α1
1 + α2.
Notice that p is the level of price where our example D and S intersect. In other words, p = p0.
To make this analysis clearer, let’s consider several numerical examples. Suppose, that
p1 = 1,
mI = mp = 1,
rI = 0.
and consider two values of rp
r1p = 0.5,
r2p = 1.5.
This implies that
α1 = 1,
α12 = 0.5, α22 = 1.5,
ηt = −η2,t.
Assuming the value of p1 and η2,1 = −η2,0 determines that values of the shock (of course given thatη1,t ≡ 0 and η2,t>1 ≡ 0).
17
Let’s determine the paths of p for these parameter values. First, consider the case when r1p = 0.5
p2 = 1− 0.5p1 + η2 = 0.5,
p3 = 1− 0.5p2 + η3 = 0.75,
p4 = 1− 0.5p3 + η4 = 0.625,
...
p15 = 0.666687012,
...
p+∞ =2
3.
The price converges to 23 , which is equal to
α11+α2
.
Now consider the case when r2p = 1.5
p2 = 1− 1.5p1 + η2 = −0.5,
p3 = 1− 1.5p2 + η3 = 1.75,
p4 = 1− 1.5p3 + η4 = −1.625,
...
p15 = 175.5575562,
...
p+∞ = ±∞.
The price diverges. Of course, negative price does not make sense. Therefore, one would have
stopped at p2 saying that there is something fundamentally wrong in this economy.
What happens if λ < 1 and we have adaptive expectations? If λ < 1 then
pet = pet−1 + λ(pt−1 − pet−1
)where
pt−1 = α1 − α2pet−1 + ηt−1.
Expectations can be written as
pet = [1− λ (1 + α2)] pet−1 + λ
(α1 + ηt−1
).
In this case we have a non-homogenous difference equation in expectations. Similar to the previ-
ous discussion, expectations converge to a number if |1− λ (1 + α2)| < 1 and diverge to infinity
18
otherwise. Steady expectations are given by
pe = [1− λ (1 + α2)] pe + λα1,
pe =α1
1 + α2.
Since we have adaptive expectations, pe is steady when prices are steady
p =α1
1 + α2.7
Interestingly, there can be situations when prices (and their expected values) are stable under
adaptive expectations with λ < 1 but unstable under static expectations. For instance, suppose
that 1−λ (1 + α2) < 0 then since by definition α2 > 0 the absolute value of 1−λ (1 + α2) is always
less than α2:
|1− λ (1 + α2)| = λ (1 + α2)− 1
λ (1 + α2)− 1 < α2 ⇔ λ < 1.
This implies that if 1− λ (1 + α2) < 0 then it is more likely that prices (and their expected values)
are stable under adaptive expectations with λ < 1 than under static expectations. This happens
because λ suppresses the reaction of inverse supply curve to the observed price. For instance,
suppose that α2 = 1.5 so that we are back to our example. Further, suppose λ = 0.5. In such a case
|1− λ (1 + α2)| = 0.25 and prices (together with their expected values) are stable under adaptive
expectations. We have seen, however, that for this example prices (and their expected values) are
not stable under static expectations.
In case, however 1− λ (1 + α2) > 0 then
|1− λ (1 + α2)| = 1− λ (1 + α2)
1− λ (1 + α2) < α2 ⇔1− α21 + α2
< λ.
In case α2 > 1 (and therefore prices are not stable under static expectations) it is always the case
that 1−α21+α2
< λ, which means that adaptive expectations are more likely to generate stable prices
(and expectations). For instance, again suppose that α2 = 1.5 but now λ = 0.1. In such a case
|1− λ (1 + α2)| = 0.75 and prices (together with their expected values) are stable under adaptive
expectations.
What would happen if we had rational expectations in this model? In case of rational expecta-
tions, expectations are implied by the model. In order to answer the question then we need to solve
for expected price level from demand and supply equations. Luckily, we have done most of the job.
7To verify this plug pe = α11+α2
into p = α1 − α2pe.
19
Write Et−1 [pt] instead of pet in α1 − α2pet + ηt:
pt = α1 − α2Et−1 [pt] + ηt.
Take the expected value of pt conditional on information at time t− 1.
Et−1 [pt] = Et−1 [α1]− α2Et−1 [Et−1 [pt]] + Et−1 [ηt]
= α1 − α2Et−1 [pt] .
Therefore,
Et−1 [pt] =α1
1 + α2
and this relation holds for any t.8 This implies that in short-term
pt = α1 − α2α1
1 + α2+ ηt,
and shocks can be the only reason of fluctuations in the economy. Such an inference holds since in
this case the realizations of η are not important given that it has a zero mean and is i.i.d./unpredictable.
In the long-term, however, we assume that there are no shocks. In some sense this corresponds to
assuming that everything is perfect. In such a case, Et−1 [pt] = pt and
p =α1
1 + α2.
This analysis points that types of expectations matter for the level of aggregate economic activity
and prices and for fluctuations. In case of static or adaptive expectations in this model there can
be prolonged fluctuations in the economy much after the shock. However, in case of rational
expectations there are no fluctuations after the shock.
The Cagan Model
Cagan Model has been an influential contribution to policy making and academia. Cagan in his
1956 article (Cagan, 1956) proposed a model which delivered novel explanation for extraordinarily
high inflation/hyperinflation. This model seems to do well in explaining for the behavior of inflation
and the demand for money even in the midst of such distress.
The common line of thought is that hyperinflation is because of continued supply of very large
quantities of money by the central bank. Famous examples are hyperinflation in Germany during
the period of 1921-1923 and Argentina in 1989. Hyperinflation in both countries happened because
their governments decided to fund their debt printing money.
Cagan Model stresses the destabilizing effects that expectations might have on inflation. In
8We have just found expectation from the model. In this sense we have assumed that agents which live in the economydescribed by the model, know the economy and use the model to form expectations. Moreover, in this sense adaptiveexpectations are a notion imposed on the model (additional condition/assumption so to say.)
20
Cagan Model inflation can destabilize and become very large even with very small amount of money
injected into the economy. Therefore, an implication of the model is that policy makers should be
wary with financing a deficit by printing money even of small quantities.
Cagan Model consists of several blocks. The first block of this model is money demand equation
Mt
Pt= L (Yt, it) ,
where Mt is the amount of desired money holdings, Pt is aggregate price, and money is the medium
of exchange. Therefore, MtPtis the real amount of desired money holdings. L (Yt, it) is money demand
function. It is assumed to depend on current output/income Yt and nominal interest rate it. More
precisely, it is assumed to increase with current output/income since increasing output implies
that the same amount of money buys more goods. In turn, money demand is assumed to depend
negatively on nominal interest rate. A justification for this assumption is that nominal interest rate
is paid on bonds/nominal savings but it is not paid on money holdings. Money holdings can be
freely converted into savings. Therefore, nominal interest rate is the opportunity cost of holding
money. Higher nominal interest rate increases this opportunity cost and reduces the desire to hold
money. (This equation can be thought to be a result of combination of Quantity Equation of Money
and Liquidity Preference Theory.)
Suppose that L (., .) is differentiable function in both arguments. Formally, our assumptions
about L (., .) can be summarized as
∂L (Y, i)
∂Y> 0,
∂L (Y, i)
∂i< 0,
where time index t has been dropped because these inequalities are assumed to hold for any t.
The second block is the Fisher Equation. In case we have a deterministic setup Fisher Equation
is
1 + it = (1 + πt+1) (1 + rt) ,
where it is the nominal rate agreed for bonds at time t and paid at time t + 1 on bond holdings
from time t. πt+1 is inflation in the period from t to t+ 1,
πt+1 =Pt+1 − Pt
Pt.
In turn, rt is real interest rate.
It is straightforward to obtain this equation. Suppose at time t one has acquired an asset Atat value PtAt. At time t + 1 the value of the asset has changed to Pt+1At+1. The percentage
change of the value of asset is the nominal interest earned on PtAt and the percentage change of
the volume/quantity of the asset is the real interest rate,
21
it =Pt+1At+1 − PtAt
PtAt=Pt+1At+1PtAt
− 1,
=
(Pt+1Pt
)(At+1At
)− 1,
=
(Pt+1 − Pt
Pt+ 1
)(At+1 −At
At+ 1
)− 1,
= (1 + πt+1) (1 + rt)− 1.
In Cagan Model, however, future prices are not known. They are assumed to follow a ran-
dom/stochastic process. In such a case, we replace Pt+1 with its expected value P et+1. This also
implies that we need to replace inflation rate with its expected value, and Fisher Equation becomes
1 + it =(1 + πet+1
)(1 + rt) .
There is no expectation on it since, although it has to be paid at time t+ 1, it is determined/agreed
at time t.9
The analysis in Cagan Model particularly focuses on periods of hyperinflation. In these periods
nominal values change much more rapidly than real values. Therefore, lets assume that real variables
are constant. Moreover, assume that the logarithm of the money demand function is linear in lnYt
and ln (1 + it). In particular,
lnL (Yt, it) = α0 + α1 lnY − απ ln[(
1 + πet+1)
(1 + r)]
= : α− απ ln(1 + πet+1
),
where α0, α1, and απ are positive constants and α = α0 + α1 lnY − απ ln (1 + r).
In such a case, combining money demand equation and Fisher Equation we have to have that
lnMt − lnPt = α− απ(lnP et+1 − lnPt
).
Denote z = lnZ and rewrite the equation above
mt − pt = α− απ(pet+1 − pt
),
or
pt = − α
1 + απ+
1
1 + απmt +
απ1 + απ
pet+1,
This is a fairly interesting equation. It suggests that current price is function of current money
supply and expected price level in future. Moreover, current price increases with money supply. This
9You might be used to the following form of Fisher Equation it = πet+1+rt. This is an approximation of the one statedabove. To see this assume that it, πet+1, and rt are close to 0 and apply the following approximation x = ln (1 + x)when x is close to zero.
22
happens because, given constant output increasing money supply simply increases prices. (Prices are
the rates at which money is converted to goods). Current price also increases with expected future
price. This happens because higher expected price increases expected inflation. Given constant
real interest rate, this implies that nominal interest rate increases, which reduces the desire to hold
money. However, the supply of money is unchanged, i.e.,Mt = fixed. Therefore, the rates at which
money is exchanged for goods increase which is the same as to say that prices, pt, increase. In terms
of equations:
pet+1 ↑⇒ πet+1 ↑ and rt = const⇒ it ↑⇒ L (Yt, it) ↓ and Mt = const⇒ pt ↑ .
If we assume that α = 0, then price (well... the logarithm of it) becomes a weighted average of
current money supply and expected inflation
pt =1
1 + απmt +
απ1 + απ
pet+1.
For simplicity we will maintain this assumption and to keep the house clean will denote ψ = 11+απ
∈(0, 1) so that
pt = ψmt + (1− ψ) pet+1. (1)
This equation together with assumption on expectations is the Cagan Model.
We will analyze now how different types of expectations matter for dynamics in this economy.
Moreover, we will check how changes in money supply matter for changes in prices and inflation,
which is the percentage change in prices.
Suppose that agents in this economy have adaptive expectations. Further, for simplicity suppose
that the expected value of price for time t − 1 coincided with its realization pet−1 = pt−1 (i.e.,
pt−1 = pt−2 and πt−1 = 0). In such a case
pet+1 = (1− λ) pet + λpt
= (1− λ)[(1− λ) pet−1 + λpt−1
]+ λpt
= (1− λ) [(1− λ) pt−1 + λpt−1] + λpt
= (1− λ) pt−1 + λpt.
Therefore, the Cagan Model can be expressed as
pt =ψ
1− (1− ψ)λmt +
(1− ψ) (1− λ)
1− (1− ψ)λpt−1.
The general solution of this difference equation is the sum of the general solution of homogenous
equation and particular solution of this equation. It is "easy" to guess and verify that it is given by
pt =
[(1− ψ) (1− λ)
1− (1− ψ)λ
]tp0 +
ψ
1− (1− ψ)λ
t∑τ=0
[(1− ψ) (1− λ)
1− (1− ψ)λ
]t−τmτ ,
23
where p0 is the initial level of prices. It is a given number.
Suppose only few mt 6= 0 (or mt is a stationary function). The logarithm of the price level in
such case is stable if(1− ψ) (1− λ)
1− (1− ψ)λ< 1.
With this inequality, the logarithm of the price level converges over time to 0. Therefore, price
converges to 1. Inflation, in turn, can be written as
πt = pt − pt−1
= −[
(1− ψ) (1− λ)
1− (1− ψ)λ
]t−1 ψ
1− (1− ψ)λp0
+ψ
1− (1− ψ)λ
mt −
ψ
1− (1− ψ)λ
t−1∑τ=0
[(1− ψ) (1− λ)
1− (1− ψ)λ
]t−1−τmτ
It is straightforward to notice that (1−ψ)(1−λ)1−(1−ψ)λ < 1. Therefore, we have stability in terms of prices
and inflation in this economy if expectations are for prices. This situation is not so interesting for
our current purposes.
Consider now how things change if we have adaptive expectations not for price levels but for
inflation:
πet+1 = (1− λ)πet + λπt.
Suppose again that at time t−1 prices were stable. Therefore, inflation was equal to 0. This implies
that
πet+1 = (1− λ)[(1− λ)πet−1 + λπt−1
]+ λπt
= λπt
or equivalently,
pet+1 − pt = λ (pt − pt−1)
Therefore,
pet+1 = (1 + λ) pt − λpt−1.
This can drastically change the solution of the model. To see that, plug this expression back into
(1) and obtain
pt =ψ
1− (1− ψ) (1 + λ)mt −
(1− ψ)λ
1− (1− ψ) (1 + λ)pt−1.
The general solution of this difference equation is
pt =
[− (1− ψ)λ
1− (1− ψ) (1 + λ)
]tp0 +
ψ
1− (1− ψ) (1 + λ)
t∑τ=0
[− (1− ψ)λ
1− (1− ψ) (1 + λ)
]t−τmτ ,
For the current purposes we know the model suffi ciently well. Let’s focus on the main novelty
24
in terms of inference which Cagan introduced. Suppose that p0 = 0 and economy is at p0 at time t
and mτ ≡ 0. Therefore, pt = p0 for any t. In this situation money supply is constant and is equal
to 1. Economy, price levels, and inflation are stable
pt = 0, πt = 0.
Consider a deviation from this situation. Suppose the government in this economy wants to
raise money supply at the beginning of time t = 0 so that m0 > 0 and keep m ≡ 0 for the rest of
the periods. Perhaps, it does so in order to finance its deficit. In such a case
pt =ψ
1− (1− ψ) (1 + λ)
[− (1− ψ)λ
1− (1− ψ) (1 + λ)
]tm0.
Therefore, if∣∣∣− (1−ψ)λ
1−(1−ψ)(1+λ)
∣∣∣ < 1 over time prices converge back to 0. However, if∣∣∣− (1−ψ)λ
1−(1−ψ)(1+λ)
∣∣∣ >1 then prices diverge to infinity. This happens even though only m0 > 0 so that there is no
continual expansion of money supply. This is the "possible" destabilizing effect of (forward looking)
expectations. Inflation in this case is given by
πt = − ψ
1− (1− ψ) (1 + λ)
[(1− ψ)λ
1− (1− ψ) (1 + λ)+ 1
] [− (1− ψ)λ
1− (1− ψ) (1 + λ)
]t−1m0.
Clearly, in case when∣∣∣− (1−ψ)λ
1−(1−ψ)(1+λ)
∣∣∣ > 1 inflation is also ever growing in absolute terms or spiralling
out of control.
The summary of the main novelty here is: In case expectations are for inflation, for certain
parameter values it is enough to slightly increase money supply for a very short period to have ever
increasing prices and inflation because of expectations over inflation. This is in contrast with the
common line of thought that in order to have ever increasing inflation and prices money supply
has to increase permanently. Moreover, it suggests that fiscal authority (central government) and
monetary authority (central bank) should be independent so that government would not be able to
finance its deficit printing money. This independence is implemented in many countries as of now
and even is part of constitutions of some of those countries.
What would happen if we had rational expectations in this model? (AGAIN) Rational expecta-
tions assumption means that the agents use the model to form their expectations. Therefore they’ll
use the following equation for forming expectations:
pt = ψmt + (1− ψ) pet+1.
Replace pet+1 with Et [pt+1] and rewrite this equation
pt = ψmt + (1− ψ)Et [pt+1] .
25
Take expectation over information that available at time t− 1.
Et−1 [pt] = ψEt−1 [mt] + (1− ψ)Et−1 [Et [pt+1]] .
According to the Law of Iterated Expectations
Et−1 [Et [pt+1]] = Et−1 [pt+1] .
Lets assume mt is deterministic. Therefore, we have a difference equation in terms of expectations
Et−1 [pt] = ψmt + (1− ψ)Et−1 [pt+1] .
Since the coeffi cient in front of the lead variable is less than one it has to be that expected price
tends to infinity in this model. Therefore, with rational expectations assumption we don’t have
convergence/stabilization of the economy. Prices and their expected values diverge to infinity.
This situation is called self-fulfilling inflation. Agents expect to have high inflation. Prices rise
accordingly, and agents’expectations fulfill.
If we impose though additional condition that
limτ→+∞
(1− ψ)τ−1Et [pt+τ ] = 0,
then we will have stability in this model. This condition is sometimes called "no bubble" condition
in the sense that it makes sure that expected prices do not become infinitely large.
Our main equation implies that
pt = ψmt + (1− ψ)Et [pt+1] .
Et [pt+1] = ψmt+1 + (1− ψ)Et [pt+2]
Et [pt+2] = ψmt+2 + (1− ψ)Et [pt+3]
...
Plugging back Et [pt+1], Et [pt+2], etc, gives
pt = ψmt + (1− ψ) ψmt+1 + (1− ψ) ψmt+2 + (1− ψ)Et [pt+3] .
Iterating till infinity we have
pt = ψ+∞∑τ=0
(1− ψ)τ mt+τ .
Therefore, with "no bubble" condition prices are forward looking and can be stationary. Prices
are forward looking implies that they take into account future changes in policy/money supply.10
Basically, this is the Lucas Critique to models which do not feature rational expectations. In those
10This is one of the reasons why it might be important to announce policies much before their implementation.
26
models changes in policy parameters do not affect expectations and therefore current actions and
economic outcomes. However, if expectations are rational then policy changes can affect actions
and economic outcomes immediately and therefore change the environment which the change of
policy/policy-makers might not have anticipated.
Suppose that m ≡ 0, then pt = 0. If m is constant and equal to m > 0 then by the formula of
geometric progression we get
pt = mψ+∞∑τ=0
(1− ψ)τ
= m.
This holds for any t. Notice that in this case pt is constant which is in sharp contrast to the case
of adaptive expectations assumption.
If instead, for example, mt+1 = mt+2 = m and mt+k = m for any k > 2, then with the same
logic
pt+3 = m
pt+2 = ψ
[m+ m
+∞∑τ=1
(1− ψ)τ]
= ψm+ (1− ψ) m,
pt+1 = ψ
[m+ (1− ψ) m+ m
+∞∑τ=2
(1− ψ)τ]
= [ψ + ψ (1− ψ)] m+ (1− ψ)2 m.
This implies that if m0 > 0 and m ≡ 0 for the rest of the periods then p0 = ψm0 and pt = 0 for
any t > 0.
Lets extend slightly the model and assume that money supply is not deterministic but follows
a random process
mt = m+ ηt.
where m is constant and η is an i.i.d. random variable with 0 mean and constant variance σ. ηtis the realization of η at time t and is known at time t. Therefore, mt at time t is not a random
variable. However, mt+1 is a random variable at time t since its realization is not known. The
assumption thatmt = m+ηt corresponds to assuming that money supply is subject to unanticipated
shocks. Where could such shocks come from? A source for such shocks in the model could be that
the government/central bank does not fully announce its money supply policy and allows some
random/unpredictable changes.
Maintaining the "no bubble condition" price can be rewritten as
pt = ψ
+∞∑τ=0
(1− ψ)τ Et [mt+τ ] ,
27
where of course Et [mt] = Et [m+ ηt] = m+ ηt since the value of ηt is known and
pt = ψ (m+ ηt) + ψm+∞∑τ=1
(1− ψ)τ
= ψ (m+ ηt) + (1− ψ) m.
Notice that this is similar to what we previously obtained and given that η are not predictable they
don’t matter for expectations and prices.
Appendix
If expectations are for prices then prices are stationary. What is the value they converge to when
money supply is constant? Here is the answer to that question:
If money supply is constant m > 0, p0 = 0, and (1−ψ)(1−λ)1−(1−ψ)λ < 1 then
limt→+∞
pt =ψ
1− (1− ψ)λm lim
t→+∞
t∑τ=0
[(1− ψ) (1− λ)
1− (1− ψ)λ
]t−τRewrite this equation as
limt→+∞
pt =ψ
1− (1− ψ)λm lim
t→+∞
t∑τ=0
[(1−ψ)(1−λ)1−(1−ψ)λ
]−τ[(1−ψ)(1−λ)1−(1−ψ)λ
]−tand notice that
+∞∑τ=0
[(1−ψ)(1−λ)1−(1−ψ)λ
]−τ= +∞. Apply L’Hôpital’s rule to figure out the limit:
limt→+∞
t∑τ=0
[(1−ψ)(1−λ)1−(1−ψ)λ
]−τ[(1−ψ)(1−λ)1−(1−ψ)λ
]−t = limt→+∞
∂∂t
t∑τ=0
[(1−ψ)(1−λ)1−(1−ψ)λ
]−τ∂∂t
[(1−ψ)(1−λ)1−(1−ψ)λ
]−t= lim
t→+∞
[(1−ψ)(1−λ)1−(1−ψ)λ
]−t−[(1−ψ)(1−λ)1−(1−ψ)λ
]−tln[(1−ψ)(1−λ)1−(1−ψ)λ
] =1
ln[1−(1−ψ)λ(1−ψ)(1−λ)
] .At the end of time horizon, price converges to
limt→+∞
pt =ψ
1− (1− ψ)λ
1
ln[1−(1−ψ)λ(1−ψ)(1−λ)
]m.
28
The Lucas Imperfect Information Model with AD
The Lucas Imperfect Information Model provides us with an explanation of why in short run aggre-
gate supply curve might be upward sloping, but not vertical.11 Expectations matter for dynamics
of prices and quantities in this model too. Prior to proceeding to the model lets digress shortly and
review AD-AS Model.
The AD-AS Model consists of two equations/curves: aggregate demand (AD) and aggregate
supply (AS). Aggregate demand curve is determined from the IS-LM Model. IS-LM-model has two
equations: investments-savings and liquidity-money (money demand). These equations are
[IS] : Y = C (Y, T ) + I (r) +G,
[LM ] :M
P= L (Y, r + πe) ,
where C is consumption, I is investment, G and T are government spending and taxes (fiscal policy
parameters), M is money supply/demand (monetary policy parameter). C increases with Y and
declines with T . I declines with r.
In a period, prices are assumed to be given. Money supply is also given. The central bank
controls it. Therefore, unknowns are Y and r in the IS-LM Model. Graphically this model can be
represented in the following manner.
The solutions of Y and r depend on price level P , fiscal and monetary policy parameters G,T ,
and M , and on expected inflation level, πe. The solution of Y is the aggregate demand curve,
AD = Y (P,G, T,M, πe) .
It declines with P . Negative relation stems from LM curve. This curve shifts up when prices
increase implying higher equilibrium interest rate and lower income.
The graphical representation of AD-AS model is
11Macro II subject slightly covers this model. You can find and read a slightly different version of this model in Part Aof Chapter 6 of Romer (2006) textbook.
29
In this figure, LRAS is long-run aggregate supply (AS) curve. SRAS is short-run aggregate
supply curve. The former is vertical since we assume that in long-run factor inputs are given/prices
are determined in the model. In turn, the latter is upward sloping and lower is its slope higher is the
effect of shifts of aggregate demand on income. Shifts of aggregate demand AD = Y (P,G, T,M, πe)
in this model can stem from changes in policy parameters G,T, andM , and expected inflation level
πe. Lucas Imperfect Information Model provides us with a SRAS curve.
In Lucas Imperfect Information Model, there are N firms. Each firm i (i = 1, N) is assumed to
face demand
χdi =y
N+ ζi − η (pi − p) ,
where yN is income spent on each good (y = lnY ). ζi is a taste parameter which has normal
distribution with 0 mean and is i.i.d. across firms. pi is the price of the product of firm i. p is the
aggregate level which consumers face. Aggregate price levels is average price in this model
p =1
N
N∑i=1
pi.
Consumers know this price level since they are consuming all types of goods produced in the
economy.
Further each firm i is assumed to have the following supply function
χsi = α+ β (pi − pe) ,
where α and β are positive parameters, pi is the price of the good of firm i, and pe is firms’expected
aggregate level price. There is intra-temporal uncertainty here which stems from the assumption
that firms cannot observe the prices (shocks) of their rivals. This is especially meaningful for
firms from different industries. Supply positively depends on the gap between pi − pe which is the(expected) relative price of the firm. In this regard, for example pi > pe can happen when firm i
has received positive shock to demand relative to others.
Market clearing requires that in equilibrium χsi = χdi ,
α+ β (pi − pe) =y
N+ ζi − η (pi − p) .
30
Therefore, solving for the price of firm i we get
pi =1
β + η
( yN
+ ζi − α)
+1
β + η(βpe + ηp) . (2)
The aggregate supply is then
y = N [α+ (β + η) pi − (βpe + ηp)− ζi] .
Take the average of this expression and assume that N is high. According to the Law of Large
Numbers then 1N
N∑i=1
ζi = 0 and
y = N [α+ β (p− pe)] .
If there is no uncertainty, which means that pe = p, then Y is vertical
y = αN.
This is long run aggregate supply curve (LRAS).
The expected price is conditional on observation of pi,
pe = E [p| pi] (3)
so that firms can update their beliefs within a period. Since ζi is driving the uncertainty in this
model and it has normal distribution, p has a normal distribution too. It can be shown that in
such a case the conditional expectation of p is its expected value plus an error term which has zero
expected value,
E [p| pi] = E [p] + θ (pi − E [p])
= (1− θ)E [p] + θpi. (4)
where θ > 0, and the error of prediction is pi − E [p].
Take expected value of (2) conditional on pi
pi =1
β + η
( yN
+ ζi − α)
+ E [p| pi] . (5)
Plug (4) into (5) and take the average of pi,
p =1
(β + η) (1− θ)
( yN− α
)+ E [p] .
Compute aggregate output denoting (β + η) (1− θ)N = b,
y = y + b (p− E [p]) . (6)
31
This is short run aggregate supply curve. Clearly it is upward sloping. In case p > E [p] clearly
in short run y > y . Such a relation holds since if p > E [p] then on average firms have received a
positive shock (their prices have turned out to be higher than expected aggregate price) and have
increased output.
Combining aggregate supply curve with aggregate demand curve gives equilibrium levels of price
and quantity. To keep the discussion as simple as possible suppose that aggregate demand is given
by the Quantity Equation of Money
PY = VM.
Suppose further that the velocity of money is equal to 1. The logarithm of the aggregate demand
therefore is
y = m− p.
Then prices can be determined from
y + b (p− E [p]) = m− p, (7)
which is common market clearing condition.
Under rational expectations the agents use (7) to derive their expectations. To find it, take the
expected value of (7),
E [p] = E [m]− y.
Just as in Cagan Model, therefore, agents’expectations of prices depend on expectation of nominal
money stock. Plugging this back into (7) and solving for p gives equilibrium price levels
p =b
1 + bE [m] +
1
1 + bm− y.
Aggregate price level in equilibrium is a weighted average of expected money supply E [m] and
actual money supply m. Plugging p from this expression into the aggregate demand equation gives
the equilibrium level of output
y = y +b
1 + b(m− E [m]) .
This implies that under rational expectation assumption short run deviations of output from
long run level are possible only through unanticipated changes in money supply. If money supply
rule is known then simply m − E [m] = 0 and y = y. Therefore, output fluctuations in this model
with rational expectations assumption are purely driven by unanticipated changes in money supply.
Prices, though, respond to unanticipated changes to money supply.
We have a version of the AD-AS Model where aggregate demand curve is
y = m− p,
which means that aggregate demand is assumed not to depend on fiscal policy. This is the reason
32
why inference does not include policy parameters other than money supply.
Lucas Imperfect Information Model with AD implies a very well known and very debated rela-
tionship in macroeconomics which is called Phillips Curve. Phillips Curve illustrates the short-run
trade-off between inflation and output. To derive it assume that long-run output is fixed and put
time index in (6) to obtain
yt = y + b (pt − Et−1 [pt])
= y + b [(pt − pt−1)− (Et−1 [pt]− Et−1 [pt−1])]
= y + b (πt − Et−1 [πt]) .
In short we have an expectation augmented Phillips Curve
yt = y + b (πt − Et−1 [πt]) .
In case when inflation is higher than expected, short-run output is higher than the long-run output.
In this sense, if policy makers can manage to full the agents they can increase short-run output.
Usually Phillips Curve is written, however, as a trade-off between inflation and unemployment.
Assuming that output is inversely proportional to unemployment and using u to denote the long-run
level of unemployment, Phillips Curve can be rewritten as
ut = u− b (πt − Et−1 [πt]) ,
where b > 0 is a constant. Increasing then unanticipated inflation reduces unemployment.
The Sticky Wage Model
In Macroeconomics II you have also seen a version of the Sticky Wage Model. We will now consider
a version of that model adding microstructure which hasn’t been discussed in detail in the previous
course. We will show again that expectations matter for macroeconomic outcomes.
Suppose that there is a continuum of mass one of identical and infinitely lived consumers. Each
consumer is endowed with L units of time. The consumers can supply their time to firms at
wage rate w or use it for leisure. The consumers derive utility from consumption of N different
types of goods xjNj=1 and have disutility from supplying labor. They have constant elasticity of
substitution utility function of the form
U (x1, ..., xN , L− l) =N∑j=1
xαj + χ ln (L− l) .
where α ∈ (0, 1) and 11−α is the elasticity of substitution between different pairs of goods x.
12 l is
12The elasticity of substitution between any i and k (i 6= k) pair of x isd ln
(xixk
)d ln
(du(xk)/dxkdu(xi)/dxi
) = 11−α .
33
the amount of labor force that the representative consumer supplies to firms and L− l is its leisuretime. χ is a positive parameter.
The consumers maximize their utility subject budget constraint taking pricespxjNj=1
of goods
x and wage rate as given. Therefore, the representative consumer solves the following problem in
each period of time
maxxjNj=1,l
N∑j=1
xαj + χ ln (L− l)
s.t.
0 = wl −N∑j=1
pxjxj
To solve this problem we use Lagrangian
L = maxxjNj=1,l
N∑j=1
xαj + χ ln (L− l) + λ
wl − N∑j=1
pxjxj
Normalize λ = 1. The optimal rules are given by first order conditions
[xj ] : pxj = αxα−1j ,
[l] : w = χ1
L− l ,
where the first one is the inverse demand function of good xj and the second one is the inverse
supply function of labor. Notice that there are actually N inverse demand functions because there
are N goods x. This means that we have N + 1 optimal rules.
Assume that effective wage rate is given by
w = pef (u, z) ,
where pe is the expected aggregate price level (pe = E[1N
∑Nj=1 pxj
]), f is a function which decreases
in the first argument and increases in the second argument. u is the level of unemployment in the
economy, and z are institutional features in the economy such as unemployment benefits, minimum
wages, etc.
Suppose that there are N firms. Each firm produces a type of x good. Firms’ input for
production is labor and a unit of labor produces a unit of output in all firms. Firms are price
setters. Therefore, the problem of the firm which produces good xj is
maxxj
pxjxj − wxj
s.t.
pxj = αxα−1j .
34
This implies that the inverse supply function is
pxj =1
αw.
Inverse supply function implies that pxj ≡ p and from the inverse demand function it follows that
in equilibrium all x goods are produced at the same quantities xj ≡ x.Notice that in this framework since firms are price setters price is higher than marginal cost w.
As α tends to 1 goods become more similar/substitutable since 11−α tends to infinity and px tends
to the marginal cost.
Given that in equilibrium all firms produce the same amount of output they hire the same
amount of labor. Since there is a continuum of mass one of consumers, the total amount of labor
that firms hire is l. This implies that unemployment rate is
u =L− lL
= 1− l
L.13
Moreover, the production of a unit of good requires unit of labor. Unemployment rate then can be
rewritten as
u = 1− Y
L
where Y is total output, Y =∑N
j=1 xj .
The intersection of
w
p=
pe
pf
(1− Y
L, z
), (8)
w
p= α,
gives equilibrium level of output and prices. At the same time it gives equilibrium level of unem-
ployment.
α =pe
pf
(1− Y
L, z
). (9)
Suppose that expected price relative to actual price pe
p increases. In such a case, in order this
equation to continue to hold f(1− Y
L , z)should decline. That will happen if unemployment in-
creases/output declines.
13Although we treat this as unemployment rate, this is actually the non-participation rate. Unemployment rate wouldbe the difference between participation rate in a non-distorted economy and the participation rate in this distortedeconomy.
35
Equation (9) holds in each period of time. Add time subscripts to it and write
pt =1
αf
(1− Yt
L, z
)pet .
This is the aggregate supply curve. In long-run pt = pet so that long-run output and unemployment
are given by
α = f
(1− Y
L, z
).
In short-run price level pt is an increasing function of Yt and expected price level pet . Higher Yt in
this setup increases the wages paid according to (8). This increases the marginal costs of the firms
and therefore the prices that they charge. Higher expected price also increases wages according to
(8) and therefore results in higher actual prices.
If we assume for example that
f (ut, z) = 1− ut + z =YtL
+ z,
then the short-run aggregate supply curve is given by
pt =1
α
(YtL
+ z
)pet ,
or
Yt =
(αptpet− z)L.
Long-run aggregate supply curve then will be
Y = (α− z)L.
36
The Effectiveness of monetary policy (with fixed rules)
We have seen that expectations and their types can matter for aggregate economic performance.
They can matter for both the levels of nominal and real variables and for their dynamics. We will
now turn to more rigorous discussion of how monetary policies can matter for economic performance
given the type of expectations.
A model with stabilizing monetary policy
Often policy makers have an agenda of stabilizing the economy. They set fiscal and monetary policy
rules in order to achieve that. The reason they do so is that consumers/households are believed
to be risk averse. Therefore, they might be better-off if the economy is more stable. The intuition
behind such a result is that risk averse consumers prefer steady income over volatile income.
This section discusses properties and effects of a monetary policy rule which attempts to stabilize
the economy. The discussion follows a simplified version of the AD-AS Model.
Since the focus of the section is on monetary policy, aggregate demand (in logarithms) is sim-
plified to
yt = mt − pt + νt.
νt can be thought to be the velocity of money. In such a case, this equation is the Quantity Equation
of Money.
In the reminder, we will assume that νt is a random variable which follows a simple autoregressive
- AR(1) - process of the form
νt = ανt−1 + ηt, (10)
where α ∈ (0, 1) is a parameter and ηt is an i.i.d. random variable with 0 mean and σ2 variance.
This process is called autoregressive because it has autocorrelation: its current value is correlated
with its previous value.
Before we proceed with the model, lets discuss the properties of the random variable νt. Denote
with µν the unconditional expected value of νt, E [νt]. It is straightforward to see that µν is equal
to 0,
E [νt] = αE [νt−1] + E [ηt]
µν = αµν + 0.
Denote with σ2ν unconditional variance of νt, V [νt]. Unconditional variance of νt is
σ2ν = E [νt − E [νt]]2
= E [νt]2
37
Plug ανt−1 + ηt for νt,
E [νt]2 = E [ανt−1]
2 + 2E [(ανt−1) (ηt)] + E [ηt]2
Given that ηt is i.i.d. E [(ανt−1) (ηt)] = 0. Therefore,
σ2ν =σ2
1− α2 .
Conditional on information available at time t− 1 mean and variance of νt are
Et−1 [νt] = αEt−1 [νt−1] + Et−1 [ηt] (11)
= ανt−1,
and
Vt−1 [νt] = Et−1 [νt − Et−1 [νt]]2
= Et−1 [ανt−1 + ηt − ανt−1]2 = σ2.
The conditional expectation formula for νt shows that a very important property of νt, as compared
to ηt, is that νt can be predicted using its previous values. The previous values of νt, in turn, depend
on the previous values of η, which are known numbers at time t.
Aggregate supply in this model is given by
yt = pt − wt,
where wt are wages. Aggregate supply increases with prices and declines with wages, which sum-
marize/represent the costs of the firms.
In this model, the long-run level of output, prices, and money supply are normalized to 1 so
that in the long-run their logarithms are
y = m = p = w = 0.
Agents are assumed to live for 2 periods. Their wage contracts fix wages for 2 periods. In
this sense contracts are for a long term and wages are sticky. Wages are fixed according to the
agents’ expectation of prices for 2 periods. At birth (denote by t − 2) they inherit information
about previous periods of time from their parents and require wage at time t according to
wt =1
2
(pet|t−1 + pet|t−2
). (12)
Monetary authority pursues an agenda of stabilizing the economy. It tries to set money supply
so that to reduce the effect of shocks ν on output conditional on information it has. The shock
38
arrives after monetary policy has been implemented. In other words, monetary policy authority sets
money supply at the beginning of time t not knowing the realization of the shock η at time t, but
knowing its realizations in earlier periods In this sense monetary policy authority does not have
superior information as compared to the other members o the economy (the agents, firms, etc).
Suppose that monetary policy rule/money supply is
mt = βht + (1− β)ht−1, (13)
where β ∈ [0, 1] and
ht = γνt−1,
and γ < 0. With this policy rule monetary policy authority can reduce the expected shock
Et−1 [νt] = ανt−1 since when νt−1 increases money supply declines.
In equilibrium aggregate demand is equal to aggregate supply,
pt − wt = mt − pt + νt.
Therefore,
pt =1
2(mt + wt + νt) , (14)
yt =1
2(mt − wt + νt) . (15)
Plugging monetary policy rule (13) and wages (12) into output equation gives
yt =1
2
[βγνt−1 + (1− β) γνt−2 + νt −
1
2
(pet|t−1 + pet|t−2
)].
In this expression, νt−1 and νt−2 are known at time t.
The dynamic path of the model is given by
pt =1
2
[βγνt−1 + (1− β) γνt−2 + νt +
1
2
(pet|t−1 + pet|t−2
)],
νt = ανt−1 + ηt,
yt =1
2
[βγνt−1 + (1− β) γνt−2 + νt −
1
2
(pet|t−1 + pet|t−2
)].
We have, therefore, unknowns yt, νt, pt, and pet|t−1 and pet|t−2 and need equations for the latter two.
Suppose that agents have rational expectations in this model. In such a case, expected prices
need to be figured out from the model and
pet|t−1 = Et−1 [pt] , pet|t−2 = Et−2 [pt] .
39
Take the expectation of price pt (14) conditional on information at time t− 1,
Et−1 [pt] =1
2(Et−1 [mt] + Et−1 [wt] + Et−1 [νt]) .
According to (13) and (11) the conditional expectation of mt is
Et−1 [mt] = βγEt−1 [νt−1] + (1− β) γEt−1 [νt−2]
= βγνt−1 + (1− β) γνt−2.
Et−2 [pt] is a given number at time t−1, which means that Et−1 [Et−2 [pt]] = Et−2 [pt]. According
to (12), then, the conditional expectation of the wage rate is given by
Et−1 [wt] =1
2Et−1 [Et−1 [pt] + Et−2 [pt]]
=1
2(Et−1 [pt] + Et−2 [pt]) .
Therefore, the expected value of pt conditional on information from time t− 1 is
Et−1 [pt] =1
2
[(α+ βγ) νt−1 + (1− β) γνt−2 +
1
2(Et−1 [pt] + Et−2 [pt])
],
or equivalently,
Et−1 [pt] =2
3
[(α+ βγ) νt−1 + (1− β) γνt−2 +
1
2Et−2 [pt]
]. (16)
This means that expected price is a function of previous realizations of shock and expectations.
We still need an equation for Et−2 [pt]. Take the expectation of price pt (14) conditional on
information at time t− 2,
Et−2 [pt] =1
2(Et−2 [mt] + Et−2 [wt] + Et−2 [νt]) .
According to (10), the expected value of νt conditional on information at time t− 2 is
Et−2 [νt] = Et−2[α2νt−2 + αηt−1 + ηt
]= α2νt−2.
Conditional expectation of monetary policy rules is
Et−2 [mt] = βγEt−2 [νt−1] + (1− β) γEt−2 [νt−2]
= [1− (1− α)β] γνt−2.
In turn, according to (12) and the Law of Iterated Expectations conditional expectation of wage
40
rate is
Et−2 [wt] =1
2Et−2 [Et−1 [pt] + Et−2 [pt]]
= Et−2 [pt] .
Therefore,
Et−2 [pt] =1
2
[1− (1− α)β] γνt−2 + Et−2 [pt] + α2νt−2
,
or equivalently,
Et−2 [pt] =[(1− β) γ + αβγ + α2
]νt−2,
The expected value of price conditional on time t− 1 (16) can be rewritten as
Et−1 [pt] =2
3(α+ βγ) νt−1 +
1
3
[3 (1− β) γ + αβγ + α2
]νt−2.
In turn, output (15) can be rewritten as
yt =1
2[(α+ βγ) νt−1 + (1− β) γνt−2] +
1
2ηt
−1
4
2
3(α+ βγ) νt−1 +
1
3
[3 (1− β) γ + αβγ + α2
]νt−2
−1
4
[γ (1− β) + αβγ + α2
]νt−2.
Group together the coeffi cients of νt−1 and νt−2 and use the condition that νt−1 = ανt−2 + ηt−1 to
rewrite output as
yt =1
3(α+ βγ) ηt−1 +
1
2ηt.
The first term in this expression is the effect of shocks in previous periods on output. Monetary
policy, for example, can eliminate their effect setting γ = −αβ and reduce the effect of shocks on
output. In such a case
yt =1
2ηt,
where ηt is shock in period t which is not observed by the monetary policy authority before it sets
and runs the policy rule. Policy rule, therefore, cannot affect ηt.
Such a policy rule implies that the variance of output is
V [yt] =1
4σ2.
Notice that if γ is not selected to eliminate ηt−1 then
V [yt] =1
4σ2 +
[1
3(α+ βγ)
]2σ2,
which is larger than 14σ
2. In this sense this policy rule stabilizes economy eliminating some of
41
the influence of shocks. In particular, and importantly, it eliminates the effect of the previous
realizations of η, ηt−1. It is able to do so because the value of ηt−1 is known when the policy is
implemented. In turn, the value of ηt−1 matters for wages, prices, and output because wages are
sticky. They are indexed also to what has happened at time t− 1. Wages become indexed because
they depend on Et−2 [pt], which in turn depends on an autocorrelated random process νt. The
values of νt are autocorrelated and can be predicted using previous information (νt−1, ηt−1, etc).
From aggregate supply equation it follows that
pt − wt =1
2ηt.
This implies that positive shocks ηt imply reduction of real wage rate wt − pt.What would happen if monetary policy authority knew the realization of ηt before setting
and running its policy and had an agenda to stabilize the economy? The answer turns to be
straightforward. If it sets
mt = −νt,
then there are no shocks in this economy and no uncertainty. Aggregate output does not fluctuate
and is at its long-run level, 1.
w = p,
y = p− w,
p = −y.
It seems natural to derive also monetary policy rule in case when wages are not sticky and are
set for each period. In such a case, the model can be summarized as
[AD] : yt = mt − pt + νt,
[AS] : yt = pt − wt,
[Wages] : wt = Et−1 [pt] , (17)
[Shock] : νt = ανt−1 + ηt.
Therefore, in equilibrium
pt =1
2(mt + wt + νt) ,
yt =1
2(mt − wt + νt) .
To find equilibrium level of output derive the expected level of prices and wages using (17)
wt = Et−1 [pt] = (α+ βγ) νt−1 + (1− β) γνt−2.
42
Therefore,
yt =1
2βγνt−1 + (1− β) γνt−2 − [(α+ βγ) νt−1 + (1− β) γνt−2] + ανt−1 + ηt
=1
2ηt.
Clearly, yt in this case does not depend on money supply, which implies that money supply
rule selected above cannot reduce variability of output. Is this a general inference for any money
supply rule? Yes. To see that take again expected price level now assuming that we have a general
(deterministic) money supply rule mt
wt = Et−1 [pt] = mt + ανt−1.
This implies that aggregate output is
yt =1
2ηt.
Monetary policy has no effect in this general case too. Therefore, there is nothing to derive/any
rule suits. The reason why this happens is that there are no rigidities since wages adjust in a period
and the values of the previous shocks (which could have been eliminated) don’t matter. Higher
money supply simply alters prices.
Another possible extension of the model is to consider static expectations wt = pt−1. In such a
case in equilibrium it has to be that
pt =1
2(mt + wt + νt) ,
yt =1
2(mt − wt + νt) ,
wt = pt−1,
νt = ανt−1 + ηt,
mt = βγνt−1 + (1− β) γνt−2.
From equations for pt and wt it follows that
pt =
(1
2
)tp0 +
1
2
t∑τ=0
(1
2
)t−τ(mτ + ντ ) .
In turn, from νt = ανt−1 + ηt is follows that
νt = αtν0 +t∑
τ=0
αt−τητ .
Assume that ν0 = η0 = 0.
43
Aggregate output, therefore, is
yt =1
2
[(α+ βγ)α+ (1− β) γ]
t−2∑τ=0
αt−τητ + (α+ βγ) ηt−1
−1
2
1
2
t−1∑τ=2
(1
2
)t−1−τ [(α+ βγ)α+ (1− β) γ]
(τ−2∑k=0
ατ−kηk
)+ (α+ βγ) ητ−1 + ητ
−1
2
(1
2
)t−1p0 +
1
2ηt.
The exercise which computes the variance of aggregate output requires tedious algebra. Instead
of that notice that in this case too monetary policy can eliminate some of the effect of previous
shocks on output setting α + βγ = 0 or (α+ βγ)α + (1− β) γ = 0. In this version of the setup
monetary policy matters for aggregate output since, again, wages are rigid. They are indexed to
previous levels of prices. Monetary policy alters the prices within period which alters the output of
firms but keeps their costs (wages) fixed.
A model with constant growth of money
In the previous (sub-)section we considered a policy which attempted to stabilize the economy. In
this (sub-)section we consider a different policy and keep the setup of the economy unchanged. The
policy that we consider bears the name of its main proponent: Milton Friedman. Friedman Rule
(or Policy) is monetary policy which makes nominal interest rate zero. Consider Fisher Equation,
i = r + π.
Friedman Rule then sets
π = −r.
A motivation behind such a policy can be derived from cash-advance models, which are slightly
more advanced to be covered in this course. In short in these models money is an inferior asset
in the sense that it does not earn interest, whereas bonds earn nominal interest i. The agents are
forced to keep money since they use it for their purchases. Intuitively, though, such a policy works
since, it manages to set nominal interest rate to 0 and eliminates the difference between keeping
money and bonds. Therefore, it makes money less inferior (or not inferior at all).
Assume that output and velocity of money are constant. In such a case, from the Quantity
Equation,
v +m = p+ y,
it follows that inflation is equal to the growth rate of money
πt = mt −mt−1.
44
Suppose that real interest rate is constant and denote −r = g. Therefore, in terms of this exposition
Friedman Rule is
mt = mt−1 + g.
This is the monetary policy rule which we consider in this section. The remainder of the model is
[AD] : yt = mt − pt + νt,
[AS] : yt = pt − wt,
[Wages] : wt =1
2(Et−1 [pt] + Et−2 [pt]) ,
[Shock] : νt = ανt−1 + ηt.
From [AD] and [AS] it follows that
pt =1
2(mt + wt + νt) , (18)
yt =1
2(mt − wt + νt) . (19)
Assuming rational expectations and using [Wages] gives
Et−1 [pt] =1
2(Et−1 [mt] + Et−1 [wt] + Et−1 [νt]) (20)
=1
2
[mt−1 + g +
1
2(Et−1 [pt] + Et−2 [pt]) + ανt−1
],
=1
2
[mt +
1
2(Et−1 [pt] + Et−2 [pt]) + ανt−1
],
Therefore,
Et−1 [pt] =1
2(Et−1 [mt] + Et−1 [wt] + Et−1 [νt]) (21)
=1
2
[mt−1 + g +
1
2(Et−1 [pt] + Et−2 [pt]) + ανt−1
],
=2
3
(mt +
1
2Et−2 [pt] + ανt−1
).
Notice that Et−1 [mt] = mt−1 + g since money supply follows a deterministic/fixed rule.
To find the expected price level conditional on information available at time t− 2, consider
Et−2 [pt] =1
2(Et−2 [mt] + Et−2 [wt] + Et−2 [νt]) ,
=1
2
(mt + Et−2 [pt] + α2νt−2
).
Therefore,
Et−2 [pt] = mt + α2νt−2.
45
Plugging back into the expression for expected price (20) gives
Et−1 [pt] =2
3
(3
2mt + ανt−1 +
1
2α2νt−2
),
and from [Wages] equation it follows that
wt = mt +1
3ανt−1 +
2
3α2νt−2
= mt + ανt−1 −2
3αηt−1.
According to (19), income level then is
yt =1
2ηt +
1
3αηt−1.
The second term in this expression highlights the persistence of shocks. This policy does not
eliminate ηt−1. We have relatively volatile output and expected inflation at rate g,
E [πt] = E [pt − pt−1] = mt −mt−1 + E
(2
3ανt−1 +
1
3α2νt−2
)− E
(2
3ανt−2 +
1
3α2νt−3
)= mt −mt−1 = g.
Consider a slight extension of the discussion. Suppose that the monetary policy authority/central
bank can cheat the agents and unexpectedly increase money supply from g = g to g′ (g′ > g). Sup-
pose for simplicity that there are no shocks in the economy, so that ηt = νt ≡ 0. This implies that
agents expect prices and prices in general are
Et−1 [pt] = Et−2 [pt] = pt = mt−1 + g
and wages are given by
wt = mt−1 + g.
Therefore, aggregate supply is at its long-run level yt = 0. This happens because there is no
uncertainty.
Aggregate demand is given by
yt =1
2[mt − (mt−1 + g)] .
Therefore, if the monetary policy authority increases the growth of money supply to g′ then itincreases output at least for a short period,
yt =1
2(g′ − g) .
Suppose policy makers’ agenda is to boost output. In such a case it would be tempting for
46
the policy makers to surprise public with sudden increases of inflation. In other words, the policy
makers although might have announced a fixed monetary policy rule they might want to deviate
from that commitment. In such a case, would the public believe that policy makers will stick to the
announced policy? The answer to the question is "No" (most probably). We will continue exploring
this commitment problem more formally in the next sections.
Discretionary monetary policy
Suppose now that the central bank’s agenda is to maximize output and minimize inflation. To
formalize that assume that the central bank has a loss function,
Lt = −yt +λ
2(pt − pt−1)2 ,
which it attempts to minimize choosing the monetary policy rule. Clearly this is equivalent to
maximizing −Lt . Therefore, the central bank’s optimal problem is
−Lt = maxmt
yt −
λ
2(pt − pt−1)2
.
Aggregate demand and supply are the same as in previous section. In equilibrium we have then
that
pt =1
2(mt + wt + νt) ,
yt =1
2(mt − wt + νt) .
In turn, assume that wages and shocks are given by
wt = pet|t−1,
νt = ανt−1 + ηt.
Suppose that the central bank sets its policy prior to observing the shock. However, it has an
important advantage in the sense that it can set its monetary policy rule conditional to the wages
that prevail in the economy. Therefore, the central bank minimizes the expected loss and solves,
−Et−1 [Lt] = maxmt
Et−1
[1
2(mt − wt + ανt−1 + ηt)−
λ
2
[1
2(mt + wt + ανt−1 + ηt)− pt−1
]2].
The solution of this problem is given by
∂Et−1 [Lt]
∂mt= 0,
47
which (in this setting) is equivalent to
Et−1
[∂Lt∂mt
]= 0,
or simply
Et−1
[1
2− λ
2
[1
2(mt + wt + ανt−1 + ηt)− pt−1
]]= 0.
Therefore,
mt =2
λ+ 2pt−1 − wt − ανt−1.
Plugging this monetary policy into the expressions for output and prices gives
pt =1
λ+ pt−1 +
1
2ηt,
yt =1
λ+ pt−1 − wt +
1
2ηt.
The expression for prices implies that expected inflation is
E [πt] =1
λ.
Therefore, increasing the weight on inflation in the loss function λ reduces inflation. In other words,
if central bank values more lower inflation then it sets monetary policy rule to make sure inflation
is lower.
In turn, in order to find output assume that the agents have rational expectations and derive
wage rate from the equation for prices.
wt =1
λ+ pt−1,
Wages depend on parameter λ. This is because in this setting monetary policy rule is contingent
on prevailing wages.
Therefore, output is given by
yt =1
2ηt.
This monetary policy therefore does not limit the effect of shocks on output but affects inflation.
Political cycles and discretionary monetary policy
Consider a modification of the previous setup. Suppose that there are two political parties A and
B in the economy and it is an election year. Currently it is period mid-t−1 and elections are at the
end of period t− 1. Suppose these parties have different tastes for (weights on) inflation λA and λBand appoint monetary policy authority accordingly. Party A has lower tolerance to inflation than
party B, λA > λB. The wages for period t are set before the election contingent on prices that
prevail in the next period. Therefore, wages are set without knowing which party wins the election
48
and which tastes for inflation prevail in the economy in period t. The probability that party A wins
the elections is χA. The reminder of the model is the same as the model in previous section.
After the elections there is one party in the economy. Let it be party i where i is either A or B.
Therefore, the economy is described by
pit =1
2
(mit + wt + νt
),
yit =1
2
(mit − wt + νt
),
mit =
2
λi+ 2pt−1 − wt − ανt−1
E[πit]
=1
λi,
wt =1
λi+ pt−1.
Clearly, expected inflation is lower in case party A won the election. Combining expressions for
price level and monetary policy gives
pit =1
λi+ pt−1 +
1
2ηt.
This is price level conditional that party i has won the elections. With probability χA price level
at time t is pAt and with probability 1− χA it is pBt , therefore, unconditional price level is
pt =
[χA
1
λA+ (1− χA)
1
λB
]+ pt−1 +
1
2ηt.
This implies that wages are given by
wt =
[χA
1
λA+ (1− χA)
1
λB
]+ pt−1.
Wages are a weighted average of tolerances for inflation and the previous level of prices. Therefore,
even though only one party is elected at time t the preferences of both parties matter.
In turn, expected inflation rate is given by
E [πt] = χA1
λA+ (1− χA)
1
λB.
To determine the level of output, use the expressions for money supply and wages
yit =1
λi−[χA
1
λA+ (1− χA)
1
λB
]+
1
2ηt.
Therefore, if A is elected the expected output is
E [yt] = (1− χA)
(1
λA− 1
λB
),
49
which is a negative number. This number increases with the probability that the party B wins
elections 1 − χA. The expected output level is negative since party A tolerates inflation less than
party B. The existence of party B pushes inflation up. To keep it low party A reduces money
supply and output. Output declines to a negative number
If B is elected then the expected output is positive and given by
E [yt] = χA
(1
λB− 1
λA
)> 0.
Monetary policy under commitment and discretion
Previous sections discussed the properties of a monetary policy rule which reduces the variance of
output. The discussion in these sections suggested that in certain cases there might a commitment
problem with an announced policy in the sense that policy makers might want public to believe
in the implementation of a policy but they might deviate later from it. The latter situation arises
especially when the policy makers pursue two opposing policy goals at the same time. The monetary
policy authorities/central banks usually have exactly two policy goals: stabile inflation and output.
Consider an economy where the central bank has these two policy goals. To formalize this and
keep the matter relatively simple, assume that the central bank selects monetary policy rule to
minimize
Lt = π2t + λ (yt − y)2 .
Monetary policy tries to minimize the deviation of output from its long-run level y and the deviation
of inflation πt from its long-run level 0. λ is a positive parameter. It offers the importance of
deviation of output from its long-run level for the central bank and therefore for monetary policy.
In this context L is the central bank’s loss function.
There is no trend in the model which we consider here, deviation of output from the long-run
level stem from exogenous shocks, and ideally long-run inflation is 0. Therefore, minimizing the π2tand (yt − y)2 amounts to minimizing their variance.
Suppose that the reminder of the economy is given by versions of AD and AS curves with
expected wages
[AD] : yt = a (mt − pt) + ηdt ,
[AS] : yt = b (pt − wt) + ηst ,
[Wages] : wt = pet|t−1,
where ηdt and ηst are uncorrected i.i.d. random variables with 0 mean and variance σ2d and σ2s,
correspondingly. a and b are positive parameters.
Further, suppose that agents’have rational expectations, so that
pet|t−1 = Et−1 [pt] .
50
For the purposes of the current discussion it is more convenient to write [AS] curve with inflation
rates
[AS] : yt = b (pt − Et−1 [pt]− Et−1 [pt−1] + Et−1 [pt]) + ηst ,
= b (πt − Et−1 [πt]) + ηst
In this model we will give an important advantage to the central bank assuming that it sets
monetary policy after observing the shock ηst and ηdt . Usually, monetary policy can react quickly,
although hardly without a lag. However, what we actually need here is that it reacts faster to
shocks than the private sector/prices and wages. This seems to be a realistic assumption and opens
a door for monetary policy to have real effects. Monetary policy can have real effects in this case
since it can mitigate at least within-period shocks.
We will further assume that monetary policy is given by
[Policy] : πt = α+ βηdt + δηst ,
where α, β and δ are policy parameters which the central bank chooses. Clearly this is an indirect
formulation of monetary policy rule. One way to think about it is that the central bank sets money
supply rule so that πt is given by equation [Policy]. Once πt is determined this money supply can
be determined from [AD] equation.
With this monetary policy rule, agents’expectations about inflation are
πet|t−1 = α.
Therefore, whatever value of α the central bank would choose, the agents know it and will adjust
their expectations accordingly. In this sense, the central bank can influence the agents’expectations.
It will make use of this in its optimization problem.
Monetary policy under commitment
In this case the central bank announces its policy a period ahead (at t− 1 period for period t) and
commits to it. This means that the central bank selects at time t − 1 parameters α, β and δ to
minimize its expected loss function Et−1 [Lt]. At time t− 1, however, similar to agents it does not
know the possible realization of ηst . Therefore,
Et−1 [πt] = α.
From equations [AS], [Wages], and [Policy] and the expression for loss function it follows that
the central bank solves:
Et−1 [Lt] = minα,β,δ
Et−1
[(α+ βηdt + δηst
)2]+ λEt−1
[(b(α+ βηdt + δηst − α
)+ ηst − y
)2].
51
In order to solve this problem, open up the brackets and use the definition of variance for random
variables with 0 mean, e.g., Et−1[(ηdt)2]
= σ2d.
Et−1
[(α+ βηdt + δηst
)2]= Et−1
[α2 + 2βηdt +
(βηdt
)2+ 2
(α+ βηdt
)(δηst ) + (δηst )
2
]= α2 + β2σ2d + δ2σ2s,
Similarly,
Et−1
[(b(α+ βηdt + δηst − α
)+ ηst − y
)2]= (bβ)2 σ2d + (bδ + 1)2 σ2s + y2.
Therefore, the optimal problems of the central bank is
Et−1 [Lt] = minα,β,δ
α2 + β2σ2d + δ2σ2s + λ
[(bβ)2 σ2d + (bδ + 1)2 σ2s + y2
].
The first order conditions of this optimal problem are
[α] : 2α = 0,
[β] : 2βσ2d + λ2b2σ2dβ = 0,
[δ] : 2δσ2s + λ2 (bδ + 1) bσ2s = 0.
The optimal conditions for α and β imply that α = β = 0. However, δ = − λb1+λb2
. The policy rule,
therefore, is
πt = − λb
1 + λb2ηst ,
From modified [AS] it follows then that output is
yt =1
1 + λb2ηst .
The variance of inflation and output under this rule are
V [πt] =
(λb
1 + λb2
)2σ2s,
V [yt] =
(1
1 + λb2
)2σ2s.
If the central bank does not value stability of output, λ = 0, then it sets inflation to 0 and allows
output to vary with ηst . If, however, it places very large weight on the stability of output, λ = +∞,then it allows inflation to vary with ηst but keeps output constant so that it has 0 variance. In
general, it is easy to show that the variance of inflation increases with λ and the variance of output
declines with it.
We don’t we have the influence of ηdt in monetary policy. The intuition behind such a result
52
is that shock ηdt pushes prices and output proportionately and in the same direction. Therefore,
it does not create a trade-off between offsetting inflation and output volatilities. In turn, shock ηstpushes prices and output in different directions. For example, positive shock ηst increases output,
but reduces prices. Therefore, it creates a trade-off.
Although the central bank reacts to shocks to ηst it sets α = 0. Therefore, unconditional
expectation of inflation and expected inflation for agents’are 0. Under this policy the expected
value of the loss function of the central bank is
Et−1 [Lt] = λ
(1
1 + λb2σ2s + y2
).
Monetary policy under discretion (without commitment)
A problem with the commitment equilibrium is that at time t the policy announced at t − 1 may
not longer be the optimal rule. In other words, if policy maker is able to change the announced
policy at time t then it might achieve lower loss. This is because at that time, inflation expectations
have been formed and can be treated as given. The central bank then could exploit this.
Formally, assume that agents believe that the central bank will deliver Et−1 [πt] = 0. In such a
case there is no Et−1 [πt] in [AS]. Therefore, from [AS], [Wages], and [Policy] and the expression
for loss function it follows that the central bank solves:
Et−1 [Lt] = minα,β,δ
Et−1
[(α+ βηdt + δηst
)2]+ λEt−1
[(b(α+ βηdt + δηst
)+ ηst − y
)2].
This can be equivalently written as
Et−1 [Lt] = minα,β,δ
α2 + β2σ2d + δ2σ2s + λ
[(bβ)2 σ2d + (bδ + 1)2 σ2s + (αb− y)2
].
The first order conditions in this case are
[α] : 2α+ 2λb (αb− y) = 0,
[β] : 2βσ2d + 2λb2σ2dβ = 0,
[δ] : 2δσ2s + λ2 (bδ + 1) bσ2s = 0.
Since the central bank sets the same β and δ it has to have lower loss setting
α =λb
1 + λb2y
instead of α = 0. Use [AS] - where expected inflation is zero - and the policy rule to compute
53
expected value of the loss function for under this policy
πt =λb
1 + λb2y − λb
1 + λb2ηst ,
Et−1 [πt] =λb
1 + λb2y,
yt =λb2
1 + λb2y +
1
1 + λb2ηst ,
Et−1 [Lt] = λ
(1
1 + λb2
)(y2 + σ2s
).
Clearly, expected loss with α = λb1+λb2
y is lower than with α = 0. Therefore, the central bank has
incentive to cheat the agents: Announce that α = 0 so that agents perceive Et−1 [πt] = 0 but later
set α = λb1+λb2
y so that Et−1 [πt] > 0.
If the central bank announces policy but alters it after the announcement then the policy rule
is not “time consistent” and is not credible. The agents will not believe that the central bank is
committed and the commitment equilibrium falls apart.
Assume now that the central bank cannot commit to a rule and look for a policy which is
optimal at period t when expectation of inflation is taken as given by the central bank. Call this a
discretionary policy. Clearly, if this policy does not appear to be the same as the one above, then
there is time inconsistency problem.
With discretionary monetary policy, the central bank solves
Lt = minπt
π2t + λ [b (πt − Et−1 [πt]) + ηst − y]2
.
The central bank minimizes loss function but not its expected value, since it is not committed to
any rule and makes the decision after the realization of shocks
The first order condition then is
πt + bλ [b (πt − Et−1 [πt]) + ηst − y] = 0.
To find out Et−1 [πt] take the expected value of this expression
Et−1 [πt] = λby,
which means that at period t− 1 the agents expect positive inflation at period t. In turn, inflation
and output are
πt = − bλ
1 + λb2ηst + λby,
yt =1
1 + λb2ηst .
Therefore, the central bank runs higher inflation than in case if it could commit to a rule.
54
In terms of previous choice parameters, clearly, this situation corresponds to δ = − bλ1+λb2
and
α = λby.
The central bank’s expected value of the loss function at time t− 1 then is (use Et−1 [πt] = λby
in [AS])
Et−1 [Lt] = Et−1
[(− bλ
1 + λb2ηst + λby
)2]+ λEt−1
[(1
1 + λb2ηst − y
)2]=
= λ
(1
1 + λb2
)σ2s + λ
(1 + λb2
)y2
Denote the expected values of loss in case of credible commitment, cheating, and discretion as
ECMt−1 [Lt], ECHt−1 [Lt], and EDt−1 [Lt].
ECMt−1 [Lt] = λ
(1
1 + λb2σ2s + y2
),
ECHt−1 [Lt] = λ
(1
1 + λb2
)(σ2s + y2
),
EDt−1 [Lt] = λ
(1
1 + λb2σ2s + y2
)+ (λb)2 y2.
It is clear that
EDt−1 [Lt] > ECMt−1 [Lt] > ECHt−1 [Lt] .
Since the loss of the central bank is higher in case of discretion and if it fails to commit then
discretionary equilibrium prevails, then the central bank might want to use some mechanisms to
show the public/agents that it is committed. Such mechanisms are readily available in case of
repeated interactions when the public can punish the central bank (in terms of changing their
beliefs for example) if it deviates from its announced policy. Another way to align incentives is to
increase the independence of the central bank which often means reduction of λ. In such a case,
however, the central bank although would commit to a rule it will allow for uncontrolled changes
in output.
Business cycles
The term business cycles is used to coin the fluctuations in aggregate output and other activity
(e.g., unemployment, trade) over medium-term. In an economy, medium-term is usually associated
with a period of several months or years. Fluctuations can be both upward and downward. Upward
changes in output are called economic booms, while downward changes are called economic busts
or recessions. Relatively long lasting recessions are called a depression. These fluctuations are
measured relative to the long-term growth trend of the output and are largely unpredictable.
55
The explanation of business cycles is one of the primary issues in macroeconomics. Business
cycles have been studies starting from the time of Adam Smith and David Ricardo. Economists
tend to differ a lot, however, in terms of their explanations of causes of business cycles and pro-
posed remedies. This is such a central topic that the economists even form schools of thought in
explanation of business cycles.
Some of the most highlighted shocks to aggregate output and other aggregate variables in an
economy are
• Technology shocks: 100 years ago travelling from Barcelona to New-York would take much
more time than now. This is a drastic example how production functions can change over
time. New technologies like PCs and robots alter the production process and usually raise
productivity. Sometimes, production facilities break down or employees use too much face-
book, so productivity falls. Such a technological change is not always smooth; it often comes
in some of form of shocks.
• Weather shocks: Agriculture and tourism industries are very weather-dependent. Other
industries could also depend on weather if their employees work effort depends on weather.
Fluctuations in weather then affect output in these industries.
• Monetary shocks: We have seen that in certain cases money supply and inflation affectoutput. This implies that random changes to monetary policy or liquidity in the economy can
lead to output fluctuations as well.
• Political shocks: The government can influence the real economy through public expendi-ture, and regulations. If it changes expenditures, tax laws, antitrust regulation, and expecta-
tions then that can cause fluctuations in aggregate output.
56
• Taste shocks: All the examples above are rather about the supply of goods. There couldbe also shocks to the demand for goods and services in terms of shifts in preferences. Such
shifts can cause fluctuations and can come for example from introduction of new products
that makes others obsolete in terms of the preference.
Usually none of these shocks can explain large shifts in output (and other aggregate variables)
such as observed in real economies. However, there are mechanisms in the economy that can
amplify these shocks. These shocks can be amplified because of, for example, intra- and inter-
temporal substitution. If a negative shock hits the economy and output declines then consumers
might wish to work less and enjoy more leisure (intra-temporal substitution). This would reduce
output further. Moreover, if consumers like smoothing their consumption then they would save less
(inter-temporal substitution) so that capital would decline and output would be lower in future.
Price stickiness could be another amplifying mechanism. If wages are sticky, for example, then after
a negative shock to productivity firms would like to pay lower wages but they cannot. Instead of
lowering wages and keeping rather steady-level of output they would fire labor force and reduce
output. Financial frictions, in terms of inability to lend and borrow freely, can amplify the effects
of shocks too. If there are financial frictions then even small shocks can force firms into bankruptcy.
This will affect financial sector that lent money to the bankrupt firms and reduce credit. Often
then additional firms have to declare bankruptcy, and sometimes even banks fail. Bank failures
reduce liquidity and credit. They can affect all creditors and debtors and therefore can have large
economic consequences.
The classical and neo-classical (fresh-water) school of thought tends to believe that the origins
of business cycle fluctuations are completely exogenous processes which affect aggregate output
through changes in technological effi ciency (e.g., introduction of computers and facebook) and other
real variables. They hypothesize that the economy is frictionless so that there are no price rigidities
and/or financial frictions. In this respect, they hypothesize that the cycles which follow these
shocks are the optimal/best response of the economy. Therefore, these schools of thought believe
that policies might not be effective in tackling business cycles. Neo-classical theories include the
Real Business Cycle theory by Kydland and Prescott (1982), which is built-around, in particular, the
rational expectations assumption. These theories tend to have solid micro-foundations. However,
they tend to under emphasize the importance of frictions and distortions in the real economy.
The Keynesian and neo-Keynesian (saltwater) schools of thought tend to believe that the origins
of business cycle fluctuations also include shocks to nominal variables. They hypothesize that the
economy involves frictions so that there are price rigidities, financial frictions, and other failures
in the economy. In such frameworks nominal variables affect real variables. Moreover, the cycles
which follow these shocks are not the best response of the economy. Therefore, these schools of
thought tend to believe that there is a scope of policy intervention (fiscal and/or monetary). A
standard example of Keynesian model is the AD-AS model (IS-LM-Phillips curve model). We
will see two "extensions" of it in the next two sections. These models, however, lack solid micro-
foundations. Neo-Keynesian theories tend to alleviate that problem. Prominent economists such
57
as Michael Woodford and Gregory Mankiw have contributed to the development of neo-Keynesian
theories. These theories are usually presented in a form of Dynamic Stochastic General Equilibrium
(DSGE) models, which are very far away from the scope of this course because of complexity. Such
models, however, are commonly used in the Central Banks and other (financial) institutions.
There are other schools of thought too. For example Monetarist school of thought (largely due
to Milton Friedman) and Austrian school of though (largely due to Friedrich Hayek).
Business cycles - The Carlin and Soskice (2005) model
This section presents a very classical Keynesian model of business cycles research by Carlin and
Soskice (2005). This model is an extension/analogue of the standard AD-AS model. The standard
AD-AS model consists of 3 equations. These are IS and LM equations, which describe the aggregate
demand (AD),
[IS] : Y = C + I (r) +G,
[LM ] :M
P= L (r + πe, Y ) .
and aggregate supply equation (AS). In Macroeconomics 1 and 2, AD-AS model is presented in a
static form. It is presented using levels of prices, but not changes of prices (inflation).
This model also consists of 3 equations. Its first equation is the IS curve. IS curve in this model,
however, is written in a somewhat reduced form and in terms of logarithms of variables
[IS] : y = A− ar,
where A and α are positive parameters. For example, A depends on the level of consumption
thriftiness. In turn, α measures the magnitude of the effect of interest rate on income in the goods
market. This model is dynamic and incorporates time dimension. In particular it is assumed here
that income and interest rate are time dependant. Moreover, this model assumes that interest rate
(because of some imperfect adjustment mechanisms) affects the level of income with time lag so
that
yt = A− art−1.
There exists interest rate level rs that leads to equilibrium level of output ye (stabilizing level of
interest rate).
Graphically, IS curve looks like this
58
The second equation in this model is the aggregate supply curve, which is written in the form
of the Phillips curve (where agents have static expectations)
πt = πt−1 + α (yt − ye) .
Graphically, this Phillips curve (PC) in short run and in long run is presented below
Here, the upward sloping curve (line) is the short run Phillips curve and the vertical curve (line)
is the long run Phillips curve. πT is the Central Bank’s targeted level of inflation.
The targeted level of inflation is one of the parameters of the Central Bank’s monetary policy.
In this model the Central Bank designs its monetary policy so that to minimize its loss function
L = (yt − ye)2 + β(πt − πT
)2,
where β is a positive parameter which highlights the importance of inflation stabilization for the
Central Bank.
The Central Bank sets its monetary policy so that to solve
L = minπt
(yt − ye)2 + β
(πt − πT
)2,
s.t.
πt = πt−1 + α (yt − ye) .
59
Substituting Phillips curve πt into L and differentiating with respect to yt gives the following first
order condition∂
∂ytL = 0⇔ (yt − ye) + αβ
(πt−1 + α (yt − ye)− πT
)= 0
Substituting Phillips curve back into this equation gives
(yt − ye) = −αβ(πt − πT
).
This equation summarizes the monetary policy rule. Carlin and Soskice call it MR-AD equation.
Graphically the solution of problem can be presented in the following manner.
The Central Bank’s indifference curves are ellipses (circles if β = 1) with a bliss point at(ye, π
T).
In sum, the model is given by the following 3 equations
[IS] : yt = A− art−1, (22)
[PC] : πt = πt−1 + α (yt − ye) , (23)
[MR−AD] : (yt − ye) = −αβ(πt − πT
). (24)
Plugging [MR−AD] into [PC] gives the level of inflation
πt =1
1 + α2β
(πt−1 + α2βπT
).14 (25)
Moreover, expressing πt from [MR−AD] and plugging it into [PC] gives
[Y R] : yt = ye −(
1
αβ+ α
)−1 (πt−1 − πT
). (26)
14Denote ω = 11+α2β
and rewrite this equation as πt = ωπt−1 + (1− ω)πT . The solution of this equation is πt =ωtπ0 + (1− ω)πT
∑tτ=0 ω
t−τ . As t increases the first term tends to 0. In turn, the second term tends to πT .Substituting this into [MR−AD] gives the dynamics of output. From the dynamics of output and [IS] the dynamicsof interest rate can be found.
60
Call this equation Y R: output rule. In turn, from IS equation it follows that
yt − ye = −a (rt−1 − rs) .
Therefore,
[TR] : rt−1 − rs =1
a
(1
αβ+ α
)−1 (πt−1 − πT
). (27)
This is an analogue of Taylor rule (it does not include output gap) which we call TR. This rule
directly follows from the Central Bank’s optimization rule. Therefore, it is the policy rule of the
Central Bank in terms of adjusting the interest rate. In this respect, TR suggests by how much the
interest rate should be adjusted if inflation deviates from its target.
Lets now turn to the analysis of fluctuations and policy responses. Without loss of generality, in
our example economy which we discuss below we will assume that inflation target is set to 2% (i.e.,
πT = 2). Moreover, time starts at t = 0 and at t = 0 the economy is in its long run (equilibrium)
level of output and (targeted) inflation.
Shocks causing fluctuations
Shock to the IS curve: Consider a positive to shock to the IS curve which raises A to A′ at
t = 0 (and remains there forever). The Central Bank can do nothing to affect the increase the
output in t = 0 since its monetary policy rule affects interest rate which has time lag in affecting
output. However, it can affect changes in the economy in the second period. In order to do so it
needs to make a forecast of Phillips in t = 1. With the forecast of Phillips curve it will identify its
constraint for designing its optimal problem, and find the solution of the optimal problem.
Suppose further that the shift of A to A′ has triggered (through Phillips curve) increase in the
current level of inflation to π0 = 4%. From Phillips and IS curves, therefore, we have that the
change in A is given by
4 = 2 + α(A′ −A
),
i.e., A′ = 2α +A.
Given that expectations are static the forecast of next period (t = 1) inflation is 4. For any level
of output, this implies that inflation is going to be higher in the next period implying that next
period’s Phillips curve has shifted up. The graphical representation of this process is as follows
61
Notice that if the Central Bank decides to lower inflation in t = 1 then it will cause a recession.
This is because any level of inflation below 4 corresponds to lower output level (lower than ye).
Moreover, it can affect inflation at t = 0 and move along the new Phillips curve.
According to the Taylor rule the Central Bank will calculate inflation at t = 0 so that it increases
interest rate at t = 0. In the figure offered above this corresponds to choosing points B and B′.
Clearly, therefore, the Central Bank causes a recession in t = 1. What happens next? At t = 1 since
output is lower than ye according to the Phillips curve inflation at t = 2 will be lower than at t = 1.
This implies that the forecast of Phillips curve for t = 2 at t = 1 is to the right of the dashed PC
in the figure above. The Central Bank then will set slightly lower interest rate and inflation. This
will increase output. This adjustment process will continue till the point in time when inflation is
back to its targeted level of 2. Of course, it will imply that the new stabilizing level of interest rate
is higher than the old stabilizing level of interest rate, r′s > rs.
Supply Shock: Consider a permanent shift of the level output at t = 0. This corresponds to a
shift of the long run level of output ye to some y′e such that y′e > ye and can happen, for example,
because of technical progress.
Suppose that the level of shock is such that it triggers the inflation to decline to 0. From the
Phillips curve, therefore, the magnitude of the shock has to be
0 = 2 + α(ye − y′e
),
62
i.e., y′e = 2α + ye.
Clearly, in this case given that long run Phillips curve shifts to the right (see the figure above
for graphical representation) it has to be that in the short run Phillips curve in the long run shifts
to the right too. To have such a shift there needs to be a new MR − AD curve since it has to go
through the intersection of long run and short run Phillips curves.
Using the forecast of Phillips curve the Central Bank knows that this shock will imply lower
inflation at t = 1. It will design its optimal problem accordingly. The solution of its optimal problem
implies Y R equation. According to Y R, the Central Bank would like to have higher output in period
t = 1 since yt increases with ye. From TR equation and IS curve this implies that the Central Bank
will reduce inflation at t = 0 from 2% and the interest rate from rs. In the figure offered above it
will move to points C and C ′. Later, it will gradually increase inflation and interest rate so that to
reach y′e and πT .
Endogenous business cycles - The Goodwin (1967) model
The Goodwin (1967) model combines not so orthodox growth models (Harrod, 1939; Domar, 1946)
with Phillips curve in order to generate endogenous business cycles. Fluctuations are due to cyclical
relationship between employment and wages. This model is not well micro-founded, however, it can
serve for a nice illustration of endogenous emergence of cycles.
This model has origins in Biology. In particular, it is related to a large class of Predator-Prey
models which describe dynamic biological systems. A common outcome of these systems is a vicious
cycle of large prey population. Imagine increasing the number of predators. That would reduce
63
the sample of prey and therefore reduce the number of predators. These dynamics give a raise to
Lotka-Volterra differential equation. Similar equation is derived from this model.
At any time t, the aggregate output is assumed to be given by
Yt = min
αtLt,
Kt
σ
,
where αt is the productivity of labor Lt, and σ pins down capital output ratio KtYtwhen Kt
σ ≥ αtLt.This production function implies that labor and capital are complementary. Moreover, it is
clearly not a neo-classical production function. For example, it violates diminishing returns property
of the neo-classical production functions.
In equilibrium firm will be reluctant to hire more than Ktσ amount of effi ciency adjusted labor
αtLt since hiring more would not increase their production. Moreover, they would be reluctant to
hire more than αtLt amount of Ktσ . We will focus on the case when they hire so that
αtLt =Kt
σ.
This implies that
Yt = αtLt =Kt
σ.
Further, assume that time is continuous and labor productivity grows at a rate of a. The growth
rate of labor productivity in discrete time is given by αt+1−αtαt
. The continuous time analogue of
this expression is 1αt
dαtdt .
Let total population be Nt and grow at a rate of β, i.e., 1Nt
dNtdt = β. Therefore, employment
rate is given by
εt =LtNt.
Denoting growth rates by letter g, this implies that employment rate grows at a rate of
gε = gL − β, (28)
where gL is the growth rate of labor. Notice that when the (equilibrium) amount of labor and
population grow at the same rate then the rate of growth of employment gε is zero. In case,
however, when population grows at a higher (lower) rate than labor then gε is negative (positive).
The rate of growth of labor can be found from the production function. Under the assumption
that αtLt = Ktσ the rate of growth of labor is
gL = gY − a. (29)
Clearly, when K is fixed the amount of labor should decline with labor productivity since the
amount of labor in effi ciency units should be constant.
The Phillips curve in this model depicts relationship between percentage changes of wages w
64
(price of labor) and employment (output). The Phillips curve is assumed to be given by
gw = ρεt − γ, (30)
where ρ and γ are positive parameters that characterize labor market conditions (e.g., existence of
minimum wage, labor unions).
Denote the share of worker compensation by θt
θt =wtLtYt
.
Apparently, 1 − θt would be compensation of capital. Now, since we focus on the case when inequilibrium αtLt = Kt
σ the share of worker compensation is
θt =wtαt,
which has a growth rate of
gθ = ρεt − γ − a. (31)
The last thing which needs to be determined to close the model is the growth rate of output.
Clearly, since Yt = Ktσ output and capital grow at the same rate.
Assume that there are two types of agents in this economy: workers and agents who own capital
and supply no labor. Call the latter type of agents "capitalists." Workers consume their income
immediately and do not save. In turn, capitalists save a constant fraction of s of their income. This
means that total savings are given by
St = s (1− θt)Yt.15
Capital accumulation rule, therefore, is given by
dKt
dt= St − δKt
= s (1− θt)Yt − δKt
=
[s (1− θt)
σ− δ]Kt.
The growth rate of capital and output then is
gK = gY =s (1− θt)
σ− δ. (32)
Notice that for sharp increase of θt, which corresponds to sharply increasing the share of worker’s
income, the growth rate of Y can become negative according to (32). This would imply that the
15Recall that in the Solow (1956) model there is no distinction between workers and "capitalists." In this sense, Solow(1956) assumes that both agents save the same percentage of their income.
65
growth rates of labor and employment become negative according to (29) and (28). Negative growth
rate of employment would reduce employment. Therefore, it would reduce wages according to the
Phillips curve (30) and θ according to (31). This is the predator-prey mechanism in this model.
This model can be summarized by two differential equations
gε =s (1− θt)
σ− δ − a− β,
gθ = ρεt − γ − a.
These equations are known as Lotka-Volterra differential equations. Clearly, gε declines as θt in-
creases. This is because higher θt reduces investments and growth of output. Therefore, it reduces
growth of employment. Moreover, higher employment increases gθ. This is because increasing
employment increases output and wages. Higher wages imply higher share of worker compensation.
In the steady-state equilibrium there is no dynamics in the system. Therefore, in the steady-state
equilibrium
0 =s (1− θ)
σ− δ − a− β,
0 = ρε− γ − a.
Solving for the share of worker compensation and employment gives
θSS = 1− σ
s(δ + a+ β) ,
εSS =γ + a
ρ.
Consider the differential equations in order to see the dynamic adjustment in this model. The
following figure offers the phase diagram (time evolution of the system).
66
Suppose that εt > εSS and θt < θSS . This implies that the point (εt, θt) is in quadrant I. Since
at εt = εSS we have that gθ = 0 and gθ increases with εt, it has to be that gθ > 0 for the points
(εt, θt) in quadrant I. This implies that over time θt increases in quadrant I, which is depicted
by the arrow pointing to the right (i.e., increasing θt). Moreover, since at θt = θSS we have that
gε = 0 and gε declines with θt, it has to be that gε > 0 for the points (εt, θt) in quadrant I. This
implies that over time εt increases in quadrant I, which is depicted by the arrow pointing to up
(i.e., increasing εt). The intuition behind the arrows in the remaining quadrants follows a similar
logic.
Imagine the economy starts at some point in quadrant I. In such a case, over time it will
gravitate to quadrant II then to quadrant III, quadrant IV , and return back to quadrant I.
This is illustrated by the circle of arrows around the steady-state. It turns out that dynamics are
periodic fluctuations in this model in the sense that this process never converges to the steady state
nor diverges to ±∞.An economy which is at the steady-state can appear in quadrant I for example because of a
positive shock to employment (e.g., population declines temporarily) and/or negative shock to labor
income share (e.g., labor productivity increases temporarily). After such a shock it will experience
fluctuations along the cycle and never converge back to the its steady-state. These fluctuations are
endogenous business cycles. Such fluctuations we have encountered for prices in the Cobweb Model.
In the Cobweb Model there would be never ending fluctuations if α2 = 1
A Real Business Cycles Model
This section mostly follows (1) Chapter 9 in Doepke et al. (1999) and King and Rebelo (1999).
67
Before Kydland and Prescott (1982) economists thought that classical models can be used for
studies of long-term phenomena such as economic growth. However, short- and medium-term fluc-
tuations are not well explained with classical models. Kydland and Prescott (1982) revolutionized
business cycles theory and economics in two ways. First, they showed that realistic fluctuations can
emerge in classical (well micro-founded) macroeconomic models as an optimal response to exoge-
nous shocks. By doing so they basically originated the real business cycles (RBC) theory. Second,
they offered ways for evaluating the predictions of the models so that to gauge their relevance and
fit to the real world.
Usually it is impossible to analytically solve the models used in real business cycles theory.
Instead numerical/simulation methods are used. In order to resort to numerical methods the values
of model parameters should be known. Kydland and Prescott (1982) advocated the use of real
world data for calibrate (estimating) the parameters of the models.
In these models, it is also very hard to derive analytical comparative statics for understanding the
qualitative predictions of the models. Numerical comparative statics are used instead of analytical
comparative statics. The numerical comparative statics are also useful for evaluation of quantitative
predictions. In particular, researchers evaluate the response of the model variables over time to
exogenous shocks (impulse-response functions) and compare them to the patterns in real world
data. In line with Kydland and Prescott (1982), the comparison is in terms of
• the direction and the shape of the response of model variables;
• the magnitude of response in terms of mean and standard deviation; and
• the signs and magnitude of correlations between model variables.
The table below offers business cycle statistics (moments) for main macroeconomic variables
from US data. This table is taken from King and Rebelo (1999). It is used for the latter two
points. All data are quarterly and are for the period of 1947 Q1-1992 Q4. Variables representing
quantities such as output (Y), consumption (C), investment (I), and hours worked are in per capita
terms. Consumption includes consumption of non-durables. Investments include private fixed
capital formation and expenditures on durables. Wage rate is the real compensation per hour.
Interest rate is basically the interest rate paid on treasury bill minus inflation. A is Hicks-neutral
productivity in aggregate output. It is defined as Solow residual (lnA = lnY −α lnK−(1− α) lnL).
68
The first column offers sample standard deviation of relevant variables. The standard deviation
of a variable X with observations from X1, X2, ..., XT is defined as
STD (X)T =
√√√√ 1
T − 1
T∑t=1
(Xt −
1
T
T∑t=1
Xt
)2.
It is the statistical/empirical analogue of the variance (more precisely the square root of the vari-
ance). The second column offers the standard deviation of the relevant variables relative to the
standard deviation of output (Y). Clearly consumption of non-durables is much less volatile than
output. Investments, which include expenditures on durables, are around three times more volatile
than output. The number of hours worked has around the same volatility as output. However,
wages (and interest rate) vary much less than output and output per hour worked.
The second column of the table shows the first order autocorrelations among variables. The first
order autocorrelation of the variable X can be found using the following regression
Xt = β + ρ1Xt−1 + ηt,
where β is a constant and ηt is by construction orthogonal to Xt−1. Coeffi cient ρ1 is the first
order autocorrelation of the variable X. It is called first order autocorrelation because we have the
1st order time difference (i.e., Xt−1) and the correlation of X with itself (i.e., auto-correlation).
69
Alternatively, ρ1 can be computed as
ρ1 =
1T−2
T∑t=2
(Xt − XT
) (Xt−1 − XT−1
)STD (X)T STD (X)T−1
.
This coeffi cient shows how much the observations of variables are interrelated (linearly). In other
words it shows how good is the current value of the variable for predicting its future value. Now
imagine that there is a shock to X because of some exogenous reasons (η). In such a case, if ρ1 is
close to 1 from below then this shock will stay in X for a long time. If ρ1 is equal to 1 then it will
stay in X forever. If ρ1 is higher than 1 then it will stay in X forever moreover it will propagate
and get larger over time.
From the second column it is clear that all quantity variables are very highly autocorrelated.
Perhaps, one of the most important autocorrelations for RBC theory is that of A. It is fairly large,
which means that previous values of A predict current values with high precision and a shock to A
will persist in A for a long time.
The last column of this table offers contemporaneous correlations between variables and output.
The correlation between variables X and Z can be found using the following regression
Xt = β + ρZt + ηt,
where β is a constant and ηt is by construction orthogonal to Zt. Coeffi cient ρ is the correlation
between variables X and Z. Alternatively, ρ can be computed as
ρ =
1T−1
T∑t=1
(Xt − XT
) (Zt − ZT
)STD (X)T STD (Z)T
.
This coeffi cient measures the degree two variables are related (linearly). This is the reason it is
called "co"-rrelation. More precisely, it corresponds to contemporaneous correlation because it is
for the observations from the same period of time.
The third column suggests that output and other quantities are very highly correlated. This
implies that the driving processes behind them might be quite the same. However, output and real
wage rate are not well correlated.
The Basic RBC Model
Consider a closed economy which is populated by a very large number of identical and infinitely-
lived households of mass one. In a period, the representative household is endowed with 1 unit
of time which it can use for work/labor l and leisure 1 − l. It derives instantaneous utility fromconsumption c and leisure
u (ct, 1− lt) = ln ct + γ ln (1− lt) .
70
The lifetime utility of the household at time zero is the expected discounted sum of the instantaneous
utilities
U (c, 1− l) = E0
[+∞∑t=0
βtu (ct, 1− lt)],
where β = 11+ρ is the discount factor and ρ > 0 is the discount rate. The representative household
has rational expectations and E0 is the expectation operator given all the available information at
the beginning of the economy, time 0.
The assumption that households live forever can be justified thinking that we have families of
altruistic households who care about the utilities of their offsprings as they care about their utilities.
It turns out that this assumption makes calculus easier. In turn, the assumption that there are a
very large number of households implies that each households is atomistic and can be ignored in
the analysis Therefore, non of the households can dictate prices and quantities and takes them as
given.
The representative household earns market wage w for each labor unit. It owns the capital
stocks in the economy which at time t are at the level of kt. The household earns market interest
rate of r for each unit of supplies capital. At time t, the proceeds from labor and capital are equal
to wtlt + rtkt. The household uses these proceeds to fund its consumption and savings. Therefore,
its budget constraint is given by
wtlt + rtkt = ct + st.
Savings are translated into investments
st = it.
Investments create new capital according to the law of motion of capital:
kt+1 = it + (1− δ) kt,
where δ ∈ (0, 1) is the rate of depreciation of capital.
This implies that the budget constraint of the household can be rewritten as
wtlt + rtkt + (1− δ) kt − ct − kt+1 = 0.
The household maximizes its lifetime utility with respect to the budget constraint. Formally, it
solves the following problem.
maxct,lt,kt+1+∞t=0
E0
[+∞∑t=0
βtu (ct, 1− lt)]
s.t.
wtlt + rtkt + (1− δ) kt − ct − kt+1 = 0,
k0 > 0− given.
71
We will use Lagrangian to solve this problem. We define the Lagrangian as
L = E0
[+∞∑t=0
βt u (ct, 1− lt) + qt [wtlt + rtkt + (1− δ) kt − ct − kt+1]].
Imagine that it is now some time t so that the household knows all the values of the variables
(including random variables) for time t but does not know the t+ 1 values. In such a case, take the
derivative of the Lagrangian with respect to ct and lt and set it to zero[∂L
∂ct= 0
]:
1
ct= qt,[
∂L
∂lt= 0
]:
γ
1− lt= qtwt.
Take also the derivative of the Lagrangian with respect to kt+1 and set it to zero. Notice that kt+1shows up in the Lagrangian at time t as kt+1 and at time t+ 1 in rt+1kt+1 + (1− δ) kt+1[
∂L
∂kt+1= 0
]: qt = βEtqt+1 [rt+1 + (1− δ)] .
We have Et in front of qt+1 [rt+1 + (1− δ)] because t+1 variables are subject to random shocks and
the household uses rational expectations to predict them. Notice that given the values of kt and itthe value of kt+1 is uniquely determined.
The first equation tells that the marginal utility of consumption 1c is equal to q, which is called
the shadow value of the marginal unit of capital. The second equation is the supply of labor. The
benefit of supplying a marginal unit of labor is wt, which is the real wage rate and measures the
number of consumption goods. In the second equation. wt is scaled by qt and qtwt is the benefit
of supplying a marginal unit of labor in terms of utility. In equilibrium, the household should be
indifferent between marginally increasing its labor supply or marginally increasing its leisure time.
Therefore, benefit in terms of more consumption because of labor should be equal to the utility loss
because of less time in leisure. The marginal disutility of labor is given by γ1−lt .
Plug the first equation into the second equation to find
1
ct= βEt
[1
ct+1[rt+1 + (1− δ)]
].
This equation is called Euler Equation. It equates the marginal utility of consumption to the
discounted value of the marginal utility of consumption times the earned interest. It states that in
equilibrium the household should be indifferent between consuming a marginal unit of good right
now or delaying its consumption to the next period and saving to earn a net interest of rt+1+(1− δ).In short it tells that in equilibrium the household should be indifferent between consumption and
saving.
For any given level of prices, the household has three variables to solve for (ct, lt, kt+1) and three
72
equations:
[labor supply] : lt = 1− γ ctwt,
[consumption] :1
ct= βEt
[1
ct+1[rt+1 + (1− δ)]
],
[investment] : wtlt + rtkt + (1− δ) kt − ct − kt+1 = 0.
Notice that labor supply increases with real wage and declines with consumption. This is because
household’s consumption reflects how well off it is. If the household anticipates higher wealth then
it would accordignly increase consumption and reduce its current labor supply (have you ever seen
lazy millionaires?).
In the standard neoclassical model, labor supply is inelastic. Therefore, changes for example
in government expenditures have no real effects on output. However, in this framework they can
have. Imagine an increase in government expenditures. The household knows that this is going
to be followed by a tax hike to cover the expenses. Therefore, it expects to have lower wealth.
The anticipation of lower wealth would lead to lower consumption and would increase current labor
supply. Higher labor supply would increase output, which we discuss below.
A very large number of identical firms produce consumption goods. The representative firm has
a Cobb-Douglas production technology:
yt = Atk1−αt lαt ,
where α ∈ (0, 1) and At is the technology level. Higher/lower At implies higher/lower level of output
for given levels of capital and labor.
At each period of time, the representative firm solves the following problem
πt = maxkt,ltyt − wtlt − rtkt
s.t.
yt = Atk1−αt lαt .
Therefore, its demand for capital and labor are given by[∂πt∂kt
= 0
]: rt = (1− α)
ytkt,[
∂πt∂lt
= 0
]: wt = α
ytlt.
These equations characterize the production/supply side. In equilibrium, supply and demand
are equal. This corresponds to plugging these two equations (supply prices) into the three equations
73
above (equations for demand side). If we do so then we obtain the following three equations:
lt = 1− γct1
α
lt
Atk1−αt lαt
(33)
1
ct= βEt
[1
ct+1
[(1− α)
At+1k1−αt+1 l
αt+1
kt+1+ (1− δ)
]], (34)
Atk1−αt lαt + (1− δ) kt − ct − kt+1 = 0. (35)
In these equations, the variables are c, k, l, and A. We need additional equation for A to have
4 equations and 4 variables. We will assume that
lnAt = ρ lnAt−1 + ηt, (36)
where ηt is an i.i.d. random variable with 0 mean and σ2 variance.
These 4 equations constitute the basic real business cycles model. The shocks originate in
equation (36) and are ηt. As discussed before, lnAt is computed using capital and labor series and
the following formulae
lnAt = lnYt − α lnKt − (1− α) lnLt.
The mean and variance of the shocks η and ρ are obtained running a regression that has the form
of (36). One of the major points of Kydland and Prescott (1982) is that many of the real world
business cycles summarized in Table 1 above can be matched using this model and the estimated
shocks.
It is important to stress that in this model shocks propagate to other variables and persist
over time. A shock that arrives at time t persists in the economy for several periods because of
two reasons. Suppose that a positive shock has arrived, so that the values of η and A are higher
than expected. First reason that the shock persists is that the arrival of positive shock increases the
marginal product of labor for a given value of k and increases output. Some of that increased amount
of output will be consumed and the remainder will be used for investments. Higher investments
will create higher amount of capital in the next period. Therefore, in the next period the amount of
capital will be higher than if there was no positive shock. This will imply higher marginal product
of labor and higher output, leading again to higher investments than there would have been without
the positive shock. The second reason is that A has autocorrelation as measured by ρ. Higher than
expected shock η will persist then in A. It turns out that usually this channel is quantitatively
the most important one. Therefore, the magnitude of the persistence ρ is very important for RBC
models.
How do we solve equations (33)-(36) for c, k, l, and A? For a fairly general set of parameters
we cannot actually derive the analytic solution of this model. To find solution of this model then
we run computer simulations.16 There are certain important insights, however, that we can draw
16Usually this is done for the log-linearized versions of these equations around the steady-state of the system. For afunction f (x) this involves writing a first order logarithmic approximation: f (x) = f (x∗) + f ′ (x∗) (lnx− lnx∗).
74
from this system of equations without explicitly solving them.
Steady state and technology shocks: We will say that our model economy is in a "steady
state" when all variables in equations (33)-(36) are time invariant. We will say that the economy is
in a "deterministic" steady state if there are no shocks, η, in the economy.
Suppose that in the long-run there are no shocks. It is possible to show that then in the long
term all variables in equations (33)-(36) are time invariant if |ρ| < 1. Let’s assume that |ρ| < 1.
According to the table from King and Rebelo (1999) this is fine at least for the US.17
Suppose that in the steady state we have a given value for A (e.g., A = 1). Using (33)-(35) it is
easy to show that in the steady state we have
l =α[1β − (1− δ)
]α[1β − (1− δ)
]+ γ
[1β − (1− δ)− (1− α) δ
] ,k =
[1− α
1β − (1− δ)
A
] 1α
l,
c =
1β − (1− δ)− (1− α) δ
1− α
[(1− α)
1β − (1− δ)
A
] 1α
l.
Let’s consider now a permanent increase in the value of A to A′. Such a permanent increase
would imply a new steady state where
l =α[1β − (1− δ)
]α[1β − (1− δ)
]+ γ
[1β − (1− δ)− (1− α) δ
] ,k′ =
[1− α
1β − (1− δ)
A′
] 1α
l,
c′ =
1β − (1− δ)− (1− α) δ
1− α
[(1− α)
1β − (1− δ)
A′
] 1α
l.
Clearly, in this new steady state capital stock and consumption are higher than in the old steady
state: k′ > k and c′ > c. Therefore, positive (negative) technology shocks increase (reduce) the
steady state levels of capital, output, and consumption.
Transition and technology shocks: Consider a model economy which starts at a deterministic
steady state and receives positive technology shock so that A increases to A′. We know that the
new steady state of this economy features higher levels of capital, output, and consumption. How
does the economy get to the new steady state? If there are no further shocks than it can be shown
17 If |ρ| > 1 then we will simply need to consider a "detrended" version of the model where we subtract the growth ofA from A, Y , K, and C.
75
from (33)-(35) that the economy will gradually transit toward the new steady state. During this
transition, capital, output, and consumption will increase.
For a more general discussion, we need to consider an economy that starts at a deterministic
steady state and periodically receives technology shocks which imply different values of K, Y , and
C in future steady state. After each shock the economy starts adjusting/transiting toward the new
steady state.
Since we are not able to solve for the variables (33)-(36) we will need to simulate the model
economy on a computer in order to assess its performance in terms of generated cycles in output,
investments, consumption, and other variables. In order to simulate it we need to use values for
parameters. The usual values of the parameters (for the US) are
α = 0.667, β = 0.984, δ = 0.025, γ = 3.48, ρ = 0.979, ση = 0.0072.18
The simulations provide the simulated values of model variables. We use then the definitions
of the standard deviation and correlations to check the model predictions. The following table,
borrowed from King and Rebelo (1999), summarizes the results from this exercise.
The simulation results indicate that the model closely matches the volatility of output, invest-
ments, consumption, and wages. It matches also autocorrelations and the observed procyclicality
(positive correlations with output) of almost all variables.
Although this "simple" model preforms astonishingly well for matching these data moments it
fails in matching the cyclicality of labor hours and interest rate. Hansen (1985) has suggested a
way to circumvent the problem with labor supply making it an indivisible choice at the individual
level so that the at macro level the elasticity of labor supply to shocks is very high. However,
18See Table 2 in King and Rebelo (1999).
76
that might create excess volatility in wages. To match the volatility of the interest rate and wage
rates then this simple real business cycle model is complemented with price rigidities. Adding price
rigidities, however, makes the model much more intractable and brings back Keynesian arguments.
We discuss one of the ways price rigidities are usually modelled in these frameworks in the next
section.19
Price Rigidities - The Calvo (1983) model
Classical and Keynesian schools differ the most in their view of how markets work. Classical (and
neo-classical) school of thought conjectures that markets are perfect (i.e., frictions and distortions
are insignificant). In this respect, it conjectures that prices freely adjust to equilibrate demand
and supply in all markets. Classical dichotomy holds under this conjecture, i.e., money supply
does not matter for real variables. Keynesian (and new-Keynesian) school of thought conjectures
that there are insignificant frictions and distortions. In particular, these imperfections create price
rigidities. Therefore, prices do not adjust (fully) to equilibrate demand and supply in all markets.
This breaks classical dichotomy. Moreover, in such a setup shifts in demand and supply can affect
output through prices too.20
There are two common ways for modelling price rigidity. The easy way to do so assumes that
changes in prices depend on time (time-dependent models.) For example, Taylor (1980) assumes
that firms change their prices each n-th period and that in each period 1n -th of firms change their
prices. Calvo (1983) assumes that in each period with some probability some of the firms can
change their prices. More precisely, Calvo (1983) assumes that at the beginning of each period a
random event decides which of the firms can change their prices and which of the firms cannot. A
more cumbersome, but more appealing, way for modelling price rigidity assumes that prices depend
on the state of the economy (menu cost models.) In these models firms change prices when the
expected benefit of changing their prices is higher than the cost of changing prices (menu cost.) The
complications arise because expected benefit of changing prices depends on the current and future
states of the economy.
DSGE models with Calvo (1983) style price rigidities tend to be the main workhorses in the
central banks. We will now cover a version of the Calvo (1983) model. It will provide us with an
upward sloping supply curve/Phillips curve.
Time is discrete. The economy is populated by a continuum of mass one of identical and infinitely
lived households. The representative household derives instantaneous utility from consumption of
a basket of goods. The lifetime utility of the household is given by
U =
+∞∑t=0
βtCt,
19Another criticism that applies to the RBC models is their reliance on Solow residual which is usually not very wellestimated.
20The assumption that prices are very rigid finds limited support in microeconomic data.
77
where C is a constant elasticity of substitution basket of goods i,
C =
1∫0
Cσi di
1σ
(37)
where σ ∈ (0, 1). The elasticity of substitution between any pair Ci and Cj (i 6= j) is 11−σ .
21 Denote
it by θ.
The households spend their entire income on purchases of C. Therefore, the representative
household’s budget constraint it
PCC =
1∫0
PiCidi, (38)
where PC is the price of C and Pi is the price of good i.
The household chooses its demand for different goods to maximize its lifetime utility. Since it
has no dynamic decisions (and therefore no inter-temporal trade-offs), the maximization of lifetime
utility is equivalent to the maximization of instantaneous utility streams. The problem of the
representative household can divided to two steps. In the first step the representative household
chooses C to maximize its utility. In the second step it choosesCi
i∈[0,1]
to maximize C. Therefore,
in the first step it solves
maxC
C − λPCC − 1∫
0
PiCidi
,
for any time t. The solution of this problem gives the shadow value of marginally relaxing the
budget constraint
1 = λPC .
In the second step the household solves the following problem
maxCii∈[0,1]
PC 1∫0
Cσi di
1σ
−1∫0
PiCidi
.
The solution of this problem is given by first order conditions for all goods i. Treating the integrals
as sums, these first order conditions are
[Ci
]: Pi = PCC
Cσ−1i
C. (39)
21εci,cj =d ln(Ci/Cj)
d ln
(∂C∂Cj
/ ∂C∂Ci
) = d ln(Ci/Cj)d ln(Cσ−1j /Ciσ−1)
= 11−σ .
78
where C =
1∫0
Cσi di. This expression together with (37) and budget constraint (38) implies that
Pσσ−1i =
(PCC
C
) σσ−1
Cσi .
Therefore, it implies that
1∫0
Pσσ−1i di
1σ
=
(PCC
C
) 1σ−1
1∫0
Cσi di
1σ
= P1
σ−1C C
σσ−1 C−
1σ−1 = P
1σ−1C .
Finally,
PC =
1∫0
Pσσ−1i di
σ−1σ
, (40)
which means that the aggregate level of price is a basket of prices of goods i.
Moreover, the (inverse) demand function implies that
Pi = PC
1∫0
Cσi di
1−σσ
Cσ−1i
Therefore,
Ci =
(PCPi
) 11−σ
C
=
(PCPi
)θC.
Each firm produces an i good. Firms have monopoly in their product and set prices. Firms choose
prices to maximize their real profits.
In case when there are no price rigidities firms solve
maxPiPC
PiPC
Ci − ϕiCi,
s.t.
Ci =
(PCPi
) 11−σ
C.
where ϕi is the marginal cost of producing Ci amount of good i. Plugging the demand function
79
into profit and taking the first order condition with respect to PiPC
gives[PiPC
]:PiPC
=ϕiσ,
This expression is the (inverse) supply of good i. Under perfect competition the relative price is
equal to marginal cost. Here, because we have monopolists, the price is equal to ϕiσ which is greater
than MC(Ci
)since σ ∈ (0, 1).
This relation should hold for any time t. Assuming that firms are symmetric, in logarithms, the
expression offered above can be written as
pt = lnϕt − lnσ,
where lnσ < 0 since σ ∈ (0, 1).
Suppose now that there are price rigidities. At any time t any firm i with a probability of α has
a sticky price and cannot change its price. With probability 1 − α it does not have a sticky priceand can change its price. In this case firms set their prices not knowing when they’ll be able to
reset them. Therefore, expectations of future prices will matter for their current decisions. A firm
which does not have a sticky price knows that if it sets price then with probability α it will last for
the next period (exactly one period). Moreover, with probability α (1− α) it will last exactly for
2 periods and exactly for 3 periods with probability α2 (1− α). The expected length of time until
price reset is given by+∞∑t=0
αt =1
1− α.
Firms in this case maximize their present discounted value of profit streams whenever they can
adjust their prices. Assume that at time t firm i is able to adjust its prices then it solves its problem
taking into account that for certain period of time (indexed by k below) it will not be able to reset
the price. Therefore, it solves at time t
Vt = maxPi,t
Et
[+∞∑k=0
(αβ)k[Pi,t
PC,t+kCi,t+k − ϕi,t+kCi,t+k
]], (41)
s.t.
Ci,t+k =
(PC,t+kPi,t
)θCt+k.
To solve this problem plug Ci,t+k into Vt and take the first order condition with respect to Pi,t. This
exercise gives
[Pi,t] : 0 = Et
[+∞∑k=0
(αβ)k[(1− θ) Pi,t
PC,t+k+ θϕi,t+k
](Pi,t
PC,t+k
)−θ−1 1
PC,t+kCt+k
]
Given that Pi,t does not depend on k it can be taken out of the sum and this expression can be
80
rewritten as
0 = Et
[+∞∑k=0
(αβ)k[(1− θ) Pi,t
PC,t+k+ θϕi,t+k
]P θC,t+kCt+k
],
Therefore, the price of firm i at time t relative to aggregate price PC,t is given by
Pi,tPC,t
=θ
θ − 1
+∞∑k=0
(αβ)k Et
[ϕi,t+k
(PC,t+kPC,t
)θCt+k
]+∞∑k=0
(αβ)k Et
[(PC,t+kPC,t
)θ−1Ct+k
] .
Let all the firms that are able to set price at time t be symmetric, i.e., ϕi,t+k ≡ ϕt+k. Moreover,let also all the firms which are not able to set prices at time t be symmetric. Therefore, denoting
the price of firms which are able to adjust it at time t by P at gives
P atPC,t
=θ
θ − 1
+∞∑k=0
(αβ)k Et
[ϕt+k
(PC,t+kPC,t
)θCt+k
]+∞∑k=0
(αβ)k Et
[(PC,t+kPC,t
)θ−1Ct+k
] . (42)
The aggregate price which prevails is the weighted average of prices of firms which adjust their price
and firms which do not adjust their price. It is given by (40):
Pσσ−1t = αP
σσ−1t−1 + (1− α) (P at )
σσ−1 , (43)
where we have dropped subscript C.
Equations (42) and (43) constitute new-Keynesian Phillips curve. To see this, first divide (43)
by Pt. Next, consider the first order log-linear approximations of these equations around the steady-
state, where all firms are able to set prices. To do so denote
f (Pt−1, Qt) = αPσσ−1t−1 + (1− α)Q
σσ−1t − 1,
g (Pt−1, Qt) = f(ePt−1 , eQt
)= α ln eP
σσ−1t−1 + (1− α) ln eQ
σσ−1t − 1.
The first order approximation of g around(
ln Pt−1, ln Qt)point is
g (Pt−1, Qt) =
[α ln eP
σσ−1t−1 + (1− α) ln eQ
σσ−1t − 1
]+αeP
− σσ−1
t−1 ePσσ−1t−1
σ
σ − 1P
σσ−1−1t−1 ln eP
σσ−1t−1
(Pt−1 − ln Pt−1
)+ (1− α) eQ
− σσ−1
t eQσσ−1t
σ
σ − 1Q
σσ−1−1t ln eQ
σσ−1t
(Qt − ln Qt
)
81
At the steady-state Pt−1 = Qt = 1. Therefore, around the steady-state it has to be that
g (Pt−1, Qt) = ασ
σ − 1Pt−1 + (1− α)
σ
σ − 1Qt.
Since f (Pt−1, Qt) = g (lnPt−1, lnQt) and f (Pt−1, Qt) ≡ 0 we have that around the steady-state
0 = αpt−1 + (1− α) qt.
Notice that pt−1 = ln Pt−1Pt
which is approximately minus 1 times inflation rate. Therefore,
πt =1− αα
qt.
Now consider log-linear approximation of (42) around the steady-state assuming for simplicity
that in the steady-state C = 1. To do so rewrite it as
Qt
+∞∑k=0
(αβ)k Et
[(PC,t+kPC,t
)θ−1Ct+k
]=
θ
θ − 1
+∞∑k=0
(αβ)k Et
[ϕt+k
(PC,t+kPC,t
)θCt+k
].
The approximation of the left had side around the steady-state point is as follows. Denote
g (Pt−1, Qt) = ln eQt+∞∑k=0
(αβ)k Et
[(ln e
Pt+kPt
)θ−1ln eCt+k
].
The first order log-linear approximation of g (Pt−1, Qt) around(
ln Qt, ln Ct+k, lnPt+kPt
)point is
g
(Qt, Ct+k,
PC,t+kPC,t
)≈ ln eQt
+∞∑k=0
(αβ)k Et
(ln ePt+k
Pt
)θ−1ln eCt+k
+(Qt − ln Qt
) +∞∑k=0
(αβ)k Et
(ln ePt+k
Pt
)θ−1ln eCt+k
+ ln eQt
+∞∑k=0
(αβ)k Et
(ln ePt+k
Pt
)θ−1 (Ct+k − ln Ct+k
)+ ln eQt
+∞∑k=0
(αβ)k Et
(θ − 1) ln eCt+k
(ln e
Pt+k
Pt
)−1e− Pt+k
Pt ePt+k
Pt
(Pt+kPt− ln
Pt+k
Pt
)
82
Around the the steady-state point(
0, C, 0)this is given by
g
(Qt, Ct+k,
PC,t+kPC,t
)≈ C
1
1− αβ +QtC1
1− αβ +
+∞∑k=0
(αβ)k(Et [Ct+k]− ln C
)+ (θ − 1) C
+∞∑k=0
(αβ)k Et
[Pt+kPt
].
Therefore, the approximation of the left hand side is
Qt
+∞∑k=0
(αβ)k Et
[(Pt+kPt
)θ−1Ct+k
]≈ 1
1− αβ +1
1− αβ qt ++∞∑k=0
(αβ)k Et [ct+k]
+ (θ − 1)+∞∑k=0
(αβ)k (Et [pt+k]− pt) .
In turn, the approximation of the right hand side is
θ
θ − 1
+∞∑k=0
(αβ)k Et
[ϕt+k
(Pt+kPt
)θCt+k
]≈ 1
1− αβ +θ
θ − 1
+∞∑k=0
(αβ)k Et[ϕt+k
]+
θ
θ − 1ϕ
+∞∑k=0
(αβ)k Et [ct+k]
+θ
θ − 1θϕ
+∞∑k=0
(αβ)k (Et [pt+k]− pt) .
Equate these expressions and notice that θθ−1ϕ = 1 to get
qt + pt = (1− αβ)
+∞∑k=0
(αβ)k Et
[pt+k + ln
ϕt+kϕ
].
Denote by Et−1zt = qt + pt. The above equation is the solution of the following difference equation
where z0 = 0,
αβEtzt+1 − Et−1zt = − (1− αβ)
[pt + ln
ϕtϕ
].
Solving for qt from this equation gives
qt = αβEt [qt+1 + πt+1] + (1− αβ) lnϕtϕ.
Using the relation α1−απt = qt gives
πt = βEt [πt+1] +1− αα
(1− αβ) lnϕtϕ.
83
This is the new-Keynesian Phillips curve. The difference between it and Keynesian Phillips curve
is that it contains no backward-looking terms but contains forward-looking (expected) inflation
matters.
84
Expectations and financial markets
Bonds
Bonds are (usually) very primitive financial instruments. They are forms loans (or I owe you/IOU).
More precisely, a bond is a certificate which states that the issuer is indebted to the owner of the
certificate. It states the amount of the debt to be paid back (principal), the time when the debt
has to be paid (maturity date). Depending on the type of the bond, the issuer might also be
obliged to pay the owner interest (coupon). Interest is usually payable at given and fixed intervals
(e.g., monthly or annual). Quite often bonds are negotiable in the sense that the ownership can be
transferred. This makes certain types of bonds highly liquid.
Firms tend to issue bonds in order to meet their financial requirements. According to the Pecking
Order Theory (of firm’s capital structure), issuing debt/bonds is firms’second most preferred way
to raise finance for investments. First most preferred way is to use internal finance, and the least
preferred way is to raise finance through issuing equity. In this respect, equity (stocks) and bonds are
quite much alike. They are both financial instruments for firms. However, there is a big difference
between bonds and equity/stocks. The stockholders of a firm are investors since they have equity
stakes (ownership rights) in the firm. In turn, the bondholders have creditor stakes in the firm
since they have lent money to the firm. Bondholders are creditors, therefore, they usually have
(absolute) priority over stockholders and will be repaid first in the event of bankruptcy. Another
and less important difference is that bonds usually have a defined maturity (consoles or annuities
don’t have maturity but are very rarely issued). In contrast, stocks are typically for indefinite
period. Not many firms, however, issue bonds. Moreover, firms which issue bonds typically are
quite large.
Almost in all countries one of the most significant issuers of bonds is the government. The
government uses bonds in order to take loans. These loans, together with taxes, the government
uses in order to cover its expenditures. For example, usually during recessions tax proceeds decline.
In turn, the governments implement counter-cyclical policies and increase their expenditures. These
expenditures they finance then using loans/bonds.
Government bonds tend to be sold on auctions, where participants are, for example, banks,
investment and hedge funds, various speculators, the central bank, etc. (You can imagine some-
thing alike a stock/derivative exchange. In fact, bond futures are massively traded on derivative
exchanges.) These markets are usually very liquid.22 One of the most liquid markets is the market
for US government bonds (US Treasury Bonds). Currently (18.03.2014), US debt issued in various
forms of bonds and IOUs amounts to $17,546,814,482,078.90 (yes! more than $17 trillion.)
There are many (and very complicated) types of bonds. For our purposes it is suffi cient to
consider the following simplistic examples. Hereafter, in our discussion in this chapter we will
22Very recently, because of the financial crisis (and, perhaps, reckless landing/borrowing) in several EU countries bondmarkets dried up. The governments of these countries then faced tough financial constraints and asked for loans (i.e.,for bond purchases) from the ECB, etc.
85
consider perfect competition in all (bond) markets and no lending constraints (i.e., think about
very liquid bond market where some of the participants can print money.) Bond 1 matures in 1
year and pays principal P . What is the fair market value/price of such a bond? Let yearly nominal
interest rate be i11, where subscript stands for the compounding time (1 year) and superscript
stands for the time when compounding will happen (in 1 year). Then the price of this bond is the
discounted value of the principal
PB1 =P
1 + i11.
Bond 2 matures in 3 years and pays principal P . What is the fair market value/price of Bond 2?
Let the yearly interest rate be the same across years and, as in previous example, let it be equal to
i11. The price of Bond 2 is the discounted value of the principal. It is
PB2 =P(
1 + i11)3 .
Clearly, as long as i11 > 0, PB1 > PB2 since Bond 1 matures earlier than Bond 2.
Notice that bank loans seem to have somewhat different structure. Usually, when an individual
takes L (EUR) loan for a year from a bank, the loan specifies the yearly nominal interest rate i.
Therefore, the individual has to repay (1 + i)L. Denote the latter by L. In such a case taking a
loan is equivalent to issuing a bond with principal L. Therefore, there is almost no difference.
Bond 3 matures in T years and pays principal P . Let the yearly interest rate be i11. The price
of Bond 3 is
PB3 =P(
1 + i11)T .
Bond 4 matures in T years and pays principal P . Moreover it pays yearly coupons C. Let the
yearly interest rate be i11. The price of Bond 4 is
PB4 =P(
1 + i11)T +
T∑t=1
C(1 + i11
)t .Bond 5 is exactly the same as Bond 4 but its coupon payments vary over time so that we have Ctinstead of C. The price of Bond 5 is
PB5 =P(
1 + i11)T +
T∑t=1
Ct(1 + i11
)t .Consider again Bond 2. Assume now, however, that interest rate during the first year is i11, in
the second year it is i21, and in the third year it is i31. In such a case, the price of Bond 2 is
PB2 =P(
1 + i11) (
1 + i21) (
1 + i31) .
86
In such a case the prices of Bond 3, Bond 4, and Bond 5 would be
PB3 =
T∏t=1
P
1 + it1,
PB4 =
T∏t=1
P
1 + it1+
T∑t=1
t∏τ=1
C
1 + iτ1,
PB5 =
T∏t=1
P
1 + it1+
T∑t=1
t∏τ=1
Ct1 + iτ1
.
Bonds 1, 2, and 3 are called zero coupon bonds since they pay no coupons. In finance interest
rate is usually called yield since it indicates what a loan (an investment) yields. The time sequence
of (similar) interest rates is termed as yield curve or term structure of interest rate. In our examples,
the yield curve is i11, i21, i
31, ..., i
T1 .
Imagine a market where zero coupon bonds are traded. which promise the same principal P .
Let there be bonds of all possible maturities: 1 year, 2 years, ..., and T years. It is straightforward
to determine the yield curve from the prices of these bonds. With slight abuse of previous notation,
let the prices of bonds with maturities 1 year, 2 years, ..., and T years be PB1 , PB2 , ..., and PBT ,
correspondingly. Therefore, interest rates are given by
i11 =P
PB1− 1,
i21 =P(
1 + i11)PB2− 1,
i31 =P(
1 + i11) (
1 + i21)PB3− 1,
...
iT1 =
(T−1∏t=1
1
1 + it1
)P
PBT− 1.
Call this algorithm (∗).Usually yield curves are upward sloping. The following figure illustrates the yield curve of US
Treasury bonds as of 9th of February 2005. It has interest rate/yield on Y-axis and time to maturity
on X-axis.
87
Clearly, in this example the yield curve is upward sloping with diminishing marginal rate of
increase. One of the widely accepted explanations for upward sloping yield curves is that longer
maturities entail greater risks for creditors/lenders. Lenders then demand a risk premium and
demand more premium for longer maturities. This explanation depends on the notion that currently
creditors (as well as we) know less about the distant future than about the very near term.
Another commonly accepted explanation is related to the IS-LM Model. Imagine for example,
that investors are expecting either a gradual shift of IS to the right or a gradual shift of LM to the
left (or both). In such case, they would expect to have higher interest rates in future and trade
accordingly.
This explanation points a deficiency in our pricing formulas since it points out that interest rates
might be highly variable/random. In such a case in all our formulas we need to replace interest
rates with their expected values. For example in the most general case we have for the price of
Bond 5
PB5 =T∏t=1
P
1 + it,e1+
T∑t=1
t∏τ=1
Cτ1 + iτ ,e1
.
To make things simple consider the case when Ct ≡ 0 and T = 3. Clearly, this corresponds to Bond
2. The price of the bond is then
PB2 =P(
1 + i1,e1
)(1 + i2,e1
)(1 + i3,e1
) .Notice that the price of the bond depends negatively on expected interest rates. Therefore, if
creditors are expecting higher rates in future the price of the bond declines. This is because if
investors are expecting higher rates in future they would rather wait and invest in future. Borrowers
in such a case receive less for the same principal. Therefore, again, expectation matter.
88
Money supply and risk-free bonds
Central banks usually announce their monetary policy in terms of a level of nominal interest rate.
They adjust their money supply so that they meet the targeted rate. How do they do that? They
buy or sell bonds in the market. To see how this works, consider, again, a deterministic world and
zero coupon bonds. Lets take as an example Bond 1,
PB1 =P
1 + i11.
If the central bank buys many such bonds, PB1 would increase. Higher PB1 implies lower i11.
Conversely, if the central bank sells many such bonds, PB1 would decline which would imply higher
i11. When the central bank buys bonds it does so using money. Therefore, it increases money supply.
In contrast, when it sells bonds it receives money. Therefore, it reduces money supply. This implies
that if the central bank announces that it plans to reduce interest rate then it plans to increase
money supply. Of course, if it announces that it plans to increase interest rate then it plans to
reduce money supply.
The central banks are (usually) public institutions. Therefore, they try to avoid excess risk. The
central banks then usually buy and sell government bonds, which tend to be thought to be relatively
safe assets. (After all, the government can almost always avoid defaulting on bonds denominated
in national currency with printing money.)
It is commonly thought that one of the most safe government bonds are the short maturity
US Treasury Bonds. Economists usually take them as the risk-free assets/bonds. The difference
between the interest paid on US Treasury Bonds and interest paid on equivalent (e.g., in terms
of maturity and coupons) bonds to certain extent represents risk premium (i.e., the premium the
investors demand to hold more risky asset.).
Yield curve and zero coupon bonds in reality
In reality there are almost no zero coupon bonds. Usually, zero coupon bonds are only those that
have paid their last coupon and are very close to maturity. This renders our algorithm (∗) almostuseless. For figuring out the yield curve a special algorithm is used. This algorithm is called
bootstrap.
The following example illustrates this algorithm. Imagine we have a bond very close to its
maturity, which has paid all its coupons. Basically we have a bond like B1. We observe its price
and know its principal. Therefore, interest rate (in a perfect market) is
i11 =P
PB1− 1.
Further, imagine we have a bond which matures in two years and pays coupon in a year. We know
its price (P ′B2), principal (P ), and the value of coupon that it pays (C). Therefore, in a perfect
89
market it has to be that,
P ′B2 =P(
1 + i11) (
1 + i21) +
C
1 + i11.
It is easy to figure out i21 from this expression,
i21 =P(
1 + i11)P ′B2 − C
− 1.
Stocks, stock prices, and stock markets
The stock of a firm is the equity stake of its owners. It represents the residual assets of the
firm that would be paid to stockholders after discharge of all senior claims (e.g., bonds/debt). A
stockholder/shareholder is an individual who legally owns one or more shares of stock in a firm.
Firms (usually) maximize their value appropriately designing production and marketing. It
turns out that this is equivalent to maximizing their shareholder value.
To see this, suppose that we have a firm which lives T periods and has real profit stream πt
in each period t. Denote real interest rate by r and, for simplicity, assume that it is constant over
time. At time t = 1, the present value of this firm is given by
V1 = π1 +1
1 + rπ2 +
1
(1 + r)2π3...+
1
(1 + r)t−1πt...+
1
(1 + r)T−1πT
=
T∑τ=1
1
(1 + r)τ−1πτ .
The present value of the firm V1 is the total present value of its stocks.
Clearly, V1 can be rewritten in the following manner
V1 = π1 +1
1 + rV2,
where π1 are the profits in current period (t = 1) and 11+rV2 are the discounted future capital gains.
For simplicity, assume that this firm has zero net debt. A stockholder/shareholder owns a
share/fraction of V1. Let this fraction be some ω from (0, 1). Therefore, in other words, a shareholder
has an entitlement to ωV1 amount of value (real money). It is clear that the maximization of V1 is
equivalent to maximization of ωV1.
In practice, shareholders usually call profits πt dividends. In such a case it is practical to write
dt instead of πt. In this respect, the value of the firm is simply discounted sum of dividends.
If this firm is privately or publicly traded in a perfect market then its market value is equal to
its value. Therefore, the price of a 1 percent ownership of the stock of the firm (or the price of 1%
share simply) is 0.01V1. In practice, S is used to denote stock price, instead of V .
90
Stock markets and speculators
Firms are publicly traded (usually) in stock markets/exchanges. Examples of stock markets are
New York Stock Exchange and London Stock Exchange. In a stock market, speculators engage in
a trade of stocks of many firms simultaneously.
Hereafter, we will assume that the values of firms are exogenously given keeping in mind that
they depend on the profits of the firms. For publicly traded firms, this is equivalent to assuming
that the stock prices of the firms are given.
In a deterministic world, speculators invest in the stocks of those firms which will have the
highest returns. For example, if there are two firms, Firm 1 and Firm 2, which currently have stock
price of S and are expected to have stock price S2 < S1, then the speculators would buy stocks of
Firm A. This is because the return on buying the stock of Firm 1 (R1) is higher than the return
on buying the stock of Firm 2 (R2),
R1 =S1 − SS
>S2 − SS
= R2.
Of course, in complete markets, because of arbitrage the price of the stocks of Firm 1 would increase
so that after all R1 = R2. We have assumed however that the stock prices are given. Therefore, S
does not change in our example.
Stochastic returns and portfolios of stocks: The world is not deterministic in the sense that
future profits of firms are usually not very predictable. Therefore, the future prices of stocks are not
very predictable. Speculators/investors then decide upon expected returns. Moreover, they usually
hold portfolios of stocks.
For example, if N firms are traded in the stock market and there are M speculator/investor,
then speculator j holds a portfolio
Pj =
N∑i=1
µi,jSi,
where µi,j is the amount of stock Si in the portfolio of speculator j. If µi,j = 0 then speculator j
does not have stocks of Firm i. If µi,j = 1 then speculator j holds real money equivalent to all the
stocks of Firm i. If µi,j = −1 then speculator j does not have stocks of Firm i. Moreover, speculator
j is in debt to deliver real money equivalent to all stocks of Firm i. In other words, speculator j
has short-selled stocks of Firm i.
Can µi,j be higher than 1 or lower than −1? Yes, it can. This is because Si is the real value
of the stock of Firm i. In such a case contracts can be written promising multiples of the value of
that stock. Consider, for example, a situation when speculator j has all the stocks of Firm i as well
as a certificate promising to pay exactly twice the value of the stock whenever exercised. In such a
case the speculator has real value of 3Si. In other words µi,j = 3.
91
Clearly, for any firm it has to be that the percentage of its stocks sums to 1 across all speculators,
M∑j=1
µi,j = 1.
In terms of portfolios, speculators/investors select the amounts of stock ownership µ and focus
on the returns of their portfolios. If the price of a selected portfolio is P then the return on it is
percentage change of its price over time. Using the notation of our previous example, the return is
Rj =Pj − PP
.
These returns are stochastic (are random variable) because the future value of portfolio Pj is a
random variable.
Utility maximization and mean-variance trade-off
Speculators buy stocks to maximize utility. It turns out that generally the problem of speculators
to design an appropriate portfolio can be summarized in terms of expected value and risk trade-off
given that the future value of portfolio is a random variable.
The following example illustrates this point. Consider a speculator who has utility function u (.).
Let u (.) be strictly-increasing and concave in portfolio returns. Suppose the returns on portfolios
have normal distribution with E expected value and σ2 variance. In such a case, the expected utility
of the speculator is
E [u (R)] =
+∞∫−∞
u (R) f (R;E, σ) dR,
where
f (R;E, σ) =1√2πσ
e−12(
R−Eσ )
2
.
Clearly, the expected utility depends on E and σ2, where σ2 is a measure of risk.
Use Z to denote
Z =R− Eσ
.
Given the properties of expectation and variance operators (see Appendix), Z is a normal random
variable with expected value 0 and variance 1. Z is usually called standardized return.
Replace R with Z in the expression for expected utility (notice that the limits of integration
don’t change):
E [u (E + σZ)] =
+∞∫−∞
u (E + σZ)ϕ (Z) dZ,
92
where ϕ (Z) is the density function of standard normal distribution,
ϕ (Z) =1√2πe−
12Z2 .
Next, we take the derivative of the expected utility E [u] with respect to the standard deviation
of return σ. Assuming that E [u] is finite, the derivative is
dE [u]
dσ=
+∞∫−∞
d
dσu (E + σZ)ϕ (Z) dZ
=
+∞∫−∞
u′ (E + σZ)
(dE
dσ+ Z
)ϕ (Z) dZ.
An indifference curve is defined as the (locus of) points where dE[u]dσ = 0:
0 =
+∞∫−∞
u′ (E + σZ)
(dE
dσ+ Z
)ϕ (Z) dZ.
This expression can be further rewritten as
dE
dσ= −
+∞∫−∞
u′ (E + σZ)Zϕ (Z) dZ
+∞∫−∞
u′ (E + σZ)ϕ (Z) dZ
,
where E is the expected return of the portfolio and σ is its standard deviation. E (σ) represents
indifference curve in expected value E and standard deviation (variance) space σ. It is increasing
and convex in σ. In other words, dEdσ > 0 and d2Edσ2
> 0 (see Appendix).
Utility increases moving the indifference curve from right to left in variance and mean space.
The following figure shows these indifference curves
93
Measures of returns and risk
Expected value of a random variable is its concentration point. If the random variable has continuous
distribution function then almost never it will be equal to its expected value. However, most likely
its realization will be closer to its expected value than other points in distribution. Therefore,
expected value of a random variable tends to be the best guess available.
In our previous example, we used variance as a measure of risk. Variance of random variable
shows how dispersed its outcomes may be. Therefore, it is one of the most commonly applied risk
measures. It is especially relevant for normally distributed random variables.
Digression (Central Limit Theorem): Consider a portfolio consisting of i.i.d. stocks with
weights 1N . According to the Central Limit Theorem the returns on this portfolio will be approxi-
mately normally distributed. The Central Limit Theorem is very powerful result. This result is one
of the reasons why normal distribution is used so often.
There are many other measures of risk. Examples are the range, the semi-inter-quartile range,
the semi-variance, and the mean absolute deviation. Each of these measures can have slightly
different implication (scale) for risk.
The range (RANGE) is defined as the difference between the highest and lowest outcomes. Let
the returns on portfolio Rj be from[Rminj , Rmaxj
]. Then the range is
RANGE = Rmaxj −Rminj .
The semi-inter-quartile range (SRANGE) is usually defined as the difference between the 75th and
25th quantiles of the random variable. In our example,
SRANGE = Rq75j −Rq25j .
94
The variance is a central moment in the sense that considers deviations from the mean/expected
value, i.e., V [Rj ] = E [Rj − E [Rj ]]2. In this sense it gives equal weight to deviations from the
mean/expected value. However, risk averse investors might be more concerned about returns below
the mean (i.e., downside risk but not upside risk). The semi-variance (SEMIVAR) is a measure of
risk that relates just to that risk. It defined as
Rj =
Rj − E [Rj ]
0
if
if
Rj < E [Rj ]
Rj ≥ E [Rj ]
SEMIV AR = E[Rj
]2.
Variance and semi-variance can be sensitive to observations distant from the mean/expected value
(i.e., outliers). The mean absolute deviation (MAD) avoids this problem. It is defined as
MAD = E [|Rj − E [Rj ]|] .
Hereafter, we will maintain an assumption that returns on stocks/portfolios have normal distrib-
ution. Therefore, we will characterize the risk with variance (or its square root: standard deviation).
Measuring portfolio return and risk to assemble a portfolio
An expected utility maximizing speculator/investor assembles its portfolio of stocks so that to
achieve the highest possible utility. It assembles the portfolio selecting the number of each type of
stocks (µ in our previous example). The expected return and risk of portfolio are directly linked to
the expected returns and risks of underlying assets/stocks.
The following very simplistic example illustrates this point. Let there be only 2 firms and 2
stocks Firm 1 and Firm 2, and S1 and S2. The current value of the portfolio of the speculator is
P = µ1S1 + µ2S2,
where µ1 is the number of S1 stocks and µ2 is the number of S2 stocks.
Slightly abusing the notation, let the next period prices of stocks 1 and 2 be S11 and S12 , corre-
spondingly. Therefore, next period price of the portfolio is
P 1 = µ1S11 + µ2S
12 .
The return on portfolio then is
R =P 1 − PP
=µ1(S11 − S1
)+ µ2
(S12 − S2
)P
=µ1S1P
S11 − S1S1
+µ2S2P
S12 − S2S2
= ω1RS1 + ω2RS2 ,
95
where RS1 is the return on stock S1, RS2 is the return on stock S2. ω1 is the weight of stock S1 in
the portfolio, and ω2 is the weight of stock S2 in the portfolio. These weights sum up to 1,
1 = ω1 + ω2.
In this respect this formula implies that the return on portfolio R is the weighted sum of returns
on underlying assets/stocks RS1 and RS2 . Notice that selecting the number of stocks in portfolio
µ1 and/or µ2 the speculator selects weights of stocks ω1 and ω2 in the portfolio.
The expected value of the return on this portfolio is
E [R] = E [ω1RS1 + ω2RS2 ]
= ω1E [RS1 ] + ω2E [RS2 ] .
In turn, the variance is
V [R] = ω21V [RS1 ] + ω22V [RS2 ] + 2ω1ω2COV [RS1 , RS2 ] ,
where COV [RS1 , RS2 ] is the covariance of RS1 and RS2 ,
COV [RS1 , RS2 ] = E [(RS1 − E [RS1 ]) (RS2 − E [RS2 ])] .
Intuitively, the portfolio P = ω1RS1 + ω2RS2 varies because so do RS1 and RS2 . However, we need
to take into account that RS1 and RS2 co-vary or vary together and that can amplify or reduce the
variance of the portfolio. The following two examples illustrate the latter point. Suppose S1 and
S2 are almost the same (may be because they are stocks of vertically or horizontally interrelated
firms) and
RS1 = RS2 .
In such a case
V [RS1 ] = V [RS2 ] ,
COV [RS1 , RS2 ] = V [RS1 ] ,
and
V [R] = ω21V [RS1 ] + ω22V [RS1 ] + 2ω1ω2V [RS1 ]
=(ω21 + 2ω1ω2 + ω22
)V [RS1 ]
= (ω1 + ω2)2 V [RS1 ]
= V [RS1 ] .
This inference should have followed since if RS1 = RS2 then P = RS1 . This example corresponds to
96
the case when S1 and S2 co-vary perfectly (they are almost the same). Suppose now that S1 and
S2 are extremely different (may be because they are stocks of competing firms) and when RS1 = 1
then RS2 = 0 but when RS1 = 0 then RS2 = 1. Let further RS1 be equal to 1 with probability 12 ,
which implies that E [RS1 ] = E [RS2 ] = 12 . Moreover,
V [RS1 ] = E [RS1 − E [RS1 ]]2 = E
[RS1 −
1
2
]2= E
[R2S1 −RS1 +
1
4
]=
1
4.
Apparently, the same holds for RS2 : V [RS2 ] = 14 . However, notice that
COV [RS1 , RS2 ] = E
[(RS1 −
1
2
)(RS2 −
1
2
)]= E
[RS1RS2 −
1
2RS2 −
1
2RS1 +
1
4
]= E [RS1RS2 ]−
1
2E [RS2 ]−
1
2E [RS1 ] +
1
4= 0− 1
2E [RS2 ]−
1
2E [RS1 ] +
1
4
= −1
4.
Therefore, since the expected value of the return on portfolio is simply the weighted sum of expected
returns on stocks, we have that
E [R] = E [ω1RS1 + ω2RS2 ]
= ω1E [RS1 ] + ω2E [RS2 ]
=1
2.
In turn, the variance of the portfolio is
V [R] = ω211
4+ ω22
1
4− 2ω1ω2
1
4
= (ω1 − ω2)21
4,
which can be strictly less than the variance of any of the stocks in portfolio. Let ω1 = ω2 then
V [R] = 0 < 14 . However, if ω1 = 1 or ω2 = 1 then V [R] = 1
4 . This happens because of negative
covariance between RS1 and RS2 and is called diversification of risk.
This example conveniently illustrates a very important point. In case RS1 and RS2 don’t vary
together perfectly, speculators can choose the weights of S1 and S2 in their portfolios so that to
minimize variance/risk for the same level of expected return.
The magnitude of the covariance is not easy to interpret since it depends on the possible realiza-
tions of random variables. Usually, therefore, another measure is used to describe (linear) relation
between random variables. It is the normalized version of the covariance and is called correlation
97
coeffi cient. It is defined as
ρS1,S2 =COV [RS1 , RS2 ]√V [RS1 ]V [RS2 ]
.
By definition, ρRS1 ,RS2 ∈ [−1, 1]. Using the correlation coeffi cient and denoting V [R] = σ2P ,
V [RS1 ] = σ2S1 and V [RS2 ] = σ2S2 we have that the variance of the portfolio returns is
σ2P = ω21σ2S1 + ω22σ
2S2 + 2ω1ω2σS1σS2ρS1,S2 .
A speculator chooses the weights ω1 and ω2 to construct its portfolio and makes its choices upon
expected value and variance of returns. Therefore, it is interesting to see how expected value and
variance of the portfolio returns depend on ω1 and ω2. First of all notice that ω2 = 1−ω1. Therefore,it is enough to choose only one of the weights, e.g., ω1. Let for simplicity E [RS1 ] > E [RS2 ]. In
such a case apparently the expected value of portfolio returns in linear and increasing function of
ω1
E [R] = E [RS2 ] + ω1 (E [RS1 ]− E [RS2 ]) .
In turn,
σ2P = ω21σ2S1 + (1− ω1)2 σ2S2 + 2ω1 (1− ω1)σS1σS2ρS1,S2
= ω21(σ2S1 + σ2S2 − 2σS1σS2ρS1,S2
)− 2ω1σS2
(σS2 − σS1ρS1,S2
)+ σ2S2 .
This is an upward opening parabola since
σ2S1 + σ2S2 − 2σS1σS2ρS1,S2 = (σS1 − σS2)2 + 2σS1σS2
(1− ρS1,S2
)> 0
for any value of ρRS1 ,RS2 . The following figures illustrate the relationships between E [R] and σ2Pand ω1
In the second figure ωmv1 is the ω1 which delivers the lowest variance of returns. It can be found
98
from the first order condition:
∂σ2P∂ω1
= 0⇔ 2ω1(σ2S1 + σ2S2 − 2σS1σS2ρS1,S2
)− 2σS2
(σS2 − σS1ρS1,S2
)= 0.
Therefore,
ωmv1 =σS2
(σS2 − σS1ρS1,S2
)σ2S1 + σ2S2 − 2σS1σS2ρS1,S2
.
Given that portfolio returns are linear in ω1, we can easily replace ω1 in the last figure with
E [R]. In such a case we would have
In this figure the curve σ2P is the minimum variance opportunity set. This is the set of variance
and expected return points which offers the minimum variance for a given expected rate of return.
In turn, E [Rmv] is the expected return of minimum variance portfolio.
Given that we have for indifference curves expected returns as functions of standard deviation,
usually this figure is transposed.
The speculators’optimal choice of portfolio/weights is given by the point of tangency of indiffer-
ence curves and minimum variance opportunity set. This is because indifference curves are convex
99
and utility increases as these curves shift to the left. Moreover, exactly because of that no portfolio
will be selected below the horizontal line that corresponds to E [Rmv].
This analysis easily proceeds for the case when the number of stocks/assets is greater than
2. We have not defined what is S2 precisely. We needed simply that it has expected value and
variance. Therefore, it can be for example a linear combination of stocks of many firms. Even
more, it can include bonds. Hereafter, we will assume that portfolios can include many types
of assets/financial instruments: stocks, bonds, options, etc. Therefore, we will let speculators to
assemble their portfolios using not only stocks. Moreover, we will assume that all speculators know
the correct variances and expected returns of assets.
Lets assume now that S2 is risk free asset with returns RS2 , which for convenience we will denote
by Rf . For example it is US Treasury bond or combination of perfectly negatively correlated assets.
In turn, S1 is itself the portfolio of all risky assets with returns RS1 which have expected value and
variance E [RS1 ] and σ2S1. The portfolio consisting of S1 and S2 has expected return and variance
of returns
E [R] = ω1E [RS1 ] + ω2Rf
= Rf + ω1 (E [RS1 ]−Rf ) .
and
σ2P = ω21σ2S1 .
This implies that the expected value of this portfolio is linear function of its standard deviation:
E [R] = ω1E [RS1 ] + ω2Rf
= Rf ± (E [RS1 ]−Rf )σPσS1
.
The following figure offers the minimum variance set in this case together with choices of ω1. The
minimum variance set is graphed using solid lines. In turn, possible choices are in dashed lines.
100
Portfolios along all these dashed lines are possible. However, only one dominates in terms of
expected return and variance. It is the portfolio at point M . With the presence of risk free asset
the speculators form their portfolios taking the tangency point of their indifference curves and the
upward sloping solid line called capital market line, CML. In this respect their portfolios are
combinations of Rf and portfolio M .
How do the speculators construct CML? To do so they need to know Rf and portfolio M .
Rf is the return on risk free asset which might be safely thought to be US Treasury bonds. What
aboutM? M is the portfolio of all risky assets (yes, it is S1 in this example, if you were wondering).
Therefore, it is at the tangency point of CML and the convex minimum variance sets.
The market is in equilibrium when prices are such that markets clear. All assets then are held.
In other words prices adjust so that excess demand and supply of assets are zero. This market
clearing condition implies that equilibrium is attained at a single-tangency portfolio, M , which all
investors combine with risk free asset and is a portfolio where all assets are held according to their
market value weights. (At this point the weight of Rf is zero. Moreover, this portfolio is called the
market portfolio.) Let the market value of an asset i be Vi and there be N assets then the market
value weight of the asset i is
wi =ViN∑i=1
Vi
,
whereN∑i=1
Vi is the total market value of all assets. Market equilibrium is not attained until the
tangency portfolio M is the market portfolio. Moreover, the value of the risk free rate of return
should be such that the aggregate borrowing and lending are equal.
The upward sloping solid line is called Capital Market Line. In turn, the result that in equilib-
rium investors hold a weighted average of risk free asset and market portfolio M is called Two-fund
Separation.
101
Market price of risk and the CAPM
The Capital Asset Pricing Model, CAPM in short, is a model which allows us to determine the
market price of risk and the appropriate measure of risk for a single asset. It rests on several very
hard assumptions (which actually we have maintained thus far). These assumptions are that
1. There are no market distortions and frictions (e.g., there are no price setters and information
is costlesly available to all)
2. Speculators/investors maximize their expected utility and are risk averse.
3. There exists an unlimited supply of risk free asset at risk free rate
4. The quantities of assets are fixed. Moreover, all assets are tradable and perfectly divisible
5. All investors have the same information about the markets
The CAPM also needs that the market portfolio is effi cient. This follows from utility maximiza-
tion and common knowledge: the market is the sum of all individual holdings and all individual
holdings are effi cient.
In equilibrium, for any asset i it has to be that its weight in the market portfolio M is equal to
wi =ViN∑i=1
Vi
.
Consider a portfolio which consists of ωi percentage of asset i and 1− ωi percentage of the marketportfolio M . The expected return on such a portfolio is
E [R] = ωiE [Ri] + (1− ωi)E [Rm] ,
where E [Ri] and E [Rm] are the expected returns on asset i and market portfolio M . The standard
deviation of this portfolio is
σP =[ω2iσ
2i + (1− ωi)2 σ2m + 2ωi (1− ωi)σi,m
] 12,
where σi,m is the covariance between asset i and market portfolio m.
Clearly, the opportunity set in terms of expected value and variance for various combinations
of asset i and portfolio M is given by the convex Minimum Variance Opportunity set. In turn, the
changes of the expected return and risk (here standard deviation) of this portfolio with ωi are given
102
by
∂
∂ωiE [R] = E [Ri]− E [Rm] ,
∂
∂ωiσP =
[ωiσ
2i − (1− ωi)σ2m + (1− 2ωi)σi,m
]×[ω2iσ
2i + (1− ωi)2 σ2m + 2ωi (1− ωi)σi,m
]− 12
The main insight of this model is that the market portfolio already contains the asset i. There-
fore, ωi is the excess demand for it. In equilibrium it should be zero and we would evaluate ∂∂ωi
E [R]
and ∂∂ωi
σP at ωi = 0. This gives
∂
∂ωiE [R]
∣∣∣∣ωi=0
= E [Ri]− E [Rm] ,
∂
∂ωiσP
∣∣∣∣ωi=0
=σi,m − σ2m
σm.
Therefore, the slope of the expected return and risk trade-off evaluated at the market portfolio is
∂∂ωi
E [R]∂∂ωi
σP
∣∣∣∣∣ωi=0
=E [Ri]− E [Rm]
(σi,m − σ2m) /σm.
This expression shows how an individual asset affects the return and risk of market portfolio.
The final insight is that this slope is equal to the slope of CML, which is
E [Rm]−Rfσm
.
Equating these two we have that
E [Ri]− E [Rm]
(σi,m − σ2m) /σm=E [Rm]−Rf
σm,
or equivalently,
E [Ri] = Rf + (E [Rm]−Rf )σi,mσ2m
.
This equation is known as the Capital Asset Pricing Model. It states that the required rate of return
on any asset i is equal to the risk free rate of return plus a risk premium. The risk premium is the
price of the risk (E [Rm]−Rf ) times the quantity of risk σi,mσ2m. The quantity of risk is denoted by
β =σi,mσ2m
.
This is the contribution of asset i to the portfolio risk. Most importantly, the variance of the asset
i a measure of risk does not matter in this context. Whatever matters is how its returns correlate
103
with market portfolio returns.
The risk free asset has β = 0 since it does not contribute to the risk of market portfolio (i.e.,
its covariance with market portfolio is zero). In turn, the market portfolio has β = 1 since it is
perfectly correlated with itself.
The following figure illustrates the CAPM where it is depicted by the blue line.
This model has several important properties. First, in equilibrium because of arbitrage every
asset should be priced so that its risk adjusted return is exactly on the blue line. Next the total
risk can be separated to two
Total risk = systematic risk + unsystematic risk,
where systematic risk is how an asset covaries with the market portfolio and unsystematic risk is
the risk not dependent on the market/economy. The speculators/investors are willing to pay a risk
premium to avoid the systematic risk. Unsystematic risk cannot be avoided.
How to find β of an asset and the magnitude of unsystematic risk. The CAPM implies that one
can estimate those from the following empirical specification to find βi
Ri = a+ βiRm + ε,
where a is a constant and ε is a random variable which does not correlate with the market. The
variance of Ri then is
σ2i = β2iσ2m + σ2ε,
where β2iσ2m is the systematic risk and σ2ε is the unsystematic risk.
A second important property of the CAPM is that the measures of risk of individual assets are
linearly additive when the assets are combined in portfolios. For instance, if an asset i has a risk
104
of βi and asset j has a risk of βj and they are combined with ωi proportion, then total risk of the
portfolio is
βp = ωiβi + (1− ωi)βj .
To show this use the properties of covariance:
βp =E [(ωiRi + (1− ωi)Rj − E [ωiRi + (1− ωi)Rj ]) (Rm − E [Rm])]
V [Rm]
= ωiE [(Ri − E [Ri]) (Rm − E [Rm])]
V [Rm]+ (1− ωi)
E [(Rj − E [Rj ]) (Rm − E [Rm])]
V [Rm]
= ωiβi + (1− ωi)βj .
Black-Scholes model
The Black-Scholes model, is a model which allows us to determine the market price of European
options. This mode maintains the following assumptions
1. There are no market distortions and frictions
2. Speculators/investors maximize their expected utility
3. There exists an unlimited supply of risk free asset at risk free rate rf
4. All assets are tradable and perfectly divisible
5. Stocks pay no dividends during the life-time of the options
European options are widely used financial instruments.
Definition 1 Option is a contract which gives its owner an option (the right but not obligation)to buy or sell an underlying asset at a specified price (strike price) before a specified date (time to
maturity).
1. Options giving the right to sell are called Call Options
2. Options giving the right to buy are called Put Options
3. Options which can be exercised only at a specified date (but not before it) are called European
Options.
To derive Black-Scholes model lets consider a stock which has random returns and a European
Call option written on it. Denote, the price of the stock at time t by St. Assume that the price of
the stock follows a geometric Brownian motion:
dSt = µtStdt+ σStdWt,
105
where d stands for the operator of infinitesimally small change over time. µt is called drift coeffi cient
and σ is a positive constant. Both µt and σ will acquire meaning when Wt is presented. This
(differential) equation equation states that the change of the value of the stock happens because of
deterministic shifts in the mean and because of a random process Wt.
Wt and, therefore, its infinitesimally small change dWt, is a random variable. More precisely,
Wt is a Wiener process: It is a random variable which changes over time so that the expected
change over any time interval is 0 (e.g., E [dWt]) and its variance over time T is equal to T (e.g.,
V [dWt] = dt). A discrete analogue for W is a simple random walk.
The above equation can be rewritten as
dStSt
= µtdt+ σdWt,
where dStStis the rate of return on the stock during a very short time of dt. The properties of Wt
random variable imply that
E
[dStSt
]= µtdt,
V ar
[dStSt
]= σ2dt.
European option written on this stock specifies the time when it can be exercised. Denote it by
T . It specifies also the strike price. Denote it by K. As we will see, at time t < T the price of this
option V is a (complicated) function of S, µ, σ, t, K, T , and rf :
V = V (S, µ, σ, t,K, T, rf ) .
From Ito’s lemma it follows that
dV =
(µS
∂V
∂S+∂V
∂t+
1
2σ2S2
∂2V
∂S2
)dt+ σS
∂V
∂SdW.
Combining this differential equation with the differential equation for stock price gives
dV =∂V
∂tdt+
1
2σ2S2
∂2V
∂S2dt+
∂V
∂SdS.
To derive the price of this option Black-Scholes model uses a non-arbitrage argument. This
involves constructing a risk free portfolio and equating its returns to the risk free returns.
Consider a portfolio of one option on this stock and a short position of ∆ in this stock. The
value of the portfolio is then
Pt = Vt −∆St.
Choose the ∆ so that to hedge this portfolio (reduce its variance).
106
The change in the price of the portfolio is given by
dPt = dVt −∆dSt
=∂V
∂tdt+
1
2σ2S2
∂2V
∂S2dt+
∂V
∂SdS −∆dS.
The first two terms are deterministic. Therefore, they don’t matter for hedging strategy and to
hedge the portfolio against (volatility) randomness select ∆ = ∂V∂S (this is called delta/dynamic
hedging). This implies that
dPt =
(∂V
∂t+
1
2σ2S2
∂2V
∂S2
)dt.
Since this portfolio entails no risk its returns should be equal to the returns on risk-free asset
rf . In other words,dPtPt
= rfdt.23
Therefore,
dPt = Ptrfdt
and (∂V
∂t+
1
2σ2S2
∂2V
∂S2
)dt = (Vt −∆St) rfdt.
The latter implies that
rfV =∂V
∂t+
1
2σ2S2
∂2V
∂S2+ rfS
∂V
∂S.
This is a second order partial differential equation.
In order to solve it one needs to have boundary conditions. First, let this be a call option. At
the time to maturity T , the price of call option is easy to derive. It is
VT = max ST −K, 0 .
This is because time T is the exercise date. The owner of this option would exercise it only if ST is
higher than the strike price and would make a net gain of ST −K. Second,
V (0, µ, σ, t,K, T, rf ) = 0.
Third,
limS→+∞
V (S, µ, σ, t,K, T, rf ) = S.
Solving this differential equation is much beyond the frames and focus of this course.
23This is because the return on portfolio is given by 1Pt
dPtdt. If the time is discrete then this becomes 1
Pt
Pt+1−Ptt+1−t =
Pt+1−PtPt
.
107
Appendix
Appendix - Reminder of Statistics 0
Probability Theory
There are at least two ways to approach probability:
• Classical (or a priori) Probability
—Classical Probability can be defined in the following way: If a random experiment can
result in n mutually exclusive and equally likely outcomes and if nA of these outcomes
have an attribute A, then the probability of A is the fraction nA/n.
• Axiomatic Probability (or so-called Kolmogorov’s Axiomatics):
—Probability Space is a triple(
Ω; A; Pr (.)), where
1. Ω is the sample space: A collection of all possible outcomes of an experiment
∗ Any given experiment result is an element of Ω.
2. A is the event space, or algebra of events.
∗ It is a collection of subsets of Ω, including Ω.
3. Pr (.) is the probability function defined on A: For any event A ∈ A, Pr (.) is a
quantitative (numerical) measure of the likelihood that this event A is observed once
the experiment is completed.
Basic definitions and theorems
The following definitions are needed to construct probability space:
Definition 2 [Sample space] The sample space, denoted by Ω, is the collection or the totality of all
possible outcomes of a conceptual experiment. Often, Ω is called the sure event since it includes any
outcome that can occur.
Definition 3 [Event and event space] An event is a subset of the sample space. The class of allevents associated with a given experiment is defined to be the event space.
Any event space A possesses the following properties:
1. Ω ∈ A
2. If A ∈ A, then the complement set A ∈ A
3. If A1, A2, ... ∈ A, then ∪i≥1Ai ∈ A
108
Definition 4 Spaces which satisfy properties 1, 2, and 3 are called σ−algebra.
Definition 5 [Probability function] A probability function Pr (.) is a set function with domain A
and counterdomain of [0, 1] interval. It which satisfies the following axioms:
• Pr (.) ≥ 0 for ∀A ∈ A
• Pr (Ω) = 1
• If A1, A2, ... ∈ A are pairwise disjoint, then Pr (∪i≥1Ai) =∑i≥1
Pr (Ai)
Definition 6 The triple(
Ω; A; Pr (.))is called probability space.
The following definitions are for conditional probability and independence of events
Definition 7 Let A,B ∈ A and Pr (B) > 0. The conditional probability of event B given the
occurrence of A is denoted and given by Pr (B|A) = Pr(B∩A)Pr(A) .
Theorem 8 (The law of total probability) Let A1, A2, ..., An ∈ A form a partition of Ω. Let B ∈ A.Then
Pr (B) =n∑i=1
Pr (B|Ai) Pr (Ai) .
Theorem 9 (Bayes’formula) Let A1, A2, ..., An ∈ A form a partition of Ω, for any i let Pr (Ai) > 0.
Then for any event B ∈ A with Pr (B) > 0
Pr (Aj |B) =Pr (B ∩Aj)
Pr (B)=
Pr (B|Aj) Pr (Aj)n∑i=1
Pr (B|Ai) Pr (Ai)
.
Definition 10 Events A1 and A2 are independent iff
Pr (A1 ∩A2) = Pr (A1) Pr (A2) .
Similarly, events A1, A2, ..., An ∈ A are independent iff the probability of any intersection of any
sub-sample of A1, A2, ..., An is the multiplication of probabilities of the terms in the intersection.
Random variables
Definition 11 (Intuitive definition) A random variable x is a real-valued function of the elements
$ of a sample space Ω.
For example, a random variable x is the sum of the two numbers that occur when we roll a
pair of fair dice one. The events are the numbers. The space of possible events is comprised of
109
36 elements. The value of a random variable depends on the outcome of the experiment being
observed. Each possible value X of a random variable x defines an event: function x ($) assigns
values X ∈ R to the set of sample space outcomes $.
Definition 12 (Technical definition) For a given probability space(
Ω; A; Pr (.)), a function x ($) :
Ω → R is said to be a random variable if for ∀X ∈ R event that occurs the event $ on which
x ($) < X, AX = $ : x ($) < X, belongs to the events space A, AX ∈ A.
Definition 13 The cumulative distribution function of a random variable x defined on the proba-
bility space(
Ω; A; Pr (.))is
Fx (X) = Pr (AX) = Pr (x ($) < X) .
Notice that the cumulative distribution function would be a non-decreasing function Fx (X) :
R → [0, 1] since higher X implies bigger space of $ events where x ($) < X. More precisely,
cumulative distribution functions satisfy the following properties.
• limX→−∞ Fx (X) = 0
• limX→+∞ Fx (X) = 1
• For any X1 and X2 where X1 < X2, Fx (X1) ≤ Fx (X2)
• limh→0− Fx (X + h) = Fx (X) for any X : R
Definition 14 Discrete density function or probability mass function of a discrete random variable
x is
fx (X) =
pxi0
x = xi, i = 1, 2...
otherwise.
Definition 15 Continuous density function or probability mass function of a continuous randomvariable x is
fx (X) =dFx (X)
dX.
Density functions satisfy the following properties.
• fx (X) ≥ 0
•∫ X−∞ fx (z) dz = Fx (X)
•∫ +∞−∞ fx (z) dz = 1
110
Basic properties of expectation, variance, and covariance
1. The basic properties of expectation operator are
• For any random variables XiNi=1 , real functions hiNi=1, and real numbers αi
Ni=1
E
[N∑i=1
αihi (Xi)
]=
N∑i=1
αiE [hi (Xi)] .
• If for any hi and hj it is the case that hi (X) ≤ hj (X) then
E [hi (X)] ≤ E [hj (X)] .
2. The basic properties of variance and covariance operators are
• For any random variables XiNi=1 , real functions hiNi=1, and real numbers αi
Ni=1
V
[N∑i=1
αihi (Xi)
]= COV
[N∑i=1
αihi (Xi) ,N∑i=1
αihi (Xi)
]
=N∑i=1
α2iV [hi (Xi)] +∑i 6=j
αiαjCOV [hi (Xi) , hj (Xj)] .
where for any hi and hj
COV [hi (Xi) , hj (Xj)] = E [(E [hi (Xi)]− hi (Xi)) (E [hj (Xj)]− hj (Xj))] .
and the second term consists of N (N − 1) items. Correlation is defined as
ρhi(Xi),hj(Xj) =COV [hi (Xi) , hj (Xj)]√V [hi (Xi)]V [hj (Xj)]
.
Correlation shows linear relation. It is from −1 to 1, ρhi(Xi),hj(Xj) ∈ [−1, 1].
• Given its definition covariance operator satisfies the following properties
COV [Xi, Xi] = V [Xi] ,
COV[αi + βiXi, αj + βjXj
]= βiβjCOV [Xi, Xj ] ,
COV[αi + βiXi + αk + βkXk, αj + βjXj
]= βiβjCOV [Xi, Xj ] + βkβjCOV [Xi, Xj ] .
Therefore, if
Xi = α+ βXj ,
then
COV [Xi, Xj ] = COV [α+ βXj , Xj ] = βV [Xj ] .
111
Mean-variance trade-off
It can be shown that indifference curve E (σ) is increasing and convex. In other words, dEdσ > 0 andd2Edσ2
> 0. This follows from that utility function u (.) is increasing and concave.
To show that
dE
dσ= −
+∞∫−∞
u′ (E + σZ)Zϕ (Z) dZ
+∞∫−∞
u′ (E + σZ)ϕ (Z) dZ
> 0
consider the numerator
+∞∫−∞
u′ (E + σZ)Zϕ (Z) dZ =
0∫−∞
u′ (E + σZ)Zϕ (Z) dZ +
+∞∫0
u′ (E + σZ)Zϕ (Z) dZ.
Denote by Z = −Z, and rewrite
+∞∫−∞
u′ (E + σZ)Zϕ (Z) dZ =
0∫−∞
u′(E − σZ
)(−Z)ϕ(−Z)d(−Z)
+
+∞∫0
u′ (E + σZ)Zϕ (Z) dZ
= −+∞∫0
u′(E − σZ
)Zϕ(Z)dZ +
+∞∫0
u′ (E + σZ)Zϕ (Z) dZ
=
+∞∫0
[u′ (E + σZ)− u′ (E − σZ)
]Zϕ (Z) dZ.
Clearly, since u′′ < 0 it has to be that
+∞∫−∞
u′ (E + σZ)Zϕ (Z) dZ < 0,
which implies that dEdσ > 0.
In turn, to show that d2Edσ2
> 0, consider two points on the indifference curve (E1, σ1) and (E2, σ2),
and their average(E1+E22 , σ1+σ22
). Notice that by construction u (E1 + σ1Z) = u (E2 + σ2Z).
The indifference curve would be convex if for any two points (E1, σ1) and (E2, σ2) and Z
1
2u (E1 + σ1Z) +
1
2u (E2 + σ2Z) < u
(E1 + E2
2+σ1 + σ2
2Z
).
112
This clearly holds since u is a concave function, which implies that
E
[u
(E1 + E2
2+σ1 + σ2
2Z
)]> E [u (E1 + σ1Z)] = E [u (E2 + σ2Z)] .
113
References
Cagan, P. (1956). The monetary dynamics of hyperinflation. In M. Friedman (Ed.), Studies in the
Quantity Theory of Money. Chicago: University of Chicago Press.
Calvo, G. A. (1983). Staggered prices in a utility-maximizing framework. Journal of Monetary
Economics 12 (3), 383—398.
Carlin, W. and D. Soskice (2005). The 3-equation New Keynesian Model– a graphical exposition.
Contributions in Macroeconomics 5 (1).
Doepke, M., A. Lehnert, and A. Sellgren (1999). Macroeconomics. Available online (last accessed
29.03.2015). http://faculty.wcas.northwestern.edu/~mdo738/book.htm.
Domar, E. D. (1946). Capital expansion, rate of growth, and employment. Econometrica 14 (2),
137—147.
Goodwin, R. (1967). A growth cycle. In C. H. Feinstein (Ed.), Socialism, Capitalism & Economic
Growth, pp. 54—58. London: Macmillan.
Hansen, G. D. (1985). Indivisible labor and the business cycle. Journal of Monetary Eco-
nomics 16 (3), 309—327.
Harrod, R. F. (1939). An essay in dynamic theory. Economic Journal 49 (193), 14—33.
Kaldor, N. (1934). A classificatory note on the determinateness of equilibrium. Rreview of Economic
Studies 1 (2), 122—136.
King, R. G. and S. T. Rebelo (1999). Resuscitating real business cycles. Volume 1, Part B of
Handbook of Macroeconomics, pp. 927—1007. Elsevier.
Kydland, F. E. and E. C. Prescott (1982). Time to build and aggregate fluctuations. Economet-
rica 50 (6), 1345—1370.
Nagel, R. (1995). Unraveling in guessing games: An experimental study. American Economic
Review 85 (5), 1313—1326.
Solow, R. M. (1956). A contribution to the theory of economic growth. Quarterly Journal of
Economics 70 (1), 65—94.
Taylor, J. B. (1980). Aggregate dynamics and staggered contracts. Journal of Political Econ-
omy 88 (1), 1.
114