Section 11.1 Vector Valued Functions. Definition of Vector-Valued Function.
Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any...
Transcript of Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any...
Conditions for the Existenceof Control Functions in NonseparableSimultaneous Equations Models1
Richard BlundellUCL and IFS
and
Rosa L. MatzkinUCLA
First version: March 2008This version: October 2010
Abstract
The control function approach (Heckman and Robb (1985)) in a system oflinear simultaneous equations provides a convenient procedure to estimateone of the functions in the system using reduced form residuals from theother functions as additional regressors. The conditions on the structuralsystem under which this procedure can be used in nonlinear and nonpara-metric simultaneous equations has thus far been unknown. In this note, wedefine a new property of functions called control function separability andshow it provides a complete characterization of the structural systems ofsimultaneous equations in which the control function procedure is valid.Key Words: Nonseparable models, Simultaneous equations, control func-tions. JEL Classification: C3.
1Acknowledgements: We are grateful to Whitney Newey, Arthur Lewbel, and par-ticipants at the Cemmap-Cowles Conference on "Advancing Applied Microeconometrics",in Beijing, August 2010, for helpful comments. The research is part of the program ofresearch of the ESRC Centre for the Microeconomic Analysis of Public Policy at IFS.Funding from the ESRC is gratefully acknowledged. Rosa L. Matzkin gratefully acknowl-edges as well support from NSF. Yong Hyeon Yang provided excellent research assisstance.The usual disclaimer applies.
1
1 Introduction
A standard situation in applied econometrics is where one is interested in
estimating a nonseparable model of the form
y1 = m1 (y2, 1)
when it is suspected or known that y2 is itself a function of y1. Additionally
there is an observable variable x, which might be used as an instrument for
the estimation of m1. Specifically, one believes that for some function m2
and unobservable 2,
y2 = m2 (y1, x, 2) .
The nonparametric identification and estimation of m1 under dierent
assumptions on this model has been studied in Roehrig (1988), Newey and
Powell (1989, 2003), Brown and Matzkin (1998), Darrolles, Florens, and
Renault (2002), Ai and Chen (2003), Hall and Horowitz (2003), Benkard
and Berry (2004, 2006), Chernozhukov and Hansen (2005), and Matzkin
(2005, 2008, 2010a) among others (see Blundell and Powell (2003), Matzkin
(2007), and many others, for partial surveys).
If the model were linear and with additive unobservables, one could esti-
mate m1 by first estimating a reduced form function for y2, which would also
turn out to be linear,
y2 = h2 (x, ) = x+ ,
and then using as an additional conditioning variable in the estimation of
m1, an idea dating back to Telser (1964).2
2Heckman (1978) references this paper in his comprehensive discussion of estimatingsimultaneous models with discrete endogenous variables. Blundell and Powell (2003) notethat it is dicult to locate a definitive early reference to the control function version of
2
If the structural model were triangular, in the sense that y1 is not an
argument in m2, a generalized version of this procedure could be applied to
nonparametric, nonadditive versions of the model, as developed in Chesher
(2003) and Imbens and Newey (2009). Their control function methods can
be used in the triangular structural model
y1 = m1 (y2, 1)
y2 = s (x, )
when x independent of (1, ), m1 strictly increasing in 1, and s strictly
increasing in the unobservable .
When the simultaneous model cannot be expressed in a triangular form,
one can consider alternative restrictions in the joint distribution of (1, 2)
and use the estimation approach in Matzkin (2010a), or one can assume
that 1 is independent of x and use the instrumental variable estimator, see
Chernozhukov and Hansen (2005) and Chernozhukov, Imbens and Newey
(2007).3
The question we aim to answer is the following: Suppose that we were
interested in estimating the function m1 when the structural model is of the
form
y1 = m1 (y2, 1)
y2 = m2 (y1, x, 2)
2SLS. Dhrymes (1970, equation 4.3.57) shows that the 2SLS coecients can be obtainedby a least squares regression of y1 on y2 and , while Telser (1964) shows how the seeminglyunrelated regressions model can be estimated by using residuals from other equations asregressors in a particular equation of interest.
3Unlike the control function approach and the Matzkin approach, the instrumentalvariable estimator requires dealing with the ill-posed inverse problem. (See also Hahnand Ridder (2010) regarding estimation of nonseparable models using conditional momentrestrictions.)
3
and x is independent of (1, 2) . Under what conditions on m2 can we do
this by first estimating a function for y2 of the type
y2 = s (x, )
and then using as an additional conditioning variable in the estimation of
m1?
More specifically, we seek an answer to the question: Under what condi-
tions on m2 is it the case that the simultaneous equations Model (S)
y1 = m1 (y2, 1)
y2 = m2 (y1, x, 2)
with x independent of (1, 2) , is observationally equivalent to the triangular
Model (T)
y1 = m1 (y2, 1)
y2 = s (x, )
with x independent of (1, )?
In what follows we first define a new property of functions, control func-
tion separability. We then show, in Section 3, that this property completely
characterizes systems of simultaneous equations where a function of interest
can be estimated using a control function. An example of a utility function
whose system of demand functions satisfies control function separability is
presented in Section 4 and illustrates the restrictiveness of the CF assump-
tions.
Section 5 describes how to extend our results to Limited Dependent Vari-
able models with simultaneity in latent or observable continuous variables.
4
The Appendix provides conditions in terms of the derivatives of the struc-
tural functions in the system and conditions in terms of restrictions on the
reduced form system. Section 6 concludes.
2 Assumptions and Definitions
2.1 The structural model and control function separa-bility
We will consider the structural model
Model (S) y1 = m1 (y2, 1)
y2 = m2 (y1, x, 2)
satisfying the following assumptions.
Assumption S.1 (dierentiability): For all values y1, y2, x, 1,2 of Y1,Y2, X, 1, 2,
the functions m1 and m2 are continuously dierentiable.
Assumption S.2 (independence): (1, 2) is distributed independently of
X.
Assumption S.3 (support): Conditional on any value x ofX, the densities
of (1, 2) and of (Y1, Y2) are continuous and have convex support.
Assumption S.4 (monotonicity): For all values y2 of Y2, the function m1
is strictly increasing in 1; and for all values (y1, x) of (Y1, X) , the function
m2 is strictly increasing in 2.
5
Assumption S.5 (crossing): For all values (y1, y2, x, 1, 2) of (Y1, Y2, X, 1, 2),
(m1 (y2, 1) /y2) (m2 (y1, x, 2) /y1) < 1.
The technical assumptions S.1-S.3 could be partially relaxed at the cost of
making the presentation more complex. Assumption S.4 guarantees that the
functionm1 can be inverted in 1 and that the functionm2 can be inverted in
2. Hence, this assumption allows us to express the direct system of structural
equations (S), defined by (m1,m2) , in terms of a structural inverse system (I)
of functions (r1, r2) , which map any vector of observable variables (y1, y2, x)
into the vector of unobservable variables (1, 2) ,
Model (I) 1 = r1 (y1, y2)
2 = r2 (y1, y2, x) .
Assumption S.5 is a weakening of the common situation where the value of
the endogenous variables is determined by the intersection of a downwards
and an upwards slopping function. Together with Assumption S.4, this
assumption guarantees the existence of a unique reduced form system (R) of
equations, defined by functions (h1, h2) , which map the vector of exogenous
variables (1, 2, x) into the vector of endogenous variables (y1, y2) ,
Model (R) y1 = h1 (x, 1, 2)
y2 = h2 (x, 1, 2) .
These assumptions also guarantee that the reduced form function h1 is
monotone increasing in 1 and the reduced form function h2 is monotone
increasing in 2. These results are established in the following Lemma.
6
Lemma 1: Suppose that Model (S) satisfies Assumptions S.1—S.5. Then,
there exist unique functions h1 and h2 representing Model (S). Moreover, for
all x, 1, 2, h1 and h2 are continuously dierentiable, h1 (x, 1, 2) /1 > 0
and h2 (x, 1, 2) /2 > 0.
Proof of Lemma 1: Assumption S.4 guarantees the existence of the
structural inverse system (I) of dierentiable functions (r1, r2) satisfying
y1 = m1y2, r
1 (y1, y2)
y2 = m2y1, x, r
2 (y1, y2, x)
By Assumption S.1, we can dierentiate these equations with respect to y1
and y2, to get
1 0
0 1
=
m1
1r1
y1m1
y2+ m1
1r1
y2
m2
y1+ m2
2r2
y1m2
2r2
y2
Hence, r1/y1 = (m1/1)1, r2/y2 = (m
2/2)1, r1/y2 =
= (m1/1)1(m1/y2) , and r2/y1 = (m2/2)
1(m2/y1) .
These expressions together with Assumptions S.4 and S.5 imply that r1/y1 >
0, r2/y2 > 0, and (r1/y2) (r2/y1) < 0. Hence the determinants of all
principal submatrices of the Jacobian matrix
r1(y1,y2)y1
r1(y1,y2)y2
r2(y1,y2,x)y1
r2(y1,y2,x)y2
of (r1, r2) with respect to (y1, y2) are positive. It follows by Gale and Nikaido
7
(1965) that there exist unique functions (h1, h2) such that for all (x, 1, 2)
1 = r1h1 (x, 1, 2) , h
2 (x, 1, 2)
2 = r2h1 (x, 1, 2) , h
2 (x, 1, 2) , x
We have then established the existence of the reduced form system (R).
The Implicit Function Theorem implies by Assumption S.1 that h1 and h2
are continuously dierentiable. Moreover, the Jacobian matrix of (h1, h2)
with respect to (1, 2) is the inverse of the Jacobian matrix of (r1, r2) with
respect to (y1, y2) . Assumptions S.4 and S.5 then imply that for all x, 1, 2,
h2 (x, 1, 2) /2 > 0 and h2 (x, 1, 2) /2 > 0. This completes the proof
of Lemma 1.//
We next define a new property, which we call control function separability.
Definition: A structural inverse system of equations (r1(y1, y2), r2(y1, y2, x))
satisfies control function separability if there exist functions v : R2 R and
q : R2 R such that
(a) for all (y1, y2, x) ,
r2 (y1, y2, x) = vq (y2, x) , r
1 (y1, y2)
(b) for any value of its second argument, v is strictly increasing in its first
argument, and
(c) for any value of its first argument, q is strictly increasing in its second
argument.
8
2.2 The triangular model and observational equiva-lence
We will consider triangular models of the form
Model (T) y1 = m1 (y2, 1)
y2 = s (x, )
satisfying the following assumptions.
Assumption T.1 (dierentiability): For all values of y1, y2, x, 1, of
Y1, Y2, X, 1, the functions m1 and s are continuously dierentiable.
Assumption T.2 (independence): (1, ) is distributed independently of
X.
Assumption T.3 (support): Conditional on any value x ofX, the densities
of (1, ) and of (Y1, Y2) are continuous and have convex support.
Assumption T.4 (monotonicity): For all values of y2, the function m1
is strictly increasing in 1; and for all values of x, the function s is strictly
increasing in .
Using the standard definition of observational equivalence, we will say
that Model (S) is observationally equivalent to Model (T) if the distributions
of the observable variables generated by each of these models is the same:
9
Definition: Model (S) is observationally equivalent to model (T) i for all
y1, y2, x such that fX (x) > 0
fY1,Y2|X=x (y1, y2;S) = fY1,Y2|X=x (y1, y2;T ) .
In the next section, we establish that control function separability com-
pletely characterizes observational equivalence between Model (S) and Model
(T).
3 Characterization of Observational Equiva-lence and Control Function Separability
Our characterization theorem is the following:
Theorem 1: Suppose that Model (S) satisfies Assumptions S.1-S.5 and
Model (T) satisfies Assumptions T.1-T.4. Then, Model (S) is observation-
ally equivalent to Model (T) if and only if the inverse system of equations
(r1(y1, y2), r2(y1, y2, x)) derived from (S) satisfies control function separabil-
ity.
Proof of Theorem 1: Suppose that Model (S) is observationally equivalent
to Model (T). Then, for all y1, y2, x such that fX (x) > 0
fY1,Y2|X=x (y1, y2;S) = fY1,Y2|X=x (y1, y2;T ) .
10
Consider the transformation
1 = r1 (y1, y2)
y2 = y2
x = x
The inverse of this transformation is
y1 = m1 (y2, 1)
y2 = y2
x = x
Hence, the conditional density of (1, y2) given X = x, under Model T and
under Model S are, respectively
f1,Y2|X=x (1, y2;T ) = fY1,Y2|X=xm1 (y2, 1) , y2;T
m1 (y2, 1)
1
and
f1,Y2|X=x (1, y2;S) = fY1,Y2|X=xm1 (y2, 1) , y2;S
m1 (y2, 1)
1
.
In particular, for all y2, all x such that fX(x) > 0, and for 1 = r1 (y1, y2)
(T1.1) fY2|1=r1(y1,y2),X=x (y2;T ) = fY2|1=r1(y1,y2),X=x (y2;S) .
That is, the distribution of Y2 conditional on 1 = r1 (y1, y2) and X = x,
generated by either Model (S) or Model (T) must be the same. By Model
(T), the conditional distribution of Y2 conditional on (1, X) = (r1 (y1, y2) , x)
can be expressed as
PrY2 y2|1 = r1 (y1, y2) , X = x
= Prs (x, ) y2|1 = r1 (y1, y2) , X = x
= Pr s (x, y2) |1 = r1 (y1, y2) , X = x
= F|1=r1(y1,y2) (s (x, y2)) .
11
where s denotes the inverse of s with respect to . The existence of s and its
strict monotonicity with respect to y2 is guaranteed by Assumption T.4. The
last equality follows because Assumption T.2 implies that conditional on 1,
is independent of X. On the other side, by Model (S), we have that
PrY2 y2|1 = r1 (y1, y2) , X = x
= Prh2 (x, 1, 2) y2|1 = r1 (y1, y2) , X = x
= Pr2 h2 (x, 1, y2) |1 = r1 (y1, y2) , X = x
= Pr2 r2
m1 (y2, 1) , y2, x
|1 = r1 (y1, y2) , X = x
= F2|1=r1(y1,y2)r2m1 (y2, 1) , y2, x
.
where h2 denotes the inverse of h2 with respect to 2. The existence of h2 and
its strict monotonicity with respect to y2 follows by Lemma 1. The third
equality follows because when 1 = r1 (y1, y2) , the value of 2 such that
y2 = h2 (x, 1, 2)
is
2 = r2 (y1, y2, x)
The last equality follows because Assumption S.2 implies that conditional on
1, 2 is independent of X.
Equating the expressions that we got for Pr (Y2 y2|1 = r1 (y1, y2) , X = x)
from Model (T) and from Model (S), we can conclude that for all y2, x, 1
(T1.2) F2|1=r1(y1,y2)r2m1 (y2, 1) , y2, x
= F|1=r1(y1,y2) (s (x, y2))
Substituting m1 (y2, 1) by y1, we get that for all y1, y2, x
F2|1=r1(y1,y2)r2 (y1, y2, x)
= F|1=r1(y1,y2) (s (x, y2))
12
Note that the distribution of 2 conditional on 1 can be expressed as an
unknown function G (2, 1) , of two arguments. Analogously, the distri-
bution of conditional on 1 can be expressed as an unknown function
H (, 1) . Denote the (possibly infinite) support of 2 conditional on 1 =
r1 (y1, y2) by [2L, 2U ] , and the (possibly infinite) support of conditional
on 1 = r1 (y1, y2) by [L, U ] . Our assumptions S.2 and S.3 imply that
the distribution F2|1=r1(y1,y2)(·) is strictly increasing on [2L, 2U ] and maps
[2L, 2U ] onto [0, 1] . Our Assumptions T.2 and T.3 imply that the distribu-
tion F|1=r1(y1,y2) (·) is strictly increasing on [L, U ] and maps [L, U ] onto
[0, 1] . Hence, (T1.2) and our assumptions imply that there exists a function
s, strictly increasing in its second argument, and functions G (2, 1) and
H (, 1) , such that for all y1, y2, x with fX(x) > 0 and fY1,Y2|X=x(y1, y2) > 0
Gr2 (y1, y2, x) , r
1 (y1, y2)= H
s (x, y2) , r1 (y1, y2)
and G and H are both strictly increasing in their first arguments at, respec-
tively, 2 = r2 (y1, y2, x) and = s (x, y2) . Let G denote the inverse of G,
with respect to its first argument. Then, G (·, r1 (y1, y2)) : [0, 1] [r2L.r2U ] is
strictly increasing on (0, 1) and
r2 (y1, y2, x) = GHs (x, y2) , r1 (y1, y2)
, r1 (y1, y2)
This implies that r2 is weakly separable into r1 (y1, y2) and a function of
(x, y2) , strictly increasing in y2. Moreover, since H and G are both strictly
increasing with respect to their first argument on their respective relevant
domains, r2 must be strictly increasing in the value of s. Extending the func-
tion s to be strictly increasing at all y2 R and extending the function GH
to be strictly increasing on all values s R, we can conclude that (T1.1), and
13
hence also the observational equivalence between Model (T) and Model (S),
implies that (r1(y1, y2), r2(y1, y2, x)) satisfies control function separability.
To show that control function separability implies the observational equiv-
alence between Model (S) and Model (T), suppose that Model (S), satisfying
Assumptions S.1-S.5, is such that there exist continuously dierentiable func-
tions v : R2 R and q : R2 R such that for all (y1, y2, x) ,
r2 (y1, y2, x) = vq (y2, x) , r
1 (y1, y2),
where for any value of r1 (y1, y2) , q is strictly increasing in y2 and v is strictly
increasing in its first argument. Let 1 = r1 (y1, y2) and = q (x, y2) . Then
2 = r2 (y1, y2, x) = v (, 1)
where v is strictly increasing in . Letting v denote the inverse of v with
respect to , it follows that,
q (y2, x) = = v (2, 1) .
Since is a function of (1, 2) , Assumption S.2 implies Assumption T.2.
Since v is strictly increasing in 2, Assumption S.3 implies that (1, ) has a
continuous density on a convex support. Let q denote the inverse of q with
respect to y2. The function q exists because q is strictly increasing in y2.
Then,
y2 = q (, x) = q (v (2, 1) , x) .
Since
y2 = h2 (x, 1, 2)
it follows that
14
y2 = h2 (x, 1, 2) = q (v (2, 1) , x)
where q is strictly increasing with respect to its first argument. Hence,
y2 = h2 (x, 1, 2) = q (, x)
where q is strictly increasing in . This implies that control function separa-
bility implies that the system composed of the structural form function for
y1 and the reduced form function for y2 is of the form
y1 = m1 (y2, 1)
y2 = h2 (x, 1, 2) = q (v (2, 1) , x) = q (, x)
where q is strictly increasing in and (1, ) is independent of X. To show
that the model generated by (m1, h2) is observationally equivalent to the
model generated by (m1, q) , we note that the model generated by (m1, h2)
implies that for all x such that fX (x) > 0,
fY1,Y2|X=x (y1, y2;S)
= f1,2r1 (y1, y2) , r
2 (y1, y2, x) r1y1r
2y2 r1y2r
2y1
where r1y1 = r1y1 (y1, y2), r2y2= r2y2 (y1, y2, x) , r
1y2= r1y2 (y1, y2), and r
2y1=
r2y1 (y1, y2, x) . On the other side, for the model generated by (m1, q) , we have
that,
fY1,Y2|X=x (y1, y2;T )
= f1,r1 (y1, y2) , v
r2 (y1, y2, x) , r
1 (y1, y2) r1y1
v1r2y2 + v2r
1y2
r1y2
v1r2y1 + v2r
1y1
where v1 denotes the derivative of v with respect to its first coordinate and
v2 denotes the derivative of v with respect to its second coordinate. Sincer1y1
v1r2y2 + v2r
1y2
r1y2
v1r2y1 + v2r
1y1
= v1r1y1r
2y2 r1y2r
2y1
15
and
f2|1=r1(y1,y2)r2 (y1, y2, x)
= f|1=r1(y1,y2)
vr2 (y1, y2, x) , r
1 (y1, y2)v1
it follows that for all x such that fX(x) > 0,
fY1,Y2|X=x (y1, y2;S) = fY1,Y2|X=x (y1, y2;T )
Hence, control function separability implies that Model (S) is observationally
equivalent to Model (T). This completes the proof of Theorem 1.//
Theorem 1 provides a characterization of two-equation systems with si-
multaneity where one of the functions can be estimated using the other to
derive a control function. One of the main conclusions of the theorem is
that to verify whether one of the equations can be used to derive a control
function, it must be that the inverse function of that equation, which maps
the observable endogenous and observable exogenous variables into the value
of the unobservable, must be separable into the inverse function of the first
equation and a function not involving the dependent variable of the first
equation. That is, the function
y2 = m2 (y1, x, 2)
can be used to derive a control function to identify the function m1, where
y1 = m1 (y2, 1)
if and only if the inverse function of m2 with respect to 2 is separable into
r1 and a function of y2 and x.
16
In the Appendix we provide equivalent characterizations of these condi-
tions in terms of the derivatives of the structural functions and of the reduced
form system (R).
4 An example
We next provide an example of an optimization problem, for which the first
order conditions satisfy control function separability. Our results then im-
ply that one can estimate the structural equation using a control function
approach. The objective function in our example is specified as
V (y1, y2, x1, x2, x3)
= (1 + 2) u (y2) + 1 log (y1 u (y2)) y1x1 y2x2 + x3
This can be the objective function of a consumer choosing demand for three
products, (y1, y2, y3) subject to a linear budget constraint, x1y1+x2y2+y3
x3, with x1 and x2 denoting the prices of, respectively, y1 and y2 and x3
denoting income.
The first order conditions with respect to y1 and y2 are
(5.1) y1: 1
(y1u(y2)) x1 = 0
(5.2) y2: (1 + 2) u
(y2) u (y2) 1(y1u(y2))
x2 = 0
17
The Hessian of the objective function is
1(y1u(y2))2
1 u(y2)
(y1u(y2))2
1 u(y2)
(y1u(y2))2
1 + 2 1
(y1u(y2))
u (y2) (u (y2))
2 1(y1u(y2))2
.
This Hessian is negative definite when 1 > 0, u (y2) > 0, u (y2) < 0 and1 + 2
1(y1 u (y2))
> 0
Since at the values of (y1, y2) that satisfy the First Order conditions, 1/ (y1 u (y2)) =
x1 and (1 + 2 (1/ (y1 u (y2)))) u (y2) = x2,as long as x1 > 0 and
x2 > 0, the objective function is strictly concave at (y1, y2) that solve the
optimization problem.
To obtain the system of structural equations, note that from (5.1), we get
(5.3) 1 = [y1 u (y2)] x1
And using (5.3) in (5.2), we get
(5.4) [(1 + 2) x1]u (y2) = x2
Hence,
2 =x2
u (y2) y1x1 + u (y2) x1 + x1
=
x2
u (y2)+ x1
(y1 u (y2)) x1
We can then easily see that the resulting system of structural equations, which
18
is
1 = [y1 u (y2)] x1
2 =
x2
u (y2)+ x1
(y1 u (y2)) x1
satisfy control function separability. The triangular system of equations,
which can then be estimated using a control function for nonseparable mod-
els, is
y1 = u (y2) +1x1
y2 = (u)1
x21 + 2 x1
The unobservable = 1 + 2 is the control function for y2 in the equation
for y1. Conditional on = 1 + 2, y2 is a function of only (x1, x2) , which is
independent of 1. Hence, conditional on = 1 + 2, y2 is independent of
1, exactly the conditions one needs to use as the control function in the
estimation of the equation for y1.
5 Simultaneity in Latent Variables
Our results can be applied to a wide range of Limited Dependent Variable
models with simultaneity in the latent variables, when additional exogenous
variables are observed and some separability conditions are satisfied. In
particular, suppose that we were interested in estimating m1 in the model
y1 = m1 (y2, w1, w2, 1)
y2 = m2 (y1, w1, w2, x, 2)
19
where instead of observing (y1, y2) , we observed a transformation, (y1, y2) ,
of (y1, y2) defined by a known vector function (T1, T2),
y1 = T1 (y1, y
2)
y2 = T2 (y1, y
2)
Assume that (w1, w2, x) is independent of (1, 2) and that for known func-
tions b1 and b2 and unknown functionsm1 andm2 the system of simultaneous
equations can be written as
b1 (y1, w1) = m1 (b2 (y
2, w2) , 1)
b2 (y2, w2) = m2 (b1 (y
1, w1) , x, 2)
Then, under support conditions on (w1, w2, x) and on the range of (T1, T2),
and under invertibility conditions on (b1, b2) , one can express this system as
b1 = m1b2, 1
b2 = m2b1, x, 2
where the distribution ofb1, b2, x
is known. (See Matzkin (2010b) for
formal assumptions and arguments and more general models.) The iden-
tification and estimation of m1 can then proceed using a control function
approach, as developed in the previous sections, when this system satisfies
control function separability.
To provide a simple specific example of the arguments that are involved
in the above statements, we consider a special case of a binary threshold
crossing model analyzed in Briesch, Chintagunta and Matzkin (1997, 2009),
y1 = m1 (y2, 1) + w1
20
y1 = 0 if y1 0
= 1 otherwise
Suppose that instead of assuming as they did, that (y2, w1) is independent
of 1, we assume that
y2 = m2 (y1 w1, x, 2)
and that (x, w1) is independent of (1, 2) . An example of such a model is
where y2 is discretionary expenditure by an individual in a store for which
expenditures are observable, w1 is an exogenous observable expenditure, and
y1 w1 is unobserved discretionary expenditure over the fixed amount w1.
Assuming thatm1 is invertible in 1 andm2 is invertible in 2, we can rewrite
the two equation system as
1 = r1 (y1 w1, y2)
2 = r2 (y1 w1, y2, x)
If this system can be expressed as
1 = r1 (y1 w1, y2)
2 = vr1 (y1 w1, y2) , s (y2, x)
for some unknown functions r1, v and s, satisfying our regularity conditions,
then one can identify and estimate m1 using a control function approach. To
shed more light on this result, let b1 = y1 w1. Then, the model becomes
b1 = m1 (y2, 1)
y2 = m2b1, x, 2
21
with a system of reduced form functions
b1 = h1 (x, 1, 2)
y2 = h2 (x, 1, 2)
Following Matzkin (2010b), we extend arguments for identification of semi-
parametric binary threshold crossing models using conditional independence
(Lewbel (2000)), and arguments for identification of nonparametric and non-
additive binary threshold crossing models using independence (Matzkin (1992),
Briesch, Chintagunta, and Matzkin (1997, 2009)) to models with simultane-
ity. For this, we assume that (X,W ) has an everywhere positive density.
Our independence assumption implies that W is independent of (1, 2) con-
ditional on X. Then, since conditional on X,b1, y2
is only a function of
(1, 2), we have that for all w1, t1
PrB1, Y2
(t1, y2)|X = x
= Pr
B1, Y2
(t1, y2)|W1 = w1, X = x
= Pr ((Y 1 W1, Y2) (t1, y2)|W1 = w1, X = x)
= Pr ((Y 1 , Y2) (t1 + w1, y2)|W1 = w1, X = x)
Letting w1 = t1, we get that
PrB1, Y2
(t1, y2)|X = x
= Pr ((Y1, Y2) (0, y2)|W1 = t1, X = x)
Hence, the distribution ofb1, y2
conditional onX is identified. The analysis
of the system
b1 = m1 (y2, 1)
y2 = m2b1, x, 2
22
when this identified distribution is given is analogous to the analysis of the
system
y1 = m1 (y2, 1)
y2 = m2 (y1, x, 2)
with the distribution of (y1, y2) given X, considered in our previous sections.
In particular, if the system satisfies control function separability, we can first
estimate the model y2 = s (x, ) where s is an unknown function increasing
in , and then use the estimated as a control in the estimation of m1.
6 Conclusions
In this note we have provided a conclusive answer to the question of when
it is possible to use a control function approach to identify and estimate a
function in a simultaneous equations model. We define a new property of
functions, called control function separability, which characterizes systems of
simultaneous equations where a function of interest can be estimated using
a control function derived from the second equation. We show that this
condition is equivalent to requiring that the reduced form function for the
endogenous regressor in the function of interest is separable into a function
of all the unobservable variables. We also provide conditions in terms of the
derivatives of the two functions in the system.
An example a system of structural equations, which is generated by the
first order conditions of an optimization problem, and which satisfies control
function separability, was presented. We have also shown how our results
can be used to identify and estimate Limited Dependent Variable models
with simultaneity in the latent or observable continuous variables.
23
7 Appendix A
7.1 Characterization in terms of Derivatives
Taking advantage of the assumed dierentiability, we can characterize sys-
tems where one of the functions can be estimated using a control function
approach using a condition in terms of the derivatives of the functions of Mod-
els (T) and (S). The following result provides such a condition. Let r2x =
r2 (y1, y2, x) /x, r2y1= r2 (y1, y2, x) /y1, and r2y2 = r
2 (y1, y2, x) /y2 de-
note the derivatives of r2, sx = s (y2, x) /x and sy2 = s (y2, x) /y2 denote
the derivatives of s, and let m1y2= m1(y2, 1)/y2 denote the derivative of
the function of interest m1 with respect to the endogenous variable y2.
Theorem 2: Suppose that Model (S) satisfies Assumptions S.1-S.5 and that
Model (T) satisfies Assumptions T.1-T.4. Then, Model (S) is observationally
equivalent to Model (T) if and only if for all x, y1, y2,
r2xr2y1 m
1y2+ r2y2
=sxsy2
Proof of Theorem 2: As in the proof of Theorem 1, observational equiv-
alence between Model (T) and Model (S) implies that for all y2, x, and
1 = r1 (y1, y2)
(T1.2) F2|1r2m1 (y2, 1) , y2, x
= F|1 (s (y2, x))
24
Dierentiating both sides of (T1.2) with respect to y2 and x, we get that
f2|1r2m1 (y2, 1) , y2, x
r2y1 m
1y2+ r2y2
= f|1 (s (y2, x)) sy2
f2|1r2m1 (y2, 1) , y2, x
r2x = f|1 (s (y2, x)) sx
where, as defined above, r2y1 = r2 (m1 (y2, 1) , y2, x) /y1, r
2y2= r2 (m1 (y2, 1) , y2, x) /y2,
r2x = r2 (m1 (y2, 1) , y2, x) /x, m
1y2= m1 (y2, 1) /y2, sy2 = s (y2, x) /y2,
and sx = s (y2, x) /x. Taking ratios, we get that
r2xr2y1 m
1y2+ r2y2
=sxsy2
Conversely, suppose that for all y2, x, 1,
(T2.1)r2x
r2y1 m1y2+ r2y2
=sxsy2
.
Define
b (y2, x, 1) = r2m1 (y2, 1) , y2, x
.
(T2.1) implies that, for any fixed value of 1, the function b (y2, x, 1) is a
transformation of s (y2, x) . Let t (·, ·, 1) : R R denote such a transforma-
tion. Then, for all y2, x,
b (y2, x, 1) = r2m1 (y2, 1) , y2, x
= t (s (y2, x) , 1) .
Substituting m1 (y2, 1) with y1 and 1 with r1 (y1, y2) , it follows that
r2 (y1, y2, x) = ts (y2, x) , r1 (y1, y2)
Hence, (T2.1) implies control function separability. This implies, by Theo-
rem 1, that Model (T) and Model (S) are observationally equivalent, and it
completes the proof of Theorem 2.//
25
Instead of characterizing observationally equivalence in terms of the deriv-
atives of the functions m1 and r2, we can express observational equivalence
in terms of the derivatives of the inverse structural form functions. Dier-
entiating with respect to y1 and y2 the identity
y1 = m1y2, r
1 (y1, y2)
and solving for m1y2, we get that
m1y2= r1y2r1y1
Hence, the condition that for all y1, y2, x
r2xr2y1 m
1y2+ r2y2
=sxsy2
is equivalent to the condition that for all y1, y2, x
r1y1 (y1, y2) r2x (y1, y2, x)
r1y1 (y1, y2) r2y2(y1, y2, x) r1y2 (y1, y2) r
2y1(y1, y2, x)
=sx(y2, x)sy2(y2, x)
orr1y1 (y1, y2) r
2x (y1, y2, x)
|ry (y1, y2, x)|=sx(y2, x)sy2(y2, x)
where |ry (y1, y2, x)| is the Jacobian determinant of the vector function r =
(r1, r2) with respect to (y1, y2) .
Note that dierentiating both sides of the above equation with respect
to y1, we get the following expression, only in terms of the derivatives of the
inverse system of structural equations of Model (S)
y1
r1y1 (y1, y2) r
2x (y1, y2, x)
|ry (y1, y2, x)|
= 0
26
7.2 Characterization in terms of the Reduced FormFunctions
An alternative characterization, which follows from the proof of Theorem 1,
is in terms of the reduced form functions. Suppose we ask when the function
y2 = m2 (y1, x, 2)
can be used to derive a control function to identify the function m1, where
y1 = m1 (y2, 1) .
Our arguments show that the control function approach can be used if and
only if the reduced form function, h2 (x, 1, 2) , for y2 can be expressed as a
function of x and a function of (1, 2) . That is the control function approach
can be used if and only if, for some functions q and v
h2 (x, 1, 2) = q (x, v (1, 2))
Note that while the suciency of such a condition is obvious, the necessity,
which follows from Theorem 1, had not been previously known.4
4Kasy (2010) also highlights the one-dimensional distribution conditional on the re-duced form h2 but does not relate this to restrictions on the structure of the simultaneousequation system (S) which is our primary objective.
27
8 References
AI, C. and X. CHEN (2003), "Ecient Estimation of Models with Condi-
tional Moments Restrictions Containing Unknown Functions,"Econometrica,
71, 1795-1843.
BENKARD, C.L. and S. BERRY (2004) "On the Nonparametric Identifica-
tion of Nonlinear Simultaneous Equations Models: Comment on B. Brown
(1983) and Roehrig (1988), Cowles Foundation Discussion Paper #1482.
BENKARD, C.L. and S. BERRY (2006) "On the Nonparametric Identifica-
tion of Nonlinear Simultaneous Equations Models: Comment on B. Brown
(1983) and Roehrig (1988), Econometrica, 74 , 5 .
BLUNDELL, R. and J. L. POWELL (2003) "Endogeneity in Nonparamet-
ric and Semiparametric Regression Models," in Advances in Economics and
Econometrics, Theory and Applications, Eighth World Congress, Volume II,
edited by M. Dewatripont, L.P. Hansen, and S.J.Turnovsky, Cambridge Uni-
versity Press, Cambridge, U.K.
BRIESCH, R., P. CHINTAGUNTA, and R.L. MATZKIN (1997) "Nonpara-
metric Discrete Choice Models with Unobserved Heterogeneity," mimeo, North-
western University.
BRIESCH, R., P. CHINTAGUNTA, and R.L. MATZKIN (2010) "Nonpara-
28
metric Discrete Choice Models with Unobserved Heterogeneity," Journal of
Business and Economic Statistics, Vol. 28, No. 2
BROWN B.W. (1983) "The Identification Problem in Systems Nonlinear in
the Variables," Econometrica, 51, 175-196.
BROWN, D.J. and R.L. MATZKIN (1998) "Estimation of Nonparametric
Functions in Simultaneous Equations Models, with an Application to Con-
sumer Demand," Cowles Foundation Discussion Paper #1175.
CHERNOZHUKOV, V. AND C. HANSEN (2005): “An IV Model of Quan-
tile Treatment Eects,” Econometrica, 73, 245-261.
CHERNOZHUKOV, V., G. IMBENS, AND W. NEWEY (2007): “Instru-
mental Variable Identification and Estimation of Nonseparable Models" Jour-
nal of Econometrics, 139(1), 4-14.
CHESHER, A. (2003): “Identification in Nonseparable Models,” Economet-
rica, Vol. 71, No. 5.
DAROLLES, S., J.P. FLORENS, and E. RENAULT (2002), ”Nonparametric
Instrumental Regression,” mimeo, IDEI, Toulouse.
FLORENS, J. P., J. J. HECKMAN, C. MEGHIR, and E. VYTLACIL (2008).
"Identification of Treatment Eects Using Control Functions in Models With
Continuous, Endogenous Treatment and Heterogeneous Eects," Economet-
29
rica, vol. 76(5), 1191-1206, 09
GALE, D. and H. NIKAIDO (1965), “The Jacobian Matrix and Global Uni-
valence of Mappings,” Math. Annalen 159, 81-93.
HAHN, J. and G. RIDDER (2010), "Conditional Moment Restrictions and
Triangular Simultaneous Equations," forthcoming in the Review of Eco-
nomics an Statistics.
HALL, P. and J.L. HOROWITZ (2003), “Nonparametric Methods for In-
ference in the Presence of Instrumental Variables,” Center for Microdata
Methods and Practice, WP 102/03.
HAUSMAN, J.A. (1983) “Specification and Estimation of Simultaneous Equa-
tion Models,” in Handbook of Econometrics, Vol. I, edited by Z. Griliches
and M.D. Intriligator.
HECKMAN, J.J. and R. ROBB (1985) "Alternative Methods for Evaluating
the Impact of Interventions," in Longitudinal Analysis of Labor Market Data,
edited by J.J. Heckman and B. Singer, Econometric Society Monograph 10,
Cambridge University Press, Cambridge, U.K.
HECKMAN, J.J. (1978) "Dummy Endogenous Variables in a Simultaneous
Equation System," Econometrica, Vol. 46, No. 4, 931-959.
IMBENS, G.W. AND W.K. NEWEY (2009) “Identification and Estimation
30
of Triangular Simultaneous Equations Models Without Additivity,” Econo-
metrica, Vol. 77, No. 5, September.
KASY, M. (2010) “Identification in Triangular Systems using Control Func-
tions,” mimeo, Berkeley, April.
LEWBEL, A. (2000) "Semiparametric Qualitative Response Model Estima-
tion with Unknown Heteroskedasticity and Instrumental Variables," Journal
of Econometrics, 97, 145-177.
MATZKIN, R.L. (1992) “Nonparametric and Distribution-Free Estimation
of the Binary Choice and the Threshold Crossing Models,” Econometrica,
Vol. 60, No. 2, p. 239-270.
MATZKIN, R.L. (2005) "Identification in Nonparametric Simultaneous Equa-
tions Models," mimeo, Northwestern University.
MATZKIN, R.L. (2007) “Nonparametric Identification,” Chapter 73 in Hand-
book of Econometrics, Vol. 6b, edited by J.J. Heckman and E.E. Leamer,
Elsevier B.V., 5307-5368.
MATZKIN, R.L. (2008) “Identification in Nonparametric Simultaneous Equa-
tions,” Econometrica, Vol. 76, No. 5, 945-978.
MATZKIN, R.L. (2010a) “Estimation of Nonparametric Models with Simul-
taneity," mimeo, UCLA.
31
MATZKIN, R.L. (2010b) “Identification in Nonparametric Limited Depen-
dent Variable Models with Simultaneity and Unobserved Heterogeneity,"
mimeo, UCLA.
NEWEY, W. and J. POWELL (1989) “Instrumental Variables Estimation
of Nonparametric Models,” mimeo.
NEWEY, W. and J. POWELL (2003) “Instrumental Variables Estimation
of Nonparametric Models,” Econometrica,.
ROEHRIG, C.S. (1988) "Conditions for Identification in Nonparametric and
Parametric Models," Econometrica, 56:2, 433-447.
TELSER, Lester G. (1964) "Iterative Estimation of a Set of Linear Regression
Equations." Journal of the American Statistical Association, 59(307), 845-62.
32