Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any...

32
Conditions for the Existence of Control Functions in Nonseparable Simultaneous Equations Models 1 Richard Blundell UCL and IFS and Rosa L. Matzkin UCLA First version: March 2008 This version: October 2010 Abstract The control function approach (Heckman and Robb (1985)) in a system of linear simultaneous equations provides a convenient procedure to estimate one of the functions in the system using reduced form residuals from the other functions as additional regressors. The conditions on the structural system under which this procedure can be used in nonlinear and nonpara- metric simultaneous equations has thus far been unknown. In this note, we define a new property of functions called control function separability and show it provides a complete characterization of the structural systems of simultaneous equations in which the control function procedure is valid. Key Words: Nonseparable models, Simultaneous equations, control func- tions. JEL Classification: C3. 1 Acknowledgements: We are grateful to Whitney Newey, Arthur Lewbel, and par- ticipants at the Cemmap-Cowles Conference on "Advancing Applied Microeconometrics", in Beijing, August 2010, for helpful comments. The research is part of the program of research of the ESRC Centre for the Microeconomic Analysis of Public Policy at IFS. Funding from the ESRC is gratefully acknowledged. Rosa L. Matzkin gratefully acknowl- edges as well support from NSF. Yong Hyeon Yang provided excellent research assisstance. The usual disclaimer applies. 1

Transcript of Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any...

Page 1: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

Conditions for the Existenceof Control Functions in NonseparableSimultaneous Equations Models1

Richard BlundellUCL and IFS

and

Rosa L. MatzkinUCLA

First version: March 2008This version: October 2010

Abstract

The control function approach (Heckman and Robb (1985)) in a system oflinear simultaneous equations provides a convenient procedure to estimateone of the functions in the system using reduced form residuals from theother functions as additional regressors. The conditions on the structuralsystem under which this procedure can be used in nonlinear and nonpara-metric simultaneous equations has thus far been unknown. In this note, wedefine a new property of functions called control function separability andshow it provides a complete characterization of the structural systems ofsimultaneous equations in which the control function procedure is valid.Key Words: Nonseparable models, Simultaneous equations, control func-tions. JEL Classification: C3.

1Acknowledgements: We are grateful to Whitney Newey, Arthur Lewbel, and par-ticipants at the Cemmap-Cowles Conference on "Advancing Applied Microeconometrics",in Beijing, August 2010, for helpful comments. The research is part of the program ofresearch of the ESRC Centre for the Microeconomic Analysis of Public Policy at IFS.Funding from the ESRC is gratefully acknowledged. Rosa L. Matzkin gratefully acknowl-edges as well support from NSF. Yong Hyeon Yang provided excellent research assisstance.The usual disclaimer applies.

1

Page 2: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

1 Introduction

A standard situation in applied econometrics is where one is interested in

estimating a nonseparable model of the form

y1 = m1 (y2, 1)

when it is suspected or known that y2 is itself a function of y1. Additionally

there is an observable variable x, which might be used as an instrument for

the estimation of m1. Specifically, one believes that for some function m2

and unobservable 2,

y2 = m2 (y1, x, 2) .

The nonparametric identification and estimation of m1 under dierent

assumptions on this model has been studied in Roehrig (1988), Newey and

Powell (1989, 2003), Brown and Matzkin (1998), Darrolles, Florens, and

Renault (2002), Ai and Chen (2003), Hall and Horowitz (2003), Benkard

and Berry (2004, 2006), Chernozhukov and Hansen (2005), and Matzkin

(2005, 2008, 2010a) among others (see Blundell and Powell (2003), Matzkin

(2007), and many others, for partial surveys).

If the model were linear and with additive unobservables, one could esti-

mate m1 by first estimating a reduced form function for y2, which would also

turn out to be linear,

y2 = h2 (x, ) = x+ ,

and then using as an additional conditioning variable in the estimation of

m1, an idea dating back to Telser (1964).2

2Heckman (1978) references this paper in his comprehensive discussion of estimatingsimultaneous models with discrete endogenous variables. Blundell and Powell (2003) notethat it is dicult to locate a definitive early reference to the control function version of

2

Page 3: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

If the structural model were triangular, in the sense that y1 is not an

argument in m2, a generalized version of this procedure could be applied to

nonparametric, nonadditive versions of the model, as developed in Chesher

(2003) and Imbens and Newey (2009). Their control function methods can

be used in the triangular structural model

y1 = m1 (y2, 1)

y2 = s (x, )

when x independent of (1, ), m1 strictly increasing in 1, and s strictly

increasing in the unobservable .

When the simultaneous model cannot be expressed in a triangular form,

one can consider alternative restrictions in the joint distribution of (1, 2)

and use the estimation approach in Matzkin (2010a), or one can assume

that 1 is independent of x and use the instrumental variable estimator, see

Chernozhukov and Hansen (2005) and Chernozhukov, Imbens and Newey

(2007).3

The question we aim to answer is the following: Suppose that we were

interested in estimating the function m1 when the structural model is of the

form

y1 = m1 (y2, 1)

y2 = m2 (y1, x, 2)

2SLS. Dhrymes (1970, equation 4.3.57) shows that the 2SLS coecients can be obtainedby a least squares regression of y1 on y2 and , while Telser (1964) shows how the seeminglyunrelated regressions model can be estimated by using residuals from other equations asregressors in a particular equation of interest.

3Unlike the control function approach and the Matzkin approach, the instrumentalvariable estimator requires dealing with the ill-posed inverse problem. (See also Hahnand Ridder (2010) regarding estimation of nonseparable models using conditional momentrestrictions.)

3

Page 4: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

and x is independent of (1, 2) . Under what conditions on m2 can we do

this by first estimating a function for y2 of the type

y2 = s (x, )

and then using as an additional conditioning variable in the estimation of

m1?

More specifically, we seek an answer to the question: Under what condi-

tions on m2 is it the case that the simultaneous equations Model (S)

y1 = m1 (y2, 1)

y2 = m2 (y1, x, 2)

with x independent of (1, 2) , is observationally equivalent to the triangular

Model (T)

y1 = m1 (y2, 1)

y2 = s (x, )

with x independent of (1, )?

In what follows we first define a new property of functions, control func-

tion separability. We then show, in Section 3, that this property completely

characterizes systems of simultaneous equations where a function of interest

can be estimated using a control function. An example of a utility function

whose system of demand functions satisfies control function separability is

presented in Section 4 and illustrates the restrictiveness of the CF assump-

tions.

Section 5 describes how to extend our results to Limited Dependent Vari-

able models with simultaneity in latent or observable continuous variables.

4

Page 5: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

The Appendix provides conditions in terms of the derivatives of the struc-

tural functions in the system and conditions in terms of restrictions on the

reduced form system. Section 6 concludes.

2 Assumptions and Definitions

2.1 The structural model and control function separa-bility

We will consider the structural model

Model (S) y1 = m1 (y2, 1)

y2 = m2 (y1, x, 2)

satisfying the following assumptions.

Assumption S.1 (dierentiability): For all values y1, y2, x, 1,2 of Y1,Y2, X, 1, 2,

the functions m1 and m2 are continuously dierentiable.

Assumption S.2 (independence): (1, 2) is distributed independently of

X.

Assumption S.3 (support): Conditional on any value x ofX, the densities

of (1, 2) and of (Y1, Y2) are continuous and have convex support.

Assumption S.4 (monotonicity): For all values y2 of Y2, the function m1

is strictly increasing in 1; and for all values (y1, x) of (Y1, X) , the function

m2 is strictly increasing in 2.

5

Page 6: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

Assumption S.5 (crossing): For all values (y1, y2, x, 1, 2) of (Y1, Y2, X, 1, 2),

(m1 (y2, 1) /y2) (m2 (y1, x, 2) /y1) < 1.

The technical assumptions S.1-S.3 could be partially relaxed at the cost of

making the presentation more complex. Assumption S.4 guarantees that the

functionm1 can be inverted in 1 and that the functionm2 can be inverted in

2. Hence, this assumption allows us to express the direct system of structural

equations (S), defined by (m1,m2) , in terms of a structural inverse system (I)

of functions (r1, r2) , which map any vector of observable variables (y1, y2, x)

into the vector of unobservable variables (1, 2) ,

Model (I) 1 = r1 (y1, y2)

2 = r2 (y1, y2, x) .

Assumption S.5 is a weakening of the common situation where the value of

the endogenous variables is determined by the intersection of a downwards

and an upwards slopping function. Together with Assumption S.4, this

assumption guarantees the existence of a unique reduced form system (R) of

equations, defined by functions (h1, h2) , which map the vector of exogenous

variables (1, 2, x) into the vector of endogenous variables (y1, y2) ,

Model (R) y1 = h1 (x, 1, 2)

y2 = h2 (x, 1, 2) .

These assumptions also guarantee that the reduced form function h1 is

monotone increasing in 1 and the reduced form function h2 is monotone

increasing in 2. These results are established in the following Lemma.

6

Page 7: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

Lemma 1: Suppose that Model (S) satisfies Assumptions S.1—S.5. Then,

there exist unique functions h1 and h2 representing Model (S). Moreover, for

all x, 1, 2, h1 and h2 are continuously dierentiable, h1 (x, 1, 2) /1 > 0

and h2 (x, 1, 2) /2 > 0.

Proof of Lemma 1: Assumption S.4 guarantees the existence of the

structural inverse system (I) of dierentiable functions (r1, r2) satisfying

y1 = m1y2, r

1 (y1, y2)

y2 = m2y1, x, r

2 (y1, y2, x)

By Assumption S.1, we can dierentiate these equations with respect to y1

and y2, to get

1 0

0 1

=

m1

1r1

y1m1

y2+ m1

1r1

y2

m2

y1+ m2

2r2

y1m2

2r2

y2

Hence, r1/y1 = (m1/1)1, r2/y2 = (m

2/2)1, r1/y2 =

= (m1/1)1(m1/y2) , and r2/y1 = (m2/2)

1(m2/y1) .

These expressions together with Assumptions S.4 and S.5 imply that r1/y1 >

0, r2/y2 > 0, and (r1/y2) (r2/y1) < 0. Hence the determinants of all

principal submatrices of the Jacobian matrix

r1(y1,y2)y1

r1(y1,y2)y2

r2(y1,y2,x)y1

r2(y1,y2,x)y2

of (r1, r2) with respect to (y1, y2) are positive. It follows by Gale and Nikaido

7

Page 8: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

(1965) that there exist unique functions (h1, h2) such that for all (x, 1, 2)

1 = r1h1 (x, 1, 2) , h

2 (x, 1, 2)

2 = r2h1 (x, 1, 2) , h

2 (x, 1, 2) , x

We have then established the existence of the reduced form system (R).

The Implicit Function Theorem implies by Assumption S.1 that h1 and h2

are continuously dierentiable. Moreover, the Jacobian matrix of (h1, h2)

with respect to (1, 2) is the inverse of the Jacobian matrix of (r1, r2) with

respect to (y1, y2) . Assumptions S.4 and S.5 then imply that for all x, 1, 2,

h2 (x, 1, 2) /2 > 0 and h2 (x, 1, 2) /2 > 0. This completes the proof

of Lemma 1.//

We next define a new property, which we call control function separability.

Definition: A structural inverse system of equations (r1(y1, y2), r2(y1, y2, x))

satisfies control function separability if there exist functions v : R2 R and

q : R2 R such that

(a) for all (y1, y2, x) ,

r2 (y1, y2, x) = vq (y2, x) , r

1 (y1, y2)

(b) for any value of its second argument, v is strictly increasing in its first

argument, and

(c) for any value of its first argument, q is strictly increasing in its second

argument.

8

Page 9: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

2.2 The triangular model and observational equiva-lence

We will consider triangular models of the form

Model (T) y1 = m1 (y2, 1)

y2 = s (x, )

satisfying the following assumptions.

Assumption T.1 (dierentiability): For all values of y1, y2, x, 1, of

Y1, Y2, X, 1, the functions m1 and s are continuously dierentiable.

Assumption T.2 (independence): (1, ) is distributed independently of

X.

Assumption T.3 (support): Conditional on any value x ofX, the densities

of (1, ) and of (Y1, Y2) are continuous and have convex support.

Assumption T.4 (monotonicity): For all values of y2, the function m1

is strictly increasing in 1; and for all values of x, the function s is strictly

increasing in .

Using the standard definition of observational equivalence, we will say

that Model (S) is observationally equivalent to Model (T) if the distributions

of the observable variables generated by each of these models is the same:

9

Page 10: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

Definition: Model (S) is observationally equivalent to model (T) i for all

y1, y2, x such that fX (x) > 0

fY1,Y2|X=x (y1, y2;S) = fY1,Y2|X=x (y1, y2;T ) .

In the next section, we establish that control function separability com-

pletely characterizes observational equivalence between Model (S) and Model

(T).

3 Characterization of Observational Equiva-lence and Control Function Separability

Our characterization theorem is the following:

Theorem 1: Suppose that Model (S) satisfies Assumptions S.1-S.5 and

Model (T) satisfies Assumptions T.1-T.4. Then, Model (S) is observation-

ally equivalent to Model (T) if and only if the inverse system of equations

(r1(y1, y2), r2(y1, y2, x)) derived from (S) satisfies control function separabil-

ity.

Proof of Theorem 1: Suppose that Model (S) is observationally equivalent

to Model (T). Then, for all y1, y2, x such that fX (x) > 0

fY1,Y2|X=x (y1, y2;S) = fY1,Y2|X=x (y1, y2;T ) .

10

Page 11: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

Consider the transformation

1 = r1 (y1, y2)

y2 = y2

x = x

The inverse of this transformation is

y1 = m1 (y2, 1)

y2 = y2

x = x

Hence, the conditional density of (1, y2) given X = x, under Model T and

under Model S are, respectively

f1,Y2|X=x (1, y2;T ) = fY1,Y2|X=xm1 (y2, 1) , y2;T

m1 (y2, 1)

1

and

f1,Y2|X=x (1, y2;S) = fY1,Y2|X=xm1 (y2, 1) , y2;S

m1 (y2, 1)

1

.

In particular, for all y2, all x such that fX(x) > 0, and for 1 = r1 (y1, y2)

(T1.1) fY2|1=r1(y1,y2),X=x (y2;T ) = fY2|1=r1(y1,y2),X=x (y2;S) .

That is, the distribution of Y2 conditional on 1 = r1 (y1, y2) and X = x,

generated by either Model (S) or Model (T) must be the same. By Model

(T), the conditional distribution of Y2 conditional on (1, X) = (r1 (y1, y2) , x)

can be expressed as

PrY2 y2|1 = r1 (y1, y2) , X = x

= Prs (x, ) y2|1 = r1 (y1, y2) , X = x

= Pr s (x, y2) |1 = r1 (y1, y2) , X = x

= F|1=r1(y1,y2) (s (x, y2)) .

11

Page 12: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

where s denotes the inverse of s with respect to . The existence of s and its

strict monotonicity with respect to y2 is guaranteed by Assumption T.4. The

last equality follows because Assumption T.2 implies that conditional on 1,

is independent of X. On the other side, by Model (S), we have that

PrY2 y2|1 = r1 (y1, y2) , X = x

= Prh2 (x, 1, 2) y2|1 = r1 (y1, y2) , X = x

= Pr2 h2 (x, 1, y2) |1 = r1 (y1, y2) , X = x

= Pr2 r2

m1 (y2, 1) , y2, x

|1 = r1 (y1, y2) , X = x

= F2|1=r1(y1,y2)r2m1 (y2, 1) , y2, x

.

where h2 denotes the inverse of h2 with respect to 2. The existence of h2 and

its strict monotonicity with respect to y2 follows by Lemma 1. The third

equality follows because when 1 = r1 (y1, y2) , the value of 2 such that

y2 = h2 (x, 1, 2)

is

2 = r2 (y1, y2, x)

The last equality follows because Assumption S.2 implies that conditional on

1, 2 is independent of X.

Equating the expressions that we got for Pr (Y2 y2|1 = r1 (y1, y2) , X = x)

from Model (T) and from Model (S), we can conclude that for all y2, x, 1

(T1.2) F2|1=r1(y1,y2)r2m1 (y2, 1) , y2, x

= F|1=r1(y1,y2) (s (x, y2))

Substituting m1 (y2, 1) by y1, we get that for all y1, y2, x

F2|1=r1(y1,y2)r2 (y1, y2, x)

= F|1=r1(y1,y2) (s (x, y2))

12

Page 13: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

Note that the distribution of 2 conditional on 1 can be expressed as an

unknown function G (2, 1) , of two arguments. Analogously, the distri-

bution of conditional on 1 can be expressed as an unknown function

H (, 1) . Denote the (possibly infinite) support of 2 conditional on 1 =

r1 (y1, y2) by [2L, 2U ] , and the (possibly infinite) support of conditional

on 1 = r1 (y1, y2) by [L, U ] . Our assumptions S.2 and S.3 imply that

the distribution F2|1=r1(y1,y2)(·) is strictly increasing on [2L, 2U ] and maps

[2L, 2U ] onto [0, 1] . Our Assumptions T.2 and T.3 imply that the distribu-

tion F|1=r1(y1,y2) (·) is strictly increasing on [L, U ] and maps [L, U ] onto

[0, 1] . Hence, (T1.2) and our assumptions imply that there exists a function

s, strictly increasing in its second argument, and functions G (2, 1) and

H (, 1) , such that for all y1, y2, x with fX(x) > 0 and fY1,Y2|X=x(y1, y2) > 0

Gr2 (y1, y2, x) , r

1 (y1, y2)= H

s (x, y2) , r1 (y1, y2)

and G and H are both strictly increasing in their first arguments at, respec-

tively, 2 = r2 (y1, y2, x) and = s (x, y2) . Let G denote the inverse of G,

with respect to its first argument. Then, G (·, r1 (y1, y2)) : [0, 1] [r2L.r2U ] is

strictly increasing on (0, 1) and

r2 (y1, y2, x) = GHs (x, y2) , r1 (y1, y2)

, r1 (y1, y2)

This implies that r2 is weakly separable into r1 (y1, y2) and a function of

(x, y2) , strictly increasing in y2. Moreover, since H and G are both strictly

increasing with respect to their first argument on their respective relevant

domains, r2 must be strictly increasing in the value of s. Extending the func-

tion s to be strictly increasing at all y2 R and extending the function GH

to be strictly increasing on all values s R, we can conclude that (T1.1), and

13

Page 14: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

hence also the observational equivalence between Model (T) and Model (S),

implies that (r1(y1, y2), r2(y1, y2, x)) satisfies control function separability.

To show that control function separability implies the observational equiv-

alence between Model (S) and Model (T), suppose that Model (S), satisfying

Assumptions S.1-S.5, is such that there exist continuously dierentiable func-

tions v : R2 R and q : R2 R such that for all (y1, y2, x) ,

r2 (y1, y2, x) = vq (y2, x) , r

1 (y1, y2),

where for any value of r1 (y1, y2) , q is strictly increasing in y2 and v is strictly

increasing in its first argument. Let 1 = r1 (y1, y2) and = q (x, y2) . Then

2 = r2 (y1, y2, x) = v (, 1)

where v is strictly increasing in . Letting v denote the inverse of v with

respect to , it follows that,

q (y2, x) = = v (2, 1) .

Since is a function of (1, 2) , Assumption S.2 implies Assumption T.2.

Since v is strictly increasing in 2, Assumption S.3 implies that (1, ) has a

continuous density on a convex support. Let q denote the inverse of q with

respect to y2. The function q exists because q is strictly increasing in y2.

Then,

y2 = q (, x) = q (v (2, 1) , x) .

Since

y2 = h2 (x, 1, 2)

it follows that

14

Page 15: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

y2 = h2 (x, 1, 2) = q (v (2, 1) , x)

where q is strictly increasing with respect to its first argument. Hence,

y2 = h2 (x, 1, 2) = q (, x)

where q is strictly increasing in . This implies that control function separa-

bility implies that the system composed of the structural form function for

y1 and the reduced form function for y2 is of the form

y1 = m1 (y2, 1)

y2 = h2 (x, 1, 2) = q (v (2, 1) , x) = q (, x)

where q is strictly increasing in and (1, ) is independent of X. To show

that the model generated by (m1, h2) is observationally equivalent to the

model generated by (m1, q) , we note that the model generated by (m1, h2)

implies that for all x such that fX (x) > 0,

fY1,Y2|X=x (y1, y2;S)

= f1,2r1 (y1, y2) , r

2 (y1, y2, x) r1y1r

2y2 r1y2r

2y1

where r1y1 = r1y1 (y1, y2), r2y2= r2y2 (y1, y2, x) , r

1y2= r1y2 (y1, y2), and r

2y1=

r2y1 (y1, y2, x) . On the other side, for the model generated by (m1, q) , we have

that,

fY1,Y2|X=x (y1, y2;T )

= f1,r1 (y1, y2) , v

r2 (y1, y2, x) , r

1 (y1, y2) r1y1

v1r2y2 + v2r

1y2

r1y2

v1r2y1 + v2r

1y1

where v1 denotes the derivative of v with respect to its first coordinate and

v2 denotes the derivative of v with respect to its second coordinate. Sincer1y1

v1r2y2 + v2r

1y2

r1y2

v1r2y1 + v2r

1y1

= v1r1y1r

2y2 r1y2r

2y1

15

Page 16: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

and

f2|1=r1(y1,y2)r2 (y1, y2, x)

= f|1=r1(y1,y2)

vr2 (y1, y2, x) , r

1 (y1, y2)v1

it follows that for all x such that fX(x) > 0,

fY1,Y2|X=x (y1, y2;S) = fY1,Y2|X=x (y1, y2;T )

Hence, control function separability implies that Model (S) is observationally

equivalent to Model (T). This completes the proof of Theorem 1.//

Theorem 1 provides a characterization of two-equation systems with si-

multaneity where one of the functions can be estimated using the other to

derive a control function. One of the main conclusions of the theorem is

that to verify whether one of the equations can be used to derive a control

function, it must be that the inverse function of that equation, which maps

the observable endogenous and observable exogenous variables into the value

of the unobservable, must be separable into the inverse function of the first

equation and a function not involving the dependent variable of the first

equation. That is, the function

y2 = m2 (y1, x, 2)

can be used to derive a control function to identify the function m1, where

y1 = m1 (y2, 1)

if and only if the inverse function of m2 with respect to 2 is separable into

r1 and a function of y2 and x.

16

Page 17: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

In the Appendix we provide equivalent characterizations of these condi-

tions in terms of the derivatives of the structural functions and of the reduced

form system (R).

4 An example

We next provide an example of an optimization problem, for which the first

order conditions satisfy control function separability. Our results then im-

ply that one can estimate the structural equation using a control function

approach. The objective function in our example is specified as

V (y1, y2, x1, x2, x3)

= (1 + 2) u (y2) + 1 log (y1 u (y2)) y1x1 y2x2 + x3

This can be the objective function of a consumer choosing demand for three

products, (y1, y2, y3) subject to a linear budget constraint, x1y1+x2y2+y3

x3, with x1 and x2 denoting the prices of, respectively, y1 and y2 and x3

denoting income.

The first order conditions with respect to y1 and y2 are

(5.1) y1: 1

(y1u(y2)) x1 = 0

(5.2) y2: (1 + 2) u

(y2) u (y2) 1(y1u(y2))

x2 = 0

17

Page 18: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

The Hessian of the objective function is

1(y1u(y2))2

1 u(y2)

(y1u(y2))2

1 u(y2)

(y1u(y2))2

1 + 2 1

(y1u(y2))

u (y2) (u (y2))

2 1(y1u(y2))2

.

This Hessian is negative definite when 1 > 0, u (y2) > 0, u (y2) < 0 and1 + 2

1(y1 u (y2))

> 0

Since at the values of (y1, y2) that satisfy the First Order conditions, 1/ (y1 u (y2)) =

x1 and (1 + 2 (1/ (y1 u (y2)))) u (y2) = x2,as long as x1 > 0 and

x2 > 0, the objective function is strictly concave at (y1, y2) that solve the

optimization problem.

To obtain the system of structural equations, note that from (5.1), we get

(5.3) 1 = [y1 u (y2)] x1

And using (5.3) in (5.2), we get

(5.4) [(1 + 2) x1]u (y2) = x2

Hence,

2 =x2

u (y2) y1x1 + u (y2) x1 + x1

=

x2

u (y2)+ x1

(y1 u (y2)) x1

We can then easily see that the resulting system of structural equations, which

18

Page 19: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

is

1 = [y1 u (y2)] x1

2 =

x2

u (y2)+ x1

(y1 u (y2)) x1

satisfy control function separability. The triangular system of equations,

which can then be estimated using a control function for nonseparable mod-

els, is

y1 = u (y2) +1x1

y2 = (u)1

x21 + 2 x1

The unobservable = 1 + 2 is the control function for y2 in the equation

for y1. Conditional on = 1 + 2, y2 is a function of only (x1, x2) , which is

independent of 1. Hence, conditional on = 1 + 2, y2 is independent of

1, exactly the conditions one needs to use as the control function in the

estimation of the equation for y1.

5 Simultaneity in Latent Variables

Our results can be applied to a wide range of Limited Dependent Variable

models with simultaneity in the latent variables, when additional exogenous

variables are observed and some separability conditions are satisfied. In

particular, suppose that we were interested in estimating m1 in the model

y1 = m1 (y2, w1, w2, 1)

y2 = m2 (y1, w1, w2, x, 2)

19

Page 20: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

where instead of observing (y1, y2) , we observed a transformation, (y1, y2) ,

of (y1, y2) defined by a known vector function (T1, T2),

y1 = T1 (y1, y

2)

y2 = T2 (y1, y

2)

Assume that (w1, w2, x) is independent of (1, 2) and that for known func-

tions b1 and b2 and unknown functionsm1 andm2 the system of simultaneous

equations can be written as

b1 (y1, w1) = m1 (b2 (y

2, w2) , 1)

b2 (y2, w2) = m2 (b1 (y

1, w1) , x, 2)

Then, under support conditions on (w1, w2, x) and on the range of (T1, T2),

and under invertibility conditions on (b1, b2) , one can express this system as

b1 = m1b2, 1

b2 = m2b1, x, 2

where the distribution ofb1, b2, x

is known. (See Matzkin (2010b) for

formal assumptions and arguments and more general models.) The iden-

tification and estimation of m1 can then proceed using a control function

approach, as developed in the previous sections, when this system satisfies

control function separability.

To provide a simple specific example of the arguments that are involved

in the above statements, we consider a special case of a binary threshold

crossing model analyzed in Briesch, Chintagunta and Matzkin (1997, 2009),

y1 = m1 (y2, 1) + w1

20

Page 21: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

y1 = 0 if y1 0

= 1 otherwise

Suppose that instead of assuming as they did, that (y2, w1) is independent

of 1, we assume that

y2 = m2 (y1 w1, x, 2)

and that (x, w1) is independent of (1, 2) . An example of such a model is

where y2 is discretionary expenditure by an individual in a store for which

expenditures are observable, w1 is an exogenous observable expenditure, and

y1 w1 is unobserved discretionary expenditure over the fixed amount w1.

Assuming thatm1 is invertible in 1 andm2 is invertible in 2, we can rewrite

the two equation system as

1 = r1 (y1 w1, y2)

2 = r2 (y1 w1, y2, x)

If this system can be expressed as

1 = r1 (y1 w1, y2)

2 = vr1 (y1 w1, y2) , s (y2, x)

for some unknown functions r1, v and s, satisfying our regularity conditions,

then one can identify and estimate m1 using a control function approach. To

shed more light on this result, let b1 = y1 w1. Then, the model becomes

b1 = m1 (y2, 1)

y2 = m2b1, x, 2

21

Page 22: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

with a system of reduced form functions

b1 = h1 (x, 1, 2)

y2 = h2 (x, 1, 2)

Following Matzkin (2010b), we extend arguments for identification of semi-

parametric binary threshold crossing models using conditional independence

(Lewbel (2000)), and arguments for identification of nonparametric and non-

additive binary threshold crossing models using independence (Matzkin (1992),

Briesch, Chintagunta, and Matzkin (1997, 2009)) to models with simultane-

ity. For this, we assume that (X,W ) has an everywhere positive density.

Our independence assumption implies that W is independent of (1, 2) con-

ditional on X. Then, since conditional on X,b1, y2

is only a function of

(1, 2), we have that for all w1, t1

PrB1, Y2

(t1, y2)|X = x

= Pr

B1, Y2

(t1, y2)|W1 = w1, X = x

= Pr ((Y 1 W1, Y2) (t1, y2)|W1 = w1, X = x)

= Pr ((Y 1 , Y2) (t1 + w1, y2)|W1 = w1, X = x)

Letting w1 = t1, we get that

PrB1, Y2

(t1, y2)|X = x

= Pr ((Y1, Y2) (0, y2)|W1 = t1, X = x)

Hence, the distribution ofb1, y2

conditional onX is identified. The analysis

of the system

b1 = m1 (y2, 1)

y2 = m2b1, x, 2

22

Page 23: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

when this identified distribution is given is analogous to the analysis of the

system

y1 = m1 (y2, 1)

y2 = m2 (y1, x, 2)

with the distribution of (y1, y2) given X, considered in our previous sections.

In particular, if the system satisfies control function separability, we can first

estimate the model y2 = s (x, ) where s is an unknown function increasing

in , and then use the estimated as a control in the estimation of m1.

6 Conclusions

In this note we have provided a conclusive answer to the question of when

it is possible to use a control function approach to identify and estimate a

function in a simultaneous equations model. We define a new property of

functions, called control function separability, which characterizes systems of

simultaneous equations where a function of interest can be estimated using

a control function derived from the second equation. We show that this

condition is equivalent to requiring that the reduced form function for the

endogenous regressor in the function of interest is separable into a function

of all the unobservable variables. We also provide conditions in terms of the

derivatives of the two functions in the system.

An example a system of structural equations, which is generated by the

first order conditions of an optimization problem, and which satisfies control

function separability, was presented. We have also shown how our results

can be used to identify and estimate Limited Dependent Variable models

with simultaneity in the latent or observable continuous variables.

23

Page 24: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

7 Appendix A

7.1 Characterization in terms of Derivatives

Taking advantage of the assumed dierentiability, we can characterize sys-

tems where one of the functions can be estimated using a control function

approach using a condition in terms of the derivatives of the functions of Mod-

els (T) and (S). The following result provides such a condition. Let r2x =

r2 (y1, y2, x) /x, r2y1= r2 (y1, y2, x) /y1, and r2y2 = r

2 (y1, y2, x) /y2 de-

note the derivatives of r2, sx = s (y2, x) /x and sy2 = s (y2, x) /y2 denote

the derivatives of s, and let m1y2= m1(y2, 1)/y2 denote the derivative of

the function of interest m1 with respect to the endogenous variable y2.

Theorem 2: Suppose that Model (S) satisfies Assumptions S.1-S.5 and that

Model (T) satisfies Assumptions T.1-T.4. Then, Model (S) is observationally

equivalent to Model (T) if and only if for all x, y1, y2,

r2xr2y1 m

1y2+ r2y2

=sxsy2

Proof of Theorem 2: As in the proof of Theorem 1, observational equiv-

alence between Model (T) and Model (S) implies that for all y2, x, and

1 = r1 (y1, y2)

(T1.2) F2|1r2m1 (y2, 1) , y2, x

= F|1 (s (y2, x))

24

Page 25: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

Dierentiating both sides of (T1.2) with respect to y2 and x, we get that

f2|1r2m1 (y2, 1) , y2, x

r2y1 m

1y2+ r2y2

= f|1 (s (y2, x)) sy2

f2|1r2m1 (y2, 1) , y2, x

r2x = f|1 (s (y2, x)) sx

where, as defined above, r2y1 = r2 (m1 (y2, 1) , y2, x) /y1, r

2y2= r2 (m1 (y2, 1) , y2, x) /y2,

r2x = r2 (m1 (y2, 1) , y2, x) /x, m

1y2= m1 (y2, 1) /y2, sy2 = s (y2, x) /y2,

and sx = s (y2, x) /x. Taking ratios, we get that

r2xr2y1 m

1y2+ r2y2

=sxsy2

Conversely, suppose that for all y2, x, 1,

(T2.1)r2x

r2y1 m1y2+ r2y2

=sxsy2

.

Define

b (y2, x, 1) = r2m1 (y2, 1) , y2, x

.

(T2.1) implies that, for any fixed value of 1, the function b (y2, x, 1) is a

transformation of s (y2, x) . Let t (·, ·, 1) : R R denote such a transforma-

tion. Then, for all y2, x,

b (y2, x, 1) = r2m1 (y2, 1) , y2, x

= t (s (y2, x) , 1) .

Substituting m1 (y2, 1) with y1 and 1 with r1 (y1, y2) , it follows that

r2 (y1, y2, x) = ts (y2, x) , r1 (y1, y2)

Hence, (T2.1) implies control function separability. This implies, by Theo-

rem 1, that Model (T) and Model (S) are observationally equivalent, and it

completes the proof of Theorem 2.//

25

Page 26: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

Instead of characterizing observationally equivalence in terms of the deriv-

atives of the functions m1 and r2, we can express observational equivalence

in terms of the derivatives of the inverse structural form functions. Dier-

entiating with respect to y1 and y2 the identity

y1 = m1y2, r

1 (y1, y2)

and solving for m1y2, we get that

m1y2= r1y2r1y1

Hence, the condition that for all y1, y2, x

r2xr2y1 m

1y2+ r2y2

=sxsy2

is equivalent to the condition that for all y1, y2, x

r1y1 (y1, y2) r2x (y1, y2, x)

r1y1 (y1, y2) r2y2(y1, y2, x) r1y2 (y1, y2) r

2y1(y1, y2, x)

=sx(y2, x)sy2(y2, x)

orr1y1 (y1, y2) r

2x (y1, y2, x)

|ry (y1, y2, x)|=sx(y2, x)sy2(y2, x)

where |ry (y1, y2, x)| is the Jacobian determinant of the vector function r =

(r1, r2) with respect to (y1, y2) .

Note that dierentiating both sides of the above equation with respect

to y1, we get the following expression, only in terms of the derivatives of the

inverse system of structural equations of Model (S)

y1

r1y1 (y1, y2) r

2x (y1, y2, x)

|ry (y1, y2, x)|

= 0

26

Page 27: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

7.2 Characterization in terms of the Reduced FormFunctions

An alternative characterization, which follows from the proof of Theorem 1,

is in terms of the reduced form functions. Suppose we ask when the function

y2 = m2 (y1, x, 2)

can be used to derive a control function to identify the function m1, where

y1 = m1 (y2, 1) .

Our arguments show that the control function approach can be used if and

only if the reduced form function, h2 (x, 1, 2) , for y2 can be expressed as a

function of x and a function of (1, 2) . That is the control function approach

can be used if and only if, for some functions q and v

h2 (x, 1, 2) = q (x, v (1, 2))

Note that while the suciency of such a condition is obvious, the necessity,

which follows from Theorem 1, had not been previously known.4

4Kasy (2010) also highlights the one-dimensional distribution conditional on the re-duced form h2 but does not relate this to restrictions on the structure of the simultaneousequation system (S) which is our primary objective.

27

Page 28: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

8 References

AI, C. and X. CHEN (2003), "Ecient Estimation of Models with Condi-

tional Moments Restrictions Containing Unknown Functions,"Econometrica,

71, 1795-1843.

BENKARD, C.L. and S. BERRY (2004) "On the Nonparametric Identifica-

tion of Nonlinear Simultaneous Equations Models: Comment on B. Brown

(1983) and Roehrig (1988), Cowles Foundation Discussion Paper #1482.

BENKARD, C.L. and S. BERRY (2006) "On the Nonparametric Identifica-

tion of Nonlinear Simultaneous Equations Models: Comment on B. Brown

(1983) and Roehrig (1988), Econometrica, 74 , 5 .

BLUNDELL, R. and J. L. POWELL (2003) "Endogeneity in Nonparamet-

ric and Semiparametric Regression Models," in Advances in Economics and

Econometrics, Theory and Applications, Eighth World Congress, Volume II,

edited by M. Dewatripont, L.P. Hansen, and S.J.Turnovsky, Cambridge Uni-

versity Press, Cambridge, U.K.

BRIESCH, R., P. CHINTAGUNTA, and R.L. MATZKIN (1997) "Nonpara-

metric Discrete Choice Models with Unobserved Heterogeneity," mimeo, North-

western University.

BRIESCH, R., P. CHINTAGUNTA, and R.L. MATZKIN (2010) "Nonpara-

28

Page 29: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

metric Discrete Choice Models with Unobserved Heterogeneity," Journal of

Business and Economic Statistics, Vol. 28, No. 2

BROWN B.W. (1983) "The Identification Problem in Systems Nonlinear in

the Variables," Econometrica, 51, 175-196.

BROWN, D.J. and R.L. MATZKIN (1998) "Estimation of Nonparametric

Functions in Simultaneous Equations Models, with an Application to Con-

sumer Demand," Cowles Foundation Discussion Paper #1175.

CHERNOZHUKOV, V. AND C. HANSEN (2005): “An IV Model of Quan-

tile Treatment Eects,” Econometrica, 73, 245-261.

CHERNOZHUKOV, V., G. IMBENS, AND W. NEWEY (2007): “Instru-

mental Variable Identification and Estimation of Nonseparable Models" Jour-

nal of Econometrics, 139(1), 4-14.

CHESHER, A. (2003): “Identification in Nonseparable Models,” Economet-

rica, Vol. 71, No. 5.

DAROLLES, S., J.P. FLORENS, and E. RENAULT (2002), ”Nonparametric

Instrumental Regression,” mimeo, IDEI, Toulouse.

FLORENS, J. P., J. J. HECKMAN, C. MEGHIR, and E. VYTLACIL (2008).

"Identification of Treatment Eects Using Control Functions in Models With

Continuous, Endogenous Treatment and Heterogeneous Eects," Economet-

29

Page 30: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

rica, vol. 76(5), 1191-1206, 09

GALE, D. and H. NIKAIDO (1965), “The Jacobian Matrix and Global Uni-

valence of Mappings,” Math. Annalen 159, 81-93.

HAHN, J. and G. RIDDER (2010), "Conditional Moment Restrictions and

Triangular Simultaneous Equations," forthcoming in the Review of Eco-

nomics an Statistics.

HALL, P. and J.L. HOROWITZ (2003), “Nonparametric Methods for In-

ference in the Presence of Instrumental Variables,” Center for Microdata

Methods and Practice, WP 102/03.

HAUSMAN, J.A. (1983) “Specification and Estimation of Simultaneous Equa-

tion Models,” in Handbook of Econometrics, Vol. I, edited by Z. Griliches

and M.D. Intriligator.

HECKMAN, J.J. and R. ROBB (1985) "Alternative Methods for Evaluating

the Impact of Interventions," in Longitudinal Analysis of Labor Market Data,

edited by J.J. Heckman and B. Singer, Econometric Society Monograph 10,

Cambridge University Press, Cambridge, U.K.

HECKMAN, J.J. (1978) "Dummy Endogenous Variables in a Simultaneous

Equation System," Econometrica, Vol. 46, No. 4, 931-959.

IMBENS, G.W. AND W.K. NEWEY (2009) “Identification and Estimation

30

Page 31: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

of Triangular Simultaneous Equations Models Without Additivity,” Econo-

metrica, Vol. 77, No. 5, September.

KASY, M. (2010) “Identification in Triangular Systems using Control Func-

tions,” mimeo, Berkeley, April.

LEWBEL, A. (2000) "Semiparametric Qualitative Response Model Estima-

tion with Unknown Heteroskedasticity and Instrumental Variables," Journal

of Econometrics, 97, 145-177.

MATZKIN, R.L. (1992) “Nonparametric and Distribution-Free Estimation

of the Binary Choice and the Threshold Crossing Models,” Econometrica,

Vol. 60, No. 2, p. 239-270.

MATZKIN, R.L. (2005) "Identification in Nonparametric Simultaneous Equa-

tions Models," mimeo, Northwestern University.

MATZKIN, R.L. (2007) “Nonparametric Identification,” Chapter 73 in Hand-

book of Econometrics, Vol. 6b, edited by J.J. Heckman and E.E. Leamer,

Elsevier B.V., 5307-5368.

MATZKIN, R.L. (2008) “Identification in Nonparametric Simultaneous Equa-

tions,” Econometrica, Vol. 76, No. 5, 945-978.

MATZKIN, R.L. (2010a) “Estimation of Nonparametric Models with Simul-

taneity," mimeo, UCLA.

31

Page 32: Conditions for the Existence of Control Functions in ... · of functions (r1,r2), which map any vector of observable variables (y 1,y 2,x) into the vector of unobservable variables

MATZKIN, R.L. (2010b) “Identification in Nonparametric Limited Depen-

dent Variable Models with Simultaneity and Unobserved Heterogeneity,"

mimeo, UCLA.

NEWEY, W. and J. POWELL (1989) “Instrumental Variables Estimation

of Nonparametric Models,” mimeo.

NEWEY, W. and J. POWELL (2003) “Instrumental Variables Estimation

of Nonparametric Models,” Econometrica,.

ROEHRIG, C.S. (1988) "Conditions for Identification in Nonparametric and

Parametric Models," Econometrica, 56:2, 433-447.

TELSER, Lester G. (1964) "Iterative Estimation of a Set of Linear Regression

Equations." Journal of the American Statistical Association, 59(307), 845-62.

32