Goodness of Fit Measure McFaddens pseudo-R 2 L 0 denotes the log likelihood function when only a...

Goodness of Fit Measure

McFadden’s pseudo-R2

0

1L

LRIL

L0 denotes the log likelihood function when only a constant is included in the model

L denotes the unconstrained log likelihood function

0 ln (1 ) ln 1L n P P P P

where P denotes the percentage of observations in the sample with yi=1.

Merits of Pseudo R2

• Lies in the unit interval

• If the model provides no predictive power, then

0 and LRI=0L L

• As the model’s fit improves, then and0L 1LRI

• However, no interpretation to scale between 0 and 1

Cross Tabulation of Hits and Misses

Letˆ1 0.5

ˆˆ0 0.5

ii

i

Fy

F

Predicted

Actual

ˆ 0.5iF ˆ 0.5iF

1iy

0iy

Problems with Success Table

• The prediction rule is arbitrary. – No linkage made to the costs of the individual errors

made. It may be more costly to make an error to classify a “yes” as a “no” than to misclassify a “no” as a “yes”.

– Some loss function would be more helpful in this case.

• There is no way to judge departures from a diagonal table.

Extensions of Binary Choice Models

• Bivariate and Multivariate Choices• Expanding the Number of Alternatives

– Ordered Choices

– Unordered Choice• Logit (multinomial/conditional)

• Nested Logit

• Mixed Logit

• Multivariate Probit

Bivariate Choice• An obvious extension of binary choice models arises when

two or more such choices are made

• Examples :– Labor literature: job market participation decisions – may be a

combinations of single and multiple decisions

– Technology adoption (e.g., conservations tillage and soil testing)

– Medical Decisions: Treatment combinations (e.g., psychiatric help and drug treatments)

– Smoking: Decision to smoke and to educate one-self regarding risks

– Education: homeownership and decision of children to stay in school.

Bivariate Choice• An obvious extension of binary choice models arises

when two or more such choices are made.

• One approach to modeling these choices is to assume that a system of latent variables exists; e.g., with two choices:

*1 1 1

*2 2 2

i i i

i i i

y x

y x

but we observe the dummy variables

*

*

1 01,2

0 0

kiki

ki

yy k

y

Bivariate Probit

• Bivariate probit emerges if the residuals are assumed to be joint standard normal

1

2

0 1~ ,

0 1i

i

N

So the corresponding log-likelihood function is simply

1 2 1 2 1 2 2 1 1 2 22 11

1 2, , , , , , ,N

i ii

i ii iL y y x x x xqq qq

where 2 1; 1,2ji jiq y j

Expanding the Choice Set Beyond Two

• Defining the Choice Set• Deriving Choice Probabilities• Identification in Choice Models

Source

• Train, K., (2002), Discrete Choice Methods with Simulation, Cambridge, MA: Cambridge University Press, Ch. 2.

Discrete Choice Models

Discrete choice models characterize decision making in which the available alternatives are

• Mutually exclusive• Exhaustive• FiniteDefining the relevant choice set is a crucial step in

the analysis

Mutually Exclusive

• Not very restrictive - one can usually redefine the choice set to satisfy this restriction

• Example #1 (Train): Home Heating Fuel Choice– Electric

– Natural Gas

– Oil

– Wood

– Other

Issue: Some households have duel-fuel or multiple systems

Possible Solutions

Explicitly include combinations

• Electric only• Natural Gas only• Oil only• Wood only• Natural gas with electric

room heaters• Etc.

Focus on primary heating system

• Potential problems:– It may be difficult to

identify primary system from available data

– Secondary heaters may be important users of energy (e.g., electric space heaters)

Example #2: Recreation Demand

Consider the following choice set for a summer vacation:1. Yellowstone National Park (Idaho, Wyoming, South

Dakota)2. Badlands National Park (South Dakota)3. Grand Tetons National Park (Wyoming)4. Mount Rushmore National Memorial (South Dakota) 5. Devil’s Tower National Monument (Wyoming)For many vacationers, the alternatives in this choice set are

not mutually exclusive

13 24

5

Devil’s Tower

Alternative Solutions

Redefine alternatives in terms of portfolios

• Yellowstone only

• Yellowstone and Grand Tetons only

• Grand Tetons and Mount Rushmore only

• Etc.

Sequence may matter as well

Focus on Primary Destination

Potential Problems:

• Identifying primary destination – even in minds of recreationists

• Constructing corresponding explanatory variables (e.g. travel costs)

Exhaustive Criteria

Readily satisfied by including a “none of the above” alternative

• Heating: “no heating”• Recreation demand: “stay at home” or “all other

sites”

Relevance

• Choice set should be defined not only to be mutually exclusive and exhaustive, but also relevant

• Choices must be ones from which agent actually chooses, not just the universe of alternatives

Example: Recreation Demand

• Should the choice set consist of– All feasible sites?

• Nothing is revealed about not visiting an unknown site• Computationally intractable

– Only sites visited in past five years?• Useful information is revealed about sites that are not visited.

– Sites within a given distance?• Limits applicability of results to, say, day trips.

• Similar issues emerge in job and career search literature• Ideally, one would model both choices and accumulation

of information about choice set.

Finite or Countable Choice Sets

• This is a restrictive characteristic– Excludes the “how much” decision– Focuses on the “which” or “how many” decision

• These decisions may be tied– Electric power: Choice of rate structure and quantity of electricity

consumed– Labor: Choice of participation decision and reservation wage or

how much to work– Recreation: Choice of where to visit and how many trips to take or

trip duration

Ordered Choice Models

Ordered responses arise in many empirical settings:• Opinion surveys: asking if you strongly agree, slightly agree, slightly

disagree, or strongly disagree with a statement.• Educational data: level of schooling: grade school graduate, high

school graduate, some college education, college graduate, some advanced degree, etc.

• Employment data: unemployed, part-time, full time.• Bond ratings• Post-release performance of prisoners• etc.

Ordered Choice Models (cont’d)• Typically motivated by assuming an underlying latent variable

*i i iy x

• Typically:0 and J

• Observe:*

0 1*

1 2*

2 3

*1

1

2

3

i

i

i i

J i J

y

y

y y

J y

1j j j

Ordered Choice Models (cont’d)• The resulting choice probabilities become:

1Pr Pri j i i jy j x

1Pr i j i i jx x

1Pr Pri i j i i jx x

1i j i jF x F x

Graphically – Binary Choice

ix

0iy 1iy

i

Graphically

1ix 2ix 3ix

2iy 3iy 4iy 1iy

i

Ordered Probit• Most applications assume that the unobserved component

of the latent variable is normally distributed; i.e.,

~ 0,1i N

1Pr i i j i jy j x x

so that

where0 and J

1Pr i j i j iy j x x • Equivalently:

• Normalization required: Either no constant or 1 0

Multinomial Logit

• Obviously, a second generalization of the binary choice set-up arises when more than two unordered alternatives are presented to decision makers.

– Much of the early research arose from studies of travel mode choice in the transportation literature, with work by McFadden (1974a,b) and Domencich and McFadden (1975).

– More recently applied to• Modeling recreation site selection: Hausman, Leonard, and

McFadden (1995) and Morey, Rowe, and Watson (1993)

• Telecommunications service selection: Train, McFadden, and Ben-Akiva (1987).

• Occupational choice: Schmidt and Strauss (1975).

ML – RUM Specification

• The utility from individual i choosing alternative j is given by:

,

1, ,

ij ij ij

ij ij

U V x

x j J

where

~ extreme valueij iid

i.e., has pdf and cdf, respectively, ofij

exp exp expij ij ijf

exp expij ijF

ML Choice Probabilities

• Given the distributional assumptions and representative agent specification, then defining

1

0 otherwise

ij ik

ij

U U k jy

we have that:

Pr 1| ,ij ijP y x Pr | ,ij ikU U k j x

Pr | ,ik ij ij ikV V k j x

exp

expij

ikk

V

V

Merits of ML Specification

• The log-likelihood model is globally concave in its parameters (McFadden, 1973)

• Choice probabilities lie strictly within the unit interval and sum to one

• The log-likelihood function has a relatively simple form

1 1

1 1

, ln

ln exp

n J

ij iji j

n J

ij ij iki j k

L y x y P

y V V

The IIA Assumption

• In the logit model for any two alternatives, j and k, the ratio of choice probabilities is

exp exp

expexp exp

ij isij s

ij ikik ik is

s

V VP

V VP V V

So that

• Relative choice are independent of the number and characteristics of all remaining choice probabilities

Blue Bus/Red Bus Example

• Suppose initially J=2, with individuals choosing between riding a blue bus (B) or a train (T) to work, with

• Consider introducing a red bus (R) alternative. One might reasonably assume that

0.5 1iBiB iT

iT

PP P

P

iR iBP P

together with IIA restrictions, this implies that13iR iB iTP P P

• One would more reasonably expect that 1 14 2 and iR iB iTP P P

Nested Logit• The standard logit model imposes considerable structure

on the distribution of preferences, primarily through the iid assumption

• The Generalized extreme value (GEV) distribution provides a generalization that allows for a richer pattern of correlations

• Nested logit is one member of the class of GEV models and the most commonly used

Nested Logit

• Nested Logit is appropriate in application where the set of alternatives can be segmented into groups (or “nests”) satisfying the conditions:– Choices from among alternatives within a group satisfy the IIA

assumption

– Choices from among alternatives in two different nests are independent alternatives in any other nests (IIN – independence from irrelevant nests)

Goodness of Fit Measure McFaddens pseudo-R 2 L 0 denotes the log likelihood function when only a...

Documents

Transcript of Goodness of Fit Measure McFaddens pseudo-R 2 L 0 denotes the log likelihood function when only a...