Goodness of Fit Measure McFaddens pseudo-R 2 L 0 denotes the log likelihood function when only a...
-
Upload
maggie-wimbrow -
Category
Documents
-
view
214 -
download
0
Transcript of Goodness of Fit Measure McFaddens pseudo-R 2 L 0 denotes the log likelihood function when only a...
Goodness of Fit Measure
McFadden’s pseudo-R2
0
1L
LRIL
L0 denotes the log likelihood function when only a constant is included in the model
L denotes the unconstrained log likelihood function
0 ln (1 ) ln 1L n P P P P
where P denotes the percentage of observations in the sample with yi=1.
Merits of Pseudo R2
• Lies in the unit interval
• If the model provides no predictive power, then
0 and LRI=0L L
• As the model’s fit improves, then and0L 1LRI
• However, no interpretation to scale between 0 and 1
Cross Tabulation of Hits and Misses
Letˆ1 0.5
ˆˆ0 0.5
ii
i
Fy
F
Predicted
Actual
ˆ 0.5iF ˆ 0.5iF
1iy
0iy
Problems with Success Table
• The prediction rule is arbitrary. – No linkage made to the costs of the individual errors
made. It may be more costly to make an error to classify a “yes” as a “no” than to misclassify a “no” as a “yes”.
– Some loss function would be more helpful in this case.
• There is no way to judge departures from a diagonal table.
Extensions of Binary Choice Models
• Bivariate and Multivariate Choices• Expanding the Number of Alternatives
– Ordered Choices
– Unordered Choice• Logit (multinomial/conditional)
• Nested Logit
• Mixed Logit
• Multivariate Probit
Bivariate Choice• An obvious extension of binary choice models arises when
two or more such choices are made
• Examples :– Labor literature: job market participation decisions – may be a
combinations of single and multiple decisions
– Technology adoption (e.g., conservations tillage and soil testing)
– Medical Decisions: Treatment combinations (e.g., psychiatric help and drug treatments)
– Smoking: Decision to smoke and to educate one-self regarding risks
– Education: homeownership and decision of children to stay in school.
Bivariate Choice• An obvious extension of binary choice models arises
when two or more such choices are made.
• One approach to modeling these choices is to assume that a system of latent variables exists; e.g., with two choices:
*1 1 1
*2 2 2
i i i
i i i
y x
y x
but we observe the dummy variables
*
*
1 01,2
0 0
kiki
ki
yy k
y
Bivariate Probit
• Bivariate probit emerges if the residuals are assumed to be joint standard normal
1
2
0 1~ ,
0 1i
i
N
So the corresponding log-likelihood function is simply
1 2 1 2 1 2 2 1 1 2 22 11
1 2, , , , , , ,N
i ii
i ii iL y y x x x xqq qq
where 2 1; 1,2ji jiq y j
Expanding the Choice Set Beyond Two
• Defining the Choice Set• Deriving Choice Probabilities• Identification in Choice Models
Source
• Train, K., (2002), Discrete Choice Methods with Simulation, Cambridge, MA: Cambridge University Press, Ch. 2.
Discrete Choice Models
Discrete choice models characterize decision making in which the available alternatives are
• Mutually exclusive• Exhaustive• FiniteDefining the relevant choice set is a crucial step in
the analysis
Mutually Exclusive
• Not very restrictive - one can usually redefine the choice set to satisfy this restriction
• Example #1 (Train): Home Heating Fuel Choice– Electric
– Natural Gas
– Oil
– Wood
– Other
Issue: Some households have duel-fuel or multiple systems
Possible Solutions
Explicitly include combinations
• Electric only• Natural Gas only• Oil only• Wood only• Natural gas with electric
room heaters• Etc.
Focus on primary heating system
• Potential problems:– It may be difficult to
identify primary system from available data
– Secondary heaters may be important users of energy (e.g., electric space heaters)
Example #2: Recreation Demand
Consider the following choice set for a summer vacation:1. Yellowstone National Park (Idaho, Wyoming, South
Dakota)2. Badlands National Park (South Dakota)3. Grand Tetons National Park (Wyoming)4. Mount Rushmore National Memorial (South Dakota) 5. Devil’s Tower National Monument (Wyoming)For many vacationers, the alternatives in this choice set are
not mutually exclusive
13 24
5
Devil’s Tower
Alternative Solutions
Redefine alternatives in terms of portfolios
• Yellowstone only
• Yellowstone and Grand Tetons only
• Grand Tetons and Mount Rushmore only
• Etc.
Sequence may matter as well
Focus on Primary Destination
Potential Problems:
• Identifying primary destination – even in minds of recreationists
• Constructing corresponding explanatory variables (e.g. travel costs)
Exhaustive Criteria
Readily satisfied by including a “none of the above” alternative
• Heating: “no heating”• Recreation demand: “stay at home” or “all other
sites”
Relevance
• Choice set should be defined not only to be mutually exclusive and exhaustive, but also relevant
• Choices must be ones from which agent actually chooses, not just the universe of alternatives
Example: Recreation Demand
• Should the choice set consist of– All feasible sites?
• Nothing is revealed about not visiting an unknown site• Computationally intractable
– Only sites visited in past five years?• Useful information is revealed about sites that are not visited.
– Sites within a given distance?• Limits applicability of results to, say, day trips.
• Similar issues emerge in job and career search literature• Ideally, one would model both choices and accumulation
of information about choice set.
Finite or Countable Choice Sets
• This is a restrictive characteristic– Excludes the “how much” decision– Focuses on the “which” or “how many” decision
• These decisions may be tied– Electric power: Choice of rate structure and quantity of electricity
consumed– Labor: Choice of participation decision and reservation wage or
how much to work– Recreation: Choice of where to visit and how many trips to take or
trip duration
Ordered Choice Models
Ordered responses arise in many empirical settings:• Opinion surveys: asking if you strongly agree, slightly agree, slightly
disagree, or strongly disagree with a statement.• Educational data: level of schooling: grade school graduate, high
school graduate, some college education, college graduate, some advanced degree, etc.
• Employment data: unemployed, part-time, full time.• Bond ratings• Post-release performance of prisoners• etc.
Ordered Choice Models (cont’d)• Typically motivated by assuming an underlying latent variable
*i i iy x
• Typically:0 and J
• Observe:*
0 1*
1 2*
2 3
*1
1
2
3
i
i
i i
J i J
y
y
y y
J y
1j j j
Ordered Choice Models (cont’d)• The resulting choice probabilities become:
1Pr Pri j i i jy j x
1Pr i j i i jx x
1Pr Pri i j i i jx x
1i j i jF x F x
Graphically – Binary Choice
ix
0iy 1iy
i
Graphically
1ix 2ix 3ix
2iy 3iy 4iy 1iy
i
Ordered Probit• Most applications assume that the unobserved component
of the latent variable is normally distributed; i.e.,
~ 0,1i N
1Pr i i j i jy j x x
so that
where0 and J
1Pr i j i j iy j x x • Equivalently:
• Normalization required: Either no constant or 1 0
Multinomial Logit
• Obviously, a second generalization of the binary choice set-up arises when more than two unordered alternatives are presented to decision makers.
– Much of the early research arose from studies of travel mode choice in the transportation literature, with work by McFadden (1974a,b) and Domencich and McFadden (1975).
– More recently applied to• Modeling recreation site selection: Hausman, Leonard, and
McFadden (1995) and Morey, Rowe, and Watson (1993)
• Telecommunications service selection: Train, McFadden, and Ben-Akiva (1987).
• Occupational choice: Schmidt and Strauss (1975).
ML – RUM Specification
• The utility from individual i choosing alternative j is given by:
,
1, ,
ij ij ij
ij ij
U V x
x j J
where
~ extreme valueij iid
i.e., has pdf and cdf, respectively, ofij
exp exp expij ij ijf
exp expij ijF
ML Choice Probabilities
• Given the distributional assumptions and representative agent specification, then defining
1
0 otherwise
ij ik
ij
U U k jy
we have that:
Pr 1| ,ij ijP y x Pr | ,ij ikU U k j x
Pr | ,ik ij ij ikV V k j x
exp
expij
ikk
V
V
Merits of ML Specification
• The log-likelihood model is globally concave in its parameters (McFadden, 1973)
• Choice probabilities lie strictly within the unit interval and sum to one
• The log-likelihood function has a relatively simple form
1 1
1 1
, ln
ln exp
n J
ij iji j
n J
ij ij iki j k
L y x y P
y V V
The IIA Assumption
• In the logit model for any two alternatives, j and k, the ratio of choice probabilities is
exp exp
expexp exp
ij isij s
ij ikik ik is
s
V VP
V VP V V
So that
• Relative choice are independent of the number and characteristics of all remaining choice probabilities
Blue Bus/Red Bus Example
• Suppose initially J=2, with individuals choosing between riding a blue bus (B) or a train (T) to work, with
• Consider introducing a red bus (R) alternative. One might reasonably assume that
0.5 1iBiB iT
iT
PP P
P
iR iBP P
together with IIA restrictions, this implies that13iR iB iTP P P
• One would more reasonably expect that 1 14 2 and iR iB iTP P P
Nested Logit• The standard logit model imposes considerable structure
on the distribution of preferences, primarily through the iid assumption
• The Generalized extreme value (GEV) distribution provides a generalization that allows for a richer pattern of correlations
• Nested logit is one member of the class of GEV models and the most commonly used
Nested Logit
• Nested Logit is appropriate in application where the set of alternatives can be segmented into groups (or “nests”) satisfying the conditions:– Choices from among alternatives within a group satisfy the IIA
assumption
– Choices from among alternatives in two different nests are independent alternatives in any other nests (IIN – independence from irrelevant nests)