Nested Logit
Brad Jones1
1Department of Political ScienceUniversity of California, Davis
April 30, 2008
Jones POL 213: Research Methods
Nested Logit
I Interesting model that does not have IIA property.
I Possible candidate model for structured choice situations.
I Conceptual example:
I J political parties a voter i could choose from.
I Say: Green, Workers, Social Dem., Moderate, CR, ExtremeRight
I Models?
I Conditional logit or MNL?
I IIA property could be an issue.
Jones POL 213: Research Methods
Nested Logit
I IIA “says” that the disturbances are independent andhomoskedastic.
I Odds are assumed to remain the same if some alternative isremoved.
I Problem: one left party is a close substitute (possibly) ofanother.
I If CD voters split their vote across two leftist parties,elimination of one from the choice set does not imply they willrandomly distribute over remaining choices.
I That is, they most likely will gravitate to the remaining leftistparty.
I If so, odds ratios will change because of nonrandomredistribution.
Jones POL 213: Research Methods
Nested Logit
I Under NL (or MNNL), the idea is to group comparablealternatives and then structure choice setting as a “tree.”
I Voter i decides to vote leftist, centrist, or rightist.
I Call this the “top level” choice.
I Once this choice is made, the voter must decide whichoutcome to choose:
I Left: Green, Workers; Center: SD, Moderate; Right: CR,Extreme Right
I Basic result from conditional probability: Prij = Prj |i ×PriI J outcomes (i.e. parties) and i branches.
Jones POL 213: Research Methods
Nested Logit
I Conditional probability says the probability of the “bottomlevel” choice is equal to the conditional probability of selectingj given branch i times the probability that branch i wasselected.
I ∃ two levels of probability because ∃ two levels of decisions.
I Consider the conditional probability statement, Prj |i .
I Suppose we specify a utility model:
Uij = β′xij + α′wi
I As in the CL presentation, the xij are covariates that canchange over the choices (bottom level) and the wi arecovariates that are attributes of the choice sets (top level).
Jones POL 213: Research Methods
Nested Logit
I The conditional probabilities can only be a function of the xij :
Prj |i =exp(β′xij) exp(α′wi )
exp(α′wi )∑Ni
k=1 exp(β′xik)
=exp(β′xij)∑Ni
k=1 exp(β′xik)
I The “top level” probability is defined by first identifying whatis sometimes called an “inclusive value” parameter:
Ii = log
(Ni∑
k=1
exp(β′xik)
)I The probability of branch i is then
Pri =exp(α′wi + τi Ii )∑C
m=1 exp(α′wi + τmIm)
Jones POL 213: Research Methods
Nested Logit
I The “inclusive value” parameter, τ , is the weight accordedeach of the branches.
I Under CL (or MNL), we assume this weight is fixed at 1.
I Estimation is done via full information maximum likelihood:
log L =N∑i
log[Prj |i × Pri
].
I Model has many parameters.
I It requires a lot of work to interpret.
I My job to show you how . . .
I Stata is actually quite good w/this model.
Jones POL 213: Research Methods
Nested Logit: Illustration
I I’m going to continue with the Stata data set provided bytheir website.
I We used it with conditional logit.
I Let’s consider the data structure.
Jones POL 213: Research Methods
. list family_id restaurant chosen kids rating distance cost income in 1/21
+---------------------------------------------------------------------------------+
| family~d restaurant chosen kids rating distance cost income |
|---------------------------------------------------------------------------------|
1. | 1 Freebirds 1 1 0 1.245553 5.444695 39 |
2. | 1 MamasPizza 0 1 1 2.82493 6.19446 39 |
3. | 1 CafeEccell 0 1 2 4.21293 8.182085 39 |
4. | 1 LosNortenos 0 1 3 4.167634 9.861741 39 |
5. | 1 WingsNmore 0 1 2 6.330531 9.667909 39 |
|---------------------------------------------------------------------------------|
6. | 1 Christophers 0 1 4 10.19829 25.95777 39 |
7. | 1 MadCows 0 1 5 5.601388 28.99846 39 |
8. | 2 Freebirds 0 3 0 4.162657 5.26874 58 |
9. | 2 MamasPizza 0 3 1 2.865081 5.728618 58 |
10. | 2 CafeEccell 0 3 2 5.337799 7.054855 58 |
|---------------------------------------------------------------------------------|
11. | 2 LosNortenos 1 3 3 4.282864 10.78514 58 |
12. | 2 WingsNmore 0 3 2 8.133914 8.313948 58 |
13. | 2 Christophers 0 3 4 8.664631 21.2801 58 |
14. | 2 MadCows 0 3 5 9.119597 25.87567 58 |
15. | 3 Freebirds 1 3 0 2.112586 4.616315 30 |
|---------------------------------------------------------------------------------|
16. | 3 MamasPizza 0 3 1 2.215329 5.992166 30 |
17. | 3 CafeEccell 0 3 2 6.978715 7.980528 30 |
18. | 3 LosNortenos 0 3 3 5.117877 10.0605 30 |
19. | 3 WingsNmore 0 3 2 5.312941 8.76644 30 |
20. | 3 Christophers 0 3 4 9.551273 23.64499 30 |
|---------------------------------------------------------------------------------|
21. | 3 MadCows 0 3 5 5.539806 24.72128 30 |
+---------------------------------------------------------------------------------+
Jones POL 213: Research Methods
. nlogitgen type=restaurant(fast: Freebirds | MamasPizza,
family: CafeEccell | LosNortenos | WingsNmore, fancy: Christophers | MadCows)
This returns:
new variable type is generated with 3 groups
label list lb_type
lb_type:
1 fast
2 family
3 fancy
. nlogittree restaurant type <-GIVES US THE TREE STRUCTURE.
Type is the branch; restaurants are the "twigs."
tree structure specified for the nested logit model
top --> bottom
type restaurant
--------------------------
fast Freebirds
MamasPizza
family CafeEccell
LosNorte~s
WingsNmore
fancy Christop~s
MadCows
Jones POL 213: Research Methods
\newpage
. nlogit chosen (restaurant= cost rating distance)
(type = incFast incFancy kidFast kidFancy), group(family_id) nolog
Nested logit estimates
Levels = 2 Number of obs = 2100
Dependent variable = chosen LR chi2(10) = 199.6293
Log likelihood = -483.9584 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
restaurant |
cost | -.0944352 .03402 -2.78 0.006 -.1611131 -.0277572<-These are the alpha parms.
rating | .1793759 .126895 1.41 0.157 -.0693338 .4280855
distance | -.1745797 .0433352 -4.03 0.000 -.2595152 -.0896443
-------------+----------------------------------------------------------------
type |
incFast | -.0287502 .0116242 -2.47 0.013 -.0515332 -.0059672 <-WHY DO I HAVE THESE?
incFancy | .0458373 .0089109 5.14 0.000 .0283722 .0633024 <-These are the beta parms.
kidFast | -.0704164 .1394359 -0.51 0.614 -.3437058 .2028729
kidFancy | -.3626381 .1171277 -3.10 0.002 -.5922041 -.1330721
-------------+----------------------------------------------------------------
(incl. value |
parameters) |
type |
/fast | 5.715758 2.332871 2.45 0.014 1.143415 10.2881 <-These are the tau parms.
/family | 1.721222 1.152002 1.49 0.135 -.5366608 3.979105
/fancy | 1.466588 .4169075 3.52 0.000 .6494642 2.283711
------------------------------------------------------------------------------
LR test of homoskedasticity (iv = 1): chi2(3)= 9.90 Prob > chi2 = 0.0194
------------------------------------------------------------------------------
Jones POL 213: Research Methods
For fun.
. nlogit chosen (restaurant= cost rating distance) (type = incFast
incFancy kidFast kidFancy), group(family_id)
nolog ivc(fast=1, family=1, fancy=1) notree <---CONSTRAINING TAU TO 1
User-defined constraints:
IV constraints:
[fast]_cons = 1
[family]_cons = 1
[fancy]_cons = 1
Nested logit regression
Levels = 2 Number of obs = 2100
Dependent variable = chosen LR chi2(7) = 189.7294
Log likelihood = -488.90834 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
restaurant |
cost | -.1367799 .0358479 -3.82 0.000 -.2070404 -.0665193
rating | .3066626 .1418291 2.16 0.031 .0286827 .5846424
distance | -.1977508 .0471653 -4.19 0.000 -.2901931 -.1053085
-------------+----------------------------------------------------------------
type |
incFast | -.0390182 .0094018 -4.15 0.000 -.0574454 -.020591
incFancy | .0407053 .0080405 5.06 0.000 .0249462 .0564644
kidFast | -.2398756 .1063674 -2.26 0.024 -.4483517 -.0313994
kidFancy | -.3893868 .1143797 -3.40 0.001 -.6135669 -.1652067
-------------+----------------------------------------------------------------
(incl. value |
parameters) |
type |
/fast | 1 . . . . .
/family | 1 . . . . .
/fancy | 1 . . . . .
------------------------------------------------------------------------------
Jones POL 213: Research Methods
Constraining tau=1 should recover conditional logit:
. clogit chosen cost rating dist incFast incFancy kidFast kidFancy, group(family_id)
Conditional (fixed-effects) logistic regression Number of obs = 2100
LR chi2(7) = 189.73
Prob > chi2 = 0.0000
Log likelihood = -488.90834 Pseudo R2 = 0.1625
------------------------------------------------------------------------------
chosen | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cost | -.1367799 .0358479 -3.82 0.000 -.2070404 -.0665193
rating | .3066622 .1418291 2.16 0.031 .0286823 .584642
distance | -.1977505 .0471653 -4.19 0.000 -.2901927 -.1053082
incFast | -.0390183 .0094018 -4.15 0.000 -.0574455 -.0205911
incFancy | .0407053 .0080405 5.06 0.000 .0249462 .0564644
kidFast | -.2398757 .1063674 -2.26 0.024 -.448352 -.0313994
kidFancy | -.3893862 .1143797 -3.40 0.001 -.6135662 -.1652061
-----------------------------------------------------------------------------
(And it does; verify from previous slide)
Jones POL 213: Research Methods
But since we know IIA doesn’t hold, we should continue with unconstrained
nested logit.
Nested logit regression
Levels = 2 Number of obs = 2100
Dependent variable = chosen LR chi2(10) = 199.6293
Log likelihood = -483.9584 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
restaurant |
cost | -.0944352 .03402 -2.78 0.006 -.1611131 -.0277572
rating | .1793759 .126895 1.41 0.157 -.0693338 .4280855
distance | -.1745797 .0433352 -4.03 0.000 -.2595152 -.0896443
-------------+----------------------------------------------------------------
type |
incFast | -.0287502 .0116242 -2.47 0.013 -.0515332 -.0059672
incFancy | .0458373 .0089109 5.14 0.000 .0283722 .0633024
kidFast | -.0704164 .1394359 -0.51 0.614 -.3437058 .2028729
kidFancy | -.3626381 .1171277 -3.10 0.002 -.5922041 -.1330721
-------------+----------------------------------------------------------------
(incl. value |
parameters) |
type |
/fast | 5.715758 2.332871 2.45 0.014 1.143415 10.2881
/family | 1.721222 1.152002 1.49 0.135 -.5366608 3.979105
/fancy | 1.466588 .4169075 3.52 0.000 .6494642 2.283711
------------------------------------------------------------------------------
LR test of homoskedasticity (iv = 1): chi2(3)= 9.90 Prob > chi2 = 0.0194
------------------------------------------------------------------------------
Jones POL 213: Research Methods
Nested Logit: Illustration
I There are clearly many parameters here.
I Let’s figure out what all of this means.
I I’m going to make use of Stata’s predict options to backout various quantities.
I Note, any of these quantities could be retrieved “by hand”using functions from above.
Jones POL 213: Research Methods
Nested Logit: Illustration
I predict pb will return the probability of choosing restaurantj .
I predict p1, p1 will return the probability of branch i .
I predict condpb, condpb will return Prj |i .
I predict xbb, xbb will return the linear prediction for thebottom-level choice.
I predict xb1, xb1 will return the linear prediction for thetop-level choice.
I predict ivb, ivb will return the inclusive value parameter.
Jones POL 213: Research Methods
. list family_id chosen pb p1 condpb restaurant type in 1/14
+----------------------------------------------------------------------------+
| family~d chosen pb p1 condpb restaurant type |
|----------------------------------------------------------------------------|
1. | 1 1 .0831245 .1534534 .5416919 Freebirds fast |
2. | 1 0 .070329 .1534534 .4583081 MamasPizza fast |
3. | 1 0 .2763391 .7266538 .3802899 CafeEccell family |
4. | 1 0 .284375 .7266538 .3913486 LosNortenos family |
5. | 1 0 .1659397 .7266538 .2283615 WingsNmore family |
|----------------------------------------------------------------------------|
6. | 1 0 .0399215 .1198928 .3329766 Christophers fancy |
7. | 1 0 .0799713 .1198928 .6670234 MadCows fancy |
8. | 2 0 .01176 .0286579 .4103599 Freebirds fast |
9. | 2 0 .0168978 .0286579 .5896401 MamasPizza fast |
10. | 2 0 .2942401 .7521651 .3911909 CafeEccell family |
|----------------------------------------------------------------------------|
11. | 2 1 .2975767 .7521651 .3956268 LosNortenos family |
12. | 2 0 .1603483 .7521651 .2131824 WingsNmore family |
13. | 2 0 .1277234 .219177 .582741 Christophers fancy |
14. | 2 0 .0914536 .219177 .417259 MadCows fancy |
+-------------------------------------------------------------------------------+
| family~d chosen xbb xb1 ivb restaurant type |
|-------------------------------------------------------------------------------|
1. | 1 1 -.731619 -1.191674 -.1185611 Freebirds fast |
2. | 1 0 -.8987747 -1.191674 -.1185611 MamasPizza fast |
3. | 1 0 -1.149417 0 -.1825957 CafeEccell family |
4. | 1 0 -1.120752 0 -.1825957 LosNortenos family |
5. | 1 0 -1.659421 0 -.1825957 WingsNmore family |
|-------------------------------------------------------------------------------|
6. | 1 0 -3.514237 1.425016 -2.414554 Christophers fancy |
7. | 1 0 -2.819484 1.425016 -2.414554 MadCows fancy |
8. | 2 0 -1.22427 -1.878761 -.3335493 Freebirds fast |
9. | 2 0 -.8617923 -1.878761 -.3335493 MamasPizza fast |
10. | 2 0 -1.239346 0 -.3007865 CafeEccell family |
|-------------------------------------------------------------------------------|
Jones POL 213: Research Methods
11. | 2 1 -1.22807 0 -.3007865 LosNortenos family |
12. | 2 0 -1.846394 0 -.3007865 WingsNmore family |
13. | 2 0 -2.804756 1.570648 -2.264743 Christophers fancy |
14. | 2 0 -3.138791 1.570648 -2.264743 MadCows fancy |
+-------------------------------------------------------------------------------+
Jones POL 213: Research Methods
Where do the numbers come from?
xbb: Linear prediction for the bottom level
It’s a function of the covariates cost, rating, and distance.
For the first observation, we see this is:
. display _b[cost]*cost+_b[rating]*rating+_b[distance]*distance
-.73161902
---------------
condpb: Conditional probability of restaurant j given branch i (from equation on previous slide):
. display exp(-.731619)/(exp(-.731619)+exp(-.8987747))
.54169189
for "FreeBirds" and
. display exp(-.8987747)/(exp(-.731619)+exp(-.8987747))
.45830811
for "MamasPizza."
-----------------
xb1: Linear prediction for i branch
This is the linear prediction for the top-level model (or the branches):
. display -.0287502*incFast + .0458373*incFancy + -.0704164*kidFast + -.3626381*kidFancy
-1.1916742
(The parms are the alphas from the model output).
---------------
Jones POL 213: Research Methods
OK. Now what about the "inclusive value parameters."
These parameters essentially give us the "weight" the
chooser ascribes to each branch. Under conditional logit, this weight is assumed
to be uniform and therefore, 1. We see in our model that these parameters are not
jointly 1 (which provides evidence in favor of the nested logit model).
Above, I refer to these parameters as the tau. The question at
hand now is where do the I come from? For the first family in the data set, note the following:
. display log(exp( -.731619)+exp(-.8987747))
-.1185611
. display log(exp( -1.149417)+exp(-1.120752)+exp(-1.659421))
-.18259554
. display log(exp( -3.514237)+exp(-2.819484))
-2.4145539
What do the numbers represent? The numbers in parentheses are
our linear predictions for the "bottom level" choices, that is,
the "xbb." Note, then, what the
inclusive value gives us: it gives us a summary of the weight accorded each
"branch" that is available to the chooser.
Jones POL 213: Research Methods
Ok, almost done. Now what about the top-level probabilities
(i.e. the probability of choosing fast food, family, or fancy?).
In lecture, I give the function. To compute it directly, we do the following:
. display exp(-1.191674 +_b[/fast]*-.1185611)/
(exp( -1.191674 + _b[/fast]*-.1185611) + exp(1.425016 +_b[/fancy]*-2.414554)
+ exp(0 +_b[/family]*-.1825957))
.15345345
Note where these numbers come from: they are the taus, the "ivb," and the "xb1."
In doing this exercise, we reproduce pb1. Interpretation?
The probability of choosing a fast food restaurant is .15 for a person
with this covariate profile.
Jones POL 213: Research Methods
Finally, we can compute the "bottom-level" probability.
It is the simple conditional probability result. For the first observation, it is:
. display p1*condpb
.08312449
We could then "fill in the tree" for observation 1 (if we wanted to).
Jones POL 213: Research Methods
Nested Logit: Illustration
I So what would we get from this model if we fully interpretedit?
I The probability of choice j . That is, the unconditionalprobability.
I The conditional probability of choice j given the selection ofbranch i .
I The probability of choosing branch i .
I A direct test of the weight associated with each branch, givenchooser attributes.
I Seems a useful empirical model for testing rational choicepredictions.
I Data requirements are substantial, as is theory for nestingchoices.
Jones POL 213: Research Methods
Top Related