The Multigraph for Loglinear Models
description
Transcript of The Multigraph for Loglinear Models
![Page 1: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/1.jpg)
The Multigraph for Loglinear Models
Harry KhamisStatistical Consulting Center
Wright State UniversityDayton, Ohio, USA
![Page 2: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/2.jpg)
OUTLINE1. LOGLINEAR MODEL (LLM)
- two-way table- three-way table- examples
2. MULTIGRAPH- construction- maximum spanning tree- conditional independencies- collapsibility
3. EXAMPLES
22
![Page 3: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/3.jpg)
Loglinear ModelLoglinear Model
Goal
Identify the structure of associations among a set of categorical variables.
33
![Page 4: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/4.jpg)
LLM: two variables Y1 2 3 … J Total
------------------------------------------------------------------------------1 n11 n12 n13 … n1J n1+
2 n21 n22 n23 … n2J n2+
. . . . . .
X . . . . . .. . . . . .I nI1 nI2 nI3 … nIJ nI+
Total n+1 n+2 n+3 … n+J n
44
![Page 5: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/5.jpg)
LLM: two variablesExample
Survey of High School Seniors in Dayton, OhioCollaboration: WSU Boonshoft School of Medicine and
United Health Services of Dayton
Marijuana Use?Yes No Total---------------------------------------------------------------------Yes 914 581 1495Cigarette Use?No 46 735 781Total 960 1316 2276
55
![Page 6: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/6.jpg)
LLM: two variables
66
Two discrete variables, X and Y
Model of independence: generating class is [X][Y]
![Page 7: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/7.jpg)
LLM: two variables
LLM of independence:
77
0
log
j
Yj
i
Xi
Yj
Xiij
where
![Page 8: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/8.jpg)
LLM: two variablesSaturated LLM: generating class is [XY]:
88
RatioOddsNote
where
XYij
j
XYij
i
XYij
j
Yj
i
Xi
XYij
Yj
Xiij
:
0
log
![Page 9: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/9.jpg)
LLM: two variables
Generating ProbabilisticInterpretation Class Model-------------------------------------------------------------------------------------X and Y independent [X][Y] pij = pi+p+j
X and Y dependent [XY] pij
99
![Page 10: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/10.jpg)
LLM: three variablesExample: Dayton High School Data
Alcohol Cigarette Marijuana UseUse Use Yes No----------------------------------------------------------------------------------Yes Yes 911 538
No 44 456
No Yes 3 43No 2 279
1010
![Page 11: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/11.jpg)
11111111
LLM: three variables
Saturated LLM, [XYZ]:
0...
log
k
XYZijk
j
XYij
i
XYij
j
Yj
i
Xi
XYZijk
YZjk
XZik
XYij
Zk
Yj
Xiijk
where
![Page 12: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/12.jpg)
LLM: three variablesGenerating Probabilistic
Interpretation Class Model------------------------------------------------------------------------------------mutual independence [X][Y][Z] pijk = pi++p+j+p++k
joint independence [XZ][Y] pijk = pi+kp+j+
conditional independence [XY][XZ] pijk = pij+pi+k/pi++
homogeneous association* [XY][XZ][YZ] *
saturated model [XYZ] pijk
*nondecomposable model1212
![Page 13: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/13.jpg)
Decomposable LLMs closed-form expression for MLEsclosed-form expression for MLEs
closed-form expression for closed-form expression for asymptotic variances (Lee, 1977)asymptotic variances (Lee, 1977)
conditional Gconditional G22 statistic simplifies statistic simplifies
allow for causal interpretationsallow for causal interpretations
easier to interpret the LLM easier to interpret the LLM
1313
![Page 14: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/14.jpg)
1414
![Page 15: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/15.jpg)
3 Categorical Variables: X, Y, and Z3 Categorical Variables: X, Y, and Z
If [X Y] and [Y Z] ⊗ ⊗then [X Z]⊗
FALSE!
1515
![Page 16: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/16.jpg)
LLM: three variables
Generating ProbabilisticInterpretation Class Model------------------------------------------------------------------------------------mutual independence [X][Y][Z] pijk = pi++p+j+p++k
joint independence [XZ][Y] pijk = pi+kp+j+
conditional independence [XY][XZ] pijk = pij+pi+k/pi++
homogeneous association [XY][XZ][YZ] pijk = ψijφikωjk
saturated model [XYZ] pijk
1616
![Page 17: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/17.jpg)
3 Categorical Variables: X, Y, and Z3 Categorical Variables: X, Y, and Z
If [Y Z] for all X = 1, 2, ….⊗then [Y Z]⊗
FALSE!
1717
![Page 18: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/18.jpg)
LLM: three variables
Generating ProbabilisticInterpretation Class Model------------------------------------------------------------------------------------mutual independence [X][Y][Z] pijk = pi++p+j+p++k
joint independence [XZ][Y] pijk = pi+kp+j+
conditional independence [XY][XZ] pijk = pij+pi+k/pi++
homogeneous association [XY][XZ][YZ] pijk = ψijφikωjk
saturated model [XYZ] pijk
1818
![Page 19: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/19.jpg)
3 Categorical Variables: X, Y, and Z3 Categorical Variables: X, Y, and Z
If [Y Z] ⊗then
[Y Z] for all X = 1, 2, 3, …⊗FALSE!
1919
![Page 20: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/20.jpg)
Which Treatment is Better?Which Treatment is Better? TRIAL 1 TRIAL 2 CURED? CURED?Yes No Total Yes No Total---------------------------------------------- ----------------------------------------A 40 (.20) 160 200 85 (.85) 15 100
TREATMENTB 30 (.15) 170 200 300 (.75) 100 400
Combine TRIALS 1 and 2: CURED?Yes No Total-----------------------------------------------A 125 (.42) 175 300TREATMENTB 330 (.55) 270 600
“Ask Marilyn”, PARADE section, DDN, pages 6-7, April 28, 1996
2020
![Page 21: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/21.jpg)
Florida Homicide Convictions Resulting in Death PenaltyML Radelet and GL Pierce, Florida Law Review 43: 1-34, 1991
Death PenaltyYes No
----------------------------------------White 53 (0.11) 430
Defendant’s RaceBlack 15 (0.08) 176
White Victim Black Victim
Death Penalty Death PenaltyYes No Yes No
------------------------------------- --------------------------------------White 53 (0.11) 414 White 0 (0.00) 16
Defendant’s RaceBlack 11 (0.23) 37 Black 4 (0.03) 139
2121
![Page 22: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/22.jpg)
Multigraph Representation of LLMsMultigraph Representation of LLMs
Vertices = generators of the LLM
Multiedges = edges that are equal in number to the number of indices shared by the two vertices being joined
2222
![Page 23: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/23.jpg)
Multigraph: three variablesMultigraph: three variables
[XY][XZ] XY XZ
2323
![Page 24: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/24.jpg)
Examples of MultigraphsExamples of Multigraphs
2424
[AS][ACR][MCS][MAC]
AS ACR
MAC MCS
![Page 25: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/25.jpg)
Examples of MultigraphsExamples of Multigraphs
2525
[ABCD][ACE][BCG][CDF]
ABCD
CDF
ACE BCG
![Page 26: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/26.jpg)
Maximum Spanning TreeMaximum Spanning Tree
The maximum spanning tree of a multigraph M: • tree (connected graph with no circuits) • includes each vertex • sum of the edges is maximum
2626
![Page 27: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/27.jpg)
Examples of maximum spanning trees Examples of maximum spanning trees
2727
[XY][XZ] XY XZ
![Page 28: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/28.jpg)
Examples of maximum spanning trees Examples of maximum spanning trees
2828
[AS][ACR][MCS][MAC]
AS ACR
MAC MCS
![Page 29: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/29.jpg)
Examples of maximum spanning trees Examples of maximum spanning trees
2929
[ABCD][ACE][BCG][CDF]
ABCD
CDF
ACE BCG
![Page 30: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/30.jpg)
Fundamental Conditional IndependenciesFundamental Conditional Independenciesfor a Decomposable LLMfor a Decomposable LLM
1. Let S be the set of indices in a branch of the maximum spanning tree
2. Remove each factor of S from the multigraph, M; the resulting multigraph is M/S
3. An FCI is determined as:
where C1, C2, …, Ck are the sets of factors in the components of M/S
3030
![Page 31: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/31.jpg)
3131
FCIs FCIs
[XY][XZ] XY XZX
S = {X}
M/S:Y Z
[Y⊗Z|X]
![Page 32: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/32.jpg)
Collapsibility ConditionsCollapsibility Conditions
Consider a conditional independence relationship of the form
[C1 C⊗ 2|S].
If the levels of all factors in C1 are collapsed, then all relationships among the remaining factors are
undistorted EXCEPT for relationships among factors in S.
3232
![Page 33: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/33.jpg)
3333
FCIs FCIs
[XY][XZ] XY XZX
S = {X}
M/S:Y Z
[Y⊗Z|X]
![Page 34: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/34.jpg)
Example: Ob-Gyn StudyExample: Ob-Gyn Study(Darrocca, et al., 1996)
n = 201 pregnant mothers
Variables: E: EGA (Early, Late)B: Bishop score (High, Low)T: Treatment (Prostin, Placebo)
3434
![Page 35: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/35.jpg)
Example: Ob-Gyn StudyExample: Ob-Gyn Study
BISHOP SCORE (B)High Low
EGA (E) EGA (E)TREATMENT (T) Early Late Early Late
------------------------------------------------------------------------------------------------------Prostin 34 24 27 21
Placebo 22 16 35 22
Best-fitting model: [E][TB]
3535
![Page 36: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/36.jpg)
Example: Ob-Gyn StudyExample: Ob-Gyn Study
Generating Class: [E][TB]
Multigraph:
E TB
FCI: [E T,B]⊗
3636
![Page 37: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/37.jpg)
Example: Ob-Gyn StudyExample: Ob-Gyn StudyCollapsed Table (collapse over EGA):
BISHOP SCORE (B) High Low Total
-------------------------------------------------Prostin 58 (0.55) 48 106
TREATMENT (T)Placebo 38 (0.40) 57 95
P = 0.037
3737
![Page 38: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/38.jpg)
Example: WSU-United Way StudyExample: WSU-United Way Study
M: Marijuana (No, Yes)
A: Alcohol (No, Yes)
C: Cigarettes (No, Yes)
R: Race (Other, White)
S: Sex (Female, Male)
Observed cell frequencies (n = 2,276):
12 0 19 2 1 0 23 23117 1 218 13 17 1 268 40517 0 18 1 8 1 19 30133 1 201 28 17 1 228 453
3838
![Page 39: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/39.jpg)
Example: WSU-United Way StudyExample: WSU-United Way Study
Generating class: [ACE][MAC][MCG]
Multigraph, M:
ACE MCG MAC
3939
![Page 40: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/40.jpg)
Example: WSU-United Way StudyExample: WSU-United Way StudyM: S = {A,C}
ACE M/S: E A C MG M
MCG MAC [E M,G|⊗ A,C]
A = Alcohol C = Cigarette E = EthnicG = Gender M = Marijuana
4040
![Page 41: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/41.jpg)
Example: WSU PASS ProgramExample: WSU PASS Program
“Preparing for Academic Success”
GPA below 2.0 at the end of first quarter
4141
![Page 42: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/42.jpg)
Example: WSU PASS ProgramExample: WSU PASS Program
Variables (n = 972):
FACTOR LABEL LEVELS--------------------------------------------------------------------------------------------------------------Retention R 1=No, 2=YesCohort C 1, 2, 3, 4PASS Participation P 1=No, 2=YesEthnic Group E 1=Caucasian, 2=African-American, 3=OtherGender G 1=Male, 2=Female
4242
![Page 43: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/43.jpg)
Example: WSU PASS ProgramExample: WSU PASS Program
The best-fitting LLM has generating class [EG][CP][RC][PG]
Multigraph, M: G
EG PG P
RC C CP 4343
![Page 44: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/44.jpg)
Example: WSU PASS ProgramExample: WSU PASS ProgramM: S = {C}
EG PG EG PG
RC CP R PC M M/S
[E,G,P⊗R|C]
C = Cohort E = Ethnic G = GenderP = PASS Participation R = Retention
4444
![Page 45: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/45.jpg)
Example: Affinal Relations in Bosnia-HerzegovinaExample: Affinal Relations in Bosnia-HerzegovinaData courtesy of Dr. Keith Doubt, Department of Sociology, Wittenberg University, Springfield, Ohio
N = 861 couples from Bosnia-Herzegovina are surveyed concerning affinal relations.
M: Marriage Type (traditional, elopement)L: Location of Man and Wife (same, different)E: Ethnicity (Bosniak, Serb, Croat)S: Settlement (rural, urban)
Best-fitting model: [MLES]
Consider structural associations among M, L, and S for each ethnic group (E) separately.
4545
![Page 46: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/46.jpg)
Example: Affinal Relations in Bosnia-Herzegovina Example: Affinal Relations in Bosnia-Herzegovina
Bosniaks: [ML][LS]
Serbs: [MS][SL]
Croats: [M][L][S]
M: Marriage Type L: Location of Man and Wife S: Settlement
4646
![Page 47: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/47.jpg)
ConclusionsConclusions The generator multigraph uses mathematical graph theory to
analyze and interpret LLMs in a facile manner
Properties of the multigraph allow one to:– Find all conditional independencies – Determine all collapsibility conditions
REFERENCEKhamis, H.J. (2011). The Association Graph and the Multigraph for Loglinear Models,
SAGE series Quantitative Applications in the Social Sciences, No. 167.
4747
![Page 48: The Multigraph for Loglinear Models](https://reader035.fdocuments.net/reader035/viewer/2022062411/56816867550346895ddec93c/html5/thumbnails/48.jpg)
Without data, you’re just one more person with an
opinion4848