SIMPLE LINEAR REGRESSION. 2 Simple Regression Linear Regression.
Advanced Statistical Methods: Beyond Linear Regression
-
Upload
basil-turner -
Category
Documents
-
view
58 -
download
0
description
Transcript of Advanced Statistical Methods: Beyond Linear Regression
-
John R. StevensUtah State University
Notes 2. Statistical Methods I
Mathematics Educators Workshop 28 March 2009*Advanced Statistical Methods:Beyond Linear Regressionhttp://www.stat.usu.edu/~jrstevens/pcmi
-
What would your students know to do with these data?ObsFlightTempDamage1STS166NO2STS970NO3STS51B75NO4STS270YES5STS41B57YES6STS51G70NO7STS369NO8STS41C63YES9STS51F81NO10STS48011STS41D70YES12STS51I76NO13STS568NO14STS41G78NO15STS51J79NO16STS667NO17STS51A67NO18STS61A75YES19STS772NO20STS51C53YES21STS61B76NO22STS873NO23STS51D67NO24STS61C58YES
-
Two Sample t-test
data: Temp by Damage t = 3.1032, df = 21, p-value = 0.005383alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 2.774344 14.047085 sample estimates: mean in group NO mean in group YES 72.12500 63.71429
-
Does the t-test make sense here?Traditional:Treatment Group mean vs. Control Group mean
What is the response variable?Temperature? [Quantitative, Continuous]Damage? [Qualitative]
-
Traditional Statistical Model 1Linear Regression: predict continuous response from [quantitative] predictorsY=weight, X=heightY=income, X=education levelY=first-semester GPA, X=parents incomeY=temperature, X=damage (0=no, 1=yes)
Can also control for other [possibly categorical] factors (covariates):SexMajorState of OriginNumber of Siblings
-
Traditional Statistical Model 2Logistic Regression: predict binary response from [quantitative] predictorsY=graduate within 5 years=0 vs. Y=not=1X=first-semester GPAY=0 (no damage) vs. Y=1 (damage)X=temperatureY=0 (survive) vs. Y=1 (death)X=dosage (dose-response model)Can also control for other factors, or covariatesRace, SexGenotypep = P(Y=1 | relevant factors) = prob. that Y=1, given state of relevant factors
-
Traditional Dose-Response Modelp = Probability of death at dose d:
Look at what affects the shape of the curve, LD50 (lethal dose for 50% efficacy), etc.
-
Fitting the Dose-Response ModelWhy logistic regression?0 = place-holder constant1 = effect of dosage dTo estimate parameters:Newton-Raphson iterative process to maximize the likelihood of the modelCompare Y=0 (no damage) with Y=1 (damage) groups
-
Likelihood Function (to be maximized)likelihood for obs. imultiply probabilities (independence)
-
Estimation by IRLSIteratively Reweighted Least Squares
equivalent: Newton-Raphson algorithm for iteratively solving score equations
-
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 15.0429 7.3786 2.039 0.0415 *Temp -0.2322 0.1082 -2.145 0.0320 *---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
-
What if the data were even better?Complete separation of points
What should happen to our slope estimate?
-
Coefficients: Estimate Std. Error z value Pr(>|z|)(Intercept) 928.9 913821.4 0.001 1Temp -14.4 14106.7 -0.001 1
-
Failure?Shape of likelihood function
Large Standard Errors
Solution only in 2006
Rather than maximizing likelihood, consider a penalty:
-
Model fitted by Penalized MLConfidence intervals and p-values by Profile Likelihood
coef se(coef) Chisq p(Intercept) 30.4129282 16.5145441 11.35235 0.0007535240Temp -0.4832632 0.2528934 13.06178 0.0003013835
-
Beetle Data
Phosphine
Total
Dosage
Receiving
Total
Total
Survivors Observed at Genotype
(mg/L)
Dosage
Deaths
Survivors
-/B
-/H
-/A
+/B
+/H
+/A
0
98
0
98
31
27
10
6
20
4
0.003
100
16
84
18
26
10
6
20
4
0.004
100
68
32
10
4
3
5
7
4
0.005
100
78
22
1
4
7
2
6
2
0.01
100
77
23
0
1
9
8
5
0
0.05
300
270
30
0
0
0
5
20
5
0.1
400
383
17
0
0
0
0
10
7
0.2
750
740
10
0
0
0
0
0
10
0.3
500
490
10
0
0
0
0
0
10
0.4
500
492
8
0
0
0
0
0
8
1.0
7850
7,806
44
0
0
0
0
0
44
10,798
10,420
378
-
Dose-response modelRecall simple model:
pij = Pr(Y=1 | dosage level j and genotype level i)
But when is genotype (covariate Gi) observed?
-
Coefficients: Estimate Std. Error z value Pr(>|z|)(Intercept) -2.657e+01 8.901e+04 -2.98e-04 1dose -7.541e-26 1.596e+07 -4.72e-33 1G1+ -3.386e-28 1.064e+05 -3.18e-33 1G2B -1.344e-14 1.092e+05 -1.23e-19 1G2H -3.349e-28 1.095e+05 -3.06e-33 1dose:G1+ 7.541e-26 1.596e+07 4.72e-33 1dose:G2B 3.984e-12 3.075e+07 1.30e-19 1dose:G2H 7.754e-26 2.760e+07 2.81e-33 1G1+:G2B 1.344e-14 1.465e+05 9.17e-20 1G1+:G2H 3.395e-28 1.327e+05 2.56e-33 1dose:G1+:G2B -3.984e-12 3.098e+07 -1.29e-19 1dose:G1+:G2H -7.756e-26 2.763e+07 -2.81e-33 1Before we fix this, first a little detour
-
A Multivariate Gaussian MixtureComponent j is MVN(j,j) with proportion j
-
The Maximum Likelihood Approach
-
A Possible Work-AroundKeys here:the true group memberships are unknown (latent)statisticians specialize in unknown quantities
-
A reasonable approach1. Randomly assign group memberships , and estimate group means j , covariance matrices j , and mixing proportions j2. Given those values, calculate (for each obs.) j = E[j|] = P(obs. in group j)3. Update estimates for j , j , and j , weighting each observation by these : 4. Repeat steps 2 and 3 to convergence
-
Plotting character and color indicate most likely component
-
The EM (Baum-Welch) Algorithm- maximization made easier with Zm = latent (unobserved) data; T = (Z,Zm) = complete dataStart with initial guesses for parametersExpectation: At the kth iteration, compute Maximization: Obtain estimate by maximizing over Iterate steps 2 and 3 to convergence ($?)
-
Beetle Data NotationObserved values Unobserved (latent) values If Nij had been observed:
How Nij can be [latently] considered:
-
Likelihood FunctionParameters =(p,P) and complete data T=(n,N) After simplification:
Mechanism of missing data suggests EM algorithm
-
Missing at Random (MAR)Necessary assumption for usual EM applicationsCovariate x is MAR if probability of observing x does not depend on x or any other unobserved covariate, but may depend on response and other observed covariates (Ibrahim 1990)Here genotype is observed only for survivors, and for all subjects at zero dosage
-
Initialization StepTwo classes of marginal information hereFor all dosage levels j observeAt zero dosage level observe for genotype iAllows estimate of Pi Consider marginal distn. of missing categorical covariate (genotype)Using zero dosage level:
This is the key the marginal distribution of the missing categorical covariate
-
Expectation StepDropping constants and :
Need to evaluate:
(*)
-
Expectation StepBayes Formula:
Multinomial (*)
-
Expectation StepFor :Not needed for maximization only affects EM convergence rateDirect calculation from multinomial distn. is possible but computationally prohibitiveNeed to employ some approximation strategySecond-order Taylor series about , using Binets formula(*)
-
Expectation StepConsider Binets formula (like Stirlings):
Have:
Use a second-order Taylor series approximation taken about as a function of :(*)
-
Maximization StepPortion of related to :
Portion of related to :by Lagrange multipliersby Newton-Raphson iterations, with some parameterization(*)
-
Convergence
-
Dose Response Curves (log scale)
-
EM Resultstest statistic for H0: no dosage effectseparation of points
Confidence
LD50
L95
U95
t
-/B
0.0035
0.0031
0.0039
3.99
-/H
0.0033
0.0028
0.0038
4.98
-/A
0.0290
-7.1862
7.2442
0.13
+/B
0.0484
0.0123
0.0845
0.09
+/H
0.0664
0.0407
0.0921
4.20
+/A
0.7382
0.1428
1.3336
1.36
-
Topics Used HereCalculusDifferentiation & Integration (including vector differentiation)Lagrange MultipliersTaylor Series ExpansionsLinear AlgebraDeterminants & EigenvaluesInverting [computationally/nearly singular] MatricesPositive DefinitenessProbabilityDistributions: Multivariate Normal, Binomial, MultinomialBayes FormulaStatisticsLogistic RegressionSeparation of Points[Penalized] Likelihood MaximizationEM AlgorithmBiology a little time and communication
*