Resolving Inference Issues in Mixed Models by Sam Weerahandi
description
Transcript of Resolving Inference Issues in Mixed Models by Sam Weerahandi
![Page 1: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/1.jpg)
Draft – Work in Progress– Confidential – Do Not Distribute1
Resolving Inference Issues in Mixed Models
byby
Sam WeerahandiSam Weerahandi
April 2013
![Page 2: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/2.jpg)
2
Outline
Why Mixed Models are Important
Mixed Models: An Overview
Issues with MLE based Inference
Introduction to Generalized Inference
Application: BLUP in Mixed Models
Performance Comparison
![Page 3: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/3.jpg)
3
Why Mixed Models Are Important!
Mixed Models are especially useful in applications involving large samples with noisy data small samples with low noise
In Clinical Research & Public Health Studies, Mixed Model can yield results of greater accuracy in estimating effects by treatment levels Patient groups
In Sales & Marketing Mixed Models are heavily used to estimate Response due to promotional tactics:
– Advertisements (TV, Magazine, Web) by Market– Doctors Response to Detailing/Starters.
In fact, if you don’t use Mixed Models in this type of applications you may get unreliable or junk estimates, tests, and intervals
So, BLUP (and hence SAS PROC MIXED) has replaced LSE as the most widely used statistical technique by Management Science groups of Pharmaceutical companies, in particular
![Page 4: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/4.jpg)
4
An Example Suppose you are asked to estimate effect of a
TV/Magazine Ad by every Market/District using a model of longitudinal sales data on ad-stocked exposure If you run LSE you may not even get the right
sign of estimates for 40% of Markets If you formulate in a Mixed Model setting you will get
much more reliable estimates So, use Mixed Models and BLUP instead of LSE
Mixed Models and the BLUP (Best Linear Unbiased Predictor) are heavily used in high noise & small sample applications
In analysis of promotions, SAS Proc Mixed or R/S+ Lme is used more than any other procedure
But REML/ML frequently yield zero/negative variance components BLUPs fail or all become equal REML/ML could be inaccurate when factor variance is relatively small
![Page 5: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/5.jpg)
5
Overview of Mixed Models
Suppose certain groups/segments distributed around their parent
Assumption in Mixed Models: Random effects are Normally distributed around the mean, the parent estimate, say M
Suppose Regression By Groups yield estimate Mi for Segment i
Let Vs be the between segment variance and Ve be the error variance, which are known as Variance Components
It can be shown that the Best Unbiased Predictor (BLUP) of Segment i effect is
a weighted average of the two estimates, and k is a known constant that depends on sample size and group data
The above is a shrinkage estimate that move extreme estimates towards the parent estimate
se
ise
kVV
MkVMV
![Page 6: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/6.jpg)
Problem
BLUP in Mixed model is a function of Variance Components
Classical estimates of Factor variance can become negative when noise (error variance) is large and/or sample size is small
Then, ML and REML fails: PROC Mixed will complaint about non-convergence or will yield equal BLUPs for all segments
I tried the Bayesian approach with MCMC, but when I did a sanity check (i) by changing the hyper parameters OR (ii) by using Gamma
type prior in place of log-normal, I got very different estimates
After both the Classical & Bayesian Approaches failed me, I wrote a paper about “Generalized Point Estimation”, which can Assure estimates fall into the parameter space Can take advantage of known signs of parameters without any
prior Can improve MSE of estimates by taking such classical
methods as Stein method 6
![Page 7: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/7.jpg)
7
Introduction to Generalized Inference
Classical Pivotals for interval estimation are of the form Q=Q(X, )
Generalized Inference on a parameter , is a generalized pivotal of the form Q=Q(X, x, ) that is a function of Observable X, observed x, and nuisance parameters satisfying Q(x,x, , ) is free of having a distribution free of
Classical Extreme Regions are of the form Q(X, )<Q(x, ) cannot produce all extreme regions
Q( X,x, )< Q( x,x, greater class of extreme regions
Generalized Test and Intervals are based on exact probability statements on Q
Generalized Estimators are based on transformed Generalized Pivotals
If Q or a transformation satisfy Q(x,x, )= , then is estimated using E(Q), the expected value of Q, Median of Q, etc.
![Page 8: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/8.jpg)
8
Generalized Estimation (GE)
The case Q(x,x, )= is too restrictive except in location parameters
More generally, if Q(x,x, , ) = 0, then the solution of E{Q(X,x,)}=0 is said to be the Generalized Estimate of
Note: As in classical estimation, one will have a choice of estimates and need to find one satisfying such desirable conditions as minimum MSE
Major advantage of GE is that, as in Bayesian Inference, it can assure, via conditional expectation, any known signs of parameters Variance components are positive Variance ratio in BLUP is between 0 and 1
Can produce inferences based on exact probabilities for Distributions such as Gamma, Weibull, Uniform
To do so you DO NOT need Prior or specify
values of hyper parameters
Read more about Generalized Inference
at www.weerahandi.org and even read my second book FREE!
![Page 9: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/9.jpg)
9
Estimating Variance Components and BLUP
Generalized approach can produce the above estimate or better estimates
Generalized pivotal quantity
is a Generalized Estimator and E(Q)=0 yields the classical estimate
But the drawback of the classical estimate is that MLE/UE frequently yields negative estimates
The conditional E(Q|C)=0 with known knowledge C yields
BLUPs are then obtained as weighted average Least Squares Estimates of Parent and Child
For simplicity consider a balanced Mixed Model The inference problems in canonical form reduces to:
![Page 10: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/10.jpg)
10
Comparison of Variance Estimation Methods (based on 10,000 simulated samples): Performance of MLE Vs. GE
Assume One-Way Random Effects model with k segments n data from each segment Degrees of freedom a=k-1
and e=n(k-1)
The variance component is estimated by the MLE and GE
Note that with small sample sizes MLE/UE yield negative estimates for Variance Component
In such situations SAS does not provide estimates or BLUP (just say “did not converge”)
![Page 11: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/11.jpg)
11
Comparison of Variance Estimation Methods:Performance of ML/REML Vs. GE (ctd.)
Table below shows MSE performance of competing estimates
Note that Generalized estimate is better than any other estimate REML is not as good as ML
Only GE can yield unequal BLUPs with any sample
![Page 12: Resolving Inference Issues in Mixed Models by Sam Weerahandi](https://reader035.fdocuments.net/reader035/viewer/2022070409/5681443d550346895db0d74c/html5/thumbnails/12.jpg)
12
Further Issues with BLUP
ML and REML Prediction Intervals for BLUP are highly conservative: Actual coverage of 95% intended intervals area as large as 100% This implies serious lack of power in Testing of Hypotheses The drawback prevails unless number of groups tend to infinity
Generalized Intervals proposed by Mathew, Gamage, and Weerahandi (2012) can rectify the drawback
Table below shows Performance of competing estimates