Sufficient statistics. The Poisson and the exponential can be summarized by (n, ).
description
Transcript of Sufficient statistics. The Poisson and the exponential can be summarized by (n, ).
Sufficient statistics.
The Poisson and the exponential can be summarized by (n, ).
So too can the normal with known variance
Consider a statistic S(Y)
Suppose that the conditional distribution of Y given S does not depend on , then S is a sufficient statistic for based on Y
Occurs iff the density of Y factors into a function of s(y) and and a function of y that doesn't depend on
y
More Chapter 4
Example. Exponential
IExp() ~ Y
E(Y) = Var(Y) = 2
Data y1,...,yn
L() = -1 exp(-yj /)
l() = -nlog() - yj /
yj /n is sufficient
2222
2
322
2
2
2)ˆ(
2)(
m.l.e. ˆ
1
0)ˆ( .
yn
yn
ynl
nnl
y
ynl
UequationLikelihood
maximum
0})(
2{)}({
)}({)(
.
2)(
.
2000
0
2
32
ynn
EUE
n
JEI
nInformatioFisher
ynn
J
nInformatioObserved
=
))(
,(~ˆ
distin )()(
)()(
probin 1)2
()()()(
)/()}({
)/()()}({
200
0
20
002/10
3
0
2
0
1
2
0
010
200
0
2000
0
nN
Zyn
nUI
ynnnn
JI
nJE
nIUVar
Approximate 100(1-2 )% CI for 0
2
2/1
/)ˆ()ˆ(
)ˆI(insert Could
)ˆ(ˆ
ynJIHere
Jz
Example. spring data
8.34)(168.26,16 )0188(.96.130.168
000353.3.168/10/)ˆ()ˆ(
3.168ˆ
22
ynJI
cyclesky
Weibull.
)/log()/(/log/
)/(/),(
log)1(loglog),(
)(exp),;()(
1 if lExponentia
0,, ,)(exp),;(
1
1
1
jjj
j
jj
j
jn
n
yyyn
ynU
yynnl
yyyfL
yyy
yf
Note.
),ˆ(),(max
problemmax D-1
)(ˆ /11
lll
yn
profile
j
Expected information
large Want
/)(log)(
/)(2)(2)'(1 (2)/-(2)/- )/(
),(22
2
I
dzzdz
nI
Gamma.
sufficient ),log(
log)1()(loglog)(
)exp()(
)(
0,, ),exp()(
),;(
1
1
jj
jj
jjn
n
yyS
yynnl
yyL
yyy
yf
Example. Bernoulli
Pr{Y = 1} = 1 - Pr{Y = 0} = 0 1
L() = ^yi (1 - )^(1-yi)
= r(1 - )n-r
l() = rlog() + (n-r)log(1-)
r = yj
R = Yj is sufficient for , as is R/n
L() factors into a function of r and a constant
Score vector
[ yj / - (n-yj )/(1-)]
Observed information
[yj /2 + (n-yj )/(1-)2 ]
ny j /ˆ
M.l.e.
Cauchy.
ICau()
f(y;) = 1/(1+(y-)2 )
E|Y| = Var(Y) =
L() = 1/((1+(yj -)2 )
Many local maxima
l() = -log(1+(yj -)2 )
J() = 2((1-(yj -)2 )/(1+(yj -)2 )2 I() = n/2
sufficient is ,....y
N(0,1) closer to is
)ˆ()ˆ( Z)ˆ()ˆ(
)((1)
0
2/1
J0
2/1
n
J
I
y
Z
JIZ
Uniform.
f(u;) = 1/ 0 < u <
= 0 otherwise
L() = 1/n 0 < y1 ,..., yn <
= 0 otherwise
0ˆ//)ˆ(
0/)ˆ(
,...y y
)max(ˆ
222
1(n)
nddl
ddl
y
y
n
j
l() becomes increasingly spikey
E u() = -1 i() = -
ondistributiin lExponentia)ˆ(
1
0 )/(
0 }ˆPr{
n
a
aa
aan
Logistic regression. Challenger data Ibinomials Rj , mj , j
)21()ˆ)(ˆ,ˆ()ˆ(
region Confidence
),(
statistic Sufficient
))exp(1(
)exp(
)!(!
!
),;Pr(),(
})exp{1/(}exp{
2001000
1
110
110
1010
110110
cJ
xRRS
x
xrr
rmr
m
rRL
xx
T
jjj
m
j
jjj
jjj
j
jj
jjj
j
Likelihood ratio.
Model includes dim() = p
true (unknown) value 0
Likelihood ratio statistic
)( ason distributiin )(
)}()ˆ({2)(
020
00
IW
llW
p
Justification.
Multinormal result
If Y ~ N (,) then (Y- )T -1(Y- ) ~ p2
)ˆ)(ˆ()ˆ(
)ˆ)(ˆ()ˆ(
)ˆ()ˆ(
)ˆ(21
)ˆ(
)ˆ()ˆ()ˆ({2
)}()ˆ({2)(
00
00
02
00
00
I
J
llll
llW
T
T
T
TT
Uses.
Pr[W(0) cp(1-2 )] 1-2
)}21(21
)ˆ()(:{
)21( )(
p
p
cll
cW
Approx 100(1-2 )% confidence region
Example. exponential
84.3}/log1/{2:{
84.3)95.0( 1
}/log1/{2)}()ˆ({2
log)ˆ(
/log)(
1
000
yyn
cp
yynll
nynl
ynnl
Spring data: 96 < <335
vs. asymp normal approx 64 < <273 kcycles
Prob-value/P-value. See (7.28)
Choose T whose large values cast doubt on H0
Pr0(T tobs)
Example. Spring data
Exponential E(Y) =
H0: = 100?
.071.0368*2
)802.1|Pr(|)248.3(Pr
248.3)100(
}/log1/{2)(
/)ˆ()ˆ(
10n 3.168ˆ
2
10
2
ZvalueP
W
yynW
ynJI
y
Nesting
: p by 1 parameter of interest
: q by 1 nuisance parameter
Model with params (0, ) nested within (, )
Second model reduces to first when = 0
)ˆ,()ˆ,ˆ(
Note.
0
0
ll
Example. Weibull
params (,)
exponential when = 1
How to examine H0 : = 1?
1p on,distributiin
)]ˆ,()ˆ,ˆ([2)(
2
p
000
llWp
Spring failure times. Weibull
07-5.73E
07)-2(2.867E
)00.5|(|
02.25)]1,168()6,181([2
26.61)1,168( 2749.2)1,168(
)1,168()1,( 75.48)6,181(
227.6)6,181( )6,181()ˆ,ˆ(
1
ZPvalueP
ll
lEL
l
EL
Challenger data. Logistic regression
temperature x1 pressure x2
(0 , 1 , 2 ) = exp{}/(1+exp{})
= 0 + 1 x1 + 2 x2 linear predictor
loglike l(0 , 1 , 2 ) =
0 rj + 1 rj x1j + 2 rj x2j - m log(1+exp{j })
Does pressure matter?
214.)107(.2)24.1|Pr(|
)54.1(Pr :
54.177.*2
05.15),,(max
82.15)0,,(max
2
10
210,,
10,
210
10
Z
valueP
l
l
Model fit.
Are labor times Weibull?
Nest its model in a more general one
Generalized gamma.
0,,, ),exp()(
),,;(1
yyy
yf
Gamma for =1
Weibull for =1
Exponential for ==1
Likelihood results.
max log likelihood:
generalized gamma -250.65
gamma -251.12
Weibull -251.17
gamma vs. generalized gamma
- 2 log like diff:
2(-250.65+251.12) = .94
P-value Pr0 (12 > .94)
= Pr(|Z|>.969)
= 2(.166) = .332
Chi-squared statistics. Pearson's chi-squared
categories 1,...,k
count of cases in category i: Yi
Pr(case in i) = i 0 < i < 1 1k i =1
E(Yi ) = ni
var(Yi ) = i (1 - i )n
cov(Yi ,Yj ) = -i j n i j
E.g. k=2 case cov(Y,n-Y) = -var(Y) = -n1 2
= { (1 ,...,k ): 1k i = 1, 0<1 ,...,k <1}
dimension k-1
Reduced dimension possible?
model i () dim() = p
log like general model:
1k-1 yi log i + yk log[1-1 -...-k-1], 1
k yi = n
nYii /ˆ
log like restricted model:
l() = 1k-1 yi log i() + yk log[1-1()-...-k-1()]
likelihood ratio statistic:
k
pkiiiy1
2
1~)ˆ(/ˆlog2
if restricted model true
The statistic is sometimes written
W = 2 Oi log(Oi /Ei )
(Oi - Ei )2/Ei
)ˆ(E where i iii nyO
Pearson's chi-squared.
5)ˆ(ntion recommenda
~
)ˆ(/)]ˆ([2
p-1-k
12
i
kiii nnyP
Example. Birth data. Poisson?
12.9ˆ 92n arrivalsDaily
Split into k=13 categories
[0,7.5), [7.5,8.5),...[18.5,24] hours
O(bserved) 6 3 3 8 ...
E(xpected) 5.23 4.37 6.26 8.08 ...
P = 4.39
P-value Pr(112 > 4.39) = .96
Two way contingency table.
r rows and c columns
n individuals
Blood groups A, B, AB, O
A, B antigens - substance causing body to produce antibodies
2
2
2
)1)(1(202
26
2)1(35
2)1(179
O
BA
OBB
OAA
O
AB
B
A
group count model I model II
O = 1 - A - B
Question. Rows and columns independent?
W = 2 yij log nyij / yi.y.j
with yi. = j yij
~ k-1-p2 = (r-1)c-1)
2
with k=rc p=(r-1)+(c-1)
P = (yij - yi. y.j /n)2 / (yi. y.j /n)
~ (r-1)(c-1)2
Model 1
W = 17.66
Pr(12> 17.66) = Pr(|Z| > 4.202) = 2.646E-05
P = 15.73
Pr(12> 15.73) = Pr(|Z| > 3.966) = 7.309E-05
k-1-p = 4-1-2 = 1
Model 2
W = 3.17
Pr(|Z| > 1.780) = .075
P = 2.82
Pr(|Z|>1.679) = .093
Incorrect model.
True model g(y), fit f(y;)
valuebad"least " :
0D 0);(
yprobabilitin )()f(y; log /)ˆ(
ydiscrepancLiebler -Kullback
)(})f(y;
g(y)log{ );( minimizes
);( log )( maximizes ˆ
g
g
g
j
ffD
dyygnl
dyyggfD
yfl
Example 1. Quadratic, fit linear
Example 2. True lognormal, but fit exponential
dyyg
nYY
y
ZY
gg
g
)(})f(y;
g(y)log{ Minimizing
/)1(var }2/exp{YE ˆ
}2/exp{
/log :likelog
}exp{ Lognormal
222
2
Large sample distribution.
);()( ifresult mle
))()()(;( ~ ˆ 11
yfyg
IKIN ggggp
Model selection.
Various models:
non-nested
Ockham's razor.
Prefer the simplest model
Formal criteria.
}log)ˆ({2
})ˆ({2
nplBIC
plAIC
Look for minimum
Example. Spring failure
Model p AIC BIC
M1 12 744.8* 769.9*
M2 7 771.8 786.5
M3 2 827.8 831.2
M4 2 925.1 929.3
6 stress levels
M1: Weibull - unconnected , at each stress level