Chapt 2. Variation How to: summarize/display random data
-
Upload
stephanie-rokos -
Category
Documents
-
view
41 -
download
0
description
Transcript of Chapt 2. Variation How to: summarize/display random data
Chapt 2. Variation
How to: summarize/display random data
appreciate variation due to randomness
Data summaries.
single observation y (number, curve, image,...)
sample y1 ..., yn
statistic s(y1 ..., yn)
Features: location
scale (spread)
Sample moments
= (y1 + ... + yn)/n average
s2 = Σ (y - )2 /(n-1) sample variance
Order statistics
y(1) y(2) ... y(n)
minimum, maximum, median, range
quartiles, quantiles
p 100% trimmed average
IQR, MAD = median{|yi - median(yi)|}
y
y
Bad data
Outlier - observation unusual compared to the others
Resistance
Trimmed average
Example (Midwife birth data). Hours in labor by day
n = 95
= 7.57 hr s2 = 12.97 hr2
min, med, max = 1.5, 7.5, 19 hr
quartiles 4.95, 9.75 hr
y
Graphs. Indispensable in data analysis
Histogram
disjoint bins [L+(k-1),L+k)
Plot count, nk , or proportion nk /n
EDF
#{yj y}/n
Estimates CDF, Prob{Y y}
Scatter plot (uj , vj )
Parallel boxplots - location, scale, shape, outliers, comparative
median, quartiles, 1.5 IQR
Random sample
Y1,...,Yn independent CDF F
Mean
E(Y) = y dF(y) (= yf(y)dy if density f)
p quantile
yp = F-1 (p)
Laplace (continuous)
f(y) = exp{-|y-|/}/2 , -<y<
Poisson (discrete)
Prob(Y=y) = f(y) = yexp{- }/y! , y=0,1,2, ...
Count of daily arrivals + poisson
Hours of labor + gamma
Gamma
f(y) =
Will be providing many examples of useful distributions in these beginning chapters
Some discrete, some continuous
0),(/}exp{1 yyy
SF Chron 01/26/09
Sampling variation.
"the data y1 ,..., yn will be regarded as the observed values of random variables" - probabilities defined
"ask how we would expect s(y1,...,yn) to behave on average, ..., understand the properties of S = S(Y1 ,...,Yn )"
Y1,...,Yn sample from distribution mean , variance 2
Sample moment ; E( ) = nE(Yj )/n = , unbiased
E(X + Y) = E(X) + E(Y)
Y Y
var( ) = 2/n
var(X+Y) = Var(X) + var(Y), if uncorrelated
var(aX) = a2 var(X)
(Yj - )2 = (Yj - + - )2
= (Yj - )2 + ( - )2
n2 = E( (Yj - )2 ) + 2
E(S2) = 2, unbiased
Birth data. n = 95, = 7.57 hr, s/n = 0.137 hr
Y
Y Y
Y Y
Y
y
Probability plot. Checking probability model
plot y(j) versus F-1(j/(n+1))
For normal take F =
from table or statistical package
Normal prob plot "works" if , unknown
For N(, 2 ), E(Y(j)) = + E(Z(j) )
Tools for approximation
Weak law of large numbers.
in probability as n
is a consistent estimate of
Definition.
{Sn} S in probability if for any > 0
Pr(|Sn - S| > ) 0
as n
If S = s0, constant and h(s) continuous at s0 then
h(Sn) h(s0) in probability
Y
Y
Central limit theorem.
n( - )/ Z = N(0,1) in distribution as n
Definition.
{Zn} converges in distribution to Z if
Pr(Zn z) Pr(Z z)
as n at every z for which Pr(Z z) is continuous
The CLT provides an approximation for "large" n
Y
Average as an estimate of .
If X is N( ,2) then (X - )/ is N(0,1)
Writing Zn = n( - )/
= + n-1/2 Zn
Indicates how efficiency of depends on n and
Y
Y
Y
Covariance and correlation.
cov(X,Y) = xy = E[{X-E(X)}{Y-E(Y)}]
sample covariance
Cxy = nj=1 (Xj - )(Yj - )/(n-1)
Cxy xy in probability
correlation
= cov(X,Y)/[var(X)var(Y)] -1 1
R = Cxy/[Cxx Cyy ]
R in probability
X Y
R = -.340
Some more distributions.
Cauchy
f(y) = 1/[{1 + (y - )2}] - < y <
distribution of same as that of Y1
no moments, long tails
Uniform
F(u) = 0 u 0
= u 0<u1
= 1 1 < u
E(U) = 1/2, center of gravity
Y
Exponential
f(y) = 0 y < 0
= exp{-y} y 0
Pareto
F(y) = 0 y < a
= 1 - (y/a)- y a a, > 0
Poisson process
Times of events y(1), y(2), y(3), ...
y(1), y(3)-y(2), y(4)-y(3),... i.i.d. exponential
Chi-squared distribution
Z1 , Z2 ,..., Z IN(0,1)
W = j=1 Z2
j
E(W) = var(W) = 2
Multinomial
page 47
p classes with probs 1 ,..., p adding to 1
Linear combination
L = a + bj Yj
E(L) = a + bj j
If independent
var(L) = bj2 j
2
If {Yj} are IN(j,j2), then L is
N(a + bj j, bj2 j
2 )
Moment-generating function
MY(t) = E(exp{tY}), t real
X, Y independent
MX+Y (t) = MX(t)MY(t)
For N(,2)
M(t) = exp{t + t2 2/2)
The normal is determined by its moments