lecture 71
Statistics
Statistics:
1. Model
2. Estimation
3. Hypothesis test 20
20
22
2
,
ˆ,ˆ
.,,2,1),,(~
sx
niNX i
iid
lecture 72
parameter estimat estimator
EstimationEstimat
Definition: A (point) estimate of a parameter, , in the model is a “guess” at what can be (based on the sample). The corresponding random variable is called an estimator.
^
. . .
. . .
SamplePopulation
^
^
222ˆ
ˆ
Ss
Xx
lecture 73
EstimationUnbiased estimate
Definition:An estimator is said to be unbiased if
E() =
^
^
unbiased
unbiased
have We
22 )(
)(
SE
XE
lecture 74
Confidence interval for the mean
Example:In a sample of 20 chocolate bars the amount of calories has been measured:
• the sample mean is 224 calories
How certain are we that the population mean is close to 224?
The confidence interval (CI) for helps us here!
lecture 75
Confidence interval for Known variance
Let x be the average of a sample consisting of n observations from a population with mean and variance (known).
A (1-)100% confidence interval for is given by
•We are (1-)100% confident that the unknown parameter lies in the CI.
•We are (1-)100% confident that the error we make by using x as an estimate of does not exceed (from which we can find n for a given error tolerance).
From the standard normal distribution N(0,1)
nzx
nzx
2/2/
nz 2/
lecture 76
Confidence interval for Known variance
Confidence interval for known variance is a results of the Central Limits Theorem. The underlying assumptions:
• sample size n > 30, or• the corresponding random var. is (approx.) normally distributed
(1 - )100% confidence interval :• typical values of : =10% , = 5% , = 1%• what happens to the length of the CI when n increases?
lecture 77
Confidence interval for Known variance
Problem:In a sample of 20 chocolate bars the amount of calories has been measured:
• the sample mean is 224 calories
In addition we assume:• the corresponding random variable is approx. normally distributed • the population standard deviation = 10
Calculate 90% and 95% confidence interval for
Which confidence interval is the longest?
lecture 78
Confidence interval for Unknown variance
Let x be the mean and s the sample standard deviation of a sample of n observations from a normal distributed population with mean and unknown variance.
A (1-)100 % confidence interval for is given by
•Not necessarily normally distributed, just approx. normal distributed.•For n > 30 the standard normal distribution can be used instead of the t
distribution.•We are (1-)100% confident that the unknown lies in the CI.
From the t distribution with n-1 degrees of freedom t(n -1)
nstx
nstx nn 1,2/1,2/
lecture 79
Problem:In a sample of 20 chocolate bars the amount of calories has been measured:
• the sample mean is 224 calories• the sample standard deviation is 10
Calculate 90% and 95% confidence intervals for
How does the lengths of these confidence intervals compare to those we obtained when the variance was known?
Confidence interval for Unknown variance
lecture 710
MATLAB: If x contains the data we can obtain a (1-alpha)100% confidence interval as follow:
mean(x) + [-1 1] * tinv(1-alpha/2,size(x,1)-1) * std(x)/sqrt(size(x,1))where •size(x,1) is the size n of the sample•tinv(1-alpha/2,size(x,1)-1) = t/2(n-1)•std(x) = s (sample standard deviation)
R: mean(x) + c(-1,1) * qt(1-alpha/2,length(x)-1) * sd(x)/sqrt(length(x))
Confidence interval for Using the computer
lecture 711
Confidence interval for
Example:In a sample of 20 chocolate bars the amount of calories has been measured:
• sample standard deviation is10
How certain are we that the population variance 2 is close to 102?
The confidence interval for 2 helps us answer this!
lecture 712
Confidence interval for
Let s be the standard deviation of a sample consisting of n observations from a normal distributed population with variance 2.
A (1-) 100% confidence interval for 2 is given by
We are (1-)100% confident that the unknown parameter 2 lies in the CI.
From 2 distribution with n-1 degrees of freedom
21,2/1
22
21,2/
2 )1()1(
nn
snsn
lecture 713
Confidence interval for
Problem:In a sample of 20 chocolate bars the amount of calories has been measured:
• sample standard deviation is10
Find the 90% and 95% confidence intervals for
lecture 714
MATLAB: If x contains the data we can obtain a (1-alpha)100% confidence interval for as follow:
(size(x,1)-1)*std(x)./ chi2inv([1-alpha/2 alpha/2],size(x,1)-1)
R: (length(x)-1)*sd(x))/
qchisq(c(1-alpha/2,alpha/2), length(x)-1)
Confidence interval for
Using the computer
Maximum Likelihood EstimationThe likelihood function
lecture 715
Assume that X1,...,Xn are random variables with joint density/probability function
where is the parameter (vector) of the distribution.
Considering the above function as a function of given the data x1,...,xn we obtain the likelihood function
);,,,( 21 nxxxf
);,,,(),,,;( 2121 nn xxxfxxxL
Maximum Likelihood EstimationThe likelihood function
lecture 716
Definition: Given independent observations x1,...,xn from the probability / density function f(x;) the maximum likelihood estimate (MLE) is the value of which maximizes the likelihood function
);();();(),,,;( 2121 nn xfxfxfxxxL
Reminder: If X1,...,Xn are independent random variables with identical marginal probability/ density function f(x;), then the joint probability / density function is
);();();();,,,( 2121 nn xfxfxfxxxf
Maximum Likelihood EstimationExample
lecture 717
Assume that X1,...,Xn are a sample from a normal population with mean and variance . Then the marginal density for each random variable is
Accordingly the joint density is
The logarithm of the likelihood function is
2
22 2
1exp
2
1);(
xxf
iinnn xxxxf 2
222221 2
1exp
)()2(
1);,,,(
i
in xnn
xxxL 2
22
212
2
1)ln(
2)2ln(
2),,,;,(ln
Maximum Likelihood EstimationExample
lecture 718
We find the maximum likelihood estimates by maximizing the log-likelihood:
which implies . For we have
which implies
Notice , i.e. the MLE is biased!
02
1
2);,(ln 2
2222
2
iix
nL
x
01
);,(ln2
2
iixL
x
xxn i
i 1̂
22 1ˆ
ii xx
n
22 1]ˆ[
n
nE
Top Related