Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation:...

20
Maximum Likelihood Estimator All of Statistics (Chapter 9)

Transcript of Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation:...

Page 1: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Maximum Likelihood Estimator

All of Statistics (Chapter 9)

Page 2: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Outline

• MLE• Properties of MLE

– Consistency– Asymptotic normality – Efficiency– Invariance

Page 3: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Definition of MLE

• Likelihood Function =

• Log Likelihood Function

• MLE is the value that maximizes

• Joint density function

Page 4: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Definition of MLE

• MLE is the value that maximizes

Page 5: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Properties of MLE

MLE has the following nice properties:• Consistency:

• Asymptoticly Normal:

• Asymptotic optimality: MLE has the smallest variance

• Invariance Property

0 0ˆ( ) 1MLEP n

Page 6: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Consistency:

1. is the maximizer of 2. is the maximizer of . 3. we have in probability by Law of large numbers

4. Based on 1,2,3

Scaled Log-likelihood Function

Expectation:

0 0ˆ( ) 1P n

Outline of Proof

0 0ˆ( ) 1P n

Page 7: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

is the maximizer of . 2:

For any , we have

Since

Proof:

Page 8: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

in probability

• Law of large numbers: sample average converges to expectation in probability (proved by Chebyshev's inequality)

• Sample average:

• Expectation:

3:

Page 9: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

1. is the maximizer of2. is is the maximizer of . 3. we have in probability by LLN.

Target: Based on 1,2,3

Page 10: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Consistency:

The distributions of the estimators become more and more concentrated near the true value of the parameter being estimated.

Page 11: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

MLE is Asymptotically Normal

Page 12: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Fisher Information• Notation:

Fisher Information is defined as

Measure how quickly pdf will change

Larger fisher information pdf changes quickly at

can be well distinguished from the distribution with other parameters

easier to estimate based on data0

0

Page 13: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Taking the derivative:

Since is pdf:

Equivalently,

Writing (4) as an expectation

Differentiate(4):

=?

Second term:

First term:

Page 14: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Theorem. (Asymptotic normality of MLE.)

Since MLE is maximizer of , we have

By Mean Value Theory:

Convergence in Distribution by Central Limit Theory

First, consider the numerator

Next, consider the denominator : Since Convergence in Prob. by LLN.

(1)

(2)

Page 15: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Combine (1) and (2), we get

Theorem. (Asymptotic normality of MLE.)

Page 16: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

With the normal property, we can generate confidence bounds and hypothesis tests for the parameters.

Page 17: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Asymptotic optimal (efficient)

• Cramér–Rao bound expresses a lower bound on the variance of estimators

• The variance of an unbiased estimator is bounded by:

• MLE:

• MLE has the smallest asymptotic variance and we say that the MLE is asymptotically efficient and asymptotically optimal.

MLE is a unbiased estimator with smallest variance

Page 18: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Functional invariance( | ), f X ~ x

( )g

( | ), ( )f g X ~ x ( )g

An invertible mapping

where

. .

Outline of Proof* 1 1( ) ( ) ( ( )) ( ( )) ( )L f f g L g L x x xx | x |

* ( )L xˆ( )g Thus, the maximum of is attained at

Page 19: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Discussion

• Questions??

Page 20: Maximum Likelihood Estimator - University of Kansasjhuan/EECS940_S12/slides/MLE.pdf• Notation: Fisher Information is defined as Measure how quickly pdf will change Larger fisher

Asymptotic optimal (efficient)MLE has the smallest asymptotic variance and we say that the MLE is

asymptotically efficient and asymptotically optimal.

Statistic Y is called efficient estimator of iff the variance of Y attains the Rao-Cramer lower bound.

Let Y is a statistic with mean then we have

When Y is an unbiased estimator of , then the Rao-Cramer inequality becomes

When n converges to infinity, MLE is a unbiased estimator with smallest variance