Lecture 8 The Principle of Maximum Likelihood. Syllabus Lecture 01Describing Inverse Problems...
-
date post
20-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Lecture 8 The Principle of Maximum Likelihood. Syllabus Lecture 01Describing Inverse Problems...
SyllabusLecture 01 Describing Inverse ProblemsLecture 02 Probability and Measurement Error, Part 1Lecture 03 Probability and Measurement Error, Part 2 Lecture 04 The L2 Norm and Simple Least SquaresLecture 05 A Priori Information and Weighted Least SquaredLecture 06 Resolution and Generalized InversesLecture 07 Backus-Gilbert Inverse and the Trade Off of Resolution and VarianceLecture 08 The Principle of Maximum LikelihoodLecture 09 Inexact TheoriesLecture 10 Nonuniqueness and Localized AveragesLecture 11 Vector Spaces and Singular Value DecompositionLecture 12 Equality and Inequality ConstraintsLecture 13 L1 , L∞ Norm Problems and Linear ProgrammingLecture 14 Nonlinear Problems: Grid and Monte Carlo Searches Lecture 15 Nonlinear Problems: Newton’s Method Lecture 16 Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Lecture 17 Factor AnalysisLecture 18 Varimax Factors, Empircal Orthogonal FunctionsLecture 19 Backus-Gilbert Theory for Continuous Problems; Radon’s ProblemLecture 20 Linear Operators and Their AdjointsLecture 21 Fréchet DerivativesLecture 22 Exemplary Inverse Problems, incl. Filter DesignLecture 23 Exemplary Inverse Problems, incl. Earthquake LocationLecture 24 Exemplary Inverse Problems, incl. Vibrational Problems
Purpose of the Lecture
Introduce the spaces of all possible data,all possible models and the idea of likelihood
Use maximization of likelihood as a guiding principle for solving inverse problems
viewpoint
the observed data is one point in the space of all possible observations
or
dobs is a point in S(d)
now suppose …the data are independent
each is drawn from a Gaussian distribution
with the same mean m1 and variance σ2
(but m1 and σ unknown)
now interpret …
p(dobs)as the probability that the observed data was in
fact observed
L = log p(dobs)called the likelihood
find parameters in the distribution
maximizep(dobs)with respect to m1 and σ
maximize the probability that the observed data
were in fact observed
the
Principle of Maximum Likelihood
solving the two equations
usual formula for the
sample mean
almost the usual formula for the sample standard
deviation
these two estimates linked to the assumption of the data being Gaussian-distributed
might get a different formula for a different p.d.f.
d1d 2 d 2d1
p(d1, ,d1 ) p(d1, ,d1 )(A) (B)
likelihood maximization process will fail if p.d.f. has no well-defined peak
linear inverse problem for with Gaussian-distibuted data
with known covariance [cov d]assumeGm=d
gives the mean dT
principle of maximum likelihood
maximize L = log p(dobs)minimize
This is just weighted least squares
E = T
principle of maximum likelihood
when data Gaussian-distributedsolve Gm=d with weighted least squares
with weighting of
special case of uncorrelated dataeach datum with a different variance[cov d]ii = σdi2
minimize
errors weighted by their certainty
probabilistic representation of a priori information
probability that the model parameters are
near mgiven by p.d.f.
pA(m)
probabilistic representation of a priori information
probability that the model parameters are
near mgiven by p.d.f.
pA(m)centered at a priori value
<m>
probabilistic representation of a priori information
probability that the model parameters are
near mgiven by p.d.f.
pA(m)variance reflects uncertainty in a
priori information
probabilistic representation of data
probability that the data are
near dgiven by p.d.f.
p(d)centered at
observed data dobs
probabilistic representation of data
probability that the data are
near dgiven by p.d.f.
p(d)variance reflects
uncertainty in measurements
probabilistic representation of both prior information and observed data
assume observations and a priori information are uncorrelated
the theoryd = g(m)is a surface in the combined space of
data and model parameters
on which the estimated model parameters and predicted data must lie
the theoryd = g(m)is a surface in the combined space of
data and model parameters
on which the estimated model parameters and predicted data must lie
for a linear theorythe surface is planar
0 1 2 3 4 5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
m
d
dobs model, m
datu
m, d
mapmest
d pred=g(m)
0 1 2 3 4 50
0.1
0.2
s
P(s
)
position along curve, sp(s)
(B)
smax
(A)
0 1 2 3 4 50
0.5
1
s
P(s
)
0 1 2 3 4 5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
m
d
dobs model, m
datu
m,
d mest≈map
d pre
d=g(m)
position along curve, s
p(s)
smax
0 1 2 3 4 50
0.5
1
s
P(s
)
0 1 2 3 4 5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
m
dd pre ≈dob s model, m
datu
m,
d mestd=g(m)
position along curve, s
p(s)
map
smax
(A)
(B)
principle of maximum likelihoodwith
Gaussian-distributed dataGaussian-distributed a priori information
minimize
this provides and answer to the question
What should be the value of ε2in damped least squares?
The answer
it should be set to the ratio of variances of the data and the a priori model parameters