GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the...
-
Upload
rosamond-collins -
Category
Documents
-
view
229 -
download
0
Transcript of GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the...
HMM basics
• Given a sequence, and state parameters:– Each possible path through the states has a certain
probability of emitting the sequence– P(O|M)
A C G T A G C T T T.04
.10
.02
.06
Probability of taking this state path
given t-probssequence(emissions)
state paths
.01
.04
.03
.08
.0004
.0040
.0006
.0048
Probability of emitting this
sequence from this state path given e-probs
Joint Probability
Viterbi Algorithm
A C G T A G C T T Tsequence
states
Highest weight path
.0004
.0040
.0006
.0048
Joint Probability
…
Viterbi Algorithm
A C G T A G C T T Tsequence
states …
Store at each node:• Likelihood of highest-likelihood path ending here• Traceback to previous node in path
Pseudocode for Viterbi Algorithm
• Iterate over positions in sequence• At each position, fill in likelihood and
traceback• Compute emission and transition counts:– number of times state k emits letter s– number of times state k transitions to state k'
• Compute new parameter values
Likelihood function for Viterbi training
• Likelihood function for Viterbi:
• Different than likelihood function for EM:
Machine learning debugging tips• Debugging ML algorithms is much harder than
other code.• Problem: It is hard to verify the output after
each step.
Machine learning debugging tips• Strategies:– Work out a simple example by hand.– Implement a simpler, less efficient algorithm and
compare.– Compute your objective function (i.e. likelihood)
at each iteration.– Stare hard at your code.– Don't use your real data to test.
HW5 Tips
• Template is not a real solution!• Calculate L(M|O) by hand for the first few
observations, compare to your results– For each site, what’s the likelihood of each state?
states
A C G T
HW5 Tips
• Not necessary to explicitly create graph structure [although you may find it helpful]
states
HW5 Tips
• Template is not a real solution!• Calculate L(M|O) by hand for the first few
observations, compare to your results– For each site, what’s the likelihood of each state?
• Make a toy case:– AAAACCCCCCCCCCCCCCC– Easy to calculate L(M|O) by hand
• Use log space computations.
HW6: Baum-Welch
• Goal: learn HMM parameters taking into account all paths:
• Expectation maximization– Forward backward algorithm.– Re-estimate parameter values based on expected
counts.
Probabilistic model inference algorithms
Problem: Given a model, what is the probability that a variable X has value x?• Belief propagation• HMMs: Forwards-backwards algorithmsProblem: Given a model, what is the most likely assignment of variables?• Maximum a posteriori (MAP) inference• Viterbi algorithm
Probabilistic model learning algorithms
Problem: Learn model parameters.• EM:
– E step: Use inference to get estimate of hidden variable values.– M step: Re-estimate parameter values.
• HMMs: Baum-Welch algorithm• Use belief propagation for inference: "soft EM"• Use Viterbi inference: "hard EM"