Expectation Maximization Notes
Expectation Maximization Notes
Expectation Maximization Notes
training
Jens Lagergren
But this can be computationally hard. Therefore we break down the problem
into following sub-problems
• Finding the right structure, i.e. determining the number of layers. (We
will not consider this sub-problem at the moment)
• Finding optimal parameters for a given structure.
1
Expectation Maximization
We will use the EM algorithm to optimize the parameters to create the best
fitting HMM from a given set of sequences F . From the initial HMM M and
the initial parameter set θ we can generate a new optimized parameter set: θ0 ,
by iterating this procedure the aproximation will eventually converge with the
local maximum. We calculate this new set θ0 as follows
X
E[Aππ0 | x, θ]
x∈F
a0ππ0 = X
E[Aπ | X, θ]
x∈F
X
E[Gπ,σ | X, θ]
x∈F
e0π (σ) = X
E[Aπ | X, θ]
x∈F
Therefore E[Aπ |X, θ] is easily computed once E[Aπ,π0 |X, θ] is resolved in point 3.
X
E[Gπ,σ |X, θ] = P r[πi = π|X, θ]
i,xi =σ
X P r[πi = π, X|θ]
=
i,x =σ
P r[x|θ]
i
X X P r[πi−1 = π 0 , πi = π, X|θ]
=
i,x =σ 0
P r[x|θ]
i π
X
E[Aππ0 |X, θ] = Pr[πi = π, πi+1 = π 0 |X, θ]
i
P
i Pr[πi = π, πi+1 = π 0 , X|θ]
=
Pr[X|θ]
2
so it is enough to be able to compute Pr[πi = π, πi+1 = π 0 , X|θ]
... where
1. bπ (i + 1) is the ”backward” variable defined as:
3
Proof: Let
and ½
1 when π = qstart
vπ (0) =
0 otherwise.
Then
= eπ (xi ) max
0
a π0 π v (i − 1)
π0
π
That is,
vπ (i) = max
0
vπ0 (i − 1)aπ0 π eπ (xi )
π
Since there are |Q|n subproblems Vπ (i) and each can be computed in time O(|Q|)
this gives a recursion that can be computed in time O(|Q|2 n).