Continuous Emission Hidden Markov Models

AUTHOR:

  • William Stein, 2010-03
class sage.stats.hmm.chmm.GaussianHiddenMarkovModel

Bases: sage.stats.hmm.hmm.HiddenMarkovModel

GaussianHiddenMarkovModel(A, B, pi)

Gaussian emissions Hidden Markov Model.

INPUT:

  • A – matrix; the N x N transition matrix
  • B – list of pairs (mu,sigma) that define the distributions
  • pi – initial state probabilities
  • normalize –bool (default: True)

EXAMPLES:

We illustrate the primary functions with an example 2-state Gaussian HMM:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,1), (-1,1)], [.5,.5]); m
Gaussian Hidden Markov Model with 2 States
Transition matrix:
[0.1 0.9]
[0.5 0.5]
Emission parameters:
[(1.0, 1.0), (-1.0, 1.0)]
Initial probabilities: [0.5000, 0.5000]

We query the defining transition matrix, emission parameters, and initial state probabilities:

sage: m.transition_matrix()
[0.1 0.9]
[0.5 0.5]
sage: m.emission_parameters()
[(1.0, 1.0), (-1.0, 1.0)]
sage: m.initial_probabilities()
[0.5000, 0.5000]

We obtain a sample sequence with 10 entries in it, and compute the logarithm of the probability of obtaining his sequence, given the model:

sage: obs = m.sample(10); obs
[-1.6835, 0.0635, -2.1688, 0.3043, -0.3188, -0.7835, 1.0398, -1.3558, 1.0882, 0.4050]
sage: m.log_likelihood(obs)
-15.2262338077988...

We compute the Viterbi path, and probability that the given path of states produced obs:

sage: m.viterbi(obs)
([1, 0, 1, 0, 1, 1, 0, 1, 0, 1], -16.677382701707881)

We use the Baum-Welch iterative algorithm to find another model for which our observation sequence is more likely:

sage: m.baum_welch(obs)
(-10.6103334957397..., 14) 
sage: m.log_likelihood(obs)
-10.6103334957397...

Notice that running Baum-Welch changed our model:

sage: m
Gaussian Hidden Markov Model with 2 States                                                    
Transition matrix:                                                                            
[   0.415498136619    0.584501863381]                                                         
[   0.999999317425 6.82574625899e-07]                                                         
Emission parameters:                                                                          
[(0.417888242712, 0.517310966436), (-1.50252086313, 0.508551283606)]                          
Initial probabilities: [0.0000, 1.0000]    
baum_welch(obs, max_iter=500, log_likelihood_cutoff=0.0001, eps=1e-12, fix_emissions=False)

Given an observation sequence obs, improve this HMM using the Baum-Welch algorithm to increase the probability of observing obs.

INPUT:

  • obs – a time series of emissions
  • max_iter – integer (default: 500) maximum number of Baum-Welch steps to take
  • log_likehood_cutoff – positive float (default: 1e-4); the minimal improvement in likelihood with respect to the last iteration required to continue. Relative value to log likelihood.
  • eps – very small positive float (default: 1e-12); used during the iteration to replace numbers, e.g., standard deviations that are 0 but are not allowed to be 0.
  • fix_emissions – bool (default: False); if True, do not change emissions when updating

OUTPUT:

  • changes the model in places, and returns the log likelihood and number of iterations.

EXAMPLES:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9])
sage: m.log_likelihood([-2,-1,.1,0.1])
-8.8582822159862751
sage: m.baum_welch([-2,-1,.1,0.1])
(22.164539478647512, 8)
sage: m.log_likelihood([-2,-1,.1,0.1])
22.164539478647512
sage: m
Gaussian Hidden Markov Model with 2 States
Transition matrix:
[              1.0 1.99514384486e-21]
[   0.499999997782    0.500000002218]
Emission parameters:
[(0.09999999999..., 1.48492424379e-06), (-1.4999999929, 0.500000010249)]
Initial probabilities: [0.0000, 1.0000]

We watch the log likelihoods of the model converge, step by step:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9])
sage: v = m.sample(10)
sage: stats.TimeSeries([m.baum_welch(v,max_iter=1)[0] for _ in range(len(v))])
[-20.1167, -17.7611, -16.9814, -16.9364, -16.9314, -16.9309, -16.9309, -16.9309, -16.9309, -16.9309]

We illustrate fixing emissions:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.9,.1]], [(1,2),(-1,.5)], [.3,.7])
sage: set_random_seed(0); v = m.sample(100)
sage: m.baum_welch(v,fix_emissions=True)
(-164.72944548204..., 23)
sage: m.emission_parameters()
[(1.0, 2.0), (-1.0, 0.5)]
sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.9,.1]], [(1,2),(-1,.5)], [.3,.7])
sage: m.baum_welch(v)
(-162.85437039799817, 49)
sage: m.emission_parameters()
[(1.27224191726, 2.37136875176), (-0.948617467518, 0.576236038512)]
emission_parameters()

Return the parameters that define the normal distributions associated to all of the states.

OUTPUT:

  • a list B of pairs B[i] = (mu, std), such that the distribution associated to state i is normal with mean mu and standard deviation std.

EXAMPLES:

sage: hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9]).emission_parameters()
[(1.0, 0.5), (-1.0, 3.0)]
generate_sequence(length)

Return a sample of the given length from this HMM.

INPUT:

  • length – positive integer

OUTPUT:

  • an IntList or list of emission symbols
  • TimeSeries of emissions

EXAMPLES:

sage: m =hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9])
sage: m.generate_sequence(5)
([-3.0505, 0.5317, -4.5065, 0.6521, 1.0435], [1, 0, 1, 0, 1])
sage: m.generate_sequence(0)
([], [])
sage: m.generate_sequence(-1)
Traceback (most recent call last):
...
ValueError: length must be nonnegative
log_likelihood(obs)

Return the logarithm of the probability that this model produced the given observation sequence.

INPUT:

  • obs – sequence of observations

OUTPUT:

  • float

EXAMPLES:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9])
sage: m.log_likelihood([1,1,1])
-4.2978807660724856
sage: set_random_seed(0); s = m.sample(20)
sage: m.log_likelihood(s)
-40.115714129484...
viterbi(obs)

Determine “the” hidden sequence of states that is most likely to produce the given sequence seq of observations, along with the probability that this hidden sequence actually produced the observation.

INPUT:

  • seq – sequence of emitted ints or symbols

OUTPUT:

  • list – “the” most probable sequence of hidden states, i.e., the Viterbi path.
  • float – log of probability that the observed sequence was produced by the Viterbi sequence of states.

EXAMPLES:

We find the optimal state sequence for a given model:

sage: m = hmm.GaussianHiddenMarkovModel([[0.5,0.5],[0.5,0.5]], [(0,1),(10,1)], [0.5,0.5])
sage: m.viterbi([0,1,10,10,1])
([0, 0, 1, 1, 0], -9.0604285688230...)

Another example in which the most likely states change based on the last observation:

sage: m = hmm.GaussianHiddenMarkovModel([[.1,.9],[.5,.5]], [(1,.5), (-1,3)], [.1,.9])
sage: m.viterbi([-2,-1,.1,0.1])
([1, 1, 0, 1], -9.61823698847639...)
sage: m.viterbi([-2,-1,.1,0.3])
([1, 1, 1, 0], -9.5660236533785135)
class sage.stats.hmm.chmm.GaussianMixtureHiddenMarkovModel

Bases: sage.stats.hmm.chmm.GaussianHiddenMarkovModel

GaussianMixtureHiddenMarkovModel(A, B, pi)

Gaussian mixture Hidden Markov Model.

INPUT:

  • A – matrix; the N x N transition matrix
  • B – list of pairs (mu,sigma) that define the distributions
  • pi – initial state probabilities
  • normalize –bool (default: True); if given, input is normalized to define valid probability distributions, e.g., the entries of A are made nonnegative and the rows sum to 1, and the probabilities in pi are normalized.

EXAMPLES:

sage: A  = [[0.5,0.5],[0.5,0.5]]
sage: B  = [[(0.9,(0.0,1.0)), (0.1,(1,10000))],[(1,(1,1)), (0,(0,0.1))]]
sage: hmm.GaussianMixtureHiddenMarkovModel(A, B, [1,0])
Gaussian Mixture Hidden Markov Model with 2 States
Transition matrix:
[0.5 0.5]
[0.5 0.5]
Emission parameters:
[0.9*N(0.0,1.0) + 0.1*N(1.0,10000.0), 1.0*N(1.0,1.0) + 0.0*N(0.0,0.1)]
Initial probabilities: [1.0000, 0.0000]

TESTS:

If a standard deviation is 0, it is normalized to be slightly bigger than 0.:

sage: hmm.GaussianMixtureHiddenMarkovModel([[1]], [[(1,(0,0))]], [1])
Gaussian Mixture Hidden Markov Model with 1 States
Transition matrix:
[1.0]
Emission parameters:
[1.0*N(0.0,1e-08)]
Initial probabilities: [1.0000]

We test that number of emission distributions must be the same as the number of states:

sage: hmm.GaussianMixtureHiddenMarkovModel([[1]], [], [1])
Traceback (most recent call last):
...
ValueError: number of GaussianMixtures must be the same as number of entries of pi

sage: hmm.GaussianMixtureHiddenMarkovModel([[1]], [[]], [1])
Traceback (most recent call last):
...
ValueError: must specify at least one component of the mixture model

We test that the number of initial probabilities must equal the number of states:

sage: hmm.GaussianMixtureHiddenMarkovModel([[1]], [[]], [1,2])
Traceback (most recent call last):
...
ValueError: number of entries of transition matrix A must be the square of the number of entries of pi
baum_welch(obs, max_iter=1000, log_likelihood_cutoff=1e-12, eps=1e-12, fix_emissions=False)

Given an observation sequence obs, improve this HMM using the Baum-Welch algorithm to increase the probability of observing obs.

INPUT:

  • obs – a time series of emissions
  • max_iter – integer (default: 1000) maximum number of Baum-Welch steps to take
  • log_likehood_cutoff – positive float (default: 1e-12); the minimal improvement in likelihood with respect to the last iteration required to continue. Relative value to log likelihood.
  • eps – very small positive float (default: 1e-12); used during the iteration to replace numbers, e.g., standard deviations that are 0 but are not allowed to be 0.
  • fix_emissions – bool (default: False); if True, do not change emissions when updating

OUTPUT:

  • changes the model in places, and returns the log likelihood and number of iterations.

EXAMPLES:

sage: m = hmm.GaussianMixtureHiddenMarkovModel([[.9,.1],[.4,.6]], [[(.4,(0,1)), (.6,(1,0.1))],[(1,(0,1))]], [.7,.3])
sage: set_random_seed(0); v = m.sample(10); v
[0.3576, -0.9365, 0.9449, -0.6957, 1.0217, 0.9644, 0.9987, -0.5950, -1.0219, 0.6477]
sage: m.log_likelihood(v)
-8.31408655939536...
sage: m.baum_welch(v)
(2.1890506868230..., 15)
sage: m.log_likelihood(v)
2.1890506868230...
sage: m
Gaussian Mixture Hidden Markov Model with 2 States
Transition matrix:
[   0.874636333977    0.125363666023]
[              1.0 1.45168520229e-40]
Emission parameters:
[0.500161629343*N(-0.812298726239,0.173329026744) + 0.499838370657*N(0.982433690378,0.029719932009), 1.0*N(0.503260056832,0.145881515324)]
Initial probabilities: [0.0000, 1.0000]

We illustrate fixing all emissions:

sage: m = hmm.GaussianMixtureHiddenMarkovModel([[.9,.1],[.4,.6]], [[(.4,(0,1)), (.6,(1,0.1))],[(1,(0,1))]], [.7,.3])
sage: set_random_seed(0); v = m.sample(10)
sage: m.baum_welch(v, fix_emissions=True)
(-7.58656858997889..., 36)
sage: m.emission_parameters()
[0.4*N(0.0,1.0) + 0.6*N(1.0,0.1), 1.0*N(0.0,1.0)]
emission_parameters()

Returns a list of all the emission distributions.

OUTPUT:

  • list of Gaussian mixtures

EXAMPLES:

sage: m = hmm.GaussianMixtureHiddenMarkovModel([[.9,.1],[.4,.6]], [[(.4,(0,1)), (.6,(1,0.1))],[(1,(0,1))]], [.7,.3])
sage: m.emission_parameters()
[0.4*N(0.0,1.0) + 0.6*N(1.0,0.1), 1.0*N(0.0,1.0)]
sage.stats.hmm.chmm.unpickle_gaussian_hmm_v0(A, B, pi, name)

EXAMPLES:

sage: m = hmm.GaussianHiddenMarkovModel([[1]], [(0,1)], [1])
sage: sage.stats.hmm.chmm.unpickle_gaussian_hmm_v0(m.transition_matrix(), m.emission_parameters(), m.initial_probabilities(), 'test')
Gaussian Hidden Markov Model with 1 States
Transition matrix:
[1.0]
Emission parameters:
[(0.0, 1.0)]
Initial probabilities: [1.0000]
sage.stats.hmm.chmm.unpickle_gaussian_hmm_v1(A, B, pi, prob, n_out)

EXAMPLES:

sage: m = hmm.GaussianHiddenMarkovModel([[1]], [(0,1)], [1])
sage: loads(dumps(m)) == m   # indirect test
True
sage.stats.hmm.chmm.unpickle_gaussian_mixture_hmm_v1(A, B, pi, mixture)

EXAMPLES:

sage: m = hmm.GaussianMixtureHiddenMarkovModel([[1]], [[(.4,(0,1)), (.6,(1,0.1))]], [1])
sage: loads(dumps(m)) == m   # indirect test
True

Previous topic

Hidden Markov Models

Next topic

Distributions used in implementing Hidden Markov Models

This Page