Viterbi-like algorithm suggesting top-N probable state sequences implementation

Question

Viterbi-like algorithm suggesting top-N probable state sequences implementation

Anton

2017年11月9日 23:51

Traditional Viterbi algorithm (say, for hidden Markov models) provides the most probable hidden state sequence given a sequence of observations.

There probably is an algorithm for decoding top-N probable hidden states sequences (k-shortest paths or the like).

But is there a good implementation anywhere?

Thanks.

UPD (copy-pasted from comments): Viterbi algorithm decodes the MOST probable sequence given observations + model. That is, argmax_x p(state_1,...,state_n|obs_1,...,obs_n). What I am asking for is an implementation of how to get the 'next argmax-es' with possibly slightly smaller probabilities GIVEN the observed sequence.

Topic graphical-model markov-process sequence graphs algorithms

Category Data Science

Eskapp · Accepted Answer · 2017年10月10日 20:55

Apparently, I misunderstood your question.

There are several methods for finding the k-best paths with extending versions of the Viterbi algorithm.

My first advice would be to look at this question on SO that is similar to yours and has a good illustrated answer.

Then, I would refer you to two articles/thesis that are publicly available and from where one can extend his/her research. (Disclaimer: these references may not be the "best" one but I've chosen them because they are publicly available and provide a good number of reference to deepen research on the topic)

The k-best paths in Hidden Markov Models. Algorithms and Applications to Transmembrane Protein Topology Recognition. Thesis by Golod, available here. (a thesis like this gives countless references on the topic)
Decoding HMMs using the k-best paths: algorithms and applications by Brown and Dolog, available here (a partial short version of what you will find in the aforementioned thesis)

My previous answer:

For anyone coming across this question looking for a way of computing some of the typical state sequences of an HMM (as I thought first this question was about), just know that such a concept of most probable sequence without specifying data is not really something used in any theory about HMMs, as far as I know. However, one can follow these steps:

As a first try, I would implement something like this:

Get the state at time $t=0$

Draw an initial state $s_0$ from the initial state probability mass function (pmf)

Get the state at time $t+1$

Draw the new state $s_1$ from the pmf defined by the $s_0$-th row of the transition matrix

Repeat this step as many times as needed to draw states up to $s_N$

Then you can repeat the entire procedure $X$ times in order to get as many sample paths as you wish.

This is very fast and easy to implement and it will give you what you want. Many scientific libraries in many languages have a built-in function for drawing a random sample from a pmf.

Viterbi-like algorithm suggesting top-N probable state sequences implementation

About