Hidden Markov Models: Linking states to labels after EM training

Question

Hidden Markov Models: Linking states to labels after EM training

lo tolmencre

2017年4月2日 15:13

The tl;dr version first:

I have the following problem: I implemented Baum Welch for ergodic HMMs. I do it like this:

I pass the model two number C1 and C2, it builds a fully connected state machine with C1 states and C2 emissions. I map all tokens from my training data onto the range [0, C2) and each label the HMM is supposed to assign a token during inference onto [0, C1). Then the HMM goes ahead and does Baum Welch Learning. When it is done it has configured its state machine to maximize the likelihood of the training data (locally).

Now to my problem:

Assume I had two isomorphic initial state machines (isomorphic under consideration of all the probabilities, as structurally all the HMMs are isomorphic anyway, because ergodic). They vary only in their state IDs, so the IDs have been scrambled around from one machine to the other. Now, after training both HMMs will be isomorphic again when trained with the same data. That means there is absolutely no connection between the IDs I map the labels from my label set to and the IDs of the states of the HMM. So how then can I interpret the HMM after training? How do I know which state corresponds to which POS-tag? Seems impossible, so I guess I am missing some crucial point.

Now a little more detail if the above was unclear:

I take my training data (texts, like newspaper etc.) and count the number of different words (types).

Then I pass count(types and count(labels) the labels being a set of POS-tags. It then randomly constructs a probabilistic fully connected state machine with pow(count(labels), order_of_model) different states. order_of_model being the number of hidden variables (POS-tag ngrams) that get combined into an individual state. It also assigns each of these states an initial and an emission probability for all of the types.

The model assumes a mapping from [0, pow(count(labels), order_of_model)) as state IDs onto external tuples of labels. And a mapping [0,count(types)-1) for the emissions onto words.

Topic probability unsupervised-learning expectation-maximization language-model machine-learning

Category Data Science

Hidden Markov Models: Linking states to labels after EM training

About