Unequal length observation sequences when training Hidden Markov model
I want to train a sequence classifier with Hidden Markov Model. The length of observation sequences is not fixed. I tried some HMM packages such as Matlab's HMM toolbox and Kevin Murphy's library. All of them seem to require the user to specify the size of transition probability matrix and emission probability matrix.
I understand that for a Hidden Markov Model (HMM), the sizes of the transition probability matrix and emission probability matrix are dependent on the number of hidden states and the length of observation sequence.
For example if:
states = ('Rainy','Sunny')
observations = ('walk', 'shop', 'clean')
The number of states is 2, the length of observation is 3. Then, transition probability would be a 2x2 matrix. Whereas, emission probability would be a 2x3 matrix.
What if the length of observation sequence is not fixed?
For example:
observation 1 = ('walk', 'shop', 'clean')
observation 2 = ('walk', 'shop', 'clean','eat pizza')
observation 3 = ('walk', 'shop', 'clean','drink beer','eat pizza')
...so on
What's the size of emission probability matrix in this case? Or can I just make the observation sequence the same length by padding with zeros?
Topic markov-hidden-model machine-learning
Category Data Science