memory error- python N-th order Markovian transition matrix from a given sequence
Ok. What is wrong with you code!
I am trying to calculate transition probabilities for each leg.
The code works for small array but for the actual dataset I got memory error. I have 64 g version python and maximized the memory usage so i believe need help to code efficiently. import numpy as np
# sequence with 3 states - 0, 1, 2
arr = [0, 1, 0, 0, 0, 2, 2, 1, 1, 1, 0, 0, 0, 0, 0, 1, 2, 2, 2, 0, 0, 2]
def transition_matrix(arr, n=1):
Computes the transition matrix from Markov chain sequence of order `n`.
:param arr: Discrete Markov chain state sequence in discrete time with states in 0, ..., N
:param n: Transition order
M = np.zeros(shape=(max(arr) + 1, max(arr) + 1))
for (i, j) in zip(arr, arr[1:]):
M[i, j] += 1
T = (M.T / M.sum(axis=1)).T
return np.linalg.matrix_power(T, n)
transition_matrix(arr=a, n=1) # n is the transition order
Again, code works like a charm but when more than 200K array is given memory error occurs.
Topic matrix probability markov-process python
Category Data Science