ML methods for vector correlation
I am dealing with a timeseries consisting of input flow sampled every 5 minutes over 441 days. My aim is to find any possible correlation from data coming from:
The same day of the week
The same moment in time
I proceeded to sample according to weekdays and hours. Then I computed the 63x63 correlation matrix for each of the weekdays and a 441x441 for each hour, which in the second case is pretty impractical. I feel like this way I can't give a clear and interpretable answer to the aforementioned yes-or-no question.
So my question is if I can try to do autocorrelation and if it results in some parameter p and q in ARIMA model or would you suggest me another more succinct approach that may give a broader picture of data?
days, weeks = 7,63
ds = df['Debit'].to_list()
ds = np.reshape(ds,(weeks,days,288))
arr_hours = [[]for _ in range(24)]
arr = np.reshape(arr,(-1,7,288))
for k in range(24):
for j in range(63):
for i in range(7):
arr_hours[k].append(ds[j,i,12*k:12*(k+1)])
dF=pd.DataFrame(arr_hours[0]).T
dF.corr()
Topic dataframe correlation time-series python
Category Data Science