Binary Classification Comparing two time series of variable length

Is there a machine learning model (something like LSTM or 1D-CNN) that takes two time series of variable length as input and outputs a binary classification (True/False whether time series are of same label)?

So the data would look something like the following

date        value label
2020-01-01  2     0     # first input time series
2020-01-02  1     0     # first input time series
2020-01-03  1     0     # first input time series
2020-01-01  3     1     # second input time series
2020-01-03  1     1     # second input time series

Is there something like that available out of the box, and if not how would you build a minimal working example model in Keras?

My best guess is to use a shared LSTM layer for both inputs and Concatenate both resulting vectors before feeding to the final Dense layer.

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

n_lstm_blocks = 50

input_1 = keras.Input(shape=(None, 1)) # unknown timespan, fixed feature size 1
input_2 = keras.Input(shape=(None, 1))
shared_lstm = layers.LSTM(n_lstm_blocks)
encode_1 = shared_lstm(input_1)
encode_2 = shared_lstm(input_2)
concat = layers.concatenate([encode_1,encode_2])
output = layers.Dense(1, activation='sigmoid')(concat)
model = keras.Model(inputs=[input_1,input_2],outputs=output)
model.compile(optimizer='adam', loss='binary_crossentropy')

A comparable task would be Siamese Networks / One-Shot learning which is used for face recognition. But in this case the task is to compare to time series and detect if they are of the same label, but knowing each label is NOT task of the network!

Topic siamese-networks machine-learning-model keras classification time-series

Category Data Science


I think both LSTM and 1D-CNN could work, but it depends on your data first.

In your code you used a shared LSTM, which implies that you think your two input sequences are actually just different length sequences of a same variable? If that's the case, why not just use a LSTM to label, and then compare their label.

If that's not the case, you can then use separate input.

So the restriction for using 1D-CNN for sequence is that your sequence should have same length among samples (to my current knowledge). So if you have two types of sequences with fixed length $l_1$ and $l_2$ among samples, then you can try this approach.

If your have two types of sequences, both with variable length, then you will have to use one dynamic input LSTM for each of them, and then concat this two LSTM.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.