1D Sequence Classification using Circular Dilated Convolutional Neural Networks

Question

1D Sequence Classification using Circular Dilated Convolutional Neural Networks

Kevin

2022年4月16日 13:51

I am working on a multiclass classification task on long 1D sequences. The sequence length may vary between $512$ and $512 \cdot 60$ timesteps, a slice of $100$ timesteps might look like this:

What is the best current approach of learning a deep learning model to minimize the cross-entropy loss with respect to the model architecture? I have read some papers using a CNN LSTM for this task, but are there any better suited architectures? I have considered Dilated Causal Convolutions which is used in the WaveNet paper: https://arxiv.org/abs/1609.03499. There is also a recent paper using Circular Dilated Convolutional Neural Networks (CDIL-CNN): https://arxiv.org/abs/2201.02143. How would one implement one of these convolutional methods in PyTorch?

The model for CDIL-CNN is available on GitHub: https://github.com/LeiCheng-no/CDIL-CNN/blob/main/cdil.py, the input shape is (Batch Size, Sequence Length, Features) and output shape should be (Batch Size, Num Classes) before being passed to the loss function. I have the following (standing in the root directory of aforementioned GitHub repository):

import torch
from cdil import *

input_shape = (32, 1024, 1) # batch size, timesteps, features
X = torch.randint(0, 14, input_shape).type(torch.float) # sample input

model = CDIL_ConvPart(num_inputs=1024, num_channels=[20, 20, 20], kernel_size=3)
model(X) # forward pass

But I get the following error:

AssertionError: Padding value causes wrapping around more than once.

I tried changing the num_inputs keyword and swap the time and feature dimension which gives no error when running:

model = CDIL_ConvPart(num_inputs=1, num_channels=[20, 20, 20], kernel_size=3)
model(X.permute(0,2,1))

Is num_inputs referring to the hidden dimension? If so, is it recommended to use an encoder/embedding layer before feeding in the input? What is the recommended hyper-parameters in this setting?

More generally, would this kind of architecture be useful in my task and does it obsolete the use of recurrent neural networks (such as LSTM) ? How is CDIL-CNN implemented in my task?

Topic lstm convolutional-neural-network rnn multiclass-classification sequence

Category Data Science

1D Sequence Classification using Circular Dilated Convolutional Neural Networks

About