Deep Learning Classification Model for data with time dimension

I know it might be a generic question but I would still appreciate some feedback. So I have a dataset with 4 dimensions (time, x, y, color). Where I have a total of 24000 records each with (5, 188, 188, 3) this dimension where I have a 5-time dimension which is like 5 different time intervals each at 15 minutes difference. Now I have to build a model for binary classification from the image. So initially I used Resnet with 3 dimensions from data (x, y, color), which gave a pretty decent result. Now I also want to include the time dimension in my data but I'm not sure RESNET will make any sense if the time dimension is included. Can someone tell me if there's another network that I can try where the model can make use of time dimension, or adding a layer like LSTM in the model can be helpful? I have attached a sample gif with 5-time dimensions and 1 color. Any suggestion will be appreciated.

Sample data:

[[[0.1798448 , 0.1735874 , 0.1735874 , ..., 0.04843931,
         0.05156801, 0.05469671],
        [0.17671609, 0.1798448 , 0.1829735 , ..., 0.05156801,
         0.05313236, 0.05156801],
        [0.18140915, 0.1861022 , 0.18140915, ..., 0.05156801,
         0.05313236, 0.05156801],
        ...,
       [[0.17515175, 0.17045869, 0.1735874 , ..., 0.05626106,
         0.05000366, 0.05626106],
        [0.17515175, 0.1735874 , 0.17202304, ..., 0.06564717,
         0.06408282, 0.06721152],
        [0.1735874 , 0.1735874 , 0.16889434, ..., 0.06721152,
         0.06251846, 0.06095411],
        ...,
        [0.06877588, 0.07034022, 0.06877588, ..., 0.01715229,
         0.01558794, 0.01715229],
        [0.07346892, 0.07190457, 0.06721152, ..., 0.01871664,
         0.01715229, 0.02028099],
        [0.07503328, 0.06721152, 0.06564717, ..., 0.01558794,
         0.02184534, 0.02184534]],

       [[0.1735874 , 0.17045869, 0.16889434, ..., 0.04843931,
         0.05313236, 0.05938977],
        [0.17515175, 0.17202304, 0.16732998, ..., 0.05938977,
         0.05938977, 0.06408282],
        [0.17045869, 0.16889434, 0.16576564, ..., 0.06564717,
         0.06251846, 0.06564717]]], dtype=float32)

Topic inceptionresnetv2 lstm image-classification deep-learning

Category Data Science


In your case, it seems that time has a clear meaning because you have a movement logic between images. As a consequence, I would recommend models adapted to videos like MetNet, ConvLSTM or PredRNN... https://ai.googleblog.com/2020/03/a-neural-weather-model-for-eight-hour.html

https://proceedings.neurips.cc/paper/2015/hash/07563a3fe3bbe7e3ba84431ad9d055af-Abstract.html

https://proceedings.neurips.cc/paper/2017/file/e5f6ad6ce374177eef023bf5d0c018b6-Paper.pdf

Hope this helps,

Nicolas

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.