CNNs are applicable wherever the input signal contains spatial information. For instance, you can think of the following list:

  • 1-D signal: Recorded voices are examples of one-dimensional inputs where there are relations between adjacent entries of inputs. This means adjacent entries have patterns which are valuable for different tasks, such as classification. You can employ 1-D convolutional layers for these input signals.
  • 2-D signal: Images are an example of this kind, albeit they may have different channels, like RGB. It's clear that adjacent pixels are roughly like each other, and other than that, the adjacent pixels share patterns which may be repeated over the entire image multiple times. Consequently, 2-D convolutional layers can be employed for these signals.
  • 3-D signal: Video frames are examples of this kind. Other than the similarity you can find inside each frame, the different frames that exist one after another in the time axis can have similar meaningful patterns which are replicated. You can employ 3-D convolutional layers for these signal.

About structured data where you can find them as the rows of relational databases which each column belongs to a specified feature, it does not have any meaning to use convolutional layers. The reason is clear. They do not have any spatial information. The adjacent rows should not share a common concept otherwise they would be redundant. Moreover, they are not spatially related to each other. For structured data, people utilise dense layers.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.