Layer notation for convolutional neural networks

When reading about convolutional neural networks (CNNs), I often come across a special notation used in the community and in scientific papers, describing the architecture of the network in terms of layers. However, I was not able to find a paper or resource describing this notation in detail.
Could someone explain to me the details or point to where it is described or "standardized"?

Examples:

  1. input−100C3−MP2−200C2−MP2−300C2−MP2−400C2−MP2−500C2−output
    (source)

  2. input−(300nC2−300nC2−MP2)_5−C2−C1−output
    (source)


A good guess seems that xCy are convolution layers (x is number of filters? y is one side of square kernel?). MPz is max-pooling layer (pool size z×z?).

But instead of guessing, I would love to have a reference (which I could possibly also reference in a paper).

Topic convolutional-neural-network notation

Category Data Science


One paper referenced by the first paper you linked to is here. It explains in section 3 (experiments) the following notation:

2x48x48-100C5-MP2-100C5-MP2-100C4-MP2-300N-100N-6N represents a net with:

  • 2 input images of size 48x48
  • a convolutional layer with 100 maps and 5x5 filters
  • a max-pooling layer over non-overlapping regions of size 2x2
  • a convolutional layer with 100 maps and 4x4 filters
  • a max-pooling layer over non overlapping regions of size 2x2
  • a fully connected layer with 300 hidden units,
  • a fully connected layer with 100 hidden units
  • a fully connected layer with 6 neurons (one per class)

From this, the answer to your question is:

  • 100C3 means a convolutional layer with 100 maps and 3x3 filters
  • MP2 means a max-pooling layer with non overlapping regions of size 2x2
  • 200C2 means a convolutional layer with 200 maps and 2x2 filters
  • etc with the "C" layers (means convolutional, preceding integer is the number of features maps; final integer is the filter size)

According to the second paper you linked, the subscript _5 indicates five pairs of 300nC2−300nC2−MP2 connected layers (see section 3), and the n indicates "the number of filters in the nth convolutional layer is [300]n". According to the accompanying model diagram (figure 3 in the linked paper), the C2 and C1 layers produce 1x1 output, meaning a scalar value. This would mean C2 is a convolutional layer with 1 map and a 2x2 filter and C1 is a convolutional layer with 1 map and 1x1 filter (though I don't fully understand what this adds).

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.