Real purpose of pooling

Recently I had a doubt as to what is the real purpose of pooling layers in neural networks is? The most common answer is

  • To select the most important feature
  • To increase the receptive field of the network

I feel that these are not real reasons for using a pooling layer because

  • There is no real need to select important features because the fully connected layer at the very end could be used to identify the most important features

  • The receptive field could be increased by increasing the kernel size in the successive layers.

So the only real reason for using pooling is to reduce to the size of the feature representation thus leading to smaller memory and computational footprint as the networks deeper.

Do you agree with the analogy? Do you feel there is any other reason as well?

Topic pooling cnn image-classification

Category Data Science


Both your original intuitions and the other answers contain important and valid points:

  • Reduce the feature maps size, hence reducing the overall computational needs.
  • Give flexibility by filtering the important features from the unimportant ones, increasing the receptive field and reducing the risk of overfitting.

However, as you pointed out, the same effects could be achieved by other means.

The differential point other answers miss is the key aspect: pooling has those benefits while having zero trainable parameters and being fast to compute.


  • Pooling layers are used to reduce the dimensions of the feature maps. Thus, it reduces the number of parameters to learn and the amount of computation performed in the network.
  • The pooling layer summarises the features present in a region of the feature map generated by a convolution layer. So, further operations are performed on summarised features instead of precisely positioned features generated by the convolution layer. This makes the model more robust to variations in the position of the features in the input image.

So somewhat it can be agreed with your conclusion, but without statistical evidence, we should not confirm that proposed hypothesis about memory optimization.


So the only real reason for using pooling is to reduce to the size of the feature representation thus leading to smaller memory and computational footprint as the networks deeper. yes, cost efficiency but also:

Generalisation: We get rid off small--unimportant details when we combine several values into one representative one. Hence what you really get is reduced chance of overfitting


I think these common answers are quite right but abstract. In my opinion, pooling "select the most important feature" or "increase the receptive field" by droping nearby information, such as Maxpooling. This may makes sense because in some case like image classification we just need some important feature to classify, hence we could drop redundant local feature. Or by AveragePooling, we can make the feature maps more stable to local change.

The last FC layer did select the most important feature to use, however, if every layer's output is more useful for the final task, you can get a more accruate result, we want the former network to learn reliable features, FC layer just map them.

As for your second idea, the receptive field could be increased by increasing the kernel size. But if you use larger kernel size, you lose local features. That's why dialated convolution is proposed, it can increase receptive field without losing information as pooling.

By the way, many researcher don't use pooling or just use them at the end of the network recent year, as Hinton said:

The pooling operation used in convolutional neural networks is a big mistake and the fact that it works so well is a disaster.


Both the common answer and your analogy are wrong. Pooling layers are added after convolution and ReLu layers. These layers gets the feature maps of image that where are the important features are. The problem with these features is the get this specific place of feature in feature map. A minor change in image will reflect a total change in feature maps. to solve this problem you can downsample the feature maps which is done by Pooling layers. Try reading about pooling and its types like max pooling and average pooling. It reduce the feature importance in a feature map by downsampling.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.