What is the difference between Inception v2 and Inception v3?

Question

What is the difference between Inception v2 and Inception v3?

Martin Thoma

2021年2月16日 15:29

The paper Going deeper with convolutions describes GoogleNet which contains the original inception modules:

The change to inception v2 was that they replaced the 5x5 convolutions by two successive 3x3 convolutions and applied pooling:

What is the difference between Inception v2 and Inception v3?

Topic inception convolutional-neural-network image-classification computer-vision

Category Data Science

desa · Accepted Answer · 2019年1月23日 14:32

Actually, the answers above seem to be wrong. Indeed, it was a big mess with the naming. However, it seems that it was fixed in the paper that introduces Inception-v4 (see: "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning"):

The Inception deep convolutional architecture was introduced as GoogLeNet in (Szegedy et al. 2015a), here named Inception-v1. Later the Inception architecture was refined in various ways, first by the introduction of batch normalization (Ioffe and Szegedy 2015) (Inception-v2). Later by additional factorization ideas in the third iteration (Szegedy et al. 2015b) which will be referred to as Inception-v3 in this report.

xiaoming-qxm · Accepted Answer · 2017年6月28日 17:07

In the paper Batch Normalization,Sergey et al,2015. proposed Inception-v1 architecture which is a variant of the GoogleNet in the paper Going deeper with convolutions, and in the meanwhile they introduced Batch Normalization to Inception(BN-Inception).

The main difference to the network described in (Szegedy et al.,2014) is that the 5x5 convolutional layers are replaced by two consecutive layer of 3x3 convolutions with up to 128 filters.

And in the paper Rethinking the Inception Architecture for Computer Vision, the authors proposed Inception-v2 and Inception-v3.

In the Inception-v2, they introduced Factorization(factorize convolutions into smaller convolutions) and some minor change into Inception-v1.

Note that we have factorized the traditional 7x7 convolution into three 3x3 convolutions

As for Inception-v3, it is a variant of Inception-v2 which adds BN-auxiliary.

BN auxiliary refers to the version in which the fully connected layer of the auxiliary classifier is also-normalized, not just convolutions. We are refering to the model [Inception-v2 + BN auxiliary] as Inception-v3.

Muayyad Alsadi · Accepted Answer · 2017年6月21日 22:40

beside what was mentioned by daoliker

inception v2 utilized separable convolution as first layer of depth 64

quote from paper

Our model employed separable convolution with depth multiplier 8 on the first convolutional layer. This reduces the computational cost while increasing the memory consumption at training time.

why this is important? because it was dropped in v3 and v4 and inception resnet, but re-introduced and heavily used in mobilenet later.

Sid M · Accepted Answer · 2017年1月18日 04:50

The answer can be found in the Going deeper with convolutions paper: https://arxiv.org/pdf/1512.00567v3.pdf

Check Table 3. Inception v2 is the architecture described in the Going deeper with convolutions paper. Inception v3 is the same architecture (minor changes) with different training algorithm (RMSprop, label smoothing regularizer, adding an auxiliary head with batch norm to improve training etc).

What is the difference between Inception v2 and Inception v3?

About