What does "expansion layer" mean?

Recently, I found expansion layer term in the next paper:

Liu, Ze, et al. Swin transformer: Hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030 (2021).

This term is mentioned in the context of Multilayer perceptron (MLP). So I have tried to figure out its meaning on my own, but I would not be able to find anything particular.

Also I found expansion ratio term (again in MLP context) in this paper:

Wu, Haiping, et al. Cvt: Introducing convolutions to vision transformers. arXiv preprint arXiv:2103.15808 (2021).

So, what does expansion layer means? And what is expansion ratio? Thanks in advance.

Topic transformer mlp

Category Data Science


It is possible they are referring to the MLP layer. The reason it is called expansion is because the output dimensions of this MLP layer are larger than that of the input by a factor. As to why we need to expand the dimensions at each layer, remember that in Swin we stitch together patches at each layer. We need more dimensions to hold info about these larger patches. So starting from a certain value of 'C', the dimensions are expanded to (I think) 32*C depending on the model.

In case you need more specific details, Swin architecture is explained in plain english here - https://towardsdatascience.com/swin-vision-transformers-hacking-the-human-eye-4223ba9764c3

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.