Variational Autoencoder assumtions

I am currently reading the paper Importance Weighted Autoencoders and am having a hard time understanding something regarding the original Variational Autoencoder (VAE) as described here

In the first paragraph of the third subsection the author wrote this:

The VAE objective of Eqn. 3 heavily penalizes approximate posterior samples which fail to explain the observations. This places a strong constraint on the model, since the variational assumptions must be approximately satisfied in order to achieve a good lower bound. In particular, the posterior distribution must be approximately factorial and predictable with a feed-forward neural network. This VAE criterion may be too strict; a recognition network which places only a small fraction (e.g. 20%) of its samples in the region of high posterior probability region may still be sufficient for performing accurate inference.

The things in bold are the things that are not clear to me, where in the algorithm of the VAE do we constraint the posterior distribution to be approximatly factorial? and why is that constraint bad?
Thanks in advance!

Topic bayesian-networks autoencoder

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.