Why a restricted Boltzman machine (RBM) tends to learn very similar weights?

These are 4 different weight matrices that I got after training a restricted Boltzman machine (RBM) with ~4k visible units and only 96 hidden units/weight vectors. As you can see, weights are extremely similar - even black pixels on the face are reproduced. The other 92 vectors are very similar too, though none of weights are exactly the same.

I can overcome this by increasing number of weight vectors to 512 or more. But I encountered this problem several times earlier with different RBM types (binary, Gaussian, even convolutional), different number of hidden units (including pretty large), different hyper-parameters, etc.

My question is: what is the most likely reason for weights to get very similar values? Do they all just get to some local minimum? Or is it a sign of overfitting?

I currently use a kind of Gaussian-Bernoulli RBM, code may be found here.

UPD. My dataset is based on CK+, which contains > 10k images of 327 individuals. However I do pretty heavy preprocessing. First, I clip only pixels inside of outer contour of a face. Second, I transform each face (using piecewise affine wrapping) to the same grid (e.g. eyebrows, nose, lips etc. are in the same (x,y) position on all images). After preprocessing images look like this:

When training RBM, I take only non-zero pixels, so outer black region is ignored.

Topic rbm

Category Data Science


A restricted Boltzmann machine (RBM) learns a lossy compression of the original inputs or in other words, a probability distribution.

Those are 4 different weight matrices are all reduced dimension representations of the original face inputs. If you visualized the weights as a probability distribution, the distributions value would be different but they would have the same amount of loss from the original image reconstruction.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.