Latent variable graph in Variational Autoencoder

Question

Latent variable graph in Variational Autoencoder

Turned Capacitor

2021年5月29日 23:35

I followed this Keras documentation guide about Auto Encoders.

At the end of the documentation there is the graph of the latent variable z:

But I can not understand and how to interpret the plot, and how should the plot change as the hyperparameters of the model change?

Topic vae keras autoencoder deep-learning

Category Data Science

Erwan · Accepted Answer · 2021年5月29日 23:35

As Nikos said in a comment, this is a graph showing the different classes (digits) in the space of the latent variable.

First a remark: from the point of view of visualization design I think the authors made a mistake by using a gradient of colours for the classes, it should have been a completely different colour for every digit: here the digits are a categorical variable, there is no notion of order between them. This is misleading because the reader of the graph would naturally expect some logic in the gradient, but there is none.

The space itself is not interpretable, in the sense that the values on the X and Y don't have a meaning (for a human). However what this graph can tell you is how close or how distinguishable are two classes: ideally the different classes are fairly distinct from each other, which means that the model is sucessful at separating them.

Unfortunately due to the mistake I mentioned above it's quite difficult to see things clearly in this graph, but if I see the colours correctly there are a few things observable:

the 1 is the big purple thing on the top right, it's mostly distinct from the rest.
The 0 is the smaller dark purple group on the top left, also quite distinct.

The other digits are much more mixed with each other, it's hard to see anything:

Apparently the 9 (yellow) spreads a bit everywhere in the centre, and there are many points for 7 (I think?) in the middle: this means that potentially the model could sometimes confuse 7 and 9 (it makes sense: the handwritten 7 and 9 can have the same shape).
The 6 seems to be the rather thin greenish thing which is closest to the 0, and I can imagine indeed that these two have a similar shape.

My eyes are not that good so maybe I made some mistakes, but I hope you see the idea.

Basically the more the different classes/colours are well separated on the graph, the better the model, so you should be able to see some difference between some very good hyper-parameters and some very bad ones. But it would be pointless to rely on the graph for that: a good old evaluation measure (or confusion matrix) is much more accurate.

Latent variable graph in Variational Autoencoder

About