Deepmind conditional neural process: evaluation

Going through the Deepmind jupyter notebook conditional neural processes, the plots at the bottom of the notebook show that the ground truth and the predicted distribution only overlap around the "context points". These context points are already in the training set. This comes as a surprise to me because I was expecting that if the model worked, then the ground truth curve would lie inside the predicted distribution at non-context points. So, doesn't this mean that the network failed to model the data? If that's the case, what's the value being shown here?

Edit: looking at Fig 2 in their arxiv publication here and comparing them with the plots in the jupyter notebook, it seems that the publication shows nice plots for which this worked. The plots in the notebook are not as confirming though.

Edit 2: re-running their notebook doesn't yield the same conversion of the loss.

At Iteration: 56200, Thu Jan 24 14:13:40 2019, loss: 1.22

(whereas in the published notebook the loss is already at 0.6 at iteration 20,000)

Edit 3: Edit 2 is not valid. I found that I had made an edit to the code while playing around with it, and it turned out to be critical for the training. (I short-circuited the if self._testing: condition in generate_curves to if True to see its impact and forgot to undo it). The training converges as published in the notebook. I have yet to play with the trained model to see how good the distributions are on non-context points

Edit 4: Now that the network converges, generating new data as validation (with the same method as for testing data), shows that the probability distribution does not capture context points. It also shows a narrow confidence interval in the center, which completely misses the ground truth.

Topic deepmind gaussian

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.