VC dimension for Gaussian Process Regression

In neural networks, the VC dimension $d_{VC}$ equals approximately the number of parameters (weights) of the network. The rule of thump for good generalization is then $N \geq 10 d_{VC} \approx 10 * (\text{number of weigts})$.

What is the VC dimension for Gaussian Process Regression ?

My domain is $X = \mathbb{R}^{25}$, meaning I have 25 features, and I want to determine the number of samples $N$ I must have to archive good generalization.

Topic gaussian-process vc-theory machine-learning

Category Data Science


The expressiveness of the gaussian process grows with a number of training points. So, the vapnik-chervonenkis dimension in fact is infinite (pretty much the same way it's infinite for k nearest neighbors) and unfortunately your rule of thumb is not applicable here.

You should probably rely on train/validation split to estimate the generalization. From my experience, GP generalizes much better than neural nets, but the exact required train size depends on data distribution complexity itself.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.