Gaussian Process for Classification: How to do predictions using MCMC methods

Question

Gaussian Process for Classification: How to do predictions using MCMC methods

Euler_Salter

2020年1月8日 06:43

Problem

I was reading about Gaussian Processes for regression in the "Gaussian Processes for Classification" textbook and in a few other online resources. Everywhere I look people seem to avoid talking about one would go about doing this. Can anyone provide a simple answer to this?

Mathematics and Context

$X\in\mathbb{R}^{n\times d}$ is a matrix whose rows ${\bf{x}}_i$ are the $n$ training observations living in $d$-dimensions.
${\bf{y}}$ is an $n$-dimensional vector containing training labels $0$ and $1$ for each training input.
${\bf{f}}$ is an n-dimensional vector whose elements $f_i={\bf{x}}_i^\top {\bf{w}}$ are the so-called linear predictors.
${\bf{x}}_*$ is a testing input.

Formulation of Gaussian Process for Classification

Inference is performed in two steps:

Compute $$ p(f_*\mid X, {\bf{y}}, {\bf{x}}_*)=\int p(f_*\mid X, {\bf{x}}_*, {\bf{f}})p({\bf{f}}\mid X, {\bf{y}})d {\bf{f}} $$
Squish the value using the sigmoid function to find the class probability. $$ \overline{\pi}_* = p(y_*=1\mid X, {\bf{y}}, {\bf{x}}_*) = \int \sigma(f_*)p(f_*\mid X, {\bf{y}}, {\bf{x}}_*)d f_* $$

What I don't understand

How does one go about solving this using sampling methods? My idea is that the first integral might be similar to an expectation so maybe we can do something like this.

Get samples ${\bf{f}}_1, \ldots, {\bf{f}}_N$ from $p({\bf{f}}\mid X, {\bf{y}})$ and then approximate the first integral like this $$ \mathbb{E}_{p({\bf{f}}\mid X, {\bf{y}})}\left[p(f_*\mid X, {\bf{x}}_*, {\bf{f}})\right] \approx \frac{1}{N}\sum_{i=1}^N p(f_*\mid X, {\bf{x}}_*, {\bf{f}}_i) $$ but then how do I compute $p(f_*\mid X, {\bf{x}}_*, {\bf{f}}_i)$ ?

Topic gaussian-process sampling classification

Category Data Science

chzhrr · Accepted Answer · 2020年1月8日 06:43

In step 1, it's no need to use MCMC, instead, we can compute the posterior with some assumptions in closed form, it's $N(E(f_*),Var(f_*))$. In the second step, we use approximation method to compute the probs by sampling from the posterior distribution.

Gaussian Process for Classification: How to do predictions using MCMC methods

Problem

Mathematics and Context

Formulation of Gaussian Process for Classification

What I don't understand

About