Gaussian Process for Classification: How to do predictions using MCMC methods

Problem

I was reading about Gaussian Processes for regression in the "Gaussian Processes for Classification" textbook and in a few other online resources. Everywhere I look people seem to avoid talking about one would go about doing this. Can anyone provide a simple answer to this?

Mathematics and Context

  • $X\in\mathbb{R}^{n\times d}$ is a matrix whose rows ${\bf{x}}_i$ are the $n$ training observations living in $d$-dimensions.
  • ${\bf{y}}$ is an $n$-dimensional vector containing training labels $0$ and $1$ for each training input.
  • ${\bf{f}}$ is an n-dimensional vector whose elements $f_i={\bf{x}}_i^\top {\bf{w}}$ are the so-called linear predictors.
  • ${\bf{x}}_*$ is a testing input.

Formulation of Gaussian Process for Classification

Inference is performed in two steps:

  1. Compute $$ p(f_*\mid X, {\bf{y}}, {\bf{x}}_*)=\int p(f_*\mid X, {\bf{x}}_*, {\bf{f}})p({\bf{f}}\mid X, {\bf{y}})d {\bf{f}} $$
  2. Squish the value using the sigmoid function to find the class probability. $$ \overline{\pi}_* = p(y_*=1\mid X, {\bf{y}}, {\bf{x}}_*) = \int \sigma(f_*)p(f_*\mid X, {\bf{y}}, {\bf{x}}_*)d f_* $$

What I don't understand

How does one go about solving this using sampling methods? My idea is that the first integral might be similar to an expectation so maybe we can do something like this.

  1. Get samples ${\bf{f}}_1, \ldots, {\bf{f}}_N$ from $p({\bf{f}}\mid X, {\bf{y}})$ and then approximate the first integral like this $$ \mathbb{E}_{p({\bf{f}}\mid X, {\bf{y}})}\left[p(f_*\mid X, {\bf{x}}_*, {\bf{f}})\right] \approx \frac{1}{N}\sum_{i=1}^N p(f_*\mid X, {\bf{x}}_*, {\bf{f}}_i) $$ but then how do I compute $p(f_*\mid X, {\bf{x}}_*, {\bf{f}}_i)$ ?

Topic gaussian-process sampling classification

Category Data Science


In step 1, it's no need to use MCMC, instead, we can compute the posterior with some assumptions in closed form, it's $N(E(f_*),Var(f_*))$. In the second step, we use approximation method to compute the probs by sampling from the posterior distribution.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.