Multivariate noise variance in Gaussian process prediction
In GP regression, we predict using $\mu^* = ... (K(X,X)+\sigma^2I)^{-1}...$ This is fine when the noise $\sigma$ is a scalar, but I am confused about what happens when $\sigma$ is Multivariate/anisotropic.
$K(X,X) \in R^{m\times m}$, does $\sigma$'s dimension not depend on the width of our prediction vector $f_\ast$? If so, how does the above section of the prediction work?
Topic gaussian-process gaussian regression statistics
Category Data Science