How can I plot the covariance matrix of scikit-learn's Gaussian process kernel?

How can I plot the covariance matrix of a Gaussian process kernel built with scikit-learn? This is my code X = Buckling_masterset.reshape(-1, 1) y = E X_train, y_train = Buckling.reshape(-1, 1), E kernel = 1 * RBF(length_scale=1e1, length_scale_bounds=(1e-5, 1e5)) gpr = GaussianProcessRegressor(kernel=kernel, alpha=1, n_restarts_optimizer = 10) gpr.fit(X_train, y_train) y_mean, y_std = gpr.predict(X, return_std=True) mean_prediction, std_prediction = gpr.predict(X, return_std=True) I want to plot the covariance matrix that is respective to this kernel. Something in the lines of:
Category: Data Science

Need a random process/distribution where I can pass a certain level of bias for producing an outcome

My first question here if am not clear please let me know. My objective a startup Sportsbook wants to test its algo to see how it manages game lines for incoming bets placed on a particular game. For example, as bets come in for a particular team the algo checks the book to see if it can cover and when the book is lob-sided it will adjust the line/odds giving the other team more favorable odds to balance the book …
Category: Data Science

Why GP posterior variance is the worst-case error?(exact proof)

I am reading this paper, which explains the connecting idea Gaussian Process and Kernel methods in detail. I am impressed by the insightful explanation in this paper, but am stuck on one part in Chapter 3, Section 3.4 Error Estimates: Posterior Variance and Worst-Case Error. In this section (p24) the authors suggests that Proposition 3.8 can be proved using Lemma 3.9. Proposition 3.8. Let $\bar{k}$ be the posterior covariance function (17) with noise variance $\sigma^2$. Then, for any $x\in\mathcal{X}$ with …
Category: Data Science

Incremental Hyperparameter optimization in GPy classifier

Is there any way to do an epoch-wise incremental gradient descent hyperparameter optimization for the Gaussian Process class GPy.core.gp under the GPy package? I am familiar with the complete optimization function model.optimize(), but unable to find any clue for incremental learning, as is supported by partial_fit() methods in sklearn estimators. Any clue or help in this is highly appreciated. Thanks in advance!
Category: Data Science

VC dimension for Gaussian Process Regression

In neural networks, the VC dimension $d_{VC}$ equals approximately the number of parameters (weights) of the network. The rule of thump for good generalization is then $N \geq 10 d_{VC} \approx 10 * (\text{number of weigts})$. What is the VC dimension for Gaussian Process Regression ? My domain is $X = \mathbb{R}^{25}$, meaning I have 25 features, and I want to determine the number of samples $N$ I must have to archive good generalization.
Category: Data Science

How do you choose a kernel for a discontinuous function in Gaussian Process Regression?

I'm doing Gaussian Process Regression and created a series of functions by gluing other functions together on random places. Here's an example: Perhaps this one is to complicated, but all the functions come from the same "family", they're all variations of gaussians. Is there anything standard that can be done with this?
Category: Data Science

Sequential sampling from Gaussian conditional not working

I'm trying to sequentially sample from a Gaussian Process prior. The problem is that the samples eventually converge to zero or diverge to infinity. I'm using the basic conditionals described e.g. here Note: the kernel(X,X) function returns the squared exponential kernel with isometric noise. Here is my code: n = 32 x_grid = np.linspace(-5,5,n) x_all = [] y_all = [] for x in x_grid: x_all = [x] + x_all X = np.array(x_all).reshape(-1, 1) # Mean and covariance of the prior …
Category: Data Science

Multivariate noise variance in Gaussian process prediction

In GP regression, we predict using $\mu^* = ... (K(X,X)+\sigma^2I)^{-1}...$ This is fine when the noise $\sigma$ is a scalar, but I am confused about what happens when $\sigma$ is Multivariate/anisotropic. $K(X,X) \in R^{m\times m}$, does $\sigma$'s dimension not depend on the width of our prediction vector $f_\ast$? If so, how does the above section of the prediction work?
Category: Data Science

Is it possible to train probabilistic model to return several distributions?

I have nonlinear data of function y(x), which is let's say parabolic. At some points of x there are several y's (look at the picture). Is it possible to train a probabilistic model to return several distributions (when needed) i.e. several means and variances. For example: when I feed a (x=a) to the model -> it returns 2 red distributions (2 means and 2 variances), and when I feed b (x=b) to the model -> it returns 1 blue distribution. …
Category: Data Science

Data model with more outputs than inputs?

I am working on parametric studies in physics simulations, i.e. I vary some real input parameters (e.g. x0,x1,x2,x3) and get an output with a larger size (e.g. y0,y1 ... y100). Assuming that I have a database of some thousand different input parameters and corresponding outputs, is there a good way to build a model that can give a prediction for the output at a new position? I have looked into various techniques, but so far I couldn't find a method …
Category: Data Science

Derivative of multi-output Gaussian Process

I am working on a project where I estimate transition and measurements models for a kalman filter using Gaussian Processes. In order to linearize the models I require the Jacobian of the estimated Guassian Process. For the single-output case this is no problem, But I am a little confused about how to do this for the multi- output case. The posterior mean of the gaussian process would be \begin{equation} \begin{aligned} \bar{f}_* &= \mathbf{k}(\mathbf{x}_* \, \mathbf{X}) K(\mathbf{X}, \mathbf{X}) ^{-1} \mathbf{y}\\ &\stackrel{\triangle}{=} …
Category: Data Science

GP derivative in GpyTorch

I am working on a project using GP-regression models to model transition and measurements models in a Kalman Filter. This means I need to be able to sample from the derivative of the original GP model. I am aware of how to combine the various kernels offered in the GpyTorch library, but is there any way I can implement my own mean and covariance functions? In the case of an RBF-Kernel the posterior mean and covariance would be. \begin{equation} \begin{aligned} …
Category: Data Science

Gaussian process regressor returns almost identical std for all datapoints

I am using a Gaussian process regressor as the regressor for active learning and I use its standard deviation to choose the next training inctance (the one with the highest std is chosen). However, the std values returned by the regressor are almost identical as shown below, that doesn't seem right, especially given that the algorithm's performance doesnt improve after having been taught with 20 new instances that it has queried. I use this data-set. the way I go about …
Category: Data Science

Gaussian Process for Classification: How to do predictions using MCMC methods

Problem I was reading about Gaussian Processes for regression in the "Gaussian Processes for Classification" textbook and in a few other online resources. Everywhere I look people seem to avoid talking about one would go about doing this. Can anyone provide a simple answer to this? Mathematics and Context $X\in\mathbb{R}^{n\times d}$ is a matrix whose rows ${\bf{x}}_i$ are the $n$ training observations living in $d$-dimensions. ${\bf{y}}$ is an $n$-dimensional vector containing training labels $0$ and $1$ for each training input. …
Category: Data Science

Using a trained classifier in an Android app

As the title suggests, I'm attempting to train some different classifiers into an android app. The main question I have is how to represent the different models in a neat and effective way, from python to Java (Android Studio). Background: I will attempt to connect 3 bluetooth bio-marker sensors through an app in order to perform a medical classification on heart disease risk groups. I'm fairly experienced with the machine learning packages in Python, mainly Scikit learn. I want to …
Category: Data Science

Advice on machine learning for small inputs and outputs

I am planning on using a machine learning algorithm to learn the mapping between sets of four coordinates (x,y,z + a distance d from a reference point) to two numbers (an amplitude A and a time t). In other words, a machine learning algorithm should learn, for each sample i, the mapping (x[i], y[i], z[i], d[i]) --> (A[i], t[i]) The coordinates x,y,z are integer numbers (because they are actually grid points on a fixed grid). The distance d is a …
Category: Data Science

What is a "variable index" in the Gaussian perspective?

I was going through this article about Gaussian processes, in which the author explains about the "variable index" in the form of a plot while writing about 2D Gaussian. The explanation and plot are as below: I understood the y-axis in this plot, but I'm having problems understanding the x-axis (variable index). Where did the values 1 and 2 come from in that axis and how is the y-value 2 for both of them?
Category: Data Science

How exactly do Gaussian Processes (square dist kernel) enforce smoothness? (Aka how are they computed to do so?)

From: http://www.cs.cmu.edu/~16831-f12/notes/F10/16831_lecture22_jlisee/16831_lecture22.jlisee.pdf "Gaussian Processes artificially introduce correlation between close samples in that vector in order to enforce some sort of smoothness on the succession of samples." But how is this computed? Is the function f(x) ~ GP(mu,k(x,x')) performed incrementally? e.g. the n'th calculated value f(xn) uses values f(x-1)...f(x-n) to compute its mean and variance?
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.