How to use a RBF kernel to create a "Kernel Space" using the similarity of each pair of point?

I am trying to use Semi-Unsupervised clustering using reinforcement learning following this paper.

Assume I have n data-points each of which has d dimensions. I also have c pairwise constraints of whether two elements are supposed to be in the same cluster or not.

The paper states that "the original input dimension of the dataset is appended to a kernel space with a similarity metric to each pairwise point in the set of constraints" creating a d + 2c dimensional space. They also say the kernel they use is an RBF.

Can anybody explain to me what do they mean? how can you use the similarity of a pair of elements to generate a whole new dimension for every data point in the dataset? what is a "kernel space" exactly in this case? if I try to google it I only get results relevant to operating systems kernels.

Topic rbf kernel reinforcement-learning svm clustering

Category Data Science


Let's take a simple example, a binary classification using only 2 dimensions with only 8 observations. Consider the first 8 rows of the dataset. It is impossible to linearly separate the data using a hyper-plane.

So, we can use a kernel transform. The books usually drop you off right here(no help). It does not tell us how to transform. Transform using what? How many dimensions? What constraints?

We could consider the simplest transformation, f(x,y)=xy. We could take this transformation and now consider it a new dimension. So we started with (D2+Class) features and we could add 1 dim (to get D3+Class) if we chose.

Q. Does this help? I suggest either plotting this dataset out by hand. Using f(x,y)=xy is Not very helpful.

So let's try another, f'(x,y)=x^2 + y^2. We started with (D2+Class) features and added another dimension. We could conceivably plot D4+C, BUT using dimensions 1,2 & 4 are easier to visualize. I suggest plotting by hand (for effect) {d1,d2,d4} to your graphic or plot it using 3D software.

Now ask yourself, Is this new situation linearly separable?

As for the constraints, Do you remember LaGrange multipliers? Well if you want to use a gaussian like constraint we could use a Radial Basis Function (RBF). Where:

RBF: K(x,y) = exp(-gamma * (||x−y||)^2)), gamma > 0

The first part is kernel 101. The LaGrangian is kernel 400. lol

dataset

        d1  d2      C          d3         d4
| row | x | y | class |f1(x,y)=xy|f'(x,y)=x^2 + y^2|
| ---:|--:|--:| -----:|---------:|----------------:|
|   1 |  1|  0|      0|         0|                1|
|   2 |  0|  1|      0|         0|                1|
|   3 | -1|  0|      0|         0|                1|
|   4 |  0| -1|      0|         0|                1|
|   5 |  2|  0|      1|         0|                4|
|   6 |  0|  2|      1|         0|                4|
|   7 | -2|  0|      1|         0|                4|
|   8 |  0| -2|      1|         0|                4|

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.