Decision boundary in a classification task

I have 1000 data points from the bivariate normal distribution $\mathcal{N}$ with mean $(0,0)$ and variance $\sigma_1^2=\sigma_2^2=10$ with the covariances being $0$. Also there are 20 more points from another bivariate normal distibution with mean $(15,15)$ with variance $\sigma_1^2=\sigma_2^2=1$ and with the covariances being $0$ again. I used the least squares method to calculate the parameters of the decision bounday $\theta_0 + \theta_1 x_1 + \theta_2 x_2=0$, that is $$\theta = (X^T X)^{-1}(X^Ty)$$ where $y$ is a column matrix with …
Category: Data Science

Visualizing the equation for separating hyperplane

I was wondering if I can visualize with the example the fact that for all points $x$ on the separating hyperplane, the following equation holds true: $$w^T.x+w_0=0\quad\quad\quad \text{... equation (1)}$$ Here, $w$ is a weight vector and $w_0$ is a bias term (perpendicular distance of the separating hyperplane from the origin) defining separating hyperplane. I was trying to visualize in 2D space. In 2D, the separating hyperplane is nothing but the decision boundary. So, I took following example: $w=[1\quad 2], …
Category: Data Science

Does linear classifier creates linear decision boundary in the input feature space?

I read a lot , but still not able to get the following concepts -: (1) If a classifier is given, how do we know whether its a linear or non linear classifier? (Interested in step by step procedure to make a judgement of classifier) (2) If a classifier is linear then its decision boundary is linear (True or False ) (3)If a decision boundary is linear then its classifier is linear(True or Flase) Now, lets suppose we have to …
Category: Data Science

Kernel selections in SVM

I want to understand the kernel selection rationale in SVM. Some basic things that I understand is if data is linear, then we must go for linear kernel and if it is non-linear, then others. But the question is how to understand that the given data is linear or not, especially when it has many features. I know that by cross validation I can try and feed different kernels and see the output whichever performs best to be selected, but …
Category: Data Science

questions about logistic regression

In the following Linear Regression discussion I didn't understand a few things: So my questions are: In the third slide: What does this probability means $P\left(y_i|x_i\right)$ and accordingly what does it mean to maximize it ? Does it mean to maximize both $P\left(y_i=1|x_i\right)$ and $P\left(y_i=0|x_i\right)$, and as higher this probability, the more stable and rightful results we get, and accordingly the more correct weights $w^*$ we get ? In the fourth slide I don't see the math, could anyone detail …
Category: Data Science

PCA vs.KernelPCA: which one to use for high dimensional data?

I have a dataset which contains a lot of features (>>3). For computational reasons, I would like to apply a dimensionality reduction. At this point I could use different techniques: standard PCA Kernel PCA LLE ... My problem is to choose the right approach since the number of features is so high that I cannot know beforehand what the distribution of points is like. I could do it only if I have 3D data, but in my case I have …
Category: Data Science

Is a data set considered to be linearly separable if it can only be separated by multiple hyperplanes?

For example, on the linear separability Wikipedia article, the following example is given: They say "The following example would need two straight lines and thus is not linearly separable". On the other hand, in Bishop's 'Pattern Recognition and Machine Learning' book, he says "Data sets whose classes can be separated exactly by linear decision surfaces are said to be linearly separable". Under Bishop's definition of linear separability, I think the Wikipedia example would be linearly separable, even though the author …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.