Non-linear Regression

For example suppose I've data set which looks like:

[[x,y,z],
 [1,2,5],
 [2,3,8],
 [4,5,14]]

It's easy to find the theta parameters from those tiny data set. Which is theta = [1,2,0]

z = 1*x + 2*y + 0 

But if my data set are non linear. Suppose:

[[x,y,z],
 [1,2,6],
 [2,3,15]]]

If i choose the mapping function to be of: z = xy+yy

It would return the theta parameter :

theta = [1,1,0]

So my deal is how to choose such mapping function for data sets which varies over time. As in recommender system the user rating varies as per the time, to reduce the cost. I've recently gone through regularization. Is there any other ideas for reducing the cost.

Topic objective-function linear-regression

Category Data Science


To answer you first question about Non linear regression:

I believe your problem of choosing mapping function for non linear regression can be solved by using Support Vector Machines.

SVMs can learn non linear mapping functions in a kernel-induced feature space. What this means is in svms , the basic idea is to map the input data X into some high dimensional feature space f using a non linear mapping (kernel) and then doing linear regression in this feature space.

To learn more about non-linear regression and kernels, you can read this.

Secondly, Regularization is a technique that is used to solve over-fitting problem. This usually happens when you use a very dense model for your training set or you train the model for far too many steps. In this case, while the accuracy on your train set is high,but it performs very poorly in case of unseen data. Hence when you add regularization, it helps reduce the cost function.

Regularization is of two types, L1 and L2. The difference lies in the power of weight-coefficients.These should be enough for your SVM based models.

To reduce overfitting induced high cost, you can also use BatchNormalization and Dropout algorithms.

Hope this helps :)

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.