Solving a system of equations with sparse data
I am attempting to solve a set of equations which has 40 independent variables (x1, ..., x40) and one dependent variable (y). The total number of equations (number of rows) is ~300, and I want to solve for the set of 40 coefficients that minimizes the total sum-of-square error between y and the predicted value.
My problem is that the matrix is very sparse and I do not know the best way to solve the system of equations with sparse data. An example of the dataset is shown below:
y x1 x2 x3 x4 x5 x6 ... x40
87169 14 0 1 0 0 2 ... 0
46449 0 0 4 0 1 4 ... 12
846449 0 0 0 0 0 3 ... 0
....
I am currently using a Genetic Algorithm to solve this and the results are coming out with roughly a factor of two difference between observed and expected.
Can anyone suggest different methods or techniques which are capable of solving a set of equations with sparse data.
Topic genetic regression algorithms machine-learning
Category Data Science