Finding a vector that minimize the MSE of its linear combination

I have been doing a COVID-19 related project. Here is the question:

  • N = vector of daily new infected cases
  • D = vector of daily deaths
  • E[D] = estimation of daily deaths

N is a n-dimensional vector, n is around 60. E[D] is another n-dimensional vector. Under certain assumptions, each entry of E[D] can be calculated as a linear combination of the entries of N.

We want to find the vector N such that the E[D] derived from N has least mean squared error when compared to actual D data. I think a gradient descent algorithm is needed here. However, I am not very familiar with gradient descent.

This seems to be a basic data science problem, but I am kind of lost right now. Does anyone has any idea about which algorithm should I dig into?

Topic mse gradient-descent optimization

Category Data Science


A basic Linear or Polynomial Regressor can be tried. SKLEARN has readymade APIs for these. Regression can ALSO be implemented with SVMs, Decision Trees, Random Forest etc or mode complex use cases with Deep Neural Networks.

These algos will internally use gradient descent hiding the maths complexity from you.


If inputs are the same and you expect multiple outputs, I would recommend you to look at multioutput models which can be of two types :

  • native multioutput algorithms
  • multiple single output regressors wrapped together. Sklearn’s multioutput does that.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.