Machine Learning for conditional density estimation

Question

Machine Learning for conditional density estimation

Enk9456

2022年5月17日 11:48

Suppose I have a set of examples $X = (x_1,x_2,..,x_n)$ with continuous numeric targets $Y = (y_1,y_2,..,y_n)$. While it is standard to use regression models to make point predictions of $y_i$ as $f(x_{i}) = \hat{y}_i$, I am interested in predicting a density function for $y_{i}$. What I want is analogous to the use of probabilities in classification instead of hard predictions (e.g. predict vs predict_proba in Scikit-learn), but for continuous regression problems. Specifically, a different density function (e.g. in the form of a histogram or a closed form solution) should be produced for each $y_{i}$ based on the input $x_i$, and of course it should be applicable to new examples $x$. I will denote this task as conditional density estimation so excuse me if there is another/better term for this.

Are there any building blocks (e.g. modules) or specific regressors in scikit-learn, keras or pytorch that allow you to do conditional density estimation?
If not, what approach do you propose for building conditional density estimation?

Example: One approach I can think of is to use Decision Tree and return a histogram of the data that belong to the leaf node. With Random Forest, an average histogram may also be given. While this is an approach that works for a specific type model, I am also interested in generic solution that would work for any regressor. Also, beyond histograms, I am also interested in methods that give closed form solutions e.g. a particular normal distribution.

Topic density-estimation probability regression

Category Data Science

Machine Learning for conditional density estimation

About