Latent variables with thresholds

Question

Latent variables with thresholds

mrt

2022年2月21日 07:02

There are many ML techniques to estimate latent variables such as the EM algorithm. Is there a technique that allows for thresholds for each of the latent variables?

I have a feature space with 10 variables $(X_1,\dots,X_{10})$ and the outcome $Y$. 7 of the $X$ features are known (I have their observations) and 3 are unknown. Each of the unknown can be within a range from 0 up to a positive constant number.

What ML technique would you recommend for estimating the above latent variables with the setup described above?

Topic estimators machine-learning

Category Data Science

D.W. · Accepted Answer · 2020年1月23日 17:49

Sure. Just treat the range as a prior on the latent variables. Typically we use a boring prior (e.g., a normal distribution, a uniform distribution), but in your case, if $X_7$ is unknown and in the range $[0, 7.3]$, then your prior for $X_7$ could be the uniform distribution on that range. Then apply the machinery of the EM algorithm, and it should all work.

jayprich · Accepted Answer · 2018年6月28日 17:24

re. "estimate latent variables"

Quantities that are trained in order to fit a "best" model within a family of models are called hyper-parameters. To any instance of the model they are fixed. To the optimisation routine they are an index into search space. Adding constraints on the range of a hyper-parameter both reduces the search space of the optimisation and requires extra "feasibility" checks during typical gradient descent.

A variable is "latent" when it is purely internal to the model, i.e. not an observable. The meaning of its scale would depend on the context and on your interpretation, since it cannot be compared to anything observed. You rarely want to constrain that range inside the model.

I would suggest leaving the hyper-parameters and latent variables unconstrained and if you want to read an output train a "neuron"-like response to get what you want out : e.g. sigmoid / tanh / softmax

Latent variables with thresholds

About