Encoding correlation

I have rather theory-based question as I'm not that experienced in encoders, embeddings etc. Scientifically I'm mostly oriented around novel evolutionary model-based methods.

Let's assume we have data set with highly correlated attributes. Usually encoders are trained to learn representation in lesser number of dimensions. What I'm wondering about is quite the opposite. Would it be possible to learn encoding to higher number of dimensions but less correlated (wishfully non-correlated)? The idea is to turn less-dimensional, very tough problem to high-dimensional but easier one. Kinda unwrap those intricate correlations using NN and decode solutions later.

Edit 1 Of course we assume we know correlation mapping really good. How exactly could I use correlation mapping to unwrap it? Is it fundamentally possible to unmap attribute dependencies?

Topic encoder mathematics theory statistics

Category Data Science


You could do that with a neural autoencoder using a custom loss function. Use a hidden layer, let's call it $l_{encoded}$, with more nodes than the features of the input data.

You have to code the custom loss: $loss = corr(o(l_{encoded}))+MSE(o(l_{output}),\ input)$,

where $corr(o(l_{encoded}))$ is the correlation of the output of the encoding layer and $MSE(o(l_{output}), input)$ is the mean squared error of the last layers output and the input instance.

Using this loss your model will try to reduce the correlation of the hidden layer while still making sure that it is able to decode the training instance.

I highly doubt this will be of any use though.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.