sparsity

Why is the L2 penalty squared but the L1 penalty isn't in elastic-net regression?

Tomer Wolberg

2022年3月19日 15:50

There was some data set I worked with which I wanted to solve non negative least squares (NNLS) on and I wanted a sparse model. After a bit of experiementing I found that what worked the best for me was using the following loss function: $$\min_{x \geq 0} ||Ax-b|| + \lambda_1||x||_2^2+\lambda_2||x||_1^2$$ Where the L2 squared penalty was implemented by adding white noise with a standard deveation of $\sqrt{\lambda_1}$ to $A$ (which can be showed to be equivelent to ridge regression …

Topic: elastic-net sparsity lasso regression

Category: Data Science

What exactly is activity sparsity and why is it beneficial?

Luuk

2022年2月3日 14:58

I have been reading about weight sparsity and activity sparsity with regard to convolutional neural networks. Weight sparsity I understood as having more trainable weights being exactly zero, which would essentially mean having less connections, allowing for a smaller memory footprint and quicker inference on test data. Additionally, it would help against overfitting (which I understand in terms of smaller weights leading to simpler models/Ockham's razor). From what I understand now, activity sparsity is analogous in that it would lead …

Topic: sparse sparsity cnn regularization

Category: Data Science

Autoencoder train and test accuracy shooting to 99% on few epochs

Muhammad Hamza Yousuf

2021年11月12日 14:03

I am trying to train an autoencoder for dimensionality reduction and hopefully for anomaly detection. My data specifications are as follows. Unlabeled 1 million data points 9 features I am trying to reduce it to 2 compressed features so I can have better visualization for clustering. My autoencoder is as follows where latent_dim = 2 and input_dim = 9 class Autoencoder(tf.keras.Model): def __init__(self,latent_dim,input_dim): super(Autoencoder32x, self).__init__() self.latent_dim = latent_dim self.input_dim = input_dim self.dropout_factor = 0.5 self.encoder = Sequential([ # Dense(16, activation='relu', …

Topic: sparsity keras autoencoder deep-learning accuracy

Category: Data Science

What is the meaning of the sparsity parameter

Sm1

2021年9月20日 23:11

Sparse methods such as LASSO contain a parameter $\lambda$ which is associated with the minimization of the $l_1$ norm. Higher the value of $\lambda$ ($>0$) means that more coefficients will be shrunk to zero. What is unclear to me is that how does this method decides which coefficients to shrink to zero? If $\lambda = 0.5$ then does it mean that those coefficients whose values are less than or equal to 0.5 will become zero? So in other words, whatever …

Topic: elastic-net sparsity lasso ridge-regression regularization

Category: Data Science

What is the difference between sparse and dense corpra?

user

2021年8月11日 08:14

I didn't got the meaning or the difference between sparse and dense corpra here in this sentence "the reason is that Skip-gram works better over sparse corpora like Twitter and NIPS, while CBOW works better over dense corpora "

Topic: corpus sparsity word2vec word-embeddings

Category: Data Science

Clustering of sparse matrix with many co-variates

Strabonio

2020年2月17日 08:11

I have a 2M x 2000 sparse matrix where rows represent an item and columns represent dimensions. I want to understand whether there are meaningful clusters in the data and I started to explore the dimensions to transform and normalise the data. Of the 2000 attributes to an item, many are co-variant (rho > .5). Are there clustering techniques that handle co-variants well automatically, without having to remove them manually?

Topic: sparsity clustering

Category: Data Science

Loss function for Autoencoder of sparse 3D Image

PascalIv

2019年12月27日 18:49

I have 3D structure data of molecules. I represented the atoms as points in a 100*100*100 grid and applied a gaussian blur to counter the sparseness. (nearly all of the grid cells contain zeros) I am trying to build an autoencoder to get a meaningful "molecule structure to vector" encoder. My current approach is to use convolutional and max-pooling layers, then flatten and a few dense layers to get a vector representation. Then I reshape and increase the dimension again …

Topic: sparsity autoencoder loss-function

Category: Data Science

If $\ell_0$ regularization can be done via the proximal operator, why are people still using LASSO?

ArtificiallyIntelligence

2019年10月15日 18:13

I have just learned that a general framework in constrained optimization is called "proximal gradient optimization". It is interesting that the $\ell_0$ "norm" is also associated with a proximal operator. Hence, one can apply iterative hard thresholding algorithm to get the sparse solution of the following $$\min \Vert Y-X\beta\Vert_F + \lambda \vert \beta \vert_0$$ If so, why people are still using $\ell_1$? If you can just get the result by non-convex optimization directly, why are people still using LASSO? I …

Topic: sparsity optimization

Category: Data Science

About