Variational AutoEncoder giving negative loss

I'm learning about variational autoencoders and I've implemented a simple example in keras, model summary below. I've copied the loss function from one of Francois Chollet's blog posts and I'm getting really really negative losses. What am I missing here? Model: "model_1" __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) [(None, 224)] 0 __________________________________________________________________________________________________ encoding_flatten (Flatten) (None, 224) 0 input_1[0][0] __________________________________________________________________________________________________ encoding_layer_2 (Dense) (None, 256) 57600 encoding_flatten[0][0] __________________________________________________________________________________________________ encoding_layer_3 (Dense) (None, 128) 32896 encoding_layer_2[0][0] __________________________________________________________________________________________________ encoding_layer_4 …
Category: Data Science

What Non-linearities are best in Denoising RNN Autoencoders and where should the go?

I’m employing a denoising RNN autoencoder for a project relating to motion capture data. This is my first time using auto encoder architectures and I was just wondering what non-linearities should be placed in these models and where they should go. This is my model as it stands: class EncoderRNN(nn.Module): def __init__(self, input_size, hidden_size, num_layers): super(EncoderRNN, self).__init__() self.input_size = input_size self.hidden_size = hidden_size self.num_layers = num_layers self.rnn_enc = nn.RNN(input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=True) self.relu_enc = nn.ReLU() def forward(self, x): pred, hidden …
Category: Data Science

How to use Variational Autoencoder's μ and σ with user-generated z?

My understanding of VAE is that unlike Autoencoders, it does not directly give you a discrete encoding (latent code vectors n-dim) instead, it gives you both mu and sigma (n- dim mean vectors and n-dim standard deviation vectors). Then you have epsilon which you use to sample from a normal distribution with mu and sigma to create z. When combining mu, sigma and epsilon, you get z which is the one decoded by the VAE's decoder. z is basically the …
Topic: autoencoder
Category: Data Science

Does autoencoder with few images not work?

I am trying to load 2 images to an autoencoder but for some reason, it does not rebuild the input image. An autoencoder is supposed to compress and decompress an image. However, when passing two images with which it trains, it only shows a black color instead of the corresponding image. import keras from keras import layers # This is the size of our encoded representations encoding_dim = 32 # 32 floats -> compression of factor 24.5, assuming the input …
Category: Data Science

Which algorithm can be used to reduce dimension of multiple time series?

In my dataset, a data point is essentially a Time series of 6 feature over a year per month so in all, it results in 6*12=72 features. I need to find class outliers so I perform dimensionality reduction hoping the difference in data is maintained and then apply k-means clustering and compute distance. For dimensionality reduction I have tried PCA and simple autoencoder to reduce dimension from 72 to 6 but results are unsatisfactory. Can anyone please suggest any other …
Category: Data Science

Incremental learning on Autoencoder for anomaly detection

I want to incrementally train my pre-trained autoencoder model on data being received every minute. Based on this thread, successive calls to model.fit will incrementally train the model. However, the reconstruction error and overall accuracy of my model seems to be getting worse than what it initially was. The code looks something like this: autoencoder = load_pretrained_model() try: while True: data = collect_new_data() autoencoder = train_model(data) # Invokes autoencoder.fit() time.sleep(60) except KeyboardInterrupt: download_model(autoencoder) sys.exit(0) The mean reconstruction error when my …
Category: Data Science

TextVectorization and Autoencoder for feature extraction of text

I'm trying to solve a problem which is as follows: I need to train the autoencoder to extract useful data from text. I will use the trained autoencoder in another model to extract features. The goal is to teach the autocoder to compress the information and then reconstruct the exact same string. I solve the problem of classification for each letter. My dataset: X_train_autoencoder_raw: 15298 some text... 1127 some text... 22270 more text... ... Name: data, Length: 28235, dtype: object …
Category: Data Science

Autoencoder not learning walk forward image transformation

I have a series of 15 frames with (60 rows x 50 columns). Over the course of those 15 frames, the moon moves from the top left to the bottom right. Data = https://github.com/aiqc/AIQC/tree/main/remote_datum/image/liberty_moon I am attempting a walk forward autoencoder where: The input data is a 60x50 image. The evaluation label is a 60x50 image from 2 frames later. All data is scaled between 0-1. model = keras.models.Sequential() model.add(layers.Conv1D(64*hp['multiplier'], 3, activation='relu', padding='same')) model.add(layers.MaxPool1D( 2, padding='same')) model.add(layers.Conv1D(32*hp['multiplier'], 3, activation='relu', padding='same')) …
Category: Data Science

Using large CNNs (e.g., ResNet) in convolutional autoencoders for image representation learning

I am confused about which CNNs are generally used inside autoencoder architectures for learning image representations. Is it more common to use a large existing network like ResNet or VGG, or do most people write their own smaller networks? What are the pros and cons of each? If people are using a large network like ResNet or VGG, does the decoder mirror the same steps taken by the encoder, or can a more simple decoding network be used? I am …
Category: Data Science

An autoencoder setup for anomaly detection

I am doing anomaly detection using machine learning. i have tried different models like isolation forest, SVM and KNN. The maximum accuracy that I can get from each of them is $80\%$ accordind to my dataset which contains $5$ features and $4000$ data samples, $18\%$ of them are anomalous. When I use autoencoder and I adjust the proper reconstruction loss threshold I can get $92\%$ accuracy but the hidden layers setup of the autoencoder does not seems right despite the …
Category: Data Science

Tensorflow.js - CNN/or autoencoder denoiser architecture?

I am new to machine learning. I have 10,000 examples of 128x256 array of values 0.0-1.0. Each example consists of a pair of a clean example and the other with noise added. I am aiming to train a CNN / (or an autoencoder?) with these examples. I am currently able to train one dense layer without errors. My first problem is my prediction is returning a 128x256 int array rather than floats. My larger question is about finding a starting …
Category: Data Science

What model and attributes would be good for this data?

I have the following set of data like in the picture, with 366 Temperature values for one year. The first set of data would be for training and the second one for test. I would like to detect the anomalies on the test data. What time steps should I choose when making train sequence data ? I have tried using 32. What model should I train? I tried using keras Conv1D and LSTM, but I can't find the optimal settings. …
Category: Data Science

How to detect anomalies?

I have timeseries data with one value per day for a year. (there is one column with temperature data). I am using autoencoders to train a reconstruction model with mse loss. Firstly, I normalized the data using the following code: training_mean = preprocessed_data.mean() training_std = preprocessed_data.std() df_training_value = (preprocessed_data - training_mean) / training_std After this I make a sequence with data. I am not sure if it's ok to choose 32 time stepts, but otherwise I can't fit the model. …
Category: Data Science

Preserve colour in convolutional autoencoder

at the moment i work with convolutional autoencoder and now I'am looking for paper or methods that adresses a colour preversation. Most of the AE paper use grayscale images and loss functions such as SSIM that preserve the structure very well are also focused on grayscale images. My networks are good in preserving structure (with SSIM as loss) but have a hard time representing the right colour. I use a all convolution architecture without any pooling. my downsampling is derived …
Category: Data Science

Encoder-Decoder performance time

I have two encoder-decoder models. *First model: *Second model: When I check the performance of the models I get approximately the same performance time (First model ~ 42 sec, Second model ~ 40 sec). I train my model on GPU and check performance on CPU. I test it only on one large image where the size is 12348x12348. I was expecting the larger model that has more parameters to train (second model) to give me longer run time. Anyone can …
Category: Data Science

MNIST data shape

In going through the different tutorials on CNN, autoencoders, and so on I trained myself on the MNIST problem. The different images are stored in a 3D array which shape is (60000,28,28). In some tutorials for the first layer of CNN they use the Flatten function keras.layers.Flatten(input_shape=()) but in other tutorials, they transform the 3D Array in A 4D Array (60.000, 28,28,1 ) that I suppose is identical that use the Flatten function? Am I right? Why there are two …
Category: Data Science

How to improve L2 loss for generative autoencoder

I am working with a modified generative autoencoder and having issues getting the L2 sufficiently low. I think problem is that because my data is over a very large range and is standardized to values between zero and one, small discrepancies in the standardized data lead to larger ones in the unstandardized data. Additionally, my other loss terms, despite being averaged by number of points in the batch, are usually orders of magnitude larger than my L2 loss, which I …
Category: Data Science

how to set threshold value by looking at loss distribution in anomaly detection task

I am following this tutorial https://towardsdatascience.com/lstm-autoencoder-for-anomaly-detection-e1f4f2ee7ccf to use LSTM autoencoder to detect anomalies in my unsupervised dataset. they plotted loss distribution and i plotted the same loss distribution on my dataset. given in image below my question is how they are setting the threshold value by looking at the loss distribution. i also want to set threshold by looking at my loss distribution but not clear how can i select threshold. they are saying in tutorial "By plotting the loss …
Category: Data Science

Training data for anomaly detection using LSTM Autoencoder

I am building an time-series anomaly detection engine using LSTM autoencoder. I read this article where the author suggests to train the model on clean data only in response to a comment. However, in most cases, it is not possible to find and exlude anomalies manually. I had always believed that because anomalies are very rare, if we train the model on all the data then the model will learn the normal behavior of time series and be ready to …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.