How to add the Luong Attention Mechanism into CNN?

As I write my CNN model for an image binary classification below, I'm trying to add an attention layer to this model. I read from tf.keras.layers.Attention: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Attention But I still don't know exactly how to use it, any help is appreciated. model = keras.Sequential() model.add(Conv2D(filters = 64, kernel_size = (3, 3), activation = 'relu', padding='same', input_shape = ((256,256,3)))) model.add(MaxPooling2D(pool_size = (2, 2), strides=(2, 2))) model.add(Conv2D(filters = 128, kernel_size = (3, 3), activation = 'relu', padding='same')) model.add(MaxPooling2D(pool_size = (2, 2), strides=(2, …
Category: Data Science

Why convolutional layer learns only biases?

I`m training a siamese CNN to distinguish between pairs of images and though my train/val binary cross-entropy loss values show negative trend, implying some of the model parameters are being updated, I noticed that convolution kernels barely change while their biases change significantly tensorboard weights histogram image. Also, while loss value decreases, accuracy appears to be frozen for some epochs and then instantly shoots up accuracy and loss plots. Q1: If this is caused by vanishing gradient, why would it …
Category: Data Science

How to determine the number of Neurons in each hidden layer and number of hidden layers for face recognition

I plan to build a CNN for face recognition using this Kaggle dataset. I tried building a model with a single hidden layer with 256 fully connected neurons, and it gave an accuracy of 45% after 55 epochs. Should I just set the no. of hidden layers (and the no. of neurons in the layers) as variables, and repeat the model evaluation process for various values of the variables to determine the optimum values? Or is there any other, more …
Category: Data Science

how does convolutional layer work?

i have one question reagrding CNNs , if we take a signle convolutional layer it can have multiple filters right ? are these filters all the same ? is a single layer made only to detect one feature ? i'm a bit confused of the working of convolutional layer.
Category: Data Science

LeNet-5 - combining feature maps in C3 layer

Famous LeNet-5 architecture looks like this: The output of layer S2 has dimension: 10x10x6 - so basically an image with 6 convultions applied to it to derive features. If each dimension was again submitted to 6 filters the resulting output would be of 10x10x36 however it is 10x10x16. Initially I stumble on it but finnaly I udnerstood that this is done be combining inputs from layer S2 and applying one kernel on it as it's explained in the article: Layer …
Category: Data Science

Can i have the input to a neural network be a set of 2d coordinates if i run them through a convolution layer?

I asked this question a few days ago with no response and still dont have an answer so i will ask again. I am training a reinforcement learning agent on a 2d grid. It is fed in its position, and the target positions using x,y coordinates. An example input would be like [[1,3],[2,2],[5,1]]. I thought that since if i just fed in the input with a flatten layer (would be 1,3,2,2,5,1), there would not be a strong enough association between …
Category: Data Science

Using softmax for multilabel classification (as per Facebook paper)

I came across this paper by some Facebook researchers where they found that using a softmax and CE loss function during training led to improved results over sigmoid + BCE. They do this by changing the one-hot label vector such that each '1' is divided by the number of labels for the given image (e.g. from [0, 1, 1, 0] to [0, 0.5, 0.5, 0]). However, they do not mention how this could then be used in the inference stage, …
Category: Data Science

Can I use a 1d convolution on a set of coordinates?

So i am training a reinforcement learning agent. It is fed in its position, and the target positions using x,y coordinates. An example input would be like [[1,3],[2,2],[5,1]]. I thought that since if i just fed in the input with a flatten layer (would be 1,3,2,2,5,1), there would not be a strong enough association between each coordinate pair. Because of this, i used a 1d convolution layer with 5 filters, and a step and size of 2, which i hoped …
Category: Data Science

CNN can't predict images outside the dataset

I am using celeba dataset to train my CNN face landmark detection model. Here is my model class LandmarkModel: def __init__(self,inp_shape): self.model = models.Sequential() self.model.add(layers.Conv2D(16, (3, 3), activation='relu', input_shape=inp_shape))#l1 self.model.add(layers.Conv2D(32,(3, 3), activation='relu')) self.model.add(layers.MaxPooling2D((2, 2))) self.model.add(layers.Conv2D(64,(3, 3), activation='relu')) self.model.add(layers.Flatten()) self.model.add(layers.Dense(512)) self.model.add(layers.Dense(10)) def getModel(self): return self.model I have trained my model for around 5k-6k images with loss of 0.1. When I use image from dataset that is outside of training sample I get correct prediction. But when I use my own clicked …
Category: Data Science

Understanding Conv1D Output Shape

I am a little confused with the output shape that Conv1D produces. Consider the code I have used as the following (a lot has been omitted for clarity): input_shape = x_train_2trans.shape # (7425, 24, 1) model.add(Conv1D(filters=4, input_shape=input_shape[1:], kernel_size=(3), activation=LeakyReLU)) model.add(Dropout(0.2)) model.add(Dense(1)) I have tried 3 different kernel sizes of 3, 2 and 1, where the output size produced are: (256, 2500, 12, 1), (256, 2500, 18, 1), (256, 2500, 24, 1), respectively. What I am confused with is the difference …
Category: Data Science

What are the possible values of a filter in a CNN?

I am a trying to write a CNN from scratch in python but I am bit new to CNNs specifically the convolution layers as I am comfortable with the dense layers. I was reading Do filters have different weights for each input channel but I didn't completely understand and had a few questions. I wanted to confirm that if the input layer had 3 channels then for there to be 4 output channels you would need a total of 12 …
Category: Data Science

AttributeError: 'Functional' object has no attribute 'predict_classes''

I am trying to use run a GoogLeNet code using FERET datasets. When I run the code, I get the following error message: Traceback (most recent call last): File "C:\Users\JoshG\PycharmProjects\GoogLeNet\GoogLeNet5.py", line 221, in <module> y_pred = (model.predict_classes(testX)) AttributeError: 'Functional' object has no attribute 'predict_classes'' Can anyone tell me what I am doing wrong? I am new to python, so please patient with me understanding. Below is the full code: # Python: 3.9 # keras: 2.2.4 for GoogLeNet on CIFAR-10 # …
Category: Data Science

Calculate importance of input data bands for CNN image classification?

I constructed and trained a convolutional neural network using Keras in R with the TensorFlow backend. I feed the network with multispectral images for a simple image classification. Is there some way to calculate which of the input bands were most important for the classification task? Ideally, I would like to have a plot with some measure of importance, grouped by bands and image classes. How can I obtain this information? Would it be necessary / possible to calculate saliency …
Category: Data Science

Backtracking filter coefficients of Convolutional Neural Networks

I'm starting to learn how convolutional neural networks work, and I have a question regarding the filters. Apparently, these are randomly generated when the model is generated, and then as the data is fed, these are corrected accordingly as with the weights in backtracking. However, how does this work in filters? To my understanding, backtracking works by calculating how much an actual weight contributed to the total error after an output has been predicted, and then correct it accordingly. I've …
Category: Data Science

Autoencoder not learning walk forward image transformation

I have a series of 15 frames with (60 rows x 50 columns). Over the course of those 15 frames, the moon moves from the top left to the bottom right. Data = https://github.com/aiqc/AIQC/tree/main/remote_datum/image/liberty_moon I am attempting a walk forward autoencoder where: The input data is a 60x50 image. The evaluation label is a 60x50 image from 2 frames later. All data is scaled between 0-1. model = keras.models.Sequential() model.add(layers.Conv1D(64*hp['multiplier'], 3, activation='relu', padding='same')) model.add(layers.MaxPool1D( 2, padding='same')) model.add(layers.Conv1D(32*hp['multiplier'], 3, activation='relu', padding='same')) …
Category: Data Science

CNN image to image translation: multiple image inputs to one image output

I am interested in training a CNN to take in inputs where each input is a set of low-resolution images and each ground truth is a single high-resolution image. The ground truth high-resolution image was generated by averaging the information from a set of 80 low-level images. What I would like the CNN to do is to generate the same high-resolution image from a smaller set of low-level images, for example 5 low-level images. I was looking into TensorFlow GAN …
Category: Data Science

Triplet loss - what threshold to use to detect similarity between two embeddings?

I have trained my triplet loss model using FaceNet's architecture. I used 11k hands dataset. Now I want to see how well my model performed, so I feed it 2 images of the same class and get back their embeddings. I want to compare the distance between these embeddings and if that distance is not larger than some threshold I can say that the model correctly classifies these 2 images as of the same class. How do I select the …
Category: Data Science

Number and size of dense layers in a CNN

Most networks I've seen have one or two dense layers before the final softmax layer. Is there any principled way of choosing the number and size of the dense layers? Are two dense layers more representative than one, for the same number of parameters? Should dropout be applied before each dense layer, or just once?
Category: Data Science

CNN Eliminate Wrong Results

I extracted images of human faces from the videos, but the model also recorded images without faces. I wrote CNN for emotion classification. In the obvious pictures, the probability is closer to a probability in the softmax function in the last layer, for example, in a photo that is certain to be happy, a probability of 0.95 for the happy class appears, but if there is no face in the picture, it disperses between classes such as 0.3 and 0.2. …
Category: Data Science

Tensorflow.js - CNN/or autoencoder denoiser architecture?

I am new to machine learning. I have 10,000 examples of 128x256 array of values 0.0-1.0. Each example consists of a pair of a clean example and the other with noise added. I am aiming to train a CNN / (or an autoencoder?) with these examples. I am currently able to train one dense layer without errors. My first problem is my prediction is returning a 128x256 int array rather than floats. My larger question is about finding a starting …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.