Why convolutional layer learns only biases?

I`m training a siamese CNN to distinguish between pairs of images and though my train/val binary cross-entropy loss values show negative trend, implying some of the model parameters are being updated, I noticed that convolution kernels barely change while their biases change significantly tensorboard weights histogram image. Also, while loss value decreases, accuracy appears to be frozen for some epochs and then instantly shoots up accuracy and loss plots. Q1: If this is caused by vanishing gradient, why would it …
Category: Data Science

Binary Classification Comparing two time series of variable length

Is there a machine learning model (something like LSTM or 1D-CNN) that takes two time series of variable length as input and outputs a binary classification (True/False whether time series are of same label)? So the data would look something like the following date value label 2020-01-01 2 0 # first input time series 2020-01-02 1 0 # first input time series 2020-01-03 1 0 # first input time series 2020-01-01 3 1 # second input time series 2020-01-03 1 …
Category: Data Science

Siamese Network for face comparison wont learn, accuracy stuck on 0.5, and loss stuck too

I'm trying to train a siamese network which contains a CNN and an embedding layer at the end to yield 2 similar (close) vectors for 2 images of the same person. I'm using the LFW_Cropped dataset, and some custom made generators. The generators are tested and returns batches of 50% 50% Same and Different pairs of images with the correct label. The labels for same and different outcome are: SAME = 1 -> (named as 'yes' in my code) DIFFERENT …
Category: Data Science

How gradients are flown back to Network in siamese architecture? How weights of all CNN models are same even when using different models

TL;DR: Intuition behind the gradient flow in Siamese Network? How can 3 models share the same weights? And if 1 model is used, how Gradients are updated from 3 different paths? I am trying to build a Siamese Network and as far as I can know, if I have to build a Triplet Loss based Siamese, I have to use 3 different networks. So for simplicity, let us say that my architecture is something like: Please correct the architecture if …
Category: Data Science

Can siamese model trained with euclidean distance as distance metric use cosine similarity during inference?

If I have 3 embeddings Anchor, Positive, Negative from a Siamese model trained with Euclidean distance as distance metric for triplet loss. During inference can cosine similarity similarity be used? I have noticed if I calculate Euclidean distance with model from A, P, N results seem somewhat consistent with matching images getting smaller distance and non-matching images getting bigger distance in most cases. In case I use cosine similarity on above embeddings I am unable to differentiate as similarity values …
Category: Data Science

Siamese model accuracy stuck at 0.5

I'm trying to train a Siamese Network model for a signatures dataset using Keras API and considering the loss only seems not bad. But ironically enough the model accuracy stuck at 0.5. Model Loss: Model Accuracy: My model is kind of a deep model, here's its architecture: input = Input((128, 128, 1)) x = BatchNormalization()(input) x = Conv2D(16, (2, 2), activation="tanh")(x) x = AveragePooling2D(pool_size=(2, 2))(x) x = Conv2D(32, (2, 2), activation="tanh")(x) x = AveragePooling2D(pool_size=(2, 2))(x) x = Conv2D(64, (2, 2), …
Category: Data Science

Siamese networks vs Semantic similarity (may be gensim)

I am trying to understand the Siamese networks . In this vector is calculated for an object (say an image) and a distance metric is applied (say manhatten) on two vectors produced by the neural network(s). The idea was applied mostly to images in the tutorials provided on internet. If I compare it with Gensim semantic similarity, there also we have vectors of two objects (words or sentences) and then do a cosine similarity to calculate the difference. (remember example …
Category: Data Science

Re-train the decoder of an autoencoder?

Can the decoder of the pre-trained autoencoder be trained again by taking the feature vector of the Siamese Neural Network (SNN) as input? I have trained the SNN model. Additionally, I trained autoencoder on the same dataset. The reconstruction of images is not satisfactory because of the small dataset. Now I want to take the feature vector of the SNN model(you can say the encoded part) and re-train the trained decoder of the autoencoder. My question is how can we …
Category: Data Science

Why does Siamese neural networks use tied weights and how do they work?

Reading this paper on one-shot learning "Siamese Neural Networks for One-shot Image Recognition" I was introduced to the idea of Siamese Neural Networks. What I did not fully grasp was what they meant by this line: This objective is combined with standard backpropagation algorithm, where the gradient is additive across the twin networks due to the tied weights. Firstly, how exactly are they tied? Bellow, I believe I've provided the formula by which they compute the gradient. T is the …
Category: Data Science

Siamese Network in Keras

I‘m looking for a minimal applied example for the implementation of a (one shot) Siamese Network, preferably in Keras. I‘m well aware of the various data science online pages and the respective examples and exercises that can be found there. However, so far I did not found an instructive source there. I would be thankful if someone could point me to some github source or if someone could share some code or other sources, which provide a sound example on …
Category: Data Science

Siamese netwroks - how to choose loss function?

I have read several articles about siamese netwroks, and I understand that there are 3 different types of loss functions: Contrastive Loss - Takes 2 inputs (from same or different classes) Triplet Loss - Takes 3 inputs (Anchor, Positive, Negative). Quadruplet Loss - Takes 4 inputs (Anchor, Positive, Negative1, Negative3). I didn't found any description about which loss function to use in each scenario ? How do I which loss function (Contrastive/Triplet/Quadruplet) to choose in the case I'm working on …
Category: Data Science

Transfer learning on siamese network with limited data

This may be a silly example, but it should be similar enough to my true research question without giving specifics. Let's say I have a pretrained Siamese neural network that tells you similarity between image pairs of dogs of different breeds. Now I want to use transfer learning on the last layer to ask a slightly different question. Now the dogs may be wearing hats, and I want to consider a regular picture of a Husky and a Husky wearing …
Category: Data Science

Split npz Dataset into Train/Test using Sklearn

I have a dataset of faces stored in an NPZ file that I would like to train it on Siamese Network. To do that, the dataset must be split into train / test using Sklearn. However, when I run the code to do the split I face this error message: ValueError: Found input variables with inconsistent numbers of samples: [2, 199139] How can I solve this issue given that my dataset consist of 199139 labeled faces, so I have a …
Category: Data Science

Feature extraction from sequence of images with Siamese Neural Network

I am trying to train a neural network to recognize certain actions in short movies. Each such movie consists of a fixed number of frames, each frame - the image is of course the same size, after preliminary preprocessing. And now I'd like to do some feature extraction of each of these images using the Siamese Neural Network (SNN). I found articles somewhere that SNN might be great for this, but without implementation details. These articles show that they take …
Category: Data Science

How to combine different models in Keras?

I have a pre-trained network, consist of two parts, the feature extraction, and the similarity learning. The network takes two inputs and predicts the images are same or not. The feature extraction part was VGGNet 16 with all layers freezed. I only extracted the feature vectors and learned the similarity network which consists of the two convolutional layers followed by four dense layers. Note: Removed last layers from the image due to large size. Now, I want to fine-tune the …
Category: Data Science

Sneakers representation learning

I am trying to make a model which would take an image of shoes as an input and output a meaningful N-dimensional embedding of the shoes, so that they could be searchable/comparable/clustered and used in a recommender system. My first guess was to employ a siamese CNN (Densesnet + 1 extra fully connected layer for the 32-dimensional embedding generation) with online hard mining triplet loss. So the idea was to train the network on making prediction if shoes on images …
Category: Data Science

How do I implement my loss function in Keras/Tensorflow, when it seems to have different parameters to the default ones?

So, I'm a university student studying Data Science, and after my previous question about TensorFlow got literally zero answers on Stack Overflow, I figured I'd post this one here instead. I need to construct a Siamese network for classifying whether or not two characters are members of the same alphabet. Looking online, I've found this tutorial on Towards Data Science, but it uses a default loss function, while I need to code my own. The problem is, the loss function …
Category: Data Science

Siamese Network - Sigmoid function to compute similarity score

I am referring to siamese neural networks introduced in this paper by G. Koch et al. The siamese net computes 2 embeddings, then calculates the absolute value of the L1 distance, which would be a value in [0, +inf). Then the sigmoid activation function is applied to this non-negative input, so the output afterwards would be in [0.5, 1), right? So, if two images are from the same class, your desired L1 distance should be close to 0, thus the …
Category: Data Science

Chess deep learning siamese network overfitting when shouldn't in theory

TLDR: My network is training with pairs so instead of 10^6 samples it has 10^12 samples (The number of samples squared) . With that large of a data set is shouldn't overfit but it does after very few epochs. Can't find the reason, any help is appriciated. Thanks. I'm trying to implement a chess deep learning model like shown in the paper "DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess" (https://www.cs.tau.ac.il/~wolf/papers/deepchess.pdf). It is using a siamese neural network …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.