deep-learning

Loading saved model fails

nmorsi

2022年6月4日 23:32

I've trained a model and saved it in .h5 format. when I try loading it I received this error ValueError Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_588/661726548.py in <module> 9 # returns a compiled model 10 # identical to the previous one ---> 11 reconstructed_model = keras.models.load_model("./custom_model.h5") ~\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs) 65 except Exception as e: # pylint: disable=broad-except 66 filtered_tb = _process_traceback_frames(e.__traceback__) ---> 67 raise e.with_traceback(filtered_tb) from None 68 finally: 69 del filtered_tb ~\Anaconda3\lib\site-packages\keras\utils\generic_utils.py in class_and_config_for_serialized_keras_object(config, module_objects, custom_objects, printable_module_name) 560 …

Topic: machine-learning-model keras tensorflow loss-function deep-learning

Category: Data Science

The difference between data science and algorithm development

אבנר יעקב

2022年6月4日 23:08

I see a lot of job opportunities in the field of data science but I'm not sure the difference between a data scientist and deep learning algorithm developer. Can someone explain that to me?

Topic: deep-learning algorithms machine-learning

Category: Data Science

What enables transformers or very deep models "plan" ahead for sequential decision making?

Water Dragon

2022年6月4日 18:29

I was watching this amazing lecture by Oriol Vinyals. On one slide, there is a question asking if the very deep models plan. Transformer models or models employed in applications like Dialogue Generation do not have a planning component but behave like they already have the dialogue planned. Dr. Vinyals mentioned that there are papers on "how transformers are building up knowledge to answer questions or do all sorts of very interesting analyses". Can any please refer to a few …

Topic: transformer reinforcement-learning deep-learning neural-network machine-learning

Category: Data Science

Neural network / machine learning approach to model specific sequencing-classification problem in industry

Kunis

2022年6月4日 17:27

I am working on a project which involves developing a machine learning/deep learning for an application in a roll-to-roll industry. For a long time, I have been looking for similar problems as a way to get some guidance but I was never able to find anything related. Basically, the problem can be seen as follows: An industrial machine is producing a roll of some material, which tends to have visible defects throughout the roll. I have already available a machine …

Topic: lstm deep-learning classification machine-learning

Category: Data Science

Class token in ViT and BERT

Shir

2022年6月4日 15:02

I'm trying to understand the architecture of the ViT Paper, and noticed they use a CLASS token like in BERT. To the best of my understanding this token is used to gather knowledge of the entire class, and is then solely used to predict the class of the image. My question is — why does this token exist as input in all the transformer blocks and is treated the same as the word / patches tokens? Treating the class token …

Topic: attention-mechanism computer-vision deep-learning nlp machine-learning

Category: Data Science

Intuitively, why do Non-monotonic Activations Work?

Jason

2022年6月4日 14:43

The swish/SiLU activation is very popular, and many would argue it has dethroned ReLU. However, it is non-monotonic, which seems to go against popular intuition (at least on this site: example 1, example 2). Reading the swish paper, the justification that the authors give is that non-monotonicity "increases expressivity and improves gradient flow... [and] may also provide some robustness to different initializations and learning rates." The authors provide an image to back up this claim, but at best this argument …

Topic: activation-function deep-learning neural-network machine-learning

Category: Data Science

Computing probabilities in Plackett-Luce model

SHASHANK GUPTA

2022年6月4日 14:19

I am trying to implement a Plackett-Luce model for learning to rank from click data. Specifically, I am following the paper: Doubly-Robust Estimation for Correcting Position-Bias in Click Feedback for Unbiased Learning to Rank. The objective function is the reward function similar to the one used in reinforcement learning $R_d$ is the reward for document d, $\pi(k \vert d)$ is the probability of document d placed at position k for a given query q. $w_k$ is the weight of position …

Topic: bayesian ranking reinforcement-learning deep-learning

Category: Data Science

Facebook picture labeling

Jama

2022年6月4日 05:20

I want to train a neural network and use open CV for facial recognition. Nicholas Martin, whose a user here on SE told me that this is a supervised learning class (clearly). So I need pictures and corresponding labels. So I thought hey! Maybe Facebook could be of help. So how can I label potentially millions of facebook pictures? Would it be by the user's profile name or is there a way to find out the name of the person …

Topic: deep-learning

Category: Data Science

Few activation functions handling various problems - neural networks

mikinoqwert

2022年6月4日 04:01

How can a few activation functions in neural networks handle so many different problems? I know some basics theory behind ANN, but I can't get what functions like the sigmoid function etc. have in common with for example image classification?

Topic: activation-function deep-learning neural-network

Category: Data Science

Is a dense layer required for implementing Bahdanau attention?

Lucky Man

2022年6月4日 02:00

I saw that everyone adds Dense( ) layer in their custom Bahdanau attention layer, which I think isn't needed. This is an image from a tutorial here. Here, we are just multiplying 2 vectors and then doing several operations on these vectors only. So what is the need of Dense( ) layer. Is the tutorial on 'how does attention work' wrong?

Topic: attention-mechanism deep-learning machine-learning

Category: Data Science

Why convolutional layer learns only biases?

Decent Lizard

2022年6月4日 00:48

I`m training a siamese CNN to distinguish between pairs of images and though my train/val binary cross-entropy loss values show negative trend, implying some of the model parameters are being updated, I noticed that convolution kernels barely change while their biases change significantly tensorboard weights histogram image. Also, while loss value decreases, accuracy appears to be frozen for some epochs and then instantly shoots up accuracy and loss plots. Q1: If this is caused by vanishing gradient, why would it …

Topic: siamese-networks convolutional-neural-network tensorflow deep-learning

Category: Data Science

Fast AI Lesson 4 - MNIST. Confused about multiplying weights by pixels?

Andrew

2022年6月3日 21:06

I’m on lesson 4 of the Fast AI "Deep Learning for Coders" course, and have been back through the same lesson a few times now but I don’t think I’m quite getting a few things. I want to have an understanding of what’s going on before moving on. This lesson is on MNIST - and Jeremy is recognising 3s vs 7s. So he has 12000 images (ignoring mini-batches) of about 800 pixels each, and his tensor has a shape of …

Topic: fastai deep-learning

Category: Data Science

How to use text as an input for a neural network - regression problem? How many likes/claps an article will get

Najati Al-imam

2022年6月3日 18:03

I am trying to predict the number of likes an article or a post will get using a NN. I have a dataframe with ~70,000 rows and 2 columns: "text" (predictor - strings of text) and "likes" (target - continuous int variable). I've been reading on the approaches that are taken in NLP problems, but I feel somewhat lost as to what the input for the NN should look like. Here is what I did so far: Text cleaning: removing …

Topic: deep-learning neural-network nlp machine-learning

Category: Data Science

Find highest reward for epsilon-greedy bandit program

vishak raj

2022年6月3日 17:23

I started to learn reinforcement learning, the first example is handling bandit program using epsilon-greedy method, In this example, there are three bandit machines used, the output is the mean value for all bandit machines and cumulative average with respect to the epsilon value The code - class Bandit: def __init__(self, m): self.m = m self.mean = 0 self.N = 0 def pull(self): return np.random.randn() + self.m def update(self, x): self.N += 1 self.mean = (1 - 1.0/self.N)*self.mean + 1.0/self.N*x …

Topic: implementation reinforcement-learning deep-learning machine-learning

Category: Data Science

False positive in Multi class Image classification

komal

2022年6月3日 12:09

I am training a neural network with some convolution layers for multi class image classification. I am using keras to build and train the model. I am using 1600 images for all categories for training. I have used softmax as final layer activation function. The model predicts well on all True categories with high softmax probability. But when I test model on new or unknown data, it predicts with high softmax probability. How can I reduce that? Should I make …

Topic: cnn keras image-classification computer-vision deep-learning

Category: Data Science

Val Loss and manually calculated loss produce different values

Amit Raz

2022年6月3日 08:56

I have a CNN classification model that uses loss: binary cross entropy: optimizer_instance = Adam(learning_rate=learning_rate, decay=learning_rate / 200) model.compile(optimizer=optimizer_instance, loss='binary_crossentropy') We are saving the best model so the latest saved model is the one that achieved the best val_loss: es = EarlyStopping(monitor='val_loss', mode='min', verbose=0, patience=Config.LearningParameters.Patience) modelPath = modelFileFolder + Config.LearningParameters.ModelFileName checkpoint = keras.callbacks.ModelCheckpoint(modelPath , monitor='val_loss', save_best_only=True, save_weights_only=False, verbose=1) callbacks = [checkpoint,es] history = model.fit(x=training_generator, batch_size=Config.LearningParameters.Batch_size, epochs=Config.LearningParameters.Epochs, validation_data=validation_generator, callbacks=callbacks, verbose=1) on the course of the training the logs show that the …

Topic: cnn tensorflow loss-function deep-learning neural-network

Category: Data Science

Overfitting in CNN

Siladittya

2022年6月3日 07:04

I am training a VGG net on STL-10 dataset I am getting Top-5 validation accuracy about 98% and Top-1 validation accuracy about 83% But both the Top-1 and Top-5 Training accuracy is reaching 100% Does this mean that the network is over-fitting? Or not? Code:: def conv2d(inp,name,kshape,s): with tf.variable_scope(name) as scope: kernel = get_weights('weights',shape=kshape) conv = tf.nn.conv2d(inp,kernel,[1,s,s,1],'SAME') bias = get_bias('biases',shape=kshape[3]) preact = tf.nn.bias_add(conv,bias) convlayer = tf.nn.relu(preact,name=scope.name) return convlayer def maxpool(inp,name,k,s): return tf.nn.max_pool(inp,ksize=[1,k,k,1],strides=[1,s,s,1],padding='SAME',name=name) def loss(logits,labels): labels = tf.reshape(tf.cast(labels,tf.int64),[-1]) #print labels.get_shape().as_list(),logits.get_shape().as_list() cross_entropy …

Topic: vgg16 tensorflow deep-learning neural-network machine-learning

Category: Data Science

How does the margin constant (alpha) in the triplet loss affect the training process when it is a constant?

Baraa

2022年6月3日 06:58

How does the margin constant in the triplet loss formula affect the gradient calculation when its derivative will be zero?

Topic: gradient loss-function deep-learning

Category: Data Science

In a Time Series Problem, is it possible to forecast quantities by learning the patterns of other items? What are my options?

Arun Ajay

2022年6月2日 20:02

Suppose I own a store that sells a variety of apples and I have the following stats each month. Report Date Type of Apple (TA) Quantity Available(QA) Quantity Sold in the Past 30 days(QS30) Quantity Shipping In (QSI) Quantity Needed to Order(QN) Lets make the following assumptions/givens: There are three types of apples: red apples, green apples and yellow apples. T(1) denotes the first month and T(60) denotes the 60th month. QA @ T(i + 1) = QA@T(i) + QSI@T(i) …

Topic: deep-learning time-series python r machine-learning

Category: Data Science

NLP Deep Learning Project (or Paper)

minattosama

2022年6月2日 14:11

Kind of trying to find a good repos/course/paper or something that can get me up to speed for an NLP problematic. Exemple, Classify email or something else. Also that is relevant to latest state of art (transformers, multi heads etc). Thank you

Topic: deep-learning nlp python

Category: Data Science

About