loss - Geeks Mental

Gradient and loss calculation localization in Vision Transformers

minos3579

2022年5月29日 16:49

Hi all I am resorting to you to figure out where the gradient and the loss for q,k,v weights update happens in Vision Transformers. I suspect it is the MLP/FF bit of the architecture but I am not confidently sure. I attach some code from lucidrains import torch from torch import nn from einops import rearrange, repeat from einops.layers.torch import Rearrange # helpers def pair(t): return t if isinstance(t, tuple) else (t, t) # classes class PreNorm(nn.Module): def __init__(self, dim, …

Topic: loss transformer gradient

Category: Data Science

Regression problem with Deep Learning

vipin bansal

2022年5月26日 10:36

I'm working on the Housing Price dataset, where the target is to predict the housing price. The price of the house will always be positive and according to me, it's possible that the model can predict a negative outcome for some of the samples. If it's correct, is there any way to control the training such that the model always predicts at least the positive value. As in the case of the classification case we use the Sigmoid/Softmax activation function …

Topic: loss regression

Category: Data Science

Keras loss object and shapes

Hiromi

2022年5月26日 09:46

I'm at a loss. I've been staring at this problem for a while and I'm unsure how to proceed. I've been constructing a script to train a model for object detection based on a dataset I've compiled. I've been going along with some example scripts and modifying some code. Here is my code: import os from tempfile import gettempdir import tensorflow as tf from tensorflow.keras import layers, Model, Sequential import numpy as np from clearml import Task, Dataset, TaskTypes def …

Topic: loss keras tensorflow

Category: Data Science

HuggingFace Transformers is giving loss: nan - accuracy: 0.0000e+00

JasonExcel

2022年5月14日 00:02

I am a HuggingFace Newbie and I am fine-tuning a BERT model (distilbert-base-cased) using the Transformers library but the training loss is not going down, instead I am getting loss: nan - accuracy: 0.0000e+00. My code is largely per the boiler plate on the [HuggingFace course][1]:- model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=3) opt = Adam(learning_rate=lr_scheduler) model.compile(optimizer=opt, loss=loss, metrics=['accuracy']) model.fit( encoded_train.data, np.array(y_train), validation_data=(encoded_val.data, np.array(y_val)), batch_size=8, epochs=3 ) Where my loss function is:- loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) The learning rate is calculated like so:- lr_scheduler …

Topic: loss huggingface bert nlp

Category: Data Science

Why is my validation loss never INcreasing?

Sophie Anna

2022年4月26日 21:40

I am currently training different neural networks for the binary classification of images. When using the logistic regression, my validation loss never increases, even not after 5000 epochs. I thought that at some point overfitting happens and the validation loss always increases. Does anybody know why this does not happen?

Topic: logistic loss validation regression

Category: Data Science

Loss stuck for regression model

dejhost

2022年4月19日 07:17

I'm training a model that returns 2 parameters. These two parameters are used for classical image processing: a threshold for the kirsch-operator the number of iterations for billateral filter. The model trains using 300 representative images, along with both parameters that were manually determined. I am currently using resnet18. A convolutional regression model. The fully connected layer is changed to output 2 nodes. As loss function I've chosen is the mean squared loss. Reducelronplateau is used as a learning rate …

Topic: loss pytorch regression machine-learning

Category: Data Science

Interpreting Categorical Crossentropy Loss

Michael Pulis

2022年3月17日 19:04

I would like to ask for clarification about the loss values outputted during training using Categorical Crossentropy as the loss function. If I have 11 categories, and my loss is (for the sake of the argument) 2, does this mean that my model is on average 2 categories off the correct category, or is the loss used purely for comparative purposes and cannot be interpreted like I am suggesting ?

Topic: loss tensorflow machine-learning

Category: Data Science

High loss but low rmse, how?

Stupid_Intern

2022年3月11日 14:02

I have trained an lstm model on a dataset but its loss during training is ten times than the rmse during test. How is it possible, and can I use this model if rmse is very low but loss is high? How can I improve training and test loss?

Topic: loss rmse hyperparameter-tuning lstm

Category: Data Science

Should the model be defined again before training it to new data?

Stupid_Intern

2022年2月21日 14:42

I wanted to fit the LSTM model on new data set in a loop so I have implemented it like this #................................define model........................... model =Sequential() model.add(LSTM(100, activation='relu', input_shape=(n_input,n_features))) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse') model.summary() for k, v in enumerate(nse.get_fno_lot_sizes()): if v not in ('^NSEI','NIFTYMIDCAP150.NS','NIFTY_FIN_SERVICE.NS','^NSEBANK'): #-----------Create Training-------------------- train = df[['close']].iloc[:int(len(df)*0.8)] scaler = MinMaxScaler() scaler.fit(train) scaled_train = scaler.transform(train) #------------------------------------------------------ generator = TimeseriesGenerator(scaled_train,scaled_train,length=n_input, batch_size=1) #----------------------------------------------------- #fit model model.fit(generator,epochs=10) or should the model definition be inside the for loop? I am asking this because I do …

Topic: loss lstm training

Category: Data Science

Custom loss function for regression

Filip

2022年1月30日 14:01

I am trying to write a custom loss function for a machine learning regression task. What I want to accomplish is following: Reward higher preds, higher targets Punish higher preds, lower targets Ignore lower preds, lower targets Ignore lower preds, higher targets All ideas are welcome, pseudo code or python code works good for me. This is what I tried so far, it does not work so well I think it is because it does not take high targets into …

Topic: loss loss-function deep-learning python machine-learning

Category: Data Science

Logarithmic scale for a learning curve

Bambeil

2021年11月24日 15:55

I'm plotting the learning curve with Python with the following code: import matplotlib.pyplot as plt import seaborn as sns import csv import pandas as pd sns.set(style='darkgrid') # Increase the plot size and font size. sns.set(font_scale=1.5) plt.rcParams["figure.figsize"] = (12,6) plt.plot(lst, 'r') plt.legend(["Validation Loss"]) # Label the plot. plt.title("RNN deltat") plt.xlabel("Epoch") plt.ylabel("Loss") The curve looks like this: The lecturer said better try it on a logarithmic scale. Can you please help to apply the logarithm here?

Topic: loss plotting validation deep-learning python

Category: Data Science

Decreasing Learning Rate doesn't improve the results

wuiwuiwui

2021年11月5日 11:06

In theory and what people are doing (e.g. Paper) decreasing the learning rate should help the optimizer to go "deeper into the valley" and thus decrease the loss and increase the metric. Thus, my plan was to train a neural network with a learning rate of 1 until the loss and my metric stay approx. the same for some epochs, then with 0.1, then 0.01 and so on. However, what I'm observing is, that the loss of the model stagnates …

Topic: loss learning-rate deep-learning machine-learning

Category: Data Science

The case of (1,478) dim and parameters of neural network to find out

Gleb Karpushkin

2021年10月14日 08:33

colleagues, actually I am kind'a new to NN, but hard trying.. I have data: Index: 40073 entries (excluded from training, UID) Columns: 484 entries dtypes: bool(468), float64(2), int64(13), object(1) I used only 478 arguments. The Y is moneySpend which can be >= 0 The code is below: newDropped = df.drop(["moneySpend","userAgent", "secondsToBuy", "hoursToBuy", "daysToBuy", "platform"], axis = 1) x_train, x_test, y_train, y_test = train_test_split(newDropped, df["moneySpend"], test_size=0.25, random_state=547) model = Sequential() dnn1.add(Dense(16, input_dim=478, activation='relu')) dnn1.add(Dense(8, activation='relu')) dnn1.add(Dense(1, activation='linear')) model.compile(loss='mse', optimizer='adam', metrics=['accuracy']) tb_callback …

Topic: loss epochs activation-function keras tensorflow

Category: Data Science

NeMo Conformer-CTC Predicts Same Word Repeatedly When Fine-Tuning

Karima Kadaoui

2021年9月23日 10:00

I'm using the NeMo Conformer-CTC small on the LibriSpeech dataset (the clean subset, around 29K inputs, using 90% for training and 10% for testing). I use Pytorch Lightning. When I try to train, the model learns 1 or 2 sentences in 50 epochs and gets stuck at a loss of 60-something (I trained it for 200 epochs too and it didn't budge). But when I try to fine tune it using a pre-trained model from the toolkit, it predicts correctly …

Topic: loss finetuning speech-to-text pytorch nlp

Category: Data Science

Training Loss increases, but Validation Loss decreases

carrot

2021年9月5日 19:03

I am finetuning a T5 transformer model on a sequence to sequence task. My program outputs the training and validation loss every 500 optimization steps. However, when I first started training the model, the training loss steeply increased, but my validation loss decreased (My training dataset has about 85,000 samples and my validation dataset has about 10,000 samples)! Does anyone know why this might be happening? Is this a sign my model is not learning properly? Also, does anyone know …

Topic: loss huggingface deep-learning nlp machine-learning

Category: Data Science

why the accuracy result and the loss result of an ANN model is inconsistent?

kazi fahim lateef

2021年8月21日 20:43

I trained a model based on an ANN and the accuracy is 94.65% almost every time while the loss result is 12.06%. Now my question is shouldn't the loss of the model be (100-94 = 6%) or near it? Why it is giving a result of 12% when the accuracy is 94%? • ANN model specification: Trained and tested data= 96,465 (training data = 80%, testing data = 20%) 1 Input layer= 5 nodes, 2 Hidden layers= 24 nodes each, …

Topic: loss loss-function accuracy machine-learning

Category: Data Science

Is my CNN model overfitting or underfitting?

Anna

2021年6月9日 15:40

I would like to be sure of whether the model is overfitting or undercutting. Being new to this, is there any specific point to identify when to stop the training process. Any help in this regard would be helpful. Thanks.

Topic: loss cnn overfitting deep-learning accuracy

Category: Data Science

Q values loss per episode and mean absolute error

imen kanzali

2021年4月12日 03:57

I am new to deep reinforcement learning! I am following this code for my adaptation problem (doing actions) https://github.com/jaromiru/AI-blog/blob/master/CartPole-DQN.py I am wondering how I can evaluate the training, I already got the average rewards, but how can I get the average Q values, loss per episode, and average absolute error. to evaluate my agent please! I will be grateful if you can help me!

Topic: loss dqn reinforcement-learning

Category: Data Science

Training loss = 0, training accuracy =1, validation and test around 85%

CasellaJr

2021年4月5日 14:14

I have created different CNNs for doing image classification. The dataset is this: https://www.kaggle.com/crowww/a-large-scale-fish-dataset There are 9 classes, and each class contains 1000 images of fish. I split in training (800 imgs per class), validation (100) and test (100). I created different CNN with these layers: 1)1 convolutional layers (conv, relu, batchnorm) + 2 fully connected layers + output 2)2 convolutional layers (conv, relu, batchnorm and maxpooling) + 2 fully connected layers + output 3)4 convolutional layers (conv, relu, batchnorm …

Topic: loss cnn convolutional-neural-network accuracy classification

Category: Data Science

How to calculate MAE and threshold in a multivariate time series

Fabio

2021年4月2日 18:25

I'm trying to understand how to calculate the MAE in my time series and then the thresholds to understand which of my data in the test set are anomalies. I'm following this tutorial, which is based on a univariate time series, and they calculate it in the following way: # Get train MAE loss. x_train_pred = model.predict(x_train) train_mae_loss = np.mean(np.abs(x_train_pred - x_train), axis=1) I have a dataset structured as well: device1 device2 device3 .... device30 0.20 0.35 0.12 0.56 1.20 …

Topic: loss training anomaly-detection time-series python

Category: Data Science

About