softmax

Is the Cross Entropy Loss important at all, because at Backpropagation only the Softmax probability and the one hot vector are relevant?

clemens_m

2022年5月25日 05:02

Is the Cross Entropy Loss (CEL) important at all, because at Backpropagation (BP) only the Softmax (SM) probability and the one hot vector are relevant? When applying BP, the derivative of CEL is the difference between the output probability (SM) and the one hot encoded vector. For me the CEL output, which is very sophisticated, does not play any roll for learning. I´m expecting a fallacy in my reasoning, so could somebody please help me out?

Topic: softmax backpropagation loss-function deep-learning

Category: Data Science

Sample variance matrix normal distribution in R

Francesco Binucci

2022年5月22日 22:40

I'm trying to perform a multinomial logistic regression in R employing the Metropolis-Hastings algorithm, considering a Matrix Normal Distribution as proposal. I'm using the function rmatrixnorm() included in the package LaplacesDemon in order to sample from the proposal distribution. I followed this strategy since I need a vector of parameter $\underline{\beta_{k}}$, with $k=1,\dots,K$ (number of classes involved in the classification). At the end of the Monte-Carlo iterations, my procedure retrieves the sample mean and the sample covariance of the posterior …

Topic: mcmc softmax classification

Category: Data Science

Deriving a binary logistic classifier from a multi class logistic classifier

user1767774

2022年5月19日 17:28

Given a multi class logisitic classifier $f(x)=argmax(softmax(Ax + \beta))$, and a specific class of interest $y$, is it possible to construct a binary logistic classifier $g(x)=(\sigma(\alpha^T x + b) > 0.5)$ such that $g(x)=y$ if and only if $f(x)=y$?

Topic: softmax multiclass-classification logistic-regression classification

Category: Data Science

CNN Eliminate Wrong Results

Beytullah Aksoy

2022年5月16日 17:38

I extracted images of human faces from the videos, but the model also recorded images without faces. I wrote CNN for emotion classification. In the obvious pictures, the probability is closer to a probability in the softmax function in the last layer, for example, in a photo that is certain to be happy, a probability of 0.95 for the happy class appears, but if there is no face in the picture, it disperses between classes such as 0.3 and 0.2. …

Topic: softmax convolutional-neural-network image-recognition deep-learning neural-network

Category: Data Science

Backpropagation with log likelihood cost function and softmax activation

Nagabhushan S N

2022年5月3日 00:02

In the online book on neural networks by Michael Nielsen, in chapter 3, he introduces a new cost function called as log-likelihood function defined as below $$ C = -ln(a_y^L) $$ Suppose we have 10 output neurons, when back propagating the error, only the gradient w.r.t. $y^{th}$ output neuron is non-zero and all others are zero. Is that right? If so, how is the below equation (81) true? $$\frac{\partial C}{\partial b_j^L} = a_j^L - y_j $$ I'm getting the expression …

Topic: softmax backpropagation neural-network

Category: Data Science

Multiclass Classification with Decision Trees: Why do we calculate a score and apply softmax?

Caleb

2022年5月1日 12:07

I'm trying to figure out why when using decision trees for multi class classification it is common to calculate a score and apply softmax, instead of just taking the averages of the terminal nodes probabilities? Let's say our model is two trees. A terminal node of tree 1 has example 14 in a node with 20% class 1, 60% class 2, and 20% class 3. A terminal node of tree 2 has example 14 in a node with 100% class …

Topic: softmax xgboost decision-trees

Category: Data Science

using logsumexp in softmax

zipline86

2022年4月29日 11:08

I saw this equation in somebody's code which is an alternative approach to implementing the softmax in order to avoid underflow by division by large numbers. softmax = e^(matrix - logaddexp(matrix)) = E^matrix / sumexp(matrix) logsumexp = scipy.special.logsumexp(matrix, axis=-1, keepdims=True) softmax = np.exp(matrix - logsumexp) I understand that when you log equations that use division you would then subtract, i.e. log(1/2) = log(1) - log(2). However, in the implantation of the code above, shouldn't they also log the matrix in …

Topic: softmax mathematics

Category: Data Science

Difference in performance Sigmoid vs. Softmax

Eric Cartman

2022年4月23日 22:04

For the same Binary Image Classification task, if in the final layer I use 1 node with Sigmoid activation function and binary_crossentropy loss function, then the training process goes through pretty smoothly (92% accuracy after 3 epochs on validation data). However, if I change the final layer to 2 nodes and use the Softmax activation function with sparse_categorical_crossentropy loss function, then the model doesn't seem to learn at all and stuck at 55% accuracy (the ratio of the negative class). …

Topic: sigmoid softmax training loss-function image-classification

Category: Data Science

Dot product for similarity in word to vector computation in NLP

Vivek Dani

2022年4月17日 17:06

In NLP while computing word to vector we try to maximize log(P(o|c)). Where P(o|c) is probability that o is outside word, given that c is center word. Uo is word vector for outside word Vc is word vector for center word T is number of words in vocabulary Above equation is softmax. And dot product of Uo and Vc acts as score, which should be higher the better. If words o and c are closer then their dot product should …

Topic: softmax word2vec word-embeddings nlp similarity

Category: Data Science

Train a model when input can contain a smaller options output with the correct output

AJ AJ

2022年3月19日 18:16

I have service order lines to charge customers, each line needs to be set to an actual product. If the customer had only one product, so all lines are set to that product. But, if the there are many products, currently a trained employee does the matching, each line to one product. e.g. enforcement fees -----------> backup YYY transmitter text message fees ----------> main XXX transmitter installation installation fee -----------> main XXX transmitter installation I can train all the kinds …

Topic: softmax neural-network

Category: Data Science

How to calculate Temperature variable in softmax(boltzmann) exploration

cvg

2022年3月16日 22:04

Hi I am developing a reinforcement learning agent for a continous state/discrete action space. I am trying to use boltmzann/softmax exploration as action selection strategy. My action space is of size 5000. My implementation of boltzmann exploration: def get_action(state,episode,temperature = 1): state_encod = np.reshape(state, [1, state_size]) q_values = model.predict(state_encod) prob_act = np.empty(len(q_values[0])) for i in range(len(prob_act)): prob_act[i] = np.exp(q_values[0][i]/temperature) #numpy matrix element-wise division for denominator (sum of numerators) prob_act = np.true_divide(prob_act,sum(prob_act)) action_q_value = np.random.choice(q_values[0],p=prob_act) action_keys = np.where(q_values[0] == action_q_value) action_key …

Topic: dqn softmax ai reinforcement-learning deep-learning

Category: Data Science

neural network binary classification softmax logsofmax and loss function

user2543622

2022年3月5日 00:58

I am building a binary classification where the class I want to predict is present only <2% of times. I am using pytorch The last layer could be logosftmax or softmax. self.softmax = nn.Softmax(dim=1) or self.softmax = nn.LogSoftmax(dim=1) my questions I should use softmax as it will provide outputs that sum up to 1 and I can check performance for various prob thresholds. is that understanding correct? if I use softmax then can I use cross_entropy loss? This seems to …

Topic: cross-entropy binary-classification softmax pytorch

Category: Data Science

classification using LogSoftmax vs Softmax and calculating precision-recall curve?

user2543622

2022年3月2日 18:15

In case of binary classification we could get final output using LogSoftmax or Softmax. In case of softmax we get results that add up to 1. I understand that LogSoftmax penalizes more for a wrong classification and few other mathematical advantage. I have binary classification problem with class 1 occurring very rarely (<2% times) my questions: If I am using probability cutoff of 0.5 (predicting to class 1 if prob is above 0.5) with Softmax then will I get same …

Topic: softmax classification

Category: Data Science

Distilling the knowledge of a binary cross entropy with sigmoid function model to a softmax model

John Jones

2022年3月1日 17:04

I have a complex CNN architecture that uses a binary cross-entropy and sigmoid function for classification. However, due to hardware restraints I would like to compress my model using knowledge distillation and unfortunately most papers deals with knowledge distillation using two models with softmax and sparse categorical entropy for the distilling the knowledge of the larger network. I'd like to know if it is possible to use a complex model that uses binary cross entropy and sigmoid function for activation …

Topic: binary-classification softmax loss-function deep-learning machine-learning

Category: Data Science

Using SVM as final layer in Convolutional Neural Network

root

2022年2月15日 14:18

I am working on the implementation of a hybrid CNN-SVM, where I define the use of SVM in the last layer of CNN as shown in this code: # Flattening cnn.add(tf.keras.layers.Flatten()) # Full Connection cnn.add(tf.keras.layers.Dense(units=128, activation='relu')) cnn.add(Dense(4, kernel_regularizer=tf.keras.regularizers.l2(0.01),activation ='softmax')) cnn.compile(optimizer = 'adam', loss = 'squared_hinge', metrics = ['accuracy']) In the case of CNN (without adding SVM), we can define the last part of CNN as below: def calculate_softmax(data): result = np.exp(data) return result softmax = calculate_softmax(temp) prediction = softmax.argmax() where …

Topic: softmax cnn svm

Category: Data Science

Derivative of a custom loss function with the logistic function

Stav Yosef

2022年2月5日 14:02

I have costum loss function with $\mu ,p, o, u, v$ as variables and $\sigma$ is the logistic function. I need to derive this loss function. Due to multiple variables in the loss function, I need to use the softmax function which is the generalization of the logistic function? $L = -\frac{1}{N}\sum_{i,j \in S}^{2}{a_j\{y_{i,j}log[\sigma{(\mu + p_i + o_j + u^{T}_{i}v_{j})]} + (1 - y_{i,j})log[1 - \sigma{(\mu + p_i + o_j + u^{T}_{i}v_{j})]}\}}$ As far I understand, it is a multivariate …

Topic: softmax mathematics gradient-descent logistic-regression

Category: Data Science

How to prove Softmax Numerical Stability?

Nicoinlas

2022年2月2日 21:12

I was playing around with the softmax function and tried around with the numerical stability of softmax. If we increase the exponent in the numerator and denominator with the same value, the output of the softmax stays constant (see picture below where -Smax is added). I cannot figure out how to prove this numerical stability (although I read that it is true). Can anyone help me with the proof?

Topic: softmax activation-function neural-network

Category: Data Science

Can a single label be a vector/matrix in a neural network and not a scalar?

zyezye11

2021年12月23日 00:46

My training data consists of individual sentences and each sentence has a few labels (say 10) and each of these labels has a discrete score from 1-10 -- so in essence, a single training example has a label that is not a scalar, but rather a matrix/vector of (10,10) or (1,10*10). Can a softmax adjust the weights in accordance to a label that on its own, is a matrix/vector? I'm looking to fine-tune a model that has this capability. Thanks.

Topic: softmax regression deep-learning neural-network machine-learning

Category: Data Science

Is there a Softmax-like transformation with scale-invariance and linarity?

Someone2

2021年12月2日 15:12

At the moment I'm using XGBoost to generate a prediction of probabilities with a custom objective-function to build something like an expert system. To do so I need to transform the raw XGBoost predictions into a probability distribution, where every value lies in the range from 0 to 1 and they all sum up to 1. Naturally you start out with the Softmax transformation. But as it turns out this function has some significant drawbacks for this kind of application. …

Topic: transformation softmax xgboost

Category: Data Science

Cross-entropy loss explanation

enterML

2021年11月25日 18:27

Suppose I build a neural network for classification. The last layer is a dense layer with Softmax activation. I have five different classes to classify. Suppose for a single training example, the true label is [1 0 0 0 0] while the predictions be [0.1 0.5 0.1 0.1 0.2]. How would I calculate the cross entropy loss for this example?

Topic: softmax deep-learning neural-network machine-learning

Category: Data Science

About