cross-entropy

Large jumps in loss in simple transformer model?

msailor

2022年5月27日 21:40

As an exercise, I created a very simple transformer model that just sees the same simple batch of dummy data repeatedly and (one would assume) should quickly learn to fit it perfectly. And indeed, training reaches a loss of zero quickly. However I noticed that the loss does not stay at zero, or even close to it: there are occasional large jumps in the loss. The script below counts every time that the loss jumps by 10 or more between …

Topic: cross-entropy huggingface transformer loss-function deep-learning

Category: Data Science

How to compute the Gini index, the entropy and the classification error from a decision tree?

Nezuko

2022年5月9日 17:24

How to find the Gini index, the entropy, and the classification error for each node of the tree in the figure below. Please help me to compute them.

Topic: cross-entropy gini-index decision-trees classification

Category: Data Science

Is there a random forest env (sci-kit, TFDF, R, etc) that has an implementation for multi-output regression?

Ryan Keathley

2022年3月11日 01:32

It is easy to adapt the idea of tree based linear regression to perform logistic regression: The decision boundaries of the tree divide the space of independent variables into hyper-cubes, and each hyper-cube is assigned a value that serves as the output of the model. Instead of the decision boundaries and value being chosen to minimize the sum of squared residuals, it should minimize the total binary cross entropy loss (equivalent to maximizing the likelihood). Taking this a step further, …

Topic: cross-entropy decision-trees logistic-regression random-forest

Category: Data Science

neural network binary classification softmax logsofmax and loss function

user2543622

2022年3月5日 00:58

I am building a binary classification where the class I want to predict is present only <2% of times. I am using pytorch The last layer could be logosftmax or softmax. self.softmax = nn.Softmax(dim=1) or self.softmax = nn.LogSoftmax(dim=1) my questions I should use softmax as it will provide outputs that sum up to 1 and I can check performance for various prob thresholds. is that understanding correct? if I use softmax then can I use cross_entropy loss? This seems to …

Topic: cross-entropy binary-classification softmax pytorch

Category: Data Science

The effects of Double Logarithms (Log Cross Entropy Loss) + Overfitting

Bun

2022年1月30日 08:54

My network involves two losses: one is a binary cross entropy, and the other is a multi-label cross entropy. The yellow graphs are the ones with double logarithm, meaning that we log(sum(ce_loss)). The red pink graphs are the ones with just sum(ce_loss). The dash lines represent validation step. The solid lines represent training step. The top yellow and top red-pink figures both represent the count of 1s. Both are supposed to converge to 30. It is clear that the top …

Topic: cross-entropy loss-function accuracy neural-network

Category: Data Science

ignoring instances or masking by zero in a multitask learning model

Minions

2021年12月10日 22:29

For a multitask learning model, I've seen that approaches usually mask the output that doesn't have a label with zeros. As an example, have a look here: How to Multi-task learning with missing labels in Keras I have another idea, which is, instead of masking the missed output with zeros, why don't we ignore it from the loss function? The CrossEntropyLoss implementation in Pytorch allows specifying a value to be ignored: CrossEntropyLoss . Is this going to be ok?

Topic: cross-entropy pytorch loss-function multitask-learning

Category: Data Science

Shannon Information Content related to Uncertainty?

xflashx

2021年12月10日 20:15

I'm a data scientist student currently writing my master thesis which resolves around the Cross Entropy (CE) Loss Function for neural networks. From my understanding, the CE is based on the Entropy, which in turn is based on the Shannon Information Content (SIC), however I struggle to interpret and explain it in such a way that my fellow students can understand it without using concepts of information theory (which itself is already a completely different and complicated area). In the …

Topic: cross-entropy information-theory probability loss-function

Category: Data Science

Why is cross entropy based on Bernoulli or Multinoulli probability distribution?

Feng Chen

2021年11月9日 13:33

When we use logistic regression, we use cross entropy as the loss function. However, based on my understanding and https://machinelearningmastery.com/cross-entropy-for-machine-learning/, cross entropy evaluates if two or more distributions are similar to each other. And the distributions are assumed to be Bernoulli or Multinoulli. So, my question is: why we can always use cross entropy, i.e., Bernoulli in regression problems? Does the real values and the predicted values always follow such distribution?

Topic: logistic cross-entropy bernoulli loss-function regression

Category: Data Science

About