weighted-data

Class weights for imbalanced data in multilabel problems

Costas Papastamos

2022年4月9日 04:03

I am trying to train a CNN for a multiclass - multilabel classification task (20 classes, each sample can belong to 1+ labels) and the dataset is highly imbalanced. In single-label cases I would use the compute_class_weights function from sklearn to calculate the class weights in order to help the optimizer to account for the minority class. However, for the multilabel case I feel its not working as supposed to, because it considers as number of samples the number of …

Topic: weighted-data class-imbalance

Category: Data Science

Clustering with custom criterion (minimum cluster weight)

couturierc

2022年4月3日 17:04

Edit: following comment from @anony-mousse, I'm changing the question to search for a general clustering approach that matches this criterion (minimum weight per cluster). I am to use a clustering method on a set of $n$ weighted points: --------------------------------------------- | id | weight | feature_1| feature_2 | ... | --------------------------------------------- | 1 | 4 | 0.2345 | -0.2345 | ... | | 2 | 2 | 0.675 | 0.7433 | ... | | 3 | 15 | -0.45 | 0.123 …

Topic: unsupervised-learning weighted-data python clustering

Category: Data Science

Weighted accuracy, sensitivity and specificity

Francesca Gavins

2022年3月29日 15:00

I have a confusion matrix TN= 27 FP=20 FN =11 TP=6 I want to calculate the weighted average for accuracy, sensitivity and specificity. I know the equation but unsure how to do the weighted averages.

Topic: weighted-data accuracy confusion-matrix python machine-learning

Category: Data Science

How to weight loss in regression

Gal Avineri

2022年3月18日 00:01

I've got a regression problem where a model is required to predict a value in the range [0, 1]. I've tried to look at the distribution of the data and and it seems that there are more examples with a low value label ([0, 0.2]) than higher value labels ([0.2, 1]). When I try to train the model using the MAE metric, the model converges to a state where it has a very low loss, but it seems that the …

Topic: weighted-data regression class-imbalance machine-learning

Category: Data Science

Weighted loss functions vs weighted sampling?

Jean-Pierre Coffe

2022年3月12日 10:00

For image classification tasks, is there a practical difference between using weighted loss functions vs. using weighted sampling? (I would appreciate theoretical arguments, experience or published papers, anything really.) Some details: By "weighted sampling", I mean attributing different sampling probabilities for each sample in the training set. By "weighted loss functions", I mean weighting error terms differently depending on the sample considered.

Topic: weighted-data loss-function image-classification dataset machine-learning

Category: Data Science

How to apply class weight to a multi-output model?

Gal Avineri

2022年3月3日 15:12

I have a model with 2 categorical outputs. The first output layer can predict 2 classes: [0, 1] and the second output layer can predict 3 classes: [0, 1, 2]. How can I apply different class weight dictionaries for each of the outputs? For example, how could I apply the dictionary {0: 1, 1: 10} to the first output, and {0: 5, 1: 1, 2: 10} to the second output? I've tried to use the following class weights dictionary weight_class={'output1': …

Topic: keras weighted-data multiclass-classification beginner neural-network

Category: Data Science

Assign more importance to recent records during training

A1010

2022年3月2日 01:00

My goal is to build a classification model in order to predict if a customer will buy a product or not (binary classification). Since in the last months (let's say 3-4) I know that the advertising of the company is changed a bit, I want to put more emphasis on the newer records. I know that it is possible to specify the sample_weights parameter in most of the classification algorithms, but I don't know how to properly build these weights. …

Topic: weighted-data classification python machine-learning

Category: Data Science

Training a model where each response in the observation data has a different known varience

dln

2022年2月28日 19:01

I have a dataset where each response variable is the number of successes of N Bernoulli trials with N and p (the probability of success) being different for each observation. The goal is to train a model to predict p given the predictors. However observations with a small N will have a higher variance and higher N. Consider the following scenario to illustrate better: Assume coins with different pictures on them have a different bias and that the bias is …

Topic: training keras weighted-data xgboost

Category: Data Science

Understanding Weighted learning in Ensemble Classifiers

AnonymousMe

2022年2月6日 00:04

I'm currently studying Boosting techniques in Machine Learning and I happened to understand that in Algorithms like Adaboost, each of the training samples is given a weight depending on whether it was misclassified or not by the previous model in sequential boosting. Although I intuitively understand that by weighting examples, we are letting the model pay more attention to examples that were previously misclassified, I do not understand "how" the weights are taken into account by a machine learning algorithm. …

Topic: boosting training weighted-data ensemble-modeling machine-learning

Category: Data Science

Assigning weights based on outcome probability

thereandhere1

2022年1月28日 21:01

In a classification problem, is it suitable to assign sample weights based on their positive class probability? For example, if I am building a binary classification problem where one of the independent features has three possible values a – 2% of the samples, probability for positive class = 90% b – 8% of the samples, probability for positive class = 40% c – 90% of the samples, probability for positive class = 5% Can I assign the samples weights based …

Topic: binary-classification weighted-data regression logistic-regression classification

Category: Data Science

Custom layers in Keras -- custom weights

MsTais

2021年12月3日 15:28

I am trying to understand how to build custom layers in Keras and I went through a couple examples: here and here. The syntax is, of course, similar, but in non of the cases it is addressed why weights are declared the way they are. For instance, how do I constrain weights to be binary? Thanks in advance!

Topic: weighted-data neural-network machine-learning

Category: Data Science

XGBoost: How to obtain scale_pos_weight for multi classes?

Peter

2021年10月22日 08:56

I know there is a similar Qn at Unbalanced multiclass data with XGBoost. But I don't understand the reply provided by @Esmailian. What is the actual formula to obtain 1, 0.333 and 0.167? For example, if we have three imbalanced classes with ratios class A = 10% class B = 30% class C = 60% Their weights would be (dividing the smallest class by others) class A = 1.000 class B = 0.333 class C = 0.167 Will I obtain …

Topic: weighted-data machine-learning

Category: Data Science

Ensemble/combining models weighted by number of observations?

Namey

2021年9月13日 04:39

Across a few different projects, I have hit a problem where I have two (or more) models: General-Purpose Model: A model which is based on a large amount of data not specifically relevant to my current classifier label goal, but which predict other labels using similar features. Cold-Start Model: A model trained on data specifically related to my current label/task, which initially starts with zero observations and goes up from there. So then, my question: what is an appropriate way …

Topic: weighted-data multilabel-classification ensemble-modeling

Category: Data Science

Logistic Regression : shouldn't weighting by the number of instances give the same result ? What could explain the discrepency?

lcrmorin

2021年8月5日 14:39

I am performing a logistic regression in a standard supervised framework (Data Set X, target y). The dataset X is composed of a handfull of categorical variables (that I one-hot encode), thus it contains a lot of redundant rows (1000s unique rows over millions of initial rows). Having a lot of redundant rows I was tempted to agregate them, weight them by their count in the fit and get approximately the same result. However I was surprised to get variation …

Topic: weighted-data logistic-regression scikit-learn

Category: Data Science

Weight for Samples on SVM

Dae-Young Park

2021年8月3日 10:14

there is a option sample_weight in fit(X[, y, sample_weight]) function (OneClassSVM, sklearn library). If I use the option sample_weight , I might give some weight to some point(that are likely to be more normal points), right? Otherwise, what does mean the sample_weight? link: https://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html#sklearn.svm.OneClassSVM.score_samples

Topic: machine-learning-model training weighted-data anomaly-detection machine-learning

Category: Data Science

Implementing class weighting in Faster RCNN

TomSelleck

2021年7月17日 12:42

I have a dataset (around 45,000 screenshots) of UI elements (UI trees containing element types and bounding boxes) and associated screenshots: The dataset is highly imbalanced with the button element being highly overrepresented: When training on my local machine on a tiny subset of the data (900 screenshots for training, 100 for testing) and 10 epochs, my results aren't bad: I trained the model on Azure ML with 25,000 screenshots for 13 epochs (which took about 3 days) and my …

Topic: object-detection pytorch weighted-data class-imbalance

Category: Data Science

How to set class weights for imbalanced classes in Keras?

Hendrik

2021年7月15日 16:24

I know that there is a possibility in Keras with the class_weights parameter dictionary at fitting, but I couldn't find any example. Would somebody so kind to provide one? By the way, in this case the appropriate praxis is simply to weight up the minority class proportionally to its underrepresentation?

Topic: keras weighted-data deep-learning classification

Category: Data Science

CNN - imbalanced classes, class weights vs data augmentation

I.D.M

2021年7月9日 21:30

I have a dataset with a few strongly imbalanced classes, eg. the smallest class is about 54 times smaller than the largest. Therefore, data augmentation in order to equalize the size of classes seems like a bad idea to me (in the example above each image would have to be augmented 54 times on average). So I thought that I could do less augmentation of minority classes and then use class weights in the loss function. Is this approach better …

Topic: data-augmentation convolutional-neural-network weighted-data class-imbalance

Category: Data Science

Is there R functions that allow to test for overdispersion when fitting a model with survey design?

airpoll_epi

2021年6月8日 07:21

I realized I need to use the package survey to be able to include sample weights in my regression analysis. Initially, I wanted to use a negative binomial regression on each one of my outcomes as count data is more often than not overdispersed, so I tried using svyglm.nb. However, for one of the outcomes which has small values, svyglm.nb makes my program crash, so I think there might be some convergence issue. I thought using a Poisson regression might …

Topic: poisson variance weighted-data r

Category: Data Science

Weighting the loss function based on previous seen true positive rates

T Piper

2021年5月15日 16:14

Similiar to class imbalance there is always something I would call "learnability imbalance" in multi-class classification. What I mean by that: Even when the classes are evenly distributed in the dataset some classes will be classified more easily by the model than others. An example would be a CNN model that classifies dog, cat and car. Dog and cat will most likely have a lower true positive rate than car because cats and dogs look more similiar to each other. …

Topic: convolutional-neural-network weighted-data loss-function multiclass-classification

Category: Data Science

About