labels

How to label legit users when trying developing a bot flagging classification model?

Marc

2022年6月2日 14:07

I’m working on a project where I try to flag bots from legit users on social media. The data I collected is not labeled but I have labeled about 17% of it (22k users) thought different techniques. Finding bots was easy as they all have similarities with each other but it's different for legit users. In my labeled data, I have most if not all bots labeled but still have a ton of legit users to label which is really …

Topic: labelling labels python machine-learning

Category: Data Science

Are more target labels in a multi-label classification always better?

Spenhouet

2022年6月1日 10:34

Context We work on medical image segmentation. There are a lot of potential labels for one and the same region we segment. There can be different medically defined labels like anatomical regions, more biological labels like tissue types or spatial labels like left/right. And many labels can be further differentiated into (hierarchical) sub labels. Clarification The question is with respect to the number of classes / target labels which are used in a multi-label classification/segmentation. It is not about the …

Topic: image-segmentation labels multilabel-classification multitask-learning deep-learning

Category: Data Science

Discriminator of a Conditional GAN with continuous labels

user3023715

2022年5月24日 13:01

OK, let's say we have well-labeled images with non-discrete labels such as brightness or size or something and we want to generate images based on it. If it were done with a discrete label it could be done like: def forward(self, inputs, label): self.batch = inputs.size(0) h = self.res1(inputs) h = self.attn(h) ... h = self.res5(h) h = torch.sum((F.leaky_relu(h,0.2)).view(self.batch,-1,4*4), dim=2) outputs = self.fc(h) if label is not None: embed = self.embedding(label) outputs += torch.sum(embed*h,dim=1,keepdim=True) The embedding can be made to …

Topic: generative-models embeddings labels regression deep-learning

Category: Data Science

How to weigh imbalanced softlabels?

Kay Lamerigts

2022年5月16日 23:02

The target is a probability between N classes, I don't want it to predict the class with the highest probability but the 'actual' probability per class. For example: | | Class 1 | Class 2 | Class 3 | ------------------------------------ | 1 | 0.9 | 0.05 | 0.05 | | 2 | 0.2 | 0.8 | 0 | | 3 | 0.3 | 0.3 | 0.4 | | 4 | 0.7 | 0 | 0.3 | ------------------------------------ | + | …

Topic: labels class-imbalance

Category: Data Science

How to use confidence labels?

2022年5月13日 07:23

I have 2 sets of training data in csv files. The training data have class labels, 1 for memorable, and 0 for not memorable. In addition, there is also a confidence label for each sample. The class labels were assigned based on decisions from 3 people viewing the photos. When they all agreed, the class label could be considered certain, and a confidence of 1 was written down. If they didn't all agree, then the classification decided on by the …

Topic: binary-classification confidence labels dataset python

Category: Data Science

Clustering of multi-label data

Ahron

2022年5月10日 11:02

The dataset consists of 1) a set of objects and 2) a set of labels, which are used to describe the objects. For the moment, for simplicity sake, each label can be marked as either true or false (In a more complex setup, each label will have a value of 1-10). But, not all the labels are actually applied to all the objects (in principle, all the labels can and should be applied across all the objects, but in practice, …

Topic: labels multilabel-classification classification clustering

Category: Data Science

Using CNNs to detect incorrect label images in dataset

Lema Zaidi

2022年5月3日 14:41

What I want to do is to train a model to identify the images that are incorrectly labeled in my dataset, for example, in a class of dogs, I can find cats images and I want a model that detects all those images that are in the wrong class. Does any one tried this to have more details or does any one have any ideas? I'm open for all ideas and thank you in advance.

Topic: cnn labels multiclass-classification image-classification python

Category: Data Science

Ground truth/label modification during training (with the data obtained from the

Alex

2022年4月29日 23:04

I'm working on an image segmentation algorithm with FCN (Long et al., 2015) as the backbone network. One idea I have is to use the argmax binary mask obtained from the final score layer (250x250x1) to generate some data (e.g. number of blobs in the mask) to modify the ground truth (e.g. set some pixels in the gt mask to 'ignore' labels) or in some way (partly) extract from the features (similar to RPN layer in FasterRCNN). Does this violate …

Topic: labels deep-learning

Category: Data Science

Are there any tools to make text labeling faster?

Alejandro Rodriguez

2022年4月25日 13:00

I saw the labelme tool and was wondering if there is a similar tool for labelling short documents? Preferably in Python.

Topic: labels nlp python

Category: Data Science

Given daily sequence of events with only event ID labels (alphanum strings), what algorithms can be used to detect sequences that are outliers?

demoman

2022年4月20日 15:16

For example, the data might be something like this: Sequence 1: ["ABC", "AAA", "ZZ123", "RRZZZ45", "AABBCC"] Sequence 2: ["CBA", "AAA", "YY123", "LMNOP", "AABBCC"] Sequence 3: ["ABC", "AAA", "ZZ123", "RRZZZ45", "AABBCC"] ... Sequence N: ["DEF", "AAA", "ZZ123", "YYZZZ45", "AABBCC"] Sequence 1 and 3 are the same, but sequence 2 and N are different. In the data set, there will be thousands of these sequences every day. Additional questions: How could I calculate similarity (or difference) measure between sequences with sequences of …

Topic: labels distance sequence outlier clustering

Category: Data Science

Labelling for churn measurement

The Great

2022年4月20日 06:10

I have 3 domains of supplier data (Jan 2017 to Jan 2022) and they are as follows a) Purchase data - Contains all the purchase (of product) data made by the suppliers with us. It contains columns such as purchase date, invoice number, product id,supplier id,project name b) Inventory data - Contains the stock/inventory info of our product with the suppliers (in their warehouse). This is reported every month. It contains columns such as supplier id, product id, inventory_reported_date, qty_in_stock …

Topic: labels classification predictive-modeling data-mining machine-learning

Category: Data Science

Merge one label with one information for classification problem or multi-label classification

Minila S

2022年4月20日 00:08

I want to build a model to support decision making in order to propose or not loan insurance to clients. Because sometimes clients asking loan and loan insurance have less chance to have their loan accepted by a bank and sometimes more chances. There are three actors in the problem: a bank, a loaner applicant (someone who ask for a loan) and a counselor. The counselor studies the loaner application and if it has a good profile it will propose …

Topic: labels multilabel-classification decision-trees classification

Category: Data Science

How to edit T&C checker text in Woocommerce checkout page? gettext?

Anda

2022年4月19日 14:00

I'm trying to edit wording in terms & conditions checkbox on checkout page (order review section), but without luck. It was easy to edit other fields like billing and shipping fields. But I'm not sure how to target this specific T&C checkbox. For other input fields the code below works fine: // WooCommerce Rename Checkout Fields add_filter( 'woocommerce_checkout_fields' , 'custom_rename_wc_checkout_fields' ); // Change placeholder and label text function custom_rename_wc_checkout_fields( $fields ) { $fields['billing']['billing_first_name']['placeholder'] = 'Type your first name...'; $fields['billing']['billing_first_name']['label'] = …

Topic: woocommerce-offtopic labels translation Wordpress

Category: Web

What is the difference between a bounding box and ROI (Region of Interest)

Jitesh Malipeddi

2022年4月13日 22:03

I was reading about the Fast RCNN for object detection. From what I understand, it uses pre-computed ROI's (using selective search) and uses these to predict the bounding box offsets and uses smooth L1 loss to refine these and get closer to the ground truth boxes. The paper states the following about the ROI's While training, R/N ROI's for each image (N=2,R=128) are taken where N are the images per mini batch. Among the ROI's chosen, around 25% of them …

Topic: faster-rcnn object-detection labels computer-vision

Category: Data Science

Correct approach to usage of class labels in cell imaging data

OParry

2022年4月2日 15:20

As part of a group project at university, we are given a series of videos of cell cultures over a 24 hour period. A number of these cells (the "knockout" cells) have had a particular gene removed, which is often absent or mutated in malignancy. We are using a blob detection algorithm to identify the cell centers and radii and further processing to match cells frame-to-frame to build up individual paths, which we then use to calculate various features. We …

Topic: training labels classifier binary

Category: Data Science

Get Label Statistics of Image Dataset

to_the_nth

2022年3月19日 04:07

I have a labeled image dataset, where the images are in subfolders and there is one Pascal XML per image with the labels. I would like to compute stats like: how many images have exactly two labels? Or - what is the average size of the labeling rectangle? Ideally also statistics on image resolution, file size etc, but mostly labels. This is probably an easy question (many papers include that info), but did not see that function in labelImg and …

Topic: labels image-classification

Category: Data Science

Is there any tool for data visualization and manipulation?

Victor

2022年3月3日 15:00

I have a time series data set that I need to manually label for supervised learning. What I am doing now is using excel to the plot, and when I see the pattern that I want, I hover over the data on the plot, read its index, then mark the data accordingly on the data. I think it is not very efficient, for example, I can not zoom or scroll. I want to ask is there any tool that I …

Topic: labels supervised-learning preprocessing visualization

Category: Data Science

How to train a machine learning algorithm with multiple labels

requalys

2022年2月28日 10:01

I have the following challenge and I very much hope that there is a solution to it. I also suspect that there is a simple approach to it. I just don't see it at the moment. Any help or advice is highly appreciated. So, I have the following situation: I asked persons to label about 1000 data points (each twice) on a 5-point scale, whose scores are not equi-distant. Texts were assessed with regard to several qualitative characteristics (such as …

Topic: labels multiclass-classification supervised-learning machine-learning

Category: Data Science

Ordered categorical xlabel number - what to call xlabel

nammerkage

2022年2月22日 10:02

Say I have 105 brand names from a store, and I know the average retrun percentage for the products of the different brands. . For example: Brand = Nike, return_rate = 30% Then I order all these brands and simply put in an integer instead of the name (since I can't put all brands on the xlabel) So now Nike is simply number 50: Brand = 50, return_rate = 30% The graph looks like this I have no clue what …

Topic: plotting labels

Category: Data Science

How is a coincidence matrix constructed for computing Krippendorff's alpha?

quanty

2022年2月21日 11:01

I am looking at two documents to help me learn about constructing coincidence matrices in order to gain a better understanding of Krippendorff's alpha. I am using these two: https://repository.upenn.edu/cgi/viewcontent.cgi?article=1043&context=asc_papers https://en.wikipedia.org/wiki/Krippendorff%27s_alpha There seems to me to be a discrepancy between the two. There probably isn't, but I'm looking for some help in figuring out whether my understanding is wrong, or if there is indeed a discrepancy. In link 1, I am looking at section B ("Nominal data, 2 observers, no …

Topic: labels statistics

Category: Data Science

About