Are more target labels in a multi-label classification always better?

Context We work on medical image segmentation. There are a lot of potential labels for one and the same region we segment. There can be different medically defined labels like anatomical regions, more biological labels like tissue types or spatial labels like left/right. And many labels can be further differentiated into (hierarchical) sub labels. Clarification The question is with respect to the number of classes / target labels which are used in a multi-label classification/segmentation. It is not about the …
Category: Data Science

Binary classificaiton for weather data if its class 1 or class 0 alert

I am working on weather data and it has few features that are independent variables such as severity, severity_id, urgency_id etc ... Based on these values, I would like to classify alerts into class 0 or 1. For example, below is row item from data source Alert | Severity | Sev_Id | Urg_Id | Event | Sys_Rec(Target Variable) -------------------------------------------------------------------------- dummy | Extreme | 1 | 1 | STORM | 1 dummy | Minor | 3 | 5 | RIPTIDE | …
Category: Data Science

Using softmax for multilabel classification (as per Facebook paper)

I came across this paper by some Facebook researchers where they found that using a softmax and CE loss function during training led to improved results over sigmoid + BCE. They do this by changing the one-hot label vector such that each '1' is divided by the number of labels for the given image (e.g. from [0, 1, 1, 0] to [0, 0.5, 0.5, 0]). However, they do not mention how this could then be used in the inference stage, …
Category: Data Science

AUC-ROC for Multi-Label Classification

Hey guys I'm currently reading about AUC-ROC and I have understood the binary case and I think that I understand the multi-classification case. Now I'm a bit confused on how to generalize it to the multi-label case, and I can't find any intuitive explanatory texts on the matter. I want to clarify if my intuition is correct with an example, let's assume that we have some scenario with three classes (c1, c2, c3). Let's start with multi-classification: When we're considering …
Category: Data Science

Understanding SGD for Binary Cross-Entropy loss

I'm trying to describe mathematically how stochastic gradient descent could be used to minimize the binary cross entropy loss. The typical description of SGD is that I can find online is: $\theta = \theta - \eta *\nabla_{\theta}J(\theta,x^{(i)},y^{(i)})$ where $\theta$ is the parameter to optimize the objective function $J$ over, and x and y come from the training set. Specifically the $(i)$ indicates that it is the i-th observation from the training set. For binary cross entropy loss, I am using …
Category: Data Science

Text similarity for badly written text

Consider the following scenario: Suppose two lists of words $L_{1}$ and $L_{2}$ are given. $L_{1}$ contains just bad-written phrases (like 'age' instead of '4ge' or 'blwe' instead of 'blue' etc.). On the other hand, each element of $L_{2}$ is a well-written version of each element of $L_{1}$. Here is an example: $$L_{1}=[...,dqta \ 5ciencc,...,s7ack \ exch9nge,...],$$ $$L_{2}=[...,stack \ exchange,...,data \ science,...].$$ Problem: Is there any strategy to try to predict which element $w^{\prime}$ in $L_{2}$ is the syntactically correct counterpart …
Category: Data Science

Multi-Label time-series classification with LSTM: large performance decrease for longer periods

I have daily data on event occurences, so for each day I have a vector like [1, 0, 1] indicating that on this day event one and three occured, but event two did not occur. I want to train a model to take data from the past number of days (n_days) and to then predict the event occurences for the next day. I believe this problems falls into the category of multi-label classification. Moreover, the data that I use has …
Category: Data Science

Trained CNN individually on multiple images to classify them, how can I now classify a related "set" of these images that correspond to one object?

I have a N object classification examples, each example consisting of a set M individual images of the object at different angles. I've trained M CNNs with the dataset of one particular image angle each and their corresponding label. (Thus I have M sets of model parameters I've discovered for each angle) Now given this information, what is a good approach to classifying a new single object based on it's set of M individual image angles? (i.e I can classify …
Category: Data Science

How to train on extended data set correctly

I have trained my classifier on pictures with a mixture of several classes on each picture, e.g. A-F. The classifier is able to (nearly) correctly segment those classes on the images. Now I got more data with pictures showing class G. To minimize my work, I only labeled class G on the images and left the rest out (Invalid). Two questions for my training arise: If there are no examples of class G in my first dataset (because it could …
Category: Data Science

Clustering of multi-label data

The dataset consists of 1) a set of objects and 2) a set of labels, which are used to describe the objects. For the moment, for simplicity sake, each label can be marked as either true or false (In a more complex setup, each label will have a value of 1-10). But, not all the labels are actually applied to all the objects (in principle, all the labels can and should be applied across all the objects, but in practice, …
Category: Data Science

Why is Word2vec regarded as a neural embedding?

In the skip-gram model, the probability that a word $w$ is part of the set of context words $\{w_o^{(i)}\}$ $(i= 1:m)$ where $m$ is the context window around the central word, is given by: $$p(w_o | w_c) = \frac{\exp{(\vec{u_o}\cdot \vec{v_c)}}}{\sum_{i\in V}\exp{(\vec{u_i}\cdot \vec{v_c)}}} $$ where $V$ is the number of words in the training set, $\vec{u_i}$ is the word embedding for the context word and $\vec{v_i}$ is the word embedding for the central word. But this type of model is defining …
Category: Data Science

XGBoost for multi-label image classification

I am trying to use the xgboost classifier for a multi-label and multi-class image classification task. I have a list of images that can have up to 5 different labels in each of them. Before I use the classifier I want to also apply image augmentation. import keras from sklearn.model_selection import train_test_split from keras.preprocessing.image import ImageDataGenerator from xgboost.sklearn import XGBClassifier train_idx, val_idx = train_test_split(mask_df.index, test_size=0.2,random_state=28) train_datagen=ImageDataGenerator(zoom_range=0.1, fill_mode='constant', rotation_range=10, height_shift_range=0.1, width_shift_range=0.1, horizontal_flip=True, vertical_flip=True, rescale=1/255.) train_generator=train_datagen.flow_from_dataframe( dataframe=mask_df.loc[train_idx], directory="home/DATA/train_images/", x_col="ImageId", y_col=columns, color_mode='grayscale', batch_size=32, …
Category: Data Science

Multi-label classification with nested features

I need to perform a multi-label classification. I have three features and they are nested. I am unsure how to combine this or what kind of classification algorithm would be best. Some multi level neural network as shown here seems good, but the nested features don't seem to be taken into account there. I present the nested features (X) and labels (Y) in the two datasets below: one subject ID can have one or more features and one or more …
Category: Data Science

Transform multi-class problem to multi-label problem

I found this question but I need an answer to the other direction. Example: Let's say we want to predict if a person with a certain profile wants to buy product A and/or B. So we have 2 binary classes A and B that don't exclude each other: A B 0 1 1 1 0 0 1 0 ... (We don't want to predict how likely it is for a person to buy B if the person has already bought …
Category: Data Science

Positive/negative training sample imbalance in multi-label image classifiers

I'm trying to train VGG-16 on the Pascal VOC 2012 dataset, which has images with 20 labels (and a given image can have multiple classes present). The examples are highly imbalanced, so I've "balanced" them such that each label is represented roughly equal in the training set. But this means that for each label, 5% of the total images are positive examples and 95% are negative samples. There is no way to achieve a 50/50 split for all classes. I'm …
Category: Data Science

Custom multi-label cross-entropy loss that boosts weight of particular errors

I am using XGBoost for a multi-label classification problem (objective is 'multi:softmax' in XGBoost). In my case there are 16 discrete output labels where only one is correct. However, depending on the example, there are particular predicted labels that are "close to correct" that deserve some sort of partial credit/boost for being closer to the answer than other labels. I want to see if I can modify the objective/loss function in XGBoost to account for this. The user gets more …
Category: Data Science

Multilabel Classification - Overfitting?

My task is the following: To input drug combinations and output renal failure-related symptoms from the drug combinations. Both the drug combinations and renal-failure related symptoms are represented as one-hot encoded (for example, someone getting symptom 1 and symptom 3 out of a total of 4 symptoms is represented as [1,0,1,0]). So far, I have ran the data through the following models and they have produced this interesting graph. The left-hand graph depicts the training and validation loss of the …
Category: Data Science

Text to Text classification

I am new comer to the field of data science and have been struggling with a simple classification problem. It seems to be generic enough and I have a suspicion that there must be a better way to frame/model this problem. I would appreciate any help. Background In our system, we have millions of tickets (similar to JIRA tickets) where each ticket has attributes like title, description, tags etc. A user can create a dashboard and add any number of …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.