Are more target labels in a multi-label classification always better?

Context We work on medical image segmentation. There are a lot of potential labels for one and the same region we segment. There can be different medically defined labels like anatomical regions, more biological labels like tissue types or spatial labels like left/right. And many labels can be further differentiated into (hierarchical) sub labels. Clarification The question is with respect to the number of classes / target labels which are used in a multi-label classification/segmentation. It is not about the …
Category: Data Science

ImageDataGenerator for multi task output in Keras using flow_from_directory

I am creating a multitask CNN model and I have two different classification properties (one with 10 classes, 2nd with 5 classes) and my directory structure looks like this: -Train - image1.jpg ... - imageN.jpg -Test - image1.jpg ... - imageN.jpg -Vald - image1.jpg ... - imageN.jpg And labels are in a csv file as propA, propB. So, a single image will have two classes, one from property A and one from property B. The model uses VGG16 : baseModel …
Category: Data Science

Multi-Source Time Series Data Prediction

I was wondering if anyone has experience with time series prediction for data from multiple sources. So for instance, time series $a,b,..,z$ each have their own shape, some may be correlated with others. The ultimate goal is to have a model trained such that the value at time $t+1$ for any given data source can be predicted. I personally have two solutions that in theory could work, but was wondering if anyone knew of other frequently used methods. Multi-task learning …
Category: Data Science

Multi-task reinforcement learning with different action spaces

I'm currently working on a project in which I need apply multi -task reinforcement learning. Over the same state space, each agent aims to do a separate task, but the action spaces of agents are different from each other. I thought IMPALA would be a good choice at first glance, but it requires actions to be shared somehow, which is not applicable in my case. Can someone please give me an idea if there is an appropriate multi-task reinforcement learning …
Category: Data Science

Which is better: multi-output model or separate models for similar tasks?

I am working on two problems: classification of images into high-level classes (e.g. shoe, dress, jacket etc.) classification of the attributes of the same images on a lower level (e.g. shoe style, color of the dress etc.), assuming that the high level class is known Currently, I have designed an architecture for the 2nd problem as a multi-class multi-output network with ResNet50 as the backbone. Now I am dealing with the 1st problem and I have two paths to follow: …
Category: Data Science

Tips for improving multitask learning based on multiple outputs

I'm currently trying to use multi-task learning based on a multi-output model that both allows to get an output for classification and regression. However, at the moment it's staying at around 20% accuracy. I tried out multiple things including choosing multiple loss functions, weighting loss with loss_weights in keras. Further I tried to adapt my Adam optimizer to for different beta_1 and beta_2 values. Since I read that it's better to share more in case of overfitting I tried out …
Category: Data Science

How to combine the loss in a multitask neural network?

when we train a model to learn two tasks at the same time (Multitask learning), we get losses from both tasks in the neural network and then we combine them. I've seen several works where they've done that, but I don't see a consistent way to achieve it among them. If I have the tasks A and B, I have seen the following ways to have the total loss: total_loss = A_loss + B_loss # or total_loss = (A_loss + …
Category: Data Science

ignoring instances or masking by zero in a multitask learning model

For a multitask learning model, I've seen that approaches usually mask the output that doesn't have a label with zeros. As an example, have a look here: How to Multi-task learning with missing labels in Keras I have another idea, which is, instead of masking the missed output with zeros, why don't we ignore it from the loss function? The CrossEntropyLoss implementation in Pytorch allows specifying a value to be ignored: CrossEntropyLoss . Is this going to be ok?
Category: Data Science

Multi-task learning for improving segmentation

I am building a multi-task model, where my main task is segmentation and my auxiliary task is either denoising or image inpainting. The goal is to try to improve the quality of the segmentation with the multi-task learning approach, where certain representations are shared across the tasks, thus reducing their noise and increasing the segmentation accuracy. Currently, my approach is as follows: Manually add noise to the image (or a hole if the auxiliary task in inpainting) Fit a model …
Category: Data Science

How do we bring Pareto optimality into the realm of Machine Learning?

I have a multi-objective optimisation problem with a large number of objectives (more than 10) which is generally the case in most-real life problems. The use of traditional GAs such as NSGA-II or SPEA-II fails in this scenario because of 'the curse of dimensionality' as discussed in Deb et al. (2005) The same paper suggests the use of PCA for dimensionality reduction within evolutionary computing algorithms. I want to know if there is another way to obtain 'pareto-optimality' ? Can …
Category: Data Science

Architecture for multivariate multi-time-series model where some features are TS specific and some features are global

I'm looking to build a time series model (using a TCN or a LSTM) with $N$ different series, each of which has $P$ series-specific features $\mathbf{X}$. My input array is of dimension $N \times t \times P$, where $t$ is the number of time steps. I've also got features $G$, which are constant across all time series. For concreteness, imagine I'm predicting city-level ice cream sales with weather data, and I also want to use GDP growth as a predictor. …
Category: Data Science

Training a Keras multi-task model with subsets of labels in each training example

Keras allows you to freeze layers, which helps with fine-tuning models. However, if you have a multi-task problem where each training example only contains a subset of the labels (e.g. only one label) we need to unfreeze a single path through the model, to avoid updating weights for heads that we don't have a label for in this example. How can this be achieved using Keras?
Category: Data Science

How to implement predict_proba() with spark.ml logistic regression

In sklearn it is possible to use the predict_proba() function to obtain the probability matrix. Since in spark.ml that function is not implemented, how can I implement it? The model.transform (dataset) function indeed generates a field called probability. Apparently, it is not the same probability matrix since it is different from the one produced by sklearn. Thanks for your help.
Category: Data Science

How to incorporate multi-task in CTR/recommendation model (deep & wide/ xDeepFM etc)?

I am building a rank algorithm for an e-commerce website that ranks the product based on likely hood of purchase and I have formulated this problem into a binary classification problem. Given each visitor information and predict on the purchase event propensity score of each product then sort. But because a purchase event is a lower-funnel event and I would like also to optimize for other metrics for example click-through-rate, add to cart, purchase, repurchase, etc. Currently, the model is …
Category: Data Science

Emotion Recognition with Multi-task Learning

Introduction I am a beginner in Data Science and currently working on a learning project aimed at emotion recognition from a bio-medical sensor dataset. The dataset consists of 8 sensors data from 20 subjects, here I have attached a screenshot of a very small part of the dataset to give you better insight into it: So, as you can see, the Dataset is about: Multiple subjects and for each subject, there are 11 columns of data. Columns 1-8 are raw …
Category: Data Science

How to feed data to multi-output Keras model from a single TFRecords file

I know how to feed data to a multi-output Keras model using numpy arrays for the training data. However, I have all my data in a single TFRecords file comprising several feature columns: an image, which is used as input to the Keras model, plus a sequence of outputs corresponding to different classification tasks: eg. an output encodes the age of the person in the image, another output encodes the gende, and so on. From what I have seen in …
Category: Data Science

What kind of learning problem is this?

Say I have $n$ multi-class classification problems $p_1$, ..., $p_n$. Each of these have their own training data. While they are all distinct problems, there may be similarities in their data (which are in my case images), e.g. the data for class $p_1^{_B}$ of problem $p_1$ may be similar in some way to the data for class $p_5^{_F}$ of problem $p_5$. Classically, each classifier tries to seperate each of its training classes from each of its other training classes, but …
Category: Data Science

Should I rescale losses before combining them for multitask learning?

I have a multitask network taking one input and trying to achieve two tasks (with several shared layers, and then separate layers). One task is multiclass classification using the CrossEntropy loss, the other is sequence recognition using the CTC loss. I want to use a combination of the two losses as criterion, something like Loss = λCE + (1-λ)CTC. The thing is that my CE loss starts around 2 while the CTC loss is in the 400s. Should I rescale …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.