domain-adaptation

How to manage sampling bias between training data and real-world data?

Matt

2022年2月16日 17:02

I'm currently working on a binary classification problem. My training dataset is rather small with only 1000 elements. (I don't know if it is relevant : my problem is similar to the "spam filtering" problem where a data can also be "likely" to be categorized as spam but i simplified it as a black or white issue, and use the probability given by the models to assign a likelihood score) Among those 1000 elements: 70% are from the class 1 …

Topic: bias distribution domain-adaptation model-selection data-cleaning

Category: Data Science

What is the difference between Multi task learning and domain generalization

Milad Sikaroudi

2021年7月26日 07:44

I was wondering about the differences between "multi-task learning" and "domain generalization". It seems to me that both of them are types of inductive transfer learning but I'm not sure of their differences.

Topic: transfer-learning domain-adaptation multitask-learning

Category: Data Science

Train consistent embeddings using text from different domains

nik

2021年6月13日 16:45

I would like to train text embeddings using texts from two different domains (podcast summaries and movie summaries). The embeddings should capture similarities on topics the texts talk about, but ignore as much as possible the style the texts were written in. The embeddings I currently train using the universal multilingual sentence encoder clearly divide between the domains, which brings quite some distance between two documents that contain strong topic similarity but were written in a different style. I tried …

Topic: embeddings domain-adaptation nlp

Category: Data Science

Train on multi-domains, then fine-tune on specific domain

guillaumefrd

2021年6月12日 18:06

Would it make sense to first train a model on images from multiple domains, and then do "fine-tuning" on one specific domain to improve its performance on it? For instance, one could train an object detector based on cars camera recorded in NYC, Paris and Beijing, then continue training on Paris only. For a model that would be deployed on Paris only, should we favor diversity or specificity? And does this training method has a name?

Topic: finetuning domain-adaptation deep-learning

Category: Data Science

Computing symmetric difference hypothesis divergence $H \Delta H$ for two domains using a segmentation network

Rakshit Kothari

2021年6月10日 02:03

Given two domains $D_1$ and $D_2$, the symmetric difference hypothesis divergence ($H \Delta H$) is used as a measure how much two domains differ from each other. Let the hypothesis, segmentation network in my case, trained on two domains be $h_1$ and $h_2$ respectively. Then (according to this work by Loog et al); $d_{H\Delta H} = 2 \sup_{h_1,h_2 \in H} |\mathrm{Pr_s}[h_1\neq h_2] - \mathrm{Pr_t}[h_1\neq h_2]|$ Where, $\mathrm{Pr_s}[h_1\neq h_2] = \int_{X}[h_1\neq h_2]p_s(x)\delta x$ Since we do not have access to the …

Topic: image-segmentation domain-adaptation machine-learning

Category: Data Science

How can I use transfer learning to predict height given age in Japan, using a model developed with USA data?

sonicboom

2021年5月2日 06:40

Suppose I have a (training) set of $n$ observation $\{(Y_i^{(U)},X_i^{(U)})\}_{i=1}^n$ of age $X_i^{(U)}$ and height $Y_i^{(U)}$ from people in the USA. Now suppose I also have a (test) set of $m$ observations $\{X_i^{(J)}\}_{i=1}^m$ of age $X_i^{(J)}$ only from people in Japan, where people are shorter on average. I want to predict the heights of people in Japan in the test set using transfer learning from the USA dataset. Suppose for simplicity the USA data is well-fit by the standard simple …

Topic: transfer-learning domain-adaptation linear-regression regression machine-learning

Category: Data Science

Latent space for cross domain numerical features

Minnie

2021年2月19日 17:18

I would like to find the shared latent space between two set of features. I have source and target domain features already extracted from images. I have 4 set of feature vectors for normal and abnormal source and target domains. I would like to train on normal source and target features and predict on abnormal sets. How do I that? I have this idea, that if I create a shared space between two domains and give it to a classifier, …

Topic: features transfer-learning domain-adaptation machine-learning

Category: Data Science

Close set and open set classification at the same time

Rambo_john

2021年1月23日 17:45

Is it possible to use a neural network(or another approach) to classify image based on trained data and at the same time if new image classes are introduced in the test set it should classify those unseen images(open set data) to new classes(kind of telling me which new class this new unseen data belongs to?) on which training is not done.

Topic: domain-adaptation image-classification neural-network python

Category: Data Science

How to measure the performance of a domain adaptation /Transfer learning technique?

Alex

2019年4月11日 06:50

Given that the performance you achieve depends on how far the target from the source domain is, how can you judge the performance of an algorithm?

Topic: transfer-learning domain-adaptation machine-learning

Category: Data Science

Why does increasing the training set size not improve the results?

user1419243

2017年8月6日 10:25

I have trained a model on a training set, which is not that big (overall around 120 true positives, and of course lots of negative examples). What I am trying to do is to improve the results by increasing the data size. I tried two approaches: I added data from a different domain and concatenated the data with the existing one. It increased the F-score from 0.13 to 0.14. I added the same extra data instances, but this time with …

Topic: domain-adaptation naive-bayes-classifier classification

Category: Data Science

Training data from different sources

Bashar Haddad

2017年7月16日 22:23

I am working on a binary classification problem. My data contains 100K samples from two different sources. When I perform the training and testing on data from the first source I can achieve classification accuracy up to 98% and when perform training and testing on the data from the second source, I can achieve up to 99%. The problem is when mix both of them, the classification accuracy goes down to 89%. Any idea how to perform the training to …

Topic: domain-adaptation classification bigdata data-mining machine-learning

Category: Data Science

Discrepancy between training set and real-world data set: domain adaptation?

Archie

2017年7月16日 21:04

I have read in literature that in some cases the training set is not representative for a real-world dataset. However, I cannot seem to find a proper term describing this phenomenon; what is the proper term to address this problem? Edit: So far I have settled for the term domain adaptation, shortly described as a field in machine learning which aims to learn from a certain data distribution in order to predict data coming from a different (but related) target …

Topic: domain-adaptation dataset predictive-modeling machine-learning

Category: Data Science

What is the difference between BatchNorm and Adaptive BatchNorm (AdaBN)?

DaveTheAl

2017年6月13日 07:22

I understand that BatchNorm (Batch Normalization) centers to (mean, std) = (0, 1) and potentially scales (with $ \gamma $) and offsets (with $ \beta $) the data which is input to the layer. BatchNorm follows this formula: (retrieved from arxiv-id 1502.03167) However, when it comes to 'adaptive BatchNorm', I don't understand what the difference is. What is adaptive BatchNorm doing differently? It is described as follows: (retrieved from arxiv-id 1603.04779)

Topic: domain-adaptation normalization deep-learning neural-network machine-learning

Category: Data Science

Dealing with an apparently inseparable dataset

Chinmay Kanchi

2016年12月22日 08:21

I'm attempting to build a model/suite of models to predict a binary target. The exact details of the models aren't important, but suffice to say that I've tried with half a dozen different types of models, with comparable results from all of them. On looking at the predictions on various subsets of the training data, it appears that a certain subset of features is important for around 30% of the data, while a different subset is important for the remaining …

Topic: domain-adaptation machine-learning

Category: Data Science

About