What is the difference between BatchNorm and Adaptive BatchNorm (AdaBN)?

I understand that BatchNorm (Batch Normalization) centers to (mean, std) = (0, 1) and potentially scales (with $ \gamma $) and offsets (with $ \beta $) the data which is input to the layer. BatchNorm follows this formula:

(retrieved from arxiv-id 1502.03167)

However, when it comes to 'adaptive BatchNorm', I don't understand what the difference is. What is adaptive BatchNorm doing differently? It is described as follows:

(retrieved from arxiv-id 1603.04779)

Topic domain-adaptation normalization deep-learning neural-network machine-learning

Category Data Science


I think the original batch normalization paper proposes to use mean and standard deviation estimated on the train set. The adaptive batch normalization simply re-estimates them on the target domain (could be the test set, or some unlabeled data from the target domain).

Please correct me if I am wrong.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.