What is num_groups in GroupNorm and how to choose it
I found that batch_norm can cause problems with small batch sizes and that GroupNorm is a good alternative. Now, GroupNorm requires two parameters, the num_group and the num_channels. How can I choose a good value for num_group? On what depends it? And with groupnorm, is good a big batch_size or a small batch_size?
Topic vgg16 pytorch batch-normalization gpu normalization
Category Data Science