Minibatches when training on two datasets of different size

Suppose I have two datasets, $X$ and $Y$, of different sizes. I am training two networks together, one which takes inputs $x\in X$, and the other takes inputs $y\in Y$. The two networks share parameters and therefore are trained together.

Are there some guidelines on how to chose the batch-sizes for the samples from $X$ vs. those from $Y$? That is, should the the batches from $X$ have the same size as the batches from $Y$?

In general, the two networks can be very different in number of parameters, and the number of total number of training data points available in $X$ can be very different from the number of points in $Y$.

Topic mini-batch-gradient-descent

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.