How to prepare masks for multiclass semantic segmentation?

It's very straightforward for binary semantic segmentation: black color (0s) is responsible for background, whereas white color (1s) is responsible for objects of interest.

But what about multiclass semantic segmentation? As far as I understand, these masks must be RGB images since we use more than two colors. Is it correct? Or should I have a separate binary mask for every class?

If I can use RGB images with multiple colors as masks, should I use some specific colors for masking? If not, should I specify colors I chose somewhere in a network as class parameters? Or will any CNN automatically detect any number of different colors in my masks?

These questions may seem naive and primitive, but I was unable to find any clear explanantions of thus aspect of multiclass semantic segmentation.

Topic image-segmentation computer-vision

Category Data Science


You should create a separate binary mask (1 for the pixlels belonging to that class and 0 for the rest of pixels) for each class. Therefore, your mask array should have a shape of (BATCH_SIZE, WIDTH, HEIGHT, NUM_CHANNELS), where NUM_CHANNELS is the number of class.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.