Which is better: multi-output model or separate models for similar tasks?
I am working on two problems:
- classification of images into high-level classes (e.g. shoe, dress, jacket etc.)
- classification of the attributes of the same images on a lower level (e.g. shoe style, color of the dress etc.), assuming that the high level class is known
Currently, I have designed an architecture for the 2nd problem as a multi-class multi-output network with ResNet50 as the backbone. Now I am dealing with the 1st problem and I have two paths to follow:
- consider both problems as different tasks and train separate models for them
- share the backbone between 1st and 2nd problems and train only one multi-output model
Which option is better? Which one should work better? Are there any good practices for combining similar tasks?