Fashion Compatibility Performance Evaluation: High in AUC but Low in FITB
I am a newbie in deep learning field. Still trying to understand how this works. But now I am working on fashion compatibility prediction. The most well-known performance evaluation in this task is fill in the blank (FITB) and compatibility prediction (AUC).
I am trying to modify some of the existing models and it comes out that it performs very well in AUC but very low in FITB. AUC~0.98 while FITB~0.30 only. I know it might be difficult to point out what's wrong with my model because I don't give any detail about my model.
But here I just want to get some insights maybe you guys can think of any possible reason why this happened. I am stuck and also still trying to figure out the reason but have no idea. Any reply would be so appreciated. Thank you! :)
Topic auc prediction deep-learning neural-network predictive-modeling
Category Data Science