Fashion Compatibility Performance Evaluation: High in AUC but Low in FITB

I am a newbie in deep learning field. Still trying to understand how this works. But now I am working on fashion compatibility prediction. The most well-known performance evaluation in this task is fill in the blank (FITB) and compatibility prediction (AUC).

I am trying to modify some of the existing models and it comes out that it performs very well in AUC but very low in FITB. AUC~0.98 while FITB~0.30 only. I know it might be difficult to point out what's wrong with my model because I don't give any detail about my model.

But here I just want to get some insights maybe you guys can think of any possible reason why this happened. I am stuck and also still trying to figure out the reason but have no idea. Any reply would be so appreciated. Thank you! :)

Topic auc prediction deep-learning neural-network predictive-modeling

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.