Triplet loss - what threshold to use to detect similarity between two embeddings?

I have trained my triplet loss model using FaceNet's architecture. I used 11k hands dataset. Now I want to see how well my model performed, so I feed it 2 images of the same class and get back their embeddings. I want to compare the distance between these embeddings and if that distance is not larger than some threshold I can say that the model correctly classifies these 2 images as of the same class.

How do I select the threshold value?

Is the threshold value the same as the margin hyperparameter used in triplet loss function? (alpha = 0.2)

After examining FaceNet's code and OpenFace's code, I saw that they test 400 thresholds raging from 0.0 to 4.0 ([0.00, 0.01, ..., 3.99, 4.00]) to find the one that gives the most accurate results and then treats that as the overall accuracy.

You can see it implemented in FaceNet here and in OpenFace here.

Why does it select 0.01 as the step from 0.0 to 4.0? Why not 0.001 or 0.0001?

Moreover, FaceNet uses LFW dataset. LFW dataset defines its own specific protocol for evaluation which is implemented in FaceNet and OpenFace. So isn't it fair to say, that the way FaceNet and OpenFace evaluate their model is exclusive to only LFW dataset?

In that case, how should we evaluate our model on different datasets that LFW?

Topic embeddings convolutional-neural-network machine-learning

Category Data Science


you can use ROC_CURVE from SKLEARN to determine your Threshold, here is a link tha can help https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.