Dummy vectors and performance measurement for vector search Face Recognition
I have about thousands of person face (from celebrity dataset LFW), which each person represented by 512 x 1 vector. I stored it on vector DB to build face searching system using embedded feature (MTCNN for face detection and arcface for embedding model). Someone suggest me to add many vectors as dummy faces to the database with unknown class (the number of the vectors is larger than the personal class).
It's still unclear for me why I need to add many unknown faces as UNKNOWN class and put it together with thousands of vector from each person. As far as I know, its pretty easy to check the performance by get the similarity score with only from known vectors (the vectors from each person) without the unknown one, for example let said if i put k = 3
or k = 5
, i will take the minimum distance as the result and get the class of the vector (ID or label).
Topic vector-space-models embeddings image-recognition
Category Data Science