Face Recognition (Scalability Issue)

Background

I would like to build a face recognition model for registration and login for some kind of service. For example, using this approach (CNN + SVM).

When a new user wants to register a service, the image of his/her face is recorded and the machine learning model is trained using these images. Then, when a person requests for the service, the model recognises if this person is a member or not.


Question

But when there is new user comes in for registration, the machine learning model has to be retrained and scans through all the previous images (or feature vectors). It seems that this approach has a scalability issue when the the number of users is large.

I have read through this post but my situation is not quite the same as the suggested answer, because my machine learning model aims to distinguish member vs non-member. Does anyone know how to tackle this scalability issue? Thanks.

Topic training scalability machine-learning

Category Data Science


  • One method is to use Online Learning as previous answers have suggested.
  • You can use transfer learning to use a pre-trained model (CNN), that has been trained on a large dataset, and use it to generate feature vectors for the members of your system.
    • When an existing user comes in you can use these feature vectors for identifying the user (member) using SVM.
    • When a non-member (new user) comes in his feature vector will not match with any of the members. If you want him to register you can store his feature vector along with previous vectors to identify him in the future.

I have used similar technique in one of my previous projects. I used the face_recognition library from ageitgey: https://github.com/ageitgey/face_recognition. You can get it from pip as well.

Note: I know it is very late to answer this question now. But I home my additions with the previous answers can help someone now or in the future as well.


There's dedicated package for online learning called creme. You can find some quick guide on this blog. Docs are also nice.

I would stick with CNN, which are great to extract features from data. I'd start with pretrained model, just chop off last layers and leave conv ones. Then take those features to creme.


You can use another approach using facenet from google to achieve scalability. I have shifted images, model and classifier to database. And used this technique to achieve scalability. Link is below enter link description here . In my case only the classifier gets updated everytime there is new image incrementally.


I am facing the similar issue. But my problem is a little different, whenever I trained a svm classifier model, my old classes got disturbed because of new classes. I have used scikit-learn svm model with higher C value, it partially solved my problem.


The problem you describe may be tackled with online machine learning where you continuously update your model as new data arrive, avoiding the computationally intensive part of retraining.

For deep neural networks, there is some work in this direction.

scikit-learn and Vowpal Wabbit also provide some online algorithms.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.