Algorithms to do a CTRL+F (find object) on an image

Question

Algorithms to do a CTRL+F (find object) on an image

Basj

2022年1月2日 02:20

We all know the CTRL+F Find text... feature in text editors / browsers.

I'd like to study the available algorithms to do something similar on an image.

Example of UI/UX: let's say you have an electronic schematic. You draw a rectangle around a diode, and then the algorithm will automatically find all similar diodes on this image.

In order to find a pattern on an image, I know (and have already used some of them) the classical tools:

openCV matchTemplate (pros: works with a single training example, cons: doesn't support rotation or scaling)
YOLO (pros: I thing it accepts rotation and scaling, cons: it requires 100s or 1000s of training examples)

Which available algorithms are there which would do their best with 1 or 2 training examples only and accept scaling / rotation?

Topic image-segmentation yolo image-recognition image-classification

Category Data Science

Archana David · Accepted Answer · 2022年1月2日 02:20

Invariant object recognition(IOR), refers to rapid and accurate recognition of objects in the presence of variations such as size, rotation and position.

SIFT and SURF are the most popular among them, but unfortunately both SIFT & SURF are patented.

If you are looking for opensource algorithm, I would suggest to go for openCV's Oriented FAST and rotated BRIEF (ORB), OpenCV ORB reference link
quotes

ORB is a good alternative to SIFT and SURF in computation cost, matching performance and mainly the patents.

RESEARCH papers

First two research papers, I found that dealt with one training example.

Rotation Invariant Object Recognition from One Training Example

This paper presented a rotation invariant local descriptor based on the Gaussian derivatives. We use “steerable filter” to implement the derivative responses. Rotation invariance is achieved by “steering” the descriptor to the main orientation at a center pixel location. An advantage of this strategy is that the main orientation can be computed directly from the first order derivative responses. They also consider feature selection in the case where only a single example is available. Virtual images are generated by rotating and rescaling the image. Rotationally and scale unstable features are computed by estimating the main orientation at the center pixel location and count the number of correctly estimated. Unstable features are removed during the learning step. The resulting object recognition system performs robustly under various illumination changes, viewpoint changes, scale changes, and rotation in the image plane, and under partial occlusion

Go for this approach if you want to start with one training example and wish to add more examples when they become available like an on-line learning scheme
Object Recognition from Local Scale-Invariant Features introduces SIFT

This paper presents a new method for image feature generation called the Scale Invariant Feature Transform (SIFT). This approach transforms an image into a large collection of local feature vectors, called SIFT keys. Each image generates order of 1000 SIFT keys. These keys are used as input to a nearest-neighbor indexing method that identifies candidate object matches This system can learn an object model from a single image.
This research paper ORB: an efficient alternative to SIFT or SURF compares performance of ORB with SIFT & SURF with two images set, an indoor and an outdoor scene. ORB outperformed SIFT & SURF images taken in outdoor and almost same performance for indoor one

Below papers supports object recognition with few training examples only(not with rotation/scaling)

Few-Example Object Detection with Model Communication

This paper considers the problem of generic object detection with very few training examples (bounding boxes) per class, named “few-example object detection (FEOD)”. Existing works on supervised/semi-supervised/weaklysupervised object detection. Multi-modal Self-Paced Learning for Detection (MSPLD) algorithm was proposed which combines self-paced learning and the multi-modal learning
Unfortunately there is no existing implementation in python package as mentioned here. This paper though focus on few training samples for object detection, however doesn't support rotation and scaling of the objects
Object Recognition from very few Training Examples for Enhancing Bicycle Maps

This paper introduces a system for object recognition that is trained with only 15 examples per class on average. To achieve this, we combine the advantages of convolutional neural networks and random forests to learn a patch-wise classifier. In the next step, we map the random forest to a neural network and transform the classifier to a fully convolutional network. Thereby, the processing of full images is significantly accelerated and bounding boxes can be predicted. However, this paper doesn't support rotation and scaling of the objects

POPULAR ALGORITHMs

Scale Invariant Feature Transform (SIFT)
Speeded Up Robust Feature (SURF): as SIFT was slow in performance a speeded up version SURF was introduced
Features from Accelerated Segment Test (FAST) corner detector: Feature detection methods weren't fast enough in SIFT & SURF
BRIEF (Binary Robust Independent Elementary Features)

SIFT uses a feature descriptor with 128 floating point numbers. Consider thousands of such features. It takes lots of memory and more time for matching. We can compress it to make it faster. But still we have to calculate it first. There comes BRIEF which gives the shortcut to find binary descriptors with less memory, faster matching, still higher recognition rate.
Oriented FAST and rotated BRIEF (ORB)

OpenCV devs came up with a new “FREE” alternative to SIFT & SURF, and that is ORB. good alternative to SIFT and SURF in computation cost, matching performance and mainly the patents
RIFT: is a rotation-invariant generalization of SIFT
RootSIFT
G-RIF
PCA-SIFT
Gauss-SIFT
KAZE and A-KAZE (KAZE Features and Accelerated-Kaze Features)
GLOH (Gradient Location and Orientation Histogram)

Below algorithm supports object recognition with few training examples only(not with rotation/scaling)

Fast R-CNN
R-FCN

Object Detection via Region-based Fully Convolutional Networks. paper and package. Unfortunately does not support CPU-only mode
Detectron2

Detectron2 is Facebook AI Research's next generation library that provides state-of-the-art detection and segmentation algorithms. It is the successor of maskrcnn-benchmark and Detectron. Detectron implements these object detection algorithms: Mask R-CNN ,RetinaNet, Faster R-CNN, RPN, Fast R-CNN, R-FCN

The following algorithms also outperforms SIFT and SURF

KAZE and A-KAZE
FAST corner detector
ORB

Algorithms to do a CTRL+F (find object) on an image

About