Object Detection without annotations and labels

Problem Statement: I am given 2 sets of images. All the images in both sets are without annotations and labels. First set : a set of images of the grocery store shelves (captured in the grocery stores). Second set: a set of close-up images of the products kept on those store shelves. What I am trying to achieve: I want to first locate and then predict a bounding box Product for a Product in the set of images of Grocery …
Category: Data Science

How to generate Anchor boxes for SSD?

I am currently trying to understand the method of generating anchor boxes for object detection. I am looking at a code where the author has done this task in a very flexible way. But I am having problems understanding some part of it. As every part of code is highly dependent each other, misunderstanding a small part leads to confusion and then I have to start from beginning. Please help... This is the function used by the author in his …
Category: Data Science

Is there a Contrast Invariant object recognition technique

I am working on object recognition and understand the technique. I’ve work with the MNIST dataset and understand how CNNs work. Now it is time for me to apply what I have learned to a real-world problem and right off the bat I encounter a major issue. The object I need to locate is a dark colored object on a white background. Building a filter on a dark object doesn’t make sense to me. Should I color invert the image …
Category: Data Science

How to extract specific region from black and white image using openCV

I have been trying to learn OpenCV as I have a deep interest in Computer Vision and one of the problems I have been trying to figure out is how to extract a particular region of an image with OpenCV. I do not want to merely crop the image; I want to extract the exact region from it. For example if I have an image of the lower half of the face like this: Is there a way I can …
Category: Data Science

Extract features using Bounding Box

I have a ground truth bounding box for a 3d object. I would like to extract useful features for the object. My goal is to concatenate these visual object features with language features (from the description of the object) for training. For visual features, I want to use the ground-truth bounding boxes to create an upper baseline. How do I extract features using 3d bounding box?
Category: Data Science

How to create COCO format data out of list of boxes

I have $N$ images. I have a script that extracts boundary boxes of an object that I am interested in. For each image, I may get $m$ boxes. There is only one item that I am interested in which is cat. So what I have is like the following: image_1.jpg [[1, 2, 100, 200], [1, 2, 130, 200]], cat, cat I wonder how I can turn these data into a COCO format data in order to build my object detection …
Category: Data Science

What does it mean if performance of two different iterations of the same network (CNN model) varies a lot?

So I trained CNN model for people detection on caltech-pedestrian dataset: Then I was curious and evaluated the model in every 1000th iteration on Evaluation toolbox(I guarantee, there is no bug in evaluation). However, plot of the performance does not look so good. The miss rate spikes between 20K(20,000) and 30K iterations. I am confused what does that mean. I mean usually we would expect the miss rate to decrease as we train the model more. I am using yolo …
Category: Data Science

Does resizing images during training affect the bounding box annotations?

I am using the TensorFlow object detection API to train my own custom dataset and am preparing annotations for the same. I see from the config file of my pre-trained SSD inception net, the size of the image is reduced to 300 x 300 during training. My doubt is whether the resize will now change the position of my object according to annotation? I mean now the xmin, ymin width and height of the bounding box would be different since …
Category: Data Science

How to calculate mAP for detection task for the PASCAL VOC Challenge?

How to calculate the mAP (mean Average Precision) for the detection task for the Pascal VOC leaderboards? There said - at page 11: Average Precision (AP). For the VOC2007 challenge, the interpolated average precision (Salton and Mcgill 1986) was used to evaluate both classification and detection. For a given task and class, the precision/recall curve is computed from a method’s ranked output. Recall is defined as the proportion of all positive examples ranked above a given rank. Precision is the …
Category: Data Science

Uniform detection: a starting point?

Given an image of a worker, I need to verify if he/she is wearing the company's uniform. I tried to Google, but either the search results are about uniform sampling or some object detection tutorials. Might you help me to point out some starting points, ideally on some similar projects published somewhere? Many thanks,
Category: Data Science

How can you include information not present in an image for neural networks?

I am training a CNN to identify objects in images (one label per image). However, I have additional information about these images that cannot be retrieved by looking at the image itself. In more detail, I'm talking about the physical location of this object. This information proved to be important when classifying these objects. However, I can't think of a good solution to include this information in a image recognition model, since the CNN is classifying the object based on …
Category: Data Science

Python : Feature Matching + Homography to find Multiple Objects

I'm trying to use OpenCV via Python to find multiple objects in a train image and match it with the key points detected from a query image. For my case, I'm trying to detect the tennis courts in the image provided below. I looked at the online tutorials and could only figure that it can only detect only one object. I thought of inserting a loop in for it to find multiple objects but I failed to do so. Any …
Category: Data Science

Objects Localization Through CNN

I am new to deep learning and tensor flow and I am trying to train a CNN at localizing digits in the Street View House Numbers data set. To this end I have an input set of 32x32 images and, since I want to recognize up to 5 digits, I am using as labels vectors of 20 elements like this [top_x_digit1,top_y_digit1,widht_digit1,height_digit1,top_x_digit2, etc..] 0,0,0,0 when there is no digit As far as I understand, after (let me say) 3 layers of …
Category: Data Science

Train object detection without annotated data/bounding boxes

From what I can see most object detection NNs (Fast(er) R-CNN, YOLO etc) are trained on data including bounding boxes indicating where in the picture the objects are localised. Are there algos that simply take the full picture + label annotations, and then on top of determining whether an image contain certain object(s) also indirectly learn to understand the appropriate bounding box(es) for objects?
Category: Data Science

Is there something like class-based object detection? Or class-based selective search?

I've been reading a lot about computer vision lately, and while there is a huge amount of info about object classification, and a lot less on object detection, I have not found anything on class-based object detection i.e when I know what I am looking for. For example, looking for cats on a picture. Nowadays you can say if a picture is a cat (object classification), or if the picture has cats among other things. How would I use this …
Category: Data Science

Detecting directions using Convolutional neural networks

I am working on a task where I have to detect damages on the vehicles and exactly where the damage has occurred. So I have to not only detect the damage on the door but also mentioned which door(front left,rear right,etc). Is it possible to not only detect the objects but the direction of the objects as well using Convolutional Neural Networks. Are there any implementation of tasks similar to this?
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.