I am working on an object detection model for microscopic images. Some of my classes are very simple (feature wise), they are practically fancy lines. Some are more complicated objects. My problem is that I am getting a lot of False Positives for the simple classes, because anything that looks loke a line gets classified. I am using the Detectron2 framework and I wonder if there is a way to set a higher confidence threshold for the simple classes?
I want to train a model for object detection. How do I have to labeling the train data? Is it enough to label the class/content of each box in the image or do I have to add the box position additionally? Thank you
I had a setup a yolo4 pytorch framework in google colab by cloning git clone https://github.com/roboflow-ai/pytorch-YOLOv4.git. I generated checkpoints by giving training. As we need more robust training model, I given training again with assigning pretrained checkpoints but it seems loss started with high value as like first time training. Code is for training !python train.py -b 2 -s 1 -l 0.001 -g 0 -pretrained ./Yolov4_epoch100_latest.pth -classes 1 -dir ./train -epochs 100. Not sure if my pretrained checkpoint is used …
I have a yolov3 model for object detection on 9 classes. What is the difference in computing metrics (such as mAP) on validation set and on a test set (unseen data)? What is usually done in literature and why?
Suppose I have 5 classes, denoted by 1, 2, 3, 4, and 5, and this is used in object detection. When evaluating an object detection performance, suppose I have classes 1, 2, and 3 present, but classes 4 and 5 are not present in the targeted values. Will each of classes 4 and 5 have average precision of 0 (due to its precision being zero as no true positives can be identified)? Or perhaps there are other considerations to take …
I have a trained Object detection model in the ONNX format (optimized to run on mobile phones). I have converted that model into the ORT format (which is also a format that is optimized to run on mobile phones). In ONNX's website (www.onnxruntime.ai), there are links to github repositories containing iOS and Android code for example apps that use ORT models. But code is available only for Object Detection in iOS and Image Classification in Android. This is shown in …
I'm a beginner, I've done object detection using Haar Cascades on faces as well as ImageAI. So maybe not a complete beginner. I'm working on a simple regression project for my resume, predicting low long a piece of toast has been in the toaster based on an image of the toast. I need a way to draw a rectangle on an image containing pictures of toast to extract them and use in my CNN model. That has been proven to …
Why do we need to use Labelimg tool for object detection? After labeling the bunch of training images using labelimg tool which will give CSV file How that CSV file works with TensorFlow object detection API and Keras? Can we detect image localization without an image annotation tool? like auto annotation.
I saw the paper and code of a Person Reidentification library by NVIDIA - GitHub - NVlabs/DG-Net: Joint Discriminative and Generative Learning for Person Re-identification. CVPR'19 (Oral) It says there are two different network to focus on Person body shape and clothing separately. Sorry if the question is noobish. Intuitively I would run the image through an edge detector, and then train on that image to make the network learn the structure of the pedestrian. - To make it focus …
I am trying to train an object detection model using Mask-RCNN with Resnet50 as backbone. I am using the pre-trained models from PyTorch's Torchvision library. I have only 10 images that I can use to train. Of the same 10 images, I am using 3 images for validation. For the evaluation, I am using the evaluation method used in COCO dataset which is also provided as .py scripts in the TorchVision's github repository. To have enough samples for training, I …
I'm studying Andrew NG's Convolutional Neural Networks and am in Week 3 of the course which deals with object detection using YOLO algorithm . I don't understand one section in the programming assignment that uses a function called 'scale_boxes' . This is what is described about the function in the course materials. "*There're a few ways of representing boxes, such as via their corners or via their midpoint and height/width. YOLO converts between a few such formats at different times, …
I am training a SSD model for detecting mobile cranes. The training dataset contains 1,000 images and test set over 400 images. About 200 epochs gave mAP 83%, but my target is 90%. So I trained SSD-ResNet-101 and it gave less accuracy. I assume that it is because ResNet-101 is too deep for the size of my dataset. I consider using ResNet-50 and Inception. But I don't have time to experiment all the models with different parameter settings. Is there …
I'm training an object detection model with Tensorflow and monitor the training task with tensorboard. I was expecting in the Images tab of tensorboard that displayed images would show a bounding box (at a specific point of training). What I see though is only images with an orange line drawn above the picture (the same orange that I expect for the bounding box). Am I missing something? Am I right when I say that a bounding box should appear or …
I'm using Tensorflow's SSD Mobilenet V2 object detection code and am so far disappointed by the results I've gotten. I'm hoping that somebody can take a look at what I've done so far and suggest how I might improve the results: Dataset I'm training on two classes (from OIV5) containing 2352 instances of "Lemon" and 2009 instances of "Cheese". I have read in several places that "state of the art" results can be achieved with a few thousand instances. Train …
Situation: My dataset is 70k images of people wearing clothes. Images are labelled: bbox position and class. There are 10 classes. I did 80:20 split. Categories are balanced with exception of one category, but I can accept poor performance on one category. The goal is cloth recognition in images. When I feed an image of a person wearing pants and a t-shirt, I want to see two bboxes of these clothes. My problems: I already trained a few models from …
I am a bit confused by reading A survey on object detection in remote sensing. They state that machine learning-based object detection consists of three essential parts - feature extraction, feature fusion + dimension reduction, and classifier training. Then, they list the feature extraction methods: Histogram of Oriented Gradients, Bag of Words, Texture Features, and more. Later in the section, they list approaches to the classifier training, e.g.: SVM, AdaBoost, k-nearest-neighbor, neural networks. This does not align with my understanding …
I understand (I think) why in object detection, the result is a rectangle: it is a simple shape that can be defined by 4 variables (2 pairs coords of opposite corners or 1 pair of coords + width and height) So more complicated shape might require more parameters which could complicate things. But for example, what if a circle was used? There would just be 3 parameters, 1 pair of coordinates of the center + the radius. Is there is …
I am experimenting with the Tensorflow Object Detection API on a Windows 7 machine. I am trying to detect US address labels (and similar blocks of text) as they appear on a piece of mail or an envelope. I am not trying to detect individual words or lines, but rather the full rectangular block of text. My address labels are typically isolated on the letter or envelope and they are surrounded by whitespace. For example: I followed the tutorial to …