In some published works, especially in medical image analysis, instead of writing FP rate as percentage, they write it per case, for example, FP: 128.52 [/case]. What is the meaning of this? Is it different from percentage rate? How to calculate it per case?
I read that euler number is the Number of objects in the region minus the number of holes in those objects , it should then return an integer. why does it return values like 54.25
I need to design a marker image that should be detected by the neural network. I am aware that it is not a complex task just to detect an image and this can be done with OpenCV alone. However the detector but this should also work also in pure visibility conditions (low light, out of focus, worn out/partially damaged, makes tiny part of the image only, etc). The background around the marker image is just the real world, difficult to …
I want to separate numbers in suppose 7638 into different images which can be predicted individually using cnn. By finding contours how can I divide each contour into separate image in python. To be more clear: How can I divide https://www.letter2word.com/products/americana-numbers into 0,1,2,3..9 into individual images using the concept of contours in opencv in python so that the resulting images can be predicted by cnn classifier and later all the results written together to give the number in image(initial) as …
Generally speaking, for training a machine learning model, the size of training data set should be bigger than the number of predictors. For a neural network, or even a deep learning model, the number of parameters are usually tens of thousands or even millions. It seems that in practice, the number of training data set, i.e., the number of images, is usually less than the number of parameters. How to explain this? I know, we can claim that the pre-trained …
I extracted images of human faces from the videos, but the model also recorded images without faces. I wrote CNN for emotion classification. In the obvious pictures, the probability is closer to a probability in the softmax function in the last layer, for example, in a photo that is certain to be happy, a probability of 0.95 for the happy class appears, but if there is no face in the picture, it disperses between classes such as 0.3 and 0.2. …
I'm a beginner, I've done object detection using Haar Cascades on faces as well as ImageAI. So maybe not a complete beginner. I'm working on a simple regression project for my resume, predicting low long a piece of toast has been in the toaster based on an image of the toast. I need a way to draw a rectangle on an image containing pictures of toast to extract them and use in my CNN model. That has been proven to …
I'm building a computer vision application using Python (OpenCV, keras-retinanet, tensorflow) which requires detecting an object and then counting how many objects are behind that front object. So, objects are often overlapping. For example: How many are in this queue? Here, the aim is to detect the person in the front (foreground) of the queue, and then detect the amount of people behind the front person, despite those behind being obscured. I have built an object recognition model using keras-retinanet …
I would be glad if someone could give me some hints and assessment for the following project. (I'm relatively new to ML and DL and having only a little theoretical knowledge) My goal is to build a detector for receipt corners in images. I started to create a dataset with images of the receipts with the labels being the 4 corner points of the receipt. My plan is to train a CNN with the dataset and I wonder if you …
The IMageNEt paper Image Net. presents the Average ROC curve for the 16 classes in the imagenet data, visit image figure. 8 in the paper. what is the known function to compute this ROC plot. As ROC plot is for a binary classification problem. Is this average ROC plot made in the picture average of ROC's for all the 16 categories. Any Help would be appreciated.
In convolutional neural networks, we make convolutions of three channels (red, green, blue) with a filter of dimensions $k\times k\times 3$, like in the picture: Each filter consists of adjustable weights, and can learn to detect primitive features, like edges. The filter can be different for each channel: one for R, another for G, yet another for B. My question is: Why do we need separate filters for each channel? If there's an edge, it will appear on each channel …
In order to identify the similarity between images (products) I want to use a neural network approach similar to TiefVision. This pre-trained neural network is basically translating the images into a feature vectors and then creating a similarity measure between the images using a distance measure between the vectors. To make it more tangible have a look at a 2D visual representation below. I want to take it one step further: When a single user "likes" multiple images, I want …
I am newbie in face recognition related things... As far i observed dlib's frontal_face_detectoris widely used to find the faces in an image and after that, to extract face_descriptor vectors which is better for real time face authentication system ? FaceNet by google dlib_face_recognition_resnet_model_v1 by face_recognition It looks both working fine.. but in real-time implementation, Is there some thing important to understand the performance ? Or any comparison checks done on real time / large data sets ? Thanks in …
I want to train a CNN for image recognition. Images for training have not fixed size. I want the input size for the CNN to be 50x100 (height x width), for example. When I resize some small sized images (for example 32x32) to input size, the content of the image is stretched horizontally too much, but for some medium size images it looks okay. What is the proper method for resizing images while avoiding the content being destroyed? (I am …
Problem Statement: I am given 2 sets of images. All the images in both sets are without annotations and labels. First set : a set of images of the grocery store shelves (captured in the grocery stores). Second set: a set of close-up images of the products kept on those store shelves. What I am trying to achieve: I want to first locate and then predict a bounding box Product for a Product in the set of images of Grocery …
I've been trying to make an object recogniser in Tensorflow and have used labelImg to classify large electrical transmission towers at varying distances. In order to make 10-16MP (~2-7MB) images train with my computer's 16GB of RAM, I need to crop them down until they're about 200x500 and <100kB. The way I see it, this preserves the detail of a far away tower, and means the image_resizer function, set at 1024x600, won't be smashing a 200x500 tower in a 5000x3000 …
I'm working on a project in which I need to build a form recognizer that, given a form image, returns de key - values pairs. As I just got started, I wanted to hear some opinions about what should I try. Some questions that I have in mind: What models works best for the refered input and output? What features should be fed into that model? What should be the ideal size of the training dataset? Please, feel free to …
I am trying to design and train a neural network, which would be able to give me coordinates of certain key points in the image. Dataset I've got a dataset containing 1800 images similar to these: This dataset is generated by me. Each image contains two circles, one smaller and one bigger, generated randomly in the image. My goal is to train the neural network to return 2 sets of coordinates, each of them pointing precisely at the center of …
I'm working with a model that involves 3 stages of 'nesting' of models in Keras. Conceptually the first is a transfer learning CNN model, for example MobileNetV2. (Model 1) This is then wrapped by a model that consists of a small DNN. (Model 2) Finally during training these are all wrapped by a model that concatenates multiple outputs from model 2, calculates loss, and then backpropagates into model 2 and in the future model 1. (Model 3) For inference later …
I'm working on a multi-classification deep learning algorithm and I was getting big over-fitting: My model is supposed to classify sunglasses on 17 different brands, but I only had around 400 images from each brand so I created a folder with data augmented x3 times, generating images with these parameters: datagen = ImageDataGenerator( rotation_range=30, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, fill_mode='nearest') After doing so i got these results: I don't know if it's correct to do the validation only using the …