How to get the expected output shape from a unet model?

I have an image segmentation task where my input image shape is (140, 85, 95, 4) and the output label shape is (140, 85, 95). Below is my model: from tensorflow.keras.layers import Conv2D, Conv2DTranspose, Input, Rescaling num_classes = 4 my_model = tf.keras.Sequential([ Input(shape = (85, 95, 4), name = 'image'), Rescaling(scale = 1./255), Conv2D(filters = 64, kernel_size = 3, strides = 1, activation = 'relu', padding = 'same'), Conv2D(filters = 64, kernel_size = 3, activation = 'relu', padding = 'same'), …
Category: Data Science

Are more target labels in a multi-label classification always better?

Context We work on medical image segmentation. There are a lot of potential labels for one and the same region we segment. There can be different medically defined labels like anatomical regions, more biological labels like tissue types or spatial labels like left/right. And many labels can be further differentiated into (hierarchical) sub labels. Clarification The question is with respect to the number of classes / target labels which are used in a multi-label classification/segmentation. It is not about the …
Category: Data Science

how to apply segmentation on objects only

i have this image which is an output from my object detection model i wanted to apply segmentation on this image so that my mask will be like that i used grabcut algorithm but the results was too bad here's my code img=cv2.imread(testpath+imgname) mask=np.zeros(img.shape[:2],np.uint8) bgModel=np.zeros((1,65),np.float64) fgModel=np.zeros((1,65),np.float64) tmpimage=image masks=[] for i in recs: cv2.grabCut(img,mask,i,bgModel,fgModel,5,cv2.GC_INIT_WITH_RECT) mask2=np.where((mask==2)|(mask==0),0,255).astype('uint8') masks.append(mask2) #img=image*mask2[:,:,np.newaxis] finalmask=np.zeros(img.shape[:2],np.uint8) for i in range(len(masks)): finalmask=finalmask+masks[i] # for i in range(len(finalmask)): # for j in range(len(finalmask[i][:])): # for k in recs: # if i<k[0] …
Category: Data Science

ValueError: Data cardinality is ambiguous: (Jupyter Notebook)

I'm building an OCR to read text off of water meters. I'm running into the error mentioned above when I try to fit the machine learning model. I am using the segmentation_models python library. BACKBONE = 'resnet34' preprocess_input = sm.get_preprocessing(BACKBONE) x_train, y_train, x_val, y_val = train_test_split(X,y, test_size = 0.2, random_state= 12345) x_train = preprocess_input(x_train) x_val = preprocess_input(x_val) model = sm.Unet(BACKBONE, encoder_weights='imagenet', encoder_freeze=True) model.compile('Adam', loss=sm.losses.bce_jaccard_loss, metrics=[sm.metrics.iou_score]) model.fit( x = x_train, y = y_train, batch_size=16, epochs=10, validation_data=(x_val, y_val)) 'X' represents the images …
Category: Data Science

resnet50 implementation for semantic segmentation

I am new to resnet models. I want to implement a resnet50 model for semantic segmentation I am following the code from this video, but my numclasses is 21. I have a few questions: If i pass in any rgb jpeg image into the model, I get an output of size (1, 21). What does this output represent? Since I am doing semantic segmentation, my images dont have any rgb channels, so what should I put for image_channels in self.conv1? …
Category: Data Science

Segmentation Network produces noisy output

I've implemented a SegNet and SegNet ReLU variant in PyTorch. I'm using it as a proof-of-concept for now, but what really bothers me is the noise produced by the network. With ADAM I seem to get slightly less noise, whereas with SGD the noise increases. I can see the loss going down and the cross-evaluation accuracy rising to 98%-99% and yet the noise is still there. On the left is the actual image, then you can see the mask, and …
Category: Data Science

At the first epochs, what will segmentation model get?

I am working at a semantic segmentation problem now, with 5-classes task. But when I running on validation function and output my probablities map. I found that with the background class (the extra class for nnUNet, named class-0), the probalities always up to nearly 1, even when running on much epochs. But the other foreground classes (as class-1 to class-6), probalities can't range to 1.0 the highest. But at least that I can recognize the outline of the target, but …
Category: Data Science

Transfer Deep Learning from one aerial imagery datset to many others

I am new to Deep Learning but have been able to use RasterVision successfully to predict building footprints within a set of aerial imagery. This aerial imagery data set is for a province of New Zealand. Now that I have a model that predicts successfully in this province, I am interested in how I could use this to predict in the many other regions of New Zealand. The problem is these regions are captured with differing sensors, resolution and with …
Category: Data Science

How to prepare masks for multiclass semantic segmentation?

It's very straightforward for binary semantic segmentation: black color (0s) is responsible for background, whereas white color (1s) is responsible for objects of interest. But what about multiclass semantic segmentation? As far as I understand, these masks must be RGB images since we use more than two colors. Is it correct? Or should I have a separate binary mask for every class? If I can use RGB images with multiple colors as masks, should I use some specific colors for …
Category: Data Science

Video segmentation vs image segmentation

I am new to data science & working on a segmentation model, Basically I need to deploy this segmentation model in android devices using TensorFlow-Lite for real time camera frame segmentation. I used unet model to do that but could not get the accuracy I wanted. After exploring so much I found something about video segmentation but I am bit confuse How video segmentation is different from normal image segmentation? Can somebody explain the differences between these two?
Category: Data Science

Identify visible stones in the image - Approach in OpenCV & Deeplearning

I have samples images of stones present in the images. I need to identify the visible stones only. The approach which I tried is threshold based filtering and detecting cv2.contours. Also, I am looking into ENet Architecture for semantic segmentation based deep learning approach. The samples images are below. Example image1: Example image2: The code which I tried for contour based detection is as below image = cv2.imread(os.path.join(img_path, img_name2)) # threshold based customization lower_bound = np.array([0, 0, 0]) upper_bound = …
Category: Data Science

What is the difference between proposal-based approach and proposal-free approach?

From here it says that Techniques to solve instance segmentation can be roughly grouped into two categories: proposal-based methods and proposal-free methods. In proposal-based methods, a set of object proposals and their classes are first predicted, then foreground-background segmentation in each bounding box is performed. The proposal-free approaches exclude the step of proposal generation. What is "proposal" in this context? Also, how to "first predict their classes"? There is not much explanation about this topic on the internet and I …
Category: Data Science

proper solution to synthesize nailart on hand picture

I'm trying to synthesize nailart on hand picture. Next 3 steps are what I'm trying to do. take hand pictures select options like color, cubic .. etc synthesize And the way I thought to solve this is get nail contour by trained UNET model with datasets (hand pics, hand pics with nail area painted) make synthetic nailarts image by trained pix-to-pix model with datasets( nailart pics, semantic images including nailarts' options) synthesize nailart image on hand picture I'm wondering whether …
Category: Data Science

multiple images inside one large CSV file

I'm very new to data science, and was admiring how people had made these massive open-source datasets, on places like kaggle. I noticed that all of the datasets where all in CSV format. I have lots of images that I'd like to upload to kaggle for everyone to use, although don't know how to convert my images to CSV. (I can't upload them as individual images because there is a limit of 1000 files, which is not enough for a …
Category: Data Science

U-Net for Crack Segmentation

I used a U-Net model that was built for Oxford Pet Segmentation to a crack segmentation project. Without transfer learning, model works fine for pet segmentation but not for crack segmentation. What could be the reason? I know there are codes for Crack Segmentation with U-Net but I want to learn why the code for pet doesn't work well for crack. Thanks in advance. def double_conv_block(x,n_filters): x=layers.Conv2D(n_filters,3,padding="same",activation="relu",kernel_initializer="he_normal")(x) x=layers.Conv2D(n_filters,3,padding="same",activation="relu",kernel_initializer="he_normal")(x) return x def downsample_block(x,n_filters): f=double_conv_block(x,n_filters) p=layers.MaxPool2D(2)(f) p=layers.Dropout(0.3)(p) return f,p def upsample_block(x,conv_features,n_filters): x=layers.Conv2DTranspose(n_filters,3,2,padding="same")(x) …
Category: Data Science

Post processing in medical segmentation with attemtion unet

I am doing a lesion segmentation for multiple sclerosis (MS), and at the moment I am using a attention unet for my thesis. The best validation dice score I have recieved is 0.771 and train 0.84. I am thinking of doing some post processing for removing some FP and FN in order to enhance the predictions. Any advice? currently I am using opening and closing, and I am not sure if this is the right approach.
Category: Data Science

Ways of calculating the area of colored regions in a map

Background I am a PHD student trying to improve my data science. One of my research projects, has me tasked with determining the size of the clusters in a colored image of regions. Here is an example image I am using. The coloring is natural as it represents the orientation of the microscope light. The light hits the surface in different ways resulting in the different colors. But I'm not trying to sum regions of similar colors, but instead just …
Category: Data Science

How to train a form recognizer

I'm working on a project in which I need to build a form recognizer that, given a form image, returns de key - values pairs. As I just got started, I wanted to hear some opinions about what should I try. Some questions that I have in mind: What models works best for the refered input and output? What features should be fed into that model? What should be the ideal size of the training dataset? Please, feel free to …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.