How to create the categorical mask for images specifically for Tensor? Or port the NumPy function correctly to Dataset.map function

I'm trying to move from NumPy array as my dataset to tensorflow.Dataset.

Now, I've created a pipeline to train the model for classification problems. At some point, I just normalize all the images using map function:

dataset['train'] = dataset['train'].map(pre_pr, num_parallel_calls=tf.data.experimental.AUTOTUNE)

And the function description looks like this:

@tf.function
def normalize(input_image: tf.Tensor, input_mask: tf.Tensor) - tuple:
    input_image = tf.cast(input_image, tf.float32) / 255.0
    input_mask= tf.cast(input_mask, tf.float32) / 255.0
    return input_image, input_mask

@tf.function
def pre_pr(datapoint: dict) - tuple:
    input_image = tf.image.resize(datapoint['image'], (IMG_SIZE, IMG_SIZE))
    input_mask = tf.image.resize(datapoint['y_mask'], (IMG_SIZE, IMG_SIZE))
    return normalize(input_image, input_mask)

This works pretty fine.

After all the mapping it converts a MapDataset shapes: {image: (None, None, 3), y_mask: (None, None, 3)}, types: {image: tf.uint8, y_mask: tf.uint8} to PrefetchDataset shapes: ((None, 128, 128, 3), (None, 128, 128, 3)), types: (tf.float32, tf.float32)

But the issue is in any segmentation problem, when I'm trying to create the categorical mask it causes issues.

I have got a function that creates the segmentation mask from images, with a given color palette, like this:

def flat_labels(label):
    label_seg = np.zeros(label.shape, dtype=np.float32)
    for i in range(len(encodings)):
        label_seg[np.all(label == encodings[i], axis=-1)] = i
    return label_seg[:, :, 0]

Here encoding is a list/array, like [255,255,0],[0,0,255]...

When I try to call it like:

input_mask = tf.numpy_function(flat_labels, [input_mask], tf.uint8)

It changes the datatype to something else:

PrefetchDataset shapes: ((None, 128, 128, 3), unknown), types: (tf.float32, tf.uint8)

I really don't want to change the rest of the code, I want to just call this flat_labels NumPy-based function to a tensor. Or some way to create a function to create a categorical mask for Tensor.

Thanks in advance.

Topic semantic-segmentation categorical-encoding keras tensorflow

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.