How to use Autoencoders for outlier detection on images

I have a bunch of images taken from a camera showing a pipe and would like to detect if the pipe is leaking or not. There are very few examples of leaking pipes in the data set. So considering this problem as a supervised learning problem, I think that it may not give us good results due to imbalanced data. I am thinking of using autoencoders and considering it as an outlier detection problem.

I am new to deep learning so I would like to know what the architecture of my neural network should look like. Should I have some convolutional layers first and then an autoencoder or should I only have an autoencoder? What would be the best deep learning library for such a use case? I am also thinking of using only the photos which do not have any leak for the training phase, is that okay?

Topic autoencoder image-classification outlier deep-learning

Category Data Science


Your explanation is not full enough. If you have the same images, with the same background, it means that you can use simple image diff to find changes and detect leaking.


One idea is to train your autoencoder only one normal images which will allow you to represent in a lower dimension space the problem. Your image would be encoded as follows, for $x \in \mathbb{R}^{W \times H \times C}$

$$ f(x) = \tilde{x} \in \mathbb{R}^{1 \times d} $$

Then since you are in a lower dimension space you can use a distance-based rule: your normal cases should accumulate in some ball:

$$ \{ ||x-u|| < \varepsilon \} $$

where $u$ would be the center (average of all your normal training cases). Then use the train model to encode your images at prediction time. Then if a new point is inside the ball it is a normal case, if it is outside it is an outlier.

That's just an idea, I think there are plenty of possible ways to approach the problem.


If you use supervised learning you need to weighted you labels (here).

Otherwise I strongly recommend to exclude every leaking image for a training dataset for an CAE. If you have the anomaly in your trainign dataset, your CAE will learn it as normal. This is actually an active research field, and not trivial for real datsets, but if you have a static image of the pipe, it should be easy. I would suggest that you use traditional CV for it and not an DL model, because you can interpret and fine tune the result.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.