Neural network architecture to automatically crop a photo of a paper sheet
With an RGB image of a paper sheet with text, I want to obtain an output image which is cropped and deskewed. Example of input:
I have tried non-AI tools (such as openCV.findContours
) to find the 4 corners of the sheet, but it's not very robust in some lighting conditions, or if there are other elements on the photo.
So I see two options:
a NN with
input=image, output=image
, that does everything (including the deskewing, and even also the brightness adjustment). I'll just train it with thousands of images.a NN with
input=image, output=coordinates_of_4_corners
. Then I'll do the cropping + deskewing with a homographic transform, and brightness adjustment with standard non-AI tools
Which approach would you use?
For approach #1, it would be input=image, output=image, what kind of architecture of NN/CNN would you use?
Is approach #2, for which input=image, output=coordinates possible? Or is there another segmentation method you would use here?