Orange3 and OCR Help

I would like to be able to utilize Orange to OCR pictures with pytesseract.

I have been able to create simple code in the Python Script widget in order to read one image at a time, but I want to be able to bring the images in with import images widget, and use the python script just to read them and provide an output.

When it comes to the output, I would like to be able to save the text as either separate txt files or a tab file that I can utilize with the corpus for text mining.

If someone could please help, I would greatly appreciate it. I am new to coding.

Here is what I have come up with so far (Currently getting an issue with the text line):

import os
import numpy as np
import pytesseract as pt

import Orange
from Orange.data import domain

Orange.data.domain = in_data
for object in in_data:
    object = np.array(object)
    text = pt.image_to_string(object, lang = 'eng')


    out_data = Corpus.from_pt(domain=new_domain)

print (out_data.domain)

Topic ocr orange3 orange python

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.