Orange3 and OCR Help
I would like to be able to utilize Orange to OCR pictures with pytesseract.
I have been able to create simple code in the Python Script widget in order to read one image at a time, but I want to be able to bring the images in with import images widget, and use the python script just to read them and provide an output.
When it comes to the output, I would like to be able to save the text as either separate txt files or a tab file that I can utilize with the corpus for text mining.
If someone could please help, I would greatly appreciate it. I am new to coding.
Here is what I have come up with so far (Currently getting an issue with the text line):
import os
import numpy as np
import pytesseract as pt
import Orange
from Orange.data import domain
Orange.data.domain = in_data
for object in in_data:
object = np.array(object)
text = pt.image_to_string(object, lang = 'eng')
out_data = Corpus.from_pt(domain=new_domain)
print (out_data.domain)
Topic ocr orange3 orange python
Category Data Science