Keyword extraction with Orange

I’m very new to using Orange Data Mining software and I’m having a hard time finding what I am looking for. I have 30,000 text files. I wish to use Orange to help me extract key words and phrases, then display to me which documents have the words and phases I’m looking for. I also wish to scan PDFs as well as images to obtain words and phrases. Any guidance and/or Orange workflows would be so appreciated.

Topic orange data-mining

Category Data Science


Your workflow has several options. One is to use Extract Keywords widget from Text add-on to retrieve relevant keywords (using TF-IDF or YAKE). Then you can use Semantic Viewer to find the documents in which those words appear and where. The other is to use Preprocess Text to keep only those words that you find interesting. You can provide a custom lexicon in the filter section. Then use Word Cloud to display keyword frequency, click on the keyword of interest and connect Corpus Viewer to observe the documents containing the selected keyword (from Word Cloud).

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.