Suggestion for a better way to organize data to generate frequent item-sets?
I have a data of a bag of words in a document. The data has 3 columns: {document number, word number, count of the word in the number}
. I am supposed to generate frequent item-sets of a particular size.
I thought that I would make list of all words that appear in a document, create a table of this list, and then generate frequent item-sets using Mlxtend or Orange . However, this approach does not seem to be efficient.
Topic orange3 orange text-mining data-mining
Category Data Science