How to justify logarithmically scaled frequency for tf in tf-idf?
I am studying tf-idf (term frequency - inverse document frequency). The original logic for tf was straightforward: count of term t / number of total terms in the document.
However, I came across the log scaled frequency: log(1 + count of term t in the document). Please refer to Wikipedia.
It does not include the number of total terms in a document. For example, say, document 1 has 10 words in total and one of them is happy. Using the original logic, tf(happy)=1/10=0.1. Document 2 also has one happy but it has 1,000 words in total. tf(happy)=1/1000=0.001. You can see the tf(happy) of document 1 is very different from that of document 2.
However, if we use the log scaled frequency, both are log(1+1), regardless of the length of documents (one only has 10 words, while the other has 1,000).
How to justify such logic? Thanks.
Topic logarithmic tfidf nlp
Category Data Science