'TF-IDF similarity with twitter stream
I have collected many tweets using twitter4j and saved them into different text files. Now I want to consider several time windows of size 10 days and for each window I want to select its top 100 tokens according to their TF-IDF. How can I do this using java? It's necessary to create a Lucene index?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
