'Is it better to create just a single Kafka topic for streaming tweets or several depending on the subject of the tweet?

Imagine I want to create a web for monitoring the opinions of people on certain subjects, let's call these subjects A, B and C. Now, some of those tweets mention the subject A and C, so I'd like to duplicate them to classify those tweets as A and C.

What would it be better when it comes to create kafka topics for that? Creating a single topic for all the tweets in general or creating topic A, B and C to then create three different tables for each topic in S3 or similar and then keep processing them separately with spark streaming?

apache-kafka twitter

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Is it better to create just a single Kafka topic for streaming tweets or several depending on the subject of the tweet?

Sources

Related Questions