'Kafka Connect vs Apache Nifi

Good Afternoon, my question is pretty simple, I'm new in Apache Kafka but I'm doing some work as part of my internship which is why I came with the question.

I will provide the context as much as I can, so I hope someone can help me, I want to clear my doubts.

I was requested to develop a pipeline (or workflow) using first Apache Nifi. This pipeline consisted of the following.

I fetched data from one local MySQL database using Nifi, then the data was sent to one Kafka topic which was later processed to clean some raw data using the Kafka Client with Java (KStream, KTable and some regular expressions) and sent again to one kafka topic.

Once the processing was done, the new data was read again using Apache Nifi, and then sent to a new MySQL table.

I provide a picture for a better undertanding. General Pipeline

After it, I was requested to do the same but using Kafka Connect instead of Apache Nifi, which was even shorter because I only had to use the Source connector to read the data from the MySQL database to sent it to one kafka topic, then process it with the Kafka Client with Java and sent it to a new kafka topic. Finally use the Sink connector to save the processed data of the new topic to sent it straight to one new table in the database.

So, someone in charge asked me when I should use Apache Nifi + Kafka instead of Kafka Connect + Kafka and I have no idea being honest.

So let's consider that the most important point here is apply Data Enrichment and let's consider two scenaries:

  • when I have data from different source but the data is not streaming data AND when the data is streaming data as well as not.

And all of it needs to be processed, integrated, cleaned and finally unified to apply data enrichment.

If I consider the context provided previously my questions and doubts are:

  • when should I use or not Nifi and Kafka? and why?
  • When should I use or not Kafka Connect with Kafka? and why?

I think I have one basic idea, and I have been reading in order to be able to answer it for myself, but being honest, I haven't come with one acceptable answer or clearly idea of when to use each one.

So, I would really appreciate your help.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source