'NiFi - data stuck in queues when load balancing is used
In Apache NiFi, dockerized version 1.15, a cluster of 3 NiFi nodes is created. When load balancing is used via default port 6342, flow files get stuck in some of the queues, in the queue in which load balancing is enabled. But, when "List queue" is tried, the message "The queue has no FlowFiles." is issued:
The part of the NiFi processor group where the issue happens:
Configuration of NiFi queue in which flow files seem to be stuck:
Another problem, maybe not related, is that after this happens, some of the flow files reach the subsequent NiFi processors, but get stuck before the MergeContent processors. This time, the queues can be listed:
The part of code when the second issue occurs:
The part of code when the second issue occurs
The configuration of the queue:
The listing of the FlowFiles in the queue:
The MergeContent processor configuration. The parameter "max_num_for_merge_smxs" is set to 100:
Load balancing is used because data are gathered from the SFTP server, and that processor runs only on the Primary node.
If you need more information, please let me know.
Thank you in advance!
Edited:
I put the load-balancing queues between the ConsumeMQTT (working on the Primary node only) and UpdataAttribute processors, but Flow files are seemingly staying in the load-balancing queue, but when the listing is done, the message is "The queue has no FlowFiles.". Please check:
Changed position of the load-balancing queue:
The message that there are no flow files in the queues:
Take notice that the processors before and after the queue are stopped while doing "List queue".
Edit 2:
I changed the configuration in the nifi.properties to the following:
nifi.cluster.load.balance.connections.per.node=20
nifi.cluster.load.balance.max.thread.count=60
nifi.cluster.load.balance.comms.timeout=30 sec
I also restarted the NiFi containers, so I will monitor the behaviour. For now, there are no stuck Flow files in the load-balancing queues, they go to the processor that follows the queue.
Solution 1:[1]
"The queue has no FlowFiles" is normal behaviour of a queue that is feeding into a Merge - the flowfiles are pending to be merged.
The most likely cause of them being "stuck" before a Merge is that you have Round Robin distributed the FlowFiles across many nodes, and then you are setting a Minimum count on the Merge. This minimum is per node and there are not enough FlowFiles on each node to hit the Minimum, so they are stuck waiting for more FlowFiles to trigger the Merge.
-- Edit
"The queue has no FlowFiles" is also expected on a queue that is active - in your flow, the load balancing queue is drained immediately into the output queue of your merge PGs Input port - so there are no FFs sitting around in the load balancing queue. If you were to STOP the Input ports inside the merge PG, you should be able to list them on the LB queue.
It sounds like you are doing GetSFTP (Primary) and then distributing the files. The better approach would be to use ListSFTP (Primary) -> Load Balance -> FetchSFTP - this would avoid shuffling large files, and would instead load balance the file names between all nodes, with each node then fetching a subset of the files.
Secondly, I would review your Merge config - you have a parameter #{max_num_for_merge_xmsx} defined, but this set in the Minimum Number of Entries for the Merge - so you are telling Merge to only ever merge when at least #{max_num_for_merge_xmsx} amount of FlowFiles is reached.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |









