'OpenCL FPGA: Kernel Execution of 2 copies of same kernel is not being made in parallel. In addition to that, there is also idle time in between them
My goal is to complete FFTs of 2 - 4K Data points together. Hence, I made 2 kernel objects from the same kernel and Enqueued the tasks at once, i.e. without any Buffer Read-Write or any callbacks in between. I find out that it doesn't happen that way. In addition to that, there is also some idle time between the executions. Can someone please explain?

I was expecting both of them to run in parallel because my FPGA seems to have more area. About 38 percent of it is used.
Solution 1:[1]
I found this question that kind off answers my doubts. It can be foundhere
Solution 2:[2]
The OpenCL queue works sequentially, so one kernel is executed after the other. This makes sure that - if kernel 2 reads memory that kernel 1 has updated, there is no race condition like if they would run concurrently. There may also be some latency to start execution of a kernel.
To run multiple kernels in parallel, you can try multiple queues.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Raghuttam Hombal |
| Solution 2 |
