'Checking get_global_id in OpenCL Kernel Necessary?
I have noticed a number of kernel sources that look like this (found randomly by Googling):
__kernel void fill(__global float* array, unsigned int arrayLength, float val)
{
if(get_global_id(0) < arrayLength)
{
array[get_global_id(0)] = val;
}
}
My question is if that if-statement is actually necessary (assuming that "arrayLength" in this example is the same as the global work size).
In some of the more "professional" kernels I have seen, it is not present. It also seems to me that the hardware would do well to not assign kernels to nonsense coordinates.
However, I also know that processors work in groups. Hence, I can imagine that some processors of a group must do nothing (for example if you have 1 group of size 16, and a work size of 41, then the group would process the first 16 work items, then then next 16, then the next 9, with 7 processors not doing anything--do they get dummy kernels?).
I checked the spec., and the only relevant mention of "get_global_id" is the same as the online documentation, which reads:
The global work-item ID specifies the work-item ID based on the number of global work-items specified to execute the kernel.
. . . based how?
So what is it? Is it safe to omit iff the array's size is a multiple of the work group size? What?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
