'Can CUDA's rounding mode be set globally for a kernel?

CUDA's handling of floating-point rounding modes is discussed here and various intrinisics such as __fadd_rn are available to perform rounded floating-point options (round-to-nearest in this case).

However, if I want to switch round modes for a block of code, this becomes unwieldy.

On the host side I can use fesetenv and friends to set the floating-point rounding mode for a thread.

Is there a way to set CUDA's floating-point rounding mode for a stream or a kernel?



Solution 1:[1]

In a word, no.

Different floating point rounding modes in CUDA are implemented as different instructions, rather than different FPU operation modes, as on some other hardware. The rounding mode is statically selected at compile time by either using the desired intrinsic or PTX instruction, or by directing the compiler to apply translation unit scope default rounding behaviour. Once the compiler and assembler are done, the floating point modes the code will use are baked into the code the GPU will run and can't be changed.

Hypothetically, I suppose it might be possible to use a runtime triggered JIT pass to have the driver transform code to different rounding modes. But that facility does not exist today.

Solution 2:[2]

As mentioned by @talonmies, the short answer is no. However, CUDA makes lower-level arithmetic instructions available to control the rounding mode. These instructions are called intrinsic and, unlike the regular ones, are less precise and aren't optimized by the compiler. Moreover, you need to rewrite your code to use these instructions. Here is an example:

// Regular add instruction
float c = a + b;

// Intrinsic add instruction
float c = __fadd_rn(a, b);

As you can see, __fadd_rn() is a sum instruction with a round-to-nearest rounding mode. The possible rounding modes are:

__fadd_rn(); // Round-to-nearest
__fadd_rz(); // Round-towards-zero
__fadd_ru(); // Round-up
__fadd_rd(); // Round-down

All this information and much more you can find in CUDA's Math API documentation.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2