'CUDA: "missing return statement at end of non-void function" in constexpr if function

when I compile the following test code, I get this warning:

test.cu(49): warning: missing return statement at end of non-void function "AllocateSize<T,D>(size_t) noexcept [with T=int, D=Device::GPU]"
          detected during instantiation of "Pointer<T, D> AllocateSize<T,D>(size_t) noexcept [with T=int, D=Device::GPU]" 
(61): here

Should I be concerned and it and is this expected? What can I do to make it go away? This seems weird since cuda does support C++17. Thanks in advance!

Compiled with: nvcc -std=c++17 test.cu -o test

The test code (test.cu):

enum class Device { CPU, GPU }; // Device

template <typename T, Device D>
class Pointer {
private:
    T* m_raw = nullptr;
    
public:
    __host__ __device__ inline Pointer(T* const p)              noexcept { this->SetPointer(p); }

    __host__ __device__ inline void SetPointer(const Pointer<T, D>& o) noexcept { this->m_raw = o.m_raw; }

    template <typename U>
    __host__ __device__ inline Pointer<U, D> AsPointerTo() const noexcept {
        return Pointer<U, D>(reinterpret_cast<U*>(this->m_raw));
    }

    __host__ __device__ inline operator T*& () noexcept { return this->m_raw; }
}; // Pointer<T, D>

template <typename T>
using CPU_Ptr = Pointer<T, Device::CPU>;

template <typename T>
using GPU_Ptr = Pointer<T, Device::GPU>;

template <typename T, Device D>
__host__ inline Pointer<T, D> AllocateSize(const size_t size) noexcept {
    if constexpr (D == Device::CPU) {
        return CPU_Ptr<T>(reinterpret_cast<T*>(std::malloc(size)));
    } else {
        T* p;
        cudaMalloc(reinterpret_cast<void**>(&p), size);
        return GPU_Ptr<T>(p);
    }
}

template <typename T, Device D>
__host__ inline void Free(const Pointer<T, D>& p) noexcept {
    if constexpr (D == Device::CPU) {
        std::free(p);
    } else {
        cudaFree(p.template AsPointerTo<void>());
    }
}

int main() { Free(AllocateSize<int, Device::GPU>(1024)); }
  • CUDA release 11.1
  • Ubuntu Based Linux Distro


Solution 1:[1]

Our compiler team looked at the issue. The warning is a spurious warning. We hope to address it in a future CUDA release. I won't be able to respond to questions about when that may be. The code is being generated correctly.

If you wish to silence the warning, a possible method is to add an additional return statement at the end of the function in question. This should have no effect in this case because it is unreachable:

template <typename T, Device D>
__host__ inline Pointer<T, D> AllocateSize(const size_t size) noexcept {
    if constexpr (D == Device::CPU) {
        return CPU_Ptr<T>(reinterpret_cast<T*>(std::malloc(size)));
    } else {
        T* p;
        cudaMalloc(reinterpret_cast<void**>(&p), size);
        return GPU_Ptr<T>(p);
    }
    return Pointer<T, D>(NULL);
}

(3028897)

Solution 2:[2]

This appears to still be an issue in cuda 11.4. I had a similar situation where the return type was auto so the extra return at the extra return does not work. However 11.3 added support for __builtin_unreachable, which does fix the issue without totally squashing an otherwise useful warning.

Solution 3:[3]

Adding a gratuitous return statement at the end of the function (@RobertCrovella's answer) doesn't work if the return type of the function is auto, because of [another spurious] "conflicting types errors".

What worked for me is to pass this option to the front-end

-Xcudafe "--diag_suppress=implicit_return_from_non_void_function"

http://www.ssl.berkeley.edu/~jimm/grizzly_docs/SSL/opt/intel/cc/9.0/lib/locale/en_US/mcpcom.msg

(took me a while to discover the right option).


CORRECTION, this strike out below doesn't suppress the warning, it seems it has to do with the function being a template and the pragma is applied before the template is instantiated.

For a localized suppression of the warning use __builtin_unreachable if available (other answer).

**UPDATE: A more specific workaround**

As @RobertCrovella commented, suppressing the warning globally can make it ignore legitimate cases where the warning is meaningful. So a localized alternative is preferable:

#if defined(__NVCC__)
#pragma push
#pragma diag_suppress 0117 //"implicit_return_from_non_void_function"
#endif
template <typename T, Device D>
__host__ inline Pointer<T, D> AllocateSize(const size_t size) noexcept {
    if constexpr (D == Device::CPU) { ... } else { ... }
}
#if defined(__NVCC__)
#pragma pop
#endif

I found the error code here: http://www.ece.ualberta.ca/~cmpe490/documents/ghs/405/c_error_ref.pdf . But unfortunatelly the information about these NVCC compiler options is difficult to find.

Solution 4:[4]

My other answer suppressed the warning globally from command line.

I have this code below to make the suppression local (which not only suppressed the warning for nvcc but also for nvc++). nv_diagnostics was introduce in nvcc ~11.4 in place of diagnostics.

#if defined __NVCC__
    #ifdef __NVCC_DIAG_PRAGMA_SUPPORT__
        #pragma nv_diagnostic push
        #pragma nv_diag_suppress = implicit_return_from_non_void_function
    #else
        #pragma    diagnostic push
        #pragma    diag_suppress = implicit_return_from_non_void_function
    #endif
#elif defined __NVCOMPILER
    #pragma    diagnostic push
    #pragma    diag_suppress = implicit_return_from_non_void_function
#endif
template< class T, class U >
constexpr auto comp_equal(T t, U u) noexcept -> bool {
    using UT = std::make_unsigned_t<T>;
    using UU = std::make_unsigned_t<U>;
    if constexpr (std::is_signed_v<T> == std::is_signed_v<U>) {
        return t == u;
    } else if constexpr (std::is_signed_v<T>) {
        return t < 0 ? false : UT(t) == u;
    } else {
        return u < 0 ? false : t == UU(u);
    }
}
#if defined __NVCC__
    #ifdef __NVCC_DIAG_PRAGMA_SUPPORT__
        #pragma nv_diagnostic pop
    #else
        #pragma    diagnostic pop
    #endif
#elif defined __NVCOMPILER
    #pragma    diagnostic pop
#endif

The problem is that GCC (which is the background compiler for nvcc) complains that it doesn't recognize the pragmas, this forces me to globally dissable the GCC warning -Wunknown-pragmas.

I tried surrounding all these with

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunknown-pragmas"
...
#pragma GCC diagnostic pop

but it didn't work, GCC keeps complaining.

If do you know of a simplification this please let me know.


I found this library very useful as reference: https://github.com/nemequ/hedley/blob/master/hedley.h

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 alfC
Solution 3
Solution 4