'Wrong result of multiplication: Undefined behavior or compiler bug?

Background

While debugging a problem in a numerical library, I was able to pinpoint the first place where the numbers started to become incorrect. However, the C++ code itself seemed correct. So I looked at the assembly produced by Visual Studio's C++ compiler and started suspecting a compiler bug.

Code

I was able to reproduce the behavior in a strongly simplified, isolated version of the code:

sourceB.cpp:

double alwaysOneB(double a[3]) {
    return 1.0;
}

main.cpp:

#include <iostream>

__declspec(noinline)
bool alwaysTrue() {
    return true;
}

__declspec(noinline)
double alwaysOneA(const double a[3]) {
    return 1.0;
}

double alwaysOneB(double a[3]); // implemented in sourceB.cpp

int main() {
    double* result = new double[2];

    if (alwaysTrue()) {
        double v[3];
        v[0] = 0.0;
        v[1] = 0.0;
        v[2] = 0.0;

        alwaysOneB(v);

        double d = alwaysOneA(v); // d = 1

        std::cout << "d = " << d << std::endl; // output: "d = 1" (as expected)

        result[0] = d * v[2];
        result[1] = d * d; // should be: 1 * 1 => 1 
    }
    if (alwaysTrue()) {
        std::cout << "result[1] = " << result[1] << std::endl; // output: "result[1] = 2.23943e-47" (expected: 1)
    }

    delete[] result;
    return 0;
}

The code contains some bogus calls to other functions that are (unfortunately) necessary to reproduce the problem. However, the expected behavior should still be pretty clear. A value of 1.0 is assigned to the variable d, which is then multiplied by itself. That result should again be 1.0, which is written to an array and printed to the console. So the desired output is:

d = 1
result[1] = 1

However, the obtained output is:

d = 1
result[1] = 3.77013e+214

Test Environment

The code was tested with the C++ compiler that comes with Visual Studio Community 2019 (latest update, VS 16.11.9, VC++ 00435-60000-00000-AA327). The problem only occurs with optimizations activated (/O2). Compiling with /Od produces a binary that prints the correct output.

In the reduced example (not for the original problem when compiling the full library) I also had to deactivate "Full Program Optimization", otherwise the compiler gets rid of my bogus function calls.

This reduced example only reproduces the problem when compiled for x86 (other examples reproduce the problem for x64).

The full compilation command line is as follows: /permissive- /ifcOutput "Release\" /GS /analyze- /W3 /Gy /Zc:wchar_t /Zi /Gm- /O2 /sdl /Fd"Release\vc142.pdb" /Zc:inline /fp:precise /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /Gd /Oy- /Oi /MD /FC /Fa"Release\" /EHsc /nologo /Fo"Release\" /Fp"Release\DecimateBug2.pch" /diagnostics:column

Full Visual Studio solution to download: https://drive.google.com/file/d/1EyoX0uXEkvfJ_Fh649k9XjJQPdDUMik7/view?usp=sharing

Both the GNU compiler and Clang produce binaries that print the desired result.

Question

Is there any undefined behavior in this code that I am unable to see and that justifies an incorrect result? Or should I report this as a compiler bug?

Assembly produced by the compiler

For the two multiplication lines

        result[0] = d * v[2];
        result[1] = d * d;

the compiler produces the following assembly code:

00CF1432  movsd       xmm1,mmword ptr [esp+18h]   // Load d into first part of xmm1
00CF1438  unpcklpd    xmm1,xmm1                   // Load d into second part of xmm1
00CF143C  movups      xmm0,xmmword ptr [esp+30h]  // Load second operands into xmm0
00CF1441  mulpd       xmm0,xmm1                   // 2 multiplications at one
00CF1445  movups      xmmword ptr [esi],xmm0      // store result

Apparently it tries to perform the two multiplications at once using mulpd. In the first two lines it successfully loads the d operand into both parts of the xmm1 register (as first operands). But when it tries to load both second operands (v[2] and d), it simply loads 128 bits from the v[2] address (esp+30h). That's fine for the second operand of the first multiplication (v[2]), but not for the second multiplication (with d). Apparently the code supposes that d is located immediately after v in memory. However, it isn't. The variable d is never actually stored in memory, it seems to exist only in registers.

This makes me strongly suspect a compiler bug. However, I wanted to confirm that I am not missing any undefined behavior that justifies the incorrect assembly.

c++visual-studio visual-studio-2019 undefined-behavior compiler-bug

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source