'Having empty constructor that leaves arrays uninitialized makes slower calculation
I am very confused with one thing... If I add constructor to struct A then calculating in for loop becomes many times slower. Why? I have no idea.
On my computer times of the snippet in outputs are:
With constructor: 1351
Without constructor: 220
Here is a code:
#include <iostream>
#include <chrono>
#include <cmath>
using namespace std;
using namespace std::chrono;
const int SIZE = 1024 * 1024 * 32;
using type = int;
struct A {
type a1[SIZE];
type a2[SIZE];
type a3[SIZE];
type a4[SIZE];
type a5[SIZE];
type a6[SIZE];
A() {} // comment this line and iteration will be twice faster
};
int main() {
A* a = new A();
int r;
high_resolution_clock::time_point t1 = high_resolution_clock::now();
for (int i = 0; i < SIZE; i++) {
r = sin(a->a1[i] * a->a2[i] * a->a3[i] * a->a4[i] * a->a5[i] * a->a6[i]);
}
high_resolution_clock::time_point t2 = high_resolution_clock::now();
cout << duration_cast<milliseconds>(t2 - t1).count() << ": " << r << endl;
delete a;
system("pause");
return 0;
}
However if I remove sin() method from for loop like this:
for (int i = 0; i < SIZE; i++) {
r = a->a1[i] * a->a2[i] * a->a3[i] * a->a4[i] * a->a5[i] * a->a6[i];
}
removing constructor does not matter and the time of execution is the same and equals 78.
Do you have similar behaviour with this code? Do you know a reason of this?
EDIT: I compile it with Visual Studio 2013
Solution 1:[1]
Yes, this behavior is still reproducible in Visual Studio 2019 if compile in Release configuration (with optimization).
If struct A has empty user constructor, then its fields remain uninitialized after new A().
On the other hand, if struct A does not have a constructor, then it becomes an aggregate and new A() fills its fields with zeros.
Computing multiplications and then sine has the same performance independent of input arguments (if they are not de-normalized values, which is not the case here), but after the initialization of the fields with zeros they appear in CPU cache, so the following computation goes faster, which explains the "benefit" of no-constructor version (of course, if not include in the measurement the time of object construction).
If you keep empty constructor, and then manually fill the object with zeros:
A* a = new A();
for (int i = 0; i < SIZE; i++)
a->a1[i] = a->a2[i] = a->a3[i] = a->a4[i] = a->a5[i] = a->a6[i] = 0;
then the program will be same fast as in the case of no constructor in A.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
