'Why does similar code running in multiple threads have different running times?
I met a very strange problem about a C++ multi-thread program which as belows.
#include<iostream>
#include<thread>
using namespace std;
int* counter = new int[1024];
void updateCounter(int position)
{
for (int j = 0; j < 100000000; j++)
{
counter[position] = counter[position] + 8;
}
}
int main() {
time_t begin, end;
begin = clock();
thread t1(updateCounter, 1);
thread t2(updateCounter, 2);
thread t3(updateCounter, 3);
thread t4(updateCounter, 4);
t1.join();
t2.join();
t3.join();
t4.join();
end = clock();
cout<<end-begin<<endl; //1833
begin = clock();
thread t5(updateCounter, 16);
thread t6(updateCounter, 32);
thread t7(updateCounter, 48);
thread t8(updateCounter, 64);
t5.join();
t6.join();
t7.join();
t8.join();
end = clock();
cout<<end-begin<<endl; //358
}
the first code block run about 1833 seconds,but the second which is almost same with the first one run about 358 seconds.Beg for an answer!Thank you!
Solution 1:[1]
Writing to nearby variables from multiple threads is slow due to "false sharing" which is described here: What is "false sharing"? How to reproduce / avoid it?
Your offsets of 16/32/48/64 are 64 bytes apart because the int values are (on most common platforms) 4 bytes each. And 64 bytes is a common cache line size, so this puts each target value on its own cache line.
The performance difference is not nearly as large if you compile with optimization. Which of course you should always do when measuring performance. But there's still a difference, and it may get worse the more threads you have.
Finally, your benchmark is unfair because you always run the "slow" code first. That means the code and data are "cold" for the first experiment and "hot" for the second experiment. This is a common mistake in benchmarking, and may even be the dominant factor in the performance difference you're seeing, depending on your system.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | John Zwinck |
