'Why is std::atomic<bool> much slower than volatile bool?
I've been using volatile bool for years for thread execution control and it worked fine
// in my class declaration
volatile bool stop_;
-----------------
// In the thread function
while (!stop_)
{
do_things();
}
Now, since C++11 added support for atomic operations, I decided to try that instead
// in my class declaration
std::atomic<bool> stop_;
-----------------
// In the thread function
while (!stop_)
{
do_things();
}
But it's several orders of magnitude slower than the volatile bool!
Simple test case I've written takes about 1 second to complete with volatile bool approach. With std::atomic<bool> however I've been waiting for about 10 minutes and gave up!
I tried to use memory_order_relaxed flag with load and store to the same effect.
My platform:
- Windows 7 64-bit
- MinGW gcc 4.6.x
What I'm doing wrong?
NB: I know that volatile does not make a variable thread-safe. My question is not about volatile, it's about why atomic is ridiculously slow.
Solution 1:[1]
Code from "Olaf Dietsche"
USE ATOMIC
real 0m1.958s
user 0m1.957s
sys 0m0.000s
USE VOLATILE
real 0m1.966s
user 0m1.953s
sys 0m0.010s
IF YOU ARE USING GCC SMALLER 4.7
http://gcc.gnu.org/gcc-4.7/changes.html
Support for atomic operations specifying the C++11/C11 memory model has been added. These new __atomic routines replace the existing __sync built-in routines.
Atomic support is also available for memory blocks. Lock-free instructions will be used if a memory block is the same size and alignment as a supported integer type. Atomic operations which do not have lock-free support are left as function calls. A set of library functions is available on the GCC atomic wiki in the "External Atomics Library" section.
So yeah .. only solution is to upgrade to GCC 4.7
Solution 2:[2]
Since I'm curious about this, I tested it myself on Ubuntu 12.04, AMD 2.3 GHz, gcc 4.6.3.
#if 1
#include <atomic>
std::atomic<bool> stop_(false);
#else
volatile bool stop_ = false;
#endif
int main(int argc, char **argv)
{
long n = 1000000000;
while (!stop_) {
if (--n < 0)
stop_ = true;
}
return 0;
}
Compiled with g++ -g -std=c++0x -O3 a.cpp
Although, same conclusion as @aleguna:
- just
bool:
real 0m0.004s
user 0m0.000s
sys 0m0.004s
volatile bool:
$ time ./a.out
real 0m1.413s
user 0m1.368s
sys 0m0.008s
std::atomic<bool>:
$ time ./a.out
real 0m32.550s
user 0m32.466s
sys 0m0.008s
std::atomic<int>:
$ time ./a.out
real 0m32.091s
user 0m31.958s
sys 0m0.012s
Update 2022-04-10, AMD Ryzen 3 3200G, g++ 9.3.0:
It looks like atomic has improved a lot in comparison to volatile.
I increased the loop counter to 10,000,000,000, to have a more precise picture. Although the magnitude doesn't change by this adjustment:
std::atomic<bool>,std::atomic<int>: ~2.9svolatile bool: ~5.4s
Solution 3:[3]
My guess is that this is an hardware question. When you write volatile you tell the compiler to not assume anything about the variable but as I understand it the hardware will still treat it as a normal variable. This means that the variable will be in the cache the whole time. When you use atomic you use special hardware instructions that probably means that the variable is fetch from the main memory each time it is used. The difference in timing is consistent with this explanation.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | Olle Lindeberg |
