'Why is there no std:: equivalent to pthread_spinlock_t like there is for pthread_mutex_t & std::mutex?

I've used pthreads a fair bit for concurrent programs, mainly utilising spinlocks, mutexes, and condition variables.

I started looking into multithreading using std::thread and using std::mutex, and I noticed that there doesn't seem to be an equivalent to spinlock in pthreads.

Anyone know why this is?



Solution 1:[1]

there doesn't seem to be an equivalent to spinlock in pthreads.

Spinlocks are often considered a wrong tool in user-space because there is no way to disable thread preemption while the spinlock is held (unlike in kernel). So that a thread can acquire a spinlock and then get preempted, causing all other threads trying to acquire the spinlock to spin unnecessarily (and if those threads are of higher priority that may cause a deadlock (threads waiting for I/O may get a priority boost on wake up)). This reasoning also applies to all lockless data structures, unless the data structure is truly wait-free (there aren't many practically useful ones, apart from boost::spsc_queue).

In kernel, a thread that has locked a spinlock cannot be preempted or interrupted before it releases the spinlock. And that is why spinlocks are appropriate there (when RCU cannot be used).

On Linux, one can prevent preemption (not sure if completely, but there has been recent kernel changes towards such a desirable effect) by using isolated CPU cores and FIFO real-time threads pinned to those isolated cores. But that requires a deliberate kernel/machine configuration and an application designed to take advantage of that configuration. Nevertheless, people do use such a setup for business-critical applications along with lockless (but not wait-free) data structures in user-space.


On Linux, there is adaptive mutex PTHREAD_MUTEX_ADAPTIVE_NP, which spins for a limited number of iterations before blocking in the kernel (similar to InitializeCriticalSectionAndSpinCount). However, that mutex cannot be used through std::mutex interface because there is no option to customise non-portable pthread_mutexattr_t before initialising pthread_mutex_t.

One can neither enable process-sharing, robostness, error-checking or priority-inversion prevention through std::mutex interface. In practice, people write their own wrappers of pthread_mutex_t which allows to set desirable mutex attributes; along with a corresponding wrapper for condition variables. Standard locks like std::unique_lock and std::lock_guard can be reused.

IMO, there could be provisions to set desirable mutex and condition variable properties in std:: APIs, like providing a protected constructor for derived classes that would initialize that native_handle, but there aren't any. That native_handle looks like a good idea to do platform specific stuff, however, there must be a constructor for the derived class to be able to initialize it appropriately. After the mutex or condition variable is initialized that native_handle is pretty much useless. Unless the idea was only to be able to pass that native_handle to (C language) APIs that expect a pointer or reference to an initialized pthread_mutex_t.


There is another example of Boost/C++ standard not accepting semaphores on the basis that they are too much of a rope to hang oneself, and that mutex (a binary semaphore, essentially) and condition variable are more fundamental and more flexible synchronisation primitives, out of which a semaphore can be built.

From the point of view of the C++ standard those are probably right decisions because educating users to use spinlocks and semaphores correctly with all the nuances is a difficult task. Whereas advanced users can whip out a wrapper for pthread_spinlock_t with little effort.

Solution 2:[2]

You are right there's no spin lock implementation in the std namespace. A spin lock is a great concept but in user space is generally quite poor. OS doesn't know your process wants to spin and usually you can have worse results than using a mutex. To be noted that on several platforms there's the optimistic spinning implemented so a mutex can do a really good job. In addition adjusting the time to "pause" between each loop iteration can be not trivial and portable and a fine tuning is required. TL;DR don't use a spinlock in user space unless you are really really sure about what you are doing.

C++ Thread discussion

Article explaining how to write a spin lock with benchmark

Reply by Linus Torvalds about the above article explaining why it's a bad idea

Solution 3:[3]

Spin locks have two advantages:

  1. They require much fewer storage as a std::mutex, because they do not need a queue of threads waiting for the lock. On my system, sizeof(pthread_spinlock_t) is 4, while sizeof(std::mutex) is 40.

  2. They are much more performant than std::mutex, if the protected code region is small and the contention level is low to moderate.

On the downside, a poorly implemented spin lock can hog the CPU. For example, a tight loop with a compare-and-set assembler instructions will spam the cache system with loads and loads of unnecessary writes. But that's what we have libraries for, that they implement best practice and avoid common pitfalls. That most user implementations of spin locks are poor, is not a reason to not put spin locks into the library. Rather, it is a reason to put it there, to stop users from trying it themselves.

There is a second problem, that arises from the scheduler: If thread A acquires the lock and then gets preempted by the scheduler before it finishes executing the critical section, another thread B could spin "forever" (or at least for many milliseconds, before thread A gets scheduled again) on that lock.

Unfortunately, there is no way, how userland code can tell the kernel "please don't preempt me in this critical code section". But if we know, that under normal circumstances, the critical code section executes within 10 ns, we could at least tell thread B: "preempt yourself voluntarily, if you have been spinning for over 30 ns". This is not guaranteed to return control directly back to thread A. But it will stop the waste of CPU cycles, that otherwise would take place. And in most scenarios, where thread A and B run in the same process at the same priority, the scheduler will usually schedule thread A before thread B, if B called std::this_thread::yield().

So, I am thinking about a template spin lock class, that takes a single unsigned integer as a parameter, which is the number of memory reads in the critical section. This parameter is then used in the library to calculate the appropriate number of spins, before a yield() is performed. With a zero count, yield() would never be called.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 Kai Petzke