'Making the nodes of an unordered_map or map<> cacheline-aligned

How can I make the nodes and the bucket-"list" of an unordered map cacheline-aligned to avoid false sharing with other data structures on the heap ?



Solution 1:[1]

So look, the above question wasn't really a question since I did know the solution. I'm just answering my question because I think this could be a not uncommon issue for which I have a nice solution. Here it is:

#pragma once
#include <new>
#if !defined(_MSC_VER)
    #include <cstdlib>
#else
    #include <malloc.h>
#endif
#include <type_traits>
        
template<typename T>
struct aligned_allocator
{
    using value_type = T;
    using size_type = std::size_t;
    using difference_type = std::ptrdiff_t;
    using propagate_on_container_move_assignment = std::false_type;
    aligned_allocator() noexcept {}
    aligned_allocator( aligned_allocator const &other ) noexcept {}
    template<typename OtherAllocator>
    aligned_allocator( OtherAllocator const &other ) noexcept {}
    value_type *allocate( size_type n );
    void deallocate( value_type *p, size_type n );
};

template<typename T>
inline
typename aligned_allocator<T>::value_type *aligned_allocator<T>::allocate( size_type n )
{
#if defined(__cpp_lib_hardware_interference_size)
    constexpr std::size_t CL_SIZE = std::hardware_destructive_interference_size;
#else
    constexpr std::size_t CL_SIZE = 64;
#endif
    std::size_t size = n * sizeof(T);
#if !defined(_MSC_VER)
    return (value_type *)std::aligned_alloc( CL_SIZE, size );
#else
    return (value_type *)_aligned_malloc( n * sizeof(T), CL_SIZE );
#endif
}

template<typename T>
inline
void aligned_allocator<T>::deallocate( value_type *p, size_type n )
{
#if !defined(_MSC_VER)
    free( p );
#else
    _aligned_free( p );
#endif
}

the first solution I've shown with a proxy allocator did only work if there where allocations with an N of 1 because T was wrapped inside a structure with an alingas attribute. But imagine what happens if you allocate T in a row with n >= 2. In this case each item would be aligned but not the whole allocation itself. So writing a proxy allocator that way isn't possible.

The above code makes the assumption that the tail padding of the allocation is also properly filled up to the alignment requirement. But I think that's usually true since the allocation facilities I use internally are often used for this special purpose.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1