'Rounding up to next power of 2
I want to write a function that returns the nearest next power of 2 number. For example if my input is 789, the output should be 1024. Is there any way of achieving this without using any loops but just using some bitwise operators?
Solution 1:[1]
next = pow(2, ceil(log(x)/log(2)));
This works by finding the number you'd have raise 2 by to get x (take the log of the number, and divide by the log of the desired base, see wikipedia for more). Then round that up with ceil to get the nearest whole number power.
This is a more general purpose (i.e. slower!) method than the bitwise methods linked elsewhere, but good to know the maths, eh?
Solution 2:[2]
I think this works, too:
int power = 1;
while(power < x)
power*=2;
And the answer is power.
Solution 3:[3]
unsigned long upper_power_of_two(unsigned long v)
{
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
v++;
return v;
}
Solution 4:[4]
If you're using GCC, you might want to have a look at Optimizing the next_pow2() function by Lockless Inc.. This page describes a way to use built-in function builtin_clz() (count leading zero) and later use directly x86 (ia32) assembler instruction bsr (bit scan reverse), just like it's described in another answer's link to gamedev site. This code might be faster than those described in previous answer.
By the way, if you're not going to use assembler instruction and 64bit data type, you can use this
/**
* return the smallest power of two value
* greater than x
*
* Input range: [2..2147483648]
* Output range: [2..2147483648]
*
*/
__attribute__ ((const))
static inline uint32_t p2(uint32_t x)
{
#if 0
assert(x > 1);
assert(x <= ((UINT32_MAX/2) + 1));
#endif
return 1 << (32 - __builtin_clz (x - 1));
}
Solution 5:[5]
One more, although I use cycle, but thi is much faster than math operands
power of two "floor" option:
int power = 1;
while (x >>= 1) power <<= 1;
power of two "ceil" option:
int power = 2;
x--; // <<-- UPDATED
while (x >>= 1) power <<= 1;
UPDATE
As mentioned in comments there was mistake in ceil where its result was wrong.
Here are full functions:
unsigned power_floor(unsigned x) {
int power = 1;
while (x >>= 1) power <<= 1;
return power;
}
unsigned power_ceil(unsigned x) {
if (x <= 1) return 1;
int power = 2;
x--;
while (x >>= 1) power <<= 1;
return power;
}
Solution 6:[6]
For any unsigned type, building on the Bit Twiddling Hacks:
#include <climits>
#include <type_traits>
template <typename UnsignedType>
UnsignedType round_up_to_power_of_2(UnsignedType v) {
static_assert(std::is_unsigned<UnsignedType>::value, "Only works for unsigned types");
v--;
for (size_t i = 1; i < sizeof(v) * CHAR_BIT; i *= 2) //Prefer size_t "Warning comparison between signed and unsigned integer"
{
v |= v >> i;
}
return ++v;
}
There isn't really a loop there as the compiler knows at compile time the number of iterations.
Solution 7:[7]
Despite the question is tagged as c here my five cents. Lucky us, C++ 20 would include std::ceil2 and std::floor2 (see here). It is consexpr template functions, current GCC implementation uses bitshifting and works with any integral unsigned type.
Solution 8:[8]
In standard c++20 this is included in <bit>.
The answer is simply
#include <bit>
unsigned long upper_power_of_two(unsigned long v)
{
return std::bit_ceil(v);
}
NOTE:
The solution I gave is for c++, not c, I would give an answer this question instead, but it was closed as a duplicate of this one!
Solution 9:[9]
For IEEE floats you'd be able to do something like this.
int next_power_of_two(float a_F){
int f = *(int*)&a_F;
int b = f << 9 != 0; // If we're a power of two this is 0, otherwise this is 1
f >>= 23; // remove factional part of floating point number
f -= 127; // subtract 127 (the bias) from the exponent
// adds one to the exponent if were not a power of two,
// then raises our new exponent to the power of two again.
return (1 << (f + b));
}
If you need an integer solution and you're able to use inline assembly, BSR will give you the log2 of an integer on the x86. It counts how many right bits are set, which is exactly equal to the log2 of that number. Other processors have similar instructions (often), such as CLZ and depending on your compiler there might be an intrinsic available to do the work for you.
Solution 10:[10]
In x86 you can use the sse4 bit manipulation instructions to make it fast.
//assume input is in eax
mov ecx,31
popcnt edx,eax //cycle 1
lzcnt eax,eax //cycle 2
sub ecx,eax
mov eax,1
cmp edx,1 //cycle 3
jle @done //cycle 4 - popcnt says its a power of 2, return input unchanged
shl eax,cl //cycle 5
@done: rep ret //cycle 5
In c you can use the matching intrinsics.
Or jumpless, which speeds up things by avoiding a misprediction due to a jump, but slows things down by lengthening the dependency chain. Time the code to see which works best for you.
//assume input is in eax
mov ecx,31
popcnt edx,eax //cycle 1
lzcnt eax,eax
sub ecx,eax
mov eax,1 //cycle 2
cmp edx,1
mov edx,0 //cycle 3
cmovle ecx,edx //cycle 4 - ensure eax does not change
shl eax,cl
@done: rep ret //cycle 5
Solution 11:[11]
/*
** http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog
*/
#define __LOG2A(s) ((s &0xffffffff00000000) ? (32 +__LOG2B(s >>32)): (__LOG2B(s)))
#define __LOG2B(s) ((s &0xffff0000) ? (16 +__LOG2C(s >>16)): (__LOG2C(s)))
#define __LOG2C(s) ((s &0xff00) ? (8 +__LOG2D(s >>8)) : (__LOG2D(s)))
#define __LOG2D(s) ((s &0xf0) ? (4 +__LOG2E(s >>4)) : (__LOG2E(s)))
#define __LOG2E(s) ((s &0xc) ? (2 +__LOG2F(s >>2)) : (__LOG2F(s)))
#define __LOG2F(s) ((s &0x2) ? (1) : (0))
#define LOG2_UINT64 __LOG2A
#define LOG2_UINT32 __LOG2B
#define LOG2_UINT16 __LOG2C
#define LOG2_UINT8 __LOG2D
static inline uint64_t
next_power_of_2(uint64_t i)
{
#if defined(__GNUC__)
return 1UL <<(1 +(63 -__builtin_clzl(i -1)));
#else
i =i -1;
i =LOG2_UINT64(i);
return 1UL <<(1 +i);
#endif
}
If you do not want to venture into the realm of undefined behaviour the input value must be between 1 and 2^63. The macro is also useful to set constant at compile time.
Solution 12:[12]
Here's my solution in C. Hope this helps!
int next_power_of_two(int n) {
int i = 0;
for (--n; n > 0; n >>= 1) {
i++;
}
return 1 << i;
}
Solution 13:[13]
For completeness here is a floating-point implementation in bog-standard C.
double next_power_of_two(double value) {
int exp;
if(frexp(value, &exp) == 0.5) {
// Omit this case to round precise powers of two up to the *next* power
return value;
}
return ldexp(1.0, exp);
}
Solution 14:[14]
An efficient Microsoft (e.g., Visual Studio 2017) specific solution in C / C++ for integer input. Handles the case of the input exactly matching a power of two value by decrementing before checking the location of the most significant 1 bit.
inline unsigned int ExpandToPowerOf2(unsigned int Value)
{
unsigned long Index;
_BitScanReverse(&Index, Value - 1);
return (1U << (Index + 1));
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#if defined(WIN64) // The _BitScanReverse64 intrinsic is only available for 64 bit builds because it depends on x64
inline unsigned long long ExpandToPowerOf2(unsigned long long Value)
{
unsigned long Index;
_BitScanReverse64(&Index, Value - 1);
return (1ULL << (Index + 1));
}
#endif
This generates 5 or so inlined instructions for an Intel processor similar to the following:
dec eax
bsr rcx, rax
inc ecx
mov eax, 1
shl rax, cl
Apparently the Visual Studio C++ compiler isn't coded to optimize this for compile-time values, but it's not like there are a whole lot of instructions there.
Edit:
If you want an input value of 1 to yield 1 (2 to the zeroth power), a small modification to the above code still generates straight through instructions with no branch.
inline unsigned int ExpandToPowerOf2(unsigned int Value)
{
unsigned long Index;
_BitScanReverse(&Index, --Value);
if (Value == 0)
Index = (unsigned long) -1;
return (1U << (Index + 1));
}
Generates just a few more instructions. The trick is that Index can be replaced by a test followed by a cmove instruction.
Solution 15:[15]
Solution 16:[16]
Trying to make an "ultimate" solution for this. The following code
is targeted for C language (not C++),
uses compiler built-ins to yield efficient code (CLZ or BSR instruction) if compiler supports any,
is portable (standard C and no assembly) with the exception of built-ins, and
addresses all undefined behaviors.
If you're writing in C++, you may adjust the code appropriately. Note that C++20 introduces std::bit_ceil which does the exact same thing except the behavior may be undefined on certain conditions.
#include <limits.h>
#ifdef _MSC_VER
# if _MSC_VER >= 1400
/* _BitScanReverse is introduced in Visual C++ 2005 and requires
<intrin.h> (also introduced in Visual C++ 2005). */
#include <intrin.h>
#pragma intrinsic(_BitScanReverse)
#pragma intrinsic(_BitScanReverse64)
# define HAVE_BITSCANREVERSE 1
# endif
#endif
/* Macro indicating that the compiler supports __builtin_clz().
The name HAVE_BUILTIN_CLZ seems to be the most common, but in some
projects HAVE__BUILTIN_CLZ is used instead. */
#ifdef __has_builtin
# if __has_builtin(__builtin_clz)
# define HAVE_BUILTIN_CLZ 1
# endif
#elif defined(__GNUC__)
# if (__GNUC__ > 3)
# define HAVE_BUILTIN_CLZ 1
# elif defined(__GNUC_MINOR__)
# if (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
# define HAVE_BUILTIN_CLZ 1
# endif
# endif
#endif
/**
* Returns the smallest power of two that is not smaller than x.
*/
unsigned long int next_power_of_2_long(unsigned long int x)
{
if (x <= 1) {
return 1;
}
x--;
#ifdef HAVE_BITSCANREVERSE
if (x > (ULONG_MAX >> 1)) {
return 0;
} else {
unsigned long int index;
(void) _BitScanReverse(&index, x);
return (1UL << (index + 1));
}
#elif defined(HAVE_BUILTIN_CLZ)
if (x > (ULONG_MAX >> 1)) {
return 0;
}
return (1UL << (sizeof(x) * CHAR_BIT - __builtin_clzl(x)));
#else
/* Solution from "Bit Twiddling Hacks"
<http://www.graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2>
but converted to a loop for smaller code size.
("gcc -O3" will unroll this.) */
{
unsigned int shift;
for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
x |= (x >> shift);
}
}
return (x + 1);
#endif
}
unsigned int next_power_of_2(unsigned int x)
{
if (x <= 1) {
return 1;
}
x--;
#ifdef HAVE_BITSCANREVERSE
if (x > (UINT_MAX >> 1)) {
return 0;
} else {
unsigned long int index;
(void) _BitScanReverse(&index, x);
return (1U << (index + 1));
}
#elif defined(HAVE_BUILTIN_CLZ)
if (x > (UINT_MAX >> 1)) {
return 0;
}
return (1U << (sizeof(x) * CHAR_BIT - __builtin_clz(x)));
#else
{
unsigned int shift;
for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
x |= (x >> shift);
}
}
return (x + 1);
#endif
}
unsigned long long next_power_of_2_long_long(unsigned long long x)
{
if (x <= 1) {
return 1;
}
x--;
#if (defined(HAVE_BITSCANREVERSE) && \
ULLONG_MAX == 18446744073709551615ULL)
if (x > (ULLONG_MAX >> 1)) {
return 0;
} else {
/* assert(sizeof(__int64) == sizeof(long long)); */
unsigned long int index;
(void) _BitScanReverse64(&index, x);
return (1ULL << (index + 1));
}
#elif defined(HAVE_BUILTIN_CLZ)
if (x > (ULLONG_MAX >> 1)) {
return 0;
}
return (1ULL << (sizeof(x) * CHAR_BIT - __builtin_clzll(x)));
#else
{
unsigned int shift;
for (shift = 1; shift < sizeof(x) * CHAR_BIT; shift <<= 1) {
x |= (x >> shift);
}
}
return (x + 1);
#endif
}
Solution 17:[17]
Portable solution in C#:
int GetNextPowerOfTwo(int input) {
return 1 << (int)Math.Ceiling(Math.Log2(input));
}
Math.Ceiling(Math.Log2(value)) calculates the exponent of the next power of two, the 1 << calculates the real value through bitshifting.
Faster solution if you have .NET Core 3 or above:
uint GetNextPowerOfTwoFaster(uint input) {
return (uint)1 << (sizeof(uint) * 8 - System.Numerics.BitOperations.LeadingZeroCount(input - 1));
}
This uses System.Numerics.BitOperations.LeadingZeroCount() which uses a hardware instruction if available:
Update:
RoundUpToPowerOf2() is Coming in .NET 6! The internal implementation is mostly the same as the .NET Core 3 solution above.
Here's the community update.
Solution 18:[18]
constexpr version of clp2 for C++14
#include <iostream>
#include <type_traits>
// Closest least power of 2 minus 1. Returns 0 if n = 0.
template <typename UInt, std::enable_if_t<std::is_unsigned<UInt>::value,int> = 0>
constexpr UInt clp2m1(UInt n, unsigned i = 1) noexcept
{ return i < sizeof(UInt) * 8 ? clp2m1(UInt(n | (n >> i)),i << 1) : n; }
/// Closest least power of 2 minus 1. Returns 0 if n <= 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value && std::is_signed<Int>::value,int> = 0>
constexpr auto clp2m1(Int n) noexcept
{ return clp2m1(std::make_unsigned_t<Int>(n <= 0 ? 0 : n)); }
/// Closest least power of 2. Returns 2^N: 2^(N-1) < n <= 2^N. Returns 0 if n <= 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value,int> = 0>
constexpr auto clp2(Int n) noexcept
{ return clp2m1(std::make_unsigned_t<Int>(n-1)) + 1; }
/// Next power of 2. Returns 2^N: 2^(N-1) <= n < 2^N. Returns 1 if n = 0. Returns 0 if n < 0.
template <typename Int, std::enable_if_t<std::is_integral<Int>::value,int> = 0>
constexpr auto np2(Int n) noexcept
{ return clp2m1(std::make_unsigned_t<Int>(n)) + 1; }
template <typename T>
void test(T v) { std::cout << clp2(v) << std::endl; }
int main()
{
test(-5); // 0
test(0); // 0
test(8); // 8
test(31); // 32
test(33); // 64
test(789); // 1024
test(char(260)); // 4
test(unsigned(-1) - 1); // 0
test<long long>(unsigned(-1) - 1); // 4294967296
return 0;
}
Solution 19:[19]
Many processor architectures support log base 2 or very similar operation – count leading zeros. Many compilers have intrinsics for it. See https://en.wikipedia.org/wiki/Find_first_set
Solution 20:[20]
Assuming you have a good compiler & it can do the bit twiddling before hand thats above me at this point, but anyway this works!!!
// http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
#define SH1(v) ((v-1) | ((v-1) >> 1)) // accidently came up w/ this...
#define SH2(v) ((v) | ((v) >> 2))
#define SH4(v) ((v) | ((v) >> 4))
#define SH8(v) ((v) | ((v) >> 8))
#define SH16(v) ((v) | ((v) >> 16))
#define OP(v) (SH16(SH8(SH4(SH2(SH1(v))))))
#define CB0(v) ((v) - (((v) >> 1) & 0x55555555))
#define CB1(v) (((v) & 0x33333333) + (((v) >> 2) & 0x33333333))
#define CB2(v) ((((v) + ((v) >> 4) & 0xF0F0F0F) * 0x1010101) >> 24)
#define CBSET(v) (CB2(CB1(CB0((v)))))
#define FLOG2(v) (CBSET(OP(v)))
Test code below:
#include <iostream>
using namespace std;
// http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious
#define SH1(v) ((v-1) | ((v-1) >> 1)) // accidently guess this...
#define SH2(v) ((v) | ((v) >> 2))
#define SH4(v) ((v) | ((v) >> 4))
#define SH8(v) ((v) | ((v) >> 8))
#define SH16(v) ((v) | ((v) >> 16))
#define OP(v) (SH16(SH8(SH4(SH2(SH1(v))))))
#define CB0(v) ((v) - (((v) >> 1) & 0x55555555))
#define CB1(v) (((v) & 0x33333333) + (((v) >> 2) & 0x33333333))
#define CB2(v) ((((v) + ((v) >> 4) & 0xF0F0F0F) * 0x1010101) >> 24)
#define CBSET(v) (CB2(CB1(CB0((v)))))
#define FLOG2(v) (CBSET(OP(v)))
#define SZ4 FLOG2(4)
#define SZ6 FLOG2(6)
#define SZ7 FLOG2(7)
#define SZ8 FLOG2(8)
#define SZ9 FLOG2(9)
#define SZ16 FLOG2(16)
#define SZ17 FLOG2(17)
#define SZ127 FLOG2(127)
#define SZ1023 FLOG2(1023)
#define SZ1024 FLOG2(1024)
#define SZ2_17 FLOG2((1ul << 17)) //
#define SZ_LOG2 FLOG2(SZ)
#define DBG_PRINT(x) do { std::printf("Line:%-4d" " %10s = %-10d\n", __LINE__, #x, x); } while(0);
uint32_t arrTble[FLOG2(63)];
int main(){
int8_t n;
DBG_PRINT(SZ4);
DBG_PRINT(SZ6);
DBG_PRINT(SZ7);
DBG_PRINT(SZ8);
DBG_PRINT(SZ9);
DBG_PRINT(SZ16);
DBG_PRINT(SZ17);
DBG_PRINT(SZ127);
DBG_PRINT(SZ1023);
DBG_PRINT(SZ1024);
DBG_PRINT(SZ2_17);
return(0);
}
Outputs:
Line:39 SZ4 = 2
Line:40 SZ6 = 3
Line:41 SZ7 = 3
Line:42 SZ8 = 3
Line:43 SZ9 = 4
Line:44 SZ16 = 4
Line:45 SZ17 = 5
Line:46 SZ127 = 7
Line:47 SZ1023 = 10
Line:48 SZ1024 = 10
Line:49 SZ2_16 = 17
Solution 21:[21]
I'm trying to get nearest lower power of 2 and made this function. May it help you.Just multiplied nearest lower number times 2 to get nearest upper power of 2
int nearest_upper_power(int number){
int temp=number;
while((number&(number-1))!=0){
temp<<=1;
number&=temp;
}
//Here number is closest lower power
number*=2;
return number;
}
Solution 22:[22]
Adapted Paul Dixon's answer to Excel, this works perfectly.
=POWER(2,CEILING.MATH(LOG(A1)/LOG(2)))
Solution 23:[23]
A variant of @YannDroneaud answer valid for x==1, only for x86 plateforms, compilers, gcc or clang:
__attribute__ ((const))
static inline uint32_t p2(uint32_t x)
{
#if 0
assert(x > 0);
assert(x <= ((UINT32_MAX/2) + 1));
#endif
int clz;
uint32_t xm1 = x-1;
asm(
"lzcnt %1,%0"
:"=r" (clz)
:"rm" (xm1)
:"cc"
);
return 1 << (32 - clz);
}
Solution 24:[24]
Here is what I'm using to have this be a constant expression, if the input is a constant expression.
#define uptopow2_0(v) ((v) - 1)
#define uptopow2_1(v) (uptopow2_0(v) | uptopow2_0(v) >> 1)
#define uptopow2_2(v) (uptopow2_1(v) | uptopow2_1(v) >> 2)
#define uptopow2_3(v) (uptopow2_2(v) | uptopow2_2(v) >> 4)
#define uptopow2_4(v) (uptopow2_3(v) | uptopow2_3(v) >> 8)
#define uptopow2_5(v) (uptopow2_4(v) | uptopow2_4(v) >> 16)
#define uptopow2(v) (uptopow2_5(v) + 1) /* this is the one programmer uses */
So for instance, an expression like:
uptopow2(sizeof (struct foo))
will nicely reduce to a constant.
Solution 25:[25]
The g++ compiler provides a builtin function __builtin_clz that counts leading zeros:
So we could do:
int nextPowerOfTwo(unsigned int x) {
return 1 << sizeof(x)*8 - __builtin_clz(x);
}
int main () {
std::cout << nextPowerOfTwo(7) << std::endl;
std::cout << nextPowerOfTwo(31) << std::endl;
std::cout << nextPowerOfTwo(33) << std::endl;
std::cout << nextPowerOfTwo(8) << std::endl;
std::cout << nextPowerOfTwo(91) << std::endl;
return 0;
}
Results:
8
32
64
16
128
But note that, for x == 0, __builtin_clz return is undefined.
Solution 26:[26]
If you need it for OpenGL related stuff:
/* Compute the nearest power of 2 number that is
* less than or equal to the value passed in.
*/
static GLuint
nearestPower( GLuint value )
{
int i = 1;
if (value == 0) return -1; /* Error! */
for (;;) {
if (value == 1) return i;
else if (value == 3) return i*4;
value >>= 1; i *= 2;
}
}
Solution 27:[27]
Convert it to a float and then use .hex() which shows the normalized IEEE representation.
>>> float(789).hex()
'0x1.8a80000000000p+9'
Then just extract the exponent and add 1.
>>> int(float(789).hex().split('p+')[1]) + 1
10
And raise 2 to this power.
>>> 2 ** (int(float(789).hex().split('p+')[1]) + 1)
1024
Solution 28:[28]
from math import ceil, log2
pot_ceil = lambda N: 0x1 << ceil(log2(N))
Test:
for i in range(10):
print(i, pot_ceil(i))
Output:
1 1
2 2
3 4
4 4
5 8
6 8
7 8
8 8
9 16
10 16
Solution 29:[29]
import sys
def is_power2(x):
return x > 0 and ((x & (x - 1)) == 0)
def find_nearest_power2(x):
if x <= 0:
raise ValueError("invalid input")
if is_power2(x):
return x
else:
bits = get_bits(x)
upper = 1 << (bits)
lower = 1 << (bits - 1)
mid = (upper + lower) // 2
if (x - mid) > 0:
return upper
else:
return lower
def get_bits(x):
"""return number of bits in binary representation"""
if x < 0:
raise ValueError("invalid input: input should be positive integer")
count = 0
while (x != 0):
try:
x = x >> 1
except TypeError as error:
print(error, "input should be of type integer")
sys.exit(1)
count += 1
return count
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow

