So, suppose I have some object of a certain size (947, for example), and I need to find the smallest power-of-2 that's large enough to hold my object (in this case, 210 or 1024). We'll start with a fairly simple algorithm:
int val = 947, powof2 = 1;
// Double powof2 until >= val
while( powof2 < val ) powof2 <<= 1;
This will work, and if you're in a hurry, this is probably what will get used. If you're just calculating the size for a texture, then this is unlikely to be performance-critical code, so it may not be worth your time to optimize this. This might become a bottleneck if you're writing a memory manager, however; how can you do this more quickly?
This code has a few weaknesses. Firstly, it uses a loop, which means that the code has to run a test and then split in two possible directions after every iteration. Conditional branches of this sort aren't a huge deal on a modern processor, but they're still slower than avoiding branches altogether. Secondly, the loop can iterate quite a few times--up to 31, in fact. Thirdly, if the value stored in 'val' is over 231, and if you're using unsigned integers, then this can become an infinite loop. Hardly desirable!
It turns out that there's a fairly slick algorithm to solve this problem. The basic idea is to take your number (947 again: 0000 0011 1011 0011), and use bit operations to convert it to a solid string of 1's with the same number of digits (in this case, 0000 0011 1111 1111). Then, add 1, and you'll have the next power of 2 (0000 0100 0000 0000, or 1024). To convert any number into a solid string of 1's, we'll use right-shift and or repeatedly:
val = (val >> 1) | val;
val = (val >> 2) | val;
val = (val >> 4) | val;
val = (val >> 8) | val;
val = (val >> 16) | val;
Why do we do it five times? Well, that's because val is a 32-bit integer, and every time you iterate, you're doubling the number of digits you fill. After five steps, you've filled 25 digits, and you're done. If you're on a 64-bit architecture (26) you'll need to repeat 6 times. The steps on 947 look like:
0000 0011 1011 0011
0000 0011 1111 1011
0000 0011 1111 1111
... with the remaining steps staying at this value.
There's one small problem with this algorithm: it doesn't work properly for the powers of two themselves. It will give the next highest power of 2. This is pretty easy to fix, though: Just subtract one from the value before you begin, and you'll get the right answer in every case. Here's the final algorithm:
val = ... // Get input
val--;
val = (val >> 1) | val;
val = (val >> 2) | val;
val = (val >> 4) | val;
val = (val >> 8) | val;
val = (val >> 16) | val;
val++; // Val is now the next highest power of 2.
One final thought: What does it do on zeroes? It turns out that if you pass the algorithm a zero or any value greater than 231, it will return zero (try it if you're not sure why). Unless you're certain you will never pass either of these values, you should handle having zero as a return value.
That's all for today. Happy bit-twiddling!