Jaa


Bit Fiddling - 3

Now that we have seen some trivial and  fast approaches to do bit counting. Lets look into one brilliant parallent counting solution.

Parallel Counting

5. MIT HAKMEM Count

HAKMEM (Hacks Memo) is a legendary collection of neat mathematical and programming hacks contributed mostly by people at MIT and some elsewhere. This source is from the MIT AI LABS and this brilliant piece of code orginally in assembly was probably conceived in the late 70's.

 

 int BitCount(unsigned int u)                         
 {
         unsigned int uCount;
    
         uCount = u - ((u >> 1) & 033333333333) - ((u >> 2) & 011111111111);
         return ((uCount + (uCount >> 3)) & 030707070707) % 63;
 }

 

Lets take a look at the theory behind this idea.

Take a 32bit number n;
n =  a31 * 231 + a30 * 230 +.....+ ak * 2 +....+ a * 2 + a0;

Here a0 through a31 are the values of bits (0 or 1) in a 32 bit number. Since the problem at hand is to count the number of set bits in the number, simply summing up these co-efficients would yeild the solution. (a0 + a1 +..+ a31 ).

How do we do this programmatically?

Take the original number n and store in the count variable.
count=n;

Shift the orignal number 1 bit to the right and subtract from the orignal.
count = n - (n >>1);

Now Shift the original number 2 bits to the right and subtract from count;
count = n - (n>>1) - (n>>2);

Keep doing this until you reach the end.
count = n - (n>>1) - (n>>2) - ... -( n>>31);

Let analyze and see what count holds now.
n         = a31 * 231 + a30 * 230 +.....+ ak * 2 +....+ a * 2 + a0;

n >> 1 = a31 * 230 + a30 * 229 +.....+ ak * 2k-1  +....+ a1;
n >> 2 = a31 * 229 + a30 * 228 +.....+ ak* 2k-2 +....+ a2

;
..
n >> k = a31 * 2(31-k) + a30 * 2(30-k) +.....+ ak * 2k;

..
n>>31 = a31;

You can quickly see that: (Hint: 2k - 2k-1  = 2k-1 ;)
count = n - (n>>1) - (n>>2) - ... -( n>>31) =a31+ a30 +..+a0;
which is what we are looking for;

 

int BitCount(unsigned int u)
{
       unsigned int uCount=u;
       do
       {
             u=u>>1;
             uCount -= u;  
       }
       while(u);
}

 

 

This certainaly is an interesting way to solve this problem. But how do you make this brilliant? Run this in constant time with constant memory!!.
 

 int BitCount(unsigned int u)                         
 {
         unsigned int uCount;
    
         uCount = u - ((u >> 1) & 033333333333) - ((u >> 2) & 011111111111);
         return ((uCount + (uCount >> 3)) & 030707070707) % 63;
 }

 

For those of you who are still wondering whats going? Basically use the same idea, but instead of looping over the entire number, sum up the number in blocks of 3 (octal) and count them in parallel.

After this statement uCount = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111);
uCount has the sum of bits in each octal block spread out through itself.

So if you can a block of 3 bits

u = a2*22 + a1*2+ a0;
u>>1 =
a2*2 + a1;
u>>2 =
a2;

u - (u>>1) - (u>>2) is a2+a1+a0 which is the sum of bits in each block of 3 bits.

The nexe step is to grab all these and sum them up:

((uCount + (uCount >> 3)) will re-arrange them in blocks of 6 bits and sum them up in such a way the every other block of 3 has the sum of number of set bits in the original number plus the preceding block of 3. The only expection here is the first block of 3.  The idea is not to spill over the bits to adjacent blocks while summing them up. How is that made possible. Well, the maximum number of set bits in a block of 3 is 3, in a block of 6 is 6. and 3 bits can represent upto 7. This way you make sure you dont spill the bits over. To mask out the junk while doing a uCount>>3. Do and AND with  030707070707. THe only expection is the first block as I just mentioned.

What does ((uCount + (uCount >> 3)) & 030707070707) hold now?
Its 2^0 * (2^6 - 1) * sum0 +  2^1 * (2^6 - 1) * sum1 + 2^2 * (2^6 - 1) * sum2  + 2^3 * (2^6 - 1) * sum3 + 2^4 * (2^6 - 1) * sum4 + 2^5 * (2^3 - 1) * sum5
where sum0 is the sum of number of set bits in every block of 6 bits starting from the 'low' position.
What we need is sum0 + sum1 + sum2 + sum3 + sum4 + sum5;
2^6-1 is 63. Clearly a modulo with 63 will get you what you want.

Remember, that this works only with 32 bits numbers, For 64-bit numbers (well I will wait for comments or do this in the next post).

Comments

  • Anonymous
    June 12, 2005
    I have seen this algorithm before, but this is the best explanation I have seen so far. I totally understand how it works now. Great work!!
  • Anonymous
    August 01, 2005
    I didn't understand the explanation u gave for ((uCount + (uCount >> 3)) & 030707070707). Can you explain how this equates to 2^0 * (2^6 - 1) * sum0 + 2^1 * (2^6 - 1) * sum1 + 2^2 * (2^6 - 1) * sum2 + 2^3 * (2^6 - 1) * sum3 + 2^4 * (2^6 - 1) * sum4 + 2^5 * (2^3 - 1) * sum5 ?

    But u have done a great job explaining the MIT HACKMEM count.. Keep it up...
  • Anonymous
    August 01, 2005
    ((uCount + (uCount >> 3)) will take sets of 3 bits each, right shift it and add it with the original. Since this is done parallely, in every set of 6 bits, there will be a unnecessary set of 3 bits after summation. This needs to go away, and is achieved by masking. with 07 (except for the Most singicant set). This is because there are 32 bits in an int, so there are 5 sets of 6 bits plus 2 bits (MSB set) This can be masked with 03. And thats why the masking is done with 030707070707. Now the number looks base 6 number, and using the general theory to compute the value, this will equate to 2^0 * (2^6 - 1) * sum0 + 2^1 * (2^6 - 1) * sum1 + 2^2 * (2^6 - 1) * sum2 + 2^3 * (2^6 - 1) * sum3 + 2^4 * (2^6 - 1) * sum4 + 2^5 * (2^3 - 1) * sum5


    E.g.

    uCount = 01010101010101001101101010110111
    uCount>>3 = 00001010101010101001101101010110

    ------------------------------
    sum = 01011111111111110111011000001101
    &030707070707= 11000111000111000111000111000111
    -----------------------------------------------
    result = 01000111000111000111000000000101

    If you look at the final result every other 3rd set of bits are masked out which makes it looks like a base 6 number.

    result % 63 will now be the number of set bits in the original number. Why 63? this is because thats the max number that can be stored in 6 bits.


  • Anonymous
    August 01, 2005
    Thanks for the illustration... I had worked out the whole procedure with an example. It works flawlessly. But i'm still in the dark with "Now the number looks base 6 number, and using the general theory to compute the value, this will equate to 2^0 * (2^6 - 1) * sum0 + 2^1 * (2^6 - 1) * sum1 + 2^2 * (2^6 - 1) * sum2 + 2^3 * (2^6 - 1) * sum3 + 2^4 * (2^6 - 1) * sum4 + 2^5 * (2^3 - 1) * sum5". How to actually calculate the value of the result considering it as a base 6 number and arrive at the above expression. Can you please illustrate that ?

    Hope i'm not asking something that is too obvious to everyone...
  • Anonymous
    August 02, 2005
    http://mathworld.wolfram.com/Base.html

    This should give you a basic idea of different base representations. You will see that we are using the senary base. (not a commonly used system) The reason 'result' equates to the above expression is beacuse, you are summing blocks of 6 bits together. So in binary represtation 101 will read as 2^0 * 1 + 2^1 * 0 + 2^2 * 1. Similary a senary base will follow the same theory. Now when you mod the result by 63 you will get sum0 + sum1 +... sum5 which is the number of bits in that number.




  • Anonymous
    October 15, 2005
    The explanation of the final step is incorrect.

    What does ((uCount + (uCount >> 3)) & 030707070707) hold now?

    sum0 lives in the bottom three bits, so the term "2^0 * (2^6 - 1) * sum0" is clearly wrong.

    The sums live in groups of six bits so we can treat this as a base 64 number (not base 6). This gives the answer:

    ((uCount + (uCount >> 3)) & 030707070707)

    = 64^0 * sum0 + 64^1 * sum1 + 64^2 * sum2 + 64^3 * sum3 + 64^4 * sum4 + 64^5 * sum5

    = (sum0 + sum1 + sum2 + sum3 + sum4 + sum5) + (64^1 - 1) * sum1 + (64^2 - 1) * sum2 + (64^3 - 1) * sum3 + (64^4 - 1) * sum4 + (64^5 - 1) * sum5

    = (sum0 + sum1 + sum2 + sum3 + sum4 + sum5) + 63 * (sum1 + (64 + 1) * sum2 + (64^2 + 64 + 1) * sum3 + (64^3 + 64^2 + 64 + 1) * sum4 + (64^4 + 64^3 + 64^2 + 64 + 1) * sum5)

    = (sum0 + sum1 + sum2 + sum3 + sum4 + sum5) + (63 * k)
    for some k.

    Hence the result.

    Factoring out the 63 looked a bit complicated, but it's the same thing as (10^n - 1) being divisible by 9 for all n. e.g. 9, 99, 999, etc. But we're in base 64 so (64^n - 1) is divisible by 63 for all n.

  • Anonymous
    August 08, 2006
    Wikipedia has an excellent explanation of this algorithm:
    http://en.wikipedia.org/wiki/Hamming_weight
  • Anonymous
    November 27, 2007
    Any Ideas about how to code the 64 Bit Version?