Question about BitMask in 5.1- Compute Parity



I’ve just started the book and am already stuck with the first question in 5.1. Going through the solution I understand the general idea of taking a 64 bit word and breaking it into four 16 bit words for feasible space use. However in the solution to compute the parities for the four 16 bit portions I see:

precomputed_parity[x >> ( 3 * kWordSize)] ^ 
precomputed_partiy[(x >> (2 * kWordSize)) & kBitMask]...

Why is there an AND operation inside that second lookup of precomputed_parity?



Hey Kavinder,

The AND operation is trying to extract the 16 bits we are interested in. If you don’t do that, the first 16 bits from MSB will still exist after x >> (2 * kWordSize). which is not what we want as our precomputed_parity only contains the results for 16-bit word. Following is an example, assuming you have a 64-bit word x where all bits are 1. After x >> (2 * kWordSize), the result is like 32 0s followed by 32 1s. Since we are only interested in the 16 bits from LSB to MSB, we use kBitMask to extract that.

Please let me know if you have any problem.



Just to follow up on this with a concrete example, suppose we are working with 8 bits words and a 2 bit lookup table.

The cache is <0, 1, 1, 0> —these are the parities of (00), (01), (10), (11), respectively. To compute the parity of (11101010) we would first compute the parities of the 4 2-bit quantities (11), (10), (10), (10), respectively.

By table lookup (explained more after this) we see these are 0, 1, 1, 1, respectively, so the final result is the parity of 0, 1, 1, 1 which is 1.

Now to the details of the table lookup.

To lookup the parity of the first two bits in (11101010), we right shift by 6, to get (00000011), and use this as an index into the cache.

However, to lookup the parity of the next two bits (ie (10)), we right shift by 4, to get (10) in the two least-significant bit places but the right shift also gives us the leading (11) before that (ie the right shift yields (00001110)).

We cannot index the cache by this, it’ll be out of bounds. To get the last two bits after the right shift by 6, we bit-wise AND (00001110) with (00000011) (this is the “mask”), this returns (00000010). Similar masking is needed for the two other 2-bit lookups.

Hope that helps, I will put in a line explaining this in the next release. Thanks for taking the trouble to point it out, we appreciate it.


1 Like

Reverse Bits: Code explanation: Why some code in the book don't have comment?

@tsunghsienlee @adnanaziz Thanks for the great and quick response. Definitely did not expect to get a response from the authors.

Ok that makes sense, I worked out an example using a 16 bit and I can see why it works. Thanks

A follow up question I have relates to the last solution to 5.1 where you determine the parity of the 64 bit word by repeatedly XOR’ing it

 short Parity(unsigned long x) {
     x ^= x >> 32;
     x ^= x >> 16;
     x ^= x >> 8;
     x ^= x >> 4;
     x &= x0xf;

I understand that a 64 bit word’s parity is made up by the parity of the lower half and the upper half, which is mentioned in the preceding paragraph. I just can’t explain conceptually what the above code is doing in relation to finding the parity.



I also asked this question on stack overflow and got some good answers that made a lot of sense:

The confusing thing for me was not seeing that for every ^ operation the only bits that matter are the bottom 32, 16, 8, 4 and the ones above become irrelevant.



The basic program is

short Parity(unsigned long x) {
  x ^= x >> 32;
  x ^= x >> 16;
  x ^= x >> 8;
  x ^= x >> 4;
  x ^= x >> 2;
  x ^= x >> 1;
  return x & 0x1;


There’s the same issue wrt the meaningful bits being the lower half, quarter, etc., similar to the issue with table lookup you brought up earlier, but it’s otherwise pretty clear.

We optimized this program a tiny bit, at the cost of clarity, by getting the parity when you get to 4 bits by table lookup.

You dont need an array for the table lookup, you can use a single word, as long as it fits 16 bits. The idea is that the parities of 0, 1, 2, 3, 4, … 15 are 0,1,1,0,1,0,0,1,1,0,0,1,0,1,1,0, respectively. You can pack these bits into the following word

const int kFourBitParityLookupTable = 0x6996; // = 0b0110100110010110.

Then the parity of x, assumed to be 15 or less, is (kFourBitParityLookupTable >> x) & 1. Essentially, the right shift is the array indexing.

The optimized approach does show how you can mix table lookup and bit-wise operations, which are the two hallmarks of optimized bit-fiddling, and it’s marginally faster.

In retrospect, we should have stuck to the more direct approach, since it’s much clearer.

Let me know if you have questions about this or other problems, we always enjoy hearing from readers.




Would love to understand why the Time Complexity is O(n/L) in this approach. Read it multiple times, but cannot make sense :frowning:

Thanks in advance.



If you would be processing bit by bit the entire n bits, the time complexity of the algorithm would be O(n). Given the fact that you actually process words of L bits length, the time spent is actually O(n/L), because you handle n/L words overall, L bits at once.
Probably in the grand equation the L factor doesn’t matter very much, as it’s a constant factor and it can usually be omitted; if you have trillions of bits and you process 64 bits at once, you still remain with billions or trillions.



Where do you generate precomputed_parity in java? i tried doing it in static block but it didn’t run. Neither it works with the constructor.

I suppose computing it for each test-case is not a good practice.