niedziela, 9 marca 2014

Mask for zero/non-zero bytes

The description of Determine if a word has a zero byte from "Bit Twiddling Hacks" says about haszero(v): "the result is the high bits set where the bytes in v were zero".

Unfortunately this is not true. High bits are also set for ones followed zeros, i.e. haszero(0xff010100) = 0x00808080. Of course the result is still valid (non-zero if there were any zero byte), but if we want to iterate over all zeros or find last zero index, this could be a problem.

It's possible to create exact mask:
uint32_t nonzeromask(const uint32_t v) {
    // MSB are set if any of 7 lowest bits are set
    const uint32_t nonzero_7bit = (v & 0x7f7f7f7f) + 0x7f7f7f7f;

    return (v | nonzero_7bit) & 0x80808080;
}

uint32_t zeromask(const uint32_t v) {
     // negate MSBs
     return (nonzeromask(v)) ^ 0x80808080;
}

Function nonzeromask requires 4 simple instructions, and zeromask one additional xor.

Brak komentarzy: