Numeric buggery

5

Quick, how is this broken:

// slot an identifier into a bucket
int getBucketIndex(long id, int buckets) {
return ((int) id) % buckets;
}

This kind of code exists to pick a bucket to slot an identifier into. The problem here is the downcast from the long to an int. Downcasting a long with the 31st bit set to an int will return a negative number and modding a negative number will return a negative number, which is never a good index into an array. :)

One solution is to unset the bit:

int getBucketIndex(long id, int buckets) {
return (((int) id) & 0x7fffffff) % buckets;
}

Comments

5 Responses to “Numeric buggery”
  1. Daniel Yokomizo says:

    Why not just: (int) (id % buckets)? It’s numerically guaranteed that the result will fit an int. The only problem is if the id can be negative, then we need either to abs() the id or the result.

  2. Alex says:

    The problem is exactly that the result can’t be negative. You might note that abs() takes and returns a double, so you would actually need something like:

    (int) (Math.abs(((int)id) % buckets))

    This came up in a fix at another project and I wasn’t involved so I can’t really comment on why flipping the bit was chosen, but I suspect it has a lot to do with performance and avoiding converting from integral to floating point and back just for Math.abs(). Any code that is finding a bucket per index is probably something getting called a whole lot and is thus very performance sensitive.

  3. Daniel Yokomizo says:

    I said abs() instead of Math.abs() because I was talking about the concept. Let me be more clear: as buckets is an int it’s guaranteed that (id % buckets) fits an int, so if both id and buckets are guaranteed to be positive (e.g. we can check with an assert or a if test) the result is guaranteed to fit an int and be positive. If either may be negative we need to abs() the result, the implementation may be calling Math.abs() or doing some numeric trick (e.g. bit munging).

  4. Ulf says:

    Remember that Math.abs can return a negative value. The following code is wrong and will fail at the worst possible moment:

    return Math.abs((int) id) % buckets;

    If you don’t believe me, read the javadoc for Math.abs. :D

  5. Daniel Yokomizo says:

    Interesting. According to the JLS:

    The Java programming language uses two’s-complement representation for integers, and the range of two’s-complement values is not symmetric, so negation of the maximum negative int or long results in that same maximum negative number. Overflow occurs in this case, but no exception is thrown.

    So it makes sense that -(-2147483648) == (-2147483648) but one wishes that it threw an overflow exception.

    IMO the Math.abs is broken as it should never return a negative value.

Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!