Skip to content
Reini Urban edited this page Sep 17, 2020 · 4 revisions

abbrev. for Undefined behaviour, as signalled with -fsanitize=undefined.

Typical undefined behaviour (UB) problems

Misaligned

Many word-wise hashes (in opposite to safe byte-wise processing) don't check the input buffer for proper word alignment, which will fail with ubsan or Sparc or old ARM CPU's. word being int32_t or int64_t or even more. On some old RISC hardware this will be a BUS error, you can even let Intel HW generate such a bus error by setting some CPU flag. But generally using misaligned accesses is fine.

These are: mx3, Spooky, mirhash (but not strict), MUM, fasthash, Murmur3*, Murmur2*, metrohash* (all but cmetro*), Crap8, discohash, beamsplitter, lookup3, fletcher4, fletcher2, all sanmayce FNV1a_ variants (FNV1a_YT, FNV1A_Pippip_Yurii, FNV1A_Totenschiff, ...), fibonacci.

The usual mitigation is to check the buffer alignment either in the caller, provide a pre-processing loop for the misaligned prefix, or copy the whole buffer into a fresh aligned area. Put that extra code inside #ifdef HAVE_ALIGNED_ACCESS_REQUIRED.

oob - Out of bounds

Some hash function assume a padded input buffer which can be accessed past its length up to the word size. This allows for faster loop processing, as no 2nd loop or switch table for the rest is needed, but it requires a cooperative calling enviroment and is as such considered cheating. This is tested in the Sanity Tests

Signed integer overflow

A simple type error, this hash needs to use unsigned integer types internally, to avoid undefined and inconsistent behaviour. i.e. SuperFastHash: signed integer overflow: -2147483641 + -113 cannot be represented in type 'int'

shift exponent overflow

With: FNV1A_Pippip_Yurii, FNV1A_Totenschiff, pair_multiply_shift, sumhash32 shift exponent 64 is too large for 64-bit type 'long unsigned int'

Clone this wiki locally