Skip to content

Instantly share code, notes, and snippets.

@mdeland
Created November 8, 2014 23:56
Show Gist options
  • Save mdeland/44de8978d4cb6ee7deed to your computer and use it in GitHub Desktop.
Save mdeland/44de8978d4cb6ee7deed to your computer and use it in GitHub Desktop.
w32 + haskell
import Data.Bits.Extras
import Data.Hashable
{-
:info w32
w32 :: Integral a => a -> GHC.Word.Word32
-- Defined in ‘Data.Bits.Extras’
-}
filter (/= 0) (map (w32 . hash) ([1.0 .. 10000.0]::[Double]))
filter (== 0) (map (w32 . hash) ([1 .. 10000]::[Int]))
@mdeland
Copy link
Author

mdeland commented Nov 8, 2014

this is relevant because of how hashes are computed for use in hyperloglog

https://github.com/ekmett/hyperloglog/blob/master/src/Data/HyperLogLog/Type.hs#L137

@mkscrg
Copy link

mkscrg commented Nov 10, 2014

I get why this is happening, not sure who to complain to.

To hash a Double (64-bit float), hashable casts it to a Word64 (64-bit unsigned int) with the same bits. That's very fast (roughly free) but you get hashes with the same bit layout as IEEE doubles:

ghci> import Data.Hashable
ghci> hash (8 :: Double) == 0x4020000000000000
True

That's relevant because w32 converts Int to Word32 by throwing out the most significant 32 bits. So

w32 (hash d) == 0

for a lot of "simple" Double values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment