2

For my application, I have to store billions of md5 values, and before storing new md5 I wanna make sure that it doesn't exist already in it. I tried using hash data structure, it worked well but no of lookups goes high with table entries. I tried using trie, each node storing a byte, but the memory consumption is too high. As each node required 256 pointer for successor on next level. Can anyone suggest better data structure which could use less space as well as no of lookups is The ?

Raphael
  • 73,212
  • 30
  • 182
  • 400
Velmurugan
  • 21
  • 1

1 Answers1

1

Maybe you can use a binary trie (critbit tree?) that splits in two on each node instead of 256? Of course you get many more nodes, but they are a LOT smaller. An example implementation in Java is here: critbit

TilmannZ
  • 764
  • 4
  • 6