Questions tagged [hash-tables]

A finite map data structure that addresses stored values using a function that maps many values to few addresses.

A hash table is a finite map (i.e. a dictionary), mapping keys to associated values, which relies on a hash function to map keys to a small integer that is typically used as an index in an array.

Hash tables are more efficient than other finite map data structures for many kinds of data.

Further reading

270 questions
112
votes
7 answers

Why is it best to use a prime number as a mod in a hashing function?

If I have a list of key values from 1 to 100 and I want to organize them in an array of 11 buckets, I've been taught to form a mod function $$ H = k \bmod \ 11$$ Now all the values will be placed one after another in 9 rows. For example, in the…
CodyBugstein
  • 3,017
  • 11
  • 31
  • 46
75
votes
4 answers

(When) is hash table lookup O(1)?

It is often said that hash table lookup operates in constant time: you compute the hash value, which gives you an index for an array lookup. Yet this ignores collisions; in the worst case, every item happens to land in the same bucket and the lookup…
39
votes
2 answers

Hash tables versus binary trees

When implementing a dictionary ('I want to look up customer data by their customer IDs'), the typical data structures used are hash tables and binary search trees. I know for instance that the C++ STL library implements dictionaries (they call them…
Alex ten Brink
  • 9,206
  • 3
  • 36
  • 63
24
votes
1 answer

How Does Populating Pastry's Routing Table Work?

I'm trying to implement the Pastry Distributed Hash Table, but some things are escaping my understanding. I was hoping someone could clarify. Disclaimer: I'm not a computer science student. I've taken precisely two computer science courses in my…
20
votes
5 answers

For what kind of data are hash table operations O(1)?

From the answers to (When) is hash table lookup O(1)?, I gather that hash tables have $O(1)$ worst-case behavior, at least amortized, when the data satisfies certain statistical conditions, and there are techniques to help make these conditions…
16
votes
4 answers

Why are graphs represented as adjacency lists instead of adjacency sets?

In answering this question, I was looking for references (textbooks, papers, or implementations) which represent a graph using a set (e.g. hashtable) for the adjacent vertices, rather than a list. That is, the graph is a map from vertex labels to…
Caleb Stanford
  • 7,298
  • 2
  • 29
  • 50
14
votes
4 answers

What are the advantages of cuckoo hashing over dynamic perfect hashing?

Dynamic perfect hash tables and cuckoo hash tables are two different data structures that support worst-case O(1) lookups and expected O(1)-time insertions and deletions. Both require O(n) auxiliary space and access to families of hash functions for…
templatetypedef
  • 9,302
  • 1
  • 32
  • 62
14
votes
3 answers

What does "non-pathological data" mean?

I took an algorithms class on Coursera. The professor in the video about hash tables said that What's true is that for non-pathological data, you will get constant time operations in a properly implemented hash table. What does "non-pathological…
14
votes
1 answer

Universal Hashing in Practice

A family $H$ of hash functions $h: U \rightarrow \{0,\ldots,M-1\}$ is universal if $$\forall x,y \in U, x \neq y \Rightarrow \Pr_{h \in H}[h(x) = h(y)] \leq \frac{1}{M}$$ You can find more about universal hashing this wikipedia article. The concept…
Dai
  • 1,460
  • 10
  • 12
12
votes
3 answers

How are hash table's values stored physically in memory?

Question: How are hash table's values stored in memory such that space if efficiently used and values don't have to be relocated often? My current understanding (could be wrong): Let's say I have 3 objects stored in a hash table. Their hash…
Pwner
  • 221
  • 1
  • 2
  • 3
11
votes
2 answers

Why should one not use a 2^p size hash table when using the division method as a hash function?

I don't understand what is meant by: "m should not be a power of 2, since if m = 2^p, then h(k) is just the p lowest-order bits of k." (pg. 231 of CLRS) Terms defined: m: size of hash table h(k): hash function = k mod m k: key I don't understand…
zallarak
  • 213
  • 1
  • 2
  • 5
11
votes
2 answers

Hashing using search trees instead of lists

I am struggling with hashing and binary search tree material. And I read that instead of using lists for storing entries with the same hash values, it is also possible to use binary search trees. And I try to understand what the worst-case and…
11
votes
2 answers

Why is the Java HashMap load factor 0.75?

I can't understand why the Java HashMap load factor is 0.75. If I understand well, the formula for the load factor is n/m, where n is the number of key and m is the number of position in the hash table. Since HashMap utilize bucket (i.e., a linked…
Bender
  • 367
  • 2
  • 11
10
votes
3 answers

Why is a (collision-less) hashtable lookup really O(1)?

Disclaimer: I know there are similar sounding questions already here and on Stackoverflow. But they are all about collisions, which is not what I am asking for. My question is: why is collision-less lookup O(1) in the first place? Let's assume I…
Foo Bar
  • 203
  • 1
  • 7
9
votes
2 answers

How to avoid cascading resizes when resizing hash tables?

With conventional collision resolution methods like separate chaining and linear/quadratic probing, the probe sequence for a key can be arbitrarily long - it is simply kept short with high probability by keeping the load factor of the table low.…
Anonymous
  • 89
  • 1
1
2 3
17 18