How Hash Tables Actually Work

Google’s email says: "Understand how they work and be prepared to explain or even implement one."

A hash table (Python’s dict, C#'s Dictionary, TS’s Map) stores key-value pairs using a hash function to map keys to array indices. This is how O(1) average-case lookups work.

The Process

  1. Hash the key: Convert the key to an integer using a hash function. Python uses hash().

  2. Map to index: index = hash(key) % array_size — maps the hash to a valid array slot.

  3. Store/Retrieve: Put the value at that index. On retrieval, hash the key again to find the index.

Collision Handling

Two different keys can hash to the same index — this is a collision. Two main strategies:

1. Chaining (Python’s approach)

Each array slot holds a linked list (or list) of (key, value) pairs. When a collision occurs, append to the list at that slot.

  • Lookup: Hash → find slot → scan the list for the matching key

  • Worst case: O(n) if all keys hash to the same slot (degenerate)

  • Average case: O(1) when load factor is low

2. Open Addressing

When a collision occurs, probe for the next empty slot (linear probing, quadratic probing, or double hashing).

  • Requires the load factor to stay below ~0.7 (resize when exceeded)

  • Better cache performance (contiguous memory)

  • More complex deletion (requires "tombstone" markers)

Load Factor and Resizing

load_factor = number_of_entries / array_size

When the load factor exceeds a threshold (Python uses ~0.67), the table resizes (typically doubles). This requires rehashing ALL existing keys — O(n) for that one operation, but amortized O(1) per insert.

What Google Wants to Hear

"A hash table provides O(1) average-case lookup, insert, and delete by using a hash function to map keys to array indices. Collisions are handled by chaining or open addressing. The worst case is O(n) if many keys hash to the same index, but with a good hash function and low load factor, this is extremely rare. Resizing is amortized O(1)."

Exercises