
A hash table is a data structure that stores key–value pairs and allows extremely fast lookups, insertions, and deletions, typically in O(1) average time.
In other words, if you know the key, you can retrieve the associated value almost instantly, regardless of how much data is stored.
You can think of a hash table as a more powerful version of an array.
While arrays allow fast access by index (arr[10]), hash tables generalize the idea so that any data can be used as a key: numbers, strings, objects, etc.
This is why most programming languages expose hash tables as built-in dictionary types:
Map and object literals ({})dictHashMapmapIn practice, hash tables are everywhere:
Hash tables work by converting a key into an index of an internal array using a function called a hash function.
Instead of searching linearly through elements, the hash function jumps directly to the memory location where the value should be stored:
If the hash function distributes keys well, most operations have O(1) amortized complexity, meaning they take constant time on average, even if occasionally a slower operation occurs (we’ll see why later).
Because of this combination of flexibility and performance, hash tables are one of the most important data structures in algorithms and system design.
If you know how to use them effectively, many seemingly complex problems become simple.
A hash function is the core component of a hash table.
Its job is simple: take a key of any type (number, string, object) and convert it into an integer index of an internal array, as we already saw before.
This allows a hash table to perform lookups without searching through all elements: it knows exactly where to store and retrieve a value.
A practical hash function must satisfy three properties:
"abc" sometimes mapped to index 3 and other times to 9, retrieval would be impossible.For example, integers can be hashed using a simple function like the one below. This returns a number in the range [0, size - 1], which is a valid index in the hash table.
function hashInt(x: number, size: number): number {
return x % size;
}
Strings require combining characters into a single numeric value. A common method is to treat the string as a polynomial.
function hashString(s: string, size: number): number {
let hash = 0;
for (let i = 0; i < s.length; i++) {
hash = (hash * 31 + s.charCodeAt(i)) % size;
}
return hash;
}
Even with a great hash function, multiple keys may map to the same index:
By the pigeonhole principle, collisions must exist.
What matters is not avoiding collisions completely (that’s impossible), but handling them efficiently. That leads to the next topic: collision resolution.
Since different keys may produce the same hash value, a hash table must handle collisions.
Two major strategies exist: chaining and open addressing.
With chaining, each position in the table stores a list (or linked list, or dynamic array) of key–value pairs that hashed to that index.
Example:
| Index | Values |
|---|---|
| 0 | — |
| 1 | ("cat" → 7), ("car" → 5) |
| 2 | — |
| 3 | ("dog" → 2), ("door" → 9), ... |
If two keys collide (same index), they are simply appended to the bucket list.
Operations:
Chaining is simple and efficient as long as the load factor stays small.
In open addressing, the table stores key–value pairs directly inside the array. When a collision happens, instead of building a list, the algorithm probes other positions in the array until it finds an empty slot.
Insertion and lookup follow the same procedure:
This is why every slot in open addressing must store both the key and the value: during lookup, the algorithm needs to check whether the key in a shifted position is the one we’re searching for.
There are different probing strategies:
i, i+1, i+2, …i+1, i+4, i+9, …i + j * hash2(key)A common challenge in open addressing is clustering, where large consecutive blocks of full cells form over time, slowing down insertion and lookup. Double hashing usually mitigates clustering better than linear or quadratic probing, but it’s slightly more expensive to compute.
Both chaining and open addressing work well in practice, and most languages and runtimes choose one based on memory layout and performance requirements.
| Feature | Chaining | Open Addressing |
|---|---|---|
| Storage | Buckets with lists | All keys stored in table |
| Memory usage | Higher (extra pointers/arrays) | Lower |
| Cache friendliness | Lower | Higher |
| Performance under high load | Degrades gracefully | Can degrade sharply |
| Deletion | Simple | More complex |
Most real hash tables (like HashMap in Java or Dictionary in C#) use chaining,
while performance–critical structures (databases, embedded systems) often use open addressing.
The next key concept answers another important question:
how do we keep collisions low in the first place? Load factor and resizing.
A hash table performs best when it is not too full.
The load factor measures how full the table is:
When the load factor grows too high, collisions become more frequent and operations slow down. To maintain fast lookups, most hash tables automatically resize themselves, typically by creating a new array that is 2× larger and re-inserting every key–value pair into it.
Even though rehashing takes time, it happens infrequently.
If each resize doubles the capacity, then the total cost of all future resizes is spread across many insertions.
This is why hash table operations (insert, lookup, delete) are considered amortized O(1).
Conceptually:
This trade-off is what makes hash tables extremely efficient in practice, even when storing millions of elements. For additional details about amortized analysis, refer to the time and space complexity article.
A hash table supports three main operations:
In a well-designed hash table, all three run in O(1) average time.
This is possible because the key is hashed to an index, and the element is accessed directly—no scanning or shifting required.
However, keeping in consideration what we saw in the previous paragraphs, collisions make the worst case O(n):
if every element hashes to the same location, the table degenerates into a list.
In practice this almost never happens because:
This is why hash tables are considered amortized O(1) and are widely used for frequency counting, caching, fast lookup tables, and set membership.
| Operation | Average Case | Worst Case |
|---|---|---|
| Insert | O(1) | O(n) |
| Search | O(1) | O(n) |
| Delete | O(1) | O(n) |
Even though worst-case performance exists in theory, real hash tables in languages like JavaScript, Python, Java, and C++ are engineered so that degeneracy is extremely unlikely. This is why they remain the default choice for key–value storage in programming and algorithmic problems.
To truly understand how hash tables work, it helps to build one.
Below is a simplified but realistic implementation of a HashMap in TypeScript using chaining.
Each bucket in the array stores a small list of key–value pairs. Collisions are handled by appending to the list.
We store:
{ key, value } pairstype Entry<K, V> = { key: K; value: V };
class HashMap<K extends string | number, V> {
private buckets: Entry<K, V>[][] = [];
private size = 0;
private readonly loadFactor = 0.75;
constructor(private capacity = 16) {
this.buckets = Array.from({ length: capacity }, () => []);
}
private hash(key: K): number {
if (typeof key === "number") {
return key % this.capacity;
}
let hash = 0;
for (const ch of key) {
hash = (hash * 31 + ch.charCodeAt(0)) >>> 0;
}
return hash % this.capacity;
}
// .... other stuff
To implement insert (put) we need to:
{ key, value } pairIn particular, if we need to resize, we need to:
// .... other stuff
put(key: K, value: V): void {
const index = this.hash(key);
const bucket = this.buckets[index];
for (const entry of bucket) {
if (entry.key === key) {
entry.value = value; // update
return;
}
}
bucket.push({ key, value });
this.size++;
if (this.size / this.capacity > this.loadFactor) {
this.resize();
}
}
private resize(): void {
const oldBuckets = this.buckets;
this.capacity *= 2;
this.buckets = Array.from({ length: this.capacity }, () => []);
this.size = 0;
for (const bucket of oldBuckets) {
for (const entry of bucket) {
this.put(entry.key, entry.value); // reinsert
}
}
}
// .... other stuff
For the get operation, we simply hash the key to find its bucket and search for the key inside that bucket.
// .... other stuff
get(key: K): V | undefined {
const index = this.hash(key);
const bucket = this.buckets[index];
for (const entry of bucket) {
if (entry.key === key) {
return entry.value;
}
}
return undefined;
}
// .... other stuff
Last but not least, we implement remove, which also hashes the key to find its bucket and searches for the key to delete it.
// .... other stuff
remove(key: K): boolean {
const index = this.hash(key);
const bucket = this.buckets[index];
for (let i = 0; i < bucket.length; i++) {
if (bucket[i].key === key) {
bucket.splice(i, 1);
this.size--;
return true;
}
}
return false;
}
// .... other stuff
| Exercise | Technique | Solution |
|---|---|---|
| Design HashMap | HashMap Design | Solution |
| Maximum Number of Balloons | Frequency Counting | Solution |
| Number of Good Pairs | Hash Map Counting | Solution |
| Isomorphic Strings | Character Mapping | Solution |
| Ransom Note | Hash Map Counting | Solution |
| Contains Duplicate II | Sliding Window + Hash Set | Solution |
| Group Anagrams | Hash Map with Sorted Key | Solution |
| Encode and Decode TinyURL | Hash Map / Bi-Directional Map | Solution |
| Reorganize String | Hash Map + Max Heap | Solution |
| Longest Consecutive Sequence | Hash Set + Sequence Counting | Solution |
| Split Array into Consecutive Subsequences | Hash Map + Greedy | Solution |
| Number of Matching Subsequences | Hash Map + Buckets | Solution |
| Number of Good Ways to Split a String | Hash Map + Prefix Counting | Solution |