How HashTables Work

How HashTables Work

September 13, 2018 Uncategorized 0

How do Hash Tables Work?

To Begin with, what is a HashTable?

Simply, it is a data structure that can map keys to values.

Why is it important?

You can look up the data in constant time i.e. O(1).

How does it work?

Internally, it’s an array of buckets or slots. There is something called Hash Function which takes the key as the input and the output of the hash function is the index in the array of buckets or slots.

 

What is a Hash Function?

A Hash Function’s job is to distribute the entries across an array of buckets. The algorithm’s job is to return an index based on the key.

Index = f(key, array_size)

This could be in 2 steps:

Hash = Hashfunc(key)

Index = hash % array_size

In an ideal world, your hash function is perfect and there are no collisions! No two keys return the same hash output ever.

However, in real world, there might be some collisions. So how do we resolve those?

There are 2 primary techniques for resolving collisions.

  1. Separate Chaining
  2. Open Addressing

 

SEPARATE CHAINING

In this technique, each bucket is independent and has some sort of collection of entries at each index.

Let’s see an example to better understand this:

Separate Chaining with Linked Lists

 

As you can see in the diagram, each bucket is just a reference to the head of the linked list. So the first time an entry is mapped at a bucket, we add the element as the first element of the linked list. If second time, there is a hash collision, an element is added at the same slot, we simply add the element as the next entry in the linked list.

Now, imagine you have to look up the second element by the key, we reach the same bucket or slot. We compare the key to the fist element’s key in the linked list. We do not find a match. Next, we look at the next element in the linked list and again compare the key. Bingo it’s a match and we can read the value associated.

That’s the most common example of separate chaining with LinkedList

You can also store the first element of the linked list in the bucket itself. This is how it might look.

SEPARATE CHAINING WITH LIST HEAD CELLS

 

This helps you look up a little faster than previous technique as there is one less jump. Also, this might help with better caching efficiency.

 

OPEN ADDRESSING

In this technique, all the records are stored in the bucket array itself.

When adding an entry, if the slot is empty then add it at the index. If not, then check out the next slot based on some probing sequence. For e.g. in linear probing u will go linearly to the next slot and see if it’s empty or not. You keep checking until u find the next empty slot and then add it there.

When looking up for an entry, we follow the same process. We hash the key, get the index, look up the entry in the array at that index. If the key matches, we return the value. If the key doesn’t match, we keep probing linearly until the key matches and then return the value.

If you are able to describe this much in an interview about HashMaps, I would say you have successfully answered the interview question.

To have more such interview questions and answers please subscribe to my youtube channel coach4dev. Until next time, Happy Coding.