A hash table is a data structure that is used to store and retrieve data in constant time. It is also known as a hash map or a dictionary. It works by using a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.
The key idea behind a hash table is to store values in an array using a key-value pair. The key is used to access the value, which is stored in the corresponding slot in the array. When we want to store a new value, we first apply a hash function to the key to determine the index of the array where the value should be stored. If there are no collisions (i.e. two keys that map to the same index), the value can be retrieved in constant time. However, collisions are inevitable in practice, and hash tables use different methods to handle them.
How Hash Tables Work
Hash tables use a hash function to compute an index into an array of buckets or slots, from which the desired value can be found. The hash function takes the key as input and returns a number between 0 and the number of slots in the array. The value is then stored in the corresponding bucket.
In order to retrieve a value from the hash table, the same hash function is applied to the key to find the index of the bucket where the value is stored. If there are no collisions (i.e. two keys that map to the same index), the value can be retrieved in constant time. However, collisions are inevitable in practice, and hash tables use different methods to handle them.
Collision Handling Methods
One method of handling collisions is called separate chaining. In this method, each bucket in the array is actually a pointer to a linked list. When a collision occurs, the new value is added to the linked list at the corresponding bucket. Retrieving a value from the hash table involves finding the linked list at the appropriate bucket and traversing it to find the desired value.
Here’s an example of a hash table using separate chaining:
class HashTable: def __init__(self, size): self.size = size self.table = [ for _ in range(size)] def _hash_function(self, key): return hash(key) % self.size def insert(self, key, value): index = self._hash_function(key) self.table[index].append((key, value)) def get(self, key): index = self._hash_function(key) for k, v in self.table[index]: if k == key: return v raise KeyError(key)
In this example, the
HashTable class uses separate chaining to handle collisions. The
__init__ method creates an array of
size empty lists, and the
_hash_function method computes the hash value of the given key. The
insert method adds a key-value pair to the appropriate list, and the
get method searches for a given key in the list and returns the corresponding value.
Another method of handling collisions is called open addressing. In this method, if a collision occurs, we simply look for the next empty slot in the array and store the value there. Retrieving a value from the hash table involves searching for the key in the array, and if it is not in its expected slot, we continue searching until we find it or determine that it is not in the array.
Now that we have a basic understanding of what a hash table is and how it works, let’s dive into the code and see how it can be implemented in practice. We will be using Python to implement a simple hash table.
Implementation of a Hash Table in Python
Creating the Hash Table Class
First, let’s create a class for our hash table. The class will have three methods:
__init__: This method will initialize the hash table with a fixed size and create an empty list to store our key-value pairs.
__getitem__: This method will get the value associated with a given key.
__setitem__: This method will set the value for a given key.
def __init__(self, size): self.size = size self.table = [ for _ in range(self.size)] def __getitem__(self, key): index = hash(key) % self.size for k, v in self.table[index]: if k == key: return v raise KeyError(key) def __setitem__(self, key, value): index = hash(key) % self.size for i, (k, v) in enumerate(self.table[index]): if k == key: self.table[index][i] = (key, value) return self.table[index].append((key, value))
Using the Hash Table Class
Let’s create an instance of our
HashTable class and add some key-value pairs to it.
hash_table = HashTable(5) hash_table['apple'] = 2 hash_table['banana'] = 4 hash_table['orange'] = 1
Now we can get the values for the keys we just added:
print(hash_table['apple']) # Output: 2 print(hash_table['banana']) # Output: 4 print(hash_table['orange']) # Output: 1
One of the main challenges in implementing a hash table is dealing with collisions. Collisions occur when multiple keys hash to the same index in the table. In our implementation, we are using chaining to resolve collisions. Chaining involves storing a linked list of key-value pairs at each index in the table. When a collision occurs, the new key-value pair is simply added to the end of the linked list.
Let’s add some more key-value pairs to our
hash_table instance to see how chaining works:
hash_table['pear'] = 3 hash_table['kiwi'] = 5 hash_table['grape'] = 2
Now let’s print out the table to see how the key-value pairs are stored:
[ , [('kiwi', 5)], [('grape', 2)], [('apple', 2), ('pear', 3)], [('banana', 4), ('orange', 1)] ]
As you can see, the key-value pairs are stored in linked lists at each index in the table. When we try to access a key that hashes to the same index as another key, we simply traverse the linked list at that index until we find the key we’re looking for.
In conclusion, a hash table is a powerful data structure that allows for efficient key-value lookups. It works by hashing keys to generate an index into an array, where the associated value is stored. In this article, we’ve covered the basics of how hash tables work, including collision resolution, and we’ve implemented a simple