Swift Algorithm Club: Hash Tables

Learn how to implement a hash table data structure in Swift, in this step-by-step tutorial from the Swift Algorithm Club. By Kelvin Lau.

Leave a rating/review
Save for later
Share
You are currently viewing page 2 of 3 of this article. Click here to view the first page.

Retrieving Items

Here's an example of retrieving an item from the hash table:

let x = hashTable["lastName"]

The hash table first hashes the key "lastName" to calculate the array index, which is 2. Since bucket 2 has a chain, you step through the list to find the value with the key "lastName". This is done by comparing the keys using a string comparison. The hash table checks that the key belongs to the last item in the chain and returns the corresponding value, "Jobs".

Common ways to implement this chaining mechanism are to use a linked list or another array. Chains should not become long because looking up items in the hash table would become a slow process. Ideally, you would have no chains at all, but in practice it is impossible to avoid collisions. You can improve the odds by giving the hash table enough buckets and usin high-quality hash functions.

Note: An alternative to chaining is "open addressing". The idea is this: if an array index is already taken, we put the element in the next unused bucket. This approach has its own upsides and downsides.

Implementation

In the Sources directory, create a new Swift file and name it HashTable.swift. Delete any text in the file and write the following:

public struct HashTable<Key: Hashable, Value> {
  private typealias Element = (key: Key, value: Value)
  private typealias Bucket = [Element]
  private var buckets: [Bucket]

  private(set) public var count = 0
  public var isEmpty: Bool { 
    return count == 0
  }

  public init(capacity: Int) {
    assert(capacity > 0) 
    buckets = Array<Bucket>(repeating: [], count: capacity)
  }
}

Although you created a hash function based on the djb2 hash algorithm, it's better to leverage Apple's version. Through the constraining the Key as Hashable, your dictionary enforces that all keys have a hash value, so your dictionary doesn't need to worry about calculating the actual hash.

The main array is named buckets It has a fixed size provided by the init(capacity) method. You also keep track of how many items have been added to the hash table using the count variable.

Operations

Now that the scaffolding for your hash table is complete, you'll want to define the mutating operations for this structure. There are four common things you will do with a hash table:

  • insert a new elements
  • look up an element
  • update an existing element
  • remove an element

You'll want the syntax to look like this:

hashTable["firstName"] = "Steve" // insert
let x = hashTable["firstName"] // lookup
hashTable["firstName"] = "Tim" // update
hashTable["firstName"] = nil // delete

Start by defining the following helper method in your HashTable structure:

private func index(for key: Key) -> Int {
  return abs(key.hashValue) % buckets.count
}

This method will help ensure the key maps to an index within the bounds of the storage array. Next, add the following just below index(for:):

Value Retrieval

Write the following inside the HashTable structure:

// 1
public subscript(key: Key) -> Value? {
  get {
    return value(for: key)
  }
}

// 2
public func value(for key: Key) -> Value? {
  let index = self.index(for: key)
  return buckets[index].first { $0.key == key }?.value
}

If a match is found, you use optional chaining to extract the value. Otherwise, first will return nil, signifying that the key doesn't map to a value in the hash table.

  1. The subscript method takes a key and returns a value. The actual logic will reside in the value(for:) method.
  2. value(for:) first calls index(for:) to convert the key into an array index. That returns the bucket number, but this bucket may be used by more than one key if there were collisions. Thus, you use the first method that takes a closure, where you compare the key of each element with the key you want to match it with.

    If a match is found, you use optional chaining to extract the value. Otherwise, first will return nil, signifying that the key doesn't map to a value in the hash table.

Inserting Values

The subscript method can also work as a setter. Add the following code at the bottom of subscript:

set {
  if let value = newValue {
    update(value: value, for: key)
  }
}

This adds a setter to the subscript operation. newValue is a invisible reference to the value that is being assigned to the subscript. Once again, you'll be defining actual logic in a helper method.

Add the following below value(for:):

@discardableResult
public mutating func update(value: Value, for key: Key) -> Value? {
  let index = self.index(for: key)
  
  // 1
  if let (i, element) = buckets[index].enumerated().first(where: { $0.1.key == key }) {
    let oldValue = element.value
    buckets[index][i].value = value
    return oldValue
  }

  // 2
  buckets[index].append((key: key, value: value))
  count += 1
  return nil
}

Here's the play-by-play:

  1. You first check to see if the value is already inside a bucket. If it is, you update the value at the index i.
  2. If execution comes to this point, it means the key doesn't map to a particular value inside the hash table. You then add the new key-value pair at the end of the bucket.

With that, you're finally able to try your hash table out. Navigate back to the playground page and write the following at the bottom of the playground:

var hashTable = HashTable<String, String>(capacity: 5)
hashTable["firstName"] = "Steve"

if let firstName = hashTable["firstName"] {
  print(firstName)
}

if let lastName = hashTable["lastName"] {
  print(lastName)
} else {
  print("lastName key not in hash table")
}

You should see the following output in the console:

Steve
lastName key not in hash table

Removing Items

The final operation you'll implement is the one that removes the key. Update the subscript method to the following:

public subscript(key: Key) -> Value? {
  get {
    return value(for: key)
  }

  set {
    if let value = newValue {
      update(value: value, for: key)
    } else {
      removeValue(for: key)
    }
  }
}

Next, add the remove(value:for:) method at the bottom of the HashTable structure:

@discardableResult
public mutating func removeValue(for key: Key) -> Value? {
  let index = self.index(for: key)
  
  // 1
  if let (i, element) = buckets[index].enumerated().first(where: { $0.1.key == key }) {
    buckets[index].remove(at: i)
    count -= 1
    return element.value
  }

  // 2
  return nil
}

This is fairly similar to the update method. You first check to see if the value is in the bucket. If it is, you remove the key in the chain, decrement the count, and return the value. Otherwise, you return nil, since you couldn't find the key-value pair to remove.

Navigate back to the playground page and write the following:

hashTable["firstName"] = nil

if let firstName = hashTable["firstName"] {
  print(firstName)
} else {
  print("firstName key is not in the hash table")
}

You should see the following in the console:

Steve
lastName key not in hash table
firstName key is not in the hash table