Equality, Identity, and Hashing with Swift Types

Write efficient and easy-to-test code that’s less error-prone

Before we begin, let’s get started by answering the following question:

What is a type?

The answer to this question is fairly simple—a type is a collection of related data that encapsulates certain concepts and principles. A type can be abstract or concrete, it defines a set of valid operations that we can perform on it, and finally, a type clearly demarcates the internal and external.

Types are a very good form of documentation and are much better than an instructional comment. Types can be designed to ensure safety and the correct behavior of the code, and since types are safe to work with, you might end up with a lot of compiler errors while working with them during compile times.

But this is a good thing, since this enables us to fix most issues before they make it to the production app. I would encourage you to think of these as a to-do list that the compiler generates when we go wrong.

Types also enable us to write easy-to-test code. For instance, let’s take the example of the following two functions:

func getAddress() -> String

func getAddress() -> Any

Which one of the given functions is easier to test? The second function provides more flexibility since it allows us to return Any type, but the same flexibility proves to be challenging while testing the correctness of the method. However, if we look at the first function, we can be sure that it will return a String type.

Just by looking at the first function, we can see that it will return a String output every time, but the same can’t be said for the second function, and this makes it harder to test.

Equality and Identity

We’ll start by exploring Equality with help of the following example:

struct EmployeeID {
  private(set) var id: Int

    init?(_ raw: Int) {
        guard raw > 1000 else {
            return nil
        }
        id = raw
    }
}

We have an EmployeeID type.

We have a failable initializer for EmployeeID that guards from creating an EmployeeID less than 4 digits.

In order to make it easier to test EmployeeID types, let’s add conformance to the Equatable protocol and see how the compiler reacts to the existing code:

struct EmployeeID : Equatable {
  // Same code as above
}

let employeeIDOne = EmployeeID(1001)
let employeeIDTwo = EmployeeID(1001)
print("Both EmployeeID's are (employeeIDOne == employeeIDTwo ? "equal" : "unequal")")

We can see that the compiler doesn’t complain, and this happens because EmployeeID is a ValueType and id is an Int. The compiler generates conformance requirements implicitly, and as a result, executing the code above will print following to console:

Let’s examine Equality behavior with Reference types with help of the following example:

class Employee {
  var id: EmployeeID
  var name: String
  
  init(id: EmployeeID, name: String) {
    self.id = id
    self.name = name
  }
}

We have an Employee class and notice that it requires a type EmployeeID. In this case, Employee is a dependent type, and it’s impossible to create an Employee without a valid EmployeeID or at least an EmployeeID that’s more than four digits long. These are strong guarantees to have.

Now, if we make the Employee type as Equatable as we did with the EmployeeID type, you’ll notice that the compiler straight away starts issuing a warning with the following message “Type ‘Employee’ does not conform to protocol ‘Equatable’”. One reason for this is that:

The solution to the above problem is very simple—just implement == in the code as follows:

class Employee : Equatable {
    static func == (lhs: Employee, rhs: Employee) -> Bool {
        return lhs.id == rhs.id &&
            lhs.name == rhs.name
    }
    // Same as above.
}

The standard implementation is to compare all the properties on the left hand side with the right hand side.

Reference types also have the concept of Identity in addition to Equality, so let’s see how these two concepts differ with the help of following code example:

guard let employeeId = EmployeeID(1001) else {
    fatalError("Not a valid Employee ID")
}
let firstEmployee = Employee(id: employeeId, name: "EmployeeOne")
let copyOfFirstEmployee = Employee(id: employeeId, name: "EmployeeOne")
print("Both Employees are (firstEmployee == copyOfFirstEmployee ? "Equal" : "Unequal")")
print("Both Employees are (firstEmployee === copyOfFirstEmployee ? "Identical" : "Unidentical")")

Here, we’ve used the same employeeId to create two similar copies of Employee, and then we compared both the Employee objects to see if they’re equal. We also checked if they’re identical with the help of == and === operators. The result of executing the above code prints following to the console:

=== checks to see if both objects point to the same reference, and since both references point to equal but different locations in memory, the result of comparing the identity of these objects results in false, i.e. firstEmployee and copyOfFirstEmployee are equal, but they don’t refer to the same object in memory.

What happens if we assign firstEmployee to a new variable and then check for identity of those two variables?

let refereceToFirstEmployee = firstEmployee
print("Both References are (refereceToFirstEmployee === firstEmployee ? "Identical" : "Unidentical")")

The result of executing this code prints the following to the console:

This certifies that this time, both the objects are now referring to the same location in memory.

There’s flexibility in how we define ==, but it’s always best to keep it simple and compare all the properties, as long as we don’t create an infinite loop.

Hashable

Hashable is a more powerful way to compare objects. It’s useful when we need to perform a quick lookup inside a Dictionary or a Set.

Additionally, Hashability also requires Equatable conformance. Let’s see how it works. Suppose we have a Set of 7 hashable objects, as shown in the image below:

Each Object in the Set returns a random-looking but not actually random number called a Hash value. This Hash value is then used to generate a slot number for the underlying Set storage.

For instance, suppose the hash value returned by Element 0 is 1234567. The slot number for this Element would be 1234567 modulo 7 = 5.

Different instances produce different hash values, and hence different slot numbers, but sometimes even different hash values might result in the same slot numbers, resulting in collisions, as shown by Elements 3–5 in the image.

Collisions can slow down performance, and this is the reason to have an efficient hashing mechanism for custom types. In our example above, when it comes to lookup Element 0, this can happen in constant time i.e. O(1) if the hash value and slot number are generated as unique. Let’s see how hashable affects the existing code.

In the code example above, in order to make EmployeeID hashable, all we need to do is make it conform to Hashable instead of Equatable, and Swift will make the necessary adjustments under the hood for us, since the stored value inside EmployeeID is an Int and Ints are intrinsically hashable.

But the same thing doesn’t apply directly to Employee type since it’s a reference type. Because of this, the compiler starts complaining as soon as we try to make it Hashable, as shown in the image below:

Making Employee Hashable

The latest iterations of Swift have made it very intuitive and clear to implement Hashing correctly. Let’s have a look at the code:

class Employee : Hashable {
  
    // Same as above

    func hash(into hasher: inout Hasher) {
        hasher.combine(id)
        hasher.combine(name)
    }
}

We start by implementing hash(into hasher: inout Hasher) and utilize the passed in hasher to hash all our properties. Hasher uses the method called combine, as shown in the code, and makes use of hashable sub-objects to generate the hash value.

If two objects are equal, then they must return the same hash value. Otherwise, the lookup in the Dictionary or a Set will not work. Whereas, the reverse isn’t true because two different objects can return the same hash value and we will end up in a collision. The hasher tries to create cryptographically correct hash values.

Good hash values are a must. Otherwise, imagine what might happen if a malicious hacker is able to cause collisions by injecting the same hash values for keys inside a dictionary? Hint: ☠️☠️☠️DENIAL OF SERVICE!!!☠️☠️☠️

For other updates you can follow me on Twitter on my twitter handle @NavRudraSambyal

Thanks for reading, please share it if you found it useful 🙂

Fritz

Our team has been at the forefront of Artificial Intelligence and Machine Learning research for more than 15 years and we're using our collective intelligence to help others learn, understand and grow using these new technologies in ethical and sustainable ways.

Comments 0 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *

wix banner square