How to make your code faster using JavaScript Sets
If you only use arrays, you’re missing a trick
Published on
Mar 30, 2019
Read time
6 min read
Introduction
I’m sure there are plenty of developers who stick to the basic global objects: numbers, strings, objects, arrays and booleans.
For many use-cases, these are all you need. But if you want to make your code as fast and scalable as possible, these basic types aren’t always good enough.
In this article, we’ll talk about how JavaScript’s Sets can make your code faster — especially as it scales. There is a significant amount of crossover between what an array can do and what a Set can do. But using Sets will often bring runtime benefits that are impossible to achieve with arrays. In this article, we’ll explore how.
How are Sets different?
The most fundamental difference is that arrays are an indexed collection. That means the value of data in an array is ordered by the index.
const arr = [A, B, C, D];
console.log(arr.indexOf(A)); // Result: 0
console.log(arr.indexOf(C)); // Result: 2
By contrast, Sets are a keyed collection. Instead of using indices, Sets order their data using keys. A Set’s elements are iterable in the order of insertion, and it cannot contain any duplicate data. In other words, every item in a Set must be unique.
What are the main benefits?
In a direct comparison, Sets have several advantages over arrays, especially when it comes to a faster run-time:
- Search for an Item: Using
indexOf()
orincludes()
to check whether an item exists in an array is slow. - Deleting an Item: In a Set, you can delete an item by its value. In an array, the equivalent is using
splice()
based on an element’s index. As in the previous point, depending on indices is slow. - Insert an Item: It is faster to add an item to a Set than to add an item to an array using
push()
,unshift()
or an equivalent method. - Storing NaN: You cannot use
indexOf()
orincludes()
to find the valueNaN
, while a Set is able to store this value. - Removing Duplicates: Set objects only store unique values. If you want to avoid storing duplicates, this is a significant advantage over arrays, where additional code would be required to deal with duplicates.
Note: For a full list of the in-built Set methods, the best place to go is MDN Web Docs.
What is the time complexity?
The methods an array uses to search for items has a linear time complexity of O(N). In other words, the run-time increases at the same rate as the size of the data increases.
By contrast, the methods used by Sets to search for, delete and insert items all have a time complexity of just O(1) — that means the size of the data has virtually no bearing on the run-time of these methods!
Note: If you’d like to learn more about time complexity, check out my article on Big O Notation.
Exactly how much faster are Sets?
Though run-time can vary significantly depending on the system being used, the size of the data provided and other variables, I hope my test results will give you a practical sense of how much faster Sets can be. I’ll share three simple tests I did and the results I got.
Preparing the tests
Before running any tests, let’s create an array and a Set with one million entries each. For the sake of simplicity, I’ll start at 0 and count up to 999,999.
let arr = [], set = new Set(), n = 1000000;for (let i = 0; i n; i++) {
arr.push(i);
set.add(i);
}
Test 1: searching for an item
First, let’s search for the number 123123
, which we know will return true.
let result;
console.time("Array");
result = arr.indexOf(123123) !== -1;
console.timeEnd("Array");
console.time("Set");
result = set.has(123123);
console.timeEnd("Set");
- Array: 0.173ms
Set: 0.023ms - The Set was 7.54 times faster
Test 2: adding an item
Now, let’s add a new item to each collection.
console.time("Array");
arr.push(n);
console.timeEnd("Array");
console.time("Set");
set.add(n);
console.timeEnd("Set");
- Array: 0.018ms
Set: 0.003ms - The Set was 6.73 times faster
Test 3: deleting an item
Lastly, let’s remove an item from each collection (we can use the one we just added). There’s no in-built array method for this, so we’ll create a helper function to keep everything clean:
const deleteFromArr = (arr, item) => {
let index = arr.indexOf(item);
return index !== -1 && arr.splice(index, 1);
};
And here’s the code for the test:
console.time("Array");
deleteFromArr(arr, n);
console.timeEnd("Array");
console.time("Set");
set.delete(n);
console.timeEnd("Set");
- Array: 1.122ms
Set: 0.015ms - In this instance, the Set was a blistering 74.13 times faster!
Overall, we can see that highly significant improvements in run-time can be made by using a Set instead of an array. Now let’s see some practical examples where sets can be useful.
Case 1: removing duplicate values from an array
If you want to remove duplicate values from an array quickly, you can convert it to a Set. This is by far the most concise way of filtering out unique values:
const duplicateCollection = ["A", "B", "B", "C", "D", "B", "C"]; // If you want to turn the array into a Set
let uniqueCollection = new Set(duplicateCollection);
console.log(uniqueCollection); // Result: Set(4) {"A", "B", "C", "D"}// If you want to keep your values in an array
let uniqueCollection = [...new Set(duplicateCollection)];
console.log(uniqueCollection); // Result: ["A", "B", "C", "D"]
Case 2: a Google interview question
In another article, I discussed four solutions to a question asked by an interviewer at Google. The interview was conducted using C++, but if it were in JavaScript, a Set would be a necessary part of the final solution.
If you’d like to look at the solution in greater depth, I recommend reading the article, but here’s a quick summary of the final solution.
Question
Given an unordered array of integers and a value sum
, return true
if any two items may be added such that they equal the value of sum
. Otherwise, return false
.
So, if we were given the array [3, 5, 1, 4]
and the value 9
, our function should return true
, because 4 + 5 = 9
.
Solution
A great approach to this question is to iterate through the array, creating a Set of compliments as we go.
Let’s apply this thinking to the example above. When we encounter 3
we can add 6
to our Set of compliments because we know we need to find a sum
of 9
. Then, each time we come into contact with a new value in the array, we can check to see if it is in our Set of compliments. When we get to 5
, we’ll add 4
to our Set of compliments. Then, when we finally encounter 4
, we will also find it in our Set and so we can return true
.
Here’s what that solution might look like:
const findSum = (arr, val) => {
let searchValues = new Set();
searchValues.add(val - arr[0]);
for (let i = 1, length = arr.length; i length; i++) {
let searchVal = val - arr[i];
if (searchValues.has(arr[i])) {
return true;
} else {
searchValues.add(searchVal);
}
};
return false;
};
And here’s a more concise version:
const findSum = (arr, sum) =>
arr.some(
(
(set) => (n) =>
set.has(n) || !set.add(sum - n)
)(new Set())
);
Because Set.prototype.has()
has a time complexity of just O(1), using a Set to store compliments rather than an array helps give our overall solution a linear run-time of O(N).
If we were instead dependent on Array.prototype.indexOf()
or Array.prototype.includes()
, both of which have a time complexity of O(N), overall run-time would be O(N²). Much slower!
If you haven’t dived into JavaScript Sets before, I hope I’ve demonstrated how useful they can be!
Related articles
You might also enjoy...
I Fixed Error Handling in JavaScript
How to steal better strategies from Rust and Go—and enforce them with ESLint
14 min read
How to Easily Support ESM and CJS in Your TypeScript Library
A simple example that works for standalone npm libraries and monorepos
5 min read
Bad Abstractions Could Be Ruining Your Code
Why the ‘Don’t Repeat Yourself’ principle might be doing more harm than good
6 min read