Big O notation · part 2 of 8
How to work out the big O of your code
Looking at a lump of code, yours or someone else's, and naming its shape is what we're aiming for. We'll do one problem three ways first, then pull out a recipe you can run on anything.
Three shapes, one job
Our loyalty scheme gives every customer a card with a number on it. One morning someone reports seeing the same number on two different cards. Uh oh. Did we mess up?! Maybe... so we want to answer: is there a duplicate in this list of numbers?
That's our question, let's answer it in three different ways.
Firstly, we'll try comparing every card against every other card:
// Attempt 1
function hasDuplicate(cards) {
for (let i = 0; i < cards.length; i++) {
for (let j = i + 1; j < cards.length; j++) {
if (cards[i] === cards[j]) return true;
}
}
return false;
}
For each of the n cards, we compare against (roughly) all the others, so that's n lots of n comparisons. A loop inside a loop over the same list: O(n²).
Secondly, we can do better by sorting first, because duplicates end up next to each other:
// Attempt 2
function hasDuplicate(cards) {
const sorted = [...cards].sort();
for (let i = 1; i < sorted.length; i++) {
if (sorted[i] === sorted[i - 1]) return true;
}
return false;
}
The sort costs O(n log n), and the single scan afterwards is O(n). Add those and the bigger one wins, so the whole thing is O(n log n).
Lastly, and best of all, remember every number we've already seen:
// Attempt 3
function hasDuplicate(cards) {
const seen = new Set();
for (const card of cards) {
if (seen.has(card)) return true;
seen.add(card);
}
return false;
}
One pass through the list, and each Set lookup is O(1), so this is O(n). We paid for it with memory: the Set can grow to hold every card. Remember this for later as it comes back when we talk about space.
That's it. For ten customers none of this matters, they all finish instantly. But run them across every card the loyalty scheme has ever issued and our first O(n²) attempt is still processing whilst our third O(n) attempt has long since answered.
The recipe
You can run this on any function, yours or someone else's.
- Find what repeats with the input. Loops over the data are the obvious ones. Watch for the disguised methods too: an
.includes(), an.indexOf(), a nested.find()are all just loops. - Nested loops multiply. Steps in sequence add.
- Keep only the fastest-growing term and bin the constants. Whatever's left is your big O.
Why you're allowed to bin the constants
Big O throws away constant factors and smaller terms. O(2n) becomes O(n). O(n + 100) becomes O(n). O(n²/2 + 3n) becomes O(n²). The first time you see this it looks sloppy, but stick with me here.
The reason is that for large n, the biggest term flattens everything else. Picture n at a million. The n² term is a trillion. The + n next to it is a million, which next to a trillion is a rounding error you'd never notice. So we don't bother noting it.
Big O is deliberately coarse. It answers "what shape does this grow into", not "exactly how many operations". A function that does two passes, and one that does a single pass count as the same shape, and if that factor of two matters to you, you'd be reaching for a benchmark and a stopwatch, not for big O.
Add for sequential, multiply for nested
Two loops one after the other, each over the list, is O(n) + O(n), which is O(2n), which is just O(n). Doing two passes doesn't change the shape.
A loop inside a loop is O(n) × O(n), which is O(n²). That's our every-pair duplicate check from earlier.
And when there are two different inputs, keep them separate. Prepping the menu of m flavours and then checking each of n orders is O(m) + O(n), written O(n + m). But checking each order against the whole menu, a loop inside a loop over two different things, is O(n × m). Don't collapse n and m into one letter unless you know they grow together.
The sneaky quadratic
Quadratic is the name for O(n²). The work grows with the square of the input. Double the data and you don't double the work, you quadruple it (2² = 4); ten times the data is a hundred times the work (10² = 100).
This is the most common complexity mistake I see (mainly because I've often written it):
function unknownFlavours(orders, menu) {
return orders.filter((order) => !menu.includes(order.flavour));
}
It looks like one loop: filter walks the n orders once. But menu.includes() is a second loop hiding inside a method call, scanning all m menu items for every single order. So it's O(n × m), and on a busy day when the order list is as long as the menu, that's O(n²).
The fix is the same trick as the duplicate finder: build a Set once, then each check is O(1).
function unknownFlavours(orders, menu) {
const known = new Set(menu);
return orders.filter((order) => !known.has(order.flavour));
}
Now it's O(n + m): one pass to build the set, one pass to filter. The previous version doesn't feel any slower when testing with a few orders, but it'll topple over in production during our busy period.