notes to self

Big O notation · part 2 of 8

How to work out the big O of your code

Looking at a lump of code, yours or someone else's, and naming its shape is what we're aiming for. We'll do one problem three ways first, then pull out a recipe you can run on anything.

Three shapes, one job

Our loyalty scheme gives every customer a card with a number on it. One morning someone reports seeing the same number on two different cards. Uh oh. Did we mess up?! Maybe... so we want to answer: is there a duplicate in this list of numbers?

That's our question, let's answer it in three different ways.

Firstly, we'll try comparing every card against every other card:

// Attempt 1
function hasDuplicate(cards) {
  for (let i = 0; i < cards.length; i++) {
    for (let j = i + 1; j < cards.length; j++) {
      if (cards[i] === cards[j]) return true;
    }
  }
  return false;
}

For each of the n cards, we compare against (roughly) all the others, so that's n lots of n comparisons. A loop inside a loop over the same list: O(n²).

Secondly, we can do better by sorting first, because duplicates end up next to each other:

// Attempt 2
function hasDuplicate(cards) {
  const sorted = [...cards].sort();
  for (let i = 1; i < sorted.length; i++) {
    if (sorted[i] === sorted[i - 1]) return true;
  }
  return false;
}

The sort costs O(n log n), and the single scan afterwards is O(n). Add those and the bigger one wins, so the whole thing is O(n log n).

Lastly, and best of all, remember every number we've already seen:

// Attempt 3
function hasDuplicate(cards) {
  const seen = new Set();
  for (const card of cards) {
    if (seen.has(card)) return true;
    seen.add(card);
  }
  return false;
}

One pass through the list, and each Set lookup is O(1), so this is O(n). We paid for it with memory: the Set can grow to hold every card. Remember this for later as it comes back when we talk about space.

That's it. For ten customers none of this matters, they all finish instantly. But run them across every card the loyalty scheme has ever issued and our first O(n²) attempt is still processing whilst our third O(n) attempt has long since answered.

The recipe

You can run this on any function, yours or someone else's.

  1. Find what repeats with the input. Loops over the data are the obvious ones. Watch for the disguised methods too: an .includes(), an .indexOf(), a nested .find() are all just loops.
  2. Nested loops multiply. Steps in sequence add.
  3. Keep only the fastest-growing term and bin the constants. Whatever's left is your big O.

Why you're allowed to bin the constants

Big O throws away constant factors and smaller terms. O(2n) becomes O(n). O(n + 100) becomes O(n). O(n²/2 + 3n) becomes O(n²). The first time you see this it looks sloppy, but stick with me here.

The reason is that for large n, the biggest term flattens everything else. Picture n at a million. The term is a trillion. The + n next to it is a million, which next to a trillion is a rounding error you'd never notice. So we don't bother noting it.

Big O is deliberately coarse. It answers "what shape does this grow into", not "exactly how many operations". A function that does two passes, and one that does a single pass count as the same shape, and if that factor of two matters to you, you'd be reaching for a benchmark and a stopwatch, not for big O.

Add for sequential, multiply for nested

Two loops one after the other, each over the list, is O(n) + O(n), which is O(2n), which is just O(n). Doing two passes doesn't change the shape.

A loop inside a loop is O(n) × O(n), which is O(n²). That's our every-pair duplicate check from earlier.

And when there are two different inputs, keep them separate. Prepping the menu of m flavours and then checking each of n orders is O(m) + O(n), written O(n + m). But checking each order against the whole menu, a loop inside a loop over two different things, is O(n × m). Don't collapse n and m into one letter unless you know they grow together.

The sneaky quadratic

Quadratic is the name for O(n²). The work grows with the square of the input. Double the data and you don't double the work, you quadruple it (2² = 4); ten times the data is a hundred times the work (10² = 100).

This is the most common complexity mistake I see (mainly because I've often written it):

function unknownFlavours(orders, menu) {
  return orders.filter((order) => !menu.includes(order.flavour));
}

It looks like one loop: filter walks the n orders once. But menu.includes() is a second loop hiding inside a method call, scanning all m menu items for every single order. So it's O(n × m), and on a busy day when the order list is as long as the menu, that's O(n²).

The fix is the same trick as the duplicate finder: build a Set once, then each check is O(1).

function unknownFlavours(orders, menu) {
  const known = new Set(menu);
  return orders.filter((order) => !known.has(order.flavour));
}

Now it's O(n + m): one pass to build the set, one pass to filter. The previous version doesn't feel any slower when testing with a few orders, but it'll topple over in production during our busy period.

What's the time complexity of the first unknownFlavours, the one calling menu.includes() inside filter, with n orders and m menu items?
Show answer

O(n × m), because includes is a hidden loop is the one.

filter walks the n orders once, but for each one includes scans up to m menu items. A loop inside a loop, even when one of them is hiding inside a method, so you multiply: O(n × m).

A function loops over the whole order list to total the takings, then loops over it again to count the cones. Two separate loops, one after the other. What's the complexity?
Show answer

O(n) is the one.

Sequential steps add: O(n) + O(n) = O(2n). Then you bin the constant, so it's just O(n). Two passes is the same shape as one. Nested loops would have multiplied to O(n²), but these aren't nested.