Insurance works because expected value is linear, but standard deviation isn’t.
Our house, in the middle of our street
Let’s say someone has a home worth $100,000. It has a 1 in 1,000 chance of burning down this year, in which case they’ll need to pay \$100,000 to get it rebuilt; otherwise, they pay \$0. We can represent this as a Bernoulli random variable, which we’ll call X. X has a 1/1,000 chance of being 100,000, and a 999/1,000 chance of being 0.
The expected value of X, $E[X]$, captures the average value of the variable. It’s the sum of each outcome, weighted by its likelihood. X’s expected value is $100{,}000\times 1/1{,}000 + 0\times 999/1{,}000$ = 100. So averaged over many years, or over many independent houses, each homeowner should expect to pay \$100 per year due to fires.
The standard deviation of X, $\sigma(X)$, represents how far a typical value is from the mean.^{1}An alternative to the standard deviation (SD) is the mean average deviation (MAD), which does have a straightforward interpretation: it’s how far, on average, a given observation will be from the expected value. In reality it seems just as good as, or better than, standard deviation, for practical purposes. However it’s annoying to compute (it rarely has a closed form). You can also argue that standard deviation weighting large deviations more captures extreme variation better, but that seems a bit posthoc. In any case, the argument I’m making here works for both the SD and the MAD. If we think of the home fire as an outlier event, the standard deviation be larger if the outlier is larger or if it’s more likely. The formula is not exactly as neat, but we can also plug the numbers in and we get $\sigma(X)$ = \$3,161.
Here, the standard deviation is much larger than the expected value. This is bad for the homeowner, because it means even though their expected cost is pretty low, they could relatively easily be hit with a lowprobability, high cost event.
It’s hard to trick expected values into doing what you want, so this is not what insurance does.^{2}In the real world, insurance does try to reduce expected loss, by encouraging practices that lower it. For instance, an insurer can encourage you to install a fire alarm or sprinklers, which reduce your expected loss from a fire. But you can do this without using insurance, and it’s not fundamentally what insurance is about. Insurance aims to reduce the standard deviation. For instance, imagine an insurance company offers the homeowner to pay \$105 per year in exchange for covering all repair costs if a fire happens. The homeowner’s expected cost goes up from \$100 to \$105, but their standard deviation goes down from \$3k to \$0, a much more comfortable situation to be in.
From the insurance side, this may seem like a bad deal. Most likely, this \$105 will be pure profit, as the house won’t burn. But the insurer is now in the position the homeowner was in before they took the deal: they’re exposed to a catastrophic event where they take in \$105 but have to pay \$100,000—for a nice loss of \$99,895. So how does insurance work?
Insurance
The insurance trick is simply to insure many homes whose catching on fire is independent.
Let’s look at a set of n similar homes, each with their random variable telling us whether it burns, which we’ll label $X_1$ through $X_n$ (which are all independent Bernoulli variables). Say an insurer insured them all. The insurer’s cost is represented by the random variable $X_1 + \cdots + X_n$. Let’s look at what this variable looks like.
Expectation is linear, which means we have
so the expected cost due to home fires increases linearly for the insurer the more homes are insured.
But the standard deviation of a sum of independent random variables is a type of L² norm:
so the standard deviation increases as the square root of the number of insured homes, far slower than the expected value.
Here’s one way to understand this more intuitively. If everything goes wrong for the homeowner, they’ll have to pay 10,000× their expected cost. But for the insurance to have to pay 10,000× their expected cost, all homes would have to burn at the same time! This is far less likely than any one home burning.
So the play for insurers is this: ask each homeowner for their yearly expected cost as the insurance premium. This brings the insurance’s expected cost back up to 0 (more realistically, ask for a little more and pocket the difference as profits). This money goes into the float, which is a big pool of money the insurer will use to pay out claims.
If any house burns down, the insurer can take money from the float and pay out the claims. But what happens if there’s not enough money?
First, in the long run, premiums are larger than the expected cost, so things will even out for the insurer. But this was also true for the homeowner, and it still exposed them to bad outcomes.
Second, as we just saw, for the insurer, a much lower relative standard deviation (aka coefficient of variation) means catastrophic events are far less likely for the insurer than for individual homeowners. This is the key effect that makes insurance work: it reduces standard deviation.
Conclusion
Since insurance only really reduces the range of outcomes, purchasing insurance on some event will never reduce your expected cost from that event.^{3}This is true in an idealized model—regulations or incentives might actually reduce expected cost a little, but they’re unlikely to really change the calculus. It merely aggregates the possible outcomes into a single guaranteed outcome.
The expected cost not changing also means that you can’t meaningfully insure against events that are very likely or certain to happen, because in these cases the problem come from the cost, not from a low chance of a terrible outcome.
Finally, as a recommendation for further reading, the arguments I’ve made here about standard deviation can be made more precise with something like the central limit theorem, which lets us translate standard deviations back into outlier probabilities, assuming we are insuring a large number of independent events.

An alternative to the standard deviation (SD) is the mean average deviation (MAD), which does have a straightforward interpretation: it’s how far, on average, a given observation will be from the expected value. In reality it seems just as good as, or better than, standard deviation, for practical purposes. However it’s annoying to compute (it rarely has a closed form). You can also argue that standard deviation weighting large deviations more captures extreme variation better, but that seems a bit posthoc. In any case, the argument I’m making here works for both the SD and the MAD. ↩

In the real world, insurance does try to reduce expected loss, by encouraging practices that lower it. For instance, an insurer can encourage you to install a fire alarm or sprinklers, which reduce your expected loss from a fire. But you can do this without using insurance, and it’s not fundamentally what insurance is about. ↩

This is true in an idealized model—regulations or incentives might actually reduce expected cost a little, but they’re unlikely to really change the calculus. ↩
Comments