The vast majority of people have more than the average number of legs.
The average mother has 2.36 children even though no mother has that number.
We have an intuitive grasp of what it means when something is average, typical, or representative. Math and statistics formalize “average” in terms of mean, median, and mode. Each of these quantities condenses the complicated details of a distribution into a single number.
That number oversimplifies things, though. Bizarre, counterintuitive, and seemingly paradoxical things happen with unusual (not-so-representative) distributions.
The “mean” is what most people mean when they say “average”. You add them up and divide by how many there are.
The “median” is the middle one. Sort them from lowest to highest and pick the one in the middle, if there’s an odd number; if even, use the mean of the two in the middle.
The “mode” is the one with the most occurrences (the highest frequency).
What’s a distribution?
For our purposes here, let’s call it a graph, with the horizontal axis showing the quantity of interest (like height, wealth, wind direction) , and the vertical axis showing count or frequency or population. The graph shows how the quantity is distributed across the population of interest.
The familiar “bell-shaped curve” is called the normal distribution or Gaussian.
It’s a pretty good model for a lot of things we find around us, like why heights are normally distributed.
That’s “normal”, but what about abnormal?
Well, it’s really called non-normal, not abnormal!
For starters, what if the distribution has outliers? Suppose you were at a gathering of a hundred Microsoft software engineers. Bill Gates entered the room and bang, the average wealth spiked — even though nobody’s wealth changed. Actually, the mean spiked, but the median and the mode probably didn’t (except maybe if Paul Allen happened to be there too).
Quantized distributions are when the variable of interest can take on only discrete, or quantized, values. Mothers can have zero children, one child, two, three, and so on, but no fractional children, let alone 2.36 of them.
Similarly, most people have two legs. A small number of people have only one leg, and some have none at all. The mean winds up slightly less than 2, so the vast number of people (who have exactly 2 legs) have more than the mean. The median and the mode are both exactly 2, though.
The mean can also be misleading when trying to compare groups. In 1950 a researcher examined literacy rates in all the U.S. states, and how that correlated with immigrant literacy. He found that states with more immigrants had higher literacy, even though immigrants themselves had lower literacy. (I’m pretty sloppy with the technicalities here, but the paradox remains.) It turned out that states with higher native literacy tended to attract the immigrants.
A skewed distribution has one tail fatter than the other …
…while a bimodal distribution (or, more generally, multimodal) could have its mean out in the middle of nowhere. You could look at a quantized distribution as an extreme case of multimodal, where the blobs all narrow into spikes.
When normal is abnormal
Even for approximately normal distributions, the “tails” can have unexpected behavior. When he was president of Harvard, Larry Summers got himself into deep doo-doo with an observation to the effect that women on average may be smarter than men, but there are more men geniuses. Really? Here’s that normal distribution graph again.
Imagine that the horizontal axis plots smarts. Suppose the red curve represented women, and the orange curve, more spread out, represented men. Now look over to the right end of the curves, the genius end. The orange curve is higher — there are more men geniuses out there! Now imagine sliding just the red curve to the right, making all women smarter, on average. You’d have to push it pretty far until there were as many genius women. (Poor Prof. Summers wanted to say it needed more study — turns out the normal distribution is not that great a model out in the tails — but he got slammed for sexism. Meanwhile nobody noticed how over in the other tail all the numskulls are men, too.)
Regression to the mean
On average, someone’s performance tends to be average. Well, duh. But how about this:
The air sergeant ran a pilot training program. Sometimes a pilot would have a lucky day and do well, some days worse. The sergeant would always berate his men no matter how they did. How come? He just looked at the data averages. If he were to praise a pilot who had a better than average day, the next day the pilot would tend to do worse (closer to average). If a pilot happened to do worse than average and then the sergeant yelled at him, on the following day the pilot would seem to have improved (closer to average). So if a praise day is followed by worse performance and yelling is followed by better performance, of course he’s going to yell!