July 27, 2005
The Law of Large Numbers
I continue to shake my head in amazement as I consider the most bats* ignorant thing I have read all summer: the claim in National Review that in order to get a picture of income distribution and mobility in America:
Intellectual Garbage Pickup: you'd have to track hundreds of millions of individuals.... [N]one of this is reliable... the Panel Study of Income Dynamics... tracks only 8,000 families out of a U.S. population of 295 million individuals...
The whole purpose of the science of statistics is to tell us that this is simply not true. As long as you can take a random sample of your population, you can find out an enormous amount about the population from a relatively small number of observations. You can find out what proportion of rich people had poor paretns, or what proportion of twenty year olds think they will graduate from college, or pretty much any other average proportion that you want.
Now the "random sample" part of this is very important. But if your sample is random--if the fact that the yes-no pattern of observations so far makes it no more (or less) likely that you next observation will be a "yes"--then the law of large numbers tells us that the sample average you compute will converge to the true population average at a frighteningly rapid speed.
The standard demonstration of this is to repeatedly flip a coin and count the excess proportion of heads over tails. We know that--with a coin flipped and caught in the air by a human being at least--the population average taking all coins that have ever been flipped of the excess proportion of heads is zero. How many observations do we have to take--how many coin flips--before the sample average converges to this population average of 0% excess heads?
Let's see. Here's one run of 1,000 "flips" from Excel's internal random number generator:
Here are ten more:
Try some yourself.
You could have a population of 295 million flipped coins. Yet you don't need to look at "hundreds of millions" of them to determine what is going on. Looking at 1,000 will do.
This is the principal insight of the science of statistics. it is an important insight. It is a powerful insight. It is also not an obvious insight--that's what makes it powerful and important.
Yet because statistical studies sometimes produce results ideologically inconvenient for the Republican Party, National Review feels it has to pretend that this insight doesn't exist.
That's really sad.
Posted by DeLong at July 27, 2005 11:10 AM