January 08, 2004

Statistical Innumeracy

Mark Kleiman bangs his head against the wall in frustration at reporters who have no understanding of statistics:

Mark A. R. Kleiman: Wrong!:

WASHINGTON -- Wesley Clark has closed the gap with Howard Dean among Democratic voters, according to a national poll taken at a time when Dean had been under intense criticism from rivals. Dean had the support of 24 percent and Clark had the backing of 20 percent in the CNN-USA Today-Gallup poll out today. The poll of 465 Democrats and those who lean Democratic had a margin of sampling error of plus or minus 5 percentage points, meaning Dean and Clark are essentially tied for the lead nationally.

No, dammit, no! [Pounds the lectern in sheer frustration.] Being 4 points behind with a 5- point margin of error isn't being "essentially tied." It's being 4 points behind, plus or minus 5 points. That's a lot better than being 21 points behind, plus or minus 5 points, but I'd rather be ahead, thanks.

Of course, the reported margin of error reflects sampling error only, and ignores all the sources of systematicerror. All it means is that, if I'd called another 465 people at the same time, using the same algorithm to select them, using the same weighting formula to adjust the sample to he assumed population of actual voters, and having the same interviewers ask the same questions, there's a 95% chance that the results of the second sample would have been within 5 points of the results of the first sample.

But the interviewers might not be perfectly impartial, and the questions, the sampling algorithm, and the weighting formulas might all embody imperfect models of actual voting behavior. The extent of those "systematic" errors cannot be estimated by simply taking the reciprocal of the square root of the sample size. So the reported margin of error is an underestimate of the actual uncertainties involved.

But there's no such thing as a "statistical tie," and it's better to be ahead than behind.

[Yes, this will be on the exam.]

All I can say is that, when the Day of Wrath comes and you look up to see Mark Kleiman standing next to the Pantokrator whispering in his ear, those of us who paid attention in our statistics classes will be very glad that we did so.

Posted by DeLong at January 8, 2004 11:55 AM | TrackBack

Comments

Math is hard. Statistics is very hard. I'm trying, right now, to imagine an infinite ensemble of sets of Democratic voters-- OK.. now, I'm computing moments of their probability distributions... and now, (understandably) I feel that I'm becoming a Baysian... ooooh...

But seriously. Is this stuff explained competently and clearly anywhere?

Posted by: Matt on January 8, 2004 12:30 PM

____

Kleiman shouldn't be giving exams if he;s going to be this emphatic in denying that there is such a thing as a "statistical tie".

Yes, one takes his point that the uncertainty of measurement means that the probability that Clark is ahead of Dean is quite small (not all *that* small, when one considers that there is uncertainty around the measurement if Dean's support that is not necessarily directly dependent on the uncertainty around Clark's support.) And one can dismiss the term "statistical tie" as one that doesn't occur in textbooks and therefore has no official meaning.

But it's a basic tenet of the sciences that when measurements are indistinguishable within the error of measurement, the scientist should avoid labeling them as being different to a statistically significant degree, the conventionally threshhold being 95% confidence. There is a meaning to the term, even if the meaning is not yet formally acknowledged. It means that unless the precision of measurement can be improved, the results cannot be said to be different with 95% confidence.

By this standard, Clark and Dean are essentially tied. Obviously, a betting man would go with Dean, were the election held nationally and today.

BTW, some textbooks do use the term. See http://216.239.57.104/search?q=cache:VX10MiY9t_gJ:www.introductorystatistics.com/escout/chap_10/chap10_sec4.pdf+%22statistical+tie%22+definition&hl=en&ie=UTF-8

The World Bank also uses the term in at least one instance: http://www.worldbank.org/wbi/governance/pdf/hague/surveyanalysis_gf2.pdf

We now return you to your regularly scheduled economics discussion.

Posted by: Charles on January 8, 2004 12:43 PM

____

Of course Mark gets it wrong about the meaning of the Iowa Markets vote share price quotes. I've emailed him and hope he corrects his post. Also, I thought that MOE was expressed for the 50/50 point and would therefore be smaller for results in the 20-25% range, but I admit that I'd have to review things I learned years ago before I had any confidence in anything I said about statistics.

Posted by: elliottg on January 8, 2004 01:27 PM

____

The opportunistic use of the MOE is just another form of argumentative opportunism in competitive discussions (e.g. law) where winning is the only thing. It doesn't have a lot to do with difficult concepts in statistics.

A +4-point lead with a 5-point MOE can be expanded to a range of +9 to -1. So Clark could be up by one, or down by nine, or somewhere in between.

No real conceptual difficulty caused this; the author just assumed, for reasons known only to himself/herself, that the MOE favored Clark. There might be a slight tendency to honestly think that MOE favored the underdog, but that's stupid. Probably the author doesn't like Dean, or maybe they want the race to be close.

With global warning, anti-warming spokesmen play the same game -- they assume that natural fluctuations are amplifying the effects of human activity on the results we see. But it's equally possible that natural fluctuations are damping the effects of human activity, and that global warming is really much worse than it seems. Initially there's no way to know.

In many political arguments, the facts and reasoning are often less important than silent attempts to assume some sort of default position by fiat (here, "Natural cycles might be amplifying global warming, but aren't damping it").

For example, two conservative defaults are "Don't interfere with business unless you can prove that it's necessary beyond a shadow of a doubt" and "Any scientific evidence brought forward in favor of business regulation must have passed the most rigorous tests and be accepted by the entire scientific community, even the contrarians and industry scientists".

Likewise, political scandals are quickly interpreted as legal cases so that the default "innocent until proven guilty" can be applied. There's no real problem with saying "X is probably innocent of any crime, but based on what we know about him, he doesn't belong in government". But by now anyone but a convicted felon is OK.

Leibniz said something like "People would argue about the multiplication table, if there were enough money in it". (If you have a source for that, BTW, send it to me. But don't try Google -- the Google source is me.)


Posted by: Zizka on January 8, 2004 02:19 PM

____

Mark Kleiman writes:
>
> All it means is that, if I'd called another 465 people at the
> same time, using the same algorithm to select them,
> using the same weighting formula to adjust the sample
> to he assumed population of actual voters, and having
> the same interviewers ask the same questions, there's a
> 95% chance that the results of the second sample would
> have been within 5 points of the results of the first
> sample.

I'm sorry, but I don't think that such an explanation deserves (as Brad DeLong suggests) a place of honor on the Day of Wrath.

Because it is wrong.

Unless polling people are doing something very different from what they should be doing, the margin of error is just half the width of the confidence interval for the proportion in question. I calculate an MOE (at 50%) of 4.6%, which should mean that a 95% confidence interval for the "true" (population parameter) level of support for (say) Howard Dean would be 19.4% - 28.6%. What this means, according to a classical statistician, is that if I polled and polled and polled again a large number of times under the same conditions, and generated confidence intervals for each poll, that 95% of these confidence intervals that I compute would cover (include) the true population parameter. It does *NOT* mean that 95% of the sample mean percentages would be within MOE % of this ONE sample. It really, really, really does not mean that.

Now, if you are a Bayesian, you are entitled to a different perspective on these kinds of confidence intervals, and you could use this perspective to calculate that the probability that the true proportion of Dean supporters (given a bunch of assumptions) is greater than the true proportion of Clark supporters is pretty high. Which is Kleiman's point, and sounds okay by me.

Posted by: Jonathan King on January 8, 2004 08:48 PM

____

On how formulas can be manipulated:
http://www.crichton-official.com/speeches/speeches_quote04.html

Posted by: Jeff on January 8, 2004 08:58 PM

____

Charles, overlapping confidence intervals don't imply nonsignificance. In general if the confidence intervals overlap by 25% or less, the difference between the means will be significant. There are such thigns as "statistical ties" (Kleiman is being far too precious here; would he still be making this argument if the difference was 0.0001%?), but this appears not to be one.

Posted by: dsquared on January 9, 2004 12:43 AM

____

Thanks, dsquared, I am aware of the statistical issues.

My point is that, while manipulating poll figures is deplorable, so is using professorial power for imposing dogma.


Mr. King, I believe you have it wrong. See http://www.robertniles.com/stats/margin.shtml

Posted by: Charles on January 9, 2004 09:57 AM

____

Mr. King is right (and Charles is wrong). You cannot turn a confidence interval into a proability that a particular estimate is close to the true value. (The person that Charles linked to is also wrong on this point.)

Everyone is talking about the probability that another survey would be close to a survey result in hand. Suppose that the survey result in hand is one of the 1/1000 surveys that is wildly wrong (more than two or three times the margin of error away from the true value). Then surely the probability that another survey would produce a similar result is actually very low. And we have no way of knowing the accuracy of the survey in hand. Both Kleiman and the person Charles links to make this mistake.

Mr. King did say:

"Now, if you are a Bayesian, you are entitled to a different perspective on these kinds of confidence intervals,"

If you are a Bayesian, the words "confidence interval" would only pass your lips when you are deriding non-Bayesians.

Posted by: David Margolies on January 9, 2004 03:13 PM

____

Deep waters here. And stats is supposed to be my business, so I am hesitant to expose any professional shortcomings.

Prof. DeLong is correct that the point estimates are the "best estimates" if caculated properly.

Now, what about who's *really* ahead, taking into account sampling error? Now confidence intervals and hypothesis tests come into play. But which confidence interval, and what significance level to choose? Neyman said you have to calculate the relative costs of type I and type II errors in order to choose. What are the appropriate costs to use? Of course, you might want to just go with the arbitrary 5% significance level, but why? I think Fisher changed his mind several times on the one that was appropriate for "moral certainty" at different times: 5%, 2.5%, 1%.

And what is the relevance of the repeated experiments approach, when you know you are working in a dynamic environment where things other than sampling error are chaning from week to week? In other words, you can never know to what extent you are repeating the same experiment? For example your ideas about determining the appropriate sample frame, likelihood of differnt respondents voting on election day, etc. changes.

I heard a famous statistician once say that he thought like a Bayesian, and then would try to find a classical procedure that would act Bayesian and would be simple enough to implement. So here is where I start thinking Bayesian.

But, until I have a good probability model, I would agree with Prof. DeLong. In the real world, as opposed to the world of classical statistical theory, I'd rather have a few favorable point estimates, and say gosh-darn to intervals.

Posted by: jml on January 9, 2004 08:02 PM

____

David Margolies writes:
>
> Mr. King did say:
>
> "Now, if you are a Bayesian, you are entitled to a different
> perspective on these kinds of confidence intervals,"
>
> If you are a Bayesian, the words "confidence interval"
> would only pass your lips when you are deriding non-
> Bayesians.

And who says I wasn't being derisive. :-)

More seriously, I was thinking of the nice, convenient settings where a classical confidence interval and a Bayesian HDR coincide, etc.

You could do a fully Bayesian analysis of this but I have to confess that I'm not exactly sure what the best approach to this particular problem is. If you're interested in posterior distributions for true popularity of multiple candiates, what you've really got is a multinomial problem, so I guess you could use a conjugate prior distribution (and I think that's Dirichlet, but I'd have to check) and Do the Right Thing.

Posted by: Jonathan King on January 9, 2004 09:11 PM

____

I've only taken one class in stats so far, but what Mr. King said was drilled into us:

..."if I polled and polled and polled again a large number of times under the same conditions, and generated confidence intervals for each poll, 95% of these confidence intervals would cover (include) the true population parameter. It does *NOT* mean that 95% of the sample mean percentages would be within MOE % of this ONE sample"...

But perhaps it's not as simple as what they feed our minds full of mush?

Posted by: andrew on January 9, 2004 11:11 PM

____

>>If you are a Bayesian, the words "confidence interval" would only pass your lips when you are deriding non-Bayesians.

I think I could make a decent argument that Bayesians are the *only* people who have any business talking about confidence intervals ...

Posted by: dsquared on January 12, 2004 07:00 AM

____

dsquared writes (quoting me):

">>If you are a Bayesian, the words "confidence interval" would only pass your lips when you are deriding non-Bayesians.

I think I could make a decent argument that Bayesians are the *only* people who have any business talking about confidence intervals ..."

Absolutely, but Bayesians do not call them "confidence intervals"!

Posted by: David Margolies on January 12, 2004 11:36 AM

____

Post a comment
















__