November 12, 2004

98000 ± 90000 Excess Deaths in Iraq

Daniel Davies puts on the hockey mask and uses the chainsaw on critics of the Lancet epidemiological study of excess deaths since Bush's invasion of Iraq:

Crooked Timber: Lancet roundup and literature review : Well, the Lancet study has been out for a while now, and it seems as good a time as any to take stock of the state of the debate.... Lots of heavy lifting here has been done by Tim Lambert and Chris Lightfoot.... I suggested to Tim that it was more like “litmus paper for hacks”; it’s up to each individual to decide for themselves whether they think a particular argument is an innocent mistake or not).... Starting with what I will describe as “Hack critiques”... arguments which are purely and simply wrong.... I’ll start with the most widespread one.

The Kaplan “dartboard” confidence interval critique.... Fred Kaplan of Slate suggested that because the confidence interval was very wide, the Lancet paper was worthless and we should believe something else like the IBC total. This argument is wrong... 1) The confidence interval describes a range of values which are “consistent” with the model. But... the most likely values are the ones in the centre.... The single most likely value is, in fact, the central estimate of 98,000 excess deaths. Furthermore... the truly shocking thing is that, wide as the confidence interval is, it does not include zero. You would expect to get a sample like this fewer than 2.5 times out of a hundred if... the war had made things better rather than worse.... 2) As the authors themselves pointed out... “Research is more than summarizing data, it is also interpretation.... [W]e had two other pieces of information. First, violence accounted for only 2% of deaths before the war and was the main cause of death after the invasion. That is something new, consistent with the dramatic rise in mortality and reduces the likelihood that the true number was at the lower end of the confidence range. Secondly, there is the Falluja data, which imply that there are pockets... that have far more deaths than the rest of the country. We set that aside these data in statistical analysis because the result in this cluster was such an outlier, but it tells us that the true death toll is far more likely to be on the high-side of our point estimate than on the low side.”... 3... “Kaplan’s Fallacy”. If your critique of an estimate is that the range is too wide... intellectual honesty demands that you refer to the whole range when using this critique, not just the half of it that you want to think about. In other words, it is dishonest to title your essay “100,000 dead – or 8,000?” when all you actually have arguments to support is “100,000 dead – or 8,000 – or 194,000?”.... The Kaplan piece was really very bad; as well as the confidence interval fallacy, there are the germs of several of the other fallacious arguments discussed below. It really looks to me as if Kaplan had decided he didn’t want to believe the Lancet number and so started looking around for ways to rubbish it, in the erroneous belief that this would make him look hard-headed and scientific and would add credibility to his endorsement of the IBC number. I would hazard a guess that anyone looking for more Real Problems For The Left would do well to lift their head up from the Bible for a few seconds and ponder what strange misplaced and hypertrophied sense of intellectual charity it was that made Kaplan, an antiwar Democrat, decide to engage in hackish critiques of a piece of good science that supported his point of view.

The cluster sampling critique... reached its fullest and most widely-cited form in a version by Shannon Love on the Chicago Boyz website. The idea here is that the cluster sampling methodology... reduces the power of the statistical tests and makes the results harder to interpret.... There are two big problems with the cluster sampling critique, and I think that they are both so serious that this argument is now a true litmus test for hacks; anyone repeating it either does not understand... or... knows that the argument is fallacious. The problems are: 1) Although sampling textbooks warn against the cluster methodology... the reason why it is risky is that it carries a very significant danger of underestimating the rare effects.... And 2).... Cluster sampling ain’t ideal, but needs must and it is frequently used in bog-standard epidemiological surveys outside war zones. The effects of clustering on standard results of sampling theory are known, and there are standard pieces of software that can be used to adjust (widen) one’s confidence interval to take account of these design effects. The Lancet team used one of these procedures, which is why their confidence intervals are so wide

There is a variant of this critique which is darkly hinted at by both Kaplan and Love... that it is not a random sample, but is cherry-picked in some way. In order to believe this, if you have read the paper, you have to be prepared to accuse the authors of telling a disgusting barefaced lie....

The argument from the UNICEF infant mortality figures.... The idea here is that the Lancet study finds a prewar infant mortality rate of 29 per 1000 live births and a postwar infant mortality rate of 54 per 1000 live births. Since the prewar infant mortality rate was estimated by UNICEF to be over 100, this (it is argued) suggests that the study is giving junk.... The authors discuss a few reasons why the movement in infant mortality might be exaggerated... it is good form to look very closely at any anomalies in data. Which is what Chris Lightfoot did.

Basically, the UNICEF estimate is quoted as a 2002 number, but it is actually based on detailed, comprehensive, on-the-ground work carried out between 1995 and 1999 and extrapolated forward. The method of extrapolation is not one which would take into account the fact that 1999 was the year in which the oil-for-food program began to have significant effects on child malnutrition in Iraq. No detailed on-the-ground survey has been carried out since 1999, and there is certainly no systematic data-gathering apparatus in Iraq which could give any more solid number....

We now move into the area of what might be called “not intrinsically hack” critiques. These are issues which one could raise with respect to the study... not themselves based on evidence strong enough to make anyone believe that the study’s estimates were wrong unless they thought so anyway.... The first might be called the “Lying Iraqis” theory. This would be the theory that the interview subjects systematically lied to the survey team. In fact, the team did attempt to check against death certificates in a subsample of the interviews and found that in 81% of cases, subjects could produce them.... Finally, we come onto two critiques of the study which I would say are valid. The first is... that the extrapolated number of 98,000 is a poor way to summarise the results of the analysis. I think that the simple fact that we can say with 97.5% confidence that the war has made things worse rather than better is just as powerful.... The second one is... the Lancet’s editorial comment... contained the phrase “100,000 civilian deaths”. The study itself counts excess deaths and does not attempt to classify them as combatants or civilians....

Finally, beyond the ultra-violet spectrum of critiques are those which I would classify as “beyond hackish”.... The attempt to use the IBC numbers as a stick to beat the Lancet study. The two studies are simply not comparable.... Arguments which... imply that there could be no valid form of epidemiology, econometrics, opinion polling, or indeed pulling up a few spuds to see if your allotment has blight.... flypaper for innumerates....

Finally, there is the strange world of Michael Fumento, a man who is such a grandiose and unselfconscious hack that he brings a kind of grandeur to the role. I can no more summarise what a class A fool he’s made of himself in these short paragraphs than I could summarise King Lear. Read the posts on Tim’s site and marvel....

The bottom line is that the Lancet study was a good piece of science, and anyone who says otherwise is lying. Its results (and in particular, its central 98,000 estimate) are not the last word on the subject, but then nothing is in statistics. There is a very real issue here, and any pro-war person who thinks that we went to war to save the Iraqis ought to be thinking very hard about whether we made things worse rather than better (see this from Marc Mulholland, and a very honourable mention for the Economist). It is notable how very few people who have rubbished the Lancet study have shown the slightest interest in getting any more accurate estimates; often you learn a lot about people from observing the way that they protect themselves from news they suspect will disconcert them.

As far as I can see, Daniel says only one thing wrong. There are two problems with cluster sampling when the number of clusters is roughly inverse to the probability that the cluster has been hit by the disaster that you are trying to assess. First--as Daniel says--there is a large probability that you will underestimate the extent of the disaster. But there is also a small probability that you will significantly overestimate the extent of the disaster. The confidence intervals aren't symmetric, and that asymmetry has consequences.

For example, the figure below shows the probabilities when you sample 25 clusters each of which has a 1/25 chance of being hit by the disaster. There is a 36% chance you will get no hits at all--that's the underestimate. There's a 37.5% chance you will get one hit--that's the accurate measure. But there is an 8% chance that you will get three hits or more, and thus overestimate the impact of the disaster by a factor of at least three.

Posted by DeLong at November 12, 2004 09:51 AM | TrackBack
Comments

Davies doesn't directly address that the asymmetrical risk of oversampling, true, but I assume that it is part of the reason that the confidence interval is so large -- i.e. that when he says that the problems with cluster sampling are well understood and have been dealt with in the Lancet study through utterly conventional means, he includes that risk. Admittedly, I assume that on the basis of finding him generally reliable, rather than on sufficient knowledge of stats to check it myself.

Posted by: LizardBreath at November 12, 2004 10:11 AM

I discuss the confidence interval issue in my ZNet piece: 100,000 Iraqis Dead: Should We Believe It?
http://www.zmag.org/content/showarticle.cfm?SectionID=15&ItemID=6565
What I conclude is that a major reason the CIs are so large is that the authors engaged in a conservtive (in a statistical sense) analysis of their data and included the Kurdish region. In this region, there was a reversal of the pattern in the rst of the country: more deaths prior to invasion than after. If this region had been excluded, the CIs would have been narrower, and the point estimate (98,000) higher.

The fact that the authors did not exclude the Kurdish region is one piece of evidence that they were NOT seeking to maximize the estimate of casualties. This type of thing is one of the things we researchers look for when judging the quality of another's study. This study was one of the best I've seen. Still, given its discrepancy with the estimates of others (Iraq Body Count, Project for Defense Alternatives), I tend to believe the "true" estimate is somewhat below 98,000, but still in the many tens of thousands.

Posted by: Stephen Soldz at November 12, 2004 10:31 AM

re: Brad's concern for cluster sampling, Davies points out (as did the Economist) that one of the randomly selected clusters was in Fallujah. The researchers collected data on Fallujah, but omitted that data from their study because Fallujah was substantially more violent than the rest of Iraq.

Clustering is subject to the risk brad cites, but the full risk would only apply to statistical results that included Fallujah.

If you sample 33 clusters, each of which has a 1/33 chance of being flatened by insurgency and the US military, but then you throw out the worst cluster because it's results are substantially higher than all the rest, just how large is the risk that you've underestimated mortality?

Posted by: Silent E at November 12, 2004 10:43 AM

So Bush has killed more people then Osama. They are both terrorists. It's doubtful that either will ever see the inside of a court room.

Posted by: feeling_cynical at November 12, 2004 11:40 AM

Stephen, the Iraq Body Count project was covered by D^2 in an earlier post. It was trying to establish a lower bound. IIRC, it required each death to have multiple confirming sources.

Posted by: Barry at November 12, 2004 11:59 AM

If you expect to find one "hit" in your sample there is roughly a one in four chance of an overestimate and a three in eight chance of an underestimate but if you exclude your largest outlier the chances of you including even the hit you expect are diminshed and you have a three quarters chance of an underestimate, roughly one in five of getting it right and an 8 per cent chance of an overestimate which is most likely to be only one hit over. This might not apply in this case but the effect is any case smaller if the difference between a hit and a miss is less marked than some or none. Excluding the Falluja sample seems to achieve just that.

There is not necessarily any discrepancy between the IBC and the Lancet figures because their definitions are different. In particular IBC do not include combatants.

Sorry to quibble but inches seem to turn into miles very easily on this issue.

Posted by: Jack at November 12, 2004 11:59 AM

Hmmm ... I see what you mean. I thought I implicitly addressed this point in suggesting that the "outlier critique" was potentially valid, although there is, statistically speaking, no reason to believe that's what happened (ex hypothesi, one doesn't assume that an unlikely event has happened rather than a likely one).

Also, there is the internal evidence of the dataset, summarised by Richard Garfield's reply to correspondence in the text. If you just walked around, grabbed 32 samples and collated them, then you would say there was a small but material chance of a big oversample. If, on the other hand, you had also got one big outlier and thrown it out, plus the samples consistently across the country showed that outside the Kurdish region, violence had gone from a minor cause of death to being the biggest single cause, then there is really nothing in the dataset which gives you any reason to believe that the tail event of an oversample has occurred.

Posted by: dsquared at November 12, 2004 12:14 PM

http://www.nytimes.com/2004/11/11/international/middleeast/11snipers.html

Hard Lesson in Battle: 150 Marines Meet 1 Sniper
By DEXTER FILKINS

FALLUJA, Iraq - American marines called in two airstrikes on the pair of dingy three-story buildings squatting along Highway 10 on Wednesday, dropping 500-pound bombs each time. They fired 35 or so 155-millimeter artillery shells, 10 shots from the muzzles of Abrams tanks and perhaps 30,000 rounds from their automatic rifles. The building was a smoking ruin.

But the sniper kept shooting.

He - or they, because no one can count the flitting shadows in this place - kept 150 marines pinned down for the better part of a day. It was a lesson on the nature of the enemy in this hellish warren of rubble-strewn streets. Not all of the insurgents are holy warriors looking for martyrdom. At least a few are highly trained killers who do their job with cold precision and know how to survive.

"The idea is, he just sits up there and eats a sandwich," said Lt. Andy Eckert, "and we go crazy trying to find him."

The contest is a deadly one, and two marines in Company B, First Battalion, Eighth Regiment of the First Marine Expeditionary Force have been killed by snipers in the past two days as the unit advanced just half a mile southward to Highway 10 from a mosque they had taken on Tuesday.

Despite the world-shaking blasts of weaponry as the Americans try to root out the snipers, this is also a contest of wills in which the tension rises to a level that seems unbearable, and then rises again. Marine snipers sit, as motionless as blue herons, for 30 minutes and stare with crazed intensity into the oversized scopes on their guns. If so much as a penumbra brushes across a windowsill, they open up.

With the troops' senses tuned to a high pitch, mundane events become extraordinary. During one bombing, a blue-and-yellow parakeet flew up to a roof of a captured building and fluttered about in tight circles before perching on a slumping power line, to the amazement of the marines assembled there.

On another occasion, the snipers tensed when they heard movement in the direction of a smoldering building. A cat sauntered out, unconcerned with anything but making its rounds in the neighborhood.

"Can I shoot it, sir?" a sniper asked an officer.

"Absolutely not," came the reply.

This day started at about 8 a.m., when the marines left the building where they had been sleeping and headed south toward Highway 10, which runs from east to west and roughly bisects the town. At the corner of Highway 10 and Thurthar, the street they were moving along, was a headquarters building for the Iraqi National Guard that had been taken over by insurgents.

Almost immediately, they came under fire from a sniper in the minaret of a mosque just south of them. Someone in a three-story residential building farther down the street also opened up. The marines made 50-yard dashes and dived for cover, but one of them was cut down, killed on the spot. It was unclear what direction the fatal bullet had come from.

"I don't know who it was," Lt. Steven Berch, leader of the fallen marine's platoon, said of the attacker, "but he was very well trained."

Posted by: lise at November 12, 2004 12:18 PM

To make the above point a bit clearer, one way in which you could attempt to guard against the issue that Brad describes above is to see how sensitive your result is to the deletion of a couple of data points. The kind of oversample you would get in the rare unlucky case in cluster sampling would be a skewed estimate of the mean which was the result of a small number of large observations. This doesn't appear to be the case with the Iraqi data; consistently across the non-Kurdish governorates of Iraq, the death rate was about 50% higher after the invasion, apart from Anwar governorate which contained Fallujah and saw the death rate go up by multiples. I find it hard to regard this as characteristic of a dataset that has oversampled clusters; I think that a much more promising way out for people who are trying to justify a lower figure would be to look for ways in which there might be reporting bias by interviewees (what I rather perjoratively call the "lying iraqis" critique).

Posted by: dsquared at November 12, 2004 12:23 PM

The concentration on Confidence Intervals may lack for viability, as this is both asymmetrical Cluster and asymmetrical data contribution. lgl

Posted by: lgl at November 12, 2004 12:34 PM

The concentration on Confidence Intervals may lack for viability, as this is both asymmetrical Cluster and asymmetrical data contribution. lgl

Posted by: lgl at November 12, 2004 12:41 PM

I have mostly two objections to the way results are presented in the study. The first one is what I call the religious adherence to the 95% confidence interval. This is not a study that lends itself to making a prediction with 95% accuracy, and the confidence interval should reflect that.

The second is the use of the term conservative estimate. There's no formal definition of the term I'm aware of, but it is generally used to denote "in all likelihood, the actual value is higher than the estimate". So rather than calculating a biased estimate by asymmetrically leaving out outliers *to one side*, giving a noncentral estimate that puts the probability of the actual value being higher at, say, 67% rather than 50% would be more in keeping with the common meaning of the term. Note that most commenters who accepted the study as credible did their own "conservative estimate" by taking off some 20-30,000 from the reported number.

Posted by: ogmb at November 12, 2004 01:50 PM

ogmb,

They did remove their outlier on one side so I think that use is fair.

The problem with not using 95% intervals is that these are usual and if they had used, say, a 75% interval they would be receiving complaints about hiding the inaccuracy of their results by artificially reducing the size of the interval.

Posted by: Jack at November 12, 2004 01:58 PM

Btw, both Daniel and Brad are wrong in offering binomial analogies to what is for all intents and purposes a multinomial random variable (excess deaths). A better analogy might be groundwater pollution from well poisoning.

Posted by: ogmb at November 12, 2004 02:11 PM

2 points

2 points

1. Isn't the percentage probability of over estimation the sum of the percentages for events greater than 1?

2. Is there a problem with assuming an even distribution of 1/25 for being hit by disaster? If we take a tornado, for example, the chances of being hit are not evenly distributed. In fact they will be close to 1, if you’re in the path, and close to 0, if you're not.

Posted by: neil at November 12, 2004 02:12 PM

Jack: "They did remove their outlier on one side so I think that use is fair."

You can apply gut feeling and say the number is probably closer to 70,000 than 98,000 and you create a "fair" conservative estimate. Their approach to Falluja is to declare it an outlier and to leave it out of the distribution, and then report the unconservative central value of the conservative distribution. My preferred method would be to take the unconservative distribution (incl. Falluja) and to establish a biased estimate that meets the definition for "conservative" I offered: i.e. the area under the distribution to the right of the estimate is bigger than the area to the left.

The reason I ask for this is Falluja cannot simply be labeled an outlier. An outlier is characterized by two things: 1. It's an atom in the tail of a distribution that's known to have thin tails, and 2. It's the result of systemically different conditions from the rest of the observations. Iow it's a sampling accident that tells us nothing about the underlying distribution. Falluja does not meet those criteria. The violence in Falluja differed in degree from other cities, but we have no evidence that it differed in kind, and we don't know very much about the thickness of the right tail (i.e. how many cities have violence levels similar to Falluja). So the Falluja datapoint potentially tells us something about the distribution, and leaving it out is the result of political rather than methodological considerations.

Posted by: ogmb at November 12, 2004 04:52 PM

ogmb,
I'm slightly lost here. There seems quite a change of pace between the rough, ready and unorthodox "gut feeling" and unconventional central estimate combo and the precise definition of outlier.
If the Falluja point is not an outlier then the tail is fat or the falluja datapoint is not systematically different from the rest of Iraq. If either were the case the figures ought to be much higher than they are and the estimate is conservative. If not they have removed an outlier and the estimate is at least cautious. Leaving it out is certainly less political than saying that the figure is probably lower when they probably do not believe that to be the case.

Posted by: Jack at November 12, 2004 05:14 PM

Jack:

As I said, any way of moving the estimate towards the "cautious" position makes it a "conservative" estimate, no matter how crude or sophisticated. I just formalized the common definition of the term and pointed out that the way the authors used it is not consistent with that definition (i.e. that the area of the distribution to the right is bigger than to the left).

The point about outliers is one of sound data evaluation. Outliers are not just extreme values, they are extreme values that pollute the sample because they're not driven by the underlying effects we are trying to uncover. Say, a war-unrelated outbreak of a disease in Falluja would create an outlier. Extensive fighting does not create an outlier, it is just an extreme instance of the underlying factors that increased the post-war mortality rate. As such, careful consideration should have concluded that we can't remove it based on the circumstances, but the researchers decided to remove it based on its characteristic as an extreme value alone.

Now I understand the researchers' attempt to appear cautious and to deflect criticism by creating a "conservative" estimate. I proposed a way to do this which might be unusual but is widely applicable, tractable and well-defined. But truncating the sample based only on the authors' wish to create a less alarmist estimate is not proper proceeding.

Posted by: ogmb at November 12, 2004 08:34 PM

I must respecfully disagree with one of Daniel Davies' most significant conclusions, for King Lear can certainly be summarized in a few short paragraphs. While it is true that sampling error renders the procedure somewhat risky and subject to significant inaccuracy, nevertheless, in my view, it turns out to be well worth the effort. See if you agree:

--------------
An arrogant king brings shame and ruin on himself and his kingdom by his foolish, spoiled behavior.

Lear, against advice, decides to force his three daughters to praise him lavishly in return for pecuniary reward. When one daughter refuses, Lear disowns her, starting a long chain of events which finishes with his descent into madness.

The only character who is permitted to criticize the king turns out also to be the only one who actually gives a damn about him. This Fool stays longest by Lear's side, yet even he does not survive to witness his master's demise, which is rich in pathos and irony.
--------------

Let Ozymandians everywhere take heed; and listen, for god's sake, to John Stewart when he is talking about you!

Believe it or not, he is trying to save your entirely undeserving skin.

Posted by: Ralph at November 12, 2004 11:12 PM

ogmb,

It seems to me that your alternate estimator (with 67% of the probability mass on one side) is impractical. The Lancet estimator is asymptotically normal (I assume). You're calling for the use of the small sample distribution of their estimator, which is presumably asymetric, not t or normal. But the small sample distribution is going to be a function of the true parameters of the underlying process, and we don't know the true parameters. Or am I missing something?

Posted by: Ragout at November 13, 2004 12:38 AM

I look on this all with bemusement. One thing we know with precision is that 1100 plus US soldiers are dead and perhaps 8000 seriously wounded. We also know the US enjoys an asymmetry of force: we have tanks and planes and the opposition doesn't. We have body army, they largely don't. We have the best trained military in the world, they have some teenager with a rocket launcher. Why are we talking outliers? To suggest that our guys on average are not killing 10 for every death and serious casualty we suffer is to suggest that our soldiers and marines are inept.

Whatever else you think of them, they are not inept. They are extradordinarily good at killing. They are shooting and bombing and people on the other end are dying. Which is the point of shooting and bombing. Without going back to the distortions of Vietnam era "body counts", just take ten or fifteen pieces of reporting on US deaths vs reported "insurgents" killed you have seen over the last year and extrapolate.

This whole war was a clusterfuck from day one and now some people are using statistics to argue that 1100 dead troops and 8000 or so seriously wounded are the Gang that Couldn't Shoot Straight. There is a ratio of Iraqi to US dead. We may never know what it is precisely. But if it is not pretty damn high those of you who supported this war in the first place should be staring at the ceiling sleepless nightly.

You can indulge your fantasies that every bombing is "a precision strike" that our human information on the ground in Fallujah was good enough that every "safe house" bombed was solely occupied by insurgents and that somehow we knew exactly how many we bagged per bomb. But we are killing somebody, a somebody who likely would still be alive if we had not invaded Iraq, and it really does not matter if he was pointing a gun or cowering in a corner, dead is dead.

If we have blown through $150,000,000,000 and ten thousand US dead and seriously wounded and NOT killed 100,000 Iraqis our troops are doing a piss poor job. Because you are talking $1.5 million an Iraqi here.

There is a desperate dance going on to pretend that this war is as cheap and pain free as Wolfowitz and Rumsfield promised it would be. Sorry, people are dying in very large numbers. That is what war does.

On the very first day of the war we launched a cruise missile strike at a restaurant in the heavily populated al-Mansour residential neighborhood because we had some evidence that Saddam and his sons were there. They weren't. But a thirteen year old girl was there or near. We know that because she was taken out of the rubble in pieces, head last. Her mother was there to see it. This was widely reported at the time but the pro-war folk didn't even blink. I think about that girl often because she was just the first counter in a body count that was bound to rise to horrific totals.

But I guess it is more important to talk about outliers than confront the simple fact that war is not good for little girls and other living things. Bush chose this war, many, many people were enablers, and no amount of inventorying deck chairs on the Titanic is going to change the fundamental realities here.

Posted by: Bruce Webb at November 13, 2004 03:47 AM

Ralph: I'm sure what D^2 had in mind was the poetic character of Lear's madness.

For a parallel challenge, formalize this:

Nobirdy avair soar anywing to eagle this.

Posted by: jlgoldberg@brick.net at November 13, 2004 05:20 AM

Sorry, in the last post "eagle this" should be "eagle it".

Posted by: Jonathan Goldberg at November 13, 2004 05:22 AM

There is one thing more. The responsible party has the moral responsibilty of overestimating the damage caused by their actions. To say well we only caused 40000 deaths rather 98000 so we're ok. What kind of thinking is that?

Posted by: Bil at November 13, 2004 06:25 AM

10,000, 98,000, 188,000? It doesn't really matter, because no one in our God-fearing, "moral" nation, aside from some liberal bloggers, really gives a damn. Scott Peterson, yes, but not mass killing of civilians by American troops.

Posted by: Bob H at November 13, 2004 06:56 AM

Hello

The Lancet study type of mortality statistics do not necessarily mean what people intuitively think that they mean. As an example, my understanding is that there is a lot more automotive traffic in Iraq than before the war, and so it would be logical to presume more traffic deaths. These enter into excess mortality statistics, but are they a cost of the war ? Closer to hone, death rates may go down during power blackouts - are power blackouts therefore a good thing ? This is not to put the Lancet study down, just that we need to realize that what it tells us may not be what we want to know.

I still think that the best estimate for casualties can be derived just from considering other similar wars.

For example, in Vietnam, we lost about 58,000, the ARVIN ("our" Vietnamese troops) lost 225,000, the NVA lost 1.5 million, and
there were 4 million civilian casualties. The ratios are thus (NVA vs US) = 26 to 1, (Civilians vs US) = 69 to 1. My understanding of counter-insurgency warfare is that the ratios (insurgents killed vs army killed) and (civilians killed army killed) of at least 10 to 1 each are pretty universal.

So, based on a figure of 1100 US dead, we would expect 28,000 insurgents and 75,000 civilian dead (using Vietnam as an analogy). This is not too far off from the mid-point of the Lancet study. The current Iraq body count -
http://www.iraqbodycount.net/ - is 14,304, or about 13 to 1 versus US casualties. This is supposed to be civilians, but I would suspect that it includes some insurgents as well. It is clearly a lower bound, not an estimate.

So, the Lancet numbers are plausible, and it is hard to see, based on past experience, how total Iraqi civilian casualties could not be in the range of several to many tens of thousands.

Regards

Posted by: Marshall Eubanks at November 13, 2004 07:22 AM

From my "semi-monthly blog" :)

In the medical journal Lancet, Les Roberts and colleagues have a paper demonstrating the increase in the number of deaths in Iraq with the inception of the war. They have a cluster sample design, where 30 clusters were selected at random, and then 30 households in each were interviewed. The blogosphere have been extremely active these days with posts discussing benefits and pitfalls of cluster methodology and what to make about the results:

Evidence suggests that the mortality rate was higher
across Iraq after the war than before, even excluding
Falluja. We estimate that there were 98 000 extra deaths
(95% CI 8000–194 000) during the post-war period in the
97% of Iraq represented by all the clusters except Falluja.

Personally, I do not find the results surprising in the least (it is a war after all), but one problem of the study, caused by the small number of clusters, have been been overlooked. They use a generalized linear model procedure (using STATA 8.0) with bootstraped standard errors.

I did a small monte carlo experiment last night, and found that even with bootstrapping, the standard errors are about 50% too small. The experiment design had 30 clusters with 30 observations in each. I performed 1000 simulations. I estimated a linear regression with glm (results are similar for probits or binomial regression). With no correction we reject the null hypothesis when it is true in 50% the simulations. With clustered standard errors, 7%, and 8% by bootstrapping. Note that we would expect a rejection rate of 5%.

Techniques that work very well when the number of clusters is large and the number of observations in each cluster relatively small do not, in general, work very well when the data structure has a small number of clusters. Even less is known whith a "square" (30 by 30) design such as the one used the Lancet study.

Posted by: Eduardo Leoni at November 13, 2004 07:34 AM

I had a question for the stats experts on this thread - I was debating my brother about the Lancet study, and said (using my memory of confidence intervals, normal distributions and standard deviations) Look, there's a 66% chance that the death rate is between 50,000 and 150,000 (one std dev from 100,000), or to put it another way, there's only an 18% chance, AT MOST, that the IBC toll is even in the right order of magnitude (i.e., that the toll is less than 50,000).

It had a nice ring to it at the time, but now as I think through it, a number of possible criticisms jump to mind. Anybody want give a critique of it?

Posted by: Geoffrey at November 13, 2004 09:19 AM

Whoops, correction - not 18%, 16.6% or so. I'll admit, I'm winging it.

Posted by: Geoffrey at November 13, 2004 09:24 AM

"FALLUJA, Iraq - American marines called in two airstrikes on the pair of dingy three-story buildings squatting along Highway 10 on Wednesday, dropping 500-pound bombs each time. They fired 35 or so 155-millimeter artillery shells, 10 shots from the muzzles of Abrams tanks and perhaps 30,000 rounds from their automatic rifles. The building was a smoking ruin."

This called to mind the statistics they used to produce showing how many bullets, bombs and bucks it cost to kill one Viet Cong . . . it was one of those "Gee whiz" deals where it would have been cheaper to move them to California and buy them a Cadillac convertible. I think TIME (among others)published these from time to time but I can't put my hand on an example right now. But they were interesting, like body counts and blood trails.

Posted by: Steve at November 13, 2004 10:49 AM

Eduardo: The EpiInfo program used in the study performs the adjustment to the standard errors that you're talking about, as far as I can tell.

Geoffrey: You really can't compare the Lancet and IBC studies. The IBC study is meant to be counting confirmed civilian deaths by violence. This is bound to be lower than total excess deaths, so it's not surprising that it is in fact in the lowest part of the CI for the Lancet study. Neither study should be taken as confirming or falsifying the other; they're simply not measuring the same things.

Posted by: dsquared at November 13, 2004 11:00 AM

The article does not have sufficient information for one to replicate the study, or to design a monte carlo simulation to investigate the properties of the estimator used. What I did was that even in the best of circumstances, i.e. a regular ols, with no problems whatsoever except heteroskedasticity due to clustering, one can find problems.

1) The "clustered standard errors" do not work.

2) If the bootstrapping -is the one in stata- (not one hand coded), it samples the clusters (and not the observations within each), that's one reason to see differences in the monte carlo results.

However, if epiinfo or whatever they did in Stata takes care of this problems, one more praise to the study. I do hope it is wrong, though - in general I don't like people dying...

Posted by: Eduardo Leoni at November 13, 2004 01:26 PM

People, this is a MORTALITY study, not a CASUALTY study. There is a difference. Mortality includes ALL causes of death -- including deaths caused by lack of clean water (due to water treatment plants lacking electricity, and water mains being casualties of bombing), deaths caused by lack of medicines in hospitals or lack of access to hospitals (due to road blocks, hospitals being blocked off, etc.), deaths caused by insurgents' car bombs, deaths from crimes that were prevented back when the Iraqi police were adequately staffed, deaths from auto accidents even (not a minor thing, without electricity traffic lights do not work, without policemen to direct traffic at those intersections driving becomes a game of "chicken"!).

So the Iraq Body Count summary of 20,000 CIVILIAN CASUALTIES is quite consistent with the Lancet estimate of 90,000+ excess MORTALITIES, and, indeed, is what we would actually expect -- those directly killed by American action SHOULD be swamped by those killed due to destruction of infrastructure, lack of supplies, and general lawlessness. After all, those latter affect millions of people, while American military actions directly kill only the few people in the immediate vicinity of one of our soldiers or one of our bombs or shells. Add in the general lawlessness of post-war Iraq, where kidnappings, murders,bombings of police stations, and other such actions are common, add in the fact that major cities with millions of people in them, such as Basra, lacked clean water and fuel for boiling water for many days after we "liberated" them, undoubtedly killed thousands all by itself. The numbers add up, if you're honest and look at all the repercussions of modern war.

- Badtux the History-studying Penguin

Posted by: Badtux at November 13, 2004 01:39 PM

I am probably wrong ... it seems they actually use only one observation per cluster in stata (the summary measures from Epiinfo) and my critique wouldn't apply.

Posted by: Eduardo Leoni at November 13, 2004 01:44 PM

A bootstrap estimate of the confidence interval should deal with that problem (must pick clusters randomely not, of course, individual observations).The Lancet published a symmetric confidence interval which implies that the authors calculated a standard error and imposed normality. This is valid only for a large number of clusters.

However, it is clear that their result is not due to a few bad clustrs, because they looked for and tossed a bad outlier Fallujah. Obviously the problem in Fallujah is not measurement error. This rough and ready toss out clusters where disaster has hit, implies that the estimates are biased down and do not have a fat upper tail.

Posted by: Robert Waldmann at November 13, 2004 01:50 PM

Marshall Eubanks wrote, "As an example, my understanding is that there is a lot more automotive traffic in Iraq than before the war, and so it would be logical to presume more traffic deaths. These enter into excess mortality statistics, but are they a cost of the war ?"

If you look at the original study, a whole lot of the increase was violent death. And 80% of the violent deaths were attributed to americans. 95% of the deaths attributed to americans were from airstrikes. Now to back up a little, we're talking about around 60 deaths due to american violence, and 3 of them were not airstrikes. Of the three, 2 got american apologies. These are not large numbers. Of *course* the error estimates will be high, they lacked resources to do a bigger study that would show up a whole lot of deaths. The results were reasonably consistent across the country except for kurdistan and Fallujah, which was representing anbar province. Throw out kurdistan and anbar and you get a rather high consistent rate all over.

If you care about the numbers, and you feel this study is not precise enough to suit you, then it would make sense to fund a larger one.

Posted by: J Thomas at November 14, 2004 04:20 PM

Bruce Webb:

Fine post, sir. Well done.

Best,

D

Posted by: Dano at November 14, 2004 06:59 PM

"This type of thing is one of the things we researchers look for when judging the quality of another's study. This study was one of the best I've seen."

I guess that's why it was rushed into the Lancet right before the election.

Posted by: Mark Bahner at November 18, 2004 04:02 PM

There is one other problem, which is plainly stated by the authors.

The authors collected all their data, and then- after the event- then decided to exclude part of their data. This is post-hoc data manipulation, and it is an abuse, because it is very easy to choose the data manipulation that gives you the result you want.

J

Posted by: James Brown at November 29, 2004 03:35 PM
Post a comment









Remember personal info?