K5 has a [link|http://www.kuro5hin.org/story/2004/10/29/62614/814|story] about a [link|http://image.thelancet.com/extras/04art10342web.pdf|paper] (.PDF, free registration) just published in The Lancet (a UK medical journal). The number seems impossibly large. There are several [link|http://www.chicagoboyz.net/archives/002543.html|critiques] (a link from K5) that say the methodology has major problems.
From the paper:
More than a third of reported post-attack deaths (n=53), and two thirds of violent deaths (n=52) happened in the Falluja cluster. This extreme statistical outlier has created a very broad confidence estimate around the mortality measure and is cause for concern about the precision of the overall finding. If the Falluja cluster is excluded, the post-attack mortality is 7\ufffd9 per 1000 people per year (95% CI 5\ufffd6\ufffd10\ufffd2; design effect=2\ufffd0).
I'm not a statistician, but statements like this raise red flags with me. If one "cluster" (of 33, each consisting of 30 households) can give such anomalous results when intense fighting happened in several cities, doesn't it seem reasonable that a much larger number of clusters is needed to be confident of the extrapolation? For example, if one had a group of 33 Americans and a third of them were Olympic marathon runners, one wouldn't assume that on average Americans had the physiology of a long distance runner. :-( Instead one would conclude that a larger, more random, sample of people was needed. The methodology they used doesn't seem to me to be likely to give an accurate estimate of the total death count. Extrapolating from small numbers to big numbers only works if the small sample is representative. (This problem comes up frequently in "cancer cluster" statistics as well.) Simply excluding the Faluja numbers doesn't give me any more confidence in the methodology.
The authors say they did not visit many cities, and did not visit some areas of cities that had heavy fighting - trying to show that their sample was random.
They recorded 142 deaths in the post-invasion period. Other groups have statistics with much higher numbers (e.g. [link|http://www.iraqbodycount.net|http://www.iraqbodycount.net] has the number of deaths as being betwen 14181 and 16312) that aren't based on extrapolation from a small number. The gut feeling I have is that while the authors may have tried to get a random sample (using random GPS coordinates), their sample was too small to extrapolate the death toll to the entire country.
Perhaps this story will result in more work to get better statistics. Until then, I think the IraqBodyCount numbers are more likely to be more accurate.
Cheers,
Scott.