On promiscuity, logical impossibilities, statistical ignorance and the blogosphere


This is Gina Kolata, writing in the NYT:

One survey, recently reported by the federal government, concluded that men had a median of seven female sex partners. Women had a median of four male sex partners. Another study, by British researchers, stated that men had 12.7 heterosexual partners in their lifetimes and women had 6.5.

But there is just one problem, mathematicians say. It is logically impossible for heterosexual men to have more partners on average than heterosexual women. Those survey results cannot be correct.


Notice that the federal survey refers to medians, so there really isn't any logical impossibility there. The British study, on the other hand, refers to means indeed - there can't be a median of reported sexual partners of 6.5 (unless it is an average of medians over a number of years or something equally weird). With means, assuming females define 'sexual partners' the same way as males, the study is based on a truly random sample and males don't travel abroad disproportionately more than females do, a logical impossibility indeed presents itself.

This is not to say that there is any excuse for the author reporting the federal survey and then going on to say that these numbers cannot be an accurate representation of reality. At the same time, the British study makes it clear that there is an issue with men on average over-reporting (or women under-reporting) the number of sexual partners they have, so the evidence supports the author's thesis (or in the case of the federal survey, it is not informative - but definitely not against. That is unless you make some simplyfying assumptions regarding the distribution and utilise the information from the British survey as a prior etc but let's not even go there).

What I found of real interest in all of this wasn't really Gina Kolata confusing means and medians - I can easily write 10 posts a day documenting abuse of statistics in the press. The really interesting bit is the blogosphere's reaction:

Crooked Timber, my original source, reports both results and goes on to quote Andrew Gelman: 'Jeff's response: MEDIANS??!! Indeed, there's no reason the two distributions should have the same median.' No mention of the fact the second result could not be referring to medians, and that it supports Gina Kolata's thesis.

The very, very, very clever Andrew Gelman discusses about the median, and then goes on to say 'Finally, it's amusing that the Brits report more sex partners than Americans, contrary to stereotypes.' As I mentioned above and a reader of Andrew's points out in the comments, the difference is between the American median and the British mean - not a terribly meaningful comparison to make. Andrew later acknowledged this in the comments - but I will have to put this down as the first time I have caught him off guard. I could bet good money he did not spend more than 10 seconds reading the article.

I'll lower that estimated time to 5 seconds for Ezra Klein, who even after attracting a flurry of comments pointing out that the results refer to the median rather than the mean (including from Robert Waldman ) defended himself by saying this:

My understanding has always been that these are mean numbers, as median numbers tend to specifically be reported that way. But this would be worth finding out.

To find out, of course, one would simply have to read the first three paragraphs of the original NYT article.

And last but by no means least, Brad De Long is taking the piss - but making sure he only quotes the irrelevant American results:

Ouch. Our own David Gale from the tenth floor is made to look ridiculous by Gina Kolata--you see, she didn't tell him that the survey didn't ask about means--about averages--but about medians. Which means that she doesn't know the difference between means and medians. Which is a very bad thing for a science reporter.

Brad spotted the second study, but he is reacting by deliberately posting the juicy bits only. Quoting the British results would not alter the conclusion that the NYT columnist really messed up here, so the issue is discreetly sweeped under the carpet (not that I find this an entirely wrong thing to do. I may be in a philosophical mood here, but there is an appeal to pithy and to-the-point posts on silly subjects such as this one).

By the way, my own first reaction was to admonish the use of the median in the NYT article - but after seeing the attention this was attracting from fellow bloggers I thought the post was worth expanding.

And as I think I should, here is Wikipedia on the mean and on the median.

1 comments:

  1. Hibernien Says:

    I think the point has been made elsewhere that different adult populations could lead to differences in the means. The smaller adult male population could lead to a higher mean although I doubt of the degree reported.

    Even more pedantically, different rates of a partner accumulation could explain mean differences further. If men accumulate partners more slowly (it's a hunch, but I'd say they do) then a random sample of, say, 30 year old men and women won't give you the same mean.

    However this is likely to bias the figures in the opposite way than reported - it would mean (no pun intended) that women would report more partners!