June 17th, 2009

Captive Dreams

(no subject)

Wednesday Within the Octave of Bloomsday 

Speaking of the Great Books...


Joe Carter writes:

[Yester]day marks the 105th anniversary of Bloomsday, a commemoration of a day in the life of Leopold Bloom, the hero of James Joyce’s
Ulysses. For at least the past fifty years, fans of the notoriously difficult novel have gathered around the world in order to drink, dress up and celebrate their status as the literary equivalent of Trekkies.

Who are these people? And why is such a monstrously bad book still praised so highly? Perhaps it can be attributed to the career inferiority complexes of English majors. They may not make as much money as their friends who got their MBAs, they reason, but at least they can claim to have read The Greatest Novel Ever Written.


No doubt, fans of Joyce will say that I’m wrong. They will say that I am failing to put in the effort required to grasp the beauty of the novel. They will argue that I am discounting the remarkable use of language and linguistic technique. They will say that I am missing the point. These people will say many things. These people are usually English professors. They don’t know any better.

Captive Dreams

(no subject)

Seven Rules for Reporting Polls and Research Results
by Steven S. Ross, February 11, 2008
Posted at: Stats.org


These are points that Ross gave to students in a Journalism school on the reporting of polls and research. 

1. In general, effects are small. So you need a lot of statistical power
That means you need a large sample sizes and information on possible confounders – things that can change the results being reported, if they are not taken into account. Example: The number of new cancer cases in the US is increasing. But when the aging of our expanding population is taken into account, the chance that any specific individual will get cancer is declining.

To which we might emphasize that a sample size large enough to yield a significant result in the aggregate will not have the same power regarding on subsets.  A stratified sample of 512 cans taken from a trailer of 32 cells may be sufficient to accept or reject the trailer at a 0.025% AQL, but it cannot be so used to accept or reject each cell: the sample per cell is only 16.  In the same way, a sample of 1086 of the general population will not be a sample of 1086 wise Latina women or 1086 young adult professionals; so drawing conclusions about sub-populations is illegitimate. 

2. You have to watch for spurious clusterng
Imagine a chess or checkerboard occupying the bottom of a large cardboard box. Toss in exactly 64 grains of rice. The grains will bounce around and finally come to rest on the 64 squares of the game board. The average incidence is one grain per square. But you’re not likely to ever see that in your lifetime. Some squares will have many grains – all by chance. Likewise, some communities will report much larger-than-average incidences of certain diseases, all by chance.

To put it another way, random distributions will always generate clusters; so finding a cluster does not in itself mean much. 

3. Spurious studies, by definition, create news
Large, well-designed studies are very expensive, so persuasive studies of health issues are rare... and spurious studies, by definition, create “news” because results are unexpected.

Selection bias.  Ten studies are performed by ten different researchers.  Nine of them find no effect.  The tenth researcher finds an effect significant at the 10% level.  Which get published?  Better yet, which make the evening news?  That's right.  "Scientists find no link!" is not the lead story at six o'clock. 

4. Be skeptical of meta-analysis
The mathematical definition of a meta-analysis is the combining of raw data from many studies to gain the statistical power of a large sample, which is then analyzed as if all the data came from one place. ...a meta-analysis is often – in fact, almost always – BS-squared.

Take the ten studies above.  Remember the 10% alpha risk?  That is the risk that you would find a significant effect when there was none.  That is, you would hear a signal that wasn't sent.  At the 10% level, you would expect one positive study out of ten, which is what we postulated.  But suppose we could claim a larger sample by lumping all ten studies into one?  Perhaps that one study will be enough to give a positive signal for the aggregate.  But we have only disguised the fact that it was a random positive.  This is how the effects of second-hand tobacco smoke were "discovered."  But the real problem with meta-analysis is that the terms and measurements were often made in different ways, with different definitions, and different methodologies.  Combining them can be like combining apples and oranges. 

5. Look for mechanisms when the results are unexpected
When you have unexpectedly high responses to seemingly low doses, the case is significantly bolstered by identifying a mechanism instead of looking only at statistical correlations or regressions....

The truism is that "correlation is not causation."  That doesn't mean that it fails to prove causation, it means that It. Is. Not. Causation.  Statistics can never prove a causal relation.  

6. With polls, keep an eye on demographics
When it comes to polling, yes, we can take 850 in an imperfectly-drawn New Hampshire sample and split it 6 ways (young-old, rich-poor, male-female, minority-white...) and insinuate the overall statistical power of the overall sample, which isn't that great in the first place, while never once mentioning that New Hampshire's demographics and ground truths have changed a lot since the last hugely contested primary there in 2000, and that younger voters, who have only cell phones, are hard to find – and thus hard to poll. And why? Because bringing up any of this screws up the story!

There are two things wrapped up here.  History may indeed make this year's sample literally incomparable to last year's, because of changes in the population, its history, the definition of terms, and so on.  Think of various statistical facts about the USA that will change depending on whether or not Alaska and Hawaii are included.  The other issue is that just as the operational definition of the thing being measured will affect the measurement obtained, so too will the sampling methodology.  Calling people on the phone at random sounds easy; but it means you have to have a list of all phone numbers (including nowadays cell phones, throw-away no-name phones) and take account of people having more than one phone number.  In the old days of land lines, pollsters would say that households with teenaged daughters were over-sampled because they often had a second phone line installed for the same house. 

7. PR plays on laziness - your laziness
Thinking is such hard work. That's the secret of PR. Odds are, journalists will reprint the press release on the new study or poll results rather than thinking about what could go wrong.

True enough, most papers and magazines simply reprint the press releases of interest groups, like the National Association of Manufacturers, the US Government, scientific researchers, or PETA.  When skepticism is displayed, it is directed at Those We Don't Like.  The only TV news story I ever saw in which the newsreader reported on all the cautions and uncertainties in a scientific study was when the study reported a higher rate of breast cancer among women who had had abortions, thus demonstrating Thucydides' Rule. 

Two books I highly recommend for those interested in sampling are: 

Case Studies in Sample Design, by A.C. Rosander 

A Sampler on Sampling, by Bill Williams.

Captive Dreams

(no subject)

Up Jim River

A bit of a snippet, from the chapter "The Freedom of Choice." The thoughts are the harper's.

It is the nature of man to be selfish, Mother had said. (And Méarana remembered a much younger self, sitting by Bridget ban’s knee before a great fierce fire in Clanthompson Hall, while certain wounds of her mother healed.) It is a weakness passed down from our uttermost ancestors, the original sin from which all others arise. It emanates from the ancient brain stem and spreads by electrical synapses to the cortex, establishing by repetition its debilitating pattern.

The more these patterns of self-indulgence dominate, her mother had cautioned her, the less your capacity for reason. The brainstem is not in the final analysis a thoughtful companion.

But her mother rejected predestination. Whether the curse is carried in the genes, as the Calvinist prophet Dawkins had claimed, or whether it involves apples and serpents, as still older allegories run, a man can school his soul to a “second nature” and so overcome the curse. By diligent exercise, he can establish habits of thought that temper or block these signals with neural patterns of their own. With prudence, justice, moderation. And courage.

And Bridget ban had displayed to her awe-stuck daughter images from the
emorái machine of her very soul: the sparking footprints of thought running through her mind.