Monday, April 23, 2007

Skewed Survey's: or Lies, Damn Lies, and Statistics

MSNBC has put up a web page entitled "About our Live Votes and surveys: How 1,000 people can be more representative than 200,000"
http://www.msnbc.msn.com/id/3704453/

This is an important addition to the discussion about information literacy. It is a concise and informative article about how polls, surveys, and online votes can differ greatly in results even on the same topic and with the same questions. It is also gratifying to see a major media outlet not only to be so circumspect about how they present information, but to also be open about it with the public. Some of the more interesting statements include:

One week in the middle of the Clinton-Lewinsky scandal, more than 200,000 people took part in an MSNBC Live Vote that asked whether President Clinton should leave office. Seventy-three percent said yes. That same week, an NBC News-Wall Street Journal poll found that only 34 percent of about 2,000 people who were surveyed thought so.


To explain the vast gap in the numbers in this and other similar cases, it is necessary to look at the difference in the two kinds of surveys.

While a poll of 100 people will be more accurate than a poll of 10, studies have shown that accuracy begins to improve less at about 500 people and increases only a minor amount beyond 1,000 people.

Random selection of those polled is necessary to ensure a broad representation of the population at large.

To begin with, the people who respond choose to do so — they are not randomly selected and asked to participate, but instead make the choice to read a story about a certain topic and then vote on a related question. There is thus no guarantee that the votes would reflect anything close to a statistical sample...

This is a good and brief explanation about statistical sampling and reliability that I think would be useful for everybody to review. It is a good reminder of things we tend to forget. To many of us these may seem obvious, however it is easy when reading an article, book or web site for us to just accept the statistics offered without considering the way they were collected and the context in which they are delivered. With plethora of information providers both ethical and less ethical it is now more important than ever to check the sources and verify information with separate and independent resources.