The Numbers
A Run at the Latest Data from ABC's Poobah of Polling, Gary Langer
Gary Langer is director of polling at ABC News, where he's covered the beat of public opinion for more than 15 years – conducting and analyzing ABC News polls, evaluating data from other sources and setting the news division's standards for poll reporting. He's the first and only pollster to win a News Emmy, for his second national survey of public opinion in Iraq.
FAVORITES
MONTHLY ARCHIVES
| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | ||
| 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 | 24 | 25 | 26 |
| 27 | 28 | 29 | 30 | 31 |
« November 2007 | Main | January 2008 »
So Who the Heck's Ahead?
December 30, 2007 11:37 AM
Romney's ahead. Huckabee's ahead. Romney… Huckabee… Clinton, no, it's Edwards. Wait – Obama. No, it's…
This is where fascination with the horse race, particularly in a low-turnout caucus, will get you: tied up in knots. We want a single number and simple characterization. It doesn’t exist. What we have are different polls done different ways, many of them overanalyzed to make something out of very little, in fluid and close races.
Six Iowa polls of very varying quality have been completed and released in the last four days. One has Romney ahead of Huckabee; two have Huckabee ahead or slightly ahead of Romney. Then there's a Romney +4, a Huckabee +2 and a Huckabee +1, all within sampling error. These polls put Huckabee's support anywhere from 23 to 36 percent; Romney's, in a closer band, 27 to 32 percent.
In the Democratic race, three of these polls have Clinton ahead or slightly ahead; the others show various flavors of dead heats. Clinton's support ranges from 23 to 31 percent, depending on the poll; Obama's, 22 to 30; Edwards', 24 to 29.
Part of all this can be people moving as they make final decisions. But it's at least as likely to reflect the vagaries of pollsters' efforts to identify and interview likely voters, and quantify their preferences, during a holiday week, in such a low-participation contest. It's wise to take into account the quality of the effort (we've checked them all, and internally we advise which of these polls are best done, which so-so and which don't past muster methodologically). But given all the variables, even good (and so-so) ones differ.
Good polls have set the table: We know the leading candidates, and we know the issues, concerns and characteristics that animate the contest. Late-stage polling can be a warning sign of anything transformational. But beyond that, for the sake of your own sanity, try to avoid getting sucked into the horse-race clutter. Given available data, these contests are best simply characterized as close.
(For more, see also my Outlook piece with Jon Cohen in in today's Washington Post.)
December 30, 2007 | Permalink | User Comments (4) | TrackBack (0)
Teens and Steroids: Keep an Eye on the Stats
December 18, 2007 11:00 AM
Data are handy for making a point – so much so that it’s awfully easy to push them harder than justified. That just might have been the case in a couple of paragraphs in last week’s Mitchell Report on steroids in baseball, focusing on the drug’s use by teenagers.
“Some estimates appear to show a recent decline in steroid use by high school students; they range from 3 to 6 percent,” the report said. “But even the lower figure means that hundreds of thousands of high-school aged young people are still illegally using steroids.”
The prevalence data we see are somewhat different; the report looks to have misconstrued what's out there, and missed some updates from 2005 and 2007 alike. Actual current-use incidence is lower - and the "hundreds of thousands" may not be quite that.
In the most current of the two sources the Mitchell Report footnoted, use of steroids in the past year was reported at 2.7 percent among 12th-grade boys in 2006; that fits the lower range it gave. But the report says “students,” not just boys, and not just 12th graders. Annual use of steroids was much lower among girls (0.7 percent in the 12th grade); the total for all 12th-graders was 1.8 percent. It was lower as well among younger, 10th-grade high-schoolers, 1.2 percent. And recent use - in the past month, rather than in the past year - was lower still.
That’s from the 2006 “Monitoring the Future” survey by the University of Michigan's Institute for Social Research, for the National Institute on Drug Abuse. As it happens, just last week (apparently after the Mitchell Report was completed, though before its release), MTF released its 2007 data. Continuing a trend over the past several years, it found 2007 annual steroid use down to 1.4 percent among 12th graders, and 1.1 percent among 10th graders.
Since its peak years (from 1999 to 2002, depending on age), “the annual prevalence rate has dropped by more than half among the 8th- and 10th-grade males… and by 40 percent among the 12th-grader males,” Monitoring the Future reported. It also reported an increase in the number of 12th graders who see “great risk” in trying anabolic steroids and “a sharp drop in 2005 in the perceived availability of these drugs, very likely due to the Anabolic Control Act of 2004.”
Putting a sharper point on these findings, the MTF summary quoted Lloyd Johnston, its principal investigator: “While a number of states are considering implementing expensive programs to test student athletes for anabolic steroid use, the problem has been diminishing sharply,” he said. “It appears that supply control efforts, in combination with educational efforts, are having the intended effects.”
That’s a somewhat different tone from the Mitchell Report’s, which repeated that “hundreds of thousands of our children are using” steroids, adding, “every American, not just baseball fans, ought to be shocked into action by that disturbing truth.” If action means what Johnston calls “implementing expensive testing programs,” there seems some room for debate.
Indeed, comparing his steroid-use data with the characterizations in Mitchell’s report, “These are pretty good declines, and I’m not sure he’s taking them into account,” Johnston told me. “The problem has gotten substantially better.”
The Mitchell Report cited another source for steroid use data among teens, the Centers for Disease Control’s “National Youth Risk Behavior Survey.” Mitchell’s report footnotes the CDC data from 2003, which put lifetime prevalence at 6.1 percent. But, inexplicably, that seems to miss more recent CDC data, from 2005, which put it at a lower 4 percent. And the CDC measures lifetime use – ever in your lifetime – as opposed to MTF's measurements of annual use and 30-day-use (which, as noted, is lowest of all). The Mitchell Report employed the active verb “using.” That best describes 30-day use; at a slight reach it’s OK for annual use, but it’s not an accurate depiction of lifetime use.
As an aside, there also are differences in how the CDC and MTF studies ask about steroids that could contribute to different results. The MTF question looks preferable to us, since it defines steroids much more clearly.
So how about those “hundreds of thousands” of high-school-aged steroid users? The Census Bureau estimates there are 16,564,000 high school students in the United States (a generous estimate, since it includes all 9th graders). If we average the MTF 2007 “annual use” figures for 10th- and 12th-graders, we get an annual prevalence estimate of 1.25 percent, or 207,050 – barely there. Using 30-day use, it’s 0.75 percent, or 124,230 kids - plenty too many, but not hundreds of thousands.
Clearly any illegal, unprescribed use of steroids is wrong, and – particularly in the case of teenagers – cause for alarm. But the subject surely is worthy of sticking with a careful reading of the best and most recent data. In this, the Mitchell Report might keep in mind the truism of baseball itself: There’s always someone watching your stats.
December 18, 2007 in Favorite Posts | Permalink | User Comments (0) | TrackBack (0)
Mind the Gap
December 12, 2007 1:48 PM
Three good national polls this week told similar stories about the presidential race – but with some different numbers. We’re sure to see it again, so it’s worth sorting out.
The most notable differences were in the numbers that get the most attention – the horse race. Part of the reason is in the way we evaluate these numbers; another part, in how they’re produced. And a third thought is about the way we tend to (over-) focus on them.
The Republican numbers are easier to evaluate. A CNN poll had a 2-point race between Rudy Giuliani and Mike Huckabee; a New York Times/CBS poll, a 1-point race; the ABC News/Washington Post poll, a 6-point Giuliani lead (a “lead,” that is, at an 88 percent confidence level – see previous item on that subject).
But focusing on the gap between candidates exaggerates small differences in polls, and misses the real aim of the horse race question – to measure each candidate’s level of support, not the space between them. (A poll with Giuliani at 5 percent and Huckabee at 4 percent is not the same as a poll with them at 22 and 21 percent respectively, even though the gap matches.)
Consider instead the candidates’ actual support levels in these polls: Giuliani, 24, 22 and 25 percent, respectively; Huckabee, 22, 21 and 19. All are well within tolerances for their sample sizes (377, 266 and 292) – and indeed pretty similar, especially given their other differences. These polls were done on the same days, but CNN’s was among registered voters, while the CBS/NYT and ABC/Post polls were among “likely voters,” which each polling outfit defines differently. And the number of “undecideds” – in our view a function of polling technique rather than actual indecision – was 4 percent in the ABC/Post poll, 6 percent in CNN’s, but 17 percent in CBS/NYT’s.
Most important, moreover, is their fundamental message – lower support for Giuliani, higher for Huckabee, with a wealth of data in each of these surveys to help us understand why that is. On that central point, these surveys are in accord.
Numbers on the Democratic race are tougher to parse out. Looking at the gaps makes the differences look garish – a 10-point lead for Hillary Clinton over Barack Obama in the CNN poll, 17 points in CBS/NYT’s, 30 points in ABC/Post’s. Better again to look instead at the candidates’ support levels. John Edwards was at 14, 11 and 10 percent respectively – similar. Obama was at 30, 27 and 23 percent – a significant difference between the high and low estimates. And in the biggest difference, Clinton was at 40, 44 and 53 percent respectively.
There again are differences in populations and in levels of undecideds (in this race, 9 percent undecided in the CBS/NYT poll, vs. 3 percent in ABC/Post’s). Sample management (e.g., number of interviews per night) also can, at times, create differences in results. And differences among groups is another possible cause. With an African-American prominent in the race, ABC/Post polls consistently have been oversampling black respondents all year, in order to increase our confidence in this particular estimate (especially because blacks account for a nearly fifth of all likely Democratic primary voters). We find Clinton ahead of Obama by 52-39 percent among African-Americans in our poll. (At one point last summer, when our data differed from a Gallup poll that had Obama much closer, it looked to us like estimates of preferences among black voters was a likely cause. And we liked the higher confidence we got from our oversample.)
The Democratic race was almost precisely the same in this ABC/Post poll as in our two previous national polls, in early November and late September. CNN’s was 40-30-14 percent, compared to a 44-25-14 result in early November – that 5-point rise for Obama reaches significance. The CBS/NYT results (44-27-11) compared with 51-23-13 in a poll CBS did in mid-October; that earlier poll, however, left out the five second-tier candidates as choices, making comparisons a bit dicier (those five get a net of 7 points in the latest CBS/NYT poll, 8 points in ABC/Post data, and 12 in the CNN poll).
What to conclude? One healthy approach would be to cut back on fixation with the horse race and look at the underlying evaluations. These polls all have Clinton ahead, albeit by different margins. What they do best, in exploring the candidates’ strengths and weaknesses, and the issues motivating voters, is to help us see why.
December 12, 2007 | Permalink | User Comments (3) | TrackBack (0)
MOE and Mojo
December 03, 2007 10:58 AM
The Des Moines Register’s new poll, released Sunday, has Barack Obama 3 points ahead of Hillary Clinton in Iowa, which it characterizes as a “lead” for Obama. Our own ABC/Post poll two weeks ago had Obama 4 points ahead – and we called it “close,” not a lead.
What gives?
The answer is that it’s all about how far you’re willing to push the envelope. To the Register, we’re probably being too conservative. To us, the Register is going overboard. But there’ll be more of this to come in the weeks ahead, so it’s worth understanding how we get here.
A poll is not laser surgery; it’s an estimate. The reliability of the estimate is (in part) a function of its sample size. This is expressed as the margin of sampling error, and it’s customarily given at the 95 percent confidence level. For a poll of 500 likely voters, which was the sample size in both the ABC/Post and the DMR polls, that’s plus or minus 4.5 points. This means a candidate would need a lead of 9 points or more for us to say with 95 percent confidence that it’s statistically significant. Neither poll comes close.
However, the estimate a poll gets is in fact the likeliest true value, and the likelihood decreases as we move toward the extreme ends of sampling error. In fact we can calculate the level of confidence we can have that Obama really leads in either of these polls.
The answer: With Obama +4 vs. Clinton in the ABC/Post poll, we could have said with 77 percent confidence that – all else equal – he really had a lead. In the DMR poll, with Obama +3, the confidence level is 64 percent. Apparently 64 percent confidence is good enough for the Register to call it a “lead.” It’s not for us; nor was the 77 percent probability in our last poll.
Why not? One reason is that the customary confidence level in survey research is 95 percent, not 77, or 64. Another is that these probabilities only hold if all else is equal – and it isn’t. These estimates also are subject to non-sampling error, the likeliest cause of which is their estimate of who in fact qualifies as a likely voter. In our tighter likely voter model, more closely approximating caucus turnout in 2004, we didn’t have Obama +4, we had him +2, with 28 percent to Clinton’s 26 percent. The probability of that being a statistically significant lead was only 37 percent. Thus the prudent course was to call it close, which is what we did, and what we think it is.
Now on the Republican side, with 400 interviews, we had Mitt Romney with 28 percent support, Mike Huckabee with 24 percent. Huckabee had all the mojo – he was the guy making the move, as our analysis two weeks ago made clear. The probabilities still had us call the race close.
That Huckabee mojo looks to be continuing; the Register now has him at 29 percent support, to Romney’s 24 percent. The Register, again, calls Huckabee the leader. The confidence level 88 percent. Is that enough to call it a “lead”? It’s tempting. But before going there we’d want to see what turnout their likely voter model anticipates and what their other models (if any) show. Meanwhile, in our book, this one, too, is close.
As you can see, there's a bit of judgment in all this. The shorthand approach used by the AP is to say that when a candidate's numerical advantage doesn’t exceed sampling error, but is at least half of what sampling error demands, it can be called a "slight" lead. That's of course not what it is; it's really a possible lead in which we cannot be wholly confident.
All this underscores one of the fundamental points about pre-election polls: They are estimates. Even with good-quality methodology, the notion of pinpoint accuracy is a myth. And the reason we do them is not simply to try to puzzle out who's ahead – but to understand how and why the voters are coming to their choices.
December 3, 2007 in Favorite Posts | Permalink | User Comments (3) | TrackBack (0)