Polls and Other Lies

Democrats have been absolutely soiling themselves over a New York Times / Siena College poll released this past weekend.

Granted that a front-page headline reading “Trump Leads Biden in 5 Key Battleground States” – in a newspaper that Democrats intrinsically trust – certainly provides a sufficient stimulus for an abrupt outbreak of explosive diarrhea.

So okay.  That happened.  But while Democrats are cleaning themselves up, perhaps they should consider some background before they overdose on Imodium.  In particular, I suggest that folks consider both the methodology of the poll and the timing of the poll.

Methodology

I’ll start by giving some credit to these pollsters for at least recognizing that national polls are useless.  Swing state polls can provide the only interesting data since we can be assured that the Electoral College votes in something north of 80% of the states are preordained and victory margins in those states are less than meaningless.  The states that the NYT polled – Arizona, Nevada, Georgia, Michigan, Wisconsin, and Pennsylvania – are a decent set.  At this point, I’d have added Virginia, New Hampshire, and North Carolina to the potential swing-state list, but that’s a nit.  The poll claims to have had 600 respondents in each state, attempting to cover various demographics (age, race, income, education, & party).  That’s all fine.

Unfortunately, that’s all the credit I’m going to give.

First, it took some digging to derive their methodology.  I’ll begrudgingly admit that they did at least publish it (unlike many other pollsters), but they certainly didn’t highlight the potential issues.  Raise your hand if you even thought to look beyond the headline into the methodology details.  Yeah, thought so.  Only data geeks would do that.  And one of them happens to have a blog.

The most glaring problem is that the poll intentionally over-sampled Republicans!

In an apparent attempt to avoid underestimating Republican support (as this poll did in 2016), the pollsters decided to over-sample Republican voters and then statistically adjust the results.  That approach “could” work with a large dataset and a low oversampling rate.  This sample, however, isn’t nearly large enough to accurately reflect reality with statistical weights.  And the over-sampling rate was way too high.

Only 20% of respondents self-identified as liberal or somewhat liberal while 36% self-identified as conservative or somewhat conservative.  No amount of mathematical magic in a 600-person poll can properly adjust the demographic coverage when one ideology has almost double the representation.

Swing states are called swing states for a reason:  The electorate in each is divided between Democrats and Republicans with a significant number of independents. In any given election, either party has a chance to prevail.

While this same poll over-corrected in both 2020 and 2022, underestimating Democratic support, they made a conscious decision to keep making the same mistake for 2024.  Go figure.

Furthermore, the poll’s definition of “likely voter” seems rather suspect.  The model uses a proprietary turnout-probability formula modified by a weighted version of self-reported voter intentions.  Wow.  Methinks the complexity of that math far exceeds the limitations of the minimal, statistically-adjusted data.

A valid poll today shouldn’t even try to weight the sample.  It should focus instead on identifying the most representative voter sample possible and stay far, far away from math tricks.

At the VERY least, it is inexcusable for these facts to have not been noted upfront in the New York Times’ coverage of the poll.

Timing

Even if we make the massively questionable assumption that the NYT poll provides an accurate snapshot, we still have another huge issue:

The poll was taken over a year out from the 2024 elections!

That’s an eternity in today’s political environment.  Here’s just a few related observations:

  • The general election campaigns haven’t really started, and Biden’s re-election messaging is still a TBD. He’ll hopefully focus again on sanity and an improved economy.  He also needs to remind the electorate that his opponent is only four years younger… and is certifiable.
  • As the campaigns progress, Biden will be Biden and will continue to be gaffe-prone.  He’s been that way for decades.  He’ll warm up when necessary but, more importantly, his Democratic surrogates will debate circles around what’s left of media-savvy Republicans with a multi-digit IQ.
  • Biden’s opponent might be in jail.  Or at least under house arrest.  Seriously, I have to believe that any felony conviction will have an impact on the polls.  Also, a continued emphasis on revenge as a campaign message should wear very thin for those not already in the cult.
  • We don’t yet know who will mount serious third-party campaigns nor what their impacts will be in the swing states.
  • Younger votes simply don’t pay attention this early in election cycles.  Their impact will show up much later in the polls.
  • Shit happens.  We have a couple of ongoing wars, an unpredictable economy, and a Congress that can’t handle the basics of governance.  There’s even a non-zero probability that the party nominees won’t be who we think they’ll be.  Oh, and locusts.  Swarms of locusts.

Bottom Line

The NYT poll mostly tells us that the 2024 election will likely be tight.  We already knew that.

Should Democrats be concerned?  Absolutely.  It would be much better to have such a commanding lead in the polls that timing & methodology problems are irrelevant.

Should Democrats panic?  Absolutely not.  This election has the highest stakes of any in my lifetime and some moderate levels of anxiety are certainly healthy.  However, there’s no current need for either Imodium OR Prozac.  While Chicken Little’s genes are in the Democratic Party’s DNA, we all need to just chill.  If necessary, we can always panic later.