PA & Early Voting

I’ve previously voiced my concerns that questionable polls were being treated as election results and that no one in the media would be slicing and dicing early voting data to produce a dynamic predictive model of the 2024 elections.  I was at least partially wrong.

Polls are still being treated as gospel despite some laughable methodologies.  Some pollsters have even corrected for past underestimations of Republican turnout by simply sampling more Republicans in their 2024 polls.  (And, yes, that’s as idiotic as you think it is.)  While I’ll at least give them some credit for admitting to a bogus shortcut, I’ll also note that few people actually read the fine print.  Also, with “margins-of-error” of 3.5% and higher, most of these polls could be technically accurate even if there’s a blowout.

As for early voting data, however, it seems that everyone with an internet connection is looking at it.  Unfortunately, almost everyone is looking at it wrong. Some people are pushing partisan agendas. Some people are only interested in creating click-bait. Some people are just idiots. And some people are making rookie data analysis mistakes.

There are still news organizations that are reporting aggregated early voting data on a national scale as if that’s indicative of anything.  It’s not.  The Presidential election isn’t national.  It’s 51 separate elections.  Worse, a surprising number of usually sane media outlets are instead aggregating early voting data from the seven swing-states. While that may seem valid on the surface, it’s just the same mistake as a national aggregation on a smaller scale.  These analyses are lazy and are equally useless.

There are analysts that are separately looking at early voting data for each swing state, but many reuse the same analytical model when each state absolutely requires its own distinctive and complex model.  Each state differs in the timeline & rules for early voting, how early voting is conducted (in-person vs. mail-in vs. drop box), what else is on the state’s ballot this cycle, what early voter data is available, etc.

Some states report more granular information than others.  For example, only 9 states report age data for early voters and only 7 report gender.  A few data sources try to model early voters to derive any missing demographic data and, although such models might be accurate, they are obviously not as conclusive as reported data.

I did attempt to analyze the early voting data myself.  I started with Pennsylvania, since that’s the closest thing to a must-win state for either party.  Here’s just a few data points that must be considered in any PA model:

  • In PA, party registration and age are reported data points; gender, education, and race are modeled data points.
  • PA allows any registered voter to request a mail-in ballot online, via mail, or in-person.  While the mail-in / drop-box rules are quite complex, it’s a similar process to many other states.
  • PA also has something they call “on-demand mail ballot voting”.  Before 10/29, PA voters had the option to apply for a “mail” ballot in-person at their county office and then immediately obtain, complete, and submit that ballot, all in the same visit.  This clumsy approach to early voting created lines of up to three hours in several PA counties. Were some voters discouraged by the lines?  Probably.  Could even a small number of discouraged voters make a difference?  Probably.  Were such voters equally split between Democratic or Republican voters?  Unlikely.  Were they mostly Democratic or Republican voters?  No one knows.
  • While there are no state-wide ballot measures on the PA ballot, voters will also choose between a well-funded Republican and a sitting Democratic U.S. Senator.  The PA ballot also includes all of PA’s U.S. House seats, the PA Attorney General, and state legislative races.  Many of these contests have their own local drama and even a small number of reverse-coat-tail voters could well make a difference.
  • While Jill Stein’s Green Party candidacy won’t garner many votes in PA, she could impact a close election.
  • Prior early voting data is required to draw comparisons.  However, the limited data available for elections prior to 2020 negates its usefulness, the 2020 Presidential election itself was conducted in the midst of the pandemic when early voting was broadly encouraged by Democrats and discouraged by Republicans, and the 2022 election cycle included no national candidate, limiting the usefulness of that data.  In any case, it’s unclear whether past early voting habits will be repeated in 2024.
  • PA state legislative efforts and court rulings at both federal and state levels have substantially changed the voting process since 2020.
  • Statewide population and demographic changes in PA are significant.

That’s a whole lot to consider.  Much of my predictive model quickly became assumptions based on other assumptions based on incomplete data, forcing me to abandon my amateur attempt.  In short, the only valid conclusion I could draw from PA early voting data was that there weren’t any valid conclusions to be drawn.

The other swing states pose different sets of issues, but the theme is the same.  No one knows shit and no serious data analyst would claim otherwise.

However, speaking only from my gut and not as a data analyst, I will make a few observations and a personal prediction:

  • Over 2 million mail-in ballots were requested in PA.  1.8 million have been returned with party affiliations of 56% Democratic, 33% Republican, and 11% Other.  While not really indicative of much, those numbers still feel pretty good.
  • There was a late early turnout surge by women in PA, particularly among young women and new women voters.  That should be good for Democrats.
  • Conversely, the early turnout of older registered Republicans in PA lagged predictions.
  • Trump’s rally diss of Puerto Rico in the closing days of the campaign could end up being a significant unforced error.  There are over 300K PA voters of Puerto Rican descent and about 5% of eligible voters in PA are Latino.
  • There is little argument that PA Democrats have a better turn-out-the-vote ground game than Republicans.
  • I suspect that most PA polls have over-estimated Trump’s support.

The race in Pennsylvania will be tight and the results will not be immediately known.  Trump will undoubtedly claim victory early with no supporting data and there will be legal actions regardless of who is eventually declared the winner.

However:  My money is on Harris winning Pennsylvania.