Now that Trump’s last challenge to the election results has been summarily dismissed by the Supreme Court and Biden has been inaugurated, I feel that it’s a good time to provide my perspective on the controversy that surrounded my colleague, Steven J. Miller, and that exploded on both the right-wing media and the internet in November.
To recap, in mid-November, Steve (a professor of mathematics at Williams) was contacted by several people who had questions about mail-in ballot data from registered Republicans in Pennsylvania. The data were provided by Matt Braynard, a former Trump campaign staffer, who claimed that the data he obtained showed rampant fraud. Steve was asked to verify this by providing a statistical analysis of the data. After his analysis, Steve signed an affidavit alleging that “almost surely (based on the data I received) that [sic] the number of ballots requested by someone other than the registered Republican is between 37,001 and 58,914, and almost surely the number of ballots requested by registered Republicans and returned but not counted is in the range from 38,910 to 56,483.” He subsequently signed a second affidavit with a slightly lower estimate, but the damage had already been done.
Not surprisingly, this statement made headlines in far right media. Trump retweeted the story to about 70,000,000 followers. One headline read “In sworn statement, prominent mathematician flags up to 100,000 Pennsylvania ballots”, and Steve is referred to as a data scientist and respected mathematician. The headline was followed by, “Federal Elections Commission Chairman Trey Trainor says new analysis by professor Steven Miller ‘adds to the conclusions that some level of voter fraud took place in this year’s election.’” The affidavit became part of the materials used in the case alleging fraud in the Pennsylvania election.
The day after the story broke, I was asked by The Williams Record and the Berkshire Eagle to respond. At first, I declined, citing my closeness to the story as Steve is both a colleague and a friend of mine. But after reading his affidavit, seeing the impact it was having, and seeing the problem with his analysis, I felt obliged to get involved. One interviewer asked me whether a senior stat major should be able to see what Steve did wrong. I replied that if a student in an intro stats course didn’t see what was wrong, they shouldn’t pass the course.
Steve and I have had many discussions in the past. We don’t always agree, but they are always interesting and productive. Out of courtesy, I immediately let Steve know what I thought of the analysis and what I intended to write. I told him that, as they say in New Jersey, this is business, not personal.
Here is the story in the Williams Record where I was quoted as saying that Steve’s analysis was “completely without merit” and “both irresponsible and unethical.” I also wrote a longer rebuttal the following day, where I pointed out that Steve had violated at least seven out of the 10 guidelines for ethical statisticians laid out by the American Statistical Association (ASA).
For most of that week I attempted to remind Steve why his analysis was so unethical and we discussed ways that Steve could amend his mistakes. At first, he insisted on simply repeating that he assumed that the data were valid. I informed him that, as pointed out in the ASA guidelines on statistical ethics, that is not enough. The onus is on the statistician to either validate the data or to state the limits of their analysis. Next, he wanted to go back to Matt Braynard to find out more details about the data collection. But, after many hours of discussion and emails with Steve, he decided not to pursue this ill-advised path further and issued this apology:
“One of the lessons I try to teach my students, and I think many have learned it better than I, is to critically examine the data before doing any analysis. I did not do that when asked to make mathematical calculations based on data related to perhaps the most contentious election of our lifetime. Nor did I fully consider how my calculations, made in isolation based on numbers provided to me, would be used. Several of my colleagues have pointed out concerns both in the data and how it was used. They were right — I made a mistake by not discussing these issues. I hope others will learn from my error and learn from my example that if you make a mistake you admit it and take steps to fix it.”
Here is the story as reported in the Berkshire Eagle on November 24. Several other people also criticized Steve’s analysis, including computational biologist Lior Pachter. Pachter provided a fairly thorough technical response, concluding, “In summary, Steven Miller’s declaration provides no evidence whatsoever of voter fraud in Pennsylvania.”
At the same time that his analysis received legitimate criticism, I noticed a “piling on” of personal attacks against Steve on social media. Not only did Pachter dissect his analysis, but he attacked Steve for allegedly inflating his CV by including papers mistakenly attributed to him on Google Scholar and thereby racking up over 8,000 citations. Pachter is an impressive scholar in his own right (to say the least) with 196 curated papers and over 70,000 citations listed on Google Scholar. But his attacks on Steve are unfair. Steve does not actively curate his page (would he seriously try to pass off a paper in chemistry from 1955 as his own?). Nor does he need to. On his academic CV he has over 140 papers in several areas of mathematics with over 20 papers written with undergraduates and over 2,000 legitimate citations, a number that would be the envy of many scholars. And I can attest that even with a name as uncommon as De Veaux, I have had papers and citations mistakenly attributed to me on Google Scholar and ResearchGate. I can only imagine the problem with a name like Steven Miller.
But the “cancel culture” attacks didn’t stop with Google Scholar. On Facebook, people started questioning his behavior as chairman of our local Phi Beta Kappa Chapter and as a member of our school board. I tried to separate his (albeit serious) misjudgment on this issue from wholesale character defamation and the accompanying moral signaling that has become so popular these days. I have remained active on social media to try to stop the cancel culture piling on of Steve. I have known Steve for years, and although his politics and mine don’t align, I have never known him to be dishonest. In fact, he has gone out of his way to try to fix this situation and has learned a lot about the difference between statistical thinking and the blind application of math or data science formulas. In particular, Steve has told me that “seeing how my report was spread without my caveats really drove home the points you made.” Unfortunately these attacks continue. Recently, the organizers of a data science course taught remotely by the Liberal Arts Collaborative for Digital Innovation (LACOL), in which Steve has participated previously, wondered if he should be replaced.
Many researchers with degrees in other fields, including, but not limited to data science, computer science, and mathematics certainly have the ability to “do statistics” and apply formulas. But too often these analyses are done without a central tenet of a statistics education — an appreciation of the necessity to question the quality of the data on which the conclusions are based. I hope that this incident can serve as a lesson to the general public of at least two important points. First, as Cathy O’Neill so eloquently reminded us in Weapons of Math Destruction, blind application of formulas and algorithms to bad data is not only untrustworthy and wrong, but potentially dangerous. And second, that we should always be skeptical of data sources and to insist on asking questions about the data pedigree and motivations of those disseminating it. Let’s hope that this episode will help the public both to value statistical thinking and to increase appreciation of those who have been educated in it.
Richard D. De Veaux, C. Carlisle and M. Tippit professor of statistics and associate chair of statistics, has been at the College since 1994. He is the 2018-2021 Vice President of the American Statistical Association.