EA shows a pattern of prioritising non-peer-reviewed publications – often shallow-dive blogposts – by prominent EAs with little to no relevant expertise … This is a worrying tendency, given that these works commonly do not engage with major areas of scholarship on the topics that they focus on, ignore work attempting to answer similar questions, nor consult with relevant experts, and in many instances use methods and/or come to conclusions that would be considered fringe within the relevant fields. … The fact remains that these posts are simply no substitute for rigorous studies subject to peer review (or genuinely equivalent processes) by domain-experts external to the EA community.ConcernedEAs, “Doing EA Better“
This is Part 3 in my series on epistemics: practices that shape knowledge, belief and opinion within a community. In this series, I focus on areas where community epistemics could be productively improved.
Part 1 introduced the series and briefly discussed the role of funding, publication practices, expertise and deference within the effective altruist ecosystem.
Part 2 discussed the role of examples within discourse by effective altruists, focusing on the cases of Aum Shinrikyo and the Biological Weapons Convention.
Today, I want to ask how research carried out by effective altruists is evaluated. The gold standard of research evaluation is often considered to be the process of anonymous, pre-publication peer review by expert scholars. While some pieces written by effective altruists are peer-reviewed, a good number are not.
Today’s post will discuss what peer review is, and why it is important. Then I will discuss the limited role of peer review within effective altruism and the epistemic costs of failing to expand the role of peer review.
2. What is peer review?
Most serious academic articles and books are published in the following way. First, manuscripts are submitted to a reputable independent scholarly publisher. Articles are sent to journals, whose editors and editorial boards are composed of leading scholars in the field. Journals are typically run either by one of a handful of experienced publishing companies, or else are published `in-house’ by universities or academic departments. Similarly, book manuscripts are sent to academic presses, typically run by leading universities, overseen by specialist editors, and advised by committees of scholars.
Second, manuscripts are subject to a process of pre-publication peer review. Nonspecialist readers need to be confident that published academic work reflects rigorous scholarship building on field-specific expertise. This is done by reviewing articles prior to publication for the quality of their scholarship.
Third, reviewers are invited from among leading specialists with significant scholarly track records in the subject covered by the manuscript. Typically, reviewers hold a PhD in their field and have a number of relevant publications in high-quality venues. This ensures that reviewers are qualified to assess the quality of submissions.
Fourth, review is anonymous. The minimum standard is `single-anonymous’ peer review: reviewers’ identities are shielded from authors. This allows reviewers to speak their minds without fear of retribution. Increasingly many disciplines have moved to `double-anonymous’ peer review, in which authors’ identities are also shielded from reviewers. This prevents evaluations from being biased by the identity of the author. Many fields practice `triple-anonymous’ review in which authors’ identities are also shielded from editors, ensuring that editorial decisions are also protected from biases related to the identity of the author.
Fifth, typically at least two reviewers submit reports on the manuscript. These reports advise the editors of the venue on the interest and scholarly merit of the manuscript. Reviewers may make three recommendations: that the manuscript be published, that the manuscript be accepted, or that the manuscript be revised and resubmitted. Increasingly, requests for revision are the norm. This allows scholarly publications to be shaped by the knowledge and views of independent experts, rather than reflecting idiosyncratic views of the author or accidents of the paper’s construction.
Sixth, review continues over a number of rounds until editors make a final decision to publish or reject the manuscript. At this point, the review process ends.
3. Benefits of peer review
To outsiders, the process of scholarly peer review can seem impossibly baroque. Yet most scholars are firmly attached to traditional practices of peer review. Why do academics put our work through such an impossibly demanding, cruel, expensive, slow and downright inhumane process of vetting?
The answer is simple: the system works. Fellow academics, as well as members of the public, need to trust that the articles they read are high-quality contributions to cutting-edge scholarly discussions. Articles should build on relevant literature to make a novel contribution to existing discussions. Only a very few readers are in a position to verify that any given article meets this standard, and the rest are loathe to take it on trust that a given article meets this standard. As a result, those readers in a position to verify that articles are up to scratch are recruited as reviewers through a system designed to allow them to express their honest and unfiltered judgments, and to use those judgments to either improve the manuscript or decide its fate.
This much is recognized by nearly all academics. A recent study by Mark Ware found:
- Peer review is widely supported: The overwhelming majority (93%) disagree that peer review is unnecessary. The large majority (85%) agreed that peer review greatly helps scientific communication and most (83%) believe that without peer review there would be no control.
- Peer review improves the quality of the published paper. Researchers overwhelmingly (90%) said the main area of effectiveness of peer review was in improving the quality of the published paper. In their own experience as authors, 89% said that peer review had improved their last published paper, both in terms of the language or presentation but also in terms of correcting scientific errors.
What evidence do we have that peer review works? First, let’s consider peer-reviewed grant proposals. This is, in many ways, a difficult form of peer review, since reviewers can see only a proposal, not the completed project, and since review is not double-anonymous. Here, nonetheless, the evidence for peer review’s effectiveness is pretty good.
For example, Li and Agha (2015) consider projects funded by the National Institutes of Health between 1980-2005. They find a consistent positive correlation between peer-review scores and the number of subsequent publications and citations from a research project. Keep in mind that this difference is only among funded proposals: unfunded proposals would likely produce fewer publications and citations than these, had they been funded.
In the case of journal publication, referees have more information to go on. Review is still challenging: in many cases, journal acceptance rates are simply so low that most meritorious submissions will be rejected, and many disciplines have yet to adopt double-anonymous reviewing. Nevertheless, results are often quite good.
For example, Card and DellaVigna (2020) look at the fates of papers submitted to four top economics journals, as a function of the recommendations made by their original referees. They find a consistently increasing correlation between the strength of referee recommendations and the number of subsequent citations that papers receive:
Again, referees are far from perfect, but they do often do a decent job at spotting high-quality papers.
The system works. Almost everyone who uses the system likes the system, submits to the system, and most tellingly, chooses to read papers that have made it through the system. Selected papers and funded projects have a higher probability of being cited, and this effect tracks not only the eventual verdict but also the strength of referees’ endorsement. That is exactly what we would like to see.
4. Criticism of peer review
There are, from time to time, murmurs of protest against the current system of pre-publication peer review. The system is slow, cruel, conservative, and at times arbitrary. However, these murmurs of protest need to be taken in context.
The vast majority of scholars, including many of those raising critiques, publish almost exclusively in standard scholarly venues through processes of pre-publication peer review. Indeed, many of the best critiques of peer review underwent such a process. Only much more rarely do scholars read or write in nontraditional formats, and then they do so with the understanding that their writing is not likely to be, and probably should not be treated as a high-quality contribution to a scholarly literature.
For example, I think that the best way for readers to appreciate the importance of peer review would be for them to stop reading this blog and to instead read through the many excellent academic papers written about the system of peer review. My average blog post takes perhaps four hours to write, is written about a subject in which I am not among the foremost academic experts, and goes through no vetting process of any kind. My average paper takes perhaps four months to write, is written about a subject in which I am among the foremost experts, and goes through intensive vetting. That’s a big difference.
This difference is recognized even by the majority of peer review’s critics. For example, a recent study of bias in peer review writes:
Despite concerns about bias, researchers still believe peer review is necessary for the vetting of knowledge claims. One of the most comprehensive surveys of perception of peer review to date found that 93% disagree with the claim that peer review is unnecessary; 85% believe peer review benefits scientific communication; and 83% believe that “without peer review there would be no control” (Ware & Monkman, 2008, p. 1). This suggests that, for researchers, “the most important question with peer review is not whether to abandon it, but how to improve it” (Smith, 2006, p. 180).
Note here that the study centrally draws on the same data from Mark Ware that I used to illustrate the benefits of peer review. This is one of many places where critics and supporters of peer review find substantial common ground.
Almost all scholars recognize that the vast majority of serious scholarship is conducted in peer-reviewed journals and books. Of course, peer review can and should be improved. But most scholars view works published through nontraditional processes with deep suspicion.
5. Peer review in effective altruism
Some effective altruists and aligned academics publish their work through traditional practices of pre-publication peer review. That is a good development, and something I hope to see more of.
Many of the most central writings by effective altruists are produced through a rather different process. Let’s contrast the process by which these writings are produced with the six things said about peer review in Section 2.
First, writings are rarely submitted to reputable, independent scholarly publishers. Most commonly, they are published on blogs, podcasts, and internet fora. These venues have no serious scholarly reputation, giving readers little guarantee that the content they carry is high-quality scholarship. Nor are these venues independent: they are typically dedicated to a single cause or issue, such as effective altruism or AI alignment, and often have a strong ideological bias towards mainstream EA views. This makes it harder for readers to be confident that content is being produced and evaluated on the basis of its independent argumentative merits, rather than its agreement with established views and practices among effective altruists.
Second, review is typically conducted after publication. While comments on forum posts, blogs and podcasts are a useful form of feedback, they nonetheless come after the content has been written and published. This means that review is unable to play the traditional vetting function of publishing high-quality work while excluding low-quality work. This also means that manuscripts cannot be revised on the basis of reviewer feedback. It is true that effective altruists sometimes seek feedback through informal channels before publishing their work. That is a good thing to do. However, academic authors also seek informal feedback before submitting their papers for review, as well as more formal feedback at academic conferences. This feedback is rightly viewed as a necessary precursor to the review process, rather than a substitute for peer review.
Third, reviewers are not selected from among leading specialists. In many online venues, reviewers are not selected at all: anyone can comment. Those who do comment may have some amount of exposure to the topic, but rarely have spent years studying the topic. They typically lack terminal degrees in the field, or indeed in any field. They also often lack a track record of similar publications. To a lesser extent, the reviewers selected by leading foundations often share many of these lacks. This means that it is much harder for reviews to be taken as evidence of the scholarly merit and importance of work published in a nontraditional way.
Fourth, review is not anonymous. Even foundations such as Open Philanthropy which commission pre-publication reviews tend to actively share the identities of reviewers with the authors of reports. This makes it difficult for reviewers to do their job. If reviewers want to preserve their relationship, not only with sensitive authors, but also with the foundations which commissioned and generously paid for the review, they know that they had best moderate their criticism and heap on an extra helping of praise.
This is doubly true when reviews will be posted publicly. If anyone at all can view the content of a review, then the reviewer is assured little protection against retaliation. Because reviewers credibly expect that most people reading their review will be committed effective altruists, this gives reviewers an especially strong incentive to toe the party line.
Fifth, reviewers rarely have the authority to deny publication or request revisions. This strips the review practice of its teeth. The purpose of peer review is not to express opinion, but to evaluate and improve manuscripts. If reviewers can neither reject nor require revisions to a manuscript, then they cannot exercise effective pre-publication control or vetting, no matter the timing of their review.
Finally, in most cases publication decisions have effectively already been made prior to review. As we saw, in most cases no form of pre-publication peer review is practiced at all. But even in the rare case, such as an Open Philanthropy report, where reviewers are commissioned, this is done with the understanding that the report is virtually certain to be published. I have never heard of an Open Philanthropy report being scrapped after negative reviews (perhaps this has been done very occasionally?). If the decision to publish is already, in effect, made, then the review process is again deprived of its teeth.
6. What scholars think
What does the average scholar think about this situation? Here is what they see. They see a group of talented, intelligent and motivated individuals producing work that they judge to be significantly sub-par. They think that the work falls beneath standards because it has not been through the system of review which nearly all scholars think is required for work to reliably meet minimal scholarly standards. They think that while some of the work produced by effective altruists is quite good, much of it would never have passed through traditional processes of peer review. As a result, they think that nontraditional forms of publication are being used to pass off work that would not survive rigorous scholarly evaluation.
Scholars typically become more concerned when they see the results of this work. They see an increasingly high degree of confidence in idiosyncratic views, such as the claim that artificial agents have a sizable chance of soon murdering us all. They see that many, though not all of the people advancing these claims lack traditional scholarly credentials and publication records. They see claims being made about their own areas of expertise which they can quickly judge to be false or based on conceptual misunderstandings.
As a result, most scholars judge that the majority of work written by effective altruists is unlikely to be worth reading or engaging with. Therefore, they neither read nor engage with it.
You don’t have to take my word for these concerns. Almost any faculty member at a leading university will tell you the same. Or you can read recent pushback against the writings produced by EA-sponsored AI-safety organizations such as Conjecture and Redwood Research. For example, here is a recent critique of Conjecture:
We believe most of Conjecture’s publicly available research to date is low-quality … We think the bar of a workshop research paper is appropriate because it has a lower bar for novelty while still having it’s own technical research. We don’t think Conjecture’s research (combined) would meet this bar.
The critique suggests that Conjecture focus on improving the quality of their research outputs by adhering to standard processes, including a limited form of peer review.
We recommend Conjecture focus more on developing empirically testable theories, and also suggest they introduce an internal peer-review process to evaluate the rigor of work prior to publicly disseminating their results.
This is, to be clear, a very minimal demand – much less than the standard form of pre-publication, anonymous and independent review at a leading journal needed to secure significant external credibility. When even such minimal demands are not met, it is hard to place much confidence in the quality of research outputs.
I am not sure if such demands will be met. Conjecture, for its part, complained that “the document is a hit piece” and refused to engage in detail with the substantive concerns raised by this critique. In particular, they said nothing to suggest that the characterization of their publication process is inaccurate or that they plan to implement robust practices of peer review. That does not look good.
7. Why care?
Why should effective altruists care about scholarly standards or scholarly reactions? One reason to care is that scholarly standards work. A great deal of what we know as a species has been produced through standard systems of peer review, and academics and members of the public alike rightly place high confidence in the rigor, quality and informativeness of research published in leading journals. Standard practices of pre-publication peer review are good ways of sorting and improving manuscripts, and they should be adopted because they work.
You would not read a physics paper published by someone with no serious training in physics, and no intention of ever publishing a paper in an academic journal. Why would you subject serious moral issues about altruism to a lesser standard of scrutiny?
A second reason to care is that work published through nontraditional processes may not be taken seriously. Because effective altruists often make strong and controversial claims, it is especially important for them to secure traditional marks of legitimacy in order to secure a fair hearing. Regardless of what effective altruists think about the value of peer review, they should seriously consider the benefits of peer review as a means of convincing serious but skeptical audiences to read and engage with their work.
I hope that effective altruists will come increasingly to care about peer review. Improving the prevalence and nature of peer review within effective altruism will improve the quality of work, as well as the public perception of that work. Some scholars within the movement have written a number of serious and high-quality peer-reviewed articles on topics related to effective altruism. That is a good sign, and something we should hope to see more of in years to come.