Exaggerating the risks (Part 12: Millett and Snyder-Beattie on biorisk)

Unlike standard biothreats, there is no historical record on which to draw when considering global catastrophic or existential risks. Alternative approaches are required to estimate the likelihood of such an event. Given the high degree of uncertainty, we adopt 3 different approaches to approximate the risk of extinction from bioweapons: utilizing surveys of experts, previous major risk assessments, and simple toy models. These should be taken as initial guesses or rough order-of-magnitude approximations, and not a reliable or precise measure.

Millett and Snyder-Beattie, “Existential risk and cost-effective biosecurity
Listen to this post

1. Introduction

This is Part 12 of my series Exaggerating the risks. In this series, I look at some places where leading estimates of existential risk look to have been exaggerated.

Part 1 introduced the series. Parts 2-5 (sub-series: “Climate risk”) looked at climate risk. Parts 6-8 (sub-series: “AI risk”) looked at the Carlsmith report on power-seeking AI.

Parts 9, 10 and 11 began a new sub-series on biorisk. In Part 9, we saw that many leading effective altruists give estimates between 1.0-3.3% for the risk of existential catastrophe from biological causes by 2100. I think these estimates are a bit too high.

Because I have had a hard time getting effective altruists to tell me directly what the threat is supposed to be, my approach was to first survey the reasons why many biosecurity experts, public health experts, and policymakers are skeptical of high levels of near-term existential biorisk. Parts 9, 10 and 11 gave a dozen preliminary reasons for doubt, surveyed at the end of Part 11.

The second half of my approach is to show that initial arguments by effective altruists do not overcome the case for skepticism. Today’s post focuses on estimates provided by Piers Millett and Andrew Snyder-Beattie.

2. The MSB estimates

In Part 2 of my series “Mistakes in the moral mathematics of existential risk,” I discussed a paper by Piers Millett and Andrew Snyder-Beattie, “Existential risk and cost-effective biosecurity“. The paper uses three models to estimate levels of per-century existential biorisk, then argues that efforts to mitigate existential biorisk are cost-effective across all three models.

In my previous post, I took for granted the biorisk estimates provided by Millett and Snyder-Beattie (henceforth, MSB). I argued that even if these risk estimates are taken at face value, MSB commit two mistakes in estimating the cost-effectiveness of existential risk mitigation. I argued that once those mistakes are corrected, the MSB models no longer treat efforts to mitigate existential biorisk as cost-effective.

Today, I want to focus on the risk models themselves. MSB give three models of existential biorisk, outlined below.

ModelBasisPer-century biorisk
Model 1Survey of 2008 Global
Catastrophic Risk Conference
0.05% to 2%
Model 2Potentially pandemic pathogens
(Gryphon report)
0.00016% to 0.008%
Model 3Naive power law extrapolation0.005% to 0.014%

First, a preliminary observation that will be repeated throughout this post. Few of these models support the high estimates of existential risk that we saw in Part 9 of this series. Even if these estimates are right, many leading effective altruists will have exaggerated existential biorisk by several orders of magnitude, except perhaps on the first model. And that model is entirely dependent on an opinion survey of effective altruists and their guests, so it hardly clears effective altruists of the charge of exaggerating the risks.

Second, I will argue that there is no good reason to give significant credence to the predictions of any of these models. In this sense, even readers willing to accept lower estimates of existential biorisk will still need to provide alternative foundations beyond the MSB arguments.

I make this case below by reviewing each model in turn.

3. Model 1: Survey of 2008 Global Catastrophic Risk Conference

In 2008, the Future of Humanity Institute hosted a conference on global catastrophic risks. At the conference, they informally asked attendees to rate the likelihoods of various disasters occurring before the year 2100. The median attendee estimated a 2% chance of human extinction from an engineered pandemic by 2100, and a 0.05% chance of extinction from a naturally occuring pandemic. These estimates are taken as MSB’s first model of existential biorisk.

I don’t think this is a particularly reliable way to estimate biorisk. To their credit, it’s not clear that MSB do either. They write: “The disadvantage [of this survey] is that the estimates were likely highly subjective and unreliable, especially as the survey did not account for response bias, and the respondents were not calibrated beforehand”.

Let me put the point a bit more sharply. Suppose I were to walk into a conference about the Loch Ness Monster. I might poll attendees about the probability that the Loch Ness Monster will be found by 2100. I suspect many would return estimates of at least 2%. But this wouldn’t provide a credible basis for taking the Loch Ness Monster to be 2% likely to be found by 2100. Why not? Because conference attendees are not a representative sample. Attending a conference about the Loch Ness Monster is highly correlated with belief in the Loch Ness Monster. In just the same way, attending a conference about existential risk is highly correlated with belief in existential risk.

You might object that this is unfair. The attendees at the 2008 Oxford Conference on Global Catastrophic Risks were, in many cases, highly educated people. Surely their estimates could not be strongly skewed.

However, we saw in Part 11 of this series that a group of superforecasters participating in the Existential Risk Persuasion Tournament gave a median estimate of existential biorisk in this century of just 0.01%, a far cry from 2%. And we will see that MSB’s remaining two risk estimates often fall shy of 0.01%.

This is good evidence that the opinions of the conference attendees may have been unrepresentative, and excellent evidence that their opinions should not be taken at face value. That does not mean that the conference attendees are wrong. But it does mean that we cannot simply take the estimates of conference attendees for granted. We are looking for an argument in support of high levels of existential biorisk. Polling those sympathetic to high risk estimates is not an argument, but merely a way of repeating what needs to be proved.

4. Model 2: Potentially pandemic pathogens

About a decade ago, the National Science Advisory Board for Biosecurity (NSABB) was tasked with advising the US Government on the risks and benefits of gain of function research. The NSABB contracted Gryphon Scientific to produce a report analyzing the risks and benefits of gain of function research. A team led by the biochemist Dr. Rocco Casagrande produced a lengthy final report, Risk and benefit analysis of gain of function research.

Casagrande’s team worked with leading experts to produce a risk model and estimate the parameters within the model. Focusing by way of example on two types of influenza research, the team estimated the annual probability of a global pandemic caused by an accident due to gain of function research of this type in the United States at between 0.002% to 0.1%.

The report suggests that risks of pandemics from deliberate misuse are comparable, so MSB suggest doubling these probabilities to account for the overall annual risk of pandemic due to gain of function research in the United States. That’s fair enough.

MSB also suggest that about 25% of relevant research will be conducted in the United States, so we should quadruple the modeled probabilities to account for the overall annual risk of pandemic due to gain in function research. That’s also fair enough. Together with doubling suggested in the previous paragraph, this produces an estimated 0.016% to 0.8% annual risk of pandemic due to gain of function research. So far so good.

The problem is that this is an estimate of catastrophic risk, not existential risk. As MSB note, Casagrande and colleagues directly estimate fatalities resulting from an outbreak of this kind, putting them in the range of 4 million to 80 million deaths. That’s not the number that MSB wanted.

Up to this point, the MSB estimate has been grounded in highly authoritative research produced by a reputable research team at the direct request of the US Government. That is admirable. But from this point on, the MSB estimate becomes rather more speculative. They write:

The analysis in Risk and Benefit Analysis of Gain of Function Research suggested that lab outbreaks from wild-type influenza viruses could result in between 4 million and 80 million deaths, but others have suggested that if some of the modified pathogens were to escape from a laboratory, they could cause up to 1 billion fatalities. For the purposes of this model, we assume that for any global pandemic arising from this kind of research, each has only a 1 in 10,000 chance of causing an existential risk. This figure is somewhat arbitrary but serves as an excessively conservative guess that would include worst-case situations in which scientists intentionally cause harm, where civilization permanently collapses following a particularly bad outbreak, or other worst-case scenarios that would result in existential risk. Multiplying the probability of an outbreak with the probability of an existential risk gives us an annual risk probability between 1.6 * 10-8 and 8*10-7.

There are a few things to say here. First, let us not forget the obvious. We saw in Part 9 of this series that many leading effective altruists give estimates in the range of 1-3% for existential biorisk in this century. The MSB estimate above is 3-4 orders of magnitude lower than the estimates surveyed in Part 9 of this series. This is not a model on which leading effective altruists were right. This is a model on which they were greatly exaggerating the risks.

Second, let’s look in detail at the MSB estimate. On top of over a thousand pages of rigorous scientific analysis by Casagrande and colleagues, MSB propose a small tweak: just multiply the final risk probability by 1 in 10,000 to get the risk of existential catastrophe. Where does this number come from? MSB tell us directly: they made it up. MSB offer no analysis of any kind to support this number, asserting (without argument) that it is an “excessively conservative guess”. One cannot help being struck by the contrast between MSB’s methodology and the rigorous, detailed, empirically-driven analysis standardly used by authors such as Casagrande and colleagues to advise policymakers.

I am not quite sure what to make of this situation. One might have imagined that MSB were in the business of arguing for estimates of existential risk. However, it is precisely at the point of introducing genuinely existential risks on top of an original analysis concerned with catastrophic risks that MSB introduce an invented figure with no support of any kind. I don’t mean that I disbelieve the figure. I mean that there is literally no argument of any kind given for the figure.

If the game being played here is simply one of stating the views held by effective altruists, then such a move is understandable. But if MSB are in the business of arguing for estimates of the level of existential biorisk, then it is hard to see how anything MSB have written here advances their project.

5. Model 3: Naive power law extrapolation

Some recent models suggest that casualties from terrorism and warfare may follow a power law distribution. That is, where X represents the casualties from a randomly selected war or terrorist attack, P(X > x) = xα for constant α.

MSB suggest that recent research supports an estimate of approximately α = 0.5 for bioterrorist attacks. This means that the probability of an arbitrary bioterrorist attack killing at least 5 billion people is approximately (5 billion)-0.5, or about 1.4 * 10-5. If we assume one bioterrorist attack per year, then the annual probability of a bioterrorist attack killing at least 5 billion people is also about 1.4 * 10-5. MSB suggest that about 10% of such attacks might result in existential catastrophe, yielding an annual estimate of existential biorisk of about 1.4 * 10-6.

At the risk of beating a dead horse, this number is still orders of magnitude lower than the estimates surveyed in Part 9 of this series, and we will shortly see that another power law estimate given by MSB is lower still.

More to the point, this kind of extreme power law extrapolation has little empirical support and would almost certainly not be endorsed even by the authors propounding power law models of bioterrorist attacks. The easiest way to see this is to consider the data on which power law models of terrorist fatalities are based.

For example, one of the leading models, due to Aaron Clauset and colleagues, draws heavily on data regarding the frequency and fatality of terrorist attacks since 1968. This data is represented below.

Frequency and severity of terror attacks, from Clauset et al. (2007)

Note the range of the data: none of the attacks on which this model is based caused more than 10,000 fatalities, and few caused significantly more than 1,000 fatalities.

On the basis of this data, we might predict with some confidence the future frequency of terrorist events with fewer than 1,000 fatalities. We might even, pushing our luck, try to predict the frequency of events with fewer than 10,000 or 100,000 fatalities, though we should not put much confidence in such predictions. But to predict the frequency of events with five billion fatalities goes entirely beyond the scope of the data used to construct power law models. From the fact that attacks obey a power law distribution with a certain slope in the range of perhaps 1-5,000 fatalities, we simply cannot conclude that they have the same slope, or even that they obey a power law distribution in the range of 1-5,000,000,000 fatalities.

Moreover, this kind of power law projection requires highly selective literalness in how model predictions are interpreted. We are meant to take the probability of 5 billion fatalities at face value, using the model projection of (5 billion)-0.5. But the same method would seem to suggest a probability of 20 billion fatalities at (20 billion)-0.5, or about 7*10-6. We are, presumably, not meant to take seriously the idea that there is a more than one-in-a-million chance that twenty billion people will die from a single bioterrorist attack this year. But if we are already disposed to ignore model predictions this close to the edge, is there any principled reason why MSB ask us to take model predictions literally when they are just slightly further from the tail?

Even MSB seem to accept that reading the extreme tails of power law models is not a good idea. Instead of reading the probability of an event which kills the entire world population off of the power law model, they read the probability of an event which kills 5 billion people off of the power law model, then separately estimate the likelihood that such an event kills the rest of us. But once we acknowledge that it would be inappropriate to read the probability of an event which killed the current population of 8 billion people off of a power law model, we should rightly demand an argument for the appropriateness of asking the model how likely it is that 5 billion people will be killed.

Finally, historical reflection suggests that this type of power law extrapolation cannot be taken literally. After all, we might have fitted a similar power law for bioterrorism a thousand years ago, finding fatalities in the range of perhaps 1-1,000 people from events such as poisoned wells and festering bodies catapulted over walls. Only, nobody would ever suggest that medieval bioterrorists had a one-in-a-million annual chance of causing an existential catastrophe. We would not suggest this because we know full well that the data grounding medieval casualty figures does not reflect events which could cause an existential catastrophe.

Only, the very same thing could be said about the data cited by authors such as Clauset and colleagues. No historical bioterrorist attack remotely resembled the type of event that could cause an existential catastrophe. Hence, if the fact that medieval bioterrorists were not engaged in activities that could rise to the level of existential threats is taken as a reason to disqualify power law extrapolations of medieval bioterrorism risk, then by the same token it should also be taken as a reason to disqualify power law extrapolations of contemporary bioterrorism risk.

Perhaps cognizant of the limits to bioterrorism data, MSB offer a different power law extrapolation based on wartime fatalities. They write:

We can also use similar reasoning for warfare, where we have more reliable data (97 wars between 1820 and 1997, although the data are less specific to biological warfare). The parameter for warfare is 0.41, suggesting that wars that result in more than 5 billion casualties will comprise (5 billion)-0.41 = 0.0001 of all wars. Our estimate assumes that wars will occur with the same frequency as in 1820 to 1997, with 1 new war arising roughly every 2 years. It also assumes that in these extreme outlier scenarios, nuclear or contagious biological weapons would be the cause of such high casualty numbers, and that bioweapons specifically would be responsible for those enormous casualties about 10% of the time (historically bioweapons were deployed in WWI, WII, and developed but not deployed in the Cold War – constituting a bioweapons threat in every great power war since 1900). Assuming that 10% of escalations resulting in more than 5 billion deaths eventually lead to extinction, we get an annual existential risk from biowarfare of 0.0000005 (or 5*10-7).

There are a few things to be said about this discussion. The first is that it implies at least two things which most effective altruists believe to be false. First, the discussion holds that (a) 10% of escalations resulting in more than 5 billion deaths lead to extinction, and (b) bioweapons would be responsible for 10% of such escalations, with (c) nuclear weapons responsible for the other 90%. Without further (unmodeled, and somewhat implausible) argument, this implies that nuclear weapons pose a nine-times greater existential threat than biological causes do. And second, by the same reasoning, the estimate implies that great power war poses a ten-times greater existential risk than biological causes do. I suspect most readers think that biorisk is far greater than the existential risk posed by nuclear weapons or great power war, so we seem to have gone wrong somewhere.

Second, the use of wartime casualties does not solve the problem of inappropriate data, but merely relocates it. Although historical wars do involve high casualty numbers, the majority of those casualties were not caused by bioweapons, and all of the bioweapons involved bear little relationship to anything that might cause an existential catastrophe. This much is admitted by MSB in a parenthetical remark (“the data are less specific to biological warfare”), but that remark is quite important: we cannot learn much about existential biorisk by projecting casualty figures from wars in which the majority of casualties came from incomparable causes.

The lesson from this discussion is simple. Fitting models to data is all well and good, but those models should be projected to make predictions about scenarios that are appropriately related to the data which generated the models. If we dramatically increase the number of casualties, as in MSB’s first power law model, or change the type of threat, as in MSB’s second power law model, the resulting predictions will have little if any empirical support.

6. Conclusion

Today’s post looked at three models of existential biorisk by MSB. We saw that none of these models provide a credible empirical basis for existential biorisk estimates.

Model 1, based on a survey of attendees at a conference about existential risk, does little more than repeat the opinions of effective altruists.

Model 2, based on a report about the risks of gain-of-function research, takes a credible estimate of catastrophic risks and then transforms this estimate on the basis of an arbitrary translation to provide a rather less credible estimate of existential risk.

Model 3, based on extrapolating power law models of fatalities from warfare and bioterrorism, pushes power law models far beyond the inferences that can be supported by the motivating data.

To say that MSB’s models of existential biorisk do not provide a credible empirical basis for existential biorisk estimates is not to say whether MSB’s estimates are too high or too low. But it is to say that we do not yet have an especially convincing reason to resist the initial case for skepticism about existential biorisk outlined in Parts 9, 10 and 11 of this sub-series. If we are going to find a credible answer to existential biorisk skeptics, that answer must lie elsewhere.


3 responses to “Exaggerating the risks (Part 12: Millett and Snyder-Beattie on biorisk)”

  1. Vasco Grilo Avatar
    Vasco Grilo

    Nice analysis, David!

    “Finally, historical reflection suggests that this type of power law extrapolation cannot be taken literally. After all, we might have fitted a similar power law for bioterrorism a thousand years ago, finding fatalities in the range of perhaps 1-1,000 people from events such as poisoned wells and festering bodies catapulted over walls.”

    I think this is a good point. One reason power law extrapolation by many many OOMs does not work well is that the tail often starts to decay faster at some point. David Roodman found this (https://www.openphilanthropy.org/wp-content/uploads/Complementary-cumulative-CDF-of-geomagnetic-storm-events-as-function-of-Dst-1957-2014-power-law-and-GP-fits.png) analysing the risk of severe solar storms at Open Phil (https://www.openphilanthropy.org/research/geomagnetic-storms-using-extreme-value-theory-to-gauge-the-risk/):
    – “The Riley-style power law extrapolation, in purple, hits the red Carrington line at .00299559, which is to say, 0.3% of storms are Carrington-strength or worse. Since there were 373 storm events over some six decades (the data cover 1957–2014), this works out to a 17.6% chance per decade, which is in the same ballpark as Riley’s 12%. The 95% confidence interval is 9.4–31.8%/decade”.
    – “In contrast, the best fit rooted in Extreme Value Theory, in orange, crosses the Carrington line at just .0000516, meaning each storm event has just a 0.005% chance of reaching that level. In per-decade terms, that is just 0.33%, with a confidence interval of 0.0–4.0%”.

    So using the power-law extrapolation would lead to a risk of storms of Carrington-strength or worse 53 (= 0.176/0.0033) times as large as the risk rooted in Extreme Value Theory (essentially, fitting to the data a generalised pareto distribution, which encompasses the power law, but also distributions with thiner tails). As illustrated in this figure (https://www.openphilanthropy.org/wp-content/uploads/Complementary-cumulative-CDF-of-geomagnetic-storm-events-as-function-of-Dst-1957-2014-power-law-and-GP-fits.png) of Open Phil’s report, the higher the risk, the greater the divergence between the power law extrapolation and generalised pareto extrapolation. So, if one was predicting extinction probabilities (or population losses like 90 % or 99 %), I would not be surprised if results differed by 5 or 10 OOMs.

    Another detail to have in mind is that, because the slope of the tail distribution usually bends downwards (this is true for the deaths plotted in the figure in section 5 of your post), it matters whether we are fitting the power law to all the data points, or just to the right tail. The right tail will tend to have a more negative slope, so I think using all points will lead to overestimating the risk.

    I have recently tried to fit a bunch of distribution to the top 1 % most deadly terrorist attacks in the global terrorism database (https://docs.google.com/document/d/1HPEvjjupH63uIx1OkjXVvJxm_jJKPbPIgSCikys0x7Q/edit?usp=sharing). I am still reviewing the results, and they might currently be wrong, but it currently looks like the choice of distribution is very crucial. For the best fit distributions (R^2 = 0.98), I am getting extinction probabilities of the order of 10^-15 per year (https://docs.google.com/spreadsheets/d/1pUo8tWRPYaIq19J3wwpBwHODk8AgcL-_ALLavmFO1Os/edit#gid=917758315&range=AI5) for the median of the 10 best fit distributions. However, there are distributions with decent fit (R^2 > 0.6) resulting in implausibly high annual extinction risk (> 10 %). For the generalised pareto rooted in extreme value theory, I get essentially no risk of extinction (so low it is rounded to 0). I will post the analysis on EA Forum at some point, but thought I would share.

    1. David Thorstad Avatar

      Thanks Vasco!

      The Roodman link does a better job than I did emphasizing what can go wrong with naive power law extrapolation over many orders of magnitude.

      Your work fitting distributions to the most deadly 1% of terrorist attacks seems well worth doing. It’s interesting to see the wide range of predictions made by different distributions. Please do let me know when you post the final analysis – I’d like to read it.

Leave a Reply