On [one] view, in the coming decades or centuries, we will invent an artificially intelligent agent that has the power to improve its own intelligence, which then will give it greater powers to improve its own intelligence further. Through this process of recursive self-improvement, that agent might rapidly — perhaps over the course of days or weeks — develop intelligence greater than that of all of the rest of humanity combined. At that point, it will have the power to do what it wants with the human species, and will be able to spread to the stars and use the resources in the accessible universe in whatever way it wants. If, however, we are able to control this superintelligence, and align it with human values, then our preferences would determine how all of the resources in the accessible universe would be used.Will MacAskill, “Are we living at the hinge of history?“
(Existential Risk Pessimism) Per-century existential risk is very high.
(Astronomical Value Thesis) Efforts to mitigate existential risk have astronomically high expected value.
Part 3 introduced a potential solution: the Time of Perils Hypothesis on which risk is high now, but will soon fall to a permanently low level.
Today, I want to look at some common objections to the argument in this paper and say a few words about how best to respond to them.
2. Limitations of the model
Many readers have spoken or written to me about modeling limitations. For example:
- Joint modeling: The paper explores changes to the Simple Model, such as value growth and the Time of Perils Hypothesis, individually, but not jointly. Wouldn’t it be better to model all of these interventions together?
- Black boxes of value: The model exogenizes its axiological component, the value v of a century. Even models which change the value of a century, for example the model of value growth, don’t tell us what generates value. Wouldn’t it be better to couple the model with (a) a model of the well-being of individuals across time, and (b) a population axiology telling us how valuable it is for collections of individuals to occupy given welfare levels? Wouldn’t it be even better if the model tracked features such as (c) population growth or (d) rates of interstellar expansion that figure as key determinants of the values of future centuries?
- Continuous modeling: The model in this paper is a discrete model. Wouldn’t it be better to use a continuous model? That would avoid uncomfortable “gappiness”, such as the sudden jump from a very high to a very low level of risk in the Time of Perils model.
These limitations are real. They are important. And I would be genuinely thrilled to see them removed. I know of at least one graduate student who is working to extend the model in some ways that will address the “black boxes of value” objection. I’d be very happy to see other work in this, or another direction.
If, at any point, you find yourself inclined to extend the model in some ways, please write to me at firstname.lastname@example.org. I’d be thrilled to hear about it.
3. Exponential growth
One type of modeling limitation deserves special emphasis. The section on value growth considers more ambitious forms of value growth than those considered by Ord and Adamcewski: while Ord and Adamcewski only considered linear growth, I also consider quadratic growth, and elsewhere I have sketched some results for cubic, quartic and quintic growth (see also Tarsney 2022 on cubic growth).
However, I have not considered the possibility of sustained exponential growth in the value of future centuries. Would that be enough to reconcile Existential Risk Pessimism with the Astronomical Value Thesis?
On the possibility of modeling sustained exponential value growth, I have three remarks to make. First, don’t do it. It is one thing to model a few centuries of exponential value growth. But projecting exponential value growth into the indefinite future is likely to vastly overestimate the extent of future value growth.
Even when we look at more familiar quantities such as GDP growth and population growth, it looks implausible to posit hundreds of centuries of sustained growth. For example, the UN projects that population growth will be subexponential, and quite probably near-zero by the end of the century.
And although it is true that GDP growth has been exponential for the past few centuries, this is a historical anomaly that cannot continue. Until very recently, world GDP growth was highly subexponential. Here are the figures for the previous twenty centuries:
And in fact, this graph is highly truncated. Humanity has been around for millions of years, but experienced sustained exponential economic growth only during the last few centuries.
Could an optimist think that exponential GDP growth will continue? Well, I hope that exponential GDP growth will continue for a while yet. But nobody should think that exponential GDP growth will continue for hundreds of centuries. Projecting forward today’s per-capita world GDP of $12,000 with the 2.1% annual rate of estimated world GDP growth by 2100 from a recent expert elicitation would yield per-capita GDP above $1080 within 85 centuries. Unless we think that within 85 centuries, the average person will be earning more dollars than there are atoms in the known universe, we should probably think that exponential GDP growth has to stop sometime.
Also, it is important to remember that GDP growth does not translate linearly into welfare growth. From the fact that Elon Musk is at least a million times richer than I am, we cannot conclude that Musk’s level of well-being is a million times greater than my own. It isn’t. So even if we are excited about the prospects for future exponential GDP growth, we should be less excited about the prospect for future exponential value growth.
Second, if you must model exponential growth, look at Aschenbrenner (2020). Aschenbrenner combines an exponential model of economic growth with a quantity (safety outputs) that exponentially decreases existential risk, but also another quantity (consumption) that exponentially increases existential risk. This is a vast improvement on a modeling situation in which we write down a single exponential equation (value growth), note the trivial mathematical fact that exponentials tend to win in the long term, and take a victory lap. If you must write down a term that increases exponentially, pay attention to exponential terms that push in the other direction.
Third, if you’d like an exponential growth model in the spirit of this paper, here it is. I hope I have convinced you not to traffic in such models. But if you would like to see what happens in such a model, it is only fair for me to tell you what happens.
Exponential growth values the world at:
for some rate a > 1 of per-century value growth and risk of existential catastrophe. This diverges for a >= 1/(1-r) and otherwise converges to .
Suppose X provides a relative reduction of risk in our own century by fraction f. This creates a new world with value:
which simplifies to:
Hence the value of X is V[WX] – V[W], which works out to.
To get a rough intuition, split up a as a = 1 + w so that i.e. w = 0.2 represents 20% per-century value growth. To a rough approximation (this is a bit too simple) if w < r, the `risk’ exponential beats the `growth’ exponential and the model behaves somewhat similarly to the models in the paper. When w > r, the `growth’ exponential beats the `risk’ exponential and the sum diverges. When w ≈ r, the `growth’ and `risk’ exponentials both exert substantial force, and model behavior is sensitive to small changes in w and r. (Something like this tripartite dynamic is familiar from the Aschenbrenner model).
Here are some representative values for a 10% relative risk reduction on the exponential growth model, with `DIV’ representing divergence.
|r = 0.001||r = 0.01||r = 0.1||r = 0.2|
|w = 0.001||100.1v||0.1v||0.1v||0.1v|
|w = 0.01||DIV||10.1v||0.1v||0.1v|
|w = 0.1||DIV||DIV||1.1v||0.2v|
|w = 0.2||DIV||DIV||DIV||0.6v|
Summing up: I don’t think it is appropriate to model indefinite exponential growth in quantities such as GDP and population, let alone the value of a century. If you must do this, take a lesson from Aschenbrenner and let exponential growth race against a parameter which decays exponentially, so that we’re not just rigging the game in favor of growth. A simplistic way to do that, in the spirit of my paper, is sketched above: low rates of value growth lead to performance qualitatively reminiscent of the models we’re already seen; high rates of value growth lead to implausibly high values for existential risk mitigation; and `knife-edge’ cases in-between are both rare and highly sensitive to parameter choices.
4. Small probabilities are enough
Many readers have reminded me that even a small probability of the Time of Perils Hypothesis may be enough to save the Pessimist. For example, if the right version of the Time of Perils Hypothesis implies that reducing existential risk is 2,000x better than anything else we can do, then a 1/2,000 chance of the Time of Perils Hypothesis being true would still be enough to salvage the case for existential risk mitigation, and more optimistic assumptions might even save the Astronomical Value Thesis.
Fair enough, I respond, but the Time of Perils Hypothesis makes a very strong claim: despite technology getting ever more destructive with each passing year, levels of existential risk will (a) soon take a (b) dramatic drop by many orders of magnitude, and (c) stay low for the rest of human history. These are strong claims, and if we didn’t have an argument for the Time of Perils Hypothesis in hand, we’d probably take them to be highly implausible. Why should we expect many orders of magnitude to be shaved off of future risk? Why should we expect that future centuries will never see a reversion to today’s high-risk regime, not even for a few decades? And why should we expect that all of this will happen soon?
What this discussion reminds us is that the Time of Perils Hypothesis needs an argument. While it is quite right to suggest that we don’t need a knockdown argument for the Time of Perils Hypothesis in order to help the Pessimist, we still do need a decent argument, otherwise it really would be appropriate to assign very low probability to the Time of Perils Hypothesis.
The paper considers three leading arguments for the Time of Perils Hypothesis and argues that they don’t work: appeals to space (Part 4), the existential risk Kuznets curve (Part 5), and wisdom growth (Part 6). If these arguments don’t work, then we need a new argument for the Time of Perils Hypothesis. What could that argument be?
5. Superintelligence and beyond
One suggestion that is often made to me is that the Time of Perils will end after recent progress once superhuman AI systems are developed. These systems may well kill us, it is suggested, but if they do not kill us, they will have the wisdom, foresight and power to predict and stamp out nearly all future risks. In this way, the Time of Perils will come to an end with the development of superhuman AI systems.
Let me answer this objection with a story. I was once in a seminar room with two professors from Princeton. They spoke about a topic connected to the Time of Perils Hypothesis. Someone in the room raised their hand and asked what they thought of the objection raised above. They were a bit perhaps taken aback by the suggestion.
After some thought, one of the professors asked whether there was perhaps an article to read which would explain why effective altruists had such high hopes for future AI systems. Alas, the reply came, there is no article.
Fair enough, the professor asked. But is there a blog post explaining why effective altruists think that future AI systems can stamp out nearly all existential risks? Alas, the reply came, there is no blog post.
At this point, the matter was dropped and discussion continued. Why was the matter dropped? Let me be blunt.
The claim that humanity will soon develop superhuman artificial agents is controversial enough. The follow-up claim that superintelligent artificial systems will be so insightful that they can foresee and prevent nearly every future risk is, to most outside observers, gag-inducingly counterintuitive.
To say that a claim is counterintuitive is not to say that it cannot be true. But it is to say that counterintuitive claims require extremely strong and detailed justification.
Let’s set aside the original claim that humanity will soon develop highly powerful superhuman artificial systems. Focus on the follow-up claim that these agents will be insightful and powerful enough to prevent nearly all future risks. This claim needs an argument. That argument needs to be detailed and fully developed. And that argument needs to be a good one.
Until that argument is made, I am really not sure what to say about this objection except that I am disappointed at the combination of a high level of belief in this view throughout the effective altruist community together with a low level of serious effort to make a rigorous case for the view. That is not what reason and evidence demand, and I hope that effective altruists will do better in the future.
6. Value and astronomical value
A final objection that is often raised is that even if existential risk mitigation is not astronomically valuable, it may still be well worth doing. Indeed, many models in this paper concede that reducing risk by 10% in our own century may have value comparable to, or even exceeding the value of a century of human life.
To a large extent, I agree. The purpose of this paper is not to argue that existential risk mitigation is never worth doing. Extinction is bad, and we should take threats of human extinction seriously. The purpose of this paper is to take the case for existential risk mitigation out of the stratosphere, bringing us away from a realm of astronomical value promotion and down to a point where numbers matter.
If the value of existential risk mitigation is not astronomical, then we need to pay detailed attention to figures such as the cost and probable impact of long-termist interventions, and stack these up in detailed analysis against the most cost-effective short-termist interventions.
I suspect that the results of this analysis will show that some existential risks are worth mitigating, and that others are best left alone. But in any case, my purpose in writing this paper is not to tell you what you should think about the cost and likely success of risk mitigation measures. My main purpose is to urge effective altruists to pay detailed attention to numbers in an arena where numbers have too often been neglected or miscalculated.
7. Taking stock
- That the model has important limitations, and could be helpfully enriched.
- That exponential value growth is not modeled.
- That even small probabilities of the Time of Perils Hypothesis may be enough to reconcile Existential Risk Pessimism with the Astronomical Value Thesis.
- That superintelligent artificial agents will soon bring the Time of Perils to an end.
- That existential risk mitigation may still be worthwhile without the Astronomical Value Thesis.
Let me know if there are other objections you would like to hear discussed.