New paper: Long-Term Trajectories of Human Civilization

Long-Term Trajectories of Human Civilization (free PDF). Foresight, forthcoming, DOI 10.1108/FS-04-2018-0037.

Authors: Seth D. Baum, Stuart Armstrong, Timoteus Ekenstedt, Olle Häggström, Robin Hanson, Karin Kuhlemann, Matthijs M. Maas, James D. Miller, Markus Salmela, Anders Sandberg, Kaj Sotala, Phil Torres, Alexey Turchin, and Roman V. Yampolskiy.

Abstract
Purpose: This paper formalizes long-term trajectories of human civilization as a scientific and ethical field of study. The long-term trajectory of human civilization can be defined as the path that human civilization takes during the entire future time period in which human civilization could continue to exist.
Approach: We focus on four types of trajectories: status quo trajectories, in which human civilization persists in a state broadly similar to its current state into the distant future; catastrophe trajectories, in which one or more events cause significant harm to human civilization; technological transformation trajectories, in which radical technological breakthroughs put human civilization on a fundamentally different course; and astronomical trajectories, in which human civilization expands beyond its home planet and into the accessible portions of the cosmos.
Findings: Status quo trajectories appear unlikely to persist into the distant future, especially in light of long-term astronomical processes. Several catastrophe, technological transformation, and astronomical trajectories appear possible.
Value: Some current actions may be able to affect the long-term trajectory. Whether these actions should be pursued depends on a mix of empirical and ethical factors. For some ethical frameworks, these actions may be especially important to pursue.

An excerpt from the press release over at the Global Catastrophic Risk Institute:

Society today needs greater attention to the long-term fate of human civilization. Important present-day decisions can affect what happens millions, billions, or trillions of years into the future. The long-term effects may be the most important factor for present-day decisions and must be taken into account. An international group of 14 scholars calls for the dedicated study of “long-term trajectories of human civilization” in order to understand long-term outcomes and inform decision-making. This new approach is presented in the academic journal Foresight, where the scholars have made an initial evaluation of potential long-term trajectories and their present-day societal importance.

“Human civilization could end up going in radically different directions, for better or for worse. What we do today could affect the outcome. It is vital that we understand possible long-term trajectories and set policy accordingly. The stakes are quite literally astronomical,” says lead author Dr. Seth Baum, Executive Director of the Global Catastrophic Risk Institute, a non-profit think tank in the US.

The group of scholars including Olle Häggström, Robin Hanson, Karin Kuhlemann, Anders Sandberg, and Roman Yampolskiy have identified four types of long-term trajectories: status quo trajectories, in which civilization stays about the same, catastrophe trajectories, in which civilization collapses, technological transformation trajectories, in which radical technology fundamentally changes civilization, and astronomical trajectories, in which civilization expands beyond our home planet.

Available here: https://kajsotala.fi/assets/2018/08/trajectories.pdf

Finland Museum Tour 1/??: Tampere Art Museum

I haven’t really been to museums as an adult; not because I’d have been particularly Anti-Museum, but just because museums never happened to become a Thing That I Do. I vaguely recall having been to a few museums with my parents when I was little, an occasional Japan exhibition as a teen when Japan was a Thing, and a few visits to various museums with school. I think my overall recollection of those visits afterwards could be summarized as being around 5.5 on the BoardGameGeek rating scale grade of “5/10: Slightly boring, take it or leave it” and “6/10: Ok – will play if in the mood”. (The BGG rating scale is my favorite of the ones that I’ve seen, but I digress.)

So I’m not sure, but it’s at least possible that between becoming an adult and yesterday, I didn’t visit a single museum.

For the last year or so however, I’ve had a definite feeling of being stuck in a rut, life-wise. Up until summer last year, I used to have a lot of anxiety; I’m still not totally free of it, but I’ve reduced the amount of it enough that escaping from it is no longer my main driving motivation, the way that it used to be. Meaning that I’m more free to focus on things that I actually enjoy.

But once you have spent most of your adult life feeling a desperate need to escape from a constant level of background anxiety, anxiety which was preventing you from doing anything slow-paced as that would have been insufficient to drown out the suffering… then it’s hard to know *what* you really enjoy anymore. Because you haven’t really been looking for enjoyable things, you have been looking for things that would make the pain go away.

What I was left with, even after getting rid of most of the anxiety, was some level of anhedonia – a difficulty deriving any pleasure from something. And most of my old routines were built around doing things that were mainly palliatives for anxiety, rather than being particularly enjoyable.

Then one day, I happened to see a news article saying something about how the national Museum Card – a single card that you can buy for a year, that gives you free access to various museums around the country – had brought more visitors to museums. And it crossed my mind that visiting museums is a Thing That People Do, which I wasn’t doing, and that I probably hadn’t been able to properly appreciate museums as a kid.

Also, that it would be a fun adventure to go around the country and visit all the various museums that the card gives you access to (currently 278 of them). That kind of an adventure was also Something That People Did, but which hadn’t really been the Kind Of Thing That I Would Do.

So I happened to mention this idea in a conversation with my good friend Tiina, who then mentioned that she also had a museum card, and that company would be welcome. That wasn’t quite an agreement to visit *all* the museums in the country, but still good enough to get started!

Our first visit, today, was to Tampere Art Museum. I had no particular reason for picking this place in particular: me and Tiina would be traveling elsewhere later in the day, so I let her as a Tampere local choose a place from which we could easily get to the bus station afterwards. I also found the notion of starting out with an art museum intriguing. Some of the other museums in Tampere, such as the Espionage Museum, Game Museum or Lenin Museum had topics that were intrinsically interesting. But just a generic “art museum” was probably closest to my inner stereotype of the kind of a dull museum that I wasn’t really able to appreciate as a kid. This time, I was determined to have an open mind and enjoy my experience, whatever it might be.

And I did.

The museum had two exhibitions. The first, Tuomo Rosenlund and Mika Hannu’s “Paikan muisti” (“The Memory of Place”) consisted of black-and-white drawings of different places in the city of Tampere, together with poems and writing about the history of those places.

I feel like I have recently done a number of things to help me connect with more intuitive parts of my mind, ranging from various psychotherapy-style practices like Focusing, to shaman drumming and mild vision trances. Looking at the various drawings, the thought came to me to let my subconscious complete them. I asked it to do so and it obliged, and in my mind’s eye, the black-and-white drawings were flooded with color and texture and depth, and I could experience myself standing in the depicted places, imagining myself into locations that I had never seen before. Not as vividly as actually being there, but as strongly as in a particularly vivid memory.

That felt enjoyable, like a pleasant meditation exercise.

And then there was the exhibition by J.A. Juvani, a visual artist who – as I learned – is the national Young Artist of the Year.

The essence of his pieces, as I experienced them, was that of pure, raw, unashamed sexuality; lust and desire shaded in colors of queer and kink. Furthermore, it was an obviously personal essence; an expression of the artist’s own sexuality, an expression of intimacy that drew you in, erotic even to someone like me who doesn’t ordinarily think of himself as being physically attracted to men.

One of his displays was a looping video, on a large screen right in front of the stairs leading to the second floor, placed so that you couldn’t avoid seeing it when you came up. At the end of the very suggestive video, he would be staring you right in the eyes; and as I stared back, I felt my breathing growing deeper and faster, as if he was propositioning to me in the flesh right there, me finding him too appealing to say no to.

What probably got me was the rawness – this was not the nice, unreal kind of erotica where everything is pink and nobody’s position ever feels uncomfortable, but the kind of primal lust that felt real and totally uninterested in Photoshopping any aspect of the experience. It was much more interested in just having a good, visceral, _physical_ fuck right then and there.

And it brought up partially forgotten memories. Memories of a same kind of pure, two-sided lust between people, with a force that I realized I probably hadn’t experienced since my teen years, in that roller-coaster relationship with my first girlfriend that was all shades of dysfunctional but never lacking in intensity.

There was a book on Juvani’s work on sale; I skimmed it, and his descriptions of his life, the casual sex, the friend who had visited “for a tea and a blowjob” further brought back memories of those first teenage sex experiences, when everything was novel and exciting and innocent and powerful; before I had accumulated the sexual traumas and problems that sometimes make me almost averse over the thought of having sex, even with a willing and desirable partner.

And there was a sense of catharsis, feeling somehow more *pure* as a result; the same kind of feeling that I may get from writing something that I’ve managed to really immerse myself in, or from really intensive fiction. A feeling of having connected with deeper, more emotional parts of my mind, and becoming more whole as a result.

It was a good experience. I’m glad that I went.

Is the Star Trek Federation really incapable of building AI?

In the Star Trek universe, we are told that it’s really hard to make genuine artificial intelligence, and that Data is so special because he’s a rare example of someone having managed to create one.

But this doesn’t seem to be the best hypothesis for explaining the evidence that we’ve actually seen. Consider:

  • In the TOS episode “The Ultimate Computer“, the Federation has managed to build a computer intelligent enough to run the Enterprise by its own, but it goes crazy and Kirk has to talk it into self-destructing.
  • In TNG, we find out that before Data, Doctor Noonian Soong had built Lore, an android with sophisticated emotional processing. However, Lore became essentially evil and had no problems killing people for his own benefit. Data worked better, but in order to get his behavior right, Soong had to initially build him with no emotions at all. (TNG: “Datalore“, “Brothers“)
  • In the TNG episode “Evolution“, Wesley is doing a science project with nanotechnology, accidentally enabling the nanites to become a collective intelligence which almost takes over the ship before the crew manages to negotiate a peaceful solution with them.
  • The holodeck seems entirely capable of running generally intelligent characters, though their behavior is usually restricted to specific roles. However, on occasion they have started straying outside their normal parameters, to the point of attempting to take over the ship. (TNG: “Elementary, Dear Data“) It is also suggested that the computer is capable of running an indefinitely long simulation which is good enough to make an intelligent being believe in it being the real universe. (TNG: “Ship in a Bottle“)
  • The ship’s computer in most of the series seems like it’s potentially quite intelligent, but most of the intelligence isn’t used for anything else than running holographic characters.
  • In the TNG episode “Booby Trap“, a potential way of saving the Enterprise from the Disaster Of The Week would involve turning over control of the ship to the computer: however, the characters are inexplicably super-reluctant to do this.
  • In Voyager, the Emergency Medical Hologram clearly has general intelligence: however, it is only supposed to be used in emergency situations rather than running long-term, its memory starting to degrade after a sufficiently long time of continuous use. The recommended solution is to reset it, removing all of the accumulated memories since its first activation. (VOY: “The Swarm“)

There seems to be a pattern here: if an AI is built to carry out a relatively restricted role, then things work fine. However, once it is given broad autonomy and it gets to do open-ended learning, there’s a very high chance that it gets out of control. The Federation witnessed this for the first time with the Ultimate Computer. Since then, they have been ensuring that all of their AI systems are restricted to narrow tasks or that they’ll only run for a short time in an emergency, to avoid things getting out of hand. Of course, this doesn’t change the fact that your AI having more intelligence is generally useful, so e.g. starship computers are equipped with powerful general intelligence capabilities, which sometimes do get out of hand.

Dr. Soong’s achievement with Data was not in building a general intelligence, but in building a general intelligence which didn’t go crazy. (And before Data, he failed at that task once, with Lore.)

The Federation’s issue with AI is not that they haven’t solved artificial general intelligence. The Federation’s issue is that they haven’t reliably solved the AI alignment problem.

Some conceptual highlights from “Disjunctive Scenarios of Catastrophic AI Risk”

My forthcoming paper, “Disjunctive Scenarios of Catastrophic AI Risk”, attempts to introduce a number of considerations to the analysis of potential risks from Artificial General Intelligence (AGI). As the paper is long and occasionally makes for somewhat dry reading, I thought that I would briefly highlight a few of the key points raised in the paper.

The main idea here is that most of the discussion about risks of AGI has been framed in terms of a scenario that goes something along the lines of “a research group develops AGI, that AGI develops to become superintelligent, escapes from its creators, and takes over the world”. While that is one scenario that could happen, focusing too much on any single scenario makes us more likely to miss out alternative scenarios. It also makes the scenarios susceptible to criticism from people who (correctly!) point out that we are postulating very specific scenarios that have lots of burdensome details.

To address that, I discuss here a number of considerations that suggest disjunctive paths to catastrophic outcomes: paths that are of the form “A or B or C could happen, and any one of them happening could have bad consequences”.

Superintelligence versus Crucial Capabilities

Bostrom’s Superintelligence, as well as a number of other sources, basically make the following argument:

  1. An AGI could become superintelligent
  2. Superintelligence would enable the AGI to take over the world

This is an important argument to make and analyze, since superintelligence basically represents an extreme case: if an individual AGI may become as powerful as it gets, how do we prepare for that eventuality? As long as there is a plausible chance for such an extreme case to be realized, it must be taken into account.

However, it is probably a mistake to focus only on the case of superintelligence. Basically, the reason why we are interested in a superintelligence is that, by assumption, it has the cognitive capabilities necessary for a world takeover. But what about an AGI which also had the cognitive capabilities necessary for taking over the world, and only those?

Such an AGI might not count as a superintelligence in the traditional sense, since it would not be superhumanly capable in every domain. Yet, it would still be one that we should be concerned about. If we focus too much on just the superintelligence case, we might miss the emergence of a “dumb” AGI which nevertheless had the crucial capabilities necessary for a world takeover.

That raises the question of what might be such crucial capabilities. I don’t have a comprehensive answer; in my paper, I focus mostly on the kinds of capabilities that could be used to inflict major damage: social manipulation, cyberwarfare, biological warfare. Others no doubt exist.

A possibly useful framing for future investigations might be, “what level of capability would an AGI need to achieve in a crucial capability in order to be dangerous”, where the definition of “dangerous” is free to vary based on how serious of a risk we are concerned about. One complication here is that this is a highly contextual question – with a superintelligence we can assume that the AGI may get basically omnipotent, but such a simplifying assumption won’t help us here. For example, the level of offensive biowarfare capability that would pose a major risk, depends on the level of the world’s defensive biowarfare capabilities. Also, we know that it’s possible to inflict enormous damage to humanity even with just human-level intelligence: whoever is authorized to control the arsenal of a nuclear power could trigger World War III, no superhuman smarts needed.

Crucial capabilities are a disjunctive consideration because they show that superintelligence isn’t the only level of capability that would pose a major risk: and there many different combinations of various capabilities – including ones that we don’t even know about yet – that could pose the same level of danger as superintelligence.

Incidentally, this shows one reason why the common criticism of “superintelligence isn’t something that we need to worry about because intelligence isn’t unidimensional” is misfounded – the AGI doesn’t need to be superintelligent in every dimension of intelligence, just the ones we care about.

How would the AGI get free and powerful?

In the prototypical AGI risk scenario, we are assuming that the developers of the AGI want to keep it strictly under control, whereas the AGI itself has a motive to break free. This has led to various discussions about the feasibility of “oracle AI” or “AI confinement” – ways to restrict the AGI’s ability to act freely in the world, while still making use of it. This also means that the AGI might have a hard time acquiring the resources that it needs for a world takeover, since it either has to do so while it is under constant supervision by its creators, or while on the run from them.

However, there are also alternative scenarios where the AGI’s creators voluntarily let it free – or even place it in control of e.g. a major corporation, free to use that corporation’s resources as it desires! My chapter discusses several ways by which this could happen: i) economic benefit or competitive pressure, ii) criminal or terrorist reasons, iii) ethical or philosophical reasons, iv) confidence in the AI’s safety, as well as v) desperate circumstances such as being otherwise close to death. See the chapter for more details on each of these. Furthermore, the AGI could remain theoretically confined but be practically in control anyway – such as in a situation where it was officially only giving a corporation advice, but its advice had never been wrong before and nobody wanted to risk their jobs by going against the advice.

Would the Treacherous Turn involve a Decisive Strategic Advantage?

Looking at crucial capabilities in a more fine-grained manner also raises the question of when an AGI would start acting against humanity’s interests. In the typical superintelligence scenario, we assume that it will do so once it is in a position to achieve what Bostrom calls a Decisive Strategic Advantage (DSA): “a level of technological and other advantages sufficient to enable [an AI] to achieve complete world domination”. After all, if you are capable of achieving superintelligence and a DSA, why act any earlier than that?

Even when dealing with superintelligences, however, the case isn’t quite as clear-cut. Suppose that there are two AGI systems, each potentially capable of achieving a DSA if they prepare for long enough. But the longer that they prepare, the more likely it becomes that the other AGI sets its plans in motion first, and achieves an advantage over the other. Thus, if several AGI projects exist, each AGI is incentivized to take action at such a point which maximizes its overall probability of success – even if the AGI only had rather slim chances of succeeding in the takeover, if it thought that waiting for longer would make its chances even worse.

Indeed, an AGI which defects on its creators may not be going for a world takeover in the first place: it might, for instance, simply be trying to maneuver itself into a position where it can act more autonomously and defeat takeover attempts by other, more powerful AGIs. The threshold for the first treacherous turn could vary quite a bit, depending on the goals and assets of the different AGIs; various considerations are discussed in the paper.

A large reason for analyzing these kinds of scenarios is that, besides caring about existential risks, we also care about catastrophic risks – such as an AGI acting too early and launching a plan which resulted in “merely” hundreds of millions of deaths. My paper introduces the term Major Strategic Advantage, defined as “a level of technological and other advantages sufficient to pose a catastrophic risk to human society”. A catastrophic risk is one that might inflict serious damage to human well-being on a global scale and cause ten million or more fatalities.

“Mere” catastrophic risks could also turn into existential ones, if they contribute to global turbulence (Bostrom et al. 2017), a situation in which existing institutions are challenged, and coordination and long-term planning become more difficult. Global turbulence could then contribute to another out-of-control AI project failing even more catastrophically and causing even more damage

Summary table and example scenarios

The table below summarizes the various alternatives explored in the paper.

AI’s level of strategic advantage
  • Decisive
  • Major
AI’s capability threshold for non-cooperation
  • Very low to very high, depending on various factors
Sources of AI capability
  • Individual takeoff
    • Hardware overhang
    • Speed explosion
    • Intelligence explosion
  • Collective takeoff
  • Crucial capabilities
    • Biowarfare
    • Cyberwarfare
    • Social manipulation
    • Something else
  • Gradual shift in power
Ways for the AI to achieve autonomy
  • Escape
    • Social manipulation
    • Technical weakness
  • Voluntarily released
    • Economic or competitive reasons
    • Criminal or terrorist reasons
    • Ethical or philosophical reasons
    • Desperation
    • Confidence
      • in lack of capability
      • in values
  • Confined but effectively in control
Number of AIs
  • Single
  • Multiple

And here are some example scenarios formed by different combinations of them:

The classic takeover

(Decisive strategic advantage, high capability threshold, intelligence explosion, escaped AI, single AI)

The “classic” AI takeover scenario: an AI is developed, which eventually becomes better at AI design than its programmers. The AI uses this ability to undergo an intelligence explosion, and eventually escapes to the Internet from its confinement. After acquiring sufficient influence and resources in secret, it carries out a strike against humanity, eliminating humanity as a dominant player on Earth so that it can proceed with its own plans unhindered.

The gradual takeover

(Major strategic advantage, high capability threshold, gradual shift in power, released for economic reasons, multiple AIs)

Many corporations, governments, and individuals voluntarily turn over functions to AIs, until we are dependent on AI systems. These are initially narrow-AI systems, but continued upgrades push some of them to the level of having general intelligence. Gradually, they start making all the decisions. We know that letting them run things is risky, but now a lot of stuff is built around them, it brings a profit and they’re really good at giving us nice stuff—for the while being.

The wars of the desperate AIs

(Major strategic advantage, low capability threshold, crucial capabilities, escaped AIs, multiple AIs)

Many different actors develop AI systems. Most of these prototypes are unaligned with human values and not yet enormously capable, but many of these AIs reason that some other prototype might be more capable. As a result, they attempt to defect on humanity despite knowing their chances of success to be low, reasoning that they would have an even lower chance of achieving their goals if they did not defect. Society is hit by various out-of-control systems with crucial capabilities that manage to do catastrophic damage before being contained.

Is humanity feeling lucky?

(Decisive strategic advantage, high capability threshold, crucial capabilities, confined but effectively in control, single AI)

Google begins to make decisions about product launches and strategies as guided by their strategic advisor AI. This allows them to become even more powerful and influential than they already are. Nudged by the strategy AI, they start taking increasingly questionable actions that increase their power; they are too powerful for society to put a stop to them. Hard-to-understand code written by the strategy AI detects and subtly sabotages other people’s AI projects, until Google establishes itself as the dominant world power.

This blog post was written as part of work for the Foundational Research Institute.