AI risk model: single or multiple AIs?

EDIT April 20th: Replaced original graph with a clearer one.

My previous posts have basically been discussing a scenario where a single AI becomes powerful enough to threaten humanity. However, there is no reason to only focus on the scenario with a single AI. Depending on our assumptions, a number of AIs could also emerge at the same time. Here are some considerations.

A single AI

The classic AI risk scenario. Some research group achieves major headway in developing AI, and no others seem to be within reach. For an extended while, it is the success of failure of this AI group that matters.

This would seem relatively unlikely to persist, given the current fierce competition in the AI scene. Whereas a single company could conceivably achieve a major lead in a rare niche with little competition, this seems unlikely to be the case for AI.

A possible exception might be if a company managed to monopolize the domain entirely, or if it had development resources that few others did. For example, companies such as Google and Facebook are currently the only ones with access to large datasets used for machine learning. On the other hand, dependence on such huge datasets is a quirk of current machine learning techniques – an AGI would need the ability to learn from much smaller sets of data. A more plausible crucial asset might be something like supercomputing resources – possibly the first AGIs will need massive amounts of computing power.

Bostrom (2016) discusses the impact of openness on AI development. Bostrom notes that if there is a large degree of openness, and everyone has access to the same algorithms, then hardware may become the primary limiting factor. If the hardware requirements for AI were relatively low, then high openness could lead to the creation of multiple AIs. On the other hand, if hardware was the primary limiting factor and large amounts of hardware were needed, then a few wealthy organizations might be able to monopolize AI for a while.

Branwen (2015) has suggested that hardware production is reliant on a small number of centralized factories that would make easy targets for regulation. This would suggest a possible route by which AI might become amenable to government regulation, limiting the amount of AIs deployed.

Similarly, there have been various proposals of government and international regulation of AI development. If successfully enacted, such regulation might limit the number of AIs that were deployed.

Another possible crucial asset would be the possession of a non-obvious breakthrough insight, one which would be hard for other researchers to come up with. If this was kept secret, then a single company might plausibly develop major headway on others. [how often has something like this actually happened in a non-niche field?]

The plausibility of the single-AI scenario is also affected by the length of a takeoff. If one presumes a takeoff speed that is only a few months, then a single AI scenario seems more likely. Successful AI containment procedures may also increase the chances of there being multiple AIs, as the first AIs remain contained, allowing for other projects to catch up.

Multiple collaborating AIs

A different scenario is one where a number of AIs exist, all pursuing shared goals. This seems most likely to come about if all the AIs are created by the same actor. This scenario is noteworthy because the AIs do not necessarily need to be superintelligent individually, but they may have a superhuman ability to coordinate and put the interest of the group above individual interests (if they even have anything that could be called an individual interest).

This possibility raises the question – if multiple AIs collaborate and share information between each other, to such an extent that the same data can be processed by multiple AIs at a time, how does one distinguish between multiple collaborating AIs and one AI composed of many subunits? This is arguably not a distinction that would “cut reality at the joints”, and the difference may be more a question of degree.

The distinction likely makes more sense if the AIs cannot completely share information between each other, such as because each of them has developed a unique conceptual network, and cannot directly integrate information from the others but has to process it in its own idiosyncratic way.

Multiple AIs with differing goals

A situation with multiple AIs that did not share the same goals could occur if several actors reached the capability for building AIs around the same time. Alternatively, a single organization might deploy multiple AIs intended to achieve different purposes, which might come into conflict if measures to enforce cooperativeness between them failed or were never deployed in the first place (maybe because of an assumption that they would have non-overlapping domains).

One effect of having multiple groups developing AIs is that this scenario may remove the possibilities of stopping to pursue further safety measures before deploying the AI, or of deploying an AI with safeguards that reduce performance (Bostrom 2016). If the actor that deploys the most effective AI earliest on can dominate others who take more time, then the more safety-conscious actors may never have the time to deploy their AIs.

Even if none of the AI projects chose to deploy their AIs carelessly, the more AI projects there are, the more likely it becomes that at least one of them will have their containment procedures fail.

The possibility has been raised that having multiple AIs with conflicting goals would be a good thing, in that it would allow humanity to play the AIs against each other. This seems highly unobvious, for it is not clear why humans wouldn’t simply be caught in the crossfire. In a situation with superintelligent agents around, it seems more likely that humans would be the ones that would be played with.

Bostrom (2016) also notes that unanticipated interactions between AIs already happen even with very simple systems, such as in the interactions that led to the Flash Crash, and that particularly AIs that reasoned in non-human ways could be very difficult for humans to anticipate once they started basing their behavior on what the other AIs did.

A model with assumptions

GraphViz

Here’s a new graphical model about an AI scenario, embodying a specific set of assumptions. This one tries to take a look at some of the factors that influence whether there might be a single or several AIs.

This model both makes a great number of assumptions, AND leaves out many important ones! For example, although I discussed openness above, openness is not explicitly included in this model. By sharing this, I’m hoping to draw commentary on 1) which assumptions people feel are the most shaky and 2) which additional ones are valid and should be explicitly included. I’ll focus on those ones in future posts.

Written explanations of the model:

We may end up in a scenario where there is (for a while) only a single or a small number of AIs if at least one of the following is true:

  • The breakthrough needed for creating AI is highly non-obvious, so that it takes a long time for competitors to figure it out
  • AI requires a great amount of hardware and only a few of the relevant players can afford to run it
  • There is effective regulation, only allowing some authorized groups to develop AI

We may end up with effective regulation at least if:

  • AI requires a great amount of hardware, and hardware is effectively regulated

(this is not meant to be the only way by which effective regulation can occur, just the only one that was included in this flowchart)

We may end up in a scenario where there are a large number of AIs if:

  • There is a long takeoff and competition to build them (ie. ineffective regulation)

If there are few AI, and the people building them take their time to invest in value alignment and/or are prepared to build AIs that are value-aligned even if that makes them less effective, then there may be a positive outcome.

If people building AIs do not do these things, then AIs are not value aligned and there may be a negative outcome.

If there are many AI and there are people who are ready to invest time/efficency to value-aligned AI, then those AIs may become outcompeted by AIs whose creators did not invest in those things, and there may be a negative outcome.

Not displayed in the diagram because it would have looked messy:

  • If there’s a very short takeoff, this can also lead to there only being a single AI, since the first AI to cross a critical threshold may achieve dominance over all the others. However, if there is fierce competition this still doesn’t necessarily leave time for safeguards and taking time to achieve safety – other teams may also be near the critical threshold.

This blog post was written as part of research funded by the Foundational Research Institute.

6 comments

  1. “This possibility raises the question – if multiple AIs collaborate and share information between each other, to such an extent that the same data can be processed by multiple AIs at a time, how does one distinguish between multiple collaborating AIs and one AI composed of many subunits?”

    1) Keep humans involved in the junctures between speciased AIs.

    2) Keep intelligence and agency separate. Agency is optional.

    • It sounds like this would be hard to accomplish if there’s competition where others are deploying increasingly autonomous AI, and can thus outperform your system that’s bottlenecked by the slow humans. It could work if there were fewer AIs, though.

      • 1. Something like keeping humans in the loop seems to happen naturally, as a result of the need for organisations to keep their systems in line with their aims.

        2. There could he a legal requirement to keep humans in the loop, cf autopilots.

        3. The most powerful AIs, or AI complexes in this case, will be few.

  2. > if the AIs cannot completely share information between each other, such as because each of them has developed a unique conceptual network, and cannot directly integrate information from the others but has to process it in its own idiosyncratic way.

    That’s true of many present-day AIs too in trivial sorts of ways. For instance, in 2009, my class built a music search engine, which took outputs from several different feature-processing steps and combined them into a final score.

    I agree that distinguishing one vs. multiple AIs that have the same goals is fuzzy and maybe not as relevant as distinguishing agents with different, competing goals.

    > it is not clear why humans wouldn’t simply be caught in the crossfire

    Yeah. Also, the extra AIs in that scenario might have worse values, not better ones, than the original, single AI.

    People have an intuition that checks and balances are good to prevent corruption, but that observation mainly applies in the human realm, where absolute power corrupts absolutely. AIs with absolute power needn’t become tyrannical (in principle).

    > a single or a small number of AIs

    I think having two AIs could be quite different from having one AI. Two global superpowers is like the US vs. USSR, with a nuclear arms race. Having just one superpower would have been very different.

    Another factor that could matter a lot is whether AI development is nationalized (by the military probably) and how many countries have top-notch AI talent.

    It’s worth noting that your “positive” outcome means “positive outcome for certain human values”. It’s not obvious that an uncontrolled AI is worse in terms of some human values, such as suffering reduction or death minimization.

    • Another factor that could matter a lot is whether AI development is nationalized (by the military probably) and how many countries have top-notch AI talent.

      Thanks, this is a good factor to consider.

      It’s worth noting that your “positive” outcome means “positive outcome for certain human values”.

      Good point. I was actually going to change the terms “positive outcome” and “negative outcome” to “explicitly value-aligned outcome” and “not explicitly value-aligned outcome”. This would be to indicate that even if efforts to make the AI’s values explicitly aligned with those of its builders failed, its behavior might still end up being positive in terms of some other values.

  3. > People have an intuition that checks and balances are good to prevent corruption, but that observation applies in the human realm, where absolute power corrupts absolutely. AIs with absolute power needn’t become tyrannical (in principle).

    Ais with absolute power are difficult to get right. You essentially have to get a very difficult problem exactly right first time. That’s not a good position to be in.

    Powerful AIs may not be subject to the corrupting effects of power, but that is not the only reason to avoid powerful singletons, or to promote checks and balances.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.