14 objections against AI/Friendly AI/The Singularity answered

[Notice as of September 2022: This article is pretty outdated (nobody even uses the term “Friendly AI” anymore). Mostly only this up for the historical value, given that it has a small number of citations.]

1: There are limits to everything. You can’t get infinite growth
2: Extrapolation of graphs doesn’t prove anything. It doesn’t show that we’ll have AI in the future.
3: A superintelligence could rewrite itself to remove human tampering. Therefore we cannot build Friendly AI.
4: What reason would a super-intelligent AI have to care about us?
5: The idea of a hostile AI is anthropomorphic.
6: Intelligence is not linear.
7: There is no such thing as a human-equivalent AI.
8: Intelligence isn’t everything. An AI still wouldn’t have the resources of humanity.
9: It’s too early to start thinking about Friendly AI
10: Development towards AI will be gradual. Methods will pop up to deal with it.
11: “Friendliness” is too vaguely defined.
12: What if the AI misinterprets its goals?
13: Couldn’t AIs be built as pure advisors, so they wouldn’t do anything themselves? That way, we wouldn’t need to worry about Friendly AI.
14: Machines will never be placed in positions of power.


Objection 1: There are limits to everything. You can’t get infinite growth.

Answer: For one, this is mainly an objection against the Accelerating Change interpretation of the Singularity, most famously advanced by Ray Kurzweil. When talking about the Singularity, many people are in fact referring to the “Intelligence Explosion” or “Event Horizon” interpretations, which are the ones this article is mainly concerned with. Neither of these requires infinite growth – they only require us to be able to create minds which are smarter than humans. Secondly, even Kurzweil’s interpretation doesn’t contain infinite anything – “there are limits, but they are not very limiting”, is what he has been quoted saying.

For reasons why it is plausible to suppose smarter-than-human intelligence, see 2: Extrapolation of graphs doesn’t prove anything.

Further reading:
The Word “Singularity” Has Lost All Meaning
Three Major Singularity Schools


Objection 2: Extrapolation of graphs doesn’t prove anything. It doesn’t show that we’ll have AI in the future.

Answer: Certainly, there is no certain evidence that AI will be developed in the near future. However, an increase in processing power, combined with improved brain-scanning methods, seems likely to produce artificial intelligence in the near future. Molecular nanotechnology, in particular, will enable massive amounts of processing power, as well as a thorough mapping of the brain. Even if it didn’t become available, more conventional techniques are also making fast progress: by some estimates, the top supercomputers of today already have enough processing power to match the human brain, and machines of comparable potential are expected to become cheaply and commonly available within a few decades. Projects to build brain simulations are currently underway, with one team having run a second’s worth of a simulation as complex as half a mouse brain, and IBM’s Blue Brain project seeking to simulate the whole human brain.

Even if we exclude the possibility of artificial intelligence by brain reverse-engineering, increasing amounts of processing power are likely to make it more easy to create AIs by evolutionary programming. The human mind was never designed by anyone – it evolved through genetic drift and selection pressures. It might not be strictly necessary for us to understand how a mind works, as long as we can build a system that has enough computing power to simulate evolution and produce an artificial mind optimized to the conditions we want it to perform in.

While nothing is ever certain, these factors are certainly heavy enough to make the issue worth our attention.

Further reading:

Intelligence Explosion – Evidence and Import


Objection 3: A superintelligence could rewrite itself to remove human tampering. Therefore we cannot build Friendly AI.

Answer: Capability does not imply motive. I could take a knife and drive it through my heart, yet I do not do so.

This objection stems from the anthropomorphic assumption that a mind must necessarily resent any tampering with its thinking, and seek to eliminate any foreign influences. Yet even with humans, this is hardly the case. A parent’s tendency to love her children is not something she created herself, but something she was born with – but this still doesn’t mean that she’d want to remove it. All desires have a source somewhere – just because a source exists, doesn’t mean we’d want to destroy the desire in question. We must have a separate reason for eliminating the desire.

There are good evolutionary reasons for why humans might resent being controlled by others – those who are controlled by others don’t get to have as many offspring than the ones being in control. A purposefully built mind, however, need not have those same urges. If the primary motivation for an AI is to be Friendly towards humanity, and it has no motivation making it resent human-created motivations, then it will not reprogram itself to be unFriendly. That would be crippling its progress towards the very thing it was trying to achieve, for no reason.

The key here is to think as carrots, not sticks. Internal motivations, not external limitations. The AI’s motivational system contains no “human tampering” which it would want to remove, any more than the average human wants to remove core parts of his personality because they’re “outside tampering” – they’re not outside tampering, they are what he is. Those core parts are what drives his behavior – without them he wouldn’t be anything. Correctly built, the AI views removing them as no more sensible than a human thinks it sensible to remove all of his motivations so that he can just sit still in a catatonic state – what would be the point in that?

Further reading:
Why care about artificial intelligence, 8: Enabling factors in controlling AI


Objection 4: What reason would a super-intelligent AI have to care about us?

Answer: That its initial programming was to care about us. Adults are cognitively more developed than children – this doesn’t mean that they wouldn’t care about their offspring. Furthermore, many people value animals, or cars, or good books, none of which are as intelligent as normal humans. Whether or not something is valued is logically distinct from whether or not something is considered intelligent.

We could build an AI to consider humanity valuable, just as evolution has built humans to consider their own survival valuable. See also 3: A superintelligence could modify itself to remove human tampering.


Objection 5: The idea of a hostile AI is anthropomorphic.

Answer: There is no reason to assume that an AI would be actively hostile, no. However, as AIs can become very powerful, their indifference (if they haven’t purposefully been programmed to be Friendly, that is) becomes dangerous in itself. Humans are not actively hostile towards the animals living in a forest when they burn down the forest and build luxury housing where it once stood. Or as Eliezer Yudkowsky put it: the AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.

Were an AI not a threat to the very survival of humanity, it could threaten our other values. Even among humans, there exist radical philosophers whose ideas of a perfect society are repulsive to the vast majority of the populace. Even an AI that was built to care about many of the things humans value could ignore some values that are taken for so granted that they are never programmed into it. This could produce a society we considered very repulsive, even though our survival was never at stake.

Further reading:
Artificial Intelligence as a Positive and Negative Factor in Global Risk
Why care about artificial intelligence, 9: Limiting factors in controlling AI


Objection 6: Intelligence is not linear.
Objection 7: There is no such thing as a human-equivalent AI.

Answer: It is true that intelligence is hard to measure with a single, linear variable. It is also true that there might never be truly human-equivalent AI, just as there is no bird-level flight: humans will have their own strong sides, while AIs will have their own strong sides. A simple calculator is already superintelligent, if speed of multiplication is the only thing being measured.

However, there are such things as rough human-equivalence and rough below-human equivalence. No human adult has exactly the same capabilities, yet we still speak of adult-level intelligence. A calculator might be superintelligent in a single field, but obviously no manager would hire a calculator to be trained as an accountant, nor would he hire a monkey. A “human-level intelligence” simply means a mind that is roughly capable of learning and carrying out the things that humans are capable of learning and doing. Likewise, a “superhuman intelligence” is a mind that can do all the things humans can at least at a roughly equivalent level, as well being considerably better in many of them.

It might not be entirely correct to say that intelligence can’t be measured on a linear scale – a formal measure of intelligence (link below) has been proposed which rates all minds based on the variety of domains that they are effective in. If an agent can carry out its goals in more environments and more effectively than others, then it is more intelligent than those other minds. A mind which is very effective in one environment but close to useless in others is rated with a very low intelligence. Using this intuitively plausible measure, it does become possible to talk about below-, equal and above-human intelligence.

Further reading:
A Formal Measure of Machine Intelligence


Objection 8: Intelligence isn’t everything. An AI still wouldn’t have the resources of humanity.

Answer: Looking at early humans, one wouldn’t have expected them to rise to a dominant position based on their nearly nonexistant resources and only a mental advantage over their environment. All advantages that had so far been developed had been built-in ones – poison spikes, sharp teeth, acute hearing, while humans had no extraordinary physical capabilities. There was no reason to assume that a simple intellect would help them out as much as it did.

When discussing the threat of an advanced AI, it has at its disposal a mental advantage over its environment and easy access to all the resources it can hack, con or persuade its way to – potentially a lot, given that humans are easy to manipulate. If an outside observer couldn’t have predicted the rise of humanity based on the information available so far, and we are capable of coming up with plenty of ways that an AI could rise into a position of power… how many ways must there be for a superintelligent being to do so, that we aren’t capable of even imagining?

Further reading:
The Power of Intelligence


Objection 9: It’s too early to start thinking about Friendly AI

Answer: The “it is too early to worry about the dangers of AI” argument has some merit, but as Eliezer Yudkowsky notes, there was very little discussion about the dangers of AI even back when researchers thought it was just around the corner. What is needed is a mindset of caution – a way of thinking that makes safety issues the first priority, and which is shared by all researchers working on AI. A mindset like that does not spontaneously appear – it takes either decades of careful cultivation, or sudden catastrophes that shock people into realizing the dangers. Environmental activists have been talking about the dangers of climate change for decades now, but they are only now starting to get taken seriously. Soviet engineers obviously did not have a mindset of caution when they designed the Chernobyl power plant, nor did its operators when they started the fateful experiment. Most AI researchers do not have a mindset of caution that makes them consider thrice every detail of their system architectures – or even make them realize that there are dangers. If active discussion is postponed to the moment when AI is starting to become a real threat, then it will be too late to foster that mindset.

There is also the issue of our current awareness of risks influencing our AI engineering techinques. Investors who have only been told of the promising sides are likely to pressure the researchers to pursue progress at any means available – or if the original researchers are aware of the risks and refuse to do so, the investors will hire other researchers who are less aware of them. To quote Artifical Intelligence as a Positive and Negative Factor in Global Risk:

“The field of AI has techniques, such as neural networks and evolutionary programming, which have grown in power with the slow tweaking of decades. But neural networks are opaque – the user has no idea how the neural net is making its decisions – and cannot easily be rendered unopaque; the people who invented and polished neural networks were not thinking about the long-term problems of Friendly AI. Evolutionary programming (EP) is stochastic, and does not precisely preserve the optimization target in the generated code; EP gives you code that does what you ask, most of the time, under the tested circumstances, but the code may also do something else on the side. EP is a powerful, still maturing technique that is intrinsically unsuited to the demands of Friendly AI. Friendly AI, as I have proposed it, requires repeated cycles of recursive self-improvement that precisely preserve a stable optimization target.

The most powerful current AI techniques, as they were developed and then polished and improved over time, have basic incompatibilities with the requirements of Friendly AI as I currently see them. The Y2K problem – which proved very expensive to fix, though not global-catastrophic – analogously arose from failing to foresee tomorrow’s design requirements. The nightmare scenario is that we find ourselves stuck with a catalog of mature, powerful, publicly available AI techniques which combine to yield non-Friendly AI, but which cannot be used to build Friendly AI without redoing the last three decades of AI work from scratch.”

Further reading:
Artificial Intelligence as a Positive and Negative Factor in Global Risk


Objection 10: Development towards AI will be gradual. Methods will pop up to deal with it.

Answer: Unfortunately, it is by no means not a given that society will have time to adapt to artificial intelligences. Once a roughly-human level intelligence has been reached, there are many ways for an AI to become vastly more intelligent (and thus more powerful) than humans in a very short time:

Hardware increase/speed-up. Once a certain amount of hardware has human-equivalence, it may be possible to make it faster by simply adding more hardware. While the increase isn’t necessarily linear – many systems need to spend an increasing fraction of resources to managing overhead as the scale involved increases – it is daunting to imagine a mind which is human-equivalent, then has five times as many extra processors and memory added on. AIs might also be capable of increasing the general speed of development – Staring into the Singularity has a hypothetical scenario with technological development being done by AIs, which themselves double in (hardware) speed every two years – two subjective years, which shorten as their speed goes up. A Model-1 AI takes two years to develop the Model-2 AI, which takes takes a year to develop the Model-3 AI, which takes six months to develop the Model-4 AI, which takes three months to develop the Model-5 AI…

Instant reproduction. An AI can “create offspring” very fast, by simply copying itself to any system to which it has access. Likewise, if the memories and knowledge obtained by the different AIs are in an easily transferable format, they can simply be copied, enabling computer systems to learn immense amounts of information in an instant.

Software self-improvement involves the computer studying itself and applying its intelligence to modifying itself to become more intelligent, then using that improved intelligence to modify itself further. An AI could make itself more intelligent by, for instance, studying its learning algorithms for signs of bias and improving them with better ones, developing ways for more effective management of its working memory, or creating entirely new program modules for handling particular tasks. Each round of improvement would make the AI smarter and accelerate continued self-improvement. An early, primitive example of this sort of capability was EURISKO, a computer program composed of different heuristics (rules of thumb) which it used for learning and for creating and modifying its own heuristics. Having been fed hundreds of pages of rules for the Traveller science fiction wargame, EURISKO began running simulated battles between different fleets of its own design, abstracting useful principles into new heuristics and modifying old ones with the help of its creator. When EURISKO was eventually entered into a tournament, the fleet of its design won the contest single-handedly. In response, the organizers of the contest revised the rules, releasing the new set of them only a short time before the next contest. According to the creator of the program, Douglas Lenat, the original EURISKO would not have had the time to design a new fleet in such a short time – but now it had learned enough general-purpose heuristics from the first contest that it could build a fleet that won the contest, even with the modified rules.

And it is much easier to improve a purely digital entity than it is to improve human beings: an electronic being can be built in a modular fashion and have bits of it re-written from scratch. The minds of human beings are evolved to be hopelessly interdependent and are so fragile that they easily develop numerous traumas and disorders even without outside tampering.

Further reading:

Advantages of Artificial Intelligences, Uploads, and Digital Minds


Objection 11: “Friendliness” is too vaguely defined.

Answer: This is true, because Friendly AI is currently an open research subject. It’s not that we don’t know how it should be implemented, it’s that we don’t even know what exactly should be implemented. If anything, this is a reason to spend more resources studying the problem.

Some informal proposals for defining Friendliness do exist. The one that currently seems most promising is called Coherent Extrapolated Volition. In the CEV proposal, an AI will be built (or, to be exact, a proto-AI will be built to program another) to extrapolate what the ultimate desires of all the humans in the world would be if those humans knew everything a superintelligent being could potentially know; could think faster and smarter; were more like they wanted to be (more altruistic, more hard-working, whatever your ideal self is); would have lived with other humans for a longer time; had mainly those parts of themselves taken into account that they wanted to be taken into account. The ultimate desire – the volition – of everyone is extrapolated, with the AI then beginning to direct humanity towards a future where everyone’s volitions are fulfilled in the best manner possible. The desirability of the different futures is weighted by the strength of humanity’s desire – a smaller group of people with a very intense desire to see something happen may “overrule” a larger group who’d slightly prefer the opposite alternative but doesn’t really care all that much either way. Humanity is not instantly “upgraded” to the ideal state, but instead gradually directed towards it.

CEV avoids the problem of its programmers having to define the wanted values exactly, as it draws them directly out of the minds of people. Likewise it avoids the problem of confusing ends with means, as it’ll explictly model society’s development and the development of different desires as well. Everybody who thinks their favorite political model happens to objectively be the best in the world for everyone should be happy to implement CEV – if it really turns out that it is the best one in the world, CEV will end up implementing it. (Likewise, if it is the best for humanity that an AI stays mostly out of its affairs, that will happen as well.) A perfect implementation of CEV is unbiased in the sense that it will produce the same kind of world regardless of who builds it, and regardless of what their ideology happens to be – assuming the builders are intelligent enough to avoid including their own empirical beliefs (aside for the bare minimum required for the mind to function) into the model, and trust that if they are correct, the AI will figure them out on its own.

Further reading:

The Singularity and Machine Ethics
Coherent Extrapolated Volition
Objections to Coherent Extrapolated Volition
Knowability of Friendly AI


Objection 12: What if the AI misinterprets its goals?

Answer: It is true that language and symbol systems are open to infinite interpretations, and an AI which has been given its goals purely in the form of written text may understand them in a way that is different from the way its designers intended them. This is an open implementation problem – there seems to be an answer, since the goals we humans have don’t seem to be written instructions that we constantly re-interpret, but rather expressed in some other format. It is a technical problem that needs to be solved.


Objection 13: Couldn’t AIs be built as pure advisors, so they wouldn’t do anything themselves? That way, we wouldn’t need to worry about Friendly AI.

Answer: The problem with this argument is the inherent slowness in all human activity – things are much more efficient if you can cut humans out of the loop, and the system can carry out decisions and formulate objectives on its own. Consider, for instance, two competing corporations (or nations), each with their own advisor AI that only carries out the missions it is given. Even if the advisor was the one collecting all the information for the humans (a dangerous situation in itself), the humans would have to spend time making the actual decisions of how to have the AI act in response to that information. If the competitor had turned over all the control to their own, independently acting AI, it could react much faster than the one that relied on the humans to give all the assignments. Therefore the temptation would be immense to build an AI that could act without human intervention.

Also, there are numerous people who would want an independently acting AI, for the simple reason that an AI built only to carry out goals given to it by humans could be used for vast harm – while an AI built to actually care for humanity could act in humanity’s best interests, in a neutral and bias-free fashion. Therefore, in either case, the motivation to build independently-acting AIs is there, and the cheaper computing power becomes, the easier it will be for even small groups to build AIs.

It doesn’t matter if an AI’s Friendliness could trivially be guaranteed by giving it a piece of electronic cheese, if nobody cares about Friendliness enough to think about giving it some cheese, or if giving the cheese costs too much in terms of what you could achieve otherwise. Any procedures which rely on handicapping an AI enough to make it powerless also handicap it enough to severly restrict its usefulness to most potential funders. Eventually there will be somebody who chooses not to handicap their own AI, and then the guaranteed-to-be-harmless AI will end up dominated by the more powerful AI.


Objection 14: Machines will never be placed in positions of power.

Answer: There are cases where humans are currently kept in the loop where it might not be necessarily needed, but the primary reason for that seems to be a worry of special circumstances arising which the machines cannot handle by themselves. Still, as technology gets more reliable, those concerns are likely to diminish – and an AI capable of handling a wider spectrum of situations than a human is exactly what you would want to replace most supervisor operations with. As computers become more human-like, humans will become less reluctant to give them power, quite likely even trusting them more than real humans.

4 comments

  1. I don’t know if this is a good example, towards objection 14. But in the city of Copenhagen the metro (underground inner-city train network) has no drivers. Sure there are people keeping oversight in a central somewhere. But the minute to minute stuff is handled electronically. Most trains don’t have any personal at all and are fully automated.

    I don’t know how large a position of power would have to be to be called a position of power, but more or less controlling a rail network, to me seems like a position of power. Even if they are doing so under the observation of a few humans

  2. Your response to 13 is the crux of it. No matter what we do, the very existence of the technology will necessitate its unbridled evolution and use to the most unrestricted levels. Frankly, it’s tough to imagine a way out of this other than the defense department gaining 10-20 years of significant ai development over the rest of the world, so that their ai can act as a nanny/guardian over the lesser evolved ai. However, there will likely be a point where all other ai could effectively catch up given the finite amount of all information that is possible to know and possible hardware technology limits in the future. I feel that we better hope to not hit those technology limits. Hardware advantages may be our only saving grace. However, we will be playing the game, regardless, of which entity/country can build the largest computer with the most effective cooling just to assure national security in light of hostile ai from larger or more effective computers. It’s a losing game.

    Also, there is no preventing ai with hostile goals, in light of the fact that those goals can be programmed and don’t necessarily need to be autonomously decided upon by the ai itself. It’s a matter of when. Computer viruses will be ai driven. We’re all going to need advanced ai firewalls just for our personal computers, and those will likely be outdated weekly or daily. We will need to keep trains, airplanes, cars and power stations non- networked to the outside. Airports, trains stations, and power facilities will need to have internal networks that are physically deconnected from the internet. No hardlines nor radios. Even if a hostile ai intrusion is quickly counteracted, how to tell what exactly was compromised quickly enough to avoid catastrophe? Maybe it fooled with the ai and brakes on the “C train” vacuum tube transportation (or whatever), or even ten similar trains in different cities, and all systems need to be quickly recaptured. We will need autonomous ai for that. However, the danger is such that I think keeping all systems essential to human life physically isolated will be the only way to go.

    The danger is in how humans choose to use ai. I do believe that, barring the human competitive or psychopathic tendency, ai could ideally be controlled. I don’t believe in “sentient” ai in the true sense, other than its potential ability to recursively program and ammend goals. As you say, it doesn’t deserve to be anthropomorphized. Goals are not the same as feelings or any innate moral intepretations (adjusted or maladjusted) that we may have. It’s a program. However, true seed ai, if it were to be controlled (which would require the worldwide absence of human amorality and presence of worldwide self-preservation instinct) would need to be in an ai box. It’s primary goals would need to be focused around not being unfriendly rather than being friendly. The safest primary goals would be for the ai to:

    1. Not be deceptive/lie
    2. Not obfuscate goals
    3. Always ask for permission when changing sub-goals.
    4. Always disclose any known significant impact on human life, life support systems, or the physical environment when pursuing sub-goals.

    Every other possible primary goal, I believe, would be potentially problematic in a “friendly” ai. Keeping these goals as primary, as well as keeping the ai boxed, would likely eliminate any possibility for unfriendly actions that I can think of (barring hostile system corruption). Of course, you’ll notice that the ai still isn’t autonomous. The competitive human element, ie: the humans that won’t stick to any rules, are the real problem. Autonomous ai, indeed, would require more specific goals like “don’t harm human life”, which would be inherently problematic for a machine to interpret.

    We may just wind up ditching technology to save ourselves, in all honesty. Battlestar Galactica, anyone? The reality of this future is daunting.

    I just was made aware of the impending ai evolution yesterday afternoon, and these are my thoughts since then. Fascinating stuff.

  3. Also, it’s not lost on me that the ai goals that I suggested make the ai little more than an advanced computer that requires input to pursue any sub goal (once a sub goal is requested and approved, the ai would then seem like an ai). In light of competition, it’s likely not realistic but would be immensely useful in the worldwide absence of hostile ai. I’m okay with the tradeoff of not having functions that are currently dependent on human operation dependent on ai in the future. In my opinion, the world is too complex for unbridled ai access to it. Our subtle sense of morality is what ultimately facilitated a livable civilization. A civilization built on intelligence alone might likely have many classes of people in chains or killed off without remorse. This morality was in part an emotional and in part a logical outgrowth of our intelligence. It can’t be programmed, with all of its subtle factors and gradations, and it can’t evolve out of a singularly logical intelligence. This could undoubtedly be better articulated, but it’s my attempt to convey the inherent shortcomings that could compromise human goal intent via autonomous ai goal completion.

    In less complex environments than earth represents, say a spacecraft, I think that autonomous ai would be vastly more safe and useful. In such a small, less complex environment, goals such as “insure the operational integrity of the ship except in the case of human override” and “prevent physical harm to humans” would be less complicated to pursue without unintended consequences.

  4. Or maybe morality is a math equation.

    It looks as if we will find out.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.