Software for Moral Enhancement

We all have our weak moments. Moments when we know the right thing to do, but are too tired, too afraid, or too frustrated to do it. So we slip up, and do something that we’ll regret.

An algorithm will never slip up in a weak moment. What if we could identify when we are likely to make mistakes, figure out what we’d want to do instead, and then outsource our decisions to a reliable algorithm? In what ways could we use software to make ourselves into better people?

Passive moral enhancement

One way of doing this might be called passive moral enhancement, because it happens even without anyone thinking about it. For example, if you own a self-driving car, you will never feel the temptation to drink and drive. You can drink as much as you want, but your car will always be the one who drives for you, so you will never endanger others by your drinking.

In a sense this is an uninteresting kind of moral enhancement, since there is nothing novel about it. Technological advancement has always changed the options that we have available to us, and made some vices less tempting while making others more tempting.

In another sense, this is a very interesting kind of change, because simply removing the temptation to do bad is a very powerful way to make progress. If you like drinking, it’s a pure win for you to get to drink rather than having to stay sober just because you’re driving. If we could systematically engineer forms of passive moral enhancement into society, everyone would be better off.

Of course, technology doesn’t always reduce the temptation to do bad. It can also open up new, tempting options for vice. We also need to find ways for people to more actively reshape their moral landscape.

A screenshot from the GoodGuide application.

A screenshot from the GoodGuide application.

Reshaping the moral landscape

On the left is a screenshot from GoodGuide. GoodGuide is an application which rates the health, environmental, and societal impact of different products on a scale from 1 to 10, making it easier to choose sustainable products. This is an existing application, but similar ideas could be taken much further.

Imagine having an application which allowed you to specify what you considered to be an ethical product and what kinds of things you needed or liked. Then it would go online and do your shopping for you, automatically choosing the products that best fit your needs and which were also the most ethical by your criteria.

Or maybe your criteria would act as a filter on a search engine, filtering out any products you considered unethical – thus completely removing the temptation to ever buy them, because you’d never even see them.

Would this be enough? Would people be sufficiently motivated to set and use such criteria, just out of the goodness of their hearts?

Probably many would. But it would still be good to also create better incentives for moral behavior.

Software to incentivize moral behavior

Sutter Health/California Pacific Medical Center.

This six-way kidney exchange was carried out in 2015 at the California Pacific Medical Center. Sutter Health/California Pacific Medical Center.

On the right, you can see a chain of kidney donations created by organ-matching software.

Here’s how it works. Suppose that my mother has failing kidneys, and that I would like to help her by giving her one of my kidneys. Unfortunately, the compatibility between our kidneys is poor despite our close relation. A direct donation from me to her would be unlikely to succeed.

Fortunately, organ-matching software manages to place us in a chain of exchanges. We are offered a deal. If I donate my kidney to Alice, who’s a complete stranger to me, then another stranger will donate their kidney – which happens to be an excellent match – to my mother. And as a condition for Alice getting a new kidney, Alice’s brother agrees to donate his kidney to another person. That person’s mother agrees to donate her kidney to the next person, and that person’s husband agrees to donate his kidney… and so on. In this way, what was originally a single donation can be transformed into a chain of donations.

As a result of this chain, people who would usually have no interest in helping strangers end up doing so, because they want to help their close ones. By setting up the chain, software has made our interest for our loved ones align together with us helping others.

The more we can develop ways of incentivizing altruism, the better off society will become.

Is this moral enhancement?

At this point, someone might object to calling these things moral enhancement. Is it really moral enhancement if we are removing temptations and changing incentives so that people do more good? How is that better morality – wouldn’t better morality mean making the right decisions when faced with hard dilemmas, rather than dodging the dilemmas entirely?

My response would be that much of the progress of civilization is all about making it easier to be moral.

I have had the privilege of growing up in a country that is wealthy and safe enough that I have never needed to steal or kill. I have never been placed in a situation where those would have been sensible options, let alone necessary for my survival. And because I’ve had the luck of never needing to do those things, it has been easy for me to internalize that killing people or stealing from them are things that you simply don’t do.

Obviously it’s also possible for someone to decide that stealing and killing are wrong despite growing up in a society where they have to do those things. Yet, living in a safer society means that people don’t have to decide it – they just take it for granted. And societies where people have seen less conflict tend to be safer and have more trust in general.

If we can make it easier for people to act in the right way, then more people will end up behaving ways that make both themselves and others better off. I’d be happy to call that moral enhancement.

Whatever we decide to call it, we have an opportunity to use technology to make the world a better place.

Let’s get to it.

An appreciation of the Less Wrong Sequences

Ruby Bloom recently posted about the significance of Eliezer Yudkowsky‘s Less Wrong Sequences on his thinking. I felt compelled to do the same.
 
Several people have explicitly told me that I’m one of the most rational people they know. I can also think of at least one case where I was complimented by someone who was politically “my sworn enemy”, who said something along the lines of “I do grant that *your* arguments for your position are good, it’s just everyone *else* on your side…”, which I take as some evidence of me being able to maintain at least some semblance of sanity even when talking about politics.
 
(Seeing what I’ve written above, I cringe a little, since “I’m so rational” sounds like so much like an over-the-top, arrogant boast. I certainly have plenty of my own biases, as does everyone who is human. Imagining yourself to be perfectly rational is a pretty good way of ensuring that you won’t be, so I’d never claim to be exceptional based only on my self-judgment. But this is what several people have explicitly told me, independently of each other, sometimes also vouching part of their own reputation on it by stating this in public.)
 
However.
 
Before reading the Sequences, I was very definitely *not* that. I was what the Sequences would call “a clever arguer” – someone who was good at coming up with arguments for their own favored position, and didn’t really feel all that compelled to care about the truth.
 
The one single biggest impact of the Sequences that I can think of is that before reading them, as well as Eliezer’s other writings, I didn’t really think that beliefs had to be supported by evidence.
 
Sure, on some level I acknowledged that you can’t just believe *anything* you can find a clever argument for. But I do also remember thinking something like “yeah, I know that everyone thinks that their position is the correct one just because it’s theirs, but at the same time I just *know* that my position is correct just because it’s mine, and everyone else having that certainty for contradictory beliefs doesn’t change that, you know?”.
 
This wasn’t a reductio ad absurdum, it was my genuine position. I had a clear emotional *certainty* of being right about something, a certainty which wasn’t really supported by any evidence and which didn’t need to be. The feeling of certainty was enough by itself; the only thing that mattered was in finding the evidence to (selectively) present to others in order to persuade them. Which it likely wouldn’t, since they’d have their own feelings of certainty, similarly blind to most evidence. But they might at least be forced to concede the argument in public.
 
It was the Sequences that first changed that. It was reading them that made me actually realize, on an emotional level, that correct beliefs *actually* required evidence. That this wasn’t just a game of social convention, but a law of universe as iron-clad as the laws of physics. That if I caught myself arguing for a position where I was making arguments that I knew to be weak, the correct thing to do wasn’t to hope that my opponents wouldn’t spot the weaknesses, but rather to just abandon those weak arguments myself. And then to question whether I even *should* believe that position, having realized that my arguments were weak.
 
I can’t say that the Sequences alone were enough to take me *all* the way to where I am now. But they made me more receptive to other people pointing out when I was biased, or incorrect. More humble, more willing to take differing positions into account. And as people pointed out more problems in my thinking, I gradually learned to correct some of those problems, internalizing the feedback.
 
Again, I don’t want to claim that I’d be entirely rational. That’d just be stupid. But to the extent that I’m more rational than average, it all got started with the Sequences.
 
Ruby wrote:
I was thinking through some challenges and I noticed the sheer density of rationality concepts taught in the Sequences which I was using: “motivated cognition”, “reversed stupidity is not intelligence”, “don’t waste energy of thoughts which won’t have been useful in universes were you win” (possibly not in the Sequences), “condition on all the evidence you have”. These are fundamental concepts, core lessons which shape my thinking constantly. I am a better reasoner, a clearer thinker, and I get closer to the truth because of the Sequences. In my gut, I feel like the version of me who never read the Sequences is epistemically equivalent to a crystal-toting anti-anti-vaxxer (probably not true, but that’s how it feels) who I’d struggle to have a conversation with.
And my mind still boggles that the Sequences were written by a single person. A single person is responsible for so much of how I think, the concepts I employ, how I view the world and try to affect it. If this seems scary, realise that I’d much rather have my thinking shaped by one sane person than a dozen mad ones. In fact, it’s more scary to think that had Eliezer not written the Sequences, I might be that anti-vaxxer equivalent version of me.
I feel very similarly. I have slightly more difficulty pointing to specific concepts from the Sequences that I employ in my daily thinking, because they’ve become so deeply integrated to my thought that I’m no longer explicitly aware of them; but I do remember a period in which they were still in the process of being integrated, and when I explicitly noticed myself using them.
 
Thank you, Eliezer.
 
(There’s a collected and edited version of the Sequences available in ebook form. I would recommend trying to read it one article at a time, one per day: that’s how I originally read the Sequences, one article a day as they were being written. That way, they would gradually seep their way into my thoughts over an extended period of time, letting me apply them in various situations. I wouldn’t expect just binge-reading the book in one go to have the same impact, even though it would likely still be of some use.)

Error in Armstrong and Sotala 2012

Katja Grace has analyzed my and Stuart Armstrong’s 2012 paper “How We’re Predicting AI – or Failing To”. She discovered that one of the conclusions, “predictions made by AI experts were indistinguishable from those of non-experts”, is flawed due to “a spreadsheet construction and interpretation error”. In other words, I coded the data in one way, there was a communication error and a misunderstanding about what the data meant, and as a result of that, a flawed conclusion slipped into the paper.

I’m naturally embarrassed that this happened. But the reason why Katja spotted this error was that we’d made our data freely available, allowing her to spot the discrepancy. This is why data sharing is something that science needs more of. Mistakes happen to everyone, and transparency is the only way to have a chance of spotting those mistakes.

I regret the fact that we screwed up this bit, but proud over the fact that we did share our data and allowed someone to catch it.

EDITED TO ADD: Some people have taken this mistake to suggest that the overall conclusion, that AI experts are not good predictors of AI timelines, to be flawed. That would overstate the significance of this mistake. While one of the lines of evidence supporting this overall conclusion was flawed, several others are unaffected by this error. Namely, the fact that expert predictions disagree widely with each other, that many past predictions have turned out to be false, and that the psychological literature on what’s required for the development of expertise suggests that it should be very hard to develop expertise in this domain. (see the original paper for details)

(I’ve added a note of this mistake to my list of papers.)

Smile, You Are On Tumblr.Com

I made a new tumblr blog. It has photos of smiling people! With more to come!

Why? Previously I happened to need pictures of smiles for a personal project. After going through an archive of photos for a while, I realized that looking at all the happy people made me feel really happy and good. So I thought that I might make a habit out of looking at photos of smiling people, and sharing them.

Follow for a regular extra dose of happiness!