How do you teach a bot?

How do you teach a bot right from wrong? 

It’s a question straight out of a movie, but it’s also something we’ll have to grapple with a lot sooner than you might think.

Take a self driving car that has to choose between hitting a child in its way or slamming its own passenger into a barrier, or imagine a rescue bot that detects two injured people in the rubble of an earthquake but knows it doesn’t have time to save both.

How does that bot decide which of these people to try to save first? That’s something that we as a community actually have to figure out. It’s a moral dilemma, which is why a team of scientists is attempting to build moral bots.

If autonomous bots and bots are going to hang with us, we’re going to have to teach them how to behave. Which means finding a way to make them aware of the values that are most important to us. But with morals things get messy pretty quickly. Humans don’t really have any concrete rules about what’s right and wrong, at least not ones that we’ve managed to agree upon. What we have instead are norms: basically, thousands of fuzzy contradictory guidelines. Norms help us predict the way that the people around us will behave and how they want us to behave. Right now the major challenge for even thinking about bots able to understand and use moral norms is that we don’t even understand very well on the human side how humans use them.

The big trick, especially if you’re a bot, is that none of these norms are absolute. In one situation a particular norm or value will feel extremely important, but change the scenario and you completely alter the rules of the game. So how can we build a bot that can figure out which norms to follow and when?

How To Build Bots, According to Science

Researchers start by compiling a list of words, ideas, and rules that people use to talk about morality – a basic moral vocabulary. The next step is figuring out how to quantify this vocabulary. How are these ideas related and organized in our minds. One theory is that the human moral landscape might look a lot like a semantic network with clusters of closely related concepts that we become more or less aware of depending on the situation. In any particular context a subset of norms is activated by the particular objects and symbols and general knowledge we have about that situation. That activated subset of norms guides action, helps us recognize violations, and allows us to make decisions.

The key here is that the relationships between these sub networks is actually something you can measure. Pick a scenario, like a day at the beach, and ask a whole bunch of people how they think they’re supposed to behave. What are they supposed to do and what are they absolutely not supposed to do? The order in which participants mention certain rules, the number of times they mention them, and the time that it takes between mentioning one idea and another are all concrete values. By collecting data from enough different situations, it’s possible to build a rough map of the human norm network. In the future a bot might come equipped with a built in version of that map, that way it could call up the correct moral framework for whatever situations are at hand.

But even if that bot could perfectly imitate a human decision making process, is that something we’d really want?

We might actually want bots to make different decisions than the ones we’d want other humans to make.

To test this, researchers have asked during experiments for people to imagine a classic moral dilemma –  picture a runaway trolley in a coal mine that’s lost its brakes. The trolley has four people on board and is hurtling toward a massive brick wall. There’s an alternate safe track but a repairman is standing on it and he’s oblivious to what’s happening. Another worker nearby sees the situation, and can pull a lever that would switch the train on to the second track saving the passengers in the trolley, but killing the repairman. He has to choose.

So the fundamental moral dilemma is whether you are willing to intervene and kill one person to save four others, or are you not going to intervene and let fate take its course.

Some of the participants watch a human make the decision, some see a humanoid bot, and some see a machine like bot. Then participants judge the decision the worker made. Generally participants blame the human worker more when he flips the switch than when he does nothing. Apparently watching another person make a cold calculating decision to sacrifice a human life, even if it’s to save other lives, makes us kind of queasy.

But evidence suggests that we might actually expect a bot to flip the switch. The participants blame the bot more if it didn’t step in and intervene, and the more machine-looking the bot was, the more they blamed it for letting the four people die.

There’s one more interesting twist to this: if the bot or human in the story made an unpopular decision but then gave a reason for that choice, participants blamed that worker less. And this is really really important, because it shows that bots are going to need communication.

Communication is essential because moral norms aren’t fixed. We argue and reason about morality and often we learn from each other and update our values as a group and any moral bot will need to be a part of that process. We’re still a long way from building a truly moral bot, but these studies could be the very first step.