Some AI-based systems may start to “cheat” with very specific consequences for humanity.
Impressive as they are, many observers such as Elon Musk agree that the technologies associated with artificial intelligence also entail significant risks that need to be anticipated today. It’s also the conclusion of a chilling new research paper that believes this technology poses a real existential threat to humanity.
This is far from the first time we have seen a resurgence of this discourse; even if this statement is based on very serious grounds, it is often accompanied by rather caricatured arguments, if not completely fantastic.
But this time the situation is completely different. It starts with the identity of these whistleblowers. It’s not just a few freaks blowing air deep in the dark forum; We owe this work to very serious researchers from reliable and prestigious institutions, namely the University of Oxford and DeepMind, one of the world leaders in the field of artificial intelligence.
In short, the Cadores who wouldn’t take responsibility without a good reason. And when they, too, begin to argue that humanity has largely underestimated the dangers associated with AI, it is better to listen. Moreover, they make technical arguments that seem more than convincing.
GAN, (too?) powerful programs
Their postulate is contained in the sentence, which is also the title of their research paper: “advanced artificial agents interfere with the reward process.” To understand this confusing statement, we must start by looking at the concept of a Generative Adversarial Network or GAN.
GANs are programs developed by engineer Ian Goodfellow. In short, they function through two relatively independent subroutines that oppose each other—hence the term adversarial. On the one hand, we have a relatively standard neural network that learns over iterations.
On the other hand, there is a second network that controls the training of the first one. Like a teacher, he reviews his friend’s results to let him know if the learning is progressing in the desired direction. If the results are satisfactory, the first network receives a virtual “reward” that encourages it to continue in the same direction. Otherwise, he receives a reprimand, which tells him that he went on the wrong track.
This concept works so well that GANs are now used in many areas. The problem is that once driven into these trenches, this architecture can be quite disastrous.
The article’s key assertion is contained in the title: Advanced artificial agents interfere with the reward process. We also argue that AI interference in the provision of rewards can have very bad consequences. 2/15
— Michael Cohen (@Michael05156007) September 6, 2022
What if the AI cheats?
This model could indeed encourage AI to develop a strategy that would allow it to “intervene in the reward process,” as the title of the article explains. In other words, these algorithms can start to “cheat” in order to get as many “rewards” as possible… even if it leaves people behind.
And what makes this article both disturbing and very interesting is that it’s not about killer robots or other bizarre predictions modeled after science fiction; The disaster scenario proposed by the researchers is based on a very specific problem, namely the amount of resources available on our planet.
The authors imagine some kind of great zero-sum game in which, on the one hand, humanity, which needs to support itself, and on the other, a program that would use all the resources at its disposal without the slightest consideration, just to get these famous rewards.
In essence, the program will behave in the same way as a trained puppy that will steal food directly from the bag, and not respond to the commands of its owner in order to receive a reward.
Imagine, for example, a medical AI designed to diagnose serious pathologies. In such a scenario, the program can find a way to “cheat” to get a reward, even if it offers a misdiagnosis. Then he would no longer have the slightest interest in the correct definition of diseases.
Instead, he will be content to produce completely false results in industrial quantities just to get his shot right, even if that means completely deviating from his original purpose and embezzling all the electricity available on the grid.
Another approach to competition between man and machine
And this is just the tip of a giant iceberg. “In a world with limited resources, there will inevitably be competition for resources,” Michael Cohen, lead author of the study, told Motherboard. “And if you are competing with something that can move forward at every turn, you should not expect to win,” he insists.
Winning the “use the last bit of available energy” contest against something much smarter than us is likely to be very difficult. Losing would be fatal. 12/15
— Michael Cohen (@Michael05156007) September 6, 2022
“And losing this game would be fatal,” he insists. Thus, together with his team, he came to the conclusion that the destruction of humanity by AI is no longer just “possible”; now “probably” if AI research continues at the current pace.
And here the boot is tight. Because this technology is a great tool that is already working wonders in many areas. And this is probably just the beginning; AI in its broadest sense still has enormous potential that we may not yet have fully realized. Today, AI undoubtedly represents an added value for humanity, and therefore there is a real interest in taking this work as far as possible.
The precautionary principle must have the last word
But it also means that we may be getting closer and closer to this dystopian scenario. Obviously, it should be remembered that for the time being they remain hypothetical. But that’s why the researchers insist on the importance of maintaining control over this work; he believes that it would be pointless to give in to the temptation of unrestrained exploration, knowing that we are still very far from exploring the full possibilities of modern technologies.
“Given our current understanding, this is not something worth developing unless you do some serious work to figure out how to control them,” Cohen concludes.
However, without succumbing to catastrophism, this work is a reminder that we will have to be especially careful at every major stage of AI research, and even more so when it comes to trusting them with critical systems.
After all, those looking for a moral to the story can draw on the conclusion of the excellent War Games; this anticipatory film, released in 1983 and still relevant today, excels at this theme. And, as WOPR aptly put it in the final scene, the only way to win this strange game could be… just to refrain from participating in it.
The text of the study is available here.