How artificial intelligence learned to play Minecraft from 70,000 hours of YouTube videos

Will artificial intelligence ever be able to play Minecraft as well or even better than a human? And most importantly, will she soon be able to learn much faster thanks to simple videos posted online? In any case, this is the goal of OpenAI, which has just presented the results of promising initial research.

Mastering Minecraft is a huge challenge for artificial intelligence. Because the name Mojang is a game that is much more difficult for a computer to learn than chess or go (games in which AI is now atomizing us). It’s a very open game with flexible rules and that’s what makes it so charming! It offers almost complete freedom to the player, who can explore, craft, dig, build as they wish… In a word, alternate extremely diverse and complex actions that are difficult to teach the poor computer programs.

However, this is what the OpenAI research team has done. They trained a model from scratch… that manages to play Minecraft “correctly”. Moreover, their model plays strictly like a person, that is, using the interface and traditional game controls: keystrokes and mouse movements.

Let’s be clear: the OpenAI artificial intelligence implemented in Minecraft is unable to build a dream home from scratch, much less replicate King’s Landing. She is content with much more modest tasks, but her performance is far from ridiculous. In this way, she manages to build a simple shelter, craft tools and explore the village to open chests there… She even managed on several occasions to create a diamond pickaxe, which, according to OpenAI, is the first in the world. This is indeed far from a simple tool, requiring many complex steps to research, produce and combine objects.

The AI ​​that plays Minecraft gives

AI uploaded to YouTube

How did the OpenAI researchers arrive at this result? The answer is almost in one word: YouTube. They took advantage of the incredible wealth and variety of Minecraft videos on the web to “feed” their model, who then drew inspiration from what they saw to learn how to play.

Well, it’s not that simple, of course. Far from there. Their methodology, which they called VPT (from Video PreTraining), first consisted of collecting 70,000 hours (!) resort to the “little hands” typed on Amazon Mechanical Turk. With a screenshot of each video, they checked whether the selected content was actually usable for the project, from an initial corpus of 270,000 hours!
For example, videos that were recorded in creative mode, or that contained logos or artifacts that could interfere with their understanding by the machine, had to be delayed.
Finally, a little subtlety: to make it easier for their toddler to start the game, the researchers also extracted a subset of those thousands of videos containing only the beginning of the game.

First of all, the researchers did not stop there: they also created a model (Inverse Dynamics Model, IDM) with the second series of Minecraft videos. This is a much smaller case (only 2000 hours) but has the advantage of being very accurate. Because all these parts were made specifically for the experience of a few experienced Minecraft players. Thus, the researchers were able to record all mouse movements and all keyboard manipulations.

The first stages of training // Photo: OpenAI

They then essentially applied this model to 70,000 hours of videos they had previously gleaned from the internet. Thus, their AI was able to “guess” the mouse and keyboard movements that were made in these videos and be inspired by them. Clever.

Once trained in this way, the AI ​​is already capable of performing some basic in-game tasks, such as chopping wood into logs, then planks, then a workbench of four planks. Obviously, this is a trivial step for a human player, but according to OpenAI, it is almost impossible to achieve with simple reinforcement learning.

It’s good but not enough

This is already correct, but not enough for our researchers. So they tweaked their AI to try and give it more skill by successfully using two different techniques. First, they asked their Minecraft experts to run 10-minute sessions in which the goal was to build a tiny house out of basic materials.
By integrating these parts into the model, they realized that AI could not only make a modest shelter for itself, but also go much further in creating complex objects (like a stone pick, for example).

Finally, they also used reinforcement learning, requiring their AI to get a diamond pickaxe in ten minutes of playing on an empty map and rewarding it for its efforts. Not easy, given that it requires collecting, combining and creating a long series of consecutive objects that are not easy to find. But they still managed to do it several times, or 2.5% of the ten-minute games played.

But why train Minecraft AI?

Of course, OpenAI didn’t do all of this in an attempt to create a “superhuman” in Minecraft, although its researchers estimate they could collect a million hours of gameplay to perfect their model. No, VPT could, above all, pave the way for a new way of teaching AI to “act” step by step, like a human.

“The results presented in this article help pave the way for using the wealth of unlabeled data on the web for consistent decision-making areas. we can read in the conclusion of the OpenAI scientific article.

To better understand this somewhat boring phrase, let’s leave Minecraft and focus on Photoshop. It is quite possible to imagine such an artificial intelligence, trained by a VPT with thousands of Photoshop tutorials available on the Web, that would learn how to navigate through the application menu, click, apply filters, retouch photos … This would greatly facilitate the work of some graphic designers!

Source :

Open AI

Back to top button

Adblock Detected

Please consider supporting us by disabling your ad blocker.