Tens of millions of gamers play Minecraft every month, and many of them are well-versed in how the universe works. Collecting diamonds is one of the most important things you have to do since theyâre an important resource that can be used to craft better tools and weapons, upgrade gear, enchant tools, and trade.
Getting those diamonds is no easy feat, at least initially, as you have to learn a specific procedure to mine them. You need the proper tools, which require crafting, and then you need to dig and create your own diamond mine. Getting a diamond can take between 30 minutes and an hour, especially if youâre new to the game. The more you play Minecraft, the better and faster youâll get at everything, including collecting diamonds.
Can an AI model find a diamond on its own in Minecraft without any prior training on how to do it? It turns out the answer is yes. Googleâs DeepMind has an AI called Dreamer, which they tasked with finding diamonds in Minecraft and gave it no training at all to support the mission. The AI learned the game on its own and was eventually successful in playing Minecraft just like a human when it comes to finding diamonds.
As you might have already guessed, this experiment is about much more than just Minecraft. But the popular game is a great universe to train some AI models due to the nature of the game. Each time you start a game, itâll load a brand new universe that youâll have to explore.
The purpose of AI systems like Dreamer isnât really to play popular video games, which they obviously can do. Instead, itâs to understand their surroundings and determine the actions they must take to achieve a task.
If robotics comes to mind as a use case for Dreamer, that wouldnât be surprising. Thatâs exactly what Google might use such systems for, especially considering what Dreamer did to âbeatâ the game in Minecraft. To get a diamond, the AI imagined the future.
According to Nature, the DeepMind scientists used several methods to get the model to mine diamonds in Minecraft. First, thereâs reinforcement learning, a technique that rewards the AI for tasks completed correctly. The AI learns to perform tasks that would lead to rewards.
Then thereâs Dreamerâs ability to build a world model of its surroundings in the game to understand what its actions might lead to.
Remember the multi-step procedure to mine diamonds that I mentioned above? You would have to cut down trees to then craft a table and a wooden pickaxe. This i only good for getting you the stone you need to then craft a stone axe and a furnace. You might see where this is going; thatâs right, you need to collect iron for an even better axe.
Once thatâs done, you need to get yourself safely down in a mine, which youâre responsible for making. Youâll want to avoid the lava, which kills you.
Put it like that, and itâs very easy to find your first diamond, even if it takes time.
Google put Dreamer to work, with the AI learning how to find that diamond over nine days. The Googlers reset the universe every 30 minutes to force Dreamer to adapt to a new universe rather than getting used to the one it was playing. They also gave Dreamer a âplus oneâ reward every time the AI completed one of the steps required to eventually mine a diamond.
The AI continued to play and learned all the actions needed to mine and find diamonds over the course of those nine days. Each run lets it learn more about the world around him. Think of it like Tom Cruise in Edge of Tomorrow. He kept dying and returning to the same moment in his past, only to advance further and further with every new life.
Eventually, Tom Cruise and Emily Blunt beat the aliens in the movie. Eventually, Dreamer was able to mine that diamond within 30 minutes, as fast as an expert human player who would need 20-30 minutes to go through the whole procedure.
âDreamer marks a significant step towards general AI systems,â Google DeepMind scientist Danijar Hafner told Nature. âIt allows AI to understand its physical environment and also to self-improve over time, without a human having to tell it exactly what to do.â
The AIâs ability to imagine the future before taking any actions might turn out to be a key development in building AI models that power advanced robots that need to perform tasks in the real world. The robots will have to imagine the outcome of their actions before interacting with the world around them. In Minecraft, Dreamer could imagine that chopping down a tree would let it collect wood before doing it.
The diamond challenge wasnât even a priority for the researchers. They simply used the Minecraft environment because it could offer them the right testing grounds for the algorithm. They could observe the AI adapting to an ever-changing environment that would be automatically generated with ease.
You can learn more details about Dreamer V3 at this link. The Deepmind study is available in full in Nature.