AI and the Metaverse
The Metaverse has surpassed AI in terms of hype - how are the two related?
This editorial follows a series on AI + political/other implications. See the previous articles below:
Among the more interesting buzzwords of late is the term “metaverse,” which Wikipedia defines as a “network of 3D virtual worlds focused on social connection.” To venture capitalist Matthew Ball, the definition is a lot less simple:
> we don’t really know how to describe the Metaverse. However, we can identify core attributes.
These attributes include persistence (the world continues to “exist” even when players log off), interoperability (allowing for seamless travel between different virtual spaces with the same virtual assets), and the presence of a fully functioning economy. The metaverse promises to be a complicated organism, requiring advances in hardware, compute, virtual platforms, interchange tools/standards, and more.
But what of AI? The metaverse concept has become more mainstream at a time when we are also witnessing important advances in AI. Facebook, one of the foremost players in the AI space, changed its name to Meta. AI has played an important role in gaming, allowing non-player characters (NPCs) to exhibit responsive behaviors and provide a more immersive experience to gamers. It affects players directly and indirectly in games like Fortnite, providing programmed opponents in the game and allowing Fortnite’s developers to make data-driven decisions about the game to drive further growth.
And it is precisely games like Fortnite that hope to build towards the still-amorphous metaverse. Given that AI impacts these games as they contribute to the building of a metaverse, AI will therefore also have some impact on the eventual metaverse. In this article, I will largely focus on a speculative account of how AI might influence the metaverse, focusing on experiential and creative aspects.
AI’s Prospects in the Metaverse
It’s difficult to describe precisely what AI might be doing in the metaverse now because the metaverse doesn’t quite “exist” yet. But we can get some idea about where it is (and will be) by considering the components that will form the metaverse. According to Matthew Ball, these are:
Interchange tools + standards
Payments, payment rails, and blockchains
Content, services, and asset businesses
Evolving user + business behaviors
Coming at the question from a different angle, we might come to similar conclusions with Pereira regarding how AI will influence the metaverse. Hardware, for instance, is beginning to be influenced by AI: Google has experimented with using deep reinforcement learning for chip planning.
Virtual platforms are already influenced by AI, as we described above in the case of games. Users interact with AI NPCs, while the content of virtual worlds is developed using data about players’ preferences. Recent advances like GPT-3 and multimodal models that bridge the language/image divide are allowing creators to augment their efforts, and these benefits promise to extend into the virtual worlds where users and creators will bring their creative energies. Content, services, and asset businesses are already leveraging machine learning to similar ends. According to Ball, a number of content businesses are using ML algorithms “that can procedurally generate interesting and coherent digital environments.”
If financial transactions are to be conducted via blockchain, AI promises to help in this area as well. According to IBM, financial services can benefit from the combined application of blockchain technology and AI by using trustworthy data to drive automated decisions.
While content creation in virtual platforms and transactions are one thing, AI promises to impact the experience of being in the metaverse or in a virtual world in numerous other ways. AI-based facial reconstruction, for instance, would allow users to control realistic avatars in VR. Facebook has already produced work on this front in “Expressive Telepresence via Modular Codec Avatars,” in which they introduce a method to create hyper-realistic faces using the cameras in VR devices. Users would be able to experience incredibly realistic interactions with one another in which they are able to display and witness emotive gestures in avatars’ facial expressions that closely match their own.
The ability to act in virtual environments extends beyond the face, and nearly all tracking in VR is also AI-powered. Oculus, for instance, uses computer vision for accurate hand tracking. Their model predicts a hand’s location and landmarks which are then used to construct a 3D model that developers can use to create new interaction mechanics or driver user interfaces. Oculus Insight has computer vision systems that fuse sensor inputs from a users’ headset and controllers to perform high-quality position tracking. Meta’s new AI Research Supercomputer (RSC) could further improve AR and VR tracking.
The potential for realistic experiences goes beyond just players’ characters. We mentioned that players could interact with NPCs whose actions and dialogue are AI-generated (see video below). But this can go further: those NPCs may also be visually realistic and emotive. Companies like DeepMotion are developing AI to create life-like characters while researchers at the University of Edinburgh have also introduced algorithms for creating realistic character movements in games.
Developing AI-based interactive games has great promise, but user buy-in can be difficult to maintain. Latitude’s AI Dungeon, a text adventure game that used GPT-3, gained traction quickly and seemed to be a promising step forward for AI-generated content. When OpenAI expressed concerns about some of the NSFW content in the game, Latitude took action, alienating many users in the process. Its attempts at filtering were apparently too indiscriminate: in attempting to remove anything suggestive about children, its system would remove content that mentioned an eight-year old laptop, for instance.
Latitude hasn’t completely given up yet, and their ideas may play an important role in the development of AI-generated content. After the AI Dungeon fiasco, Latitude launched an AI-powered game platform called Voyage. But their vision extends further than just hosting games: Latitude wants Voyage to serve as a game engine, allowing players to create their own games with trained AI models.
What of the quality of rendered environments? If we are to spend time in virtual reality environments and interact with others, we presumably want high-quality graphics in order to see the intricacies of expressions and other features that might make a virtual world seem realistic. AI could have a hand in this too–Nvidia’s DLSS technology uses neural networks to boost frame rates and generate sharp images. Intel has also developed image-enhancing AI, while new research is enabling the rendering of 3D scenes from 2D images. Such progress could enable both users and creators to more easily contribute to virtual environments.
When players interact with one another, they need to do more than just see each other–they also need to communicate. AI is already used for translation and voice transcription. Meta’s RSC was also built with real-time voice translation in mind. With these advances, players from around the world who do not share a common language could interact with each other by just speaking their native languages.
AI promises to make the metaverse, or virtual worlds, or whatever all “this” becomes, a more interesting and realistic place. It will allow creators and companies to develop more interesting and personalized content. It may enable hardware that offers a smoother gaming experience. It will help make players’ interactions with non-player characters feel more realistic and lifelike. In many ways, it could make the metaverse a far more faithful reproduction of the real world. We have presented a far from comprehensive account of how AI is shaping and could shape the metaverse. But it is clear that there will be many ways in which AI and the technologies that promise to form the metaverse will interact in the years to come.
About the Author:
Daniel Bashir is a machine learning engineer at an AI startup in Palo Alto, CA. He graduated with his Bachelor’s in Computer Science and Mathematics from Harvey Mudd College in 2020. He is interested in computer vision, ML infrastructure, and information theory.