Nvidia Showcases In-Game AI Technology Enabling Player-NPC Conversations (Video)

TL;DR:

  • Nvidia CEO Jensen Huang introduced the Nvidia Avatar Cloud Engine (ACE) for games at Computex 2023.
  • ACE for Games utilizes natural language interaction and artificial intelligence to provide intelligence to non-player characters (NPCs) in games.
  • It enables sound-to-facial expression conversion, text-to-speech capabilities, and natural language communication.
  • The system allows NPCs to listen to players’ conversations and generate contextually relevant responses in real-time.
  • ACE for Games is built on Nvidia Omniverse, incorporating components such as NeMo for language models, Riva for speech recognition and text-to-speech, and Omniverse Audio2Face for facial animation synchronization.
  • Nvidia aims to optimize ACE for Games by fine-tuning models and deploying them via cloud services or local systems.
  • The technology presents a significant step towards immersive and responsive gaming experiences, offering the potential for improved player-NPC interactions and AI-driven teammates.

Main AI News:

The potential of AI in gaming has yet to be fully realized, despite significant advancements in the field. However, at Computex 2023, Nvidia CEO Jensen Huang offered a glimpse into the future of gaming. During his keynote address, Huang unveiled the Nvidia Avatar Cloud Engine (ACE) for games—a cutting-edge artificial intelligence service that harnesses natural language interaction to provide intelligence to non-player characters.

ACE for Games, according to Nvidia, enables the conversion of sound into facial expressions and text into speech, facilitating seamless communication in natural language. Huang referred to this aspect as a “big language model.” With ACE for Games, NPCs can listen to players’ conversations, speak through their voices, and generate contextual responses that go beyond repetitive templates. Furthermore, the system can animate a character’s face to synchronize with the words being spoken.

To demonstrate this technology, Huang showcased a real-time demonstration based on Convai’s Kairos, developed using Unreal Engine 5. The demonstration resembled a scene from Cyberpunk 2077, where a player enters a ramen shop and engages in a conversation with an NPC named Jin. The player’s voice commands are met with responses that align with the story and the character’s role.

While the dialogues presented during the demonstration may have seemed somewhat dry and strained, the underlying technology remains impressive. One can easily envision the potential of ACE for Games once it undergoes further refinement and improvement.

Nvidia elaborated on the components that constitute ACE for Games, all of which are built on Nvidia Omniverse. The first component is Nvidia NeMo, which enables the creation, configuration, and deployment of language models. NeMo Guardrails, a feature within NeMo, helps protect users from engaging in “unsafe” conversations—an important consideration for the application of this technology in video games.

The second component is Nvidia Riva, which facilitates automatic speech recognition and text-to-speech capabilities, enabling players to engage in live conversations using a microphone.

Lastly, there is Nvidia Omniverse Audio2Face, a component that synchronizes facial animations with the words spoken by the characters. This technology is already being utilized in highly anticipated games such as STALKER 2: Heart of Chernobyl and Fort Solis.

“Nvidia ACE for Games leverages neural networks optimized for various capabilities, encompassing different sizes, performance levels, and quality,” explained Nvidia. The ACE for Games creation service assists developers in fine-tuning models specific to their games and deploying them via Nvidia DGX Cloud, a GeForce RTX PC, or locally for real-time output. Optimization in terms of latency is a critical requirement for ensuring an immersive and responsive gaming experience.

Although Huang did not disclose the exact requirements for utilizing ACE for Games, it is expected that the current iteration demands substantial computational resources.

While there is still room for improvement in this technology, ACE for Games represents a significant step towards a future where players can ask NPCs game-related questions and receive contextual answers tailored to their needs. Gone will be the days of predefined responses. Additionally, the concept of AI-driven teammates capable of engaging in human-like dialogues and executing verbal commands is an intriguing prospect.

Conclusion:

The introduction of Nvidia’s ACE for Games represents a major development in the gaming market. By leveraging natural language interaction and artificial intelligence, ACE offers the potential to enhance player-NPC interactions and create more immersive and responsive gaming experiences. With the ability to convert sound into facial expressions and generate contextually relevant responses, this technology opens up new possibilities for game developers and players alike. The optimization and deployment options provided by Nvidia further support the scalability and accessibility of ACE for Games. As AI continues to advance, we can anticipate increased demand for AI-driven gaming solutions, driving innovation and growth in the market.

Source