Apple has been granted a patent for next-gen avatars using machine learning based on blood flow tracking

TL;DR:

  • Apple obtains patent for realistic avatars with machine learning and blood flow tracking.
  • Apple Vision Pro introduced by Mike Rockwell enables authentic avatar creation.
  • Advanced neural network maps facial expressions for natural avatars.
  • Patent details two-phase process: training and application.
  • Blood flow texture enhances realism for future Apple devices.

Main AI News:

Apple has secured a patent from the U.S. Patent and Trademark Office, heralding the future of realistic avatars powered by machine learning and blood flow tracking. This innovation was prominently showcased during the unveiling of Apple Vision Pro by Mike Rockwell, the Vice President of the Technology Development Group.

For digital communications like FaceTime, Vision Pro goes beyond conveying just your eyes and creates an authentic representation of you. This was one of the most difficult challenges we faced in building Vision Pro,” remarked Rockwell. The absence of a video conferencing camera and potential obstructions, such as eyewear, posed significant obstacles. However, Apple’s advanced machine learning techniques have paved the way for a novel solution.

Vision Pro leverages front sensors for a swift enrollment process, utilizing an advanced encoder-decoder neural network to craft your digital Persona. This neural network has been trained on a diverse spectrum of thousands of individuals, enabling it to deliver a natural representation that dynamically mirrors your facial and hand movements. With your Persona, you gain the ability to communicate seamlessly with over a billion FaceTime-capable devices, offering a level of volume and depth previously unimaginable in traditional video communication.

A Glimpse into Apple’s Patent

Apple’s recently granted patent revolves around the utilization of a machine learning-based blood flow tracking technique to generate photorealistic avatars. The crux of this innovation lies in mimicking blood flow patterns in response to facial expressions and movements made by the subject. As individuals talk, make different facial expressions, or undergo any motion that alters the contours of their face, blood flow within the facial tissues undergoes distinct changes, resulting in shifts in skin coloration.

This patent involves a two-phase process: training and application. During the training phase, a texture autoencoder is trained using blood flow image data captured through a photogrammetry system. Multiple images of subjects exhibiting various expressions are captured to obtain ground truth data relating to the correlation between facial expressions and blood flow in the face. The lighting component displaced from the albedo map is extracted to determine blood flow, with the albedo map describing the face’s texture under diffused light in its static state. Consequently, the extracted lighting component serves as an indicator of the deviation from the albedo map for a specific expression, enabling the texture autoencoder to map a subject’s expression to a 2D blood flow texture map.

The second phase entails leveraging the 2D blood texture map to generate an avatar. This can be achieved through techniques like multipass rendering, where the 2D blood texture map is incorporated as an additional pass during the rendering process. Alternatively, the blood flow texture can be overlaid on a 3D mesh, creating a dynamic representation of the subject.

Conclusion:

Apple’s patent for machine learning-powered avatars, capable of mimicking blood flow during facial expressions, marks a significant advancement in digital communication technology. This breakthrough has the potential to redefine user experiences on Apple devices, offering a level of realism and authenticity previously unmatched in the market. As consumers increasingly seek immersive and lifelike interactions, Apple’s foray into natural avatar creation positions the company at the forefront of innovation in this domain, potentially shaping the future landscape of digital communication and entertainment.

Source