PlumX Metrics
Embed PlumX Metrics

Efficient Emotional Talking Head Generation via Dynamic 3D Gaussian Rendering

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), ISSN: 1611-3349, Vol: 15036 LNCS, Page: 80-94
2025
  • 0
    Citations
  • 0
    Usage
  • 0
    Captures
  • 0
    Mentions
  • 0
    Social Media
Metric Options:   Counts1 Year3 Year

Conference Paper Description

The synthesis of talking heads with outstanding fidelity, lip synchronization, emotion control, and high efficiency has received lots of research interest in recent years. While some current methods can produce high-fidelity videos in real-time based on NeRF, they are still constrained by computational resources and struggle to achieve accurate emotion control. To tackle these challenges, we propose Emo-Gaussian, a method for generating talking heads based on 3D Gaussian Splatting. In our method, a Gaussian field is utilized to model a specific character. We condition the opacity and color information on audio and emotion inputs, dynamically rendering and optimizing the 3D Gaussians, thus effectively achieving the modeling of the dynamic variations of the talking head. As for the emotion input, we introduce an emotion control module, which utilizes a pre-trained CLIP model to extract emotional priors from images of individuals. These priors are then integrated with an attention mechanism to provide emotion guidance for the process of generating talking heads. Quantitative and qualitative experiments demonstrate the superiority of our method over previous approaches in terms of image quality, lip synchronization, and emotion control, meanwhile exhibiting high efficiency compared to previous state-of-the-art methods.

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know