Audio-Driven Facial Animation with Deep Learning: A Survey

Jiang, Diqiong, Chang, Jian, You, Lihua, Bian, Shaojun, Kosk, Robert and Maguire, Greg (2024) Audio-Driven Facial Animation with Deep Learning: A Survey. Information, 15 (11). p. 675. ISSN 2078-2489

Full text not available from this repository. (Request a copy)

Access this via: https://www.mdpi.com/2078-2489/15/11/675

Abstract

Audio-driven facial animation is a rapidly evolving field that aims to generate realistic facial expressions and lip movements synchronized with a given audio input. This survey provides a comprehensive review of deep learning techniques applied to audio-driven facial animation, with a focus on both audio-driven facial image animation and audio-driven facial mesh animation. These approaches employ deep learning to map audio inputs directly onto 3D facial meshes or 2D images, enabling the creation of highly realistic and synchronized animations. This survey also explores evaluation metrics, available datasets, and the challenges that remain, such as disentangling lip synchronization and emotions, generalization across speakers, and dataset limitations. Lastly, we discuss future directions, including multi-modal integration, personalized models, and facial attribute modification in animations, all of which are critical for the continued development and application of this technology.

Item Type:	Article
Additional Information:	Article version: VoR From Crossref journal articles via Jisc Publications Router History: epub 28-10-2024; issued 28-10-2024. Licence for VoR version of this article starting on 28-10-2024: https://creativecommons.org/licenses/by/4.0/
SWORD Depositor:	JISC Router
Depositing User:	JISC Router
Date Deposited:	06 Nov 2024 10:53
Last Modified:	06 Nov 2024 10:53
URI:	https://bnu.repository.guildhe.ac.uk/id/eprint/19378

Actions (login required)

Edit Item