Synthesizing photorealistic 4D human head avatars from videos is essential for VR/AR, telepresence, and video game applications. Although existing Neural Radiance Fields (NeRF)-based methods achieve high-fidelity results, the computational expense limits their use in real-time applications. To overcome this limitation, we introduce BakedAvatar, a novel representation for real-time neural head avatar synthesis, deployable in a standard polygon rasterization pipeline. Our approach extracts deformable multi-layer meshes from learned isosurfaces of the head and computes expression-, pose-, and view-dependent appearances that can be baked into static textures for efficient rasterization. We thus propose a three-stage pipeline for neural head avatar synthesis, which includes learning continuous deformation, manifold, and radiance fields, extracting layered meshes and textures, and fine-tuning texture details with differential rasterization. Experimental results demonstrate that our representation generates synthesis results of comparable quality to other state-of-the-art methods while significantly reducing the inference time required. We further showcase various head avatar synthesis results from monocular videos, including view synthesis, face reenactment, expression editing, and pose editing, all at interactive frame rates.
Overview of our three-stage pipeline. In the first stage, we learn three implicit fields in canonical space for FLAME-based deformation, isosurface geometry, as well as multiple radiance bases and a position feature that are combined by the light-weighted appearance decoder. In the second stage, we bake these neural fields as deformable multi-layer meshes and multiple static textures. In the third stage, we use differential rasterization to fine-tune the baked textures.
We test our method on multiple monocular videos of real portrait videos from PointAvatar [Zheng et al. 2022b], NHA [Grassal et al. 2022], NerFace [Gafniet al. 2021], and HDTF [Zhang et al. 2021].
We compare our method with state-of-the-art neural head avatar reconsturction methods.
Our method supports real-time rendering of controllable head avatars. We test our method on novel-view synthesis, facial reenactment, expression and pose editing task.
Our method renders 4D head avatars at real-time framerates on mobiles devices.
If you find this work useful for your work, please cite us:
@article{bakedavatar,
author = {Duan, Hao-Bin and Wang, Miao and Shi, Jin-Chuan and Chen, Xu-Chuan and Cao, Yan-Pei},
title = {BakedAvatar: Baking Neural Fields for Real-Time Head Avatar Synthesis},
year = {2023},
issue_date = {December 2023},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3618399},
doi = {10.1145/3618399},
volume = {42},
number = {6},
journal = {ACM Trans. Graph.},
month = {sep},
articleno = {225},
numpages = {14}
}