Visual Dubbing Pipeline (Video & Face Synthesis)

High-fidelity visual dubbing pipeline under few-shot constraints.

A novel visual dubbing pipeline that balances lip-sync accuracy and realistic facial reenactment. Read the paper

Overview of the pipeline in inference time. More details can be found on the paper.
Sample results, more to be found here!

Overview

  • Combines person-generic and person-specific methods for realistic visual dubbing
  • Introduces a virtual dubber to capture expressive lip-sync with limited data
  • Uses a full-head identity-swapping autoencoder to transfer face, hair, ears, and neck
  • Eliminates artifacts like jitter and double chin from mouth-only approaches
  • Achieves high visual quality and temporal consistency with short video inputs