APB2FaceV2

An improved version of APB2Face: Real-Time Audio-Guided Multi-Face Reenactment

Generate Convert Improve

Install / Use

/learn @zhangzjn/APB2FaceV2

About this skill

Quality Score

0/100

README

APB2FaceV2

Official pytorch implementation of the paper: "APB2FACEV2: REAL-TIME AUDIO-GUIDED MULTI-FACE REENACTMENT".

Using the Code

Requirements

This code has been developed under Python3.7, PyTorch 1.5.1 and CUDA 10.1 on Ubuntu 16.04.

Datasets in the paper

Download AnnVI dataset from Google Drive or Baidu Cloud (Key:str3) to /media/datasets/AnnVI.

Train

python3 train.py --name AnnVI --data AnnVI --data_root DATASET_PATH --img_size 256 --mode train --trainer l2face --gan_mode lsgan --gpus 0 --batch_size 16

Results are stored in checkpoints/xxx

Test

python3 test.py

Results are stored in checkpoints/AnnVI-Big/results

Citation

@article{zhang2021real,
  title={Real-Time Audio-Guided Multi-Face Reenactment},
  author={Zhang, Jiangning and Zeng, Xianfang and Xu, Chao and Liu, Yong and Li, Hongliang},
  journal={IEEE Signal Processing Letters},
  year={2021},
  publisher={IEEE}
}

Related Skills

node-connect

348.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

108.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

348.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

348.2k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。