ListenDenoiseAction
Code to reproduce the results for our SIGGRAPH 2023 paper "Listen Denoise Action"
Install / Use
/learn @simonalexanderson/ListenDenoiseActionREADME
Listen, Denoise, Action!
This repository provides code and models for the paper Listen, denoise, action! Audio-driven motion synthesis with diffusion models.
Please watch the following video for an introduction to our work:
For video samples and a general overview, please see our project page. For the new dance dataset with high-quality mocap, please see the Motorica Dance Dataset.
Installation
We provide a Docker file and requirements.txt for installation using a Docker image or Conda.
Installation using Conda
conda install python=3.9
conda install -c conda-forge mpi4py mpich
pip install -r requirements.txt
Dance synthesis demo
Data and pretrained models
Please download our pretrained dance models here and move them to the pretrained_models folder.
We include processed music inputs from the test dataset in the data folder for generating dances from the model.
Synthesis scripts
You can use the following shell scripts for reproducing the dance user studies in the paper:
./experiments/dance_LDA.sh
./experiments/dance_LDA-U.sh
To try out locomotion synthesis, please go to https://www.motorica.ai/.
Training data
The four main training datasets from our SIGGRAPH 2023 paper are available online:
- The Trinity Speech Gesture Dataset
- The ZEGGS dataset
- The 100STYLE dataset
- The Motorica Dance Dataset, a new dataset with high-quality dance mocap released together with our paper
License and copyright information
The contents of this repository may not be used for any purpose other than academic research. It is free to use for research purposes by academic institutes, companies, and individuals. Use for commercial purposes is not permitted without prior written consent from Motorica AB. If you are interested in using the codebase, pretrained models or the dataset for commercial purposes or non-research purposes, please contact us at info@motorica.ai in advance. Unauthorised redistribution is prohibited without written approval.
Attribution
Please include the following citations in any preprints and publications that use this repository.
@article{alexanderson2023listen,
title={Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models},
author={Alexanderson, Simon and Nagy, Rajmund and Beskow, Jonas and Henter, Gustav Eje},
year={2023}
issue_date={August 2023},
publisher={ACM},
volume={42},
number={4},
doi={10.1145/3592458},
journal={ACM Trans. Graph.},
articleno={44},
numpages={20},
pages={44:1--44:20}
}
The code for translation-invariant self-attention (TISA) was written by Ulme Wennberg. Please cite the correspoding ACL 2021 article if you use this code.
Related Skills
node-connect
332.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
81.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
332.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
81.7kCommit, push, and open a PR
