Text2motion
No description available
Install / Use
/learn @bralani/Text2motionREADME
Text2Motion
<img src="assets/tyranno_attacks.gif" width="400"/>A tyrannosaurus attacks
Text2Motion is a deep learning system that translates text descriptions into realistic 3D animations for any given mesh. It automates complex animation workflows with an intelligent, three-stage pipeline:
-
Classification: The system first analyzes the input 3D mesh to identify its key features and determine the appropriate armature (skeleton).
-
Skinning: It then binds the mesh to the armature, creating a "skinned" model ready for realistic deformation.
-
Generation: Finally, a powerful diffusion model generates fluid and nuanced motion based on your text prompt, bringing the model to life.
Getting Started
Follow these instructions to get the project up and running on your local machine.
-
Setup environment
To setup the environment, run the following code:
conda env create -f environment.yaml conda activate text2motion pip install torch-scatter -f https://data.pyg.org/whl/torch-2.4.1+cu121.html pip install git+https://github.com/inbar-2344/Motion.git -
Download the dataset
This project uses the Truebones Zoo dataset, a collection of animated animal motions. You can get the dataset for free from the Truebones Gumroad page. After downloading, extract the contents into a directory named
datawithin the root of the project. The directory structure should look like this:text2motion/ ├── data/ │ └── Truebone_Z-OO/ │ └── ... (dataset files) ├── dataset/ ├── models/ ├── scripts/ ├── utils/ ├── train_classifier_skinning.py ├── train_diffusion.py -
Preprocess the dataset
To prepare the dataset for training, you must run two distinct preprocessing steps: one for the classification and skinning models, and a second one for the diffusion model.
Preprocessing for Classification and Skinning
This first step is essential for the classification and skinning models and requires Blender. You will need to run a script from your terminal that automates the necessary operations within Blender.
Open a terminal and execute the following command:
blender --background --python scripts/preprocessing_script.py
Preprocessing for the Diffusion Model
The second step prepares the data specifically for training the diffusion model. To correctly preprocess the data for the diffusion model, please refer to the detailed documentation and scripts provided in the Anytop repository. You should then put the
truebones_processedfolder indata.
Usage
Training the Models
The complete training process involves two distinct stages that must be run in order. Each stage trains a separate model that is essential for the final animation pipeline.
-
Train the Classifier and skinning model This model learns to analyze a 3D mesh and predict its skeletal structure (armature) in the correct positions with a weight for each vertex.
python -m train_classifier_skinning -
Train the Diffusion Model This model learns to generate the actual motion sequence from a text prompt, which is then applied to the skinned model. See the file
utils/parser_util.pypython -m train_diffusion --model_prefix NAME_MODEL --objects_subset bipeds --lambda_geo 1.0 --overwrite --balanced
Acknowledgments
- Truebones for providing the excellent and comprehensive Zoo dataset.
- Anytop, whose work provided the basis for our preprocessing scripts, visualization tools, and the core of our diffusion model.
- The authors of MDM for their insightful ideas that significantly influenced the architecture of our diffusion model.
- The RigNet project for developing the
SkinNetmodel, which we have adapted for our skinning process. - The creators of PointNet++ for the pioneering network architecture that underpins our classification and joint prediction models.
License
This code is distributed under an MIT LICENSE. Note that our code depends on other libraries that have their own respective licenses that must also be followed.
Related Skills
node-connect
347.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
108.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
