GitHub - kiranchhatre/amuse: [CVPR 2024] AMUSE: Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion

Kiran Chhatre · Radek Daněček · Nikos Athanasiou
Giorgio Becherini · Christopher Peters · Michael J. Black · Timo Bolkart

This is a repository for AMUSE: Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion. AMUSE generates realistic emotional 3D body gestures directly from a speech sequence (top). It provides user control over the generated emotion by combining the driving speech with a different emotional audio (bottom).

News 🚩

[2024/07/25] Data processing and gesture editing scripts are available.
[2024/06/12] Code is available.
[2024/02/27] AMUSE has been accepted for CVPR 2024! Working on code release.
[2023/12/08] ArXiv is available.

Setup

Main Repo Setup

The project has been tested with the following configuration:

Operating System: Linux 5.14.0-1051-oem x86_64
GCC Version: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
CUDA Version: CUDA 11.3
Python Version: Python 3.8.15
GPU Configuration:
- Audio Model: NVIDIA A100-SXM4-80GB
- Motion Model: NVIDIA A100-SXM4-40GB, Tesla V100-32GB

Note: The audio model requires a larger GPU. Multiple GPU support is implemented for the audio model; however, it was not used in the final version.

git clone https://github.com/kiranchhatre/amuse.git
cd amuse/dm/utils/
git clone https://github.com/kiranchhatre/sk2torch.git
git clone -b init https://github.com/kiranchhatre/PyMO.git
cd ../..
git submodule update --remote --merge --init --recursive
git submodule sync

git submodule add https://github.com/kiranchhatre/sk2torch.git dm/utils/sk2torch
git submodule add -b init https://github.com/kiranchhatre/PyMO.git dm/utils/PyMO

git submodule update --init --recursive

git add .gitmodules dm/utils/sk2torch dm/utils/PyMO

Environment Setup

conda create -n amuse python=3.8
conda activate amuse
export CUDA_HOME=/is/software/nvidia/cuda-11.3
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
conda env update --file amuse.yml --prune
module load cuda/11.3
conda install anaconda::gxx_linux-64 # install 11.2.0
FORCE_CUDA=1 pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1110/download.html

Blender Setup

conda deactivate
conda env create -f blender.yaml
AMUSEPATH=$(pwd)
cd ~
wget https://download.blender.org/release/Blender3.4/blender-3.4.1-linux-x64.tar.xz
tar -xvf ./blender-3.4.1-linux-x64.tar.xz
cd ~/blender-3.4.1-linux-x64/3.4
mv python/ _python/
ln -s /home/kchhatre/anaconda3/envs/envs/blender ./python
cd "$AMUSEPATH"
cd scripts
conda activate amuse

Data Setup and Blender Resources

Follow instructions: https://amuse.is.tue.mpg.de/download.php

Tasks

Once the above setup is correctly done, you can execute the following:

train_audio (training step 1/2)
Train AMUSE step 1 of the speech disentanglement model.
```
cd $AMUSEPATH/scripts
python main.py --fn train_audio
```
train_gesture (training step 2/2)
Train AMUSE step 2 of the gesture generation model.
```
cd $AMUSEPATH/scripts
python main.py --fn train_gesture
```
infer_gesture
Infer AMUSE on a single 10s WAV monologue audio sequence.
Place audio in $AMUSEPATH/viz_dump/test/speech.
Video of generated gesture will be in $AMUSEPATH/viz_dump/test/gesture.
```
cd $AMUSEPATH/scripts
python main.py --fn infer_gesture
```
edit_gesture
```
cd $AMUSEPATH/scripts
python main.py --fn edit_gesture
```
For extensive editing options, please refer to the process_loader function in infer_ldm.py and experiment with different configurations in emotion_control, style_transfer, and style_Xemo_transfer. While editing gestures directly from speech is challenging, it offers intriguing possibilities. The task involves numerous combinations, and not all may yield optimal results. Figures A.11 and A.12 in supplementary material illustrate the inherent complexities and variations in this process. Click the image below to watch the video on YouTube:
bvh2smplx_
Convert BVH to SMPL-X using the provided BMAP presets from the AMUSE website download page. Place the BVH file inside $AMUSEPATH/data/beat-rawdata-eng/beat_rawdata_english/<<actor_id>>, where actor_id is a number between 1 and 30. The converted file will be located in $AMUSEPATH/viz_dump/smplx_conversions.
```
cd $AMUSEPATH/scripts
python main.py --fn bvh2smplx_
```
Once converted, import the file in Blender using the SMPLX blender addon. Remember to specify the target FPS (for current file: 24 FPS) in the import animation window while importing the NPZ file.

prepare_data
Prepare data and create an LMDB file for training AMUSE. We provide the AMUSE-BEAT version on the project webpage. To train AMUSE on a custom dataset, you will need aligned motion and speech files. The motion data should be in an animation NPZ file compatible with the SMPL-X format.
```
cd $AMUSEPATH/scripts
python main.py --fn prepare_data
```

Citation

If you find the Model & Software, BVH2SMPLX conversion tool, and SMPLX Blender addon-based visualization software useful in your research, we kindly ask that you cite our work:

@InProceedings{Chhatre_2024_CVPR,
    author    = {Chhatre, Kiran and Daněček, Radek and Athanasiou, Nikos and Becherini, Giorgio and Peters, Christopher and Black, Michael J. and Bolkart, Timo},
    title     = {{AMUSE}: Emotional Speech-driven {3D} Body Animation via Disentangled Latent Diffusion},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {1942-1953},
    url = {https://amuse.is.tue.mpg.de},
}

Additionally, if you use the AMUSE-BEAT data in your research, please also consider citing both the AMUSE and EMAGE projects.

License

Software Copyright License for non-commercial scientific research purposes. Please read carefully the following terms and conditions and any accompanying documentation before you download and/or use AMUSE model, AMUSE-BEAT data and software, (the "Data & Software"), including 3D meshes, images, videos, textures, software, scripts, and animations. By downloading and/or using the Data & Software (including downloading, cloning, installing, and any other use of the corresponding github repository), you acknowledge that you have read these terms and conditions, understand them, and agree to be bound by them. If you do not agree with these terms and conditions, you must not download and/or use the Data & Software. Any infringement of the terms of this agreement will automatically terminate your rights under this License.

Acknowledgments

We would like to extend our gratitude to the authors and contributors of the following open-source projects, whose work has significantly influenced and supported our implementation: EVP, Motion Diffusion Model, Motion Latent Diffusion, AST, ACTOR, and SMPL-X. We also wish to thank SlimeVRX for their collaboration on the development of the bvh2smplx_ task. For a more detailed list of acknowledgments, please refer to our paper.

Contact

For any inquiries, please feel free to contact amuse@tue.mpg.de. Feel free to use this project and contribute to its improvement. For commercial uses of the Data & Software, please send an email to ps-license@tue.mpg.de.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs		configs
data/beat-rawdata-eng/beat_rawdata_english/16		data/beat-rawdata-eng/beat_rawdata_english/16
dm		dm
docs/static		docs/static
models		models
scripts		scripts
viz_dump		viz_dump
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
amuse.yml		amuse.yml
blender.yaml		blender.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

News 🚩

Setup

Main Repo Setup

Environment Setup

Blender Setup

Data Setup and Blender Resources

Tasks

Citation

License

Acknowledgments

Contact

About

Releases

Packages

Contributors 2

Languages

License

kiranchhatre/amuse

Folders and files

Latest commit

History

Repository files navigation

News 🚩

Setup

Main Repo Setup

Environment Setup

Blender Setup

Data Setup and Blender Resources

Tasks

Citation

License

Acknowledgments

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages