Skip to content
/ HVDM Public

Official PyTorch implementation of Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation

Notifications You must be signed in to change notification settings

hxngiee/HVDM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HVDM: Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation

Official PyTorch implementation of "Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation".

1. Environment setup

conda create -n hvdm python=3.8 -y
source activate hvdm
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2
pip install natsort tqdm gdown omegaconf einops lpips pyspng tensorboard imageio av moviepy PyWavelets

2. Dataset

Dataset download

We conduct experiments on three datasets: SkyTimelapse, UCF-101, TaiChi. Please refer to the directories structure below and locate it in the /data folder. You can modify the data directory path where data is stored by changing the data_location variable in tools/dataloader.py.

Directories structure

The dataset and checkpoints should be placed in the following structures below

HVDM
├── configs
├── data
    └── SKY
        ├── 001.png
        └── ...
    └── TaiChi
        ├── 001.png
        └── ...
    └── UCF-101
        ├── folder
            ├── 001.avi    
            └── ...    
├── ...
├── results
    ├── ddpm_final_[DATASET]_42
        ├── model_[EPOCH].pth
        └── ...
    └── first_stage_ae_final_[DATASET]_42
        ├── model_[EPOCH].pth
        └── ...
├── tools
└── main.py

3. Training

For settings related to the experiment name, please refer to the PVDM which is the repository our code is based on. Here, [EXP_NAME] is an experiment name you want to specifiy, [DATASET] is either SKY or UCF101 or TaiChi, and [DIRECTOTY] denotes a directory of the autoencoder to be used.

Autoencoder

 python main.py 
 --exp first_stage \
 --id [EXP_NAME] \
 --pretrain_config configs/autoencoder/base.yaml \
 --data [DATASET_NAME] \
 --batch_size [BATCH_SIZE]

This script will automatically save logs and checkpoints in ./results folder.

Diffusion model

 python main.py \
 --exp ddpm \
 --id [EXP_NAME] \
 --pretrain_config configs/autoencoder/base.yaml \
 --data [DATASET] \
 --first_model [AUTOENCODER DIRECTORY] 
 --diffusion_config configs/latent-diffusion/base.yaml \
 --batch_size [BATCH_SIZE]

4. Inference

We are currently working on incorporating code for Image2Video and Video Dynamics Control. Also the model checkpoints will be released soon.

Short Video Generation

python sample.py 
--exp ddpm \
--first_model './results/model_[EPOCH].pth' \
--second_model 'results/ddpm_main_UCF101_42/ema_model_[EPOCH].pth' \
--mode short

Long Video Generation

python sample.py 
--exp ddpm \
--first_model '.results/model_[EPOCH].pth' \ 
--second_model 'results/ddpm_main_[DATASET]_42/ema_model_[EPOCH].pth' \
--mode long

Citation

@article{kim2024hybrid,
  title={Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation},
  author={Kim, Kihong and Lee, Haneol and Park, Jihye and Kim, Seyeon and Lee, Kwanghee and Kim, Seungryong and Yoo, Jaejun},
  journal={arXiv preprint arXiv:2402.13729},
  year={2024}
}

Reference

HVDM draws significant inspiration from the following projects: pvdm, wavediff, latent-diffusion, and stylegan2-ada-pytorch repositories. We thank to all contributors for making their work openly accessible.

About

Official PyTorch implementation of Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages