VEnhancer: Generative Space-Time Enhancement
for Video Generation

Jingwen He, Tianfan Xue, Dongyang Liu, Xinqi Lin,

Peng Gao, Dahua Lin, Yu Qiao, Wanli Ouyang, Ziwei Liu

The Chinese University of Hong Kong, Shanghai Artificial Intelligence Laboratory,

S-Lab, Nanyang Technological University

VEnhancer, a generative space-time enhancement framework that can improve the existing T2V results.

VideoCrafter2	+VEnhancer

📖 For more visual results, go checkout our project page

🔥 Update

[2024.07.28] Inference code and pretrained video enhancement model are released.
[2024.07.10] This repo is created.

🎬 Overview

The architecture of VEnhancer. It follows ControlNet and copies the architecures and weights of multi-frame encoder and middle block of a pretrained video diffusion model to build a trainable condition network. This video ControlNet accepts low-resolution key frames as well as full frames of noisy latents as inputs. Also, the noise level $\sigma$ regarding noise augmentation and downscaling factor $s$ serve as additional network conditioning apart from timestep $t$ and prompt $c_{text}$.

⚙️ Installation

# clone this repo
git clone https://github.com/Vchitect/VEnhancer.git
cd VEnhancer

# create environment
conda create -n venhancer python=3.10
conda activate venhancer
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2
pip install -r requirements.txt

Note that ffmpeg command should be enabled. If you have sudo access, then you can install it using the following command:

sudo apt-get update && apt-get install ffmpeg libsm6 libxext6  -y

🧬 Pretrained Models

Model Name	Description	HuggingFace	BaiduNetdisk
venhancer_paper.pth	video enhancement model, paper version	download	download

💫 Inference

Download clip model via open clip, Stable Diffusion's VAE via sd2.1, and VEnhancer model. Then, put these three checkpoints in the VEnhancer/ckpts directory.
run the following command.

  bash run_VEnhancer.sh

BibTeX

If you use our work in your research, please cite our publication:

@article{he2024venhancer,
  title={VEnhancer: Generative Space-Time Enhancement for Video Generation},
  author={He, Jingwen and Xue, Tianfan and Liu, Dongyang and Lin, Xinqi and Gao, Peng and Lin, Dahua and Qiao, Yu and Ouyang, Wanli and Liu, Ziwei},
  journal={arXiv preprint arXiv:2407.07667},
  year={2024}
}

🤗 Acknowledgements

Our codebase builds on modelscope. Thanks the authors for sharing their awesome codebases!

📧 Contact

If you have any questions, please feel free to reach us at hejingwenhejingwen@outlook.com.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
ckpts		ckpts
inputs		inputs
video_to_video		video_to_video
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt
run_VEnhancer.sh		run_VEnhancer.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VEnhancer: Generative Space-Time Enhancement
for Video Generation

🔥 Update

🎬 Overview

⚙️ Installation

🧬 Pretrained Models

💫 Inference

BibTeX

🤗 Acknowledgements

📧 Contact

About

Releases

Packages

Languages

Vchitect/VEnhancer

Folders and files

Latest commit

History

Repository files navigation

VEnhancer: Generative Space-Time Enhancementfor Video Generation

🔥 Update

🎬 Overview

⚙️ Installation

🧬 Pretrained Models

💫 Inference

BibTeX

🤗 Acknowledgements

📧 Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

VEnhancer: Generative Space-Time Enhancement
for Video Generation

Packages