Repository stems from: Original colab
Additional upscaling using ESRGAN
VQ-GAN original paper: https://arxiv.org/abs/2012.09841
CLIP original paper: https://arxiv.org/abs/2103.00020
ESRGAN original paper: https://arxiv.org/abs/1809.00219
Very nice introduction to the technique: Alien Dreams
If you want to run the script on GPU, firstly install PyTorch with CUDA support!
git clone https://github.com/openai/CLIP
git clone https://github.com/CompVis/taming-transformers
git clone https://github.com/xinntao/ESRGAN
pip install ftfy
pip install regex
pip install tqdm
pip install omegaconf
pip install pytorch-lightning
pip install kornia
pip install einops
pip install imageio-ffmpeg
pip install opencv-python
Copy pretrained models into models/
Additional links to models - work in progress...
python CLIP_VQGAN.py -texts your_text_prompt
Additional run option
- -width - Image width
- -height - Image height
- -model - Used pretrained model for VQ-GAN
- -display_int - Display interval during generation of the image
- -init_image - Starting image instead of random noise
- -target images - Target images instead of text prompt
- -seed - Random seed
- -max_iterations - Maximum number of optimization iterations
- -make_video - Possibility of making video from genrated images
- -upscale - Possibility to 4x upscale images
Work in progress...