This implementation is a PyTorch-based version of Generative Adversarial Text-to-Image Synthesis paper. In this project, a Conditional Generative Adversarial Network (CGAN) is trained, leveraging text descriptions as conditioning inputs to generate corresponding images. The architecture of this model draws inspiration from DCGAN (Deep Convolutional Generative Adversarial Network).
- h5py==3.6.0
- numpy==1.21.5
- Pillow==10.0.0
- torch==2.0.0
We used Caltech-UCSD Birds 200 and text embeddings provided by Reed Scott et al.
├── models
├ └── dcgan_model.py
├── utils.py
├── data_util.py
├── requirements.txt
└── DCGAN_Text2Image.ipynb
[1] Generative Adversarial Text-to-Image Synthesis https://arxiv.org/abs/1605.05396