A web application that generates an image for corresponding textual description. The user is required to enter a text description of a scene. The application will then generate an image that best corresponds to this description. The application uses Generative Adversarial Networks (GANs) trained on a large dataset of images consisting of multiple everyday-object categories.
Explanation:
- Input is a text description of a scenario/ object.
- GANs accept input in the form of vector representations.
- Hence, it is necessary to convert the text description to word embeddings, which are basically vector representations of text description.
- Char CNN-RNN model used for conversion to word embeddings.
- These vector representations thus produced are then passed to the model through the AJAX calls.
- We have used a stacked architecture of Generative Adversarial Networks. This is represented in the form of 2 stages.
- Stage-I GAN sketches the primitive shape and colors of a scene.
- Stage-II GAN adds finer details to the low-resolution image from the Stage-I.
- Final image generated by model is passed back to the chatbot interface through use of AJAX calls.
- The image corresponding to the text description is thus rendered in the chabtot interface itself.
For more detailed explanations and involved concepts, please read the Project Report.pdf
How to Run:
- Clone/ Download the repository.
- Open the folder in terminal.
- Type the command : python new_main.py
- Open link in terminal in a web browser.
- Use the application.