Skip to content

A WebApp that Generates Caption for Images using CNN-RNN Architecture

License

Notifications You must be signed in to change notification settings

Dev228-afk/Image-Caption-Generator

Repository files navigation

Image-Caption-Generator

A WebApp that Generates Caption for Images using CNN-RNN.

Application Link: https://dev228-afk-image-caption-generator-app-c6ckdt.streamlitapp.com/

Model

This Model consists of a CNN-RNN Layer, Which is made of Keras Sequential API. it's made of the following contents:

  1. CNN Encoder Model: Pretrained CNN Model, which generates Features for Input and Training Images. as an Encoder, Transfer Learning based Xception model has been used with its pretrained weights.
  2. word Embedding Layer: Converts Caption into Word Embedding Tokens. it takes the input/output dimension of the Vector (32,256).
  3. LSTM Decoder Model: LSTM is used as Text Sequence Processing in Encoder-Decoder Architecture, Which takes Input-pair of the feature vector of image and Partial Caption and returns Predicted Caption for input Image
Overview of the Overall Model with its Dimension is shown below:


Dataset used:

Model Results:

  • Some of the Captions Generated by this model are as follows:


Requirements

  • Tensorflow
  • Pandas
  • Numpy
  • Pillow
  • Keras
  • h5py

Usage

  • Use Training_model.ipynb file for the training Model
  • Use Model_Testing.ipnb file for testing model

If this Repository really helped you, please do Star to the Repo.

Releases

No releases published

Packages

No packages published

Languages