Skip to content

visnunathan8/Flower-Classification-using-CNN

Repository files navigation

FlowerSpeciesClassification

Screenshot 2023-04-16 at 2 56 08 PM

Accurate identification and classification of flower species is a crucial task for the understanding and conservation of various plant species. However, the lack of available information about the different flower species poses a significant challenge to achieving this goal. This report proposes a Flower Identification System focused on identifying the correct class/specie of a given flower. The project utilizes three different datasets containing a varied number of flower species, that were used to train three different Convolutional Neural Network (CNN) architectures - MobileNetV2, ResNet18, and VGG16. The training process was done for nine instances trained from scratch, three instances for transfer learning and multiple instances for hyperparameter tuning. In this report, the outcomes of the scratch and transfer learning training, hyperparameter tuning, Grad-CAM and TSNE visualization are discussed. The analysis helps determine which model-dataset combination provides the maximum accuracy along with optimal hyperparameters for training the models. This study aims to contribute towards enhancing the classification and identification of flower species, which will be beneficial for the conservation and protection of various plant species.

Project aim

• Develop a flower classification system using deep learning techniques to aid botanists, agriculturists, and horticulturists in identifying different species of flowers.

Methodology

• Building deep-learning-based classification models using a combination of state-of-the-art convolutional neural network (CNN) architectures.

Screenshot 2023-04-16 at 3 11 18 PM

Goals

• Explore and provide a detailed analysis of how different CNN architectures and combinations fare against the chosen datasets of flowers.
• Compare all eleven models that were trained.
• Provide a detailed performance analysis.

Requirements to run the code (libraries, etc)

 pip install -r requirements.txt

Instructions to run the code :

• Jupyter Notebook or any compatible software to run .ipynb files.
• Access to the dataset, which is assumed to be stored in Google Drive.
• Access to the pre-trained models, which are stored inside PreTrainedModels.
• The required libraries and modules should be installed in the environment, including but not limited to PyTorch, scikit-learn, tqdm, and matplotlib.

Training and Validating the Models

The models are trained in the folders as in Dataset-1, Dataset-2, Dataset-3. The codes are in .ipynb files. Some Sample files are given below : MobilenetV2_Dataset1.ipynb Dataset2_VGG16_Final.ipynb Resnet18_Dataset3.ipynb

To run the pre-trained model on the provided sample test dataset

• Access the "TestingModel" folder where the pre-trained sample testing model is located.
• Retrieve the corresponding pre-trained model weights from the below given drive link.
• Obtain the sample test dataset from the below given drive link. (https://drive.google.com/drive/folders/1BrCI3fdoxvH840Ii5AD914SgKfVjv8Ci?usp=share_link)

Links to the Dataset

• Flowers Dataset 1, URL: https://www.kaggle.com/datasets/nadyana/flowers
• Flowers Dataset 2, URL: https://www.kaggle.com/datasets/utkarshsaxenadn/flower-classification-5-classes-roselilyetc
• Flowers Dataset 3, URL: https://www.kaggle.com/datasets/l3llff/flowers

Screenshot 2023-04-16 at 3 12 50 PM

std2

Dataset-1

• It has 7 evenly balanced classes with 1600 images per class (11200 total images). However, we pruned the number of classes to 5 making the total images to 8000(to maintain a diverse number of classes per dataset).

Screenshot 2023-04-12 at 2 13 04 AM

Dataset-2

• It has 10 evenly distributed classes with 1500 images per class (15000 total images).

Screenshot 2023-04-12 at 2 13 42 AM

Dataset-3

• It has 16 classes unevenly distributed with a total of 15,740 images. There are 980 images on average per class in Dataset-3 where the number of images per class fell in the range of 737-1054.

Screenshot 2023-04-12 at 2 14 05 AM

Problematic Images in Dataset 2

Screenshot 2023-04-12 at 2 15 01 AM



The imported libraries and modules include :

  • Python
  • torch
  • torch.nn, torch.nn.functional
  • torch.utils.data
  • torchvision.datasets, torchvision.transforms, torchvision.models
  • sklearn
  • tqdm
  • torchsummary
  • pandas
  • numpy
  • matplotlib
  • omnixai
  • optuna

• There are also installations of the optuna and omnixai libraries using the !pip install command.

Folder Descriptions :

• Data Analysis:

This folder contains scripts and code used to perform data analysis on the datasets used in the project. It contain scripts for data cleaning, data exploration, and data visualization. Data analysis is an important step in any machine learning project, as it helps to identify patterns and insights in the data that can inform the development of machine learning models.

• Dataset-1, Dataset-2, Dataset-3:

These folders contain code used for training and testing machine learning models. Each dataset has its own set of comparison codes, which are used to compare the performance of models on the dataset.

• GradCAM:

This folder contains code for the GradCAM algorithm, which is used to generate heatmaps that visualize the important regions of an image that a deep learning model is using to make its predictions.

• Model and dataset comparison:

This folder contains code for comparing the performance of different machine learning models on various datasets.

• SampleTesting:

This folder contains code for running sample tests on all models on dataset3. These tests are used to evaluate the performance of the models.

• TSNE:

This folder contains code for the t-SNE algorithm, which is used to visualize high-dimensional data in a low-dimensional space

• Transfer learning:

This folder contains code for comparing the metrics of transfer learning models vs. training models from scratch. Transfer learning involves using a pre-trained model as a starting point for a new model, rather than starting from scratch.

• Optimization:

This folder contains code for optimizing a model with different set of learning rate. Optimization involves finding the best set of hyperparameters for a model to improve its performance.

Collaborators to our Project

• GitHub ID: mahdihosseini, email tied to GitHub: (mahdi.hosseini@mail.utoronto.ca).
• GitHub ID: ahmedalagha1418, email tied to GitHub: (ahmedn.alagha@hotmail.com).
• GitHub ID: visnunathan8, email tied to GitHub: (rocketvisnu@gmail.com).
• GitHub ID: ShrawanSai, email tied to GitHub: (msaishrawan@gmail.com).
• GitHub ID: Sharanyu, email tied to GitHub: (sharanyu@hotmail.com).
• GitHub ID: kin-kins, email tied to GitHub: (K.ashu403@gmail.com).

AhmedAlagha1418 mahdihosseini

About

Flower Species Classification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •