Skip to content

Prashant-Tiwari26/Multimodal-Sentiment-Analysis-using-Text-and-Images

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mulitmodal Sentiment Analysis using Text and Image Data

Overview

In this Project, we perform multimodal sentiment analysis on twitter data comprised of tweets containing both text and images Mohammed, D. J., & Aleqabie, H. J. (2022, September). The Enrichment Of MVSA Twitter Data Via Caption-Generated Label Using Sentiment Analysis. In 2022 Iraqi International Conference on Communication and Information Technologies (IICCIT) (pp. 322-327). IEEE. to predict the sentiment behind the tweets. The sentiment is classified into three different categories: Positive, Neutral and Negative.

Table of Contents

Datasets

The following dataset has been used for this project : Mohammed, D. J., & Aleqabie, H. J. (2022, September). The Enrichment Of MVSA Twitter Data Via Caption-Generated Label Using Sentiment Analysis. In 2022 Iraqi International Conference on Communication and Information Technologies (IICCIT) (pp. 322-327). IEEE. which can be found here.

Model Architecture

Preprocessing

Text

The captions are corresponding labels are available in LabeledText.xlsx, feature engineering has been done to add the following feature columns:

  • Caption Length : Indicating length of captions
  • Hashtags : Extracting and collecting all the hashtags used in each tweet
  • Total Hashtags : Showing the total number of hashtags in each tweets

The code to do this is available to run in Scripts/Text/FeatureEng.py, the engineered data is then saved as a csv file to Data/Text/Engineered.csv

Afterwards, the embeddings are generated for captions and hashtags using TF-IDF approach and BERT.

The code to do this is in Scripts/Text/Preprocess.py, the function tfidf_preprocessing() and class BERT_Embeddings is present in CustomFunctions.py and then the embeddings are saved in Data/Text/TF-IDF and Data/Text/BERT along with target labels and number of captions.

Training

Evaluation

Usage

Dependencies

All the dependencies in the project are mentioned in requirements.txt file. To install all dependencies run the following command in your terminal:

pip install -r requirements.txt