Sentiment Analysis of Indian Political Tweets using LSTM

This project was developed as part of the Advanced Machine Learning course requirement, focusing on analyzing the sentiment of Indian political tweets using a Long Short-Term Memory (LSTM) model. The study efficiently incorporates Tweepy for data collection, utilizes a labeled dataset from Kaggle, and applies Natural Language Processing (NLP) techniques along with Global Vector (GloVe) word embeddings for data preprocessing. The project's methodology offers a comprehensive approach to sentiment analysis, categorizing tweets into positive, neutral, or negative sentiments and achieving an impressive model accuracy of 96%.

Dataset

The dataset for this project was both scraped by Tweepy and sourced from Kaggle, specifically tailored for sentiment analysis of Indian political tweets. It plays a crucial role in training and validating our LSTM model to ensure accurate sentiment classification.

Dataset Source: Kaggle: Indian Political Tweets Sentiment Analysis

Models and Accuracy

The LSTM model was employed for sentiment classification, with the following performance metrics:

Precision
Recall
F1-Score

These metrics indicate the model's strong ability to classify the sentiment of tweets accurately. The performance is summarized as follows:

Sentiment	Precision	Recall	F1-Score	Support
0	0.95	0.96	0.96	7276
1	0.95	0.97	0.96	6935
2	0.97	0.94	0.95	7089

Accuracy			0.96	21300
Macro Avg	0.96	0.96	0.96	21300
Weighted Avg	0.96	0.96	0.96	21300

Comparative Sentiment Analysis

In addition to the LSTM model's sentiment analysis, a comparative analysis was conducted with the Valence Aware Dictionary and sEntiment Reasoner (VADER), a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media.

Comparative Sentiment Analysis for BJP and INC Tweets

Party	Model	Positive	Neutral	Negative
BJP	LSTM	48.8%	17.4%	33.8%
BJP	VADER	49.2%	15.7%	35.1%
INC	LSTM	49.4%	17.3%	33.3%
INC	VADER	48.5%	16.7%	34.8%

Visual Comparison

Interpretation and Insights

By visually comparing the two sets of pie charts for each political party (BJP and INC), we can observe the similarities and differences in sentiment distributions. This comparison helps us assess the alignment between our model's predictions and VADER's predictions for the given political teams. It provides insights into the model's performance relative to an established sentiment analysis tool like VADER. The slight variations in the sentiment distributions also offer an opportunity to explore the nuances captured by our LSTM model versus the heuristic approach employed by VADER.

The close alignment in overall sentiment distribution for both the BJP and INC tweets between the LSTM model and VADER suggests that the LSTM model is quite robust and aligns well with conventional sentiment analysis methods. However, the differences in the neutral and negative categories invite further exploration into the linguistic subtleties and context that may influence the sentiment analysis results.

Setup and Running the Project

To replicate and run this project, follow these steps:

Collect Tweets: Use Tweepy to collect Indian political tweets. A guide for setting up Tweepy and collecting tweets is provided in the source codes.
Prepare the Dataset: Download the labeled dataset from Kaggle and preprocess it using the provided scripts for GloVe embeddings.
Download GloVe: Download Global Vector dimension file and place it within the folder. This project used glove.6B.50d.txt
Train and Evaluate the LSTM Model: Follow the instructions in the LSTM model implementation folder to train and evaluate the sentiment analysis model.

Requirements

This project is designed to be run in a Python environment with support for Jupyter Notebooks or Google Colaboratory. Key dependencies include TensorFlow, Keras, Tweepy, Pandas, NumPy, and Matplotlib.

Acknowledgments

We extend our gratitude to the academic staff and my peers at Bengal Institute of Technology for their invaluable feedback and support. Special thanks to the Kaggle community for providing a comprehensive dataset for sentiment analysis of Indian political tweets.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Dataset		Dataset
Source codes		Source codes
Default_model.h5		Default_model.h5
Flow Chart.png		Flow Chart.png
Project_Presentation_7thSEM.pptx		Project_Presentation_7thSEM.pptx
Project_Report_7thSEM.pdf		Project_Report_7thSEM.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis of Indian Political Tweets using LSTM

Dataset

Models and Accuracy

Comparative Sentiment Analysis

Comparative Sentiment Analysis for BJP and INC Tweets

Visual Comparison

Interpretation and Insights

Setup and Running the Project

Requirements

Acknowledgments

About

Releases

Packages

Languages

invcble/Sentiment-Analysis-of-Indian-Political-Tweets-2023

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis of Indian Political Tweets using LSTM

Dataset

Models and Accuracy

Comparative Sentiment Analysis

Comparative Sentiment Analysis for BJP and INC Tweets

Visual Comparison

Interpretation and Insights

Setup and Running the Project

Requirements

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages