Skip to content

Raw dataset having over 16000 tweets (including both sarcastic and non-sarcastic) for researchers aspiring to work on Sarcasm Detection in Hindi.

Notifications You must be signed in to change notification settings

pragyakatyayan/Tweets_Dataset_for_Sarcasm_detection_in_Hindi

Repository files navigation

Tweets_Dataset_for_Sarcasm_detection_in_Hindi

last-commit pandas tweepy textblob

Raw dataset having over 16000 tweets (including both sarcastic and non-sarcastic) for researchers aspiring to work on Sarcasm Detection in Hindi.

Number of Sarcastic tweets: 6051

Number of Non-Sarcastic Tweets: 10128

These tweets were extracted using tweet scrapping code from the Github repository of Mr. Griffin Leow

It was tweaked to extract tweets in native Hindi and of specific hashtags. The dataset has tweets for the duration 01-01-2012 to 23-06-2020.

Method #1: Run the scrap_tweets_in_Hindi-v1.py file via IDLE or Jupyter Notebook to re-scrap tweets from twitter.
Method #2: Just download the Jupyter Notebook and run all the cells!

P.S. If you are around, I won't mind if you star the repository! Thanks ;-)

About

Raw dataset having over 16000 tweets (including both sarcastic and non-sarcastic) for researchers aspiring to work on Sarcasm Detection in Hindi.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published