Skip to content

mplakhtiy/real-or-not

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Master Thesis

Abstract

Nowadays, disaster detection, based on Twitter tweets, has become one of the challenging and demanded researches. For that purpose there was even a competition, created in the Kaggle repository. In this work we explore (and compare) various machine learning methods and techniques, applying them to the disaster tweets data set from the Kaggle competition. We select 10 most promising data preprocessing algorithms, on which our models and classifiers are tested. We find the most successful pairs of model/classifier and preprocessing algorithm for further use. Exploring an approach of combining predictions from different sources to produce a finer predictions, we train several simple neural network models, on data built from combinations of best pairs' predictions. We compare the achieved results to the results obtained when applying simple formulas, to the same combinations of results from the original classifiers. Finally, we locate tweets, which were not recognized by any model or classifier and remove them from the training data set, analysing how this action influences the performance of models and classifiers.

Keywords:

Disaster detection, NLP, Twitter

Resources:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages