Nowadays, disaster detection, based on Twitter tweets, has become one of the challenging and demanded researches. For that purpose there was even a competition, created in the Kaggle repository. In this work we explore (and compare) various machine learning methods and techniques, applying them to the disaster tweets data set from the Kaggle competition. We select 10 most promising data preprocessing algorithms, on which our models and classifiers are tested. We find the most successful pairs of model/classifier and preprocessing algorithm for further use. Exploring an approach of combining predictions from different sources to produce a finer predictions, we train several simple neural network models, on data built from combinations of best pairs' predictions. We compare the achieved results to the results obtained when applying simple formulas, to the same combinations of results from the original classifiers. Finally, we locate tweets, which were not recognized by any model or classifier and remove them from the training data set, analysing how this action influences the performance of models and classifiers.
Disaster detection, NLP, Twitter