Master Thesis

Abstract

Nowadays, disaster detection, based on Twitter tweets, has become one of the challenging and demanded researches. For that purpose there was even a competition, created in the Kaggle repository. In this work we explore (and compare) various machine learning methods and techniques, applying them to the disaster tweets data set from the Kaggle competition. We select 10 most promising data preprocessing algorithms, on which our models and classifiers are tested. We find the most successful pairs of model/classifier and preprocessing algorithm for further use. Exploring an approach of combining predictions from different sources to produce a finer predictions, we train several simple neural network models, on data built from combinations of best pairs' predictions. We compare the achieved results to the results obtained when applying simple formulas, to the same combinations of results from the original classifiers. Finally, we locate tweets, which were not recognized by any model or classifier and remove them from the training data set, analysing how this action influences the performance of models and classifiers.

Keywords:

Disaster detection, NLP, Twitter

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
configs		configs
data		data
logs		logs
models		models
predictions		predictions
round-table-history		round-table-history
tds		tds
tests		tests
tweets		tweets
.gitignore		.gitignore
Maryan_Plakhtiy_Master_Thesis.pdf		Maryan_Plakhtiy_Master_Thesis.pdf
README.md		README.md
analyzer.py		analyzer.py
app_bert.py		app_bert.py
app_han.py		app_han.py
app_keras.py		app_keras.py
app_kfold_bert.py		app_kfold_bert.py
app_kfold_han.py		app_kfold_han.py
app_kfold_keras.py		app_kfold_keras.py
app_kfold_sklearn.py		app_kfold_sklearn.py
app_sklearn.py		app_sklearn.py
bert_tokenization.py		bert_tokenization.py
csv_from_log.py		csv_from_log.py
data_analysis.py		data_analysis.py
draw.py		draw.py
pred.py		pred.py
prepare_data.py		prepare_data.py
prepare_predictions.py		prepare_predictions.py
requirements-han.txt		requirements-han.txt
requirements.txt		requirements.txt
round_table.py		round_table.py
sklearn_log_to_csv.py		sklearn_log_to_csv.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Master Thesis

Abstract

Keywords:

Resources:

About

Releases

Packages

Languages

mplakhtiy/real-or-not

Folders and files

Latest commit

History

Repository files navigation

Master Thesis

Abstract

Keywords:

Resources:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages