Skip to content

The Use Of Classical Classification to Distinguish between 16 MBTI given a vectorized text using CBOW, BERT Models vs Classification using The LSTM model

License

Notifications You must be signed in to change notification settings

Hedrax/16-MBTI-Prediction-Based-On-Social-Text

Repository files navigation

16 MBTI Prediction Based On Social Text

The Use Of Classical Classification to Distinguish between 16 MBTI given a vectorized text using CBOW, BERT Models vs Classification using The LSTM model

Project's KeyPoints

  • LSTM Model
  • BERT Model
  • CBOW Model
  • MBTI Classification

Abstract of The Experiment

our primary objective was to develop a classification framework for the Meyers-Briggs Type Indicator (MBTI) based on social media posts. We adopted a dual-pronged approach to address this challenge. Initially, we employed a Long Short-Term Memory (LSTM) neural network model to categorize vectorized text into one of the 16 MBTI types. Subsequently, we took a dimensionality reduction approach, breaking down the 16 MBTI categories into their 4 fundamental dimensions that define each unique personality. We then applied traditional classification techniques to the vectorized text outputs from two Natural Language Processing (NLP) models, namely BERT and CBOW. In the second phase of our approach, we leveraged a set of six distinct models to optimize our classification results. This comprehensive strategy allowed us to conduct a thorough and accurate comparative analysis of the Vectors produced by the BERT and CBOW models. Our findings contribute to a nuanced understanding of the effectiveness of different NLP models in the context of MBTI classification, paving the way for enhanced accuracy and insights into personality Predictions based on social media content

Results

CBOW

Model I/E N/S T/F J/P AVG
Logistic Regression 0.72 0.757 0.8315 0.6724 0.7452
SVC Accuracy 0.7342 0.7834 0.8424 0.6858 0.7614
SGD Classifier 0.691 0.764 0.8199 0.6538 0.7321
Random Forest 0.6819 0.7347 0.8007 0.6512 0.7171
XGBoost 0.7104 0.7672 0.8192 0.6664 0.7408
CatBoost 0.7297 0.786 0.8370 0.6852 0.7594

BERT

Model I/E N/S T/F J/P AVG
Logistic Regression 0.8006 0.7896 0.7311 0.7311 0.7878
SVC Accuracy 0.7375 0.7682 0.8108 0.7032 0.7549
SGD Classifier 0.7801 0.7878 0.8252 0.7196 0.7781
Random Forest 0.7496 0.7124 0.769 0.6856 0.7291
XGBoost 0.7767 0.7434 0.7933 0.6995 0.7532
CatBoost 0.7838 0.7635 0.8064 0.7128 0.7666

LSTM Model

We trained the LSTM and managed to get an Accuracy of 73.02%, a loss of 1.1323.

For more detailed results check the report

Contributors

Alhossien
Alhossien Waly
Ali
Ali Ibrahim

About

The Use Of Classical Classification to Distinguish between 16 MBTI given a vectorized text using CBOW, BERT Models vs Classification using The LSTM model

Topics

Resources

License

Stars

Watchers

Forks