Skip to content

markallenthornton/3daffect

Repository files navigation

3daffect

Creation and validation of a 3-dimensional sentiment dictionary

A weighted 3-dimensional sentiment dictionary for quantifying the affect in text samples. The dictionary covers approximately 2 million tokens in the pre-trained (common crawl) fastText word vector embedding. These word vectors (not included in repository due to size, but available here) are necessary for re-running some elements of the code included in this repository.

A radial basis function support vector regression was trained to predict ratings of 166 mental state words on 3 principal component dimensions (rationality vs. emotionality, social impact, and valence [+/-]) based on the 300d fastText embedding. This regression achieved relatively high accuracy in 5-fold cross-validation: r = .86, .85, and .91, respectively; RMSE = .60, .60,, .51, respectively, vs. chance at SD=1. An SVM-R trained on all 166 state words was then used to impute 3-d affect scores to all words in the fastText corpus, creating a weighted dictionary.

The resulting weighted dictionary was validated in two ways. First, dictionary scores of individual words on the 3 dimensions were correlated with approximately matched human ratings of dominance, arousal, and valence across nearly 14k words normed in Warriner, A.B., Kuperman, V., & Brysbaert, M. (2013). Norms of valence, arousal, and dominance for 13,915 English lemmas. Behavior Research Methods, 45, 1191-1207.. Resulting correlations were .57 for dominance-rationality, .27 for arousal-social impact, and .67 for valence-valence. This reliable out-of-sample prediction, particularly for the exact dimension match of valence to valence, suggests that the dictionary creation method was largely successful.

Second, the 3d affect dictionary was compared with the 14k ratings from Warriner et al. in terms of scoring extended pieces of text. These consisted of sentences from Amazon reviews, entire IMdB reviews, and sentences from Yelp reviews, curated as part of Group to Individual Labels using Deep Features, Kotzias et. al,. KDD 2015. Each sentences/review was labeled with a binary 1/0 for positive vs. negative, which we attempted to predict using the valence dimensions of both dictionaries. Results favored the 3d affect dictionary over the 14k human rated words for two of the three validation sets despite the far smaller set of words originally normed in the affect dictionary: Amazon - 60% vs 69% accuracy; IMdB - 73% accuracy vs. 70% accuracy; and Yelp - 73% accuracy vs. 69% accuracy. This superior performance was achieved in part - though not completely - due to the fact that the 3d affect model was able to score every piece of text due to its large number of tokens.

Please cite the paper which originally derived the 3-dimensional model of affect from analysis of patterns of brain activity associated with mental state representation Tamir, D. I., Thornton, M. A., Contreras, J. M., & Mitchell, J. P. (2016). Neural evidence that three dimensions organize mental state representation: Rationality, social impact, and valence. Proceedings of the National Academy of Sciences of the United States of America, 113(1), 194-199.

This dictionary was built as part of Methods in Neuroscience at Dartmouth (MIND), 2018.

About

Construction and validation of the 3-D affect dictionary used in affectr

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published