Skip to content

A rating-based sentiment dataset of IMDB movie reviews (WASSA 2014)

License

Notifications You must be signed in to change notification settings

daiquocnguyen/SAR14

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 

Repository files navigation

SAR14: A rating-based sentiment dataset of movie reviews

The SAR14 dataset contains 234k IMDB movie reviews along with their associated rating scores on a 1-10 scale. Particularly, this dataset consists of 167k reviews with positive scores (greater than or equal to 7) and 66k reviews with negative scores (less than or equal to 4). Please find details about the construction of this dataset as well as results of sentiment polarity classification in our paper:

@InProceedings{NguyenWASSA2014long,
  author    = {Dai Quoc Nguyen and Dat Quoc Nguyen and Thanh Vu and Son Bao Pham},
  title     = {Sentiment Classification on Polarity Reviews: An Empirical Study Using Rating-based Features},
  booktitle = {Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis},
  year      = {2014},
  pages     = {128--135}
}

Please cite the paper whenever SAR14 is used to produce published results or incorporated into other software. SAR14 is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.