Skip to content

Latest commit

 

History

History
12 lines (10 loc) · 1.21 KB

README.md

File metadata and controls

12 lines (10 loc) · 1.21 KB

SAR14: A rating-based sentiment dataset of movie reviews

The SAR14 dataset contains 234k IMDB movie reviews along with their associated rating scores on a 1-10 scale. Particularly, this dataset consists of 167k reviews with positive scores (greater than or equal to 7) and 66k reviews with negative scores (less than or equal to 4). Please find details about the construction of this dataset as well as results of sentiment polarity classification in our paper:

@InProceedings{NguyenWASSA2014long,
  author    = {Dai Quoc Nguyen and Dat Quoc Nguyen and Thanh Vu and Son Bao Pham},
  title     = {Sentiment Classification on Polarity Reviews: An Empirical Study Using Rating-based Features},
  booktitle = {Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis},
  year      = {2014},
  pages     = {128--135}
}

Please cite the paper whenever SAR14 is used to produce published results or incorporated into other software. SAR14 is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.