Skip to content

Semantic Textual Similarity: task which consists in evaluating the degree of semantic equivalence between pairs of sentences. Also known as paraphrase detection.

Notifications You must be signed in to change notification settings

albertrial/SemEval-2012-task-6

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SemEval-2012 Task 6: Semantic Textual Similarity

Project done for the IHLT (Introduction to Human Language Technology) course (Master in Artificial Intelligence at UPC)

Authors:

  • Albert Rial
  • Utku Ünal

This repository contains two different approaches to address the SemEval 2012 Task 6 (Semantic Textual Similarity) which consists in, given two pair of sentences, provide a similarity value between them. This task is also known as paraphrases detection. A paraphrase between two sentences or texts is when both have the same meaning using different words.

In the repository you can find the following files and folders:

  • sts-model.ipynb: jupyter notebook containing our first approach together with the explanations and results obtained.

  • sts-model-infersent.ipynb: another jupyter notebook where we have another approach that uses a pre-trained sentence embeddings model (InferSent).

  • sts-summary.pdf: PDF file containing a brief presentation about the work done, the approaches taken and all the results obtained.

  • train/test-gold: datasets of the STS competition provided in the subject and used for training and testing.

Important: In order to run the InferSent approach, it is needed the InferSent library, the fastText word embeddings dataset and a pre-trained model of InferSent.

Results

The best result obtained is (pearson correlation between the gold-standard similarity values and the ones from our system): 82.46%. This result overpasses the result obtained by the winner of the SemEval 2012 competition.

About

Semantic Textual Similarity: task which consists in evaluating the degree of semantic equivalence between pairs of sentences. Also known as paraphrase detection.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published