Skip to content

Shared task of the Second Workshop on Figurative Language Processing, co-located with ACL 2020

Notifications You must be signed in to change notification settings

BshoterJ/FigLang2020-Sarcasm-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sarcasm-detection

Sarcasm detection is the second shared task of FigLang2020 ,co-located with ACL 2020

For more information about the shared task and to participate visit CodaLab site

Dataset

Training data can be downloaded in Github

  • label : SARCASM or NOT_SARCASM
  • response : the sarcastic response, whether a sarcastic Tweet or a Reddit post
  • context : the conversation context of the response
    • Note, the context is an ordered list of dialogue, i.e., if the context contains three elements, c1, c2, c3, in that order, then c2 is a reply to c1 and c3 is a reply to c2. Further, if the sarcastic response is r, then r is a reply to c3.

For instance, for the following example :

"label": "SARCASM", "response": "Did Kelly just call someone else messy? Baaaahaaahahahaha", "context": ["X is looking a First Lady should . #classact, "didn't think it was tailored enough it looked messy"]

The response tweet, "Did Kelly..." is a reply to its immediate context "didn't think it was tailored..." which is a reply to "X is looking...". Your goal is to predict the label of the "response" while also using the context (i.e, the immediate or the full context).

  • Twitter : constructed a data set of 5,000 English Tweets balanced between the SARCASM and NOT_SARCASM classes.
  • Reddit : it is a dataset of 4,400 Reddit posts balanced between the SARCASM and NOT_SARCASM classes.

Envirement

  • python 3.6
  • torch 1.0+
  • scikit-learn
  • tqdm
  • pandas
  • jsonlines

Experiments

Twitter

Baseline

Model dev/test F1-score input
bert(uncased-large-wwm) 81.243% response
bert(uncased-large-wwm) 82.200% context+response
bert(cased-large-wwm) 82.553%/69.200% context+response
bert(cased-large) 83.147%/72.619% context+response

Advanced

Model dev/test F1-score
bert(last 3 layer) 83.355%/73.189%

Reddit

Model dev/test F1-score input
bert(cased-large)+gru 71.352%/63.042% response

RUN

1、Download bert pretain model to ./bert-large-cased-wwm and rename them as:

config.json;pytorch_model.bin;vocab.txt

2、Prepare the training and dev data(4:1 and 5 fold)

python ./data/twitter/preprocess_twitter.py 

3、Train the model

sh run_bert sh

About

Shared task of the Second Workshop on Figurative Language Processing, co-located with ACL 2020

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published