HU at SemEval-2024 Task 8A: Can Contrastive Learning Learn Embeddings to Detect Machine-Generated Text?

This is the official implementation of our final submission on SemEval 2024, Task 8. Paper is available on arXiv.

Run Locally

Clone

  git clone https://github.com/dipta007/SemEval24-task8

Go to the project directory

  cd SemEval24-task8

Install dependencies

  conda env create -f environment.yml 
  conda activate sem24_task8

Download Data

 gdown https://drive.google.com/drive/folders/1FrhMQ5QvMgaeSgcBmZbk7l_GbU-ga99P -O ./data --folder

Run trainer

  python src/train.py --exp_name=EXP_NAME

Final Model Hyperparameters

 'accumulate_grad_batches': 16,
 'batch_size': 2,
 'cls_dropout': 0.6,
 'encoder_type': 'sen',
 'loss_weight_con': 0.7,
 'loss_weight_gen_text': 0.1,
 'loss_weight_text': 0.8,
 'lr': 1e-05,
 'max_doc_len': 64,
 'max_epochs': -1,
 'max_sen_len': 4096,
 'model_name': 'jpwahle/longformer-base-plagiarism-detection',
 'seed': 42,
 'validate_every': 0.04,
 'weight_decay': 0.0

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
src		src
subtaskA		subtaskA
.dvcignore		.dvcignore
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
run_all.py		run_all.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HU at SemEval-2024 Task 8A: Can Contrastive Learning Learn Embeddings to Detect Machine-Generated Text?

Run Locally

Final Model Hyperparameters

About

Languages

dipta007/SemEval24-Task8

Folders and files

Latest commit

History

Repository files navigation

HU at SemEval-2024 Task 8A: Can Contrastive Learning Learn Embeddings to Detect Machine-Generated Text?

Run Locally

Final Model Hyperparameters

About

Topics

Resources

Stars

Watchers

Forks

Languages