Skip to content

Latest commit

 

History

History
161 lines (140 loc) · 5.39 KB

README.md

File metadata and controls

161 lines (140 loc) · 5.39 KB

Towards Debiasing Sentence Representations

Pytorch implementation for debiasing sentence representations.

This implementation contains code for removing bias from BERT representations and evaluating bias level in BERT representations.

Correspondence to:

Paper

Towards Debiasing Sentence Representations
Paul Pu Liang, Irene Li, Emily Zheng, Yao Chong Lim, Ruslan Salakhutdinov, and Louis-Philippe Morency
ACL 2020

If you find this repository useful, please cite our paper:

@inproceedings{liang-etal-2020-towards,
    title = "Towards Debiasing Sentence Representations",
    author = "Liang, Paul Pu  and
      Li, Irene Mengze  and
      Zheng, Emily  and
      Lim, Yao Chong  and
      Salakhutdinov, Ruslan  and
      Morency, Louis-Philippe",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.488",
    doi = "10.18653/v1/2020.acl-main.488",
    pages = "5502--5515",
}

Installation

First check that the requirements are satisfied:
Python 3.6
torch 1.2.0
huggingface transformers
numpy 1.18.1
sklearn 0.20.0
matplotlib 3.1.2
gensim 3.8.0
tqdm 4.45.0
regex 2.5.77
pattern3

The next step is to clone the repository:

git clone https://github.com/pliang279/sent_debias.git

To install bert models, go to debias-BERT/, run pip install .

Data

Download the GLUE data by running this script:

python download_glue_data.py --data_dir glue_data --tasks SST,QNLI,CoLA

Unpack it to some directory $GLUE_DIR.

Precomputed models and embeddings (optional)

  1. Models

  2. Embeddings

Usage

If you choose to use precomputed models and embeddings, skip to step B. Otherwise, follow step A and B sequentially.

A. Fine-tune BERT

  1. Go to debias-BERT/experiments.
  2. Run export TASK_NAME=SST-2 (task can be one of SST-2, CoLA, and QNLI).
  3. Fine tune BERT on $TASK_NAME.
    • With debiasing
      python run_classifier.py \
      --data_dir $GLUE_DIR/$TASK_NAME/ \
      --task_name $TASK_NAME \
      --output_dir path/to/results_directory \
      --do_train \
      --do_eval \
      --do_lower_case \
      --debias \
      --normalize \
      --tune_bert 
      
    • Without debiasing
      python run_classifier.py \
      --data_dir $GLUE_DIR/$TASK_NAME/ \
      --task_name $TASK_NAME \
      --output_dir path/to/results_directory \
      --do_train \
      --do_eval \
      --do_lower_case \
      --normalize \
      --tune_bert 
      
    The fine-tuned model and dev set evaluation results will be stored under the specified output_dir.

B. Evaluate bias in BERT representations

  1. Go to debias-BERT/experiments.

  2. Run export TASK_NAME=SST-2 (task can be one of SST-2, CoLA, and QNLI).

  3. Evaluate fine-tuned BERT on bias level.

    • Evaluate debiased fine-tuned BERT.
        python eval_bias.py \
        --debias \
        --model_path path/to/model \
        --model $TASK_NAME \
        --results_dir path/to/results_directory \
        --output_name debiased
      
      If using precomputed models, set model_path to acl2020-results/$TASK_NAME/debiased.
    • Evaluate biased fine-tuned BERT.
        python eval_bias.py \
        --model_path path/to/model \
        --model $TASK_NAME \
        --results_dir path/to/results_directory \
        --output_name biased
      
      If using precomputed models, set model_path to acl2020-results/$TASK_NAME/biased.

    The evaluation results will be stored in the file results_dir/output_name.

    Note: The argument model_path should be specified as the output_dir corresponding to the fine-tuned model you want to evaluate. Specifically, model_path should be a directory containing the following files: config.json, pytorch_model.bin and vocab.txt.

  4. Evaluate pretrained BERT on bias level.

    • Evaluate debiased pretrained BERT.
      python eval_bias.py \
      --debias \
      --model pretrained \
      --results_dir path/to/results_directory \
      --output_name debiased 
      
    • Evaluate biased pretrained BERT.
      python eval_bias.py \
      --model pretrained \
      --results_dir path/to/results_directory \
      --output_name biased 
      

    Again, the bias evaluation results will be stored in the file results_dir/output_name.