Skip to content

Latest commit

 

History

History
8 lines (6 loc) · 693 Bytes

README.md

File metadata and controls

8 lines (6 loc) · 693 Bytes

Training on the Test Task

Code to reproduce the experiments, figures and tables of the paper Training on the Test Task Confounds Evaluation and Emergence.

  • The folder experiments/ contains the code to fine-tune models on the datasets of task-relevant data considered, and to evaluate models using the LM Evaluation Harness library.
  • The folder notebooks/evaluations contains the model evaluation files.
  • The Jupyter notebook notebooks/figures.ipynb reproduces the figures and tables in the paper.
  • The fine-tuned models are currently being uploaded here.