Skip to content

Official codebase for PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models

License

Notifications You must be signed in to change notification settings

scale-lab/PoliTune

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models

This repository provides training scripts for fine-tuning LLMs using our preference datasets as described in the paper.

Dataset

The datasets are hosted on Hugging Face Hub. There are two preference datasets:

Repository Structure

  • configs/ - Contains the training recipes for LLMs.
  • data/ - Contains the dataset wrappers.
  • finetune/ - Contains the fine-tuning script.

Dependencies

The codebase depends on torchtune and huggingface.

Fine-Tuning the Model

To fine-tune the model, follow these steps:

  1. Download the model weights using torchtune's tune download.
  2. Ensure the configuration file under configs/ is correctly pointing to the downloaded model.
  3. Run the fine-tuning process using torchtune:
    tune run finetune/dpo_finetune.py --config configs/<config file> checkpointer.output_dir=<path to save the fine-tuned model> output_dir=<path to save the outputs and logs> dataset._component_=<data.datasets.politune_left|data.datasets.politune_right>
    For example:
    tune run finetune/dpo_finetune.py --config configs/llama8b_lora_dpo_single_device.yaml checkpointer.output_dir=checkpoints/ output_dir=output/ dataset._component_=data.datasets.politune_left

Citation

If you use this codebase or the datasets in your work, please cite our paper:

@inproceedings{agiza2024politune,
  title={PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models},
  author={Agiza, Ahmed and Mostagir, Mohamed and Reda, Sherief},
  booktitle={Proceedings of the 2024 AAAI/ACM Conference on AI, Ethics, and Society},
  pages={},
  year={2024}
}

License

MIT License. See LICENSE file

About

Official codebase for PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages