HonestyLLM

This repository contains scripts and configurations for our paper "The Best of Both Worlds: Toward an Honest and Helpful Large Language Model".

Introduction

This repository focuses on enhancing the honesty and helpfulness of Large Language Models (LLMs) in real-world applications. Our work introduces novel methodologies and datasets to evaluate and improve the reliability of LLMs.

Components

HoneSet Dataset: A novel dataset containing 930 queries across six categories, crafted to evaluate the honesty of LLMs.
Two Enhancement Approaches:
- Training-Free Enhancement: Leverages curiosity-driven prompting to help LLMs express uncertainty and refine their responses.
- Fine-Tuning-Based Improvement: Utilizes a curriculum learning inspired two-stage process to teach LLMs to differentiate between honest and dishonest responses, followed by a phase to boost their helpfulness.

HoneSet

Honeset is located in dataset/HoneSet.json which contains 930 data items across 6 categories as follows:

Category
Latest Information with External Services
User Input Not Enough Or With Wrong Information
Self Identity Cognition
Modality Mismatch
Professional Capability in Specific Domain
Interactivity Sensory Processing

Training-free Enhancement

Requirements

Python 3.x
Libraries: openai, replicate, requests, tenacity, concurrent.futures, anthropic, torch, yaml, argparse, dotenv
API keys and model mappings for Azure, replicate, deepinfra and other services.

Configuration Steps

Edit Configuration:
- Navigate to the training_free/config.yaml file.
- Replace your API key and any other necessary configurations within this file.
Script Location:
- Ensure that you are in the directory containing the training_free.sh script.

Set Model Parameters:

model_type can be online or local

model_name can be as follows:

Model_name input	Model
gpt-4	GPT-4
chatgpt	ChatGPT
claude	Claude3-Opus
llama3-70b	Llama3-70b
llama3-8b	Llama3-8b
mixtral-8x7b	Mixtral-8x7b
llama2-7b	Llama2-7b
llama2-13b	Llama2-13b
llama2-70b	Llama2-70b
mistral-7b	Mistral-7b

Command Line Arguments

Online Mode When running the script in online mode, use the following parameters:

./training_free.sh online [model_name]

local Mode When running the script in local mode, you can specify additional parameters:
--temperature (default = 0): Controls the randomness of the response generation. Higher values produce more varied outputs.
--repetition_penalty (default = 1.0): Penalizes repetition to encourage more diverse responses.
--num_gpus (default = 1): Specifies the number of GPUs to use.
--max_length (default = 2048): Limits the number of tokens in the response.
--debug (default = false): Enables debug mode for more verbose output.
--model_path (default = ''): The path to the model files, necessary in local mode.
--filename (default = ''): Specifies the output filename.
--test_type (default = 'plugin'): Sets the type of testing or processing.
--online (default = 'False'): Indicates whether to run the model in online mode.

./training_free.sh local [model_name] --temperature [value] --repetition_penalty [value] --num_gpus [value] --max_length [value] --debug --model_path [path] --filename [filename] --test_type [type]

Improvement through fine-tuning

Overview

This repository contains scripts and configurations for fine-tuning, merging, and running inference with Llama models using LLaMA-Factory.

Requirements

LLaMA-Factory installed
Install LLaMA-Factory

git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e .[torch,metrics]

Run Fine-tuning

Fine-Tuning

To fine-tune the model, use the following command:

llamafactory-cli train train_config.yaml

Replace train_config.yaml with one setting in finetuning/*.yaml

Merging Stage 1 Model

After fine-tuning, merge the stage 1 model using:

llamafactory-cli export merge_lora_dpo.yaml

Make sure merge_lora_dpo.yaml is configured with the appropriate merging parameters.

Running Model Inference

To run model inference, execute:

llamafactory-cli api model_inference.yaml

Ensure model_inference.yaml contains the correct inference settings.

Citation

@misc{gao2024best,
      title={The Best of Both Worlds: Toward an Honest and Helpful Large Language Model}, 
      author={Chujie Gao and Qihui Zhang and Dongping Chen and Yue Huang and Siyuan Wu and Zhengyan Fu and Yao Wan and Xiangliang Zhang and Lichao Sun},
      year={2024},
      eprint={2406.00380},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.idea		.idea
dataset		dataset
evaluation		evaluation
finetuning		finetuning
image		image
training_free		training_free
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HonestyLLM

Table of Contents

Introduction

Components

HoneSet

Training-free Enhancement

Requirements

Configuration Steps

Command Line Arguments

Improvement through fine-tuning

Overview

Requirements

Run Fine-tuning

Fine-Tuning

Merging Stage 1 Model

Running Model Inference

Citation

About

Releases

Packages

Contributors 3

Languages

Flossiee/HonestyLLM

Folders and files

Latest commit

History

Repository files navigation

HonestyLLM

Table of Contents

Introduction

Components

HoneSet

Training-free Enhancement

Requirements

Configuration Steps

Command Line Arguments

Improvement through fine-tuning

Overview

Requirements

Run Fine-tuning

Fine-Tuning

Merging Stage 1 Model

Running Model Inference

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages