Skip to content

Latest commit

 

History

History
20 lines (15 loc) · 2.74 KB

2403.10131.md

File metadata and controls

20 lines (15 loc) · 2.74 KB

Background

  • Background This paper investigates how to enhance the question-answering capabilities of large language models (LLMs) in domain-specific contexts using extracted auxiliary documents. Current LLMs, when faced with open-book exam-type questions, sometimes get distracted by irrelevant documents, which affects the quality of the model's answers and reasoning abilities.

  • Existing Work Existing approaches train models using the standard Retrieval Augmented Generation (RAG) setup, which allows models to introduce additional documents to help answer questions, in addition to relying on the knowledge obtained during pre-training or finetuning training. However, in this setup, models might look for answers in documents they should not use, which means they are negatively influenced by "distractor documents."

Core Contributions

  • Introduced a training recipe called 'RAFT'
    • Challenge 1: How to make the model ignore distractor documents and focus on the relevant ones RAFT trains the model to identify and cite the correct sequence from the relevant document to answer the question, which can improve the model's ability to answer and reason. To address the challenge of distractor documents, RAFT removes the "oracle documents" which could be used for deducing answers from some training examples, thus forcing the model to memorize answers instead of deriving them from the provided context.

    • Challenge 2: How to enhance the model's reasoning process for answering questions To increase training quality, the authors employed the generation of a Chain-of-Thought to explain the provided answers. The experiments show that creating a complete reasoning chain and clearly citing sources significantly enhances the model's accuracy in answering questions.

Implementation and Deployment

The experimental results of RAFT show that it performs better than domain-specific fine-tuned models and general-purpose models with RAG functionality when handling domain-specific documents. Especially on specific data sets like HotpotQA and HuggingFace datasets, RAFT demonstrates a clear advantage. Additionally, the paper investigates the importance of models learning Chain-of-Thought (CoT) responses and shows that models including CoT significantly outperformed baseline models. RAFT’s experiments also include qualitative analysis and exploration of the robustness of the model in a top-K RAG setup.

Summary

The RAFT approach proposed in this paper innovates the training of large language models to answer questions in a domain-specific "open book" manner, enhancing the model's reasoning capabilities and resistance to distractor documents, and improving the model's accuracy in generating answers through the chain-of-thought method.