Skip to content

Latest commit

 

History

History
20 lines (15 loc) · 2.37 KB

2312.17485.md

File metadata and controls

20 lines (15 loc) · 2.37 KB

Background

  • Background The paper examines the application of Large Language Models (LLMs) in repairing defects identified during the code review (CR) process by reviewers and automated checkers. It highlights that review comments play a critical role in guiding the repair process for most CR defects as they often provide suggestions for rectification, and LLMs, similar to developers, have the capability to understand and generate repairs based on feedback from both natural and programming languages.

  • Existing Work Existing research has not been fully exploited to understand and apply the feedback from code reviews for identifying and repairing code defects efficiently. This can be a potent tool during the software development process.

Core Contributions

  • Introduced a semi-automated APR paradigm
    • Challenge 1: Effectively utilizing LLMs to comprehend review comments and generate repair patches They have implemented a semi-automated APR paradigm that exploits the capabilities of LLMs to comprehend code review comments and generate repair patches, tackling the challenge of guiding code defect repairs with natural language feedback.

    • Challenge 2: Designing effective prompts to guide LLMs in repairing codes They designed 7 prompts based on CR process information and discovered that review comments and fix ranges were the two most effective prompts. Cross-validation of two distinct datasets established the importance of data diversity.

Implementation and Deployment

The research team studied 9 different LLMs, including ChatGPT-3.5, ChatGPT-4, LLaMA, and CodeLLaMA, for which they created seven different prompts. They compared the code repair success rates of these LLMs under each prompt. CodeLLaMA emerged as the top performer with a repair success rate of up to 72.97%. They emphasized the importance of dataset diversity by comparing two datasets consisting of approximately 16K review comments and 15K automatically generated comments. Additionally, the study provided an example of how CodeLLaMA successfully repaired a complicated defect requiring multiple code modifications.

Summary

The research explores the application of LLMs in repairing code review defects, introduces an effective semi-automated APR paradigm, analyzes the performance of 9 popular models, and designs effective prompts to guide the code repair process.