Skip to content

Latest commit

 

History

History
20 lines (15 loc) · 2 KB

2401.05319.md

File metadata and controls

20 lines (15 loc) · 2 KB

Background

  • Background The research identifies limitations in LLMs' ability to generate code for problems involving complex data structures and algorithms and observes the insufficiency of existing debugging techniques.

  • Existing Work Current debugging methods such as rubber duck debugging lack real-time access to variable values and execution flow, limiting their effectiveness in resolving issues in complex algorithmic code. They also fail to fully utilize test cases feedback.

Core Contributions

  • Introducing a "print debugging" method
    • Challenge 1: Real-time tracking and variable values Existing methods lack the capability to monitor and track the execution flow in real time. This paper overcomes this challenge by guiding LLMs to debug using the print debugging method, allowing the models to insert print statements and analyze logs to fix bugs.

    • Challenge 2: Effective use of test case feedback Merely providing LLMs with failed test case information isn't enough for bug localization. The proposed method aids models in identifying and fixing bugs by adding print statements to the code and examining the inconsistencies between the outputs and the corresponding test case explanations.

Implementation and Deployment

Experiments conducted with GPT-4 and evaluated on the Leetcode dataset show that print debugging surpasses rubber duck debugging by 1.5% and 17.9% in performance for easy and medium-level problems respectively. However, no improvement was observed in hard-level problems where the accuracy remained at a low 5%, indicating that more intricate solutions are necessary for complicated scenarios.

Summary

The paper proposes a methodology for using print debugging to guide LLMs in code generation and debugging, validating its effectiveness on the Leetcode dataset, especially for easy and medium complexity problems. Despite limited success with hard-level problems, this work represents a significant advancement in the field of LLMs for code debugging.