CodePlan: Repository-Level Coding Using LLMs and Planning

Overview

Authors: Ramakrishna Bairi, Atharv Sonwane, Aditya Kanade, Vageesh D C, Arun Iyer, Suresh Parthasarathy, Sriram Rajamani, B. Ashok, Shashank Shet
Institution: Microsoft Research, India
Topic: Automates repository-level coding tasks using Large Language Models (LLMs) and planning.
Challenge: Repository-level tasks involve pervasive edits across a codebase, which can't be addressed directly by LLMs due to inter-dependencies and large size.
Solution: CodePlan—a task-agnostic framework that synthesizes a multi-step chain of edits.

Incremental Dependency Analysis: Tracks syntactic and semantic relations between code elements.
Change May-Impact Analysis: Identifies potential impacts of code changes on the rest of the repository.
Adaptive Planning Algorithm: Constructs a plan graph to guide edits based on dependency analysis.

Tasks Evaluated: Package migration (C#) and temporal code edits (Python).
Results: CodePlan outperformed baselines in getting repositories to pass validity checks.

Framed repository-level coding as a planning problem.
LLM-driven Repository-level Coding Task: Start with a repository and seed specifications; reach a valid state through derived edit specifications.

CodePlan Framework:
- Constructs a plan graph.
- Uses dependency analysis to identify areas needing changes.
- Adapts the plan as code changes occur.
- Validates repository state with an oracle after each plan execution.

Datasets Used: Internal and external repositories.
Oracles: C# build tools for migration tasks; Pyright for temporal edits.
Baselines: Oracle-Guided Repair (reactive approach based on errors flagged by oracles).
Metrics:
- Block Metrics: Matched, Missed, and Spurious Blocks.
- Edit Metrics: Levenshtein Distance and Diff BLEU.

RQ1: CodePlan effectively localizes and makes required changes for repository-level tasks, outperforming Oracle-Guided Repair.
RQ2: Temporal and spatial contexts are crucial for CodePlan's performance.
RQ3: Key differentiators for CodePlan include its strategic planning, context awareness, and change-may-impact analysis.

Current Limitations:
- Static analysis limitations in dynamically typed languages.
- Handling dynamic dependencies remains a challenge.
Future Directions:
- Expand to more programming languages and artifacts.
- Improve change may-impact analysis with machine learning.
- Address dynamic dependencies in software systems.

CodePlan presents a promising approach to automating complex coding tasks at the repository level, with potential for significant productivity gains and increased code accuracy.
Future work includes expanding its applicability and refining its analysis capabilities.