Training Small LLMs to Edit Code Instead of Generating It
The article explores using small language models (LLMs) for code editing instead of full code generation. It explains why large 2B parameter models struggle with complex code generation and how a retrieval-and-edit approach can be more effective for small models.
Why it matters
This approach could enable more effective use of small, efficient LLMs for code-related tasks, reducing the reliance on large, resource-intensive models.
Key Points
- 1Large LLMs fail at code generation due to the many constraints involved
- 2Small models can succeed at code transformation by editing existing implementations
- 3The article describes a pipeline using sentence embeddings and a Qdrant index to retrieve relevant code snippets
Details
The article argues that while large 2B parameter LLMs have seen a lot of code during pretraining, they lack the capacity to reliably generate complex, syntactically valid, and idiomatic code from scratch. The model has to remember APIs, exception handling, and other edge cases, which is too many constraints for 2 billion parameters to satisfy simultaneously. However, the author found that small models can excel at code transformation tasks. By retrieving an existing implementation and asking the model to modify it, the model only needs to insert a specific pattern it has seen before, rather than generating everything from scratch. The article describes a pipeline using sentence embeddings and a Qdrant index to retrieve relevant code snippets, which the small 3.8B parameter Phi-3-mini model can then edit to add new functionality.
No comments yet
Be the first to comment