Controlled experiment: Giving an LLM agent access to CS papers improves automated hyperparameter search by 3.2%
A controlled experiment shows that an LLM coding agent with access to a database of 2M+ CS papers can outperform an identical agent without such access when optimizing a 7M parameter GPT-2 model on a small-scale task.
Why it matters
This experiment demonstrates the potential benefits of giving AI agents access to the latest research literature during automated hyperparameter search and experimentation.
Key Points
- 1Two identical runs using Karpathy's autoresearch framework, with one agent having access to a paper search database
- 2The paper-augmented agent found and applied techniques like AdaGC and sqrt batch scaling rule, leading to a 3.2% improvement in validation loss
- 3The paper-less agent was limited to the 'standard ML playbook', while the paper-augmented agent accessed more recent and obscure techniques
- 4The effect may be larger on less-explored problems compared to the well-studied TinyStories dataset used in this experiment
Details
The experiment set up two identical runs using Karpathy's autoresearch framework, with a 7M parameter GPT-2 model being optimized on the TinyStories dataset. The only difference was that one agent had access to a database of over 2 million CS papers, which it could search and retrieve relevant techniques from. The paper-augmented agent was able to find and apply techniques like AdaGC (adaptive gradient clipping) and sqrt batch scaling rule, leading to a 3.2% improvement in validation loss compared to the agent without paper access. The paper-less agent was limited to the 'standard ML playbook', while the paper-augmented agent could access more recent and obscure techniques that may not have been encoded in its training. The experiment was deliberately conducted on the well-explored TinyStories dataset to make the comparison harder, and the authors suggest the effect would likely be larger on less-explored problems.
No comments yet
Be the first to comment