Reddit Machine Learning3h ago|Research & PapersProducts & Services

Controlled experiment: Giving an LLM agent access to CS papers improves automated hyperparameter search by 3.2%

A controlled experiment shows that an LLM coding agent with access to a database of 2M+ CS papers can outperform an identical agent without such access when optimizing a 7M parameter GPT-2 model on a small-scale task.

💡

Why it matters

This experiment demonstrates the potential benefits of giving AI agents access to the latest research literature during automated hyperparameter search and experimentation.

Key Points

  • 1Two identical runs using Karpathy's autoresearch framework, with one agent having access to a paper search database
  • 2The paper-augmented agent found and applied techniques like AdaGC and sqrt batch scaling rule, leading to a 3.2% improvement in validation loss
  • 3The paper-less agent was limited to the 'standard ML playbook', while the paper-augmented agent accessed more recent and obscure techniques
  • 4The effect may be larger on less-explored problems compared to the well-studied TinyStories dataset used in this experiment

Details

The experiment set up two identical runs using Karpathy's autoresearch framework, with a 7M parameter GPT-2 model being optimized on the TinyStories dataset. The only difference was that one agent had access to a database of over 2 million CS papers, which it could search and retrieve relevant techniques from. The paper-augmented agent was able to find and apply techniques like AdaGC (adaptive gradient clipping) and sqrt batch scaling rule, leading to a 3.2% improvement in validation loss compared to the agent without paper access. The paper-less agent was limited to the 'standard ML playbook', while the paper-augmented agent could access more recent and obscure techniques that may not have been encoded in its training. The experiment was deliberately conducted on the well-explored TinyStories dataset to make the comparison harder, and the authors suggest the effect would likely be larger on less-explored problems.

Like
Save
Read original
Cached
Comments
?

No comments yet

Be the first to comment

AI Curator - Daily AI News Curation

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies