Reddit Machine Learning3h ago|Research & Papers Products & Services

Controlled experiment: Giving an LLM agent access to CS papers improves automated hyperparameter search by 3.2%

A controlled experiment shows that an LLM coding agent with access to a database of 2M+ CS papers can outperform an identical agent without such access when optimizing a 7M parameter GPT-2 model on a small-scale task.

💡

Why it matters

This experiment demonstrates the potential benefits of giving AI agents access to the latest research literature during automated hyperparameter search and experimentation.

Key Points

1Two identical runs using Karpathy's autoresearch framework, with one agent having access to a paper search database
2The paper-augmented agent found and applied techniques like AdaGC and sqrt batch scaling rule, leading to a 3.2% improvement in validation loss
3The paper-less agent was limited to the 'standard ML playbook', while the paper-augmented agent accessed more recent and obscure techniques
4The effect may be larger on less-explored problems compared to the well-studied TinyStories dataset used in this experiment

Details

The experiment set up two identical runs using Karpathy's autoresearch framework, with a 7M parameter GPT-2 model being optimized on the TinyStories dataset. The only difference was that one agent had access to a database of over 2 million CS papers, which it could search and retrieve relevant techniques from. The paper-augmented agent was able to find and apply techniques like AdaGC (adaptive gradient clipping) and sqrt batch scaling rule, leading to a 3.2% improvement in validation loss compared to the agent without paper access. The paper-less agent was limited to the 'standard ML playbook', while the paper-augmented agent could access more recent and obscure techniques that may not have been encoded in its training. The experiment was deliberately conducted on the well-explored TinyStories dataset to make the comparison harder, and the authors suggest the effect would likely be larger on less-explored problems.

Controlled experiment: Giving an LLM agent access to CS papers improves automated hyperparameter search by 3.2%

Why it matters

Key Points

Details

Dive deeper

Related Articles

PentaNet: Pushing beyond BitNet with Native Pentanary Quant…

Additional Experiments During Rebuttal Can Worsen Paper Qua…

Create Datasets from TikTok Videos

Is TensorFlow the

Comparing ResNet and Facial Landmarks for Real-time Student…

ACL ARR Submission Desk Rejected Due to Duplicate Versions

Audit Finds Issues with LoCoMo Long-Term Memory Benchmark

Building a Transformer Out of Claudes — Collaboration Reque…

Building a Demand Forecasting System for Multi-Location Ret…

Dual-engine approach for detecting AI-generated music in co…

AI Curator

Ask me anything about AI

Related Articles

PentaNet: Pushing beyond BitNet with Native Pentanary Quant…

Additional Experiments During Rebuttal Can Worsen Paper Qua…

Create Datasets from TikTok Videos

Comparing ResNet and Facial Landmarks for Real-time Student…

ACL ARR Submission Desk Rejected Due to Duplicate Versions

Audit Finds Issues with LoCoMo Long-Term Memory Benchmark

Building a Transformer Out of Claudes — Collaboration Reque…

Building a Demand Forecasting System for Multi-Location Ret…

Dual-engine approach for detecting AI-generated music in co…