Inference-time attractor layer for transformers: why it failed and how three clocks fixed it
The author's inference-time attractor layer for transformers failed not due to memory interference, but because it settled too quickly. Instrumenting the MoE routing revealed a universal 2D geometry, and the failures turned out to be timing issues, leading to the introduction of a three-clock system.
Why it matters
The three-clock system provides a potential solution to the timing issues that caused the original attractor layer to fail, which could have broader implications for building stable and coherent language models.
Key Points
- 1The attractor failed because it settled too fast, causing the system to snap back to an earlier state with no warning
- 2Routing dynamics collapsed onto a 2D manifold with fixed axes, suggesting two dimensions are the minimum for a stable system
- 3A three-clock system (fast, medium, slow) was introduced to prevent 'fake stillness' and premature certainty
Details
The author's previous work on an inference-time attractor layer for transformers showed promising results on small models but failed during long generation tasks. Instead of structural issues, the problem was found to be a timing failure - the attractor settled too quickly, causing the system to suddenly snap back to an earlier state. Instrumenting the MoE routing revealed a universal 2D geometry, with the routing dynamics collapsing onto a 2D manifold with fixed axes across different models and noise levels. This suggests two dimensions are the minimum needed for a system to stabilize itself without freezing its own evolution. To address this, the author introduced a three-clock system, with fast, medium, and slow clocks to track token-to-token coherence, turn/arc coherence, and long-term identity coherence respectively. This prevents the system from treating 'parking in the wrong valley' as success and enforcing closure without knowing whether it is actually earned.
No comments yet
Be the first to comment