Dev.to Machine Learning6h ago|Research & PapersProducts & Services

LLM Hallucinations Are Compression Artifacts

The article argues that language model hallucinations are not bugs, but rather compression artifacts similar to JPEG image compression. It explains how language models are fundamentally compression algorithms, and how this insight changes the approach to building trustworthy AI systems.

đź’ˇ

Why it matters

This insight changes how we approach building trustworthy AI systems, by understanding the fundamental compression nature of language models.

Key Points

  • 1Language models are compression algorithms at their core, trained to predict the next token
  • 2Hallucinations are compression artifacts, similar to JPEG image compression artifacts
  • 3Rare facts, exact figures, and specific details are the first to be lost in compression
  • 4Model size is a bitrate race, with larger models having fewer compression losses
  • 5Techniques like RAG, fine-tuning, and prompt engineering can be reframed as ways to inject lossless data

Details

The article explains that Claude Shannon's insight about the mathematical equivalence of data compression and next-symbol prediction is the key to understanding language model hallucinations. Language models like GPT, Claude, and Gemini are fundamentally compression algorithms, where the model weights are the compressed file and the training data is the original. Just like JPEG compression, language models lose fine details and reconstruct plausible-looking artifacts at boundaries. The article argues that this compression artifact view explains why language models excel at tasks like code generation (highly compressible) and struggle with math (exact numbers are hard to compress). Increasing model size is akin to increasing JPEG quality - more bits available means fewer losses. Techniques like Retrieval-Augmented Generation (RAG), fine-tuning, and prompt engineering can be reframed as ways to inject lossless data into the language model's context, bypassing the compression process.

Like
Save
Read original
Cached
Comments
?

No comments yet

Be the first to comment

AI Curator - Daily AI News Curation

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies