Dev.to Machine Learning6h ago|Research & Papers Products & Services

LLM Hallucinations Are Compression Artifacts

The article argues that language model hallucinations are not bugs, but rather compression artifacts similar to JPEG image compression. It explains how language models are fundamentally compression algorithms, and how this insight changes the approach to building trustworthy AI systems.

💡

Why it matters

This insight changes how we approach building trustworthy AI systems, by understanding the fundamental compression nature of language models.

Key Points

1Language models are compression algorithms at their core, trained to predict the next token
2Hallucinations are compression artifacts, similar to JPEG image compression artifacts
3Rare facts, exact figures, and specific details are the first to be lost in compression
4Model size is a bitrate race, with larger models having fewer compression losses
5Techniques like RAG, fine-tuning, and prompt engineering can be reframed as ways to inject lossless data

Details

The article explains that Claude Shannon's insight about the mathematical equivalence of data compression and next-symbol prediction is the key to understanding language model hallucinations. Language models like GPT, Claude, and Gemini are fundamentally compression algorithms, where the model weights are the compressed file and the training data is the original. Just like JPEG compression, language models lose fine details and reconstruct plausible-looking artifacts at boundaries. The article argues that this compression artifact view explains why language models excel at tasks like code generation (highly compressible) and struggle with math (exact numbers are hard to compress). Increasing model size is akin to increasing JPEG quality - more bits available means fewer losses. Techniques like Retrieval-Augmented Generation (RAG), fine-tuning, and prompt engineering can be reframed as ways to inject lossless data into the language model's context, bypassing the compression process.

LLM Hallucinations Are Compression Artifacts

Why it matters

Key Points

Details

Dive deeper

Related Articles

Clique Graphs and Overlapping Communities

Distributed Outcome Routing: Solving Intelligence Fragmenta…

Breach of Trust: BrowserStack Leaks Users' Email Addresses

Building BAINT AI: Clarity Is Harder Than Code

The 80/80 Paradox: Why AI Tools Abound but AI Results Lag

Diagonal Based Feature Extraction for Handwritten Alphabets…

One Prompt Replaced 3 Hours of Daily Debugging for Me

ParamFlow - Lightweight Configuration Management for Python

Building an Autonomous VLM Auditor for E-Commerce Scale

On Physical Adversarial Patches for Object Detection

AI Curator

Ask me anything about AI

Related Articles

Clique Graphs and Overlapping Communities

Distributed Outcome Routing: Solving Intelligence Fragmenta…

Breach of Trust: BrowserStack Leaks Users' Email Addresses

Building BAINT AI: Clarity Is Harder Than Code

The 80/80 Paradox: Why AI Tools Abound but AI Results Lag

Diagonal Based Feature Extraction for Handwritten Alphabets…

One Prompt Replaced 3 Hours of Daily Debugging for Me

ParamFlow - Lightweight Configuration Management for Python

Building an Autonomous VLM Auditor for E-Commerce Scale

On Physical Adversarial Patches for Object Detection