Gemini Flash Hallucinates 91% of the Time When Unsure

The Gemini 3 Flash model has a 91% hallucination rate on the Artificial Analysis Omniscience Hallucination Rate benchmark, indicating it frequently provides incorrect answers when it should have refused or admitted to not knowing.

💡

Why it matters

Hallucination rate is a critical metric for AI models, especially in applications that require accurate and reliable output.

Key Points

  • 1Gemini 3 Flash model has a 91% hallucination rate on the AA-Omniscience Hallucination Rate benchmark
  • 2Hallucination rate measures how often the model answers incorrectly when it should have refused or admitted to not knowing
  • 3Other models like Claude and GPT have lower hallucination rates, ranging from 26% to 93%
  • 4Hallucination rate may be an important factor for applications requiring precise, reliable output like coding

Details

The article discusses the performance of various AI models on the Artificial Analysis Omniscience Hallucination Rate benchmark. The Gemini 3 Flash model stands out with a very high 91% hallucination rate, meaning it frequently provides incorrect answers when it should have refused or admitted to not knowing. In contrast, other models like Claude and GPT have lower hallucination rates, ranging from 26% to 93%. This metric may be particularly important for applications that require precise, reliable output, such as coding, where hallucinations could lead to significant issues. The article suggests that the lower hallucination rates of Anthropic models like Claude may be a key factor in their strong performance on coding tasks.

Like
Save
Read original
Cached
Comments
?

No comments yet

Be the first to comment

AI Curator - Daily AI News Curation

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies