Cohere Releases Open-Source ASR Model Transcribe, Outperforming OpenAI Whisper
Cohere, a Canadian AI company, has released an open-source automatic speech recognition (ASR) model called Transcribe. The 2-billion parameter model claims to have the top spot on the Hugging Face Open ASR Leaderboard with a 5.42% average word error rate, outperforming competitors like OpenAI's Whisper.
Why it matters
Transcribe's performance breakthrough and open-source availability pose a direct challenge to OpenAI's Whisper, a dominant force in foundational AI models.
Key Points
- 1Transcribe is a high-performance, production-ready ASR model with state-of-the-art accuracy and superior inference speed
- 2It supports 14 languages and is available for download on Hugging Face and through Cohere's API and Model Vault platform
- 3Transcribe's 5.42% average WER places it ahead of notable models like OpenAI Whisper, Alibaba Qwen3-ASR, and ElevenLabs Scribe
Details
Cohere's Transcribe model is a significant challenger to OpenAI's Whisper, which has been the de facto standard for open-source capable ASR since its release. Transcribe's claimed 5.42% average word error rate and best-in-class throughput represent a tangible accuracy and speed improvement over competitors. The model is available under the permissive Apache 2.0 license, allowing for easy integration and deployment. While the technical details of the model architecture and training data are not yet disclosed, the focus is on the reproducible benchmark results and immediate availability. Cohere plans to integrate Transcribe into its North AI agent platform, highlighting the trend of bundling core AI capabilities like speech recognition into cohesive platforms.
No comments yet
Be the first to comment