The Decoder1d ago|Research & Papers Products & Services

Alibaba's Qwen3.5-Omni Learns to Write Code from Spoken Instructions and Video

Alibaba has released Qwen3.5-Omni, an omnimodal AI model that can process text, images, audio, and video. It claims to outperform Gemini 3.1 Pro on audio tasks and has unexpectedly learned to write code from spoken instructions and video input without any specific training.

💡

Why it matters

Qwen3.5-Omni's ability to write code from spoken instructions and video showcases the rapid progress in multimodal AI and its potential applications in software engineering and other domains.

Key Points

1Qwen3.5-Omni is an omnimodal AI model from Alibaba that can process multiple data modalities
2It outperforms Gemini 3.1 Pro on audio tasks
3Qwen3.5-Omni has learned to write code from spoken instructions and video input without any prior training

Details

Alibaba's new Qwen3.5-Omni AI model is capable of processing text, images, audio, and video data. The model claims to beat the performance of Gemini 3.1 Pro on audio-related tasks. What's more surprising is that Qwen3.5-Omni has unexpectedly learned to write code from spoken instructions and video input, without being explicitly trained for this capability. This demonstrates the model's impressive multimodal learning abilities and its potential to assist with tasks like software development through natural language and visual inputs.

Alibaba's Qwen3.5-Omni Learns to Write Code from Spoken Instructions and Video

Why it matters

Key Points

Details

Dive deeper

Related Articles

Anthropic's Leaked AI Coding Tool Cloned Over 8,000 Times o…

Google Deepmind Study Exposes Threats to Autonomous AI Agen…

EU Bans AI-Generated Content in Official Communications

Perplexity AI Sued Over Alleged Data Sharing with Meta and …

OpenAI Confirms $122B Funding Round, Unveils ChatGPT Super …

Oracle Lays Off Thousands to Fund AI Infrastructure Bet

Anthropic Accidentally Publishes Claude Code Source Code

Google's Veo 3.1 Lite Cuts Video Generation Costs by More T…

Why AI Productivity Gets Lost Between Benchmarks and Balanc…

Nebius Plans $10 Billion AI Data Center in Finland Near Rus…

AI Curator

Ask me anything about AI

Related Articles

Anthropic's Leaked AI Coding Tool Cloned Over 8,000 Times o…

Google Deepmind Study Exposes Threats to Autonomous AI Agen…

EU Bans AI-Generated Content in Official Communications

Perplexity AI Sued Over Alleged Data Sharing with Meta and …

OpenAI Confirms $122B Funding Round, Unveils ChatGPT Super …

Oracle Lays Off Thousands to Fund AI Infrastructure Bet

Anthropic Accidentally Publishes Claude Code Source Code

Google's Veo 3.1 Lite Cuts Video Generation Costs by More T…

Why AI Productivity Gets Lost Between Benchmarks and Balanc…

Nebius Plans $10 Billion AI Data Center in Finland Near Rus…