Hacker News5h ago|Research & Papers Products & Services

Three new Kitten TTS models – smallest less than 25MB

Kitten TTS has released three new open-source text-to-speech models with varying sizes and quality levels, aimed at on-device applications.

💡

Why it matters

Tiny, high-quality text-to-speech models are crucial for enabling on-device AI applications, especially in low-power environments.

Key Points

1Three new Kitten TTS models with 80M, 40M, and 14M parameters
2The 14M variant is the smallest at under 25MB but has high expressivity
3Models support English text-to-speech in 8 voices (4 male, 4 female)
4Models are quantized and use ONNX for runtime, designed to run on low-end devices without GPUs

Details

Kitten TTS is an open-source project focused on developing tiny and expressive text-to-speech models for on-device applications. The latest release includes three new models with varying sizes and quality levels. The largest 80M parameter model has the highest quality, while the 14M variant reaches new state-of-the-art in expressivity among similar-sized models, despite being under 25MB in size. All models support English text-to-speech in 8 voices (4 male, 4 female) and are designed to run on low-power devices like Raspberry Pis, smartphones, and wearables without requiring a GPU. The models are quantized to int8 and fp16 and use the ONNX runtime. This release aims to bridge the gap between on-device and cloud-based text-to-speech solutions, making it easier to build production-ready voice agents and apps that run entirely on the device.

Three new Kitten TTS models – smallest less than 25MB

Why it matters

Key Points

Details

Dive deeper

Related Articles

From Oscilloscope to Wireshark: A UDP Story

P2P Network for Formally Verified AI-Driven Science

Meta Faces Security Incident Caused by Rogue AI Agent

NanoGPT Slowrun: 10x Data Efficiency with Infinite Compute

Noq: n0's new QUIC implementation in Rust

Connecticut and the 1 Kilometer Effect

Gauntlet AI Offers AI Training and Job Placement

How to Not Pay Your Taxes

Launch HN: Voltair (YC W26) – Drone and charging network fo…

Scaling Karpathy's Autoresearch: Leveraging GPU Clusters

AI Curator

Ask me anything about AI

Related Articles

From Oscilloscope to Wireshark: A UDP Story

P2P Network for Formally Verified AI-Driven Science

Meta Faces Security Incident Caused by Rogue AI Agent

NanoGPT Slowrun: 10x Data Efficiency with Infinite Compute

Noq: n0's new QUIC implementation in Rust

Connecticut and the 1 Kilometer Effect

Gauntlet AI Offers AI Training and Job Placement

Launch HN: Voltair (YC W26) – Drone and charging network fo…

Scaling Karpathy's Autoresearch: Leveraging GPU Clusters