The $12,000 AI Independence Box
George Hotz's startup Tiny Corp is shipping a $12,000 computer called tinybox that can run large language models offline, eliminating the need for costly API calls and data leaving the machine.
Why it matters
The tinybox offers an alternative to relying on cloud APIs for large language model inference, providing significant cost savings and operational advantages for AI-powered applications.
Key Points
- 1Tinybox offers 120B parameter model capabilities with no API costs, rate limits, or data leaving the machine
- 2The hardware provides high-performance GPUs and compute power for running any PyTorch-based model
- 3Tinybox outperformed much more expensive systems in MLPerf benchmarks, validating the performance claims
- 4Owning the hardware eliminates ongoing API costs and provides more control over models and data privacy
Details
Tiny Corp's tinybox is a $12,000 computer that can run large language models like GPT-3 offline, without relying on cloud APIs that come with ongoing costs and limitations. The hardware includes 4 AMD 9070XT GPUs providing 778 TFLOPS of FP16 compute and 64GB of GPU RAM. A more expensive $65,000 version upgrades to RTX PRO 6000 GPUs with 3,086 TFLOPS and 384GB of RAM. Both run Ubuntu 24.04 and can run any PyTorch-based model, including research models, without rate limits or data leaving the machine. This provides significant advantages over cloud API access, including lower per-inference costs, unlimited throughput, local latency, and unconstrained model choice. Tiny Corp also benchmarked the tinybox against much more expensive systems in MLPerf, validating the performance claims. For AI builders spending hundreds per month on API calls, the tinybox can pay for itself in 2-3 years of avoided costs, while also providing more flexibility and control.
No comments yet
Be the first to comment