Benchmarking LLMs for an AI Coaching Feature in the Browser

The author built an AI-powered coaching feature for a combat log analyzer tool, running entirely in the browser using WebLLM, an in-browser LLM inference engine. This article covers the methodology, benchmarks, and implementation decisions.

💡

Why it matters

This article demonstrates how to leverage in-browser LLM inference to build AI-powered features with strict output requirements, without the need for a server-side backend.

Key Points

  • 1Developed an AI coaching feature for a combat log analyzer tool, running client-side in the browser
  • 2Evaluated WebLLM, an in-browser LLM inference engine, as a replacement for a local LLM provider
  • 3Benchmarked 3 LLMs on a strict output schema, measuring quality across 6 signals
  • 4Leveraged WebLLM's grammar-constrained generation and OPFS caching to optimize performance

Details

The author is building Holocron, a browser-based combat log analyzer for the Star Wars: The Old Republic video game. The core feature is an AI-powered coaching layer that takes structured combat stats as input and generates plain-language guidance as output, all running client-side in the browser. To avoid the friction of a local LLM setup, the author evaluated WebLLM, an in-browser LLM inference engine that compiles models into a WebGPU-accelerated WASM runtime. WebLLM's key advantages are grammar-constrained generation, which enforces the output schema at the token sampling level, and OPFS caching, which reduces load times for repeat users. The author benchmarked 3 LLMs on a strict 500-token output schema, measuring quality across 6 signals: narrative depth, schema compliance, template parroting, ability name accuracy, finding duplication, and actionability. The results informed implementation decisions to optimize performance and reliability for the production coaching feature.

Like
Save
Read original
Cached
Comments
?

No comments yet

Be the first to comment

AI Curator - Daily AI News Curation

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies