Running LLM Classification After the Response: Next.js after() + OpenRouter at $0.0002 per Call
The article discusses how the author implemented an asynchronous LLM-based spam classification pipeline for a form submission service, while keeping the LLM call off the critical path and ensuring cost-effectiveness.
Why it matters
This approach shows how to effectively leverage LLMs in a production system while maintaining performance, reliability, and cost-efficiency.
Key Points
- 1Implemented a non-blocking LLM classification pipeline using Next.js's after() API
- 2Ensured form submission latency did not change and LLM failures did not break the submission
- 3Kept the cost per classification under a cent to offer the feature for free
- 4Prevented prompt injection attacks by the respondent to hijack the classifier
Details
The author has been building FORMLOVA, a chat-first form service where users interact with the product using MCP clients like Claude or ChatGPT. They recently shipped a sales-email auto-classification feature, where an LLM classifies every form response into 'legitimate', 'sales', or 'suspicious' labels. The key constraints were: 1) The form submission latency must not change, 2) Any LLM failure must not break the submission, 3) Cost per classification must stay under a cent, and 4) Prompt injection via the respondent's input must not hijack the classifier. The article explains how they solved these challenges by implementing an asynchronous LLM classification pipeline using Next.js's after() API, which defers the LLM call until after the response is flushed to the user. The article includes code snippets from the production codebase to demonstrate the implementation.
No comments yet
Be the first to comment