Running LLM Classification After the Response: Next.js after() + OpenRouter at $0.0002 per Call

The article discusses how the author implemented an asynchronous LLM-based spam classification pipeline for a form submission service, while keeping the LLM call off the critical path and ensuring cost-effectiveness.

đź’ˇ

Why it matters

This approach shows how to effectively leverage LLMs in a production system while maintaining performance, reliability, and cost-efficiency.

Key Points

  • 1Implemented a non-blocking LLM classification pipeline using Next.js's after() API
  • 2Ensured form submission latency did not change and LLM failures did not break the submission
  • 3Kept the cost per classification under a cent to offer the feature for free
  • 4Prevented prompt injection attacks by the respondent to hijack the classifier

Details

The author has been building FORMLOVA, a chat-first form service where users interact with the product using MCP clients like Claude or ChatGPT. They recently shipped a sales-email auto-classification feature, where an LLM classifies every form response into 'legitimate', 'sales', or 'suspicious' labels. The key constraints were: 1) The form submission latency must not change, 2) Any LLM failure must not break the submission, 3) Cost per classification must stay under a cent, and 4) Prompt injection via the respondent's input must not hijack the classifier. The article explains how they solved these challenges by implementing an asynchronous LLM classification pipeline using Next.js's after() API, which defers the LLM call until after the response is flushed to the user. The article includes code snippets from the production codebase to demonstrate the implementation.

Like
Save
Read original
Cached
Comments
?

No comments yet

Be the first to comment

AI Curator - Daily AI News Curation

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies