Granular Token Cost Attribution Missing in Claude Code: Implementing Per-Tool-Call Tracking for Optimization and Debugging
The article discusses the lack of granular token cost attribution per tool call in Claude Code, which creates challenges for developers to optimize costs and debug inefficiencies. It introduces CAT, an open-source CLI tool that addresses this gap by synchronizing hook events and statusline snapshots to enable precise token cost attribution.
Why it matters
CAT's granular token cost attribution enables targeted optimization, transparency, and scalability for AI workflows, reducing costs by up to 30% in pilot tests.
Key Points
- 1Claude Code does not provide granular token cost attribution per tool call
- 2This makes it difficult to optimize costs, debug inefficiencies, and scale operations effectively
- 3CAT, an open-source CLI tool, solves this by correlating hook events and statusline snapshots
- 4CAT uses a FastAPI + Uvicorn async collector, SQLite + aiosqlite database, and a delta engine to match tool calls with token counts
Details
The article explains that Claude Code's architecture separates token consumption data from tool call logs, making it challenging to correlate the two data streams. This disconnect leads to cost overruns and debugging bottlenecks. Existing solutions, such as parsing Claude Code logs, are insufficient as they don't bridge the gap between tool calls and token counts. CAT addresses this by using a FastAPI + Uvicorn async collector to receive hook events in real-time, a SQLite + aiosqlite database to store data efficiently, and a delta engine to match tool calls with token counts by aligning timestamps. CAT also employs Welford's online algorithm to compute rolling baseline statistics per task type and uses Z-score anomaly detection to flag resource-intensive operations. An optional Haiku LLM classifier provides root-cause analysis for anomalies. While CAT is robust, its effectiveness depends on accurate timestamp alignment between hook events and statusline snapshots, and the assumption of a normal distribution of token consumption.
No comments yet
Be the first to comment