Dev.to AI2h ago|Research & Papers Products & Services

Granular Token Cost Attribution Missing in Claude Code: Implementing Per-Tool-Call Tracking for Optimization and Debugging

The article discusses the lack of granular token cost attribution per tool call in Claude Code, which creates challenges for developers to optimize costs and debug inefficiencies. It introduces CAT, an open-source CLI tool that addresses this gap by synchronizing hook events and statusline snapshots to enable precise token cost attribution.

💡

Why it matters

CAT's granular token cost attribution enables targeted optimization, transparency, and scalability for AI workflows, reducing costs by up to 30% in pilot tests.

Key Points

1Claude Code does not provide granular token cost attribution per tool call
2This makes it difficult to optimize costs, debug inefficiencies, and scale operations effectively
3CAT, an open-source CLI tool, solves this by correlating hook events and statusline snapshots
4CAT uses a FastAPI + Uvicorn async collector, SQLite + aiosqlite database, and a delta engine to match tool calls with token counts

Details

The article explains that Claude Code's architecture separates token consumption data from tool call logs, making it challenging to correlate the two data streams. This disconnect leads to cost overruns and debugging bottlenecks. Existing solutions, such as parsing Claude Code logs, are insufficient as they don't bridge the gap between tool calls and token counts. CAT addresses this by using a FastAPI + Uvicorn async collector to receive hook events in real-time, a SQLite + aiosqlite database to store data efficiently, and a delta engine to match tool calls with token counts by aligning timestamps. CAT also employs Welford's online algorithm to compute rolling baseline statistics per task type and uses Z-score anomaly detection to flag resource-intensive operations. An optional Haiku LLM classifier provides root-cause analysis for anomalies. While CAT is robust, its effectiveness depends on accurate timestamp alignment between hook events and statusline snapshots, and the assumption of a normal distribution of token consumption.

Granular Token Cost Attribution Missing in Claude Code: Implementing Per-Tool-Call Tracking for Optimization and Debugging

Why it matters

Key Points

Details

Dive deeper

Related Articles

Building Guard-Clause: AI-Powered Contract Analysis Without…

Building AI Visibility Infrastructure: Inside Jonomor's Arc…

Building Credit-Based Video SaaS with Full-Stack AI Develop…

Developing an E-Commerce Pharmacy App & Dashboard for Your …

I built a local AI engine you can access from anywhere

I built an AI memory that fact-checks itself while you sleep

How to run background agents with Claude Code — fire and fo…

Big Tech firms are accelerating AI investments and integrat…

I Kept Opening Claude Code and Wishing It Had a Real UI. So…

Logs Won’t Tell You Why Your AI Agent Failed

AI Curator

Ask me anything about AI

Related Articles

Building Guard-Clause: AI-Powered Contract Analysis Without…

Building AI Visibility Infrastructure: Inside Jonomor's Arc…

Building Credit-Based Video SaaS with Full-Stack AI Develop…

Developing an E-Commerce Pharmacy App & Dashboard for Your …

I built a local AI engine you can access from anywhere

I built an AI memory that fact-checks itself while you sleep

How to run background agents with Claude Code — fire and fo…

Big Tech firms are accelerating AI investments and integrat…

I Kept Opening Claude Code and Wishing It Had a Real UI. So…

Logs Won’t Tell You Why Your AI Agent Failed