Tokenizer Changes in Claude 4.7 Increase Token Usage
The new tokenizer in Claude 4.7 uses 1.47x more tokens than the previous version, despite Anthropic's estimate of 1.0x to 1.35x. This increase in token usage affects technical content and code prompts the most, leading to higher costs for users.
Why it matters
The tokenizer change in Claude 4.7 has a significant impact on token usage and costs for developers using the platform, especially for technical content.
Key Points
- 1The Claude 4.7 tokenizer uses significantly more tokens than the previous version, up to 1.47x more in some cases
- 2Anthropic's estimate of a 1.0x to 1.35x increase was on the low end, with most real-world content seeing higher ratios
- 3The more dense and technical the content, the more the token count increases with the new tokenizer
Details
The article explains that Anthropic released a new tokenizer with the Claude 4.7 update, which resulted in a significant increase in the number of tokens used compared to the previous version 4.6. While Anthropic estimated the increase to be between 1.0x and 1.35x, empirical measurements on real-world technical content showed a much higher ratio of 1.47x. This means the same prompts and content now cost more tokens to process, even though the price per token remains the same. The article provides detailed measurements across different types of content, showing that the more dense and technical the text, the more the token count increases with the new tokenizer. The change is intentional, as a more granular tokenizer can improve performance on character-level tasks and reduce errors, but it comes at a higher operational cost for users.
No comments yet
Be the first to comment