Replacing JSON with TOON in LLM Prompts Saves 40% on Tokens

The author, a frontend developer, discovered that using JSON in LLM prompts results in significant token overhead due to repeated keys and syntactic elements. By switching to a more concise format called TOON (Token-Oriented Object Notation), they were able to reduce token usage by 40%.

💡

Why it matters

Reducing token usage in LLM prompts can lead to significant cost savings, especially for companies and developers working extensively with these models.

Key Points

  • 1JSON format has a lot of redundant elements like curly braces, quotes, and repeated keys
  • 2This redundancy adds significant token overhead when used in LLM prompts
  • 3TOON is a more concise format that declares keys once in a header, then only includes values
  • 4TOON-to-JSON conversion is lossless, allowing the same data structure to be used

Details

The author works extensively with LLM APIs and frequently sends structured data like product lists, logs, and user information as part of prompts. They found that the JSON format, while convenient, results in a lot of redundant elements like curly braces, quotes, and repeated keys. Across a set of 50 records, this can add up to hundreds of characters of syntactic overhead that the model has to tokenize, resulting in higher costs. To address this, the author discovered TOON (Token-Oriented Object Notation), a more concise format that declares the keys once in a header, then only includes the values in subsequent rows. This eliminates the need for repeated keys, quotes, and braces, resulting in a 40% reduction in token usage compared to JSON. The TOON-to-JSON conversion is lossless, allowing the same data structure to be used without any changes to the LLM model or application logic.

Like
Save
Read original
Cached
Comments
?

No comments yet

Be the first to comment

AI Curator - Daily AI News Curation

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies