Dev.to Machine Learning4h ago|Research & PapersProducts & Services

Comparing HTML, Markdown, and SOM for AI Agents

This article explores the pros and cons of different formats for representing web pages to AI agents, including raw HTML, Markdown, and the Semantic Object Model (SOM).

💡

Why it matters

The choice of web page representation format is crucial for AI agents that need to understand and interact with web content efficiently.

Key Points

  • 1Raw HTML provides complete fidelity but is expensive and noisy due to styling and scripts
  • 2Markdown is more concise but loses interactive elements and makes navigation tasks difficult
  • 3SOM preserves meaning and interactivity while stripping presentation noise

Details

When an AI agent needs to understand a web page, there are three common approaches: raw HTML, Markdown, and the Semantic Object Model (SOM). Raw HTML provides complete fidelity to the DOM, but 80-95% of the tokens are noise like styling and scripts, making it expensive and slow. Markdown strips the HTML to readable text while preserving structure, resulting in fewer tokens, but it loses interactive elements and makes navigation tasks guesswork. SOM is a structured JSON representation that preserves meaning and interactivity while removing presentation noise. Each element includes its semantic role and available actions, providing a more efficient and meaningful representation for AI agents.

Like
Save
Read original
Cached
Comments
?

No comments yet

Be the first to comment

AI Curator - Daily AI News Curation

AI Curator

Your AI news assistant

Ask me anything about AI

I can help you understand AI news, trends, and technologies