Semantic Sitemaps
Semantic Sitemaps & Information Gain
Traditional sitemaps are built for 1990s indexing algorithms. CYBERMAPS introduces the Semantic Sitemap Extension, adding high-density metadata for LLM training and RAG ingestion.
AI Namespace Extensions
We extend the standard Sitemap XML with the ai: namespace:
ai:info_gain: A value from 0.0 to 1.0 indicating the “uniqueness” of the content. High info-gain pages are prioritized for training.ai:visual_weight: Indicates the importance of media on the page for multimodal models (GPT-4o, Gemini 1.5).ai:intent: Classifies the page asinformational,transactional, ornavigational.
Example Output
<url>
<loc>https://example.com/deep-dive-post</loc>
<lastmod>2026-05-27</lastmod>
<ai:info_gain>0.92</ai:info_gain>
<ai:intent>informational</ai:intent>
</url>
Substance Scoring
Before a page is included in the AI sitemap, it must pass a Substance Audit. Pages with low text-to-code ratios or shortcode-heavy “thin content” are automatically excluded to preserve the quality of the discovery data.