Skip to content

feat(docs): add llms.txt feed and optional description generation for better LLM routing#3629

Open
benjaminach wants to merge 51 commits intomasterfrom
feat/llms.txt
Open

feat(docs): add llms.txt feed and optional description generation for better LLM routing#3629
benjaminach wants to merge 51 commits intomasterfrom
feat/llms.txt

Conversation

@benjaminach
Copy link
Copy Markdown
Contributor

@benjaminach benjaminach commented Mar 26, 2026

Summary

This PR introduces a generated llms.txt feed for doc.scalingo.com and improves LLM routing quality by adding optional, high-signal page descriptions.

Changes

  • Added src/llms.txt (Jekyll-generated) to expose a machine-readable list of documentation pages.
  • Updated _config.yml to include llms.txt in the generated site.
  • Filtered out non-target content from the feed (e.g. changelog entries, directory/index-like pages, explicitly excluded pages).
  • Added a new Codex skill: .codex/skills/scalingo-generate-description/SKILL.md.
  • Added its OpenAI agent config: .codex/skills/scalingo-generate-description/agents/openai.yaml.
  • Added/updated description front matter on selected user-management pages.
  • Standardized description formatting with double-quoted YAML values.

Why

description fields are optional, but they significantly help LLMs understand page intent and route queries to the right documentation page when scanning llms.txt.

Impact

  • New machine-readable endpoint: /llms.txt
  • Better semantic retrieval/routing for doc-focused LLM workflows
  • No expected user-facing behavior changes in normal site navigation

Validation

  • Verified Jekyll generation includes llms.txt.
  • Checked feed formatting and URL output.
  • Checked YAML front matter validity on updated pages.

@benjaminach benjaminach marked this pull request as draft March 27, 2026 11:03
@benjaminach benjaminach changed the title add files to generates an llms.txt file feat(docs): add llms.txt feed and optional description generation for better LLM routing Mar 27, 2026
Comment thread src/llms.txt Outdated
Comment thread src/llms.txt Outdated
@benjaminach benjaminach marked this pull request as ready for review March 30, 2026 15:39
Comment thread src/_includes/organisms/head.html Outdated
<script defer data-domain="scalingo.com" event-app="documentation" data-api="https://scalingo.com/sc-analytics/event" src="{{ 'assets/analytics.js' | esbuild_asset_path }}"></script>
<link rel="stylesheet" href="https://use.typekit.net/ajl3atf.css">
{% if jekyll.environment == "production" %}
<script async src="https://analytics.scalingo.com/script.js"></script>
Copy link
Copy Markdown

@semgrep-code-scalingo semgrep-code-scalingo Bot Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tag is missing an 'integrity' subresource integrity attribute. The 'integrity' attribute allows for the browser to verify that externally hosted files (for example from a CDN) are delivered without unexpected manipulation. Without this attribute, if an attacker can modify the externally hosted resource, this could lead to XSS and other types of attacks. To prevent this, include the base64-encoded cryptographic hash of the resource (file) you’re telling the browser to fetch in the 'integrity' attribute for all externally hosted files.

🧁 Removed in commit faa8a24 🧁

@Frzk Frzk force-pushed the feat/llms.txt branch 2 times, most recently from 5213261 to faa8a24 Compare April 16, 2026 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants