AgentLitmus Methodology v0.1 (internal codename: AgentGrade)

How AgentLitmus scores a site

AgentLitmus fetches a small set of pages from your site root — never more than 8 requests — and runs 10 automated checks ("signals"), worth 100 points total. Each signal returns points, a pass/partial/fail status, a one-line finding, and a suggested fix. Last updated June 2026.

Guides for agents

44 pts

llms.txt

12 pts

Checks for a /llms.txt file: a plain-text guide that points AI agents to your most important pages. Awards points for the file existing, having a title and markdown links, and those links pointing to your own domain.

id: llms_txt

Renders Without JS

12 pts

Checks how much real content is in the raw HTML response, before any JavaScript runs. Awards points for visible text length, a healthy text-to-HTML ratio, and signs of server-side rendering.

id: renders_without_js

Semantic HTML

10 pts

Examines your homepage's HTML structure: a single <h1>, headings that don't skip levels, landmark elements like main/nav/header/footer, and semantic lists or tables instead of div-only layouts.

id: semantic_html

AGENTS.md

10 pts

Checks for an /AGENTS.md file describing how AI coding agents should work with your project. Awards points for the file existing and being organized with at least two markdown headings.

id: agents_md

Machine-readable content

24 pts

Structured Data (JSON-LD)

14 pts

Looks for JSON-LD <script> tags on your homepage. Awards points for the tags being present, parsing as valid JSON, and declaring a recognized schema.org @type like Organization, Product, Article, or WebSite.

id: structured_data

Machine-Readable Endpoint

10 pts

Looks for a machine-readable API surface: a /.well-known/mcp.json manifest (strongest), an /openapi.json spec, or a visible link to API docs on the homepage. The best one found counts, not the sum.

id: machine_endpoint

Access & trust

32 pts

Agent Access Policy

10 pts

Reads your /robots.txt to see whether it exists, whether it blanket-blocks all crawlers, and whether it explicitly addresses known AI agent user-agents like GPTBot, ClaudeBot, or Google-Extended.

id: agent_access_policy

Content Freshness

8 pts

Checks whether your sitemap.xml has <lastmod> dates and whether your homepage shows a parseable date (modified time, <time> element, or visible date) within the last 12 months.

id: freshness

Sitemap URL Quality

8 pts

Checks that /sitemap.xml exists and parses, then samples up to 20 URLs to see whether they use descriptive, readable words rather than opaque IDs or hashes.

id: urls_sitemap

Image Alt Text

6 pts

Samples up to 30 <img> elements on your homepage and checks how many have non-empty alt text, so agents (and screen readers) understand what images convey.

id: alt_text

Grade bands

Your total score out of 100 maps to a letter grade:

A

90+

B

75–89

C

60–74

D

40–59

F

0–39

Crawler behavior

  • Identifies as AgentLitmusBot/0.1. See the bot page for details.
  • Respects robots.txt disallow rules for our user agent before fetching any page.
  • Times out each request after 10 seconds and makes at most 8 requests per scan.
  • Fetches the homepage and a handful of well-known paths in priority order — robots.txt, the homepage, llms.txt, AGENTS.md, sitemap.xml, /.well-known/mcp.json, then openapi.json and an /api-docs fallback — so the highest-weight checks run first if the request budget runs out.
  • Does not execute JavaScript — everything is scored from the raw HTML response.

Unchecked signals and score normalization

If a signal's required resources were never fetched — because the scan ran out of request budget or hit a network error, not because the resource was fetched and found missing — that signal is marked unchecked instead of fail. Unchecked signals are shown with a gray dot and don't count against your score.

Your score is normalized over only the signals that were actually checked: total = round(earned points / checked points × 100). If every signal was checked, this is the same as the raw percentage. If some signals are unchecked, the report shows "Grade based on N of 10 signals" so it's clear the grade reflects a partial scan.