| name | web-archeologist |
|---|---|
| description | Trace any clause, element, algorithm, or GitHub issue/PR in Web Standards (WHATWG, W3C, WICG, IETF) back to its historical origins. |
This skill enables the agent to trace any clause, element, algorithm, or GitHub issue/PR in Web Standards (WHATWG, W3C, WICG, IETF) back to its historical origins.
Before cloning repositories, use these tools to find the canonical definition and its impact across the web platform.
If you only have a term (e.g., "fetch timing info") but no URL, use the ReSpec Xref API:
# Search for a term across all known specifications
curl -s -X POST "https://respec.org/xref" \
-H "Content-Type: application/json" \
-d '{"keys": [{"term": "fetch timing info"}]}' | jq '.result[][1][0].uri'To see which other specs depend on a definition, use WebDex (by @dontcallmedom). This helps identify the "Why" by seeing who consumes the logic:
- Tool:
http://dontcallmedom.github.io/webdex/ - Manual Search: Search the term in WebDex to find "References" and "Definitions".
For a comprehensive search across all modern and historical web specifications:
- CLI Repository: jnjaeschke/webspec-index
- Usage: Install with
cargo binstall webspec-index. This tool provides full-text search, cross-reference tracking, and graph traversal across HTML, DOM, URL, CSS, ECMAScript, and 70+ other specifications. Use this to find where terms are defined if ReSpec Xref fails or if you need to build a cross-reference graph.
If a term is ambiguous or the spec definition is too low-level to understand the intent:
-
Search API (Recommended):
curl -s "https://developer.mozilla.org/api/v1/search?q=term" | jq '.documents[0].mdn_url' -
Usage: MDN often bridges the gap between the "What" (spec) and the "How" (usage). Use it to identify which specification is the current "canonical" one for a feature, as it often links directly to the normative spec sections.
-
Disambiguation: If Xref returns multiple results (e.g., for "Image"), MDN can help you identify if the user is likely referring to the
HTMLImageElement, a CSS<image>type, or a Canvas 2D image source.
Before diving into spec prose, always look for the Explainer:
- Source: Search the "Initial Issue" or the "Landing PR" for links to
explainer.mdor a dedicated repository (often inWICG/). - Priority: Treat the Explainer as the highest-signal source for "Why" a feature exists, its constraints, and the alternatives considered.
- Search Tip:
site:github.com "WICG" "term" "explainer"
Always use ~/.gemini/cache/specs for local clones.
| Domain | GitHub Repository |
|---|---|
html.spec.whatwg.org |
whatwg/html (File: source) |
dom.spec.whatwg.org |
whatwg/dom (File: dom.bs) |
fetch.spec.whatwg.org |
whatwg/fetch (File: fetch.bs) |
*.spec.whatwg.org |
whatwg/<name> (File: <name>.bs) |
drafts.csswg.org |
w3c/csswg-drafts (Search **/*.bs) |
httpwg.org |
httpwg/http-extensions |
wicg.github.io |
WICG/<name> |
source.chromium.org |
chromium/chromium |
webkit.org |
WebKit/WebKit |
searchfox.org |
mozilla/gecko-dev |
krijnhoetmer.nl/irc-logs |
KrijnHoetmer/irc-logs |
Action: Clone with --depth 1000. Use git fetch --unshallow if history is cut off.
Warning: A shallow clone (
--depth) can lead to hallucinations where the oldest commit in the shallow history is incorrectly identified as the origin of a line. Alwaysgit fetch --unshallowbefore performing a deep history trace orgit log -L.
Given a fragment ID (e.g., #main-fetch), find the exact line. The true definition is the <dfn> or structural block defining the term.
grep -rlE "<dfn[^>]*.*fragment" . --include=*.bs --include=source --include=*.md
Search for the fragment name using these patterns in order:
- Strict Attribute Match:
- Common prefixes:
(concept-|rel-|attr-|dom-)? grep -nE '<dfn[^>]* (id|data-x)=["'\'']prefix?fragment["'\'']' <file>
- Common prefixes:
- Multi-line Handling:
- If the fragment is inside a nested tag, use
grep -nE 'data-x="fragment"'then scan back 5 lines withsedto find the opening<dfn.
- If the fragment is inside a nested tag, use
- CSS Property:
grep -nE "Name:\s*fragment" <file>(Inside apropdefblock).
If given a link to source.chromium.org, WebKit's GitHub, or Mozilla Searchfox:
- Chromium: Remove the URL prefix (e.g.,
source.chromium.org/chromium/chromium/src/+/main:) to isolate the file path. - WebKit: Remove the URL prefix (e.g.,
github.com/WebKit/WebKit/blob/main/) to isolate the file path. - Mozilla (Gecko): Remove the URL prefix (e.g.,
searchfox.org/mozilla-central/source/) to isolate the file path. - Function Search: If searching for a symbol name (e.g.,
FetchManager::Loader::Start), usegrep -rn "SymbolName" .to find the implementation.
... (Strategies omitted for brevity) ...
When GitHub issues or Bugzilla reports reference a "discussion on IRC" or when you need to find the real-time debate behind a 2006-2016 era change:
- Local Search (Recommended): Clone
KrijnHoetmer/irc-logsand usegrepto search across channels and dates.grep -rEi "createContextualFragment" ~/.gemini/cache/specs/irc-logs/whatwg
- Online Archive: krijnhoetmer.nl/irc-logs/
- Search Tip: If local search is unavailable, use Google with
site:krijnhoetmer.nl/irc-logs/whatwg "term".
- Archive: matrixlogs.bakkot.com/irc-whatwg/
- Usage: Use this for more recent discussions (post-2018) that happened in the #whatwg channel, now bridged to Matrix.
Use this protocol to build a tree of callers and callees for a specific algorithm or concept, annotating the relationships with spec links and rationale.
- Locate the Definition Block: Use the heuristics in Section 3 to find the
<div algorithm>or structural block. - Scan for References: Identify all
<a>tags or terms in[= ... =]or{{ ... }}brackets within the block. - Resolve Specs: For each reference, determine if it is internal (same file) or external (use ReSpec Xref to find the source spec).
- Describe Relationship: Note how the callee is used (e.g., "Invoked to validate the origin", "Passed as an argument to initialize the fetch params").
- Internal Callers:
grepthe current specification for the term'sidorlt(link text). - External Callers: Use WebDex (
http://dontcallmedom.github.io/webdex/) to find which other specifications reference this definition. - Rationale: Analyze the calling context to describe why this spec is invoking the algorithm.
Present the graph as a nested Markdown list with the following structure:
Algorithm Name[Spec Link] - "Short description of the algorithm's purpose."- Callees:
Child Algorithm[Spec Link] - "Relationship: [How it's used]"
- Callers:
Parent Algorithm[Spec Link] - "Relationship: [Why it calls this]"
- Callees:
Example:
Main Fetch[fetch/#main-fetch] - "The entry point for all network requests."- Callees:
HTTP-network-or-cache fetch[fetch/#http-network-or-cache-fetch] - "Relationship: Invoked for HTTP(S) schemes."
- Callers:
HTML Navigation[html/#navigate] - "Relationship: Used to fetch the document resource."
- Callees: