Spec-driven SEO and GEO

πŸ—’οΈ Description

A synthesis of SEO and GEO (Generative Engine Optimization) patterns drawn from two case studies β€” the portfolio (Vite 7 + React 19 SPA) and Qamera AI (Next.js 16 + Turborepo + Vercel + Supabase). The central thesis: full control over SEO and GEO is only possible today on a code-based stack β€” and it only pays off once you layer a spec-driven AI workflow on top (OpenSpec / OPSX Workflow).

WordPress / Webflow plugins cover the top 80% of needs. The top 20% (CSP with reporting to Sentry, xhtml:link in the sitemap, requestIdleCallback in <head>, build-time llms.txt, per-bot rules in robots.txt) is what wins positions today β€” in classic SERP and in LLM answers.

🧩 Toolchain β€” five tools, one loop

ToolRole
claude-seo pluginAudit as the first command β€” technical, GEO, schema, performance, hreflang
OpenSpec / OPSX WorkflowSpec-driven workflow β€” proposal.md β†’ design.md β†’ specs/ β†’ tasks.md before code
Lighthouse MCPLab CWV and LCP opportunities from inside the agent
Rich Results Test + securityheaders.com + Sentry CSP ReportsVerification at every step
Git worktreesParallel work on independent changes (when the project allows it)

The loop: audit β†’ proposal β†’ design β†’ specs β†’ tasks β†’ implement β†’ verify β†’ archive. Spec-driven is a feedback loop for the AI β€” reviewing a spec costs minutes, reviewing 200 lines of generated code in the wrong place costs hours. See: Specification-Driven Development, Context Engineering.

🧩 Patterns transferable 1:1 between stacks

A. llms.txt as a custom build-time artifact

llmstxt.org (Answer.AI / Jeremy Howard, 2024). In 2026 ChatGPT web search, Perplexity, Claude Search and Gemini Deep Research all respect it. Build-time generator in Node:

  • llms.txt β€” shortened index of content (~16 KB)
  • llms-full.txt β€” full content with a \n\n---\n\n separator for single-token ingest

Not doable in a CMS panel β€” requires your own CMS with control over section ordering per language and a fallback for missing description.

B. Schema enrichment

BlogPosting / Article / Service / Product with the full set of fields. Three non-obvious details:

  • articleBody: post.excerpt is semantically wrong (the spec requires full content) β€” drop the field rather than hack it
  • publisher.logo must be a raster (PNG 600Γ—60), not SVG
  • datePublished / dateModified in ISO 8601 with Z or offset, not just YYYY-MM-DD

Missing fields that add value: mainEntityOfPage, publisher, dateModified, description with a fallback to the first paragraph.

C. Hreflang at the sitemap level, not just <head>

Metadata.alternates.languages is a head-level signal. Google prefers sitemap-level xhtml:link for clustering language variants:

<url>
  <loc>https://qamera.ai/pricing</loc>
  <xhtml:link rel="alternate" hreflang="en" href="https://qamera.ai/pricing"/>
  <xhtml:link rel="alternate" hreflang="pl" href="https://qamera.ai/pl/pricing"/>
  <xhtml:link rel="alternate" hreflang="uk" href="https://qamera.ai/uk/pricing"/>
  <xhtml:link rel="alternate" hreflang="x-default" href="https://qamera.ai/pricing"/>
</url>

A shared helper buildLanguageAlternates(pathname) used both in sitemap.ts and in every generateMetadata. A drift-guard test in CI fails if someone adds a path to the sitemap but forgets alternates in page.tsx.

D. AI bot allowlist β€” named rules instead of wildcard

robots.txt with separate blocks for GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot. Wildcard = β€œno signal”, the bot interprets that conservatively. Named allow = β€œexplicit yes”.

E. Security headers as a trust signal

  • X-XSS-Protection is deprecated β€” drop it rather than update it
  • Content-Security-Policy in Report-Only mode with reporting to Sentry before enforcing
  • HSTS with preload, Permissions-Policy per page (geolocation, camera, microphone, payment)
  • securityheaders.com C β†’ A is reachable in a single afternoon

🧩 Stack-specific gotchas

SPA / Vite: async=true on an inline <script> is a myth β€” the attribute applies to downloading, but the inline code itself executes synchronously during HTML parsing. Fix: requestIdleCallback + setTimeout(2000) fallback for Safari ≀ 16.3.

Next.js / SaaS: CLS from client-side fetch without reserved dimensions (Qamera /marketplace/styles: 0.467 β†’ 0.016 via SSR initial grid β€” bonus for GEO, because non-JS crawlers see the content).

Field vs lab data: Lab score (Lighthouse) has high variance (post-deploy variance 38 β†’ 61 β†’ 43). Real verification is CrUX from Google Search Console after 2-4 weeks. PSI cold function can show LCP 14.4s while Lighthouse warm shows 1.6s β€” a single PSI metric is sampling, always re-run.

🧩 Decision: one big PR vs many small ones

FactorOne-PR (portfolio)Multi-PR (Qamera)
Maintainers12+
Risk of file conflictslowhigh
Review cycleself-reviewcode review by co-owner
Time distributionone afternoon5 working days
Rollback granularityall or nothingper-feature
Dev environmentoneworktree + separate node_modules

Threshold: thematic cohesion + <500 lines of diff + single maintainer β†’ one PR. Disjoint file sets + multi-dev + monorepo β†’ multi-PR with worktrees.

🧩 Meta-lessons

  1. Audits find bugs outside their scope β€” e.g., self-referencing alternateSlug in the blog-article-writer skill, chain: data fix β†’ code defense β†’ process fix (a rule in .claude/rules/)
  2. AI workflow creates bugs, AI workflow fixes them β€” a self-correcting loop, provided there is a process
  3. The second project = ~30% of the time of the first β€” provided the patterns are documented (this note)
  4. Time compression is multiplicative: code-based stack Γ— good AI workflow = hours. Either alone is not enough.

πŸ“– Further reading


Source: _raw/inbox/2026-04-22-portfolio-seo-improvements-brief.md + _raw/inbox/2026-04-22-qamera-seo-foundation-case-study.md