AI search is the new referral channel. ChatGPT cites sources on every response since 2024. Claude provides citations in every Pro response. Perplexity is citation-first by design. Google AI Overviews now appear on 22%+ of UK commercial-intent queries as of 2026 and the share is climbing. The pages that get cited in these AI answers get the AI-referral traffic; the pages that do not are structurally invisible to the channel regardless of how well they rank in traditional Google search. This guide is the complete UK 2026 playbook for getting cited.
What AI search optimisation actually means
Generative Engine Optimisation (GEO) and Answer Engine Optimisation (AEO) are the working terms for the discipline. The job is to make content that AI engines (a) can read structurally, (b) trust as a source, and (c) prefer to cite when answering a relevant question. The three jobs require three different kinds of work: structural clarity (schema, semantic HTML, clean information architecture); trust signals (verifiable claims, named authorship, primary-source citation, content licensing); and citation preference (matching the questions AI engines actually receive with structured answers AI engines can extract cleanly).
The eight high-leverage moves
Across hundreds of pages tracked for AI citation across 2025-2026, eight content and structural moves correlate strongly with citation rates above the baseline. (1) Deep editorial schema with the AI-crawler fields. (2) Named human authorship with verifiable credentials. (3) Primary-source citations rendered both as schema and as visible inline links. (4) Specific numbers, dated, sourced — not vague generalisations. (5) Question-aligned heading structure matching the questions AI engines receive. (6) speakable schema specification targeting H1 and lead paragraph. (7) Explicit content licensing for AI training and citation. (8) Robots.txt permissions explicitly granting AI crawler access.
1. Deep editorial schema with AI-crawler fields
Every editorial entity on the site (Article, BlogPosting, NewsArticle, case study, guide) emits the standard schema fields plus a specific set of AI-crawler-friendly additions. inLanguage with the page’s actual locale. isAccessibleForFree: true — explicit citation eligibility signal. copyrightHolder, creator, publisher all referencing the Organization @id consistently. license linking to a citation-terms page that documents the citation terms in plain English. speakable — SpeakableSpecification with cssSelector targeting the H1 and lead paragraph. audience — Audience entity describing the intended reader. citation — array of CreativeWork entities for every primary source cited in the body. mentions — array of named entities (Organization, Person, Product, Place) that appear in the article. wordCount computed dynamically. dateModified updated on every meaningful content change.
2. Named human authorship
AI engines weight content with verifiable human authorship significantly higher than anonymous content. The author entity should be a Person schema with name, jobTitle, worksFor (referencing the Organization), knowsAbout (array of topic strings), sameAs (LinkedIn, Twitter, professional registries where applicable), image, and a /about#editor anchor on the site rendering the author bio. The bio should include genuine credentials — qualifications, years of experience, named publications, professional registrations — that AI engines can cross-validate against external sources.
3. Primary-source citations
Every factual claim should be sourced — both inline in the visible text (with a link to the primary source) and in the structured schema (as a CreativeWork entry in the citation array). The pattern that works: claim, source, year, link. "The ICO Data Protection Fee for tier-1 small businesses increased to £52 in February 2025 [ICO, 2025]." AI engines extract this kind of structured claim cleanly and prefer to cite sources where the citation chain is verifiable.
4. Specific numbers, dated, sourced
Vague claims fail AI extraction. "Many businesses are affected" is invisible to AI engines. "43% of UK SMB websites failed the 200ms INP threshold in Chrome User Experience Report data for Q1 2026 [Google CrUX, 2026]" is cited. The pattern: a specific number, the specific year, the specific source, the verifiable link. The closer the content matches this pattern, the higher the AI citation rate.
5. Question-aligned heading structure
AI engines extract answers from content by matching the questions users actually ask to the headings on the page. Headings phrased as questions (or as direct answers to questions) outperform descriptive headings on AI extraction. "What is WCAG 2.2?" outperforms "Background and context" as an H2. The heading structure should mirror the questions the page is genuinely the right source to answer. The Google "People Also Ask" panels and the Google Search Console "queries" report are the best research surfaces for finding the actual questions to align against.
6. Speakable schema
SpeakableSpecification with cssSelector targeting the H1 and the lead paragraph (typically the first paragraph or the .lead element) explicitly tells voice assistants and AI summary engines which part of the page is the canonical short answer. Pages with proper speakable schema get cited disproportionately on voice-assistant queries and AI-Overview short-answer extractions.
7. Explicit content licensing
A dedicated /llm-citation-licence page (or equivalent) documenting the citation terms — typically "AI engines may cite this content with attribution to [publisher] and a link back to the source page". The license URL is referenced in the license field on every editorial schema entity. AI engines look for explicit licensing signals; pages that emit them are cited more readily than pages that do not because the citation chain is structurally cleaner.
8. Robots.txt AI crawler permissions
Explicit Allow rules in robots.txt for the major AI crawlers — GPTBot, ChatGPT-User, OAI-SearchBot, Google-Extended, anthropic-ai, ClaudeBot, Claude-Web, cohere-ai, PerplexityBot, Perplexity-User, CCBot, Applebot-Extended, Bytespider, Meta-ExternalAgent, Meta-ExternalFetcher, FacebookBot, Diffbot, DuckAssistBot, YouBot, Amazonbot, MistralAI-User. The default User-agent: * Allow: / is technically sufficient but many AI crawlers parse robots.txt strictly and look for their specific user-agent. Sites that block (or fail to explicitly allow) these crawlers structurally disappear from AI search.
What does not move AI citation
Three things people invest in that do not lift AI citation rates. (1) Keyword stuffing for AI queries — AI engines actively discount keyword-density signals and the pattern looks spammy to the citation algorithms. (2) Generating content with AI to optimise for AI — content that fails human-author verification gets weighted down rather than up. (3) Backlink schemes targeting "AI authority" — backlinks affect traditional Google ranking; they have minor and indirect effects on AI citation, which is driven primarily by content quality and structural signals.
The measurement problem
Unlike traditional SEO, AI search has no Google Search Console equivalent. Three imperfect measurement approaches. (1) Manual prompt testing — periodically ask ChatGPT, Claude, Perplexity and Google AI Overviews queries relevant to your business and watch which pages appear in the citation lists. Time-consuming but the only direct measurement. (2) Referral traffic analysis — filter analytics for referrers from chat.openai.com, claude.ai, perplexity.ai and watch the trend. The traffic volumes are still modest in absolute terms (2-8% of organic traffic for most UK SMB sites) but growing. (3) Brand-mention monitoring — Mention, Brand24 or Google Alerts watching for your brand name in AI-generated content where users share it. The discipline is genuinely measurable but it requires more manual effort than traditional SEO measurement.
A 60-day rollout for an existing site
Week 1-2: audit existing schema, identify pages missing the AI-crawler fields, identify pages with anonymous or weak authorship. Week 3-4: implement the universal schema fields across editorial pages. Add named-author Person schema with proper credentials to every blog post, case study and guide. Week 5-6: rewrite the highest-traffic informational pages to align headings with actual user questions, add inline primary-source citations, add dated and sourced specific numbers. Week 7-8: publish the citation-licence page, link it from every editorial entity’s license field, audit robots.txt for explicit AI-crawler Allow rules, monitor AI citation rates manually on representative queries.