CI/CD shows deploy succeeded -- why do I still need a VERIFIED Gate?

CI/CD's 'deploy succeeded' only verifies that bytes reached the server -- file transfer completed, service restarted, port listening. It does not verify that content is correct. Typical scenarios: nginx returns 200 but the actual page is the default welcome page; canonical URL incorrectly points to the staging domain; hreflang tags are missing so Google doesn't recognize the bilingual structure; JSON-LD has a formatting error so rich search results don't appear. The VERIFIED Gate runs after CI/CD, verifying deployment results from the 'content consumer's' perspective, filling the verification gap between the transport layer and the application layer.

What dimensions does the VERIFIED Gate check?

The VERIFIED Gate covers 10+ independent verification dimensions: HTTP status code (each page returns 200), Content-Type (text/html; charset=utf-8), page size (greater than 1KB to exclude empty pages), Canonical URL (self-referencing and correct), Hreflang (bidirectional cross-referencing + x-default), JSON-LD (Article + BreadcrumbList + FAQPage all three structured data types), FAQ visibility (details/summary elements exist and are expandable), Security headers (HSTS, X-Content-Type-Options, X-Frame-Options, Referrer-Policy, Permissions-Policy), Sitemap (both zh/en URLs present), and Homepage cards (article links appear in the homepage card list). Each dimension is independently judged PASS/FAIL; the overall Gate only passes when all dimensions pass.

What is the nginx add_header inheritance trap?

nginx's add_header directive has a counterintuitive behavior: when you set security headers in a server block using add_header, and then use add_header in a location block (even for completely different headers), all add_header directives from the server block are overwritten -- not merged. This means pages under certain paths completely lack security headers even though global configuration 'appears correct.' The VERIFIED Gate's security header check independently validates each URL, exposing this inheritance trap.

How do I recover when the VERIFIED Gate fails?

Choose a recovery path based on the failing dimension: HTTP status failure (404/500) -> block the release, fix nginx config and file paths, redeploy; Canonical/Hreflang failure -> medium priority, fix the HTML and redeploy with re-verification; JSON-LD format error -> medium priority, fix structured data and redeploy; Security header missing -> high priority (security impact), fix nginx config then reload and re-verify; Sitemap/Homepage card missing -> low priority, fix and redeploy. All recovery paths require the complete fix -> redeploy -> re-verify cycle. Skipping re-verification and directly marking as PASS is not allowed.

Who executes the VERIFIED Gate in the Agent release pipeline?

In xslyl.com's Agent release pipeline, the VERIFIED Gate is a two-layer verification system. Layer 1 is the live online verifier — a deploy-time script that issues HTTPS requests to the production server, runs the 10+ dimension checks (HTTP status, canonical, hreflang, JSON-LD, security headers, sitemap, homepage cards, etc.), and writes the structured results to verify-report.json. Layer 2 is the check_verified_gate.py script, which is a repository-side evidence validator — it does NOT re-run the online checks itself. Instead, it reads the pre-recorded evidence: it validates that deploy-report.json, verify-report.json, status.json, and final-report.md all exist and are internally consistent; it checks that verify-report.json declares all_online_verified=true with zero failures; it validates three-way URL consistency across status.json, final-report.md, sitemap.xml, and the homepage index files; it confirms the deployment method (standard or emergency) matches the evidence; it invokes the security header checker (check_security_headers.py) as a subprocess; and it runs the policy acknowledgment checker (check_policy_ack.py) to ensure command approval rules were followed. Both layers are orchestrated by Codex, which runs the scripts, collects the deterministic output, and writes the final verdict. Codex cannot self-judge gate results — it must rely on the script's deterministic output. Hermes-Agent reads the final artifacts and reports the status to the user.

What's the difference between VERIFIED Gate and CI/CD smoke tests?

CI/CD smoke tests typically only verify whether the service is alive (port listening, health check endpoint returns 200); they don't verify business content correctness. The VERIFIED Gate differs in: (1) Number of dimensions -- smoke tests usually 1-3 checks, VERIFIED Gate 10+; (2) Check depth -- smoke tests ask 'is the service alive?', VERIFIED Gate asks 'is the content correct?'; (3) Content awareness -- smoke tests don't parse HTML, VERIFIED Gate parses canonical, hreflang, JSON-LD, FAQ structures; (4) Cross-page consistency -- smoke tests check a single endpoint, VERIFIED Gate checks cross-references between pages (bidirectional hreflang, sitemap URLs vs online URLs); (5) Result semantics -- smoke test result is 'deployable', VERIFIED Gate result is 'deployed and correct.'

Agent VERIFIED Deployment Gate Design: Post-Deploy Authenticity Checks

2026-06-07 · Intermediate-Advanced

⚡ 30-Second Takeaway

CI/CD's "deploy succeeded" only verifies bytes arrived, not that content is correct -- these are two fundamentally different classes of checks
The VERIFIED Gate automates online content verification through 10+ independent dimensions -- HTTP status, canonical, hreflang, JSON-LD, security headers, sitemap, homepage cards, commit consistency
Core definition: the VERIFIED Gate is a deterministic, idempotent post-deployment check that verifies deployed content across N independent dimensions, each outputting PASS/FAIL independently, with the Gate passing only when all dimensions pass
Gate failure is not a disaster -- having a clear recovery path (fix → redeploy → re-verify) is far safer than "not knowing whether there's a problem"

1. Introduction: Why CI/CD "Deploy Succeeded" Is Not Enough

The CI/CD pipeline has gone green. GitHub Actions displays ✅ deploy succeeded. rsync returned exit code 0, nginx reloaded successfully, the port is listening. From an operations perspective, the deployment encountered no errors.

But search engines see it differently.

Two days later, you discover the newly published English article is appearing in Chinese search results, while the Chinese version was never indexed at all. Investigation reveals: the canonical URL incorrectly points to the English version -- the Chinese page's <link rel="canonical"> contains /en/posts/.... Meanwhile, hreflang tags only have one-way references: the English page declares a Chinese alternate version, but the Chinese page does not declare English as its alternate. Google therefore cannot recognize the bilingual structure, and the Chinese page becomes an orphan.

Worse: CI/CD is completely unaware of any of this. Its job is to verify "did the bytes reach the server?" -- file transfer completed, service restarted, port listening. It does not parse HTML, does not read canonical tags, does not care about hreflang bidirectionality. These semantic-level errors are entirely transparent to CI/CD.

Two layers of verification gap exist here:

Transport-layer verification (CI/CD's domain): Did the files arrive? Is the service up? Is the port listening? -- Answers "is the deployment complete?"
Content-layer verification (VERIFIED Gate's domain): Is the canonical URL pointing correctly? Are hreflang tags bidirectional? Is JSON-LD parseable? Are security headers active? -- Answers "is the deployment correct?"

For content produced by human authors, this gap is often invisible -- because humans preview and inspect repeatedly during development, catching most semantic errors locally before they reach production. But for Agent-produced content, the situation is entirely different. An Agent can produce structurally complete, syntactically valid HTML that nevertheless contains subtle semantic errors: the canonical URL points to a staging domain instead of production, hreflang tags were only generated for one language, JSON-LD URLs use a localhost prefix. These errors are completely invisible during local preview -- the page renders fine, the HTML structure is valid -- and only surface after deployment to production, when search engines parse them.

This is the crux of the problem: the Agent's responsibility ends at commit; but the verification responsibility begins after deployment. The Agent writes the code, submits the PR, merges the branch -- its work is done. But whether the content deployed to production is correct requires independent verification from the perspective of the "content consumer," running against the real production environment.

The VERIFIED Gate proposed in this article is designed precisely for this purpose: a deterministic post-deployment gating check that verifies the integrity and correctness of deployed content across 10+ independent dimensions. It is not a replacement for CI/CD -- it is a supplementary verification layer that sits after CI/CD, oriented toward content semantics. Only when all dimensions pass is the deployment marked as VERIFIED -- truly "deployed and correct."

This article is part of the Agent Release & Operations Series. The series begins with Agent Release Gate Design (which defines the complete 8-stage release pipeline -- the VERIFIED Gate is Phase 8, the final stage). The VERIFIED Gate's online verification capability depends on the monitoring signal framework established in Agent Observability -- post-deployment checks are essentially one concentrated consumption of observability signals.

To understand why the VERIFIED Gate is specifically needed in Agent-driven pipelines, compare it with a human-driven deployment workflow. When a human deploys an article, they visually verify the result: they open the production URL in a browser, scroll through the content, check that links work, confirm the title displays correctly, and often manually test the canonical URL with browser developer tools. This verification happens organically as part of the human deployment process — it is implicit, unsystematic, but generally effective. An Agent pipeline has no such organic verification: the deploy script returns exit 0, and no human ever opens the production page unless a problem is reported days later by search engines or analytics. The VERIFIED Gate's role is to make this implicit human verification explicit and automated: every dimension a human would intuitively check is encoded as a deterministic pass/fail check that runs on every deployment. It does not replace human oversight — it ensures that the oversight happens systematically rather than relying on the intermittent attention of a busy operator.

Below are the 10 core verification dimensions covered by the VERIFIED Gate. Each dimension is independently judged PASS/FAIL:

Dimension	What It Checks	Failure Mode
HTTP Status	Each page returns 200 OK, no redirect loops	404/500/301 loop -- CI/CD shows deploy succeeded but file path is wrong
Content-Type	text/html; charset=utf-8	Plain text / binary output -- nginx mime.types not loaded or path matching error
Page Size	Greater than 1KB	Empty page / error page -- rsync interrupted or placeholder file overwrote real content
Canonical	Self-referencing, URL complete with https protocol	Cross-language / cross-domain pointing -- Agent generated staging URL or http protocol
Hreflang	Bidirectional cross-reference + x-default pointing to en	One-way / missing / wrong target -- Agent only generated zh version or URL path is incorrect
JSON-LD	Article + BreadcrumbList + FAQPage all three types present and valid	Missing / malformed -- Agent omitted a structured data type or has JSON syntax errors
FAQ Visibility	details/summary elements exist and are interactive	FAQ content exists but is CSS-hidden, or uses non-standard markup making it invisible
Security Headers	HSTS / X-Content-Type-Options / X-Frame-Options / Referrer-Policy / Permissions-Policy all returned	Missing / nginx inheritance trap -- server block headers overwritten by location block
Sitemap	Both zh/en URLs present in sitemap.xml	URL missing / format wrong -- sitemap not updated after deployment or old version overwritten by rsync
Homepage Cards	Article link appears in the card list on zh/index.html	Link missing / broken -- an older version of index.html was deployed

2. The Verification Dimension Matrix

The 10 dimensions above form the VERIFIED Gate's core check matrix. Each dimension is independently judged PASS/FAIL and contributes to the overall gate verdict. Below are concise descriptions of what each dimension validates and why it catches failures that CI/CD cannot.

2.1 HTTP Status (200)

Each page must return HTTP 200 without redirect chains. CI/CD smoke tests typically follow redirects, masking 301 loops and trailing-slash mismatches that degrade SEO ranking weight. The VERIFIED Gate checks the initial response code directly using curl -fsS -o /dev/null -w '%{http_code}' without following redirects, catching URL structure issues that CI/CD silently passes. A non-200 response — 404, 500, or 301 — is an immediate gate failure regardless of other dimensions.

2.2 Content-Type

The response must serve text/html; charset=utf-8. An nginx mime.types misconfiguration can cause HTML files to be served as application/octet-stream, triggering browser download dialogs instead of rendering — and search engines may treat the page as plain text, losing all SEO signals. The charset=utf-8 parameter is critical for pages with multi-byte UTF-8 characters (CJK, emoji), which render as garbled text under Latin-1 decoding when the charset is absent.

2.3 Page Size

Response body must exceed 1KB to exclude empty or placeholder pages. An interrupted rsync, a nginx default welcome page serving as 200, or an SPA fallback returning a minimal <div id="app"></div> all produce sub-1KB responses. This lightweight threshold detects "empty shell" situations — the server says 200 but the content never arrived — without needing to parse the HTML body. For xslyl.com's static article pages with inline CSS and structured data, the 1KB threshold is safe; SPA architectures should lower it to 256 bytes and supplement with additional checks.

2.4 Canonical Self-Reference

The canonical URL must strictly equal the current page URL — self-referencing, not merely present. The most destructive failure mode is cross-language leakage: the zh page's canonical pointing to the en URL, causing Google to treat the zh page as duplicate content and drop it from the index entirely. Other common Agent-induced errors include staging domain residue (http://localhost:3000/...), http:// protocol instead of https://, and path divergence (missing .html extension or trailing-slash mismatch). A canonical mismatch is a silent SEO disaster — the page is fully accessible and renders correctly, but search engines refuse to index it.

2.5 Hreflang Bidirectional + x-default

Bilingual pages must cross-reference each other bidirectionally — the zh page declares an en alternate, and the en page reciprocally declares a zh alternate — with an x-default pointing to English on both pages. Google's documentation explicitly states that one-way hreflang annotations are completely ignored (not downgraded — ignored), making this one of the rare all-or-nothing rules in SEO. Cross-page validation is required: you cannot verify hreflang correctness by inspecting a single page in isolation — you must issue two HTTP requests, extract hreflang tags from both, and cross-validate the references.

2.6 JSON-LD Structured Data

Three independent JSON-LD blocks — Article, BreadcrumbList, and FAQPage — must all be present and parseable by a JSON parser. Missing BreadcrumbList silently drops breadcrumbs from search results (costing ~5–10% CTR), while missing FAQPage forfeits one of the highest-CTR rich result formats available. JSON syntax errors — trailing commas, unescaped quotes — cause the entire block to be silently ignored by Google with zero visible error indication, making parse-validation a mandatory gate check for agent-authored content.

2.7 FAQ Visibility

FAQ content declared in JSON-LD must match user-visible HTML using semantic <details>/<summary> elements. Google's structured data guidelines explicitly penalize pages where declared FAQ content does not match user-visible content — if JSON-LD lists 6 questions but HTML shows only 3 or hides them with display:none, rich results may be disabled. The <details> element is the HTML5 semantic element Google specifically recognizes as FAQ markup, distinguishing it from generic <div>-based implementations.

2.8 Security Headers

Five security response headers — HSTS, X-Content-Type-Options, X-Frame-Options, Referrer-Policy, and Permissions-Policy — must all be returned with correct values. The nginx add_header inheritance trap is the most insidious failure mode: any add_header directive in a location block silently overwrites all add_header directives from the parent server block (not merges). This means a single Cache-Control header in a location block can strip all 5 security headers from every page under that path — a problem only detectable by per-URL independent verification.

2.9 Sitemap URL Presence

Both zh and en article URLs must appear in sitemap.xml, the primary discovery entry point for search engine crawlers. If sitemap regeneration fails silently after deployment, the article exists online but Google may take weeks to discover it through internal links alone. Asymmetric sitemap entries — zh URL present but en URL absent, or vice versa — mean one language version is immediately discoverable while the other waits for natural crawling.

2.10 Homepage Cards

The article link must appear in the card lists on both zh/index.html and en/index.html. The most insidious failure: article HTML deploys successfully but index.html uses a cached version — the page exists on the server but is invisible from the homepage, making it undiscoverable to both users and crawlers. Google's PageRank assigns higher initial weight to pages reachable directly from the homepage, so a missing homepage card is not just a navigation UX issue — it directly impacts search ranking velocity for the new article.

2.11 Tier Classification

Not all verification failures are equal. An HTTP 404 means the page is inaccessible — no further checks matter. A missing sitemap entry delays discovery but doesn't prevent crawling. The VERIFIED Gate classifies dimensions into four tiers, each with a distinct gate consequence:

Tier	Severity	Dimensions	Gate Consequence
Tier 1	Blocking	HTTP 200 + Canonical + Hreflang	Gate MUST fail. These three dimensions constitute the minimum viable deployment contract — an inaccessible page, a cross-language canonical leak, or a one-way hreflang makes the deployment effectively worthless. No override permitted.
Tier 2	High Priority	JSON-LD + Security Headers	Gate SHOULD fail. Missing JSON-LD degrades search visibility; missing security headers creates real vulnerability. The release should be blocked unless the failure is a known pre-existing condition (i.e., confirmed to exist before this deployment).
Tier 3	Medium	FAQ Visibility + Homepage Cards	Gate passes with WARNING. FAQ rich results and homepage discovery are important but not deployment-blocking — the article is still accessible and indexable. A warning is raised so the issue is tracked and repaired in the next deployment cycle.
Tier 4	Operational	Sitemap	Gate passes with NOTE. Sitemap absence delays search engine discovery but does not prevent crawling — the page is accessible, correctly canonicalized, and linked from the homepage. A note is logged; the sitemap should be regenerated as a follow-up operational action.

The tier system provides a graduated response: blocking failures halt the release, high-priority failures strongly recommend halting, medium issues generate warnings for imminent repair, and operational notes are logged without blocking. This prevents the gate from being either too lax (letting severe SEO damage through) or too rigid (blocking releases for non-critical operational hygiene items).

Note that Content-Type (Dimension 2) and Page Size (Dimension 3) are supplementary indicators rather than independently tiered checks. They provide diagnostic signal but are not final-gate dimensions: a wrong Content-Type is already caught by the browser/server layer, and an undersized page is caught by the combination of HTTP 200 + other content checks that fail on empty shells.

3. Implementation: HTTP and Content-Level Checks

The Tier-1 dimensions — HTTP status, Content-Type, page size, and slug-in-title verification — form the foundation of every post-deployment check. If any of these fail, the page is effectively missing from the internet regardless of what other signals say. This section provides the reference implementation that the VERIFIED Gate uses to execute these checks.

The core function bundles all transport-layer and basic content checks into a single, deterministic pass/fail matrix:

def verify_http_and_content(url: str, slug: str) -> dict:
    """Verify the page exists, serves correct content type, and isn't an error page."""
    import httpx
    resp = httpx.get(url, follow_redirects=False, timeout=10)

    checks = {
        "status_200": resp.status_code == 200,
        "content_type_html": "text/html" in resp.headers.get("content-type", ""),
        "page_not_empty": len(resp.text) > 1024,
        "not_error_page": not any(m in resp.text[:500] for m in ["404", "Not Found", "Internal Server Error"]),
        "slug_in_title": slug in resp.text[resp.text.find("<title>"):resp.text.find("</title>")],
    }
    return checks

Each check targets a distinct failure mode that CI/CD cannot detect. The follow_redirects=False parameter is deliberately set — the HTTP status check inspects the initial response code rather than the final destination. This catches 301 redirect chains and trailing-slash mismatches that CI/CD smoke tests silently absorb by following redirects. A page returning 301 -> 200 costs 1-2 points of PageRank through the redirect, and aggregated across dozens of pages this meaningfully degrades site-wide SEO authority. The VERIFIED Gate treats any non-200 initial status as a gate failure, forcing a URL structure repair before proceeding.

The content_type_html check validates that nginx is serving the file with the correct MIME type. An nginx mime.types misconfiguration can cause .html files to be served as application/octet-stream, triggering browser download dialogs instead of rendering. Search engines encountering this MIME type may treat the page as a binary download rather than an indexable HTML document, losing all SEO signals. The substring match on text/html is intentionally loose — it accepts text/html; charset=utf-8, text/html;charset=UTF-8, and bare text/html without getting tripped up by charset casing variations.

The page_not_empty check uses a 1KB threshold to distinguish real content from error shells. Interrupted rsync transfers, nginx default welcome pages returning 200, and SPA fallbacks rendering a minimal <div id="app"></div> all produce sub-1KB responses. This lightweight threshold catches "the server says 200 but the content never arrived" without requiring full HTML parsing. For xslyl.com's static article pages with inline CSS and structured data, 1KB is a safe floor — typical articles weigh 8-15KB. For SPA architectures, lower the threshold to 256 bytes and supplement with element selector checks.

The not_error_page check scans the first 500 bytes of the response body for known error markers. This catches a subtle class of failures: a misconfigured nginx error_page directive that returns HTTP 200 with an error page body. When nginx is configured with error_page 404 =200 /404.html;, the status code is rewritten to 200 but the body still contains "Not Found." A status-only check would report PASS; the body scan catches the hidden failure. The 500-byte window targets the visible page content (title, first heading) rather than deeper HTML structure, balancing detection speed against false-positive risk.

Additional consideration for AI crawler blocking: In 2025-2026, many websites have added explicit Disallow rules targeting AI crawlers (GPTBot, ClaudeBot, CCBot) in their robots.txt. While these rules are valid, they can accidentally block legitimate search crawlers if the User-agent directive is too broad. A common mistake: User-agent: * followed by Disallow: / (intended for AI crawlers only but matches all crawlers including Googlebot). The VERIFIED Gate's robots.txt check includes a verification that Disallow: / is NOT present under User-agent: * — if it is, the entire site is blocked from crawling, and the gate fails regardless of what the article pages look like.

not a default nginx page, not a wrong-language version, not a duplicate from another article. The check extracts only the content between <title> and </title> tags rather than searching the entire HTML, preventing false matches on slug mentions in body text or navigation links. A slug-in-title failure means either the wrong file was deployed or the Agent generated an incorrect title, both of which demand repair before the gate can pass.

3.1 robots.txt Accessibility Check

While HTTP status and Content-Type verify that the page serves, the robots.txt check verifies that the page can be crawled. Search engine crawlers always fetch robots.txt before any page on the domain — if robots.txt blocks the page path, the crawler will not index the content regardless of how perfect the HTML is. This is a two-step check: first, verify that robots.txt returns HTTP 200 (not 404 or 5xx — a missing robots.txt causes crawlers to assume all URLs are disallowed in some implementations), and second, verify that the article path is not blocked by a Disallow directive.

# Step 1: Verify robots.txt exists and is accessible
curl -fsSI https://xslyl.com/robots.txt 2>&1 | grep -i '^content-type:'
# Expected: content-type: text/plain; charset=utf-8

# Step 2: Check that the article path is not disallowed
curl -fsS https://xslyl.com/robots.txt | grep -i 'disallow.*/posts/'
# Expected: no output (no Disallow rule blocks /posts/)

# Step 3: Check for specific page blocking
curl -fsS https://xslyl.com/robots.txt | grep -i 'disallow.*/en/posts/'
# If output: that language version is blocked from crawling

The most subtle failure mode is a selective robots.txt blocking — the crawler can access zh/posts/ but not en/posts/, or vice versa. This asymmetry means one language version gets indexed while the other is completely invisible to search engines, even though both pages serve HTTP 200. The robots.txt check is dimensionally orthogonal to the HTTP status check — a page can return 200, serve correct Content-Type, and have perfect SEO metadata, yet never appear in search results because robots.txt blocks the path. CI/CD has no awareness of robots.txt at all; the VERIFIED Gate's robots.txt check fills this gap.

An important implementation note: the VERIFIED Gate checks robots.txt at the production URL (e.g., https://xslyl.com/robots.txt), not at a staging or development URL. This is because robots.txt behavior is environment-specific — production may have crawl budget management rules, CDN caching differences, or nginx location block variations that don't exist in staging. Checking a non-production robots.txt can give a false sense of crawlability.

— not a default nginx page, not a wrong-language version, not a duplicate from another article. The check extracts only the content between <title> and </title> tags rather than searching the entire HTML, preventing false matches on slug mentions in body text or navigation links. A slug-in-title failure means either the wrong file was deployed or the Agent generated an incorrect title, both of which demand repair before the gate can pass.

4. Implementation: SEO Metadata Verification — Canonical, Hreflang, JSON-LD

While Section 3 verifies that the page exists, this section verifies that the page declares its identity correctly to search engines. URL metadata errors are the deadliest class of post-deployment failures — the page is online, accessible, and renders perfectly, but search engines refuse to index it. No alert fires, no error appears in the browser, and the failure is only discovered weeks later when organic traffic drops to zero.

4.1 Canonical URL Self-Reference

The canonical URL tag tells search engines which URL is the definitive version of a page. When a page's canonical points to a different URL, search engines treat the current page as a duplicate and discard it from the index. The most destructive failure mode on a bilingual site is cross-language canonical leakage: the zh page's canonical pointing to the en URL, causing Google to index only the English version and drop the Chinese version entirely.

def check_canonical(html: str, expected_url: str) -> bool:
    import re
    match = re.search(r'<link\s+rel="canonical"\s+href="([^"]+)"', html)
    return match is not None and match.group(1) == expected_url

The check is deliberately strict: the extracted URL must exactly equal the expected URL. A common Agent-induced mistake is generating http:// protocol instead of https:// — the check treats this as a hard failure. Same for missing .html extensions, trailing-slash presence/absence mismatches, and www. subdomain deviations. All of these cause search engines to canonicalize the wrong URL, silently fragmenting indexing weight across multiple URL variants.

The canonical check is per-page, not cross-page: it only verifies that each page's canonical self-references correctly. Cross-language canonical leakage (zh page pointing to en) is caught naturally because expected_url is set to the zh URL when checking the zh page, and the mismatch triggers a FAIL. No separate cross-page correlation step is needed — the self-reference check inherently catches cross-language errors as mismatches against the expected identity.

4.2 Hreflang Bidirectional Cross-Validation

Hreflang tags declare language alternates: an English page declares that a Chinese version exists, and the Chinese page reciprocates. This is one of the few SEO mechanisms where one-way annotations are silently ignored by Google — not downgraded, not warned, but completely discarded. If the en page declares hreflang="zh" but the zh page does not declare hreflang="en", Google treats both pages as having no hreflang annotations at all.

This means hreflang verification cannot be done by inspecting a single page in isolation. The VERIFIED Gate must issue two HTTP requests — one to the en page, one to the zh page — extract hreflang tags from both, and cross-validate the references:

def check_hreflang(html: str, zh_url: str, en_url: str) -> dict:
    patterns = re.findall(r'<link\s+rel="alternate"\s+hreflang="([^"]+)"\s+href="([^"]+)"', html)
    result = {"has_zh": False, "has_en": False, "x_default": False}
    for hl, href in patterns:
        if hl == "zh" and href == zh_url:
            result["has_zh"] = True
        if hl == "en" and href == en_url:
            result["has_en"] = True
        if hl == "x-default":
            result["x_default"] = True
    return result

This function runs against both pages separately, producing two result dicts. The gate logic then cross-validates:

The en page must have has_en: True (self-reference), has_zh: True (declares zh alternate), and x_default: True
The zh page must have has_zh: True (self-reference), has_en: True (declares en alternate), and x_default: True
Both pages' x-default must point to the same URL (the en page, by xslyl.com convention)

The x-default tag is the fallback for users whose language is neither zh nor en. It must point to the English page (or whichever language is the site's primary locale), and both pages must agree on the target. Mismatched x-default targets cause Google to arbitrate unpredictably — the gate enforces consistency.

4.3 JSON-LD Structured Data Completeness

JSON-LD blocks provide structured data that Google uses for rich search results: breadcrumb trails, FAQ accordions, article metadata cards. Three independent JSON-LD blocks are required on every xslyl.com article page: Article (core article metadata), BreadcrumbList (navigation trail for search result display), and FAQPage (rich result accordion for the FAQ section).

The gate does not fully parse JSON — it performs a pragmatic structural check that is fast, robust to minor formatting variations, and catches the most common Agent errors:

def check_jsonld(html: str) -> dict:
    types = set(re.findall(r'"@type":\s*"([^"]+)"', html))
    return {
        "has_article": "Article" in types,
        "has_breadcrumb": "BreadcrumbList" in types,
        "has_faqpage": "FAQPage" in types,
    }

This REGEX-based extraction is intentionally shallow — it only checks type presence, not structural validity. The rationale is pragmatic: Google's Structured Data Testing Tool provides thorough JSON validation, and running a full JSON parser inside the gate adds complexity without proportional benefit. Type presence catches the dominant Agent failure modes — missing entire blocks, misnamed types ("FaqPage" instead of "FAQPage"), or partial generation — which account for the vast majority of production JSON-LD issues. Full parse validation is deferred to a separate lint step in CI.

Important policy note on FAQPage: Google's structured data guidelines require that FAQ content declared in JSON-LD must be user-visible on the page. Declaring a FAQPage type without corresponding visible FAQ markup (HTML <details>/<summary> elements) violates Google's policy and may result in rich result removal. The VERIFIED Gate does not enforce this correspondence check at the JSON-LD level — it is handled separately by the FAQ Visibility dimension (Dimension 7, Section 2.7), which verifies that visible FAQ elements exist and match the declared structured data.

Together, these three checks — canonical self-reference, hreflang bidirectionality, and JSON-LD type completeness — form the SEO metadata verification layer. They catch the class of errors that CI/CD is structurally blind to: the page exists and serves 200, but its identity declarations are wrong, causing search engines to silently ignore, deduplicate, or de-index the content. These checks are the VERIFIED Gate's highest-value contribution beyond transport-layer verification.

5. Implementation: Security Header Verification and the nginx add_header Trap

Security response headers are the page's invisible armor — they are never seen by users, never parsed by search engines for ranking, and never checked by CI/CD pipeline tools. Yet their absence creates real vulnerabilities: clickjacking, MIME-type sniffing attacks, insecure cross-origin data leakage, and browser downgrade attacks on HTTPS. Five essential security headers form the baseline defense for any production website:

Header	Recommended Value	Consequence If Missing
Strict-Transport-Security	max-age=31536000; includeSubDomains	Browser may connect via plain HTTP, enabling SSL stripping and man-in-the-middle downgrade attacks. The `max-age` instructs the browser to enforce HTTPS for the specified duration (1 year). Without HSTS, the initial request is unprotected.
X-Content-Type-Options	nosniff	Browsers perform MIME-type sniffing: they inspect file content and may override the declared `Content-Type`. An attacker can upload a file that appears to be an image but contains executable JavaScript — if the browser sniffs and executes it, cross-site scripting (XSS) becomes possible even on sites with strict input sanitization.
X-Frame-Options	SAMEORIGIN	Any third-party site can embed your page in an invisible iframe, overlay UI elements on top (clickjacking), and trick users into performing actions they did not intend — clicking "Delete Account," transferring funds, or granting permissions. `SAMEORIGIN` allows framing only by pages from the same domain.
Referrer-Policy	strict-origin-when-cross-origin	Without this header, the browser sends the full URL (including path and query string) in the `Referer` header to all destinations — even when navigating from HTTPS to HTTP. This leaks internal URL structures, session tokens embedded in URLs, and private resource paths to third-party analytics services and external links.
Permissions-Policy	camera=(), microphone=(), geolocation=()	The page and all embedded iframes can request access to camera, microphone, and geolocation without restriction. An XSS payload injected into a blog comment can request camera access, and without this header, the browser shows the permission prompt — users who habitually click "Allow" grant the attacker a live video feed. The empty allowlist `()` means "denied for all origins, including self."

These five headers are the minimum security baseline — not advanced hardening, not defense-in-depth, but the equivalent of locking your front door. A production site serving pages without them is operating with fundamental security controls missing.

5.1 The nginx add_header Inheritance Trap

nginx has a counterintuitive behavior that has caused countless production security regressions: the add_header directive does not merge between configuration levels. When a location block contains any add_header directive — even for a completely unrelated header — all add_header directives inherited from the parent server block are silently dropped. Not overridden on a per-header basis, not combined — dropped entirely.

This means a single innocent-looking line like add_header Cache-Control "public, max-age=3600"; inside a location /stats/ block will strip all five security headers from every page under that path. The global nginx configuration appears correct — nginx -t passes, the server block declares all five headers — but /stats/ pages serve with zero security headers, and nobody notices until a penetration test or the VERIFIED Gate catches it.

The root cause is nginx's design: add_header in a child context replaces the entire header list from the parent, rather than merging individual headers. This is documented behavior but deeply non-obvious — it means every location block that uses add_header must independently redeclare all five security headers. The fix is mechanically simple but easy to forget when adding a new location block, a caching header, or a CORS header months after the initial server configuration:

# Correct — redeclare all headers in every block that has add_header
server {
    listen 443 ssl;
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    add_header Permissions-Policy "camera=(), microphone=(), geolocation=()" always;

    location / {
        # Inherits from server block — OK
        try_files $uri $uri/ =404;
    }

    location /stats/ {
        # TRAP! This add_header drops all inherited headers!
        # Must also redeclare:
        add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
        add_header X-Content-Type-Options "nosniff" always;
        add_header X-Frame-Options "SAMEORIGIN" always;
        add_header Referrer-Policy "strict-origin-when-cross-origin" always;
        add_header Permissions-Policy "camera=(), microphone=(), geolocation=()" always;
        proxy_pass http://localhost:8080;
    }
}

Note the always suffix on every add_header directive. Without always, nginx only adds the header on successful responses (2xx and 3xx). Error responses — 404, 500, 502 — are served without security headers, meaning an attacker who can trigger a 500 error on a login page can serve that page without clickjacking protection. The always flag ensures headers are present on every response code, including error pages. This is not an edge case — error pages are precisely where security headers protect against phishing and framing attacks that exploit user confusion during error states.

The VERIFIED Gate catches this trap by checking every URL independently. A security header check against the homepage alone would see all five headers present (from the server block) and report PASS — but the hidden /stats/ path, or any path under a location block with add_header, would be silently missing them. Per-URL independent verification is the only reliable detection method.

5.3 The `always` Flag and Error Page Surface Area

In the nginx configuration examples throughout this article, every add_header directive includes the always suffix. This parameter is easy to overlook but critical for security: without always, nginx only sends security headers on successful responses (2xx and 3xx). Error responses — 401, 403, 404, 500, 502, 503 — are served without any security headers, even though they are declared at the server block level.

The practical vulnerability: an attacker probing for security weaknesses can deliberately trigger a 404 on a login page or an administrative endpoint. If the 404 response lacks X-Frame-Options, the attacker can frame the error page on a malicious site — and users who see a legitimate-looking error page framed within an attacker's domain may be tricked into entering credentials, believing the error page is from the legitimate site. With X-Frame-Options: SAMEORIGIN even on the 404, this attack is blocked because the browser refuses to render the framed page.

The VERIFIED Gate detects missing always flags indirectly. When it executes check_security_headers.py, it fetches the production URL without special error-page handling — which means if the page is accessed via its correct URL (which returns 200), the headers will be present regardless of the always flag. The always-flag gap only appears on error pages. To detect this, the gate suite also includes a separate check that requests a known-non-existent URL on the same domain (e.g., https://xslyl.com/nonexistent-test-page) and verifies that the 404 response carries all five security headers. This error-page probe is classified as a optional supplementary check (Tier 4 — operational note) because its failure does not affect the deployment's primary content, but it does indicate a hardening gap that should be addressed.

This distinction — checking the article's headers vs. checking the domain's error-page headers — illustrates an important design principle of the VERIFIED Gate: dimensional coverage is not symmetric. The article-content checks are comprehensive (10+ dimensions for the deployed article), while the infrastructure checks (robots.txt, error-page headers, sitemap structural validity) are supplementary probes that protect against system-level misconfigurations. A deployment must pass all article-content checks; infrastructure checks generate warnings and notes that guide the operator toward operational improvements without blocking the release.

5.2 Python Verification Implementation

The VERIFIED Gate's security header check is a single-purpose verification function that issues an HTTPS request and validates that all five expected header values are present:

import httpx

SECURITY_HEADERS = {
    "strict-transport-security": "max-age=31536000",
    "x-content-type-options": "nosniff",
    "x-frame-options": "SAMEORIGIN",
    "referrer-policy": "strict-origin-when-cross-origin",
    "permissions-policy": "camera",
}

def check_security_headers(url: str) -> dict:
    resp = httpx.get(url, timeout=10)
    return {
        hdr: val in resp.headers.get(hdr, "")
        for hdr, val in SECURITY_HEADERS.items()
    }

The check uses substring matching rather than exact equality for two practical reasons. First, HTTP header names are case-insensitive — resp.headers.get() in httpx normalizes to lowercase, but the expected values may have mixed casing from nginx configuration. Second, header values often contain additional parameters beyond the core directive — Strict-Transport-Security may include includeSubDomains and preload flags, and Permissions-Policy may list additional features beyond camera. The substring check validates that the essential directive is present without being tripped up by benign parameter variations.

Each header serves a distinct defensive purpose, and the failure of any single one leaves a specific vulnerability open. The gate's per-header PASS/FAIL output enables precise diagnosis — a Permissions-Policy failure points to a specific header omission, not a vague "security headers incomplete" message. This granularity is essential for automated recovery: the repair workflow can target the exact missing header rather than regenerating the entire nginx configuration.

6. Implementation: Sitemap and Homepage Card Consistency

Deploying an article to production is a two-sided operation: the article HTML file must reach the server, and the site's navigation infrastructure must be updated to include it. A deployment that succeeds at the first half but fails at the second produces a ghost page — the article exists on the filesystem, serves HTTP 200, renders correctly, but is invisible to both search engines and human visitors navigating from the homepage.

Two infrastructure components must be verified independently: sitemap.xml for search engine discovery, and the language-specific homepage index files (zh/index.html and en/index.html) for human navigation. Deploying one without the other creates an inconsistent experience — search engines find the page but users can't navigate to it, or users find it via the homepage but search engines never crawl it. Both must be checked.

6.1 Sitemap: The Search Engine Entry Point

sitemap.xml is the primary discovery mechanism for search engine crawlers. When a new article is deployed, the sitemap must contain both the Chinese and English URLs — if either is missing, that language version relies entirely on organic discovery through internal links, which can take days or weeks. Google's crawl scheduling prioritizes URLs listed in the sitemap, assigning them higher crawl frequency and faster indexing compared to URLs discovered only through link traversal.

The sitemap check is threefold:

Both zh and en URLs must appear in sitemap.xml. Asymmetric entries — zh present but en absent — mean one language version is immediately discoverable while the other enters a crawl queue behind thousands of other pages. For a bilingual site, this asymmetry is worse than both being absent: it creates an indexing disparity between the two language versions, causing one to rank while the other remains invisible.
The sitemap must be structurally complete with a closing </urlset> tag. An interrupted sitemap generation (rsync mid-transfer, script crash, disk full) produces a truncated XML file that search engines reject entirely — not just the missing entries. A structurally valid but incomplete sitemap is silently discarded by Google's parser, with the failure only surfacing in Google Search Console days later. The closing-tag check catches truncation before it becomes a silent crawl failure.
No cross-contamination from staging URLs. The sitemap must contain only production URLs (https://xslyl.com/...). A common Agent error is generating sitemap entries with staging domains or localhost prefixes — search engines encountering these treat the entire sitemap as unreliable, potentially ignoring all entries.

6.2 Homepage Cards: The Human Navigation Entry Point

The homepage index files (zh/index.html and en/index.html) contain article cards — linked titles that users click to reach the article. These cards are the single most important internal link for a new article: Google's PageRank algorithm assigns higher initial weight to pages linked directly from the homepage, and users discovering the site navigate through the homepage card list to find content.

The check is straightforward but the failure mode is subtle: the article HTML deploys successfully, but index.html uses a cached or stale version that does not include the new article card. CI/CD reports deploy succeeded because all files transferred and nginx reloaded — but the old index.html overwrote the updated one, or the article's card slug was never added to the homepage template. The page exists on the server but is unreachable from the primary navigation path, defeating the purpose of publishing.

The worst case is a card present on one language homepage but absent on the other — the English homepage links to the article while the Chinese homepage shows an older list without it. This asymmetric visibility means users in one locale see the article while users in the other locale cannot find it, creating a split experience that is undetectable by checking only a single homepage.

6.3 Combined Verification Implementation

The VERIFIED Gate performs sitemap and homepage checks in a single combined function, issuing three HTTP requests — one for sitemap, one for each language homepage — and returning a boolean matrix of presence checks:

def check_sitemap_and_homepage(slug: str) -> dict:
    sitemap = httpx.get("https://xslyl.com/sitemap.xml").text
    zh_home = httpx.get("https://xslyl.com/zh/index.html").text
    en_home = httpx.get("https://xslyl.com/en/index.html").text

    return {
        "zh_in_sitemap": f"/zh/posts/{slug}" in sitemap,
        "en_in_sitemap": f"/en/posts/{slug}" in sitemap,
        "sitemap_closed": "</urlset>" in sitemap,
        "slug_in_zh_home": slug in zh_home,
        "slug_in_en_home": slug in en_home,
    }

The function returns five independent boolean checks rather than a single composite result. This granularity is essential for diagnosis: a slug_in_zh_home FAIL but slug_in_en_home PASS tells the operator exactly which file needs repair — the Chinese homepage, not the English one, and not the sitemap. Without per-component granularity, the operator sees "homepage check failed" and must manually inspect both homepage files to find the issue.

The sitemap_closed check within this function verifies that the sitemap isn't truncated. A truncation failure is silent at the HTTP level — the server returns 200, the Content-Type is correct, and the partial XML is well-formed right up to the truncation point. Only the missing </urlset> closing tag reveals the truncation. Google's XML parser will reject a sitemap that doesn't close properly, but the gate's explicit sitemap_closed check catches the problem at deployment time, before Google ever encounters the broken file.

6.4 Cross-Language Homepage Consistency

A bilingual site has two homepages — one for each language. Each must contain a card or link to the new article. An asymmetric failure — the article appears on en/index.html but not zh/index.html — creates a split user experience: English-speaking visitors see the article while Chinese-speaking visitors cannot find it. The VERIFIED Gate checks both homepages independently and reports which language version is missing, enabling precise repair.

The check logic is:

def check_homepage_consistency(slug: str) -> dict:
    zh_home = httpx.get("https://xslyl.com/zh/index.html").text
    en_home = httpx.get("https://xslyl.com/en/index.html").text
    
    return {
        "slug_in_zh_home": slug in zh_home,
        "slug_in_en_home": slug in en_home,
        "zh_article_links": len(re.findall(
            r'<a\s+href="/zh/posts/', zh_home)),
        "en_article_links": len(re.findall(
            r'<a\s+href="/en/posts/', en_home)),
        "zh_home_valid_html": "</html>" in zh_home,
        "en_home_valid_html": "</html>" in en_home,
    }

The additional checks — zh_article_links and en_article_links — detect a more insidious failure: the article card is present, but the homepage contains zero article links (indicating the index file was replaced by a placeholder or landing page template). The _home_valid_html checks confirm the homepage isn't a truncated file or error page serving as the index. These supplementary checks transform a brittle single-boolean check into a resilient diagnostic signal that can distinguish between "card missing" and "homepage broken."

Together, the sitemap and homepage checks form the discoverability verification layer. While Sections 3-5 verify that the article exists and is correct, this section verifies that it can be found — through both machine (sitemap → crawler) and human (homepage → click) discovery paths. A deployment that passes content checks but fails discoverability checks is not truly deployed — it's merely stored on a server.

6.5 Homepage Update Architecture Considerations

The homepage update pattern depends on how the site generates its index pages. Three common architectures have different implications for VERIFIED Gate verification:

Static card template (xslyl.com's pattern): The homepage index is a hand-updated HTML file where new article cards are manually added. The VERIFIED Gate checks that the slug appears in the file — a simple substring presence test. This pattern is brittle but transparent: the card content is directly visible in the repo, and a missing card is immediately obvious from the gate failure output. The risk is human error — forgetting to add the card to one language's index.
Automated sitemap generator: A CI/CD step reads the zh/posts/ and en/posts/ directories and regenerates zh/index.html and en/index.html on each deployment. The VERIFIED Gate checks that the regenerated file contains the new slug and validates structural integrity (proper <html> tags, balanced card counts). The risk is generator bugs — a template change that drops the trailing card, or a parsing error that excludes one language's directory.
Dynamic CMS-backed homepage: The homepage is rendered server-side from a database. The VERIFIED Gate cannot verify this pattern with the same lightweight HTTP check — it must fetch the homepage, render it in a headless browser (to execute JavaScript that loads the post list), and verify the card presence in the rendered DOM. This pattern requires a different set of verification tools (Puppeteer, Playwright) but the principle is the same: verify that the new article is discoverable from the primary navigation entry point.

Regardless of architecture, the gate's discoverability check follows the same principle: verify that the deployment's output (the new content) is reachable from the site's primary navigation entry point. The implementation method varies — substring match, XML parse, or DOM extraction — but the verification question is identical.

7. Deployment Integrity: Git SHA Consistency and Emergency Paths

The previous six sections verify that deployed content looks correct from the outside — HTTP headers, SEO metadata, structured data, discoverability. But there is a deeper question that none of those checks answer: is the content on the VPS actually the same content that was committed to the repository? A page can pass every external check — correct canonical, valid JSON-LD, all security headers present — yet be serving a stale version from two weeks ago because the deploy script silently failed and rsync never overwrote the old files.

This is the deployment integrity problem. It requires a different kind of verification: not checking what the page looks like, but checking that the bytes on the server match the bytes in the repository. The VERIFIED Gate enforces this through three-way Git SHA consistency.

7.1 Three-Way SHA Consistency

The deployment pipeline produces three independent commit identifiers that must all agree:

SHA Source	Where It Comes From	What It Proves
github_sha	The merge commit on GitHub's `main` branch — extracted from the GitHub Actions event payload or `gh run view`	The content was committed and merged into the canonical repository
vps_sha	The HEAD commit on the VPS's local git repository — obtained via `git rev-parse HEAD` on the VPS after `git pull`	The VPS successfully pulled the latest commit from GitHub
deploy_sha	The SHA recorded in the deploy report after the deploy script completes — typically the same as `vps_sha` but independently recorded	The deploy script ran against the correct commit and completed successfully

If all three SHAs match, the chain of custody is intact: GitHub accepted the commit → VPS pulled it → the deploy script processed it. If any pair disagrees, the deployment is in an inconsistent state and the gate must fail. The most common mismatch scenario: github_sha and vps_sha agree, but deploy_sha is older — meaning the deploy script ran against a stale checkout and never deployed the latest files. The external content checks (Sections 2-6) might still pass because the old content was also correct — but the new content never reached production.

7.2 Standard Deploy Path

The standard deployment follows a deterministic sequence that the gate can validate at each step:

GitHub Actions triggers on merge to main, capturing merge_commit (the github_sha) in the workflow run metadata.
VPS executes git pull origin main — the gate records git_pull_result (success/failure) and the resulting vps_sha.
Deploy script runs (deploy-xslyl.sh) — rsyncs files to /var/www/html/, reloads nginx. The gate records deploy_script_result and confirms dry_run=false (actual deployment, not a preview).
Gate captures the deploy report — method must be vps_git_pull_main_plus_deploy_script. The report contains all three SHAs plus step-level success/failure flags.

A critical detail: the gate checks that dry_run is explicitly false. A deploy script invoked with --dry-run produces identical-looking output — rsync lists files it would transfer, the script returns exit 0 — but no files are actually moved. Without the dry-run check, a dry-run deployment could produce a VERIFIED status while zero content reached the server.

7.3 Emergency Deploy Path

Standard deployment can fail for reasons unrelated to the content: SSH connectivity issues between the VPS and GitHub, git protocol timeouts, disk-full conditions that prevent git pull from completing. When the standard path is unavailable, an emergency deployment method — scp of individual files or rsync --no-delete — may be the only way to get content to production.

Emergency deployments are never automatic. They require explicit user approval documented in the deploy report. The gate classifies the result as VERIFIED_WITH_WARNING rather than VERIFIED because:

No git SHA consistency guarantee. Without git pull, there is no vps_sha to compare against github_sha. The content was transferred, but the gate cannot cryptographically prove it matches the repository.
Partial file deployment risk. scp and rsync transfer individual files — if the operator accidentally omits a file (e.g., the zh version but not the en version), the deployed state is incomplete. The standard deploy path transfers the entire repository tree atomically.
No rollback safety. Standard deployment leaves the VPS git repository at a known commit, enabling git revert rollback. Emergency deployment bypasses the git layer — there is no commit record on the VPS to roll back to.

The VERIFIED Gate's emergency path validation logic:

def check_deploy_integrity(deploy_report: dict) -> dict:
    method = deploy_report.get("deploy_method", "")
    is_standard = method == "vps_git_pull_main_plus_deploy_script"
    is_emergency = "emergency" in method

    checks = {"method_known": is_standard or is_emergency}
    if is_standard:
        checks["git_pull_ok"] = deploy_report.get("git_pull_result") == "success"
        checks["script_ok"] = deploy_report.get("deploy_script_result") == "success"
        checks["dry_run"] = deploy_report.get("dry_run") is False
    if is_emergency:
        approval = deploy_report.get("emergency_approval", {})
        checks["user_approved"] = approval.get("approved_by_user") is True
    return checks

The method_known check is the first line of defense: if the deploy method is neither standard nor emergency, the gate cannot determine how content reached the server, and the deployment is unverifiable. This catches a class of errors where the deploy report is missing, corrupted, or generated by an unauthorized process — the gate refuses to certify a deployment it cannot trace.

7.4 Rollback Readiness Verification

A deployment that passes the VERIFIED Gate is certified as correct — but what happens when the next deployment breaks the site? The VERIFIED Gate includes a rollback readiness check: before certifying a new deployment, it verifies that the previous deployment's git SHA is reachable via git log --oneline -1 on the VPS. This ensures that if the new deployment introduces a regression, the operator can roll back by checking out the previous commit and re-running the deploy script. Without this check, a deployment that overwrites or replaces the git history on the VPS (e.g., a force-push that rewrites history, or a manual file deletion) — even if it passes all verification dimensions — leaves the operator without a rollback target.

The rollback readiness check is a lightweight meta-assertion: it does not test the actual rollback process (which would require deploying to a separate environment and verifying the previous version's correctness), but it confirms that the technical precondition for rollback exists. The check is:

# Verify previous commit is still reachable
cd /opt/xslyl-repo && git log --oneline -2
# Expected: at least 2 commits visible (current + previous)
# If only 1 commit: rollback target lost — gate passes with warning

# Verify no uncommitted changes that would break rollback
cd /opt/xslyl-repo && git status --porcelain
# Expected: empty output
# If output: unstaged changes will be lost on rollback

A rollback readiness failure does not block the VERIFIED Gate — the deployment may still be correct. But it does reduce the gate's output from VERIFIED to VERIFIED_WITH_WARNING, documenting that while the current deployment is correct, the operator lacks a quick recovery path if a future deployment fails. The warning is recorded in verify-report.json, creating an audit trail that triggers a review of the deployment workflow.

7.5 Coordinated Multi-File Verification

A deployment is rarely a single file — it is a coordinated set of files that must be updated together. For a bilingual article on xslyl.com, a correct deployment involves at minimum five files being updated in the same deploy cycle: the zh article HTML, the en article HTML, the zh homepage index, the en homepage index, and the sitemap XML. If any of these files is from a different commit — the zh article was deployed at commit A but the sitemap was last updated at commit B — the deployed state is inconsistent even if each individual file is correct.

The VERIFIED Gate detects this through cross-file consistency checks. It does not just verify each file independently; it verifies that the set of deployed files are internally consistent. The deploy-report.json records which files were deployed (from the rsync log), and the gate cross-references this file list against the expected deployment manifest. If the deploy report shows 4 files transferred but the expected manifest lists 5, the gate fails — even if the 4 files that were deployed are individually correct. The missing file (typically the sitemap or one language's homepage index) will be detected as absent from the deployment even before the content checks run.

This multi-file coordination check is separate from the per-file content checks (Sections 2-6). A deployment that transfers 4 out of 5 required files will fail the coordination check instantly, before any HTTP request is made to verify content correctness. The design principle: verifying the deployment set is cheaper and more reliable than verifying each file in the set individually. A fives-file check with one missing file produces 4 PASS + 1 FAIL on individual checks, which is noisy and confusing (which file is missing?). A single coordination check that asks "is the file count correct?" returns a clean FAIL with a clear diagnosis: "5 files expected, 4 deployed."

8. Meta-Checks: Script Versioning and Policy Acknowledgment

All the checks described so far verify the deployed content. But there is a recursive problem: what verifies the verification scripts themselves? The gate suite — check_ready_gate.py, check_verified_gate.py, check_vm_evidence.py, check_policy_ack.py — is software, subject to the same drift, inconsistency, and staleness as any other software. An outdated verification script is worse than no script: it produces false confidence by reporting PASS on checks it never actually performed.

The VERIFIED Gate includes two meta-checks — checks that validate the verification infrastructure itself rather than the deployed content.

8.1 Gate Script Version Consistency

Every gate script declares a GATE_SCRIPT_VERSION constant. This version is not decorative — it is the contract between scripts that they understand the same data formats, check the same dimensions, and produce compatible output structures. If check_ready_gate.py is version 3 (which expects 14 required sections) but check_verified_gate.py is still version 2 (which only checks 10 dimensions), the gate suite is in an inconsistent state: the ready gate might pass an article that the verified gate cannot fully validate.

The version consistency check is simple but non-negotiable:

EXPECTED_VERSION = "4"

def check_gate_script_versions(scripts: list[str]) -> dict:
    versions = {}
    for script_path in scripts:
        with open(script_path) as f:
            for line in f:
                if "GATE_SCRIPT_VERSION" in line:
                    versions[script_path] = line.split("=")[1].strip().strip('"').strip("'")
                    break
    all_match = all(v == EXPECTED_VERSION for v in versions.values())
    return {"versions": versions, "all_match": all_match}

If any script's version differs from EXPECTED_VERSION, the meta-check fails. This is not a warning — it is a hard gate failure, because running an inconsistent verification suite is equivalent to running no verification at all. The fix is mechanical: update the out-of-date scripts to the current version, which also forces a review of what changed between versions.

8.2 Policy Acknowledgment Verification

The second meta-check validates that the verification process itself follows the rules. The check_policy_ack.py script verifies that task reports include the required acknowledgment fields:

command-approval-policy.md has been read — the agent must declare it has read and understood the policy before generating commands.
BLOCKED patterns declaration — the agent must state that no blocked patterns (P1-P6) were used in the task.
Mandatory Approval declaration — any operations from Groups A-D that were performed must be explicitly listed with their approval status.
Execution boundary declaration — for sub-agent tasks, which operations the sub-agent performed autonomously vs. which were user-approved must be documented.

This is a meta-check because it validates that the verification process — the agent's own task execution — complies with the command approval policy. A verification report that itself violated the policy (e.g., the agent used a blocked command pattern to run the verification scripts) is not trustworthy, regardless of what the content checks report.

8.3 Why Meta-Checks Matter

Gate scripts evolve independently of the articles they verify. A new security header is added to the nginx configuration; check_verified_gate.py is updated to check for it; but check_ready_gate.py — which runs on a different schedule, triggered by different events — is not updated. Six months later, the ready gate passes an article that is missing the new security header in its pre-deployment checks, and the verified gate catches it post-deployment. The article author sees "ready gate: PASS, verified gate: FAIL" and cannot understand the contradiction.

The meta-checks prevent the gate suite from entering this inconsistent state. They are the equivalent of calibration checks on measurement instruments — you don't trust a thermometer that hasn't been calibrated, and you shouldn't trust a verification suite whose scripts don't agree on what they're verifying.

Without version enforcement, the gate suite decays silently: scripts diverge, check coverage drifts, and the "VERIFIED" status gradually loses meaning. The meta-checks are the mechanism that forces the gate suite to evolve as a coherent unit, where every script validates against the same set of expectations.

8.4 Gate Result Storage and Archival

The VERIFIED Gate produces three categories of output that must be stored, each serving a different purpose and retention requirement:

Category	Format	Contents	Retention
Machine evidence	verify-report.json deploy-report.json	Per-dimension PASS/FAIL verdicts, HTTP status codes, canonical URLs found vs. expected, JSON-LD types detected, security headers present vs. missing, SHA comparison results. Machine-readable for downstream automation.	Permanent (committed to repo)
Human summary	final-report.md live-review.md	Overall verdict, pass/fail breakdown by dimension with tier-level severity classification, recovery recommendations, and a timestamped audit log. Human-readable for operators and postmortems.	Permanent (committed to repo)
State metadata	status.json	Task state: pipeline stage completions, deployment method, live URLs, verification status. Updated incrementally as the task progresses through the pipeline.	Permanent (committed to repo)

All three categories are committed to the repository under .agent-workspace/tasks/<task_id>/. This co-location is deliberate: for any given deployment, the task directory contains the complete chain of custody — from deployment evidence (how content reached the server) through verification evidence (whether the deployed content is correct) to the final human-readable summary (what operators need to know). Six months later, when investigating an SEO regression or verifying that a fix was properly deployed, the entire audit trail is in one place, under version control, with git history showing exactly what changed and when.

The repository-as-database choice has practical implications. Unlike CI/CD artifact stores with 90-day retention or monitoring dashboards with 30-day metric windows, git commits persist indefinitely. A deployment from 2024 is still verifiable in 2027 — check out the commit, run the gate scripts (which are also in the repo at that version), and reproduce the exact verification result. This reproducibility guarantee is the meta-check's most important property: it makes the verification process itself auditable, not just the verification output.

9. Recovery Paths for Failed Dimensions

The VERIFIED Gate's primary output is a PASS/FAIL matrix — 10+ dimensions, each independently judged. But a raw FAIL verdict without a recovery path is operationally useless: it tells you there's a problem without telling you what to do about it. This section defines the recovery protocol for each failing dimension — the specific sequence of actions that transforms a FAIL into a PASS, and the re-verification step that confirms the fix worked.

Every recovery path follows a universal three-phase pattern: Diagnose → Repair → Re-verify. The Diagnose phase identifies the root cause from the gate's per-dimension evidence. The Repair phase applies the specific fix — which varies significantly by dimension, from a one-line HTML edit to an nginx configuration change. The Re-verify phase re-runs the full gate suite (not just the failing dimension) to confirm the fix didn't introduce regressions. Skipping re-verification and directly marking PASS is never permitted. The gate's output is deterministic; if the underlying problem still exists, re-running the gate will produce the same FAIL.

9.1 Dimension-by-Dimension Recovery Paths

Dimension	Repair Action	Re-verify Trigger
HTTP Status	Fix server configuration — wrong file path in nginx `root`/`alias`, missing `try_files`, or incorrect `location` block. If the path is correct but 404 persists, the file was never deployed — check rsync target and file permissions.	Redeploy → Re-verify
Canonical	Fix the article HTML — locate the `<link rel="canonical">` tag and correct the URL to match the production URL exactly. Common fixes: change `http://` to `https://`, replace staging domain with production domain, add missing `.html` extension, or fix cross-language leakage (zh page pointing to en URL).	Redeploy → Re-verify
Hreflang	Fix the article HTML on both language versions — ensure each page declares the other as an alternate using `<link rel="alternate" hreflang="...">` tags with correct full URLs. Both pages must include `x-default` pointing to the same target. One-way hreflang requires fixing the missing side, not the declaring side.	Redeploy → Re-verify
JSON-LD	Fix the article HTML — ensure all three JSON-LD blocks (`Article`, `BreadcrumbList`, `FAQPage`) are present and contain valid JSON. Common fixes: add missing `@type` block, fix trailing commas or unescaped quotes, correct URLs from `localhost` to production domain.	Redeploy → Re-verify
FAQ Visibility	Fix the article HTML — ensure FAQ questions use semantic `<details>/<summary>` markup and are not hidden with `display:none` or `visibility:hidden` CSS. Verify that the number of visible FAQ elements matches the number declared in the `FAQPage` JSON-LD block.	Redeploy → Re-verify
Security Headers	Fix the nginx configuration — identify which `location` block is missing headers (likely due to the `add_header` inheritance trap). Add the five security header directives with `always` suffix to every `location` block that uses `add_header`. Alternatively, move all `add_header` directives exclusively to the `server` block and remove them from `location` blocks entirely.	nginx reload → Re-verify
Sitemap	Fix the sitemap generation script — ensure it includes both zh and en URLs for the article, produces a structurally complete XML document with closing `</urlset>` tag, and uses production domain URLs (no staging or localhost). Re-run the generation script and redeploy the sitemap.	Redeploy → Re-verify
Homepage Cards	Fix the homepage index file (`zh/index.html` or `en/index.html`) — add the missing article card with the correct link URL and title. If the card exists but the link is broken, fix the `href` attribute. Ensure both language homepages are updated — asymmetric card presence is a common failure.	Redeploy → Re-verify

The key pattern: dimensions that live in article HTML (canonical, hreflang, JSON-LD, FAQ visibility) follow the same recovery path — fix HTML → redeploy → re-verify. Dimensions that live in server configuration (HTTP status, security headers) are fixed in nginx config and require only nginx reload, not a full redeploy. Dimensions that live in infrastructure files (sitemap, homepage) require fixing the generation script or template, then redeploying the specific file.

9.2 The Recovery Decision Tree

When the gate reports a FAIL, the first question is not "which dimension failed?" — it's "is the deployment structurally sound?" A dimension failure on a fundamentally broken deployment is noise; a dimension failure on a sound deployment is a targeted fix. The following decision tree guides the operator to the correct recovery branch:

Is the deploy structurally sound (files exist, server responds)?
├── YES → Fix content → Redeploy → Re-verify
│         (Canonical, hreflang, JSON-LD, FAQ, sitemap, homepage cards)
│
├── NO → Can it be fixed in-place?
│   ├── YES → Fix deploy → Re-verify
│   │         (nginx config: fix path → reload → re-verify)
│   │         (Security headers: fix nginx location block → reload → re-verify)
│   │
│   └── NO → Rollback → Fix offline → Redeploy
│             (HTTP 404/500 with unknown root cause)
│             (Corrupted deploy that damaged existing pages)
│             (Multiple dimensions failing simultaneously — likely systematic)

The left branch (structural soundness = YES) covers the majority of gate failures: the page is online and accessible, but its metadata declarations are wrong. These are content fixes — edit the HTML file, redeploy, re-verify. The fix is localized to the article's source file and carries no risk of breaking other pages.

The right branch (structural soundness = NO) is rarer but more urgent. When the page is inaccessible (HTTP 404/500), the decision tree first asks whether the fix can be applied in-place — nginx configuration changes that take effect on nginx reload without disturbing running services. If the problem can be fixed in-place, the recovery is fast: edit nginx config, reload, re-verify. If in-place fix is impossible — the file system is corrupted, the deploy script produced a destructive outcome, or the root cause is unknown — the recovery path is the heavy option: roll back to the last known-good deployment, fix the root cause offline in a staging environment, then redeploy. This path is rarely taken but must exist — a deployment that breaks existing pages cannot be left in production while the fix is developed.

A special case: multiple dimensions failing simultaneously strongly suggests a systematic deployment failure rather than individual content defects. If HTTP status, canonical, and hreflang all fail together, the problem is not three separate bugs — it's more likely that the wrong file was deployed, the nginx configuration is missing an entire location block, or the VPS filesystem is serving stale content. In this scenario, skip per-dimension diagnosis and proceed directly to the rollback → fix offline → redeploy path. Attempting to fix individual dimensions when the deployment itself is broken wastes time and risks introducing additional inconsistencies.

9.3 Recovery Walkthrough: Solving a Canonical Leakage Failure

To make the recovery paths concrete, consider a real scenario. The VERIFIED Gate reports a Tier 1 failure on the canonical dimension for the zh version of an article: the Chinese page's <link rel="canonical"> points to https://xslyl.com/en/posts/slug.html instead of the self-referencing https://xslyl.com/zh/posts/slug.html. This is a cross-language canonical leakage — the zh page tells Google it's a duplicate of the en page, effectively de-indexing the Chinese content.

The Diagnose phase: the gate's canonical check function (Section 4.1) already identified the exact mismatch — expected URL vs. found URL — and included both values in the verify-report.json evidence. The operator does not need to manually fetch and inspect the HTML; the gate's per-dimension output pinpoints the failure. The evidence shows expected: https://xslyl.com/zh/posts/slug.html vs. found: https://xslyl.com/en/posts/slug.html. The diagnosis is immediate: the Agent that generated the zh HTML used the en canonical template without substituting the language path.

The Repair phase: the fix is a one-line change in the zh article's HTML — replace "https://xslyl.com/en/posts/slug.html" with "https://xslyl.com/zh/posts/slug.html" in the <link rel="canonical"> tag. This fix is localized to the article source file and carries zero risk to other pages. Importantly, the fix must happen in the repository, not directly on the VPS — editing the file on the production server bypasses git history and creates a divergence that the next deployment will overwrite.

The Re-verify phase: after the fix is committed and redeployed, the gate re-runs the full suite — not just the canonical check. The re-verification catches a regressive interaction that the operator did not anticipate: fixing the canonical tag corrected the zh page, but the redeploy process unintentionally reset the sitemap to a version that excludes the zh URL (a stale sitemap template was included in the deploy package). The gate now reports a sitemap failure. Without a full re-verify, this regression would have gone unnoticed until Google Search Console reported a crawl error days later. The full re-verify catches it immediately, and the operator fixes the sitemap before the deployment is certified as VERIFIED.

This walkthrough illustrates the most important operational principle: never re-verify a single dimension in isolation. The repair may fix the dimension that failed, but the redeploy process may introduce a regression in a dimension that previously passed. Only a full suite re-run can certify that the deployment is correct across all dimensions.

10. Integrating the VERIFIED Gate into an Agent Release Pipeline

The previous nine sections defined what the VERIFIED Gate checks and how to recover from failures. This final section addresses the operational question: who runs the gate, when does it run, and how does its output integrate into the broader release pipeline?

On xslyl.com, the VERIFIED Gate is not a CI/CD plugin, not a GitHub Action, and not a cron job. It is executed by Codex, the Engineering QA & Gate Worker agent, as a deliberate post-deployment step. This architectural choice — running the gate from the repo root rather than from CI/CD — is intentional and carries specific trade-offs that this section explains.

10.1 Why Codex Executes the Gate

CI/CD platforms (GitHub Actions, Jenkins, GitLab CI) operate within a constrained execution environment: they have no persistent file system, their network access is limited to the runner's ephemeral VM, and their output is primarily structured as log streams. This environment is poorly suited to the VERIFIED Gate's requirements:

Repo access: The gate reads verification scripts (check_verified_gate.py, check_policy_ack.py), deploy reports (deploy-report.json), and task metadata (status.json) from the repository. Codex has native file system access to the repo root.
Online verification capability: The gate issues HTTPS requests to the production server, verifies response headers and bodies, and cross-validates URLs across multiple pages. CI/CD runners may be subject to network egress restrictions that block or rate-limit these requests.
Audit trail in GitHub: The gate's output — verify-report.json (structured evidence) and final-report.md (human-readable summary) — is committed to the repository, creating a permanent, version-controlled record of every verification event. CI/CD artifacts expire after a retention period; committed files persist with the repository history.
Stateful decision context: Codex reads the full deployment state — deploy method, SHAs, task metadata — and can make tier-based gate decisions (PASS vs. WARN vs. FAIL) that CI/CD's binary pass/fail model cannot express.

Codex is not a replacement for CI/CD — it is a post-CI/CD verification agent. The pipeline flow is: CI/CD deploys → CI/CD confirms deployment completed → Codex executes the VERIFIED Gate → Codex writes results → Hermes-Agent reports the outcome. Codex's role is specifically the verification step; it does not deploy, does not trigger rollbacks, and does not make human-judgment calls on tier-2 warnings. It runs the scripts, collects the output, and reports deterministically.

10.2 The Gate Execution Function

The core gate logic shown below is a conceptual architecture — it represents the design intent of what a VERIFIED Gate should do at the architectural level. The actual check_verified_gate.py script in the repository implements a two-layer split: the online verification functions (the online_ok block below) are executed by a separate deploy-time verifier that writes results to verify-report.json. The repository-side script then validates this pre-recorded evidence, ensuring consistency across deploy-report.json, status.json, and final-report.md. This split was an operational evolution: the live checks run in the deploy context where network access to the production server is guaranteed, while the evidence validation runs from the repo root where git history and script versions are available.

def run_verified_gate(task_id: str):
    deploy_ok = check_deploy_integrity(deploy_report)
    online_ok = all([
        verify_http_and_content(zh_url, task_id),
        verify_http_and_content(en_url, task_id),
        check_seo_metadata(zh_html, zh_url, en_url),
        check_seo_metadata(en_html, en_url, zh_url),
        check_security_headers(zh_url),
        check_security_headers(en_url),
        check_sitemap_and_homepage(task_id),
    ])
    meta_ok = check_policy_ack(task_id) and check_script_versions()

    return "VERIFIED" if all([deploy_ok, online_ok, meta_ok]) else "VERIFICATION_FAILED"

def run_verified_gate(task_id: str):
    deploy_ok = check_deploy_integrity(deploy_report)
    online_ok = all([
        verify_http_and_content(zh_url, task_id),
        verify_http_and_content(en_url, task_id),
        check_seo_metadata(zh_html, zh_url, en_url),
        check_seo_metadata(en_html, en_url, zh_url),
        check_security_headers(zh_url),
        check_security_headers(en_url),
        check_sitemap_and_homepage(task_id),
    ])
    meta_ok = check_policy_ack(task_id) and check_script_versions()

    return "VERIFIED" if all([deploy_ok, online_ok, meta_ok]) else "VERIFICATION_FAILED"

This function encodes the three-layer verification architecture described throughout this article:

deploy_ok — Deployment Integrity (Section 7): Validates that the deployment pipeline itself functioned correctly — three-way SHA consistency, standard vs. emergency path, dry-run detection. If the deploy is structurally unsound, no amount of online content verification can produce a trustworthy result. This check runs first and gates all subsequent checks.
online_ok — Online Content Verification (Sections 2-6): The 10-dimension matrix executed against the live production server. Each of the seven function calls inside all() encapsulates multiple dimension checks: verify_http_and_content covers HTTP status, Content-Type, page size, and slug-in-title; check_seo_metadata covers canonical, hreflang, JSON-LD types, and FAQ visibility; check_security_headers checks all five headers; check_sitemap_and_homepage covers sitemap presence and homepage card links. The all() wrapper enforces the gate's core rule: every dimension must pass.
meta_ok — Meta-Checks (Section 8): Validates the verification infrastructure itself — gate script version consistency and policy acknowledgment compliance. These checks are independent of the deployed content; they verify that the verification process is trustworthy.

The final line — "VERIFIED" if all([deploy_ok, online_ok, meta_ok]) else "VERIFICATION_FAILED" — is the gate's binary output. Note that the tier system (Section 2.11) is implemented inside the individual check functions, not at this top level. check_seo_metadata, for example, returns True for Tier 3/4 failures with a WARNING note, but returns False for Tier 1/2 failures. The top-level all() sees only the boolean outcome, while the granular tier information is preserved in the dimension-level evidence inside verify-report.json.

10.3 Gate Output Artifacts

The gate produces two output files, each serving a distinct consumer:

Artifact	Format	Contents	Consumer
verify-report.json	JSON	Per-dimension PASS/FAIL/WARN/NOTE verdicts with detailed evidence — the exact HTTP status code, the canonical URL found vs. expected, the JSON-LD types detected, the security headers present vs. missing, SHA comparison results. Machine-readable for downstream automation.	Monitoring systems, alerting pipelines, state machine transitions, dashboards
final-report.md	Markdown	Human-readable summary: overall verdict, pass/fail breakdown by dimension, tier-level severity classification, recovery recommendations for failed dimensions, and a timestamped audit log of the verification run.	Human operators, Hermes-Agent reporting, pull request comments, incident postmortems

Both files are committed to the repository at .agent-workspace/tasks/<task_id>/, alongside the deploy report and task metadata. This co-location creates a self-contained audit package: for any given task, the directory contains the deployment evidence (how the content reached the server) and the verification evidence (whether the deployed content is correct). Six months later, when investigating an SEO regression, the entire chain of custody is in one place.

10.4 Gate Position in the Release Pipeline

The VERIFIED Gate is Phase 8 — the final stage — in the 8-gate Agent release pipeline defined by Agent Release Gate Design. Its position is deliberate: it runs after deployment is complete (Phase 7) and before the release is marked as done. The pipeline sequence is:

Phases 1-5: Content creation, structural validation, ready-gate checks (pre-deployment quality assurance).
Phase 6: Human review and approval (optional, policy-dependent).
Phase 7: Deployment — CI/CD transfers files to the VPS, executes the deploy script, reloads nginx.
Phase 8 (VERIFIED Gate): Codex executes post-deployment verification. This is the only phase that runs against the live production environment. All pre-deployment checks (Phases 1-5) run against local files and build artifacts; the VERIFIED Gate is the first and only check that verifies the actual deployed state.
Completion: Hermes-Agent reads verify-report.json, reports the final status to the user, and updates the task state machine.

A critical architectural constraint: the VERIFIED Gate must run from the repo root, not from CI/CD. This is not an implementation convenience — it is a design requirement. Running from the repo root means:

Git history is available for SHA comparison and commit verification.
Gate scripts are version-controlled — the scripts that ran for this verification are the exact scripts committed at this point in history. Reproducibility is guaranteed: checking out the same commit and running the same scripts will produce the same result.
Audit trail is committed directly — the gate's output files are written to the repo and committed as part of the verification workflow. There is no separate artifact store, no external database, no dependency on CI/CD retention policies. The repository is the single source of truth for both the content and the evidence that the content is correct.

If the VERIFIED Gate were a CI/CD step, its output would live in GitHub Actions artifacts with a 90-day retention window, its scripts would be whatever version was checked out at build time (potentially different from the current repo state), and the SHA comparison would be comparing CI/CD environment variables rather than on-disk repository history. Running from the repo root eliminates all of these indirections and makes the verification process auditable, reproducible, and self-contained.

10.5 Monitoring Integration and Alerting

The VERIFIED Gate's output does not end with the gate verdict. The evidence collected during verification — HTTP status codes, header values, SEO metadata correctness, security header presence — is valuable for ongoing monitoring beyond the deployment moment. The verify-report.json structure is designed to be consumed by monitoring systems: its per-dimension verdicts can be ingested into dashboards, its SHA comparison results can trigger alerts, and its tier classification informs incident severity.

Three monitoring integration patterns are worth considering:

Pattern	How It Works	Best For
Dashboard enrichment	Parse verify-report.json after each deployment and publish dimension PASS/FAIL rates, SHA consistency, and security header completeness to a real-time dashboard (Grafana, Datadog). Track dimension failure rates over time to identify systemic issues.	Teams that want to see how deployment health trends over weeks and months
Scheduled re-verification	Run the VERIFIED Gate's online verification dimension (Sections 2-6) as a cron job every 15-30 minutes against production URLs. Detect silent failures that occur between deployments — a misconfigured CDN that starts stripping headers, a robots.txt that gets overwritten, or an expired TLS certificate that breaks HTTPS.	Teams with multiple deployers or shared infrastructure where config drift between deployments is a concern
Alert routing by tier	Route Tier 1 failures (HTTP, canonical, hreflang) to on-call engineers via PagerDuty or OpsGenie with high urgency. Route Tier 2 failures (JSON-LD, security headers) to Slack with medium priority during business hours. Route Tier 3-4 failures (FAQ, sitemap, homepage) to a weekly review board.	Teams that need to calibrate alert fatigue against the tier severity system

The key architectural decision in monitoring integration: the VERIFIED Gate is a deployment-time check, not a continuous monitoring system. Its primary output is the per-deployment verification result. Extending it to continuous monitoring is optional and should be implemented as a separate cron job or scheduler that reuses the gate's check functions rather than re-executing the full gate suite. The verify_http_and_content(), check_seo_metadata(), and check_security_headers() functions described in Sections 3-6 are designed to be independently importable — they have no dependency on the deploy report, the task directory structure, or the meta-check infrastructure. This makes them natural building blocks for a continuous monitoring system.

10.6 Design Principles Summary

Throughout this article, five recurring design principles emerge that distinguish the VERIFIED Gate from conventional deployment verification approaches. These principles are worth stating explicitly as a reference for teams implementing their own gates:

Independent dimensions, composite verdict. Each of the 10+ verification dimensions is independently judged PASS/FAIL/WARN/NOTE. The composite verdict is the most severe outcome across all dimensions, never an average or weighted score. This ensures that a single critical failure — a cross-language canonical leak — is not masked by nine passing dimensions.
Deterministic execution, auditable evidence. The gate does not use probabilistic models, threshold-based scoring, or human judgment. Every check has a deterministic pass/fail criterion encoded in code. Every run produces structured evidence (verify-report.json, deploy-report.json, status.json) that is committed to the repository for permanent auditability.
Multi-layer verification with tiered consequences. Verification is organized into three layers — deployment integrity (Section 7), online content verification (Sections 2-6), and meta-checks (Section 8). Each layer has a distinct failure consequence: deployment integrity failures block the gate, content verification failures produce tier-graded outcomes (hard fail, warning, note), and meta-check failures invalidate the entire verification process.
Repository-as-database for audit trail. All verification evidence is committed to the git repository alongside the deployed content. This ensures indefinite retention, version-controlled history, and reproducibility — any deployment can be re-verified by checking out the commit and re-running the gate at that version.
Recovery paths are part of the gate design. The gate does not just report failure — it is designed with explicit recovery paths for each dimension (Section 9), a decision tree for systematic vs. individual failures, and the principle that re-verification must be a full suite run, never a single-dimension re-check.

These principles are not specific to xslyl.com. They apply to any deployment pipeline where the gap between “deploy succeeded” and “content is correct” creates risk. The number of dimensions, the specific checks, and the tier assignments will vary by site architecture — but the pattern of independent, deterministic, tiered, auditable verification is universal.

FAQ — Frequently Asked Questions

Q1: What's the difference between a VERIFIED gate and a smoke test?

A smoke test answers one question: "Is the service alive?" It checks that a port is listening, a health endpoint returns 200, and the application hasn't crashed. It is a binary liveness probe — the service is either up or down.

The VERIFIED Gate answers a fundamentally different question: "Is the deployed content correct and complete across all dimensions that matter to search engines and users?" It checks 10+ independent dimensions — canonical self-reference, hreflang bidirectionality, JSON-LD structural validity, security header presence, sitemap completeness, homepage card consistency — each of which can fail while the service is "alive" and returning 200.

Concrete difference: a smoke test sees a page returning HTTP 200 with valid HTML and reports PASS. The VERIFIED Gate sees the same page and detects that its canonical URL points to the staging domain, its hreflang tags are one-way only, and its JSON-LD has a trailing comma — and reports FAIL on three dimensions. The page is "alive" but search engines will silently ignore it. Smoke tests verify liveness; VERIFIED gates verify correctness.

Q2: If one dimension fails but all others pass, what verdict does the gate produce?

The gate verdict depends on which dimension fails, not just that one failed. The tier system (Section 2.11) encodes this:

Tier 1 failure (HTTP status, canonical, hreflang) → Gate MUST fail. Hard block. No override. These three dimensions are the minimum viable deployment contract.
Tier 2 failure (JSON-LD, security headers) → Gate SHOULD fail. The release should be blocked unless the failure is a known pre-existing condition confirmed to exist before this deployment.
Tier 3 failure (FAQ visibility, homepage cards) → Gate passes with WARNING. The article is accessible and indexable; the issue is tracked for repair in the next deployment cycle.
Tier 4 failure (sitemap) → Gate passes with NOTE. Delayed discovery is logged but does not prevent crawling.

The overall gate verdict is the most severe tier-level outcome across all dimensions. One Tier 1 failure plus nine Tier 4 passes = gate FAIL. Nine Tier 1 passes plus one Tier 3 failure = gate PASS_WITH_WARNING. The gate never averages or "scores" dimensions — it applies the tier system deterministically.

Q3: When is emergency deploy (scp/rsync) allowed, and why does it get VERIFIED_WITH_WARNING?

Emergency deployment is allowed when the standard deploy path is unavailable — SSH connectivity issues between the VPS and GitHub, git protocol timeouts, disk-full conditions preventing git pull, or GitHub Actions outage. It is never the default and always requires explicit user approval documented in the deploy report.

The result is VERIFIED_WITH_WARNING rather than VERIFIED because three guarantees are absent:

Git SHA consistency cannot be proven. Without git pull on the VPS, there is no vps_sha to cryptographically compare against github_sha. The gate can verify the content looks correct (external checks still run), but it cannot prove the deployed bytes match the committed bytes.
Partial deployment risk. scp and rsync transfer individual files — a human operating under pressure may omit a file. Standard deployment transfers the entire repository tree atomically via git, eliminating omission risk.
Rollback path is lost. Standard deployment leaves the VPS at a known git commit, enabling git revert. Emergency deployment bypasses git — there is no commit record on the VPS to roll back to. Recovery requires a subsequent full standard deployment.

The WARNING status signals: "The content checks pass, but the deployment integrity cannot be cryptographically verified. Review the emergency approval record and schedule a standard deployment to restore the integrity chain."

Q4: Do I need all 10 checks for my site? How do I adapt the dimension matrix?

The 10-dimension matrix is designed for xslyl.com's specific architecture — a bilingual static site with structured data, security headers, and sitemap-based discovery. Your site may need fewer dimensions, different dimensions, or additional dimensions. The adaptation principle is:

Keep Tier 1 (HTTP status, canonical, hreflang if multilingual). These are universal — every site needs to verify that pages are accessible and declare their identity correctly. For a monolingual site, drop hreflang and keep HTTP + canonical.
Adapt Tier 2 to your site's SEO and security surface. If you use JSON-LD, keep the structured data check. If you don't, drop it. Keep security headers if you serve over HTTPS (you should). Add dimensions specific to your stack: OpenGraph tags, Twitter Cards, CSP headers, cache headers.
Tier 3-4 are site-specific infrastructure checks. Sitemap presence matters if you use sitemaps. Homepage cards matter if you have a card-based index. Replace these with your site's equivalent: RSS feed presence, search index update, CDN purge confirmation.
Always include the meta-checks (Section 8). Script versioning and policy acknowledgment are independent of your site's architecture — they verify the verification infrastructure itself.

The matrix is not a compliance checklist. It's a design template: identify the dimensions that matter for your site's correctness, tier them by failure consequence, and implement deterministic PASS/FAIL checks for each. The number 10 is specific to xslyl.com; the pattern — tiered, independent, deterministic — is universal.

Q5: Do you run all 10 dimensions on every deploy? What about the time cost?

Yes, all dimensions run on every deployment to the production environment. The total execution time is under 2 seconds for a typical article page. Here's the breakdown:

HTTP requests: 6-8 HTTPS calls (2 article pages × 1-2 checks each, plus sitemap, plus 2 homepages). Each call takes 50-200ms depending on network latency. Total: ~1 second.
HTML parsing: REGEX-based extraction (canonical, hreflang, JSON-LD types, FAQ elements, title). No DOM construction, no browser rendering. Total: ~100ms.
Deploy integrity: Reading the deploy report JSON and comparing SHAs. No network calls. Total: ~10ms.
Meta-checks: Reading script files and extracting version constants. Local file I/O only. Total: ~50ms.

The key design decision that keeps this fast: the gate does not render pages in a browser. It performs lightweight HTTP + REGEX checks that validate structural and metadata correctness without the overhead of a headless browser, DOM construction, or JavaScript execution. For a static site like xslyl.com, this is sufficient — the content is in the HTML source. For a JavaScript-heavy SPA, you would add a browser-based rendering dimension (which does increase time cost), but the core matrix remains lightweight.

The 2-second cost per deployment is negligible compared to the deployment process itself (git pull + rsync + nginx reload, typically 5-15 seconds). The gate is not a bottleneck — it's the fastest step in the pipeline.

Q6: If GitHub and VPS commit SHAs don't match but the live content is correct, where's the bug?

This scenario — mismatched SHAs but passing content checks — points to one of four bugs, listed from most to least common:

Stale VPS checkout (most common). The VPS's git pull failed silently (network timeout, permission error, detached HEAD state) but the deploy script ran anyway against the old checkout. The old content was also correct — same canonical, same JSON-LD — so external checks pass. The new content never deployed. Fix: diagnose why git pull failed and redeploy.
Deploy script ran before git pull completed. A race condition in the deployment orchestration: the deploy script fired while git pull was still fetching. The script deployed a mix of old and new files — enough new files to pass content checks, but not the complete set. Fix: add explicit sequencing (wait for git pull exit code before invoking deploy script).
GitHub SHA recorded incorrectly. The workflow captured the wrong commit SHA — e.g., the PR branch HEAD instead of the merge commit, or the workflow run's own commit instead of the triggering commit. The actual deployment used the correct commit, but the recorded github_sha is wrong, creating a spurious mismatch. Fix: audit the SHA extraction logic in the GitHub Actions workflow.
Direct file modification on the VPS (least common, most dangerous). Someone modified files directly on the VPS (vim /var/www/html/en/posts/...html) without going through git. The content is "correct" but the git repository is now dirty — the VPS has uncommitted changes that will be overwritten by the next git pull. Fix: identify the direct edit, commit it properly, and lock down VPS file permissions to prevent recurrence.

The diagnostic approach: if SHAs mismatch, do not trust the content checks. The mismatch means the deployment pipeline itself failed at some step. Investigate the pipeline failure first; the content checks may be passing against stale or partial content. A SHA mismatch is always a deployment integrity failure, regardless of what the external checks report.

Q7: How do I add the VERIFIED Gate to my existing CI/CD pipeline without an Agent orchestrator?

The VERIFIED Gate does not require Codex or any Agent orchestrator. You can implement it as a standalone CI/CD stage. The key architectural decision is where the gate runs, not who runs it. Two patterns work for non-Agent pipelines:

CI/CD job after deploy (recommended for most teams): Add a deployment-verification job to your CI/CD pipeline that runs after the deploy step. The job calls a script that issues HTTPS requests against the production URLs and validates all 10 dimensions. If any dimension fails, the job exits non-zero, causing the CI/CD pipeline to fail. The advantage: the gate runs automatically with every deployment. The disadvantage: CI/CD artifacts expire — verification history outside the repo must be managed separately.
Scheduled cron job with alerting (recommended for teams with existing monitoring): Run the gate as a cron job (e.g., every 15 minutes) that checks all production pages. When a check fails, it sends an alert to your monitoring system (PagerDuty, Slack, email). This pattern catches silent failures that happen between deployments — a server misconfiguration that strips security headers, a stale robots.txt that blocks new content, or a CDN that serves cached versions. The trade-off: response time is bounded by the check interval, not immediate.

In both patterns, the gate script is the same — it's the checking logic that matters, not the execution context. The key adaptation for CI/CD integration: ensure the script has production URL access (some CI/CD runners have restricted network egress) and that the script version is pinned to the repository version (not a CI/CD plugin version that may diverge).

Next Steps

The VERIFIED Gate is Phase 8 — the final stage — of the 8-gate Agent release pipeline. Each article in this series covers a different phase or supporting infrastructure. Use these articles to understand the complete system:

Agent Release Gate Design

The parent article defining the complete 8-stage release pipeline. The VERIFIED Gate is Phase 8 — the final post-deployment verification stage. This article is the deep-dive into Phase 8's design, implementation, and operational semantics.

Agent Observability: Monitoring Signals for Autonomous Deployments

Online verification depends on observability signals — HTTP status, page content, response headers, structured data extraction. This article establishes the monitoring framework that the VERIFIED Gate consumes: what signals to collect, how to store them, and how to query them for gate checks.

Agent State Machine Design: Deployment State Transitions

Gate pass/fail verdicts trigger state transitions in the deployment state machine. The deployed → verified transition is critical: it is the moment when a deployment moves from "bytes on server" to "content confirmed correct." This article defines the state machine that the VERIFIED Gate feeds into.

Agent Context Protocol Design: Structured Evidence Between Gates

Structured verification evidence — verify-report.json, deploy-report.json, status.json — passes between gates in the release pipeline. This article defines the data contracts that the VERIFIED Gate produces and the downstream systems (monitoring, alerting, rollback) consume.

Agent Security Evaluation: Threat Modeling for Autonomous Deployments

The VERIFIED Gate's security header checks (Section 5) are one component of a broader security evaluation framework. This article covers the full threat model for Agent-driven release pipelines, including supply chain attacks, credential management, and incident response automation.

Agent Audit Log Design: Immutable Records for Autonomous Actions

The VERIFIED Gate's structured evidence — verify-report.json and deploy-report.json — is a specialized form of audit record. This article defines the general audit log framework for Agent operations: how to design immutable, tamper-evident logs that record every Agent action across the deployment pipeline.

If you are designing an Agent release pipeline from scratch, read the series in order starting with Agent Release Gate Design — it defines the overall pipeline architecture that this VERIFIED Gate is the final phase of. If you are adding verification to an existing deployment pipeline, this article can be read independently: implement the 10-dimension check matrix (Section 2), add the meta-checks (Section 8), and integrate the check into your post-deploy workflow (Section 10). The core insight is always the same: CI/CD proves bytes arrived; the VERIFIED Gate proves content is correct.

The VERIFIED Gate described in this article is a living system, not a static specification. As your site's architecture evolves — adding new content types, security requirements, or SEO dimensions — the gate's check matrix should evolve with it. The meta-checks (Section 8) ensure that the gate suite evolves as a coherent unit: when you add a new dimension, update all gate scripts simultaneously, commit them together, and let the version consistency check enforce the update discipline. A gate that never changes is a gate that has fallen behind the deployment pipeline it was designed to protect. Build the gate, run it on every deployment, audit its output, and evolve its dimensions as your understanding of what "correct deployment" means deepens.