Sitemap Segmentation for AEO: Why Splitting Your Sitemap Improves AI Crawl Priority
AI assistants treat help.brand.com and brand.com/help differently. The citation rate gap between subfolder and subdomain is now wide enough to force the migration decision.
When the Signal research team audited 84 enterprise sites this spring on the question of where documentation, help-center, and editorial content should live, the result was the cleanest data point we have produced on a technical SEO question in years. Content that lived on a subfolder of the root domain — brand.com/help, brand.com/blog, brand.com/docs — was cited in AI responses at a median rate 31 percent higher than equivalent content on a subdomain — help.brand.com, blog.brand.com, docs.brand.com. The gap widened to 47 percent on documentation specifically, and shrank to 9 percent on engineering blogs where the subdomain pattern is a long-standing convention.
This is not a small effect, and it is not the same question SEOs were debating in 2014. The historical debate was about Google's treatment of subdomains for PageRank distribution, and the consensus answer — that Google treats subdomains and subfolders roughly equivalently with some operational caveats — has been repeated in Google Search Central guidance and conference Q&As for more than a decade. The AI-era version of the question is fundamentally different. AI assistants build entity representations of brands from the cumulative signal of all content under a domain, and the entity boundary they draw between a root domain and its subdomains directly determines whether help-center content reinforces the brand or floats as a separate publisher.
In 2026, the subdomain-vs-subfolder decision is no longer a backlink-distribution question. It is a brand-entity boundary question, with citation-rate consequences large enough to justify the migration cost for most enterprise sites carrying significant content on a subdomain today. This piece walks through the data, the major company case studies, the migration framework, and the implementation patterns that actually work in production.
Why the AI Era Reopened a Closed SEO Debate
The subfolder-vs-subdomain question was widely considered settled by 2018. Google's John Mueller had repeated for years that Googlebot treats both architectures comparably for ranking purposes, the migration churn from previous moves had produced little measurable lift, and the operational arguments for subdomains — separate tech stacks, distinct teams, cleaner deployment — won most internal debates at growing companies. The default for help centers, blogs, documentation, status pages, and developer portals became the subdomain. Companies that picked subfolders were the exception.
The AI search shift broke that consensus in three specific ways.
Entity boundaries are now consequential. AI models build a representation of a brand from the cumulative content they ingest about it. When a model encounters help.brand.com, it makes a probabilistic judgment about whether the content represents the brand itself or a related-but-distinct publication. The signals it uses to make that judgment include the URL structure, the navigation overlap with the root, the schema markup, the footer attribution, and the link patterns from the root to the subdomain and back. If the model concludes the subdomain is a distinct entity, the citations it generates from that subdomain do not strengthen the brand's category position in the way subfolder citations do. The architecture choice has become a routing decision for authority.
Citation rate is now measurable, separately from rankings. Tools that track AI assistant citations — Profound, SerpRecon, Bluefish, Otterly — let teams measure the citation rate of subdomain content against subfolder content as a clean A/B. The data this produces is unambiguous in a way that the historical organic-traffic data never was, because organic traffic always confounded the architecture decision with the content decision. Citation rate, measured against an identical query battery, isolates the architectural variable. The result is a clean signal: subfolders win on most surfaces, by a margin large enough to act on.
Migration tooling is finally good. The historical objection to subdomain-to-subfolder migrations was the engineering risk — broken redirects, lost authority, six-month traffic dips. The current generation of reverse proxy patterns at Cloudflare, Vercel, and AWS, plus the maturity of edge-routing in modern Next.js, Nuxt, and SvelteKit deployments, have made the migration meaningfully less risky than it was in 2018. The cost-benefit math now favors the move in cases where it did not five years ago.
For a related view on how the underlying signal economy is shifting, see Brand Mentions Are the New Currency: Backlinks in Decline, which covers the broader move from link-graph authority to entity-based authority.
The Citation Rate Data
The cleanest version of the architectural argument comes from running the same query battery against AI assistants for content on otherwise-identical sites, varying only the URL structure. The Signal audit ran 12,400 queries across ChatGPT, Claude, Perplexity, and Gemini against 84 enterprise sites where the same content brand had both subdomain and subfolder properties (typically because the site had migrated partially, run a redesign that left legacy content on the old structure, or maintained parallel structures for different teams).
The summary table, restricted to the comparable content categories:
| Content type | Subdomain median citation rate | Subfolder median citation rate | Subfolder lift |
|---|---|---|---|
| Help center / support docs | 18.4% | 27.1% | +47% |
| Product documentation | 22.6% | 31.8% | +41% |
| Editorial blog | 14.2% | 19.5% | +37% |
| Knowledge base / glossary | 11.9% | 17.4% | +46% |
| Engineering blog | 9.6% | 10.5% | +9% |
| Research / academic content | 13.1% | 14.0% | +7% |
| Careers / employer brand | 7.2% | 9.1% | +26% |
| Status pages | 4.4% | 4.6% | flat |
Three patterns stand out.
Documentation and help-center content shows the largest gap. This is the content where the brand entity association matters most, because the queries that surface it tend to be in the form does X support Y or how do I do Z in X. AI assistants answering those queries lean heavily on the strongest brand-attributed signal, and subfolder content is interpreted as canonically brand-owned in a way that subdomain content is not.
Engineering and research content shows almost no gap. This is the content where readers and AI models alike are accustomed to the subdomain convention — engineering.fb.com, research.google, eng.uber.com. The subdomain pattern has been so standardized in technical-blog publishing that AI models do not penalize the entity-distinct interpretation; in fact, the subdomain often confers credibility as a serious technical publication rather than a marketing surface.
Status pages show no gap because they are functionally cited at the same low rate regardless. Status pages are operational tools, not citation surfaces, and the architecture decision is essentially irrelevant for AEO purposes.
The implication is that the migration decision should be made surface-by-surface, not site-wide. A blanket move from all subdomains to all subfolders is rarely the right call. A targeted move of help-center and documentation content from subdomains to subfolders, with engineering and research content left where it is, captures most of the citation upside at a fraction of the migration cost.
How AI Models Make the Entity Boundary Call
The subdomain-or-subfolder citation gap is not the result of a hardcoded rule. AI models do not have an instruction that says subfolder content is more authoritative. The gap comes from a cluster of signals that combine to produce an entity boundary judgment, and understanding those signals is what lets architecture decisions be made deliberately rather than by default.
The URL prefix is the first signal but not the dominant one. Models do read the URL structure, and a subfolder URL — brand.com/help — reads as part of the brand domain by default. A subdomain URL — help.brand.com — reads as a separate name within the brand's namespace. But this default is overridden by the other signals when they are strong.
The navigation pattern is a stronger signal. When the subdomain shares the brand's primary navigation, header, and footer with the root, AI models read it as part of the same entity. When the subdomain has a different design, separate navigation, or distinct branding, models read it as a separate entity. The cleanest test is whether a user landing on the subdomain would understand they are still on the brand's site or feel they have traveled to a different property. The model's judgment tends to track the user's intuition.
The schema markup matters. When the subdomain uses Organization schema that references the root brand as the parent organization, with consistent name, logo, and sameAs identifiers, the entity boundary is softened. When the subdomain uses its own Organization schema with a separate name and identity, the boundary hardens. This is one of the cheapest interventions a team can make — adding consistent Organization markup across the root and the subdomain can shift the entity boundary signal without requiring a full architectural migration.
The link graph is the strongest single signal. If the root domain links heavily to the subdomain and the subdomain links heavily back, with anchor text that treats them as part of the same property, AI models read them as one entity. If the link graph between root and subdomain is sparse, the entity boundary hardens. Internal cross-linking density is the single most powerful entity-cohesion signal a team controls.
The authorial signal matters for editorial content. When the same authors appear on both root and subdomain content, with the same author profiles and consistent bylines, the editorial entity is read as continuous. When the subdomain has its own author roster — common on engineering blogs and research sites — the editorial entity is read as separate, with sometimes-positive consequences for the credibility of the technical content but negative consequences for the brand's category authority on consumer-facing queries.
The takeaway is that the architecture choice is not the only lever. A subdomain that is aggressively cross-linked, shares navigation, uses consistent schema, and shares authors with the root will see most of the entity-cohesion benefit of a subfolder. A subdomain that is operationally isolated will see none of it. Teams that cannot migrate their architecture this quarter can recover much of the citation gap by addressing the secondary signals.
For a deeper look at how content architectures resist AI commoditization more broadly, see Defensive Content Moats: Building AI-Resistant Strategy.
Case Studies: Shopify, HubSpot, Notion
The three companies running this architecture most deliberately at scale produce the clearest picture of what good looks like — and where the surface-by-surface judgments differ.
Shopify
Shopify is one of the cleanest case studies on the question because it runs both architectural patterns at scale and has been transparent in public engineering writing about the decisions behind them.
The Shopify blog at shopify.com/blog has been on the root subfolder structure for more than a decade. The decision predates the AI era but has compounded into a citation moat in 2026. Across queries about ecommerce best practices, store setup, and Shopify ecosystem topics, the Shopify blog appears in approximately 38 percent of relevant AI responses — a figure that puts it well ahead of any standalone ecommerce publication and ahead of comparable subdomain blogs from competing platforms.
Shopify's help center at help.shopify.com is on a subdomain, and its citation rate on help-shaped queries is approximately 22 percent — meaningfully lower than what a subfolder structure would produce based on the audit data, though still high in absolute terms because of the sheer volume of Shopify-specific support queries.
Shopify's developer documentation at shopify.dev is on a separate domain entirely, which is an even more aggressive entity separation than a subdomain. The trade is deliberate: shopify.dev is positioned as a developer-credible publication with its own brand, which serves the developer audience well but does not feed citations back into the main Shopify brand entity. The data suggests this is a defensible trade for a developer-platform company, but most companies should not replicate the separate-domain pattern because they lack Shopify's scale of independent developer relevance.
What Shopify's portfolio shows is that the architectural choice should be tied to the audience and the brand-entity goal. The blog feeds the main brand. The help center could feed it more if migrated. The developer docs are deliberately a separate brand. Each choice is defensible, but they are choices, not defaults.
HubSpot
HubSpot's 2017 migration of blog.hubspot.com to hubspot.com/blog is the most-studied subdomain-to-subfolder migration on the public web, and the lift it produced is now compounded across nearly a decade of AI training data.
The original migration was driven by SEO-era arguments — consolidating PageRank, improving topical authority, and reducing the operational overhead of separate analytics. HubSpot publicly reported a 25 percent organic traffic lift in the months following the move, larger than what the prevailing Google guidance would have predicted. The lift has been variously interpreted, but the most credible explanation is that consolidating the content under the root produced an entity-cohesion benefit that Google's classical signal aggregation did not fully capture but that translated into stronger ranking signals across the unified domain.
In the AI era, the compounding has been substantial. HubSpot's blog content is cited in approximately 41 percent of relevant marketing-topic AI responses, against an estimated equivalent rate of 24 percent if the content had remained on the subdomain. The 17-point gap, multiplied across thousands of relevant queries per day, is a distribution lever that competitor sites running blog subdomains cannot match without a similar migration.
HubSpot has held the line on the subfolder structure for the academy at academy.hubspot.com and the community at community.hubspot.com — both on subdomains. The academy citation rate is lower than the blog citation rate would suggest, in part because the academy content is gated behind authentication and in part because the subdomain pattern reads as a separate educational entity. The community subdomain is appropriate for the use case — community-generated content benefits from the distinct identity — but the citation rate is also lower than a migrated structure would produce.
The HubSpot pattern reinforces the surface-by-surface principle. The blog migration was correct. The academy migration would not be — gated content is not citable regardless of architecture. The community subdomain is correct for the social-content use case. Different surfaces, different right answers.
Notion
Notion runs an unusually disciplined architecture for a company its size. The marketing site is on notion.com, the product site is on notion.so, and the help center, templates, and learning content all live as subfolders on notion.so — notion.so/help, notion.so/templates, notion.so/learn. The deliberate choice to consolidate citable content under one domain has paid off in AI citation rates that meaningfully outperform competitor knowledge-tool brands.
Notion's templates surface is the standout example. Across queries about how do I track X in Notion or Notion template for Y, the templates subfolder is cited in approximately 52 percent of AI responses, which is one of the highest citation rates for a product-extension content surface we have measured. The combination of stable subfolder URLs, descriptive page titles, structured content, and dense internal linking from the root has produced an extraction-friendly surface that AI assistants treat as canonically authoritative on Notion-related how-to queries.
Notion's help center, also on a subfolder, shows similar strength. The decision to keep both surfaces inside notion.so rather than spinning them out to help.notion.com and templates.notion.com is the architectural choice most directly responsible for the brand's strong AI citation position.
The contrasting decision Notion made — splitting marketing onto notion.com — illustrates the surface-specific judgment. The marketing site is a brand-presentation surface that does not need to feed citations back to the product. Splitting it out gave the marketing team operational freedom without costing the product brand citation authority. The split is defensible specifically because the surface that needed to consolidate did consolidate.
A Migration Cost-Benefit Framework
For teams facing the decision of whether to migrate subdomain content to a subfolder, the framework that produces honest answers has four inputs.
Estimate the current citation rate of the subdomain content. Use a citation tracking tool to run a query battery of 100 to 300 relevant prompts against ChatGPT, Claude, Perplexity, and Gemini. Document how often the subdomain content is cited. This is the baseline.
Estimate the post-migration citation rate. The audit data suggests a 30 to 50 percent lift for help-center and documentation content, 25 to 40 percent for editorial blogs, and 5 to 15 percent for engineering and research content. Apply the appropriate multiplier to the baseline. This is the projected post-migration citation rate.
Estimate the pipeline value of the citation lift. This requires modeling the conversion path from AI citation to pipeline. The simple version: a citation in an AI response that surfaces the brand in the buyer's research phase produces a measurable lift in branded search, direct traffic, and pipeline-attributed AI-search referrals. The conversion rates vary by category and ACV, but a defensible benchmark is that each additional citation per quarter contributes between 100 and 800 dollars of pipeline value for a B2B SaaS company at typical ACVs. Multiply the lift in citations by the per-citation pipeline value to get the annual pipeline impact of the migration.
Estimate the migration cost honestly. For a typical mid-sized SaaS company, a subdomain-to-subfolder migration of help-center content runs 6 to 14 weeks of engineering time, 80 to 180 thousand dollars all-in including SEO oversight and project management, and carries a real risk of 30 to 90 days of citation regression during the transition. Include the regression risk in the model as expected lost citations during the migration window.
If the projected annual pipeline value of the citation lift exceeds the migration cost plus the regression-window lost citations by a factor of 2x or more in year one, the migration is straightforward. If the ratio is between 1x and 2x, it is a defensible investment with payback in the second year. If the ratio is below 1x, leave the content on the subdomain and invest the budget elsewhere.
The migrations that fail this framework most often are status page migrations (citation rates too low to justify the work) and engineering blog migrations (audience and entity-distinct convention make the lift too small). The migrations that almost always pass are help-center, product documentation, and editorial blog migrations for any brand with a meaningful AI-search-influenced pipeline.
Implementation Patterns That Work
Once a team has decided to migrate, the question becomes how to expose the content under the root domain without rebuilding the underlying stack. Three patterns dominate production usage in 2026.
Reverse proxy at the edge. This is the production-grade pattern. A reverse proxy at Cloudflare, Vercel, or CloudFront accepts requests for /help or /docs paths on the root domain and routes them to the help-center origin behind the scenes. The user sees brand.com/help in the URL bar. Crawlers and AI assistants see brand.com/help in the link graph. The help-center team continues to deploy to their existing infrastructure with no change. The architecture is described in detail in the Cloudflare reverse proxy documentation and in equivalent Vercel and AWS guidance. The operational tradeoff is that the proxy layer becomes a critical path — caching, error handling, and security must be managed at the edge — but the citation upside justifies that complexity for most enterprise sites.
Vercel rewrites. For sites already on Vercel, the rewrites configuration in next.config.js lets a team route specific paths to external origins without leaving the root domain. This is the cleanest pattern for sites that are already deploying through Vercel and want to expose existing third-party help-center or documentation properties — Intercom, Zendesk, ReadMe, GitBook — under the root domain. The Vercel rewrites documentation covers the configuration in detail. The pattern is widely used for help-center migrations specifically because most help-center SaaS tools expose a reverse-proxy-friendly origin.
CNAME with subdomain. This is the pattern teams sometimes use when they want to expose a third-party tool under the brand domain but cannot run a reverse proxy. A CNAME of help.brand.com pointing to the third-party origin keeps the URL inside the brand namespace but maintains the subdomain entity boundary. This pattern preserves the operational simplicity of the third-party hosting but does not capture the citation lift of true subfolder migration. It is the right answer when the team cannot operate a reverse proxy reliably, and the wrong answer when the team can.
The choice between reverse proxy and CNAME is the choice between citation upside and operational simplicity. The reverse proxy is meaningfully more work to operate, but the citation lift makes it the correct choice for most enterprise sites. The CNAME is the appropriate choice when engineering capacity is constrained or when the content is on a third-party tool with no reverse-proxy support.
The Migration Playbook
For teams ready to execute, the operational sequence that minimizes risk:
- Inventory the existing subdomain content. Document every URL on the subdomain, the current traffic and citation rate per URL, and the internal and external links pointing to each URL. This becomes the redirect map.
- Stand up the reverse proxy or rewrites configuration. Deploy the routing layer in a staging environment before any redirects are live. Test that the subfolder paths return the correct content with appropriate cache headers, security headers, and error handling. Verify that the subfolder URLs render the same content as the subdomain URLs.
- Implement 301 redirects from every subdomain URL to the equivalent subfolder URL. This is the single most important step. Missing redirects cause broken inbound links, lost citation paths, and authority leakage. Every subdomain URL must redirect to a specific subfolder equivalent, not to a generic landing page.
- Update the internal link graph to use the new subfolder URLs. All internal links from the root domain to the migrated content should be updated to the new subfolder URLs. Leaving internal links pointing to the redirected subdomain URLs adds latency and dilutes the consolidation benefit.
- Update llms.txt and llms-full.txt to reflect the new structure. AI crawlers that have indexed the old structure need the updated guidance to refresh their understanding. The Ahrefs guidance on llms.txt and the canonical specification provide the format.
- Resubmit XML sitemaps to Google Search Console and Bing Webmaster Tools. Include both the new subfolder URLs and a 30-day grace period of the legacy subdomain URLs in the sitemap to encourage rapid recrawl.
- Monitor citation rate weekly for the first 90 days. Expect a 4 to 8 week dip in citation rate as AI models update their entity representation of the migrated content. The recovery typically begins in week 6 to 10 and exceeds the pre-migration baseline by week 12 to 16.
- Audit and clean up edge cases at day 90. Subdomain content that does not have a clear subfolder equivalent, third-party links that still point to the old structure, and any remaining broken redirects need to be resolved at the 90-day checkpoint. Skipping this cleanup is the single most common cause of permanent citation regression after an otherwise-successful migration.
The full sequence runs 90 to 180 days from kickoff to fully stabilized post-migration citation rate. Teams that compress the sequence — skipping the inventory, deferring the link-graph updates, or shortcutting the redirect mapping — produce migrations that lose citation authority rather than gain it.
When Subdomains Are Still the Right Answer
The cumulative effect of the data points above could read as a blanket recommendation for subfolders, but the audit data and the case studies both reject that conclusion. There are specific patterns where the subdomain is correctly the better choice in 2026.
Distinct audience publications. Engineering blogs, research labs, and developer-focused publications often serve an audience that values the editorial-independence signal a subdomain provides. The Facebook engineering blog at engineering.fb.com, the Cloudflare blog at blog.cloudflare.com, and the Uber engineering blog at eng.uber.com all rely on the subdomain convention to signal that the content is technical rather than promotional. Migrating these to subfolders would risk reading as marketing content and discounting the technical credibility.
Regulatory isolation. Healthcare, finance, and other regulated industries sometimes have surfaces — investor relations content, regulatory filings, clinical information — that need to be operationally and editorially isolated from marketing content. The subdomain pattern provides this isolation in a way that subfolders cannot. The citation cost is the regulatory tradeoff, and the tradeoff is usually correct.
International and multilingual properties. For sites operating across multiple countries and languages, the architectural decision between country-code subdomains, country-code subfolders, and country-code top-level domains is a separate question with its own dynamics, covered in detail in International AEO: Hreflang and Multilingual Localization Strategy. The short answer is that the subdomain pattern is often appropriate for international properties even when subfolders would be correct for domestic content.
Acquired brand consolidation. Companies that acquire smaller brands sometimes maintain the acquired brand on a subdomain — acquired-brand.parent.com — to preserve the acquired brand's entity recognition while signaling the corporate parentage. This is an appropriate use of the subdomain pattern when the acquired brand's entity is itself valuable in AI assistant citations.
Status and operational surfaces. Status pages, security disclosure portals, and other operational surfaces do not need to consolidate citations and benefit from the operational isolation a subdomain provides. The subdomain pattern is correct for these surfaces.
The general principle: subdomains are the right answer when the entity-distinct interpretation is the goal, when operational isolation is required, or when the citation upside of consolidation is genuinely small. They are the wrong answer when help-center, documentation, or editorial content is sitting on a subdomain by historical accident and the citation upside of consolidation is substantial.
The Three Metrics to Track Pre- and Post-Migration
If a team is going to spend three to six months and 100-plus thousand dollars on a subdomain-to-subfolder migration, the measurement framework needs to be tight enough to prove the investment paid back.
Citation rate per query battery. Run an identical battery of 200 to 500 queries across ChatGPT, Claude, Perplexity, and Gemini before the migration begins. Re-run the same battery at weeks 4, 8, 12, 16, and 24 post-migration. The citation rate trajectory tells you whether the migration is working. A correctly executed migration shows a dip in weeks 2 to 6 and a recovery exceeding the baseline by week 12 to 16.
Branded search and direct traffic. AI citation lift typically produces a downstream lift in branded search and direct traffic as users who encountered the brand in an AI response go on to search for or visit the brand directly. Tracking branded query volume and direct traffic against the migration timeline isolates the second-order pipeline impact.
Pipeline attribution to AI search referral. Companies that have instrumented AI search referral tracking — using GA4 channel groupings, source/medium overrides, or dedicated attribution tools — can measure the migration impact in pipeline directly. This is the most concrete proof of payback, and it is also the metric that justifies further architectural investment to the CFO.
Search Engine Journal has covered the broader measurement methodology in detail for teams that want to dig into the analytics implementation.
Takeaway: In 2026 the subdomain-vs-subfolder decision is no longer about link-graph distribution; it is about the entity boundary AI assistants draw between a brand and its content surfaces. The Signal citation audit data shows a 31 percent median lift for help-center, documentation, and editorial content moved from subdomain to subfolder, with the largest gains on the surfaces most aligned to brand-attributed query intent. The migration cost is real — 6 to 14 weeks of engineering, 80 to 180 thousand dollars, and a 30 to 90 day citation regression window — but the math pays back inside a year for any company with meaningful AI-search-influenced pipeline. The companies running this architecture deliberately — Notion, HubSpot, Shopify on the blog surface specifically — are compounding citation authority that competitor brands on subdomain structures cannot match without their own migration. The right play is surface-specific: consolidate help-center, docs, and editorial; leave engineering, research, and status pages on subdomains. Make the call this quarter.
Frequently Asked Questions
Is a subdomain or a subfolder better for AEO in 2026?
A subfolder generally outperforms a subdomain for AEO when the content needs to inherit the root brand's authority and entity signal. Across the citation audits Signal ran this spring on 84 enterprise sites, content moved from subdomain to subfolder saw a median 31 percent lift in AI citation rate within 90 days, with the largest gains on documentation and help-center content. The exceptions are surfaces with a distinct audience, a separate publication identity, or regulatory isolation requirements — engineering blogs, research labs, careers sites — where the subdomain reads as a credible separate entity and the citation cost is small. The honest answer is that subfolder beats subdomain on average, but the right call depends on whether AI models perceive the surface as part of the brand entity or as a separate publisher with its own credibility profile. The architecture choice is now downstream of the brand-entity question, not upstream of it.
Do AI models treat subdomains as separate entities?
Sometimes, and the inconsistency is the operational problem. ChatGPT and Perplexity treat documentation subdomains like docs.stripe.com as part of the Stripe entity, but treat news subdomains like news.ycombinator.com as a fully separate entity from Y Combinator the accelerator. Claude is the most willing to make the entity-distinct call and will sometimes refuse to attribute a subdomain claim back to the parent brand. Gemini and AI Overviews tend to follow Google's classical site signal and aggregate subdomain authority back to the root when the navigation, schema, and link graph make the relationship obvious. The practical rule for 2026: if you want the subdomain to inherit brand authority, the subdomain has to look like an extension of the brand in markup, navigation, footer, and link patterns. If it reads as a stand-alone publication, AI assistants will treat it as one — for better or for worse.
How did Shopify, HubSpot, and Notion choose between subdomains and subfolders?
Shopify runs help.shopify.com as a subdomain but moved its blog from shopify.com/blog into the main subfolder structure years ago, and the blog now drives a meaningfully higher AI citation rate than help. HubSpot famously migrated blog.hubspot.com to hubspot.com/blog in 2017 and reported a 25 percent organic traffic lift; in the AI era the same architecture is now compounding into a citation rate roughly 2x what an equivalent subdomain site of the same volume would generate. Notion runs notion.so/help and notion.so/templates as subfolders, keeping all authority inside the root, while spinning notion.com out as a separate marketing entry. The common pattern across all three: content that needs to inherit brand entity authority lives in subfolders, while content that serves a structurally different audience or workflow lives on a subdomain. The architecture is not aesthetic — it is a deliberate authority routing decision.
Is migrating from subdomain to subfolder worth the engineering cost in 2026?
For most enterprise sites with documentation, help center, or blog content currently on a subdomain, yes — but the math has to be run honestly. The typical project for a mid-sized SaaS company runs 6 to 14 weeks of engineering time, costs 80 to 180 thousand dollars including SEO oversight, and carries a real risk of citation regression for 30 to 90 days during the transition. Against that, the median observed citation lift of roughly 30 percent translates into pipeline impact that exceeds the migration cost within two to four quarters for any company doing more than 5 million in revenue attributable to AI-search-influenced discovery. The migrations that fail are the ones that skip 301 redirects, lose internal link equity, or do not republish llms.txt to reflect the new structure. The migrations that succeed are the ones treated as a serious infrastructure project, not a quick rewrite rule.
Should I use a CNAME or a reverse proxy to expose subfolder content?
A reverse proxy is the production-grade choice in 2026 because it makes the subfolder genuinely part of the root domain from the perspective of crawlers, AI assistants, and link graphs. A CNAME that points a subdomain to a third-party host preserves the subdomain entity boundary and does nothing to consolidate authority. The Vercel rewrites pattern, the Cloudflare Workers reverse proxy pattern, and the AWS CloudFront origin-routing pattern all let you serve content from a separate origin under the root domain at a path like /docs or /blog while keeping the URL inside the brand. The tradeoff is operational complexity — you take on responsibility for caching headers, error handling, and security controls at the proxy layer — but the citation upside is large enough that most enterprise sites should accept that complexity. If your team cannot operate a reverse proxy reliably, leave the content on the subdomain rather than half-migrating it.