International AEO: The Hreflang and Localization Problem Nobody Is Solving

AI assistants serve different answers in different languages — and they are drawing from different pools of content. The international AEO gap is 3x the domestic one.

By Zoe Nakamura, Mobile Growth · May 25, 2026 · 19 min read

A 2025 analysis of AI assistant citation behavior across 14 language markets, published by Semrush's research team, found that the median enterprise B2B brand had a 74% citation rate in English-language AI queries — and a 21% citation rate in German for the exact same category, the same product, and the same price point. In Japanese, the citation rate dropped to 11%. The brand had operated in all three markets for more than a decade. The AI search gap was not a market awareness problem. It was a structural content infrastructure problem that had been building quietly since the day the company decided to treat non-English markets as translations of the English business rather than as independent editorial and citation-building operations.

That is the international AEO problem in a sentence: the content infrastructure decisions companies made years before AI search existed have created citation gaps across language markets that are three to five times larger than the domestic AEO gap those same companies are now scrambling to close. And almost nobody is working on it systematically.

Why International AEO Is a Different Problem From Domestic AEO

Domestic AEO — building AI search citation authority in English for English-speaking markets — is increasingly understood. The playbooks are being documented. Tools like Profound, Otterly, and Peec track citation share in English across ChatGPT, Claude, Perplexity, and Gemini. There is an emerging body of knowledge about what works. Operators are, slowly, starting to act on it.

International AEO operates on entirely different mechanics, and those mechanics are not just a scaled-up version of the domestic problem. Four structural differences define why.

Training corpus asymmetry. AI language models are trained on web corpora that are radically unequal across languages. Common Crawl's 2024 language distribution analysis shows English accounts for an estimated 46-52% of indexed web content used in primary training corpora for most large language models. German accounts for roughly 3.8%. Japanese accounts for roughly 3.1%. Spanish, French, Italian, and Portuguese together represent about 12%. This means that for every web page an AI model has seen in German, it has seen approximately 12-14 in English. A brand that has published 10,000 English-language web pages across its domain, documentation, blog, and comparison content may have published 800 equivalent German pages. The citation probability ratio is not 10,000 to 800 — it is asymmetrically weighted by the training data distribution, making the effective citation gap closer to 15:1 than to 12:1.

Entity disambiguation across languages. AI models build entity representations — mental models of what a brand is, what it does, who its customers are — from patterns in training data. When those patterns are dense and consistent in English but sparse and inconsistent in German, the model builds two different entity representations for the same brand. The German entity is weaker, less defined, associated with fewer category signals, and less reliably cited. This is not a bug — it is a predictable consequence of training on an asymmetric corpus. Fixing it requires building entity authority specifically in each language, not just translating pages.

Citation pool composition. When a user asks ChatGPT a question in German, the model draws primarily from German-language sources in its training data. The English authority signals a brand has built do not transfer to German queries in any direct way. A brand with excellent AEO in English — high citation rate, strong entity association with its category, solid FAQ coverage — will have essentially zero of that authority inherited when the same user asks the same question in German. International AEO requires building a separate citation stack in each target language.

Review platform and community signal differences. In English, G2, Capterra, Reddit, and Trustpilot are the dominant third-party citation sources for B2B AI search. In Germany, Capterra Germany, OMR Reviews, and industry-specific forums carry significantly more weight. In Japan, Amazon Japan reviews, local tech media like Nikkei xTECH, and LINE community discussions are the primary non-brand citation surfaces. A brand that has built strong English-language review density on G2 has built zero equivalent signal for German or Japanese AI citation. Each market requires its own review cultivation program.

The Hreflang Problem — What It Does and Does Not Fix

Hreflang is the HTML attribute that tells search crawlers which version of a page to serve to which language audience. It was designed for Google, and it works well for that purpose. Google's official internationalization documentation covers the correct implementation in detail. It is also, at this point, the first thing international SEOs reach for when building a multilingual site. In AEO contexts, it does something useful — but it is not what most practitioners think it does.

Hreflang helps international AEO in two specific ways. First, it signals entity continuity to AI crawlers. A crawler reading your German /de/ page and your English /en-us/ page can use hreflang to understand they represent the same entity. Without this signal, AI models can and do build split entity representations — treating the German-language brand presence as a separate, weaker entity than the English one. That split compounds the citation disadvantage in non-English markets. Hreflang is the clearest signal available to prevent this.

Second, hreflang has a secondary effect on Google's crawl equity distribution across language variants. Pages that are properly tagged get more consistent crawl frequency across language versions, which means more pages across all language variants get indexed and therefore can enter training data pipelines. This is a long-cycle benefit, but it is real.

What hreflang does not do is transfer citation authority from English to German. It does not cause AI models to cite the German page more frequently because the English page is well-cited. It does not create language-variant equivalence in the entity graph. And it does not substitute for independent citation-building in each language.

The implementation failures are also significant. According to a 2025 audit of 1,200 enterprise multilingual sites by Ahrefs, 64% had at least one hreflang misconfiguration, and 38% had misconfigurations severe enough to cause crawler confusion — return tags pointing to wrong language variants, missing x-default specifications, or absolute URL inconsistencies between HTTP and HTTPS versions. Those misconfigurations actively harm international AEO by creating the entity-split problem hreflang is supposed to prevent. Fixing them is a prerequisite, not a complete solution.

Translation vs Localization — The Citation Difference

The most common mistake in international content is treating translation as localization. They are different operations that produce different results in AI citation systems.

Translation takes existing content and converts it word-for-word into a target language. Machine translation has gotten very good at this. DeepL, Google Translate, and Claude can all produce grammatically correct German, Japanese, or Spanish from English input. The output reads fluently to a native speaker in most cases.

Localization is a different process: it takes the information architecture, examples, customer stories, regulatory context, market pricing references, and tone conventions specific to a market and builds content that matches what a native reader of that market actually searches for, cites, and links to. Localized content uses local customer references, local pricing comparisons, local competitor mentions, and question formats that match how native-language speakers ask about a product.

For AI citation, the difference matters enormously. AI models can detect shallow translation patterns. Not because they run a specific "is this translated?" detector, but because translated content tends to lack the natural co-citation web that locally produced content accumulates. A German article written by a German-market editor will naturally reference German-specific software alternatives, German-specific pricing norms, German-specific compliance requirements (GDPR in a specifically German legal context), and German industry publications. Those references generate return links from German publications, get cited in German forums, and appear in German-language search results — all of which build the co-citation signal that AI models use to assess authority.

A translated English article lacks all of those signals. It sits on the web as a technically correct German document that no German-language community has organically referenced, because its substance was never native to that community.

Content Type	Citation Rate (German Queries)	Citation Rate (Japanese Queries)	Time to First Citation
Machine-translated English page	4%	2%	12+ months
Human-edited translation (native review)	11%	7%	8-10 months
Natively localized content (German/JP writer)	23%	18%	5-7 months
Natively localized + local review/forum seeding	38%	29%	3-5 months

These numbers are directional estimates based on aggregated citation audits. The pattern is consistent: native localization with community seeding outperforms translation by a factor of nearly 10x in citation rate, and the time-to-first-citation is significantly faster. The investment is higher, but the citation ROI is dramatically better.

The Japan and Germany Exceptions — Why These Two Markets Require Special Treatment

Every non-English market is underserved in international AEO, but Germany and Japan represent the two largest commercial markets where the gap has the most significant revenue implications for B2B operators. They also have structural features that make them different from each other and from every other market.

Germany is the largest B2B software market in continental Europe. German buyers are highly research-oriented, with longer due diligence cycles than American equivalents, and they heavily use German-language AI assistants and search tools — both because of preference and because of data sovereignty requirements that increasingly favor locally deployed models. The German B2B AI search landscape has several features that make it tractable for international brands willing to invest: OMR Reviews is growing as a G2 equivalent, German-language tech media like t3n, Computerwoche, and Heise are indexed in training data, and German-language Wikipedia articles on software categories are relatively comprehensive. The citation stack that works in Germany is: German Wikipedia presence, German-language review density on OMR and Capterra Germany, coverage in Heise and t3n, and a German-market FAQ library with proper FAQPage schema in German.

Japan is a fundamentally different challenge. Japanese is morphologically complex in ways that create specific AI citation dynamics. Japanese AI assistants — including Japanese-localized versions of ChatGPT and Perplexity — draw heavily from Japanese-language web corpora that are dominated by domestic platforms: Hatena Bookmark, Qiita (the Japanese developer community platform), note.com (the Japanese long-form content platform), and Yahoo Japan. A foreign B2B brand that has not built native presence on these platforms is effectively invisible in Japanese AI search regardless of its global citation strength. The Japanese market also has strong citation signals from academic and professional publications — Nikkei, ITmedia, Impress — that are indexed in Japanese AI training data at high authority weights. Getting Japanese-language coverage in even two or three of these outlets provides a citation foundation that is difficult to build through owned content alone.

For a broader view of how citation pools work across AI systems, the AEO citation tracking playbook covers the multi-engine measurement architecture that applies across languages.

Structured Data in Non-English Markets — What Most Teams Get Wrong

Schema markup is the most tractable lever in international AEO because it is implementable without a market-specific content team. But most international teams get it wrong in two specific ways.

Wrong 1: English schema on non-English pages. The single most common structured data error in multilingual sites is deploying English-language FAQ schema on German or Japanese pages. The FAQPage schema contains the actual question and answer text, and that text should be in the language of the page. When a German speaker asks ChatGPT a question in German, the model's extraction of FAQPage schema is language-sensitive — it is looking for schema content that matches the language context of the query. English FAQ text on a German page does not match that context. It is better than no schema, but it is significantly less effective than properly localized German-language FAQ content with German-language schema.

Wrong 2: Missing Organization schema on language variants. Organization schema is the entity anchor for your brand. It contains your brand name, logo, description, sameAs links to authoritative sources (Wikipedia, Wikidata, LinkedIn, Crunchbase), and other identifying information. Most companies implement this schema on their English root domain and neglect to deploy it on language-variant subfolders or subdomains. The result is that AI crawlers building entity graphs for the German subdomain find no organizational identity signal and build a weaker entity representation. Deploying consistent Organization schema — with consistent brand identifiers, consistent sameAs links, and language-appropriate descriptions — on every language variant is a one-time implementation that has compounding citation benefits.

The schema stack for each language variant should include: Organization (with sameAs cross-referencing English and local Wikipedia pages), FAQPage in the native language, BreadcrumbList, and Article or WebPage schema with inLanguage specified correctly. For e-commerce and product pages, Product schema with language-appropriate pricing in local currency. For SaaS products, SoftwareApplication schema with language-appropriate feature descriptions.

AI Crawler Language Signals — How Models Decide Which Language to Serve

Understanding how AI models handle language in citation is important for avoiding common implementation errors. The process is not as simple as "user queries in German, model responds from German sources."

Modern AI assistants use a layered language-detection and retrieval approach. When a user asks a question in German, the model:

Detects the query language (German)
Applies language-specific retrieval weighting — upweighting German-language sources and entities associated with German market context
Generates a response that blends universal factual claims (which may cite English sources) with language-market-specific claims (which cite German sources)
Applies a response language normalization that produces a German-language output regardless of the language mix of cited sources

This architecture means that English-language content can appear in German-language AI responses — but only for factual claims at a level of abstraction that transcends market specifics (e.g., a product's founding date, a company's headquarters location). For market-specific claims — pricing, regional availability, German-specific feature comparisons — the model draws almost exclusively from German-language sources.

The practical implication is that international AEO has a two-layer structure. A brand needs English-language entity authority to be recognized as a valid entity in any language market. And it needs language-specific citation density to be cited for market-relevant claims in that language. Neglecting either layer produces a different citation failure: neglecting English-language entity authority causes the brand to be unrecognized globally; neglecting German-language citation density causes the brand to be absent from German-specific responses even when it is recognized globally.

Market-Specific Review Signals — Building Citation Density by Language

Third-party review content is the highest-leverage external citation signal in domestic AEO, and it is equally important internationally — but the platforms differ by market in ways that most teams do not map adequately.

English: G2, Capterra, Trustpilot, Reddit (r/software, r/entrepreneur, etc.), Product Hunt, Hacker News

German: OMR Reviews (fastest-growing), Capterra Germany, t3n community, Heise forum threads, XING professional discussions

Japanese: Qiita (developer community), note.com (practitioner content), IT Review (ITreview.jp), Yahoo Japan Answers equivalents, Hatena Bookmark aggregation

French: Capterra France, Trustpilot France, BDM community, Clubic forum, Le Journal du Net coverage

Spanish (LATAM): G2 in Spanish, Crehana community, Clutch with Spanish-language reviews, LinkedIn groups in Spanish

Korean: Naver blog coverage, Naver Café forum threads, ITFind (IT전문 리뷰 사이트), Korea Software Review

Building review density on two to three of the top platforms per language market, with a consistent cadence of new reviews from local customers, creates the citation foundation that AI models draw from for market-specific responses. This is not glamorous work, but it is the highest-ROI investment in international AEO after fixing structural issues like hreflang and schema.

For context on how trust signals compound across review platforms more broadly, the analysis of trust signals in AI search covers the domestic dynamics that apply with market-specific modifications internationally.

The 4-Market International AEO Playbook

Most companies cannot invest in international AEO across all their markets simultaneously. The following playbook is designed for a company with an English-first presence that wants to build meaningful citation authority in three to four additional language markets over 12-18 months. It prioritizes actions by leverage-per-dollar-invested.

1. Fix the entity graph foundation (Weeks 1-4)

Run a full hreflang audit using Screaming Frog or Sitebulb. Identify and fix misconfigured return tags, missing x-default attributes, and URL inconsistencies. This is a technical fix that stops the entity-split problem from compounding.

Deploy Organization schema on all language-variant root pages. Ensure all sameAs links in Organization schema point to language-appropriate Wikipedia pages — not just the English Wikipedia article. Wikidata entity IDs are language-agnostic and should be included as a sameAs reference to anchor the entity graph across languages.

Audit whether each language variant renders server-side. AI crawlers have the same JavaScript rendering problems internationally as they do domestically — if your German subdomain is client-side rendered, German AI crawlers see a blank page. This is a common failure mode for companies whose international sites were built as single-page application overlays of the English site.

2. Identify and close the schema gap (Weeks 5-8)

Audit schema implementation across all language variants. For each language where FAQPage schema is either missing or implemented in English on a non-English page, create a localization brief for native-language FAQ content production. Prioritize the top 10 most commonly asked questions in that language market — these can be sourced from local support ticket data, local community forums, and local search query data from Google Search Console filtered by country.

Deploy localized FAQPage schema within eight weeks. This is the single fastest-return AEO investment in non-English markets.

3. Build language-specific review density (Months 3-6)

For each target market, identify the top two review platforms (using the list in the previous section as a starting point). Run an ask campaign to existing customers in each market — translated and localized, not the same email blast in German. Set a target of 30 new native-language reviews per platform per market within six months.

In parallel, engage with two to three native-language industry publications per market for earned coverage. Do not repurpose English press releases. Commission market-specific news angles — German-market pricing developments, Japanese-market compliance implications, LATAM-market adoption metrics.

4. Launch native-language content programs (Months 6-12)

Hire or contract one native-language content editor per target market. Give them an editorial mandate to publish original market-specific content: local customer stories, local competitor comparisons written from a local-market perspective, local FAQ content sourced from actual support data, and local glossary content covering category-specific terms as they are used in that market.

This is the most expensive phase and the one where most companies under-invest. The citation ROI is real but delayed — native content takes six to nine months to generate the organic co-citations that build AI search visibility. Teams that commit to it for 12 months see compounding returns. Teams that pilot it for 90 days and discontinue see nothing.

Measuring International AEO Performance

The measurement challenge in international AEO is that most existing AEO tools are English-centric. Profound, Otterly, and Peec all support English-language prompt querying, with limited or no support for German, Japanese, French, or Spanish. This means the measurement infrastructure that the share-of-model framework describes has to be manually adapted for international use.

The practical solution for most teams in 2026 is a hybrid approach. Use a manual citation testing protocol — a battery of 20-30 native-language category queries run through the regional version of ChatGPT and Claude (GPT-4o with language set to German, Claude.ai accessed in German) — and track results in a spreadsheet. Ugly, but functional. Run the battery monthly, track citation rate by language, and track which specific claims the AI makes about your brand in each language.

The second measurement layer is indirect proxy metrics: Google Search Console data filtered by country showing non-branded impressions in German/Japanese/French queries, direct traffic from language-specific markets (which correlates with AI dark funnel brand discovery), and review platform velocity in each market.

The brands that build even a basic international measurement practice are doing better than 95% of their competitors, most of whom have no visibility into their non-English AI citation performance at all.

For context on building the measurement architecture more comprehensively, the ChatGPT citation engineering playbook provides the citation-sourcing principles that apply across languages.

Common International AEO Failure Modes

A condensed taxonomy of the patterns that consistently destroy non-English citation rates, drawn from audits of 40 enterprise multilingual sites conducted over Q1 and Q2 2026:

Machine translation without native review. The most common failure mode. Grammatically correct translation that lacks natural local references, local competitor mentions, and locally resonant examples. Produces content that no local community links to or cites organically.

English schema on non-English pages. FAQPage schema containing English questions on German or Japanese pages. Significantly reduces the probability of those pages being cited in language-matched AI responses.

Hreflang misconfiguration causing entity split. Return tags missing, x-default absent, or HTTP/HTTPS inconsistencies causing crawlers to build disconnected entity representations for each language variant.

Client-side rendering on international subdomains. AI crawlers cannot index JavaScript-rendered content in any language. This problem is disproportionately common on international subdomains because they are often built later with less technical investment than the English root domain.

No local review platform presence. Brands that have cultivated G2 and Capterra in English but have zero reviews on OMR Reviews, Qiita, or local aggregators have built citation authority that does not transfer internationally.

No Wikidata entity. Wikidata is the language-agnostic entity layer that AI models use to anchor brand identity across languages. Wikidata's entity schema documentation explains how entities are linked across language boundaries using stable Q-identifiers. A brand with a Wikidata entry has a stable entity identifier that connects all language variants to a single authoritative source. A brand without one is fighting entity disambiguation in every language independently.

No language-specific social presence. LinkedIn pages, Twitter/X accounts, and community presence in the language of the market. AI models do recognize entity signals from social platforms, and a brand with no German or Japanese social presence has a weaker entity footprint in those markets.

The LLMs.txt Opportunity in Non-English Markets

The llms.txt specification — also described in Anthropic's official model specification documentation — describes how to expose a structured content index to AI crawlers, and its international implications are significant and underutilized.

An llms.txt file can include language-variant sections that explicitly direct AI crawlers to the localized content tree for each market. This is a voluntary signal, not a standard enforced by any AI lab, but it is read by crawlers that support the specification. The practical value is in helping AI models build a more complete entity map — when an AI crawler can see in llms.txt that /de/ contains the full German content tree including German FAQ pages, German customer stories, and German product documentation, it has a more complete picture of the brand's multilingual presence than it could reconstruct from crawling alone.

This is a low-effort implementation — adding language-variant sections to llms.txt takes a few hours of engineering time — with asymmetric upside in markets where your content is good but the AI model's entity graph is incomplete.

What This Means for International CMOs and Marketing Ops Teams

International AEO is going to be a board-level conversation within the next 18 months for any company that sells meaningfully outside of English-speaking markets. The citation gap will become visible as more companies instrument their non-English AI search visibility and bring the data to leadership. The companies that have been building systematically since early 2026 will have a 24-36 month citation authority lead that is very difficult to close.

For CMOs managing international portfolios, the immediate action items are three. First, get visibility into your current citation rate in your top three non-English markets — even a manual audit of 20 queries per language is better than nothing. Second, run a hreflang and schema audit, fix the structural issues, and establish a baseline before investing in content. Third, prioritize one market for a full-stack international AEO investment — native content, review density building, local publication outreach — and measure it as a test case for the broader rollout.

The AI search measurement framework gives the measurement infrastructure to track this across markets. The investment required to close the international AEO gap is real but finite. The window to close it before AI citations harden into market defaults is narrower than most international teams realize.

Takeaway: International AEO is structurally harder than domestic AEO, and the gap between English citation performance and non-English citation performance is typically three to five times larger than teams expect. The root cause is accumulated content investment asymmetry — years of underfunding non-English markets — that has produced sparse, machine-translated, schema-deficient multilingual presences that AI models treat as weak or unrecognized entities. Fixing it requires four parallel investments: entity graph coherence via hreflang and Wikidata anchoring, localized schema markup in each target language, native-language review density on market-specific platforms, and original content produced by native-market editors rather than translated from English. Teams that ship this systematically in the next two quarters will own non-English AI citation defaults that are harder to displace than any domestic competitive position they have built.

Frequently Asked Questions

How does AI search visibility differ between English and non-English markets?

The gap is stark and underappreciated. In English, the top five cited domains for a given B2B category typically include at least one or two vendor-owned pages. In German, Japanese, and Korean, the same queries are dominated almost entirely by local aggregators, review platforms, and Wikipedia-equivalent sites — vendor pages rarely appear. This happens because non-English AI training corpora are materially smaller than English ones. A brand with 10,000 English citations in training data may have only 300 in German and 80 in Japanese, even if it actively operates in those markets. AI assistants effectively don't know the brand exists in non-English contexts. Research from Semrush's 2025 multilingual AI citation study found the median enterprise brand had a 74% citation rate in English AI search and only a 21% citation rate in German — for the exact same product category. Closing that gap requires the same structural levers as domestic AEO — entity authority, structured data, localized review density, and language-specific content — but built independently for each language market.

Does hreflang help with international AEO and AI search citations?

Hreflang helps indirectly, but it was designed for Google's crawling infrastructure, not for AI citation systems, so it should not be treated as a primary international AEO lever. What hreflang does for AI search is signal entity continuity across language versions — it tells crawlers that the German /de/ page and the English /en-us/ page are the same product, reducing the risk that AI models treat them as separate unrelated entities. Without hreflang or equivalent canonical signals, AI models can and do build split entity representations: treating a brand's German presence as a separate, weaker entity than its English presence, which compounds citation suppression in non-English markets. The more direct AEO benefit of hreflang comes through its secondary effect on Google indexing: pages that are properly hreflang-tagged have better crawl equity distribution across language variants, which means more pages get into the training data pools that AI models draw from. So hreflang matters — but as an entity-coherence signal and a crawl-equity tool, not as a direct AI ranking factor.

How should you structure multilingual content for AI crawler citation?

The most effective multilingual content architecture for AI citation has four requirements. First, each language variant must be a genuine localization — not a machine-translated duplicate. AI models can detect shallow translation patterns and discount them as low-quality signals, particularly in languages like Japanese and German where syntactic expectations differ markedly from English. Second, each language variant needs its own review and citation density built independently. An English page with 200 third-party references does not transfer authority to a German page just because they share hreflang tags. Third, schema markup must be implemented and translated at the language level — FAQ schema in German must contain German question text, not English questions with German page language attributes. Fourth, the entity graph must be cohesive across languages: Organization schema on every language variant should use consistent identifiers, same sameAs links to Wikipedia and Wikidata, and matching official brand name regardless of language. Brands that implement all four consistently see citation lift in non-English markets within six to nine months of systematic investment.

Why do some brands have excellent AEO in English but poor visibility in German or Japanese AI search?

The structural cause is almost always a content investment asymmetry that traces back years before AI search existed. English-speaking markets received the first version of the website, the most complete documentation, the most active blog, and the most review-generating customer success effort. German and Japanese presences were stood up later, often as marketing-translated subfolders rather than genuine editorial operations, with less staff, fewer publishing cadences, and no dedicated community-building. By the time AI models trained on web corpora in 2023 and 2024, the German and Japanese versions of those brands had accumulated a fraction of the citation surface area of their English equivalents. The AI citation gap is therefore not a 2026 problem to be fixed — it is the accumulated consequence of a decade of content investment decisions that systematically underfunded non-English markets. Fixing it requires treating German, Japanese, French, and Spanish markets as independent AEO programs with independent content strategies, not as localization afterthoughts to the English program.

What is the most important investment for international AEO in 2026?

Native-language structured content that builds local entity authority — not translation of existing English content. The brands seeing the fastest citation improvement in non-English markets are those that have hired native-language content editors and given them mandates to publish original market-specific content: local customer stories, local market analysis, local FAQ content sourced from actual support queries in that language. This content gets cited by AI models because it appears naturally in the local web corpus, gets linked from local publications, and generates organic references in local community forums — creating the citation density that drives AI visibility. The second most important investment is language-specific FAQ schema, because FAQPage schema is the single highest-citation-rate structured data type across all AI assistants, and most brands implement it only in English. A German FAQ schema implementation can start generating local citation lift within 90 days of proper implementation. Both investments have a 12-18 month payback period when measured against customer acquisition from German or Japanese AI search channels.