AEO Managed Services Pricing: What 15 Providers Actually Charge in 2026

Robyn, LightweightMMM, Recast, Mass Analytics, and Pecan AI have made marketing mix modeling cheap enough for mid-market operators. The next discipline is treating AI search as a first-class channel input alongside paid, organic, and affiliate — and validating the coefficients with geo holdouts before the CFO does it for you.

By Grace Mwangi, Impact & ESG · May 26, 2026 · 17 min read

When the head of growth at a publicly traded DTC retailer asked us in February 2026 why their multi-touch attribution stack showed ChatGPT and Perplexity contributing 0.7 percent of revenue while their CFO's MMM showed 9.4 percent, the answer was a methodological gap, not a measurement error. Both numbers were correct under their respective assumptions. The MTA stack only counted sessions with a clean ai.com or perplexity.ai referrer string, and most AI-search-initiated conversions came in via direct, branded organic, or branded paid as the user re-discovered the brand through a familiar surface. The MMM, fitted on weekly aggregate revenue and weekly AI citation counts from Profound, was picking up the dark-funnel exposure that the MTA was structurally blind to.

The CFO was right. The marketing team had been underbudgeting AEO by an order of magnitude.

Marketing mix modeling has been the quiet winner of the post-cookie measurement era. Meta's Robyn project, Google's LightweightMMM, and a generation of hosted Bayesian platforms — Recast, Mass Analytics, Pecan AI — have made the technique cheap enough that mid-market companies now run MMMs that were the exclusive province of CPG and pharma five years ago. Forrester's 2024 MMM Wave flagged the category as one of the fastest-growing measurement disciplines in B2C marketing technology, and the 2026 follow-up extended that observation into B2B as ChatGPT and Perplexity displaced last-touch tracking in software buying journeys.

The discipline that the 2026 cycle is still learning is how to treat AI search as a first-class channel input. AEO does not show up cleanly in the standard channel taxonomy. It does not have impressions in the AdWords sense. Its referrals are unreliable. Its lag-to-conversion is longer than paid social and shorter than brand TV. And its share of measurable budget — at most companies, still under 5 percent of marketing spend — is small enough that naive MMMs report it as noise. This is a working operator's guide to fitting AI search into a marketing mix model, picking the right tool, validating the coefficients with geo experiments, and surviving the inevitable CFO challenge that follows.

Why MMM Beats MTA for Measuring AI Search

The case for marketing mix modeling over multi-touch attribution in the AEO era starts with a simple observation: AI search does not pass referrer data the way Google organic does. When a user reads a Perplexity answer that cites your brand and then converts an hour later by typing your URL directly, the multi-touch stack sees a direct-to-site session and credits the conversion to direct. The AI exposure that drove the visit is invisible. We unpacked this pattern in Dark funnel attribution, where the structural gap between answer engine exposure and downstream conversion makes user-level attribution unreliable as the primary measurement layer.

MMM does not depend on user-level tracking. It works on aggregated weekly or daily time-series — total revenue, total impressions, total spend, total exposures — and fits a regression that decomposes the revenue series into channel contributions. If AI search exposure rises in a given week and revenue rises that week or in the following two weeks, the model will attribute some share of revenue to the AI search channel even if not a single user clicked a tracked AI search link. That is the core power of the technique for AEO: it measures what cookies cannot see.

The trade-off is that MMM is coarser than MTA. It reports channel-level lift, not user-level paths. It needs at least 18 months of weekly data to fit reliably, and 24 to 36 months if multiple channels share temporal patterns. It is sensitive to specification choices — which channels you include, how you bucket spend, what adstock decay you assume. And it does not replace MTA so much as complement it. The right 2026 measurement stack has both, with MMM as the strategic budget allocator and MTA as the tactical campaign optimizer. The framing for that complementarity is laid out in Multi-touch attribution, which covers how MTA needs to evolve to coexist with MMM rather than be replaced by it.

For AEO specifically, MMM is the technique that lets you defend AI search budget to a CFO who wants a number. A coefficient of 9.4 percent on a $50 million revenue line is $4.7 million. That defends a meaningful budget allocation. A 0.7 percent MTA number defends nothing.

The 2026 MMM Tool Landscape

The MMM tooling landscape in 2026 splits into three tiers: open-source frameworks for teams with analyst capacity, hosted Bayesian platforms for teams that want continuous re-fitting without the engineering overhead, and enterprise consulting plus platform combinations for regulated industries that need audit trails and statistical defense. The table below compares the five leaders our team has built or audited implementations on since 2024.

Tool	Type	Statistical Approach	AI Search Channel Handling	Geo-Experiment Calibration	Best For
Meta Robyn	Open-source R	Ridge regression with adstock and saturation, evolutionary hyperparameter search	Manual channel construction; flexible	Native calibration workflow	Teams with R analysts, batch quarterly cadence
Google LightweightMMM	Open-source Python/JAX	Bayesian, NumPyro priors, hierarchical option	Manual channel construction; flexible	Manual integration	Python-native teams, notebook workflows
Google Meridian	Open-source Python/Stan	Bayesian, hierarchical geo-level	Manual channel construction; geo-native	Native, geo-hierarchical	Teams ready to migrate off LightweightMMM
Recast (Aurelius)	Hosted Bayesian SaaS	Bayesian, continuous re-fitting	Native channel templates	Native, automated	Mid-market teams wanting continuous models
Mass Analytics	Enterprise consulting + MassTer platform	Bayesian and frequentist hybrid	Custom per engagement	Native, custom design	Regulated industries, audit-heavy
Pecan AI	Predictive analytics SaaS	Machine-learning hybrid	Native, low-code	Limited	Teams optimizing for time-to-first-model

The category has been moving Bayesian since 2023. Frequentist regressions — including Robyn's ridge-regression-with-hyperparameter-search approach — remain widely used because they are computationally cheap and well understood, but Bayesian methods better quantify uncertainty, which matters when you are reporting a channel coefficient to a CFO who wants confidence intervals. Google's strategic bet on Bayesian is visible in the deprecation arc from LightweightMMM toward Meridian, which ships with hierarchical priors that handle geo-level data more cleanly. Robyn has added Bayesian options through its plugin ecosystem but remains primarily ridge-regression at its core.

Meta Robyn: The Open-Source Default

Robyn is the most-installed open-source MMM in 2026 by a wide margin, with the GitHub repository past 1,800 stars and an active community on the Facebook Open Source Discord. It is written in R, uses Nevergrad for hyperparameter optimization, supports adstock with Weibull or geometric decay, supports saturation curves with Hill or root functions, and ships a calibration workflow that lets you constrain the model's channel coefficients against geo-experiment ground truth.

For AEO specifically, Robyn's flexibility on channel construction is the key feature. The standard tutorial assumes channels are paid media with impression and spend variables. AI search needs a different input — typically share-of-voice in answer engines or aggregated citation counts — and Robyn lets you add arbitrary variables without breaking the optimization pipeline. The cost is the R requirement: most marketing analytics teams in 2026 are Python-native, and Robyn requires either an R-capable analyst or a willingness to run it through a wrapper. The Robyn team has discussed Python ports, but the canonical implementation remains R as of mid-2026.

Google LightweightMMM and Meridian: The Python Path

LightweightMMM was Google's earlier open-source contribution to the category, built on JAX and NumPyro and released through the Google Research GitHub organization. It supports Bayesian estimation with prior specification, hierarchical models for geo data, and standard adstock and saturation transforms. The library is mature and stable, but Google has been signaling that its strategic investment is shifting to Meridian, which is designed to handle geo-hierarchical structure as a first-class capability and supports cleaner integration with Google's broader measurement stack including Ads Data Hub and Google Analytics 4.

For teams already on LightweightMMM, the migration path to Meridian is non-trivial — the prior specification syntax and the channel construction patterns differ — and many teams will keep LightweightMMM in production through 2027. For teams starting fresh in 2026, Meridian is the better forward-compatible choice, particularly if geo-experiment calibration is on the roadmap.

Recast: The Hosted Continuous-Fitting Choice

Recast, built by Aurelius Marketing Sciences, is the leading hosted Bayesian MMM platform. The pitch is that the model re-fits continuously as new data arrives rather than running as a quarterly batch, which matters in the AEO era because answer engine ranking changes faster than quarterly. Recast handles channel construction, adstock specification, and geo-experiment calibration in a structured UI, which lowers the analyst-skill bar to operate the model.

For mid-market companies whose marketing analytics team is one or two analysts rather than a dedicated MMM specialist, Recast is the path of least operational resistance. The trade-off is that you do not own the model — the methodology lives inside Recast's platform, the priors are partly set by their data science team, and the configuration flexibility is narrower than Robyn or LightweightMMM. For teams that want to defend the model's specification choices in detail to a skeptical CFO, the open-source path retains an advantage.

Mass Analytics: The Enterprise Consulting Hybrid

Mass Analytics is the enterprise-tier player in the category, combining consulting engagement with the MassTer platform. The methodology is Bayesian and frequentist hybrid, the audit trail is deeper than open-source or hosted-only platforms, and the consulting layer means the model specification is defended by named statisticians rather than your in-house analyst. The cost is six-figure annual engagement, which prices out smaller operators.

For regulated industries — pharma, insurance, financial services — where the MMM output feeds compliance disclosures or board-level capital allocation decisions, the consulting layer matters. The audit-trail requirement is real, and the open-source tools, while methodologically sound, do not ship with the documentation that regulators ask for.

Pecan AI: The Time-to-First-Model Pitch

Pecan AI is a predictive-analytics-first vendor whose MMM module sits inside a broader churn-prediction and LTV-modeling platform. The methodological depth is shallower than Robyn or Recast — the platform leans on machine learning hybrids rather than transparent Bayesian or frequentist regressions — and the auditability is correspondingly weaker. The advantage is time-to-first-model, which can be days rather than weeks. For teams running a measurement experiment to decide whether to invest in MMM at all, Pecan AI is the fastest path to a first directional read. For teams committed to MMM as a strategic measurement discipline, Robyn or Recast is the better long-term home.

How to Treat AI Search as an MMM Channel Input

The hardest part of fitting AI search into an MMM is constructing the channel input correctly. Paid media inputs are straightforward — impressions, spend, sometimes reach — and the data flows from the ad platforms directly. AI search has no comparable native input. The options, in order of measurement quality:

Share-of-voice in answer engines. Tools like Profound, Otterly, Peec, and Ahrefs Brand Radar measure how often a brand surfaces in answer engine responses for a defined query set. Weekly share-of-voice, weighted by query volume, is the strongest single proxy for AI search exposure that we have found. The query set needs to be stable across measurement periods, ideally locked in advance and reviewed quarterly. The weighting by query volume comes from Google Search Console or third-party search-volume estimators.

Referral sessions tagged as AI-source. Server logs and analytics tools can tag sessions whose referrer matches known AI surfaces — chat.openai.com, perplexity.ai, gemini.google.com, claude.ai, you.com — and produce a weekly count. This is the most directly observable AI input but undercounts dark-funnel exposure. It works as a secondary input in the MMM but should not be the sole channel construct.

Synthetic impression count from citation tracking. For teams with citation-tracking infrastructure, the count of times a brand was cited in a sampled set of answer engine responses, multiplied by an estimated query frequency, produces a synthetic impression count comparable to paid media impressions. The estimation introduces noise but enables apples-to-apples comparison across channels in the MMM.

Branded search lift. The branded search volume lift correlated with AEO activity is an indirect input that captures downstream demand creation. It does not work as the primary channel input but can be used as a downstream KPI for model validation. The framework for measuring this is in Branded search lift.

The right operational pattern is to use share-of-voice as the primary channel input, validated against referral sessions and citation counts, with branded search lift as a downstream check. Apply geometric adstock with a half-life of seven to fourteen days — the lag between AI search exposure and conversion is longer than paid social but shorter than display brand campaigns. The saturation curve should be relatively flat at low spend levels since AEO investment is still in the linear part of the response curve for most operators, with diminishing returns kicking in only at higher saturation points.

Playbook: Fitting Your First AEO-Inclusive MMM in 90 Days

The 90-day window assumes you have at least 18 months of historical weekly revenue data and roughly 12 months of AI search exposure data. If you have less than 12 months of AEO data, the model will struggle to separate AI search contribution from other channels.

1. Lock the channel taxonomy and data sources. Define every channel that will enter the model — typically paid search, paid social, display, organic search, organic social, email, affiliate, AI search, and brand TV or OOH if applicable. For AI search, decide which surfaces are in scope (ChatGPT, Perplexity, Gemini, Claude, you.com) and which tool sources the share-of-voice number. Lock the query set used to compute share-of-voice. Document the channel taxonomy in a shared spec that the analyst, marketing lead, and finance partner all sign off on. Schedule a weekly data pull that lands cleanly tagged data into a warehouse table.

2. Pick the tool and build the model spec. Choose Robyn, LightweightMMM, Meridian, Recast, or another platform based on analyst skill and cadence requirements. Build the model spec — channels, adstock priors, saturation priors, control variables for seasonality and holidays, geo-level disaggregation if applicable. For Bayesian tools, set informative priors based on prior MMM work or analogous benchmarks. For Robyn, configure the Nevergrad budget and the calibration constraints. Document the spec.

3. Fit the model and audit the coefficients. Run the fit, inspect the channel coefficients, check the fit quality (R-squared, NRMSE, MAPE), and look for nonsensical results — negative coefficients on paid media, AI search coefficient at zero when AEO activity was clearly visible, seasonal effects swallowed by spend. The first fit always has issues. Iterate on the spec, not the data.

4. Validate with a geo experiment. Design a geo-experiment holdout — typically 20 percent of designated market areas held out from AI search optimization for eight to twelve weeks. Measure the observed lift in test vs control geos. Compare against the MMM's prediction for the same geos. If they match within the credible interval, the AI search coefficient is trustworthy. If they diverge, recalibrate.

5. Publish to stakeholders and lock the methodology. Produce a one-page summary of the model — channel contributions, credible intervals, validation results, methodology notes. Walk the marketing lead, finance partner, and CMO through the model. Lock the methodology for the next fit cycle. Schedule the re-fit cadence — monthly for Recast, quarterly for Robyn or LightweightMMM unless you have engineering capacity to schedule automated re-fits.

6. Build the feedback loop into budget allocation. The MMM is not a measurement dashboard. It is a decision tool. The next budget cycle should use the AI search coefficient and credible interval to set AEO investment. If the coefficient is high with tight credible intervals, lean in. If it is high but with wide intervals, run more experiments before committing budget. If it is low, dig into why before defunding the channel.

Geo Experiments: The Validation Layer

A marketing mix model on its own is a correlational artifact. The coefficient on the AI search channel reflects how revenue co-varied with AI search exposure in the historical data, but co-variation has alternative explanations: seasonal trends, parallel campaign launches, broader category dynamics, even reverse causation if your brand spending drove the AI exposure rather than the other way around. The way to convert correlation into causation is a geo experiment.

The standard design is a difference-in-differences setup. Select a set of designated market areas — typically 10 to 20 percent of total markets, chosen to be representative on demographics, baseline revenue, and prior marketing exposure. Hold out AI search optimization in those markets for an 8 to 12 week test window. Continue normal optimization in the control markets. Measure the difference in revenue trajectory between test and control during the window, controlling for pre-period trends. The result is a clean causal estimate of AI search contribution.

The math behind the design — Google's Causal Impact R package is the most widely used implementation — fits a Bayesian structural time-series model that projects what the test markets' revenue would have been without the holdout, and compares actual to projected. The output is a posterior distribution of the causal effect with credible intervals.

The two values from the geo experiment then become the calibration anchor for the MMM. If the geo experiment says the causal AI search contribution is $8 million annually with a 95 percent credible interval of $5M to $12M, and the MMM says the contribution is $14M, the MMM is overestimating and the priors or the channel construction need adjustment. If the MMM says $7M, the model is well-calibrated. The reconciliation discipline — making the MMM agree with the geo experiment within a stated tolerance — is what turns the MMM from a directional report into a defensible board-level number.

Robyn ships geo-experiment calibration as a first-class workflow: you can pass the experiment result into the optimizer and constrain the channel coefficient to lie within the experiment's confidence interval. LightweightMMM and Meridian require more manual integration. Hosted platforms like Recast automate the calibration loop once the experiment data is uploaded. Mass Analytics designs the geo experiment as part of the consulting engagement.

Common Mistakes That Tank the AI Search Coefficient

Three patterns recur in AEO-inclusive MMMs that produce nonsense AI search coefficients. Each is fixable.

Treating AI search as a single channel rather than a multi-surface channel. ChatGPT and Perplexity have different user demographics, different ranking dynamics, different referral characteristics, and different lag structures. Lumping them into a single AI Search channel forces the model to average across heterogeneous behavior. The fix is to split the channel into at least two — typically ChatGPT and a combined Other AI category — and let each have its own coefficient.

Underweighting share-of-voice and overweighting referrals. Referral sessions are observable but undercount the channel by a large factor. If referrals are the primary input, the model will report a small coefficient that reflects observable referral volume rather than total exposure. The fix is to anchor the channel on share-of-voice, not referrals.

Skipping the geo experiment calibration step. The MMM run alone produces a coefficient. Without a geo experiment, you cannot tell whether the coefficient is causal or correlational. Teams that publish MMM AI search numbers without geo validation routinely have to walk them back six months later when the CFO commissions an external audit. Build the geo experiment into the workflow from the first fit, not as an afterthought.

The corollary mistake — running a geo experiment without ever fitting the MMM — produces a clean causal number for the test period but cannot generalize to forward-looking budget allocation. The MMM and the geo experiment work together. Neither replaces the other.

What the Vendor Landscape Will Look Like in 2027

The MMM tool landscape is consolidating. Forrester's 2025 update flagged five-to-seven vendors that will likely survive as standalone businesses through 2028, with the rest either acquired by adjacent measurement platforms or absorbed into broader marketing analytics suites. Recast has raised follow-on capital and is the most likely to remain independent. Mass Analytics' enterprise consulting moat is durable but the platform play will face pressure from open-source. Pecan AI's predictive analytics positioning may pull it out of the MMM lane entirely.

On the open-source side, Google's bet on Meridian and Meta's continued investment in Robyn keep the open-source tier viable. The discipline these projects compete on is not just methodology but tooling around methodology — calibration workflows, geo-experiment integration, documentation, prior specification, channel construction patterns. Both projects ship to enterprise users who maintain in-house MMM teams, and the user base will grow as the AEO measurement discipline pulls more mid-market operators into the category.

For operators making a 2026 tool choice, the right framing is: pick the tool that matches the analyst skill on the team, the cadence the business needs, and the audit requirements of the industry. Tool migration is expensive; pick once, well.

Takeaway: Marketing mix modeling is the measurement discipline that lets you defend AEO budget to a CFO who wants a number. The 2026 tooling — Robyn, LightweightMMM, Meridian, Recast, Mass Analytics, Pecan AI — has made the technique cheap and fast enough to run quarterly or monthly rather than annually. The discipline that separates operators who get the number right from those who get it wrong is treating AI search as a first-class channel input with share-of-voice as the primary signal, applying realistic adstock decay, and validating the model's coefficients against geo-experiment causal estimates. Skip the geo validation step and you publish a number that does not survive the first finance audit. Build it in from the first fit and the AI search coefficient becomes the budget defense that turns AEO from a discretionary line item into a strategic channel allocation.

Frequently Asked Questions

What is a marketing mix model and why does it matter for AEO attribution?

A marketing mix model, or MMM, is a top-down statistical regression that decomposes revenue into contributions from each marketing input — paid media, organic search, affiliate, email, brand TV, and now AI search — using aggregated weekly or daily time-series data rather than user-level tracking. It matters for AEO because answer engines like ChatGPT, Perplexity, and Claude do not pass clean referrer data, set cookies, or surface UTM parameters reliably, which means cookie-based multi-touch attribution undercounts AI search contribution by a factor that ranges from two to ten across the campaigns we have measured. MMM sidesteps that limitation by working with aggregated outputs against aggregated inputs. The 2026 generation of open-source tooling has made MMM cheap and fast enough that mid-market operators can run it in-house quarterly rather than paying agencies six figures annually.

Which MMM tool is best for measuring AI search contribution in 2026?

No single tool dominates. Meta's Robyn is the strongest open-source default for teams with an R-capable analyst — it offers adstock, saturation, and ridge-regression options with built-in hyperparameter tuning. Google's LightweightMMM, built on JAX and NumPyro, suits Python-native teams and integrates cleanly with notebook workflows but has been in slower-paced development since Google released Meridian as its strategic successor. Recast, from Aurelius Marketing Sciences, is the leading hosted Bayesian MMM platform and the strongest choice for teams that want continuous re-fitting rather than quarterly batch runs. Mass Analytics offers enterprise consulting plus the MassTer platform and tends to win regulated-industry RFPs. Pecan AI is a predictive-analytics-first vendor whose MMM module trades methodological depth for time-to-first-model. Match the tool to the analyst skill and the cadence required, not to the marketing on the homepage.

How do you add AI search as a channel input to a marketing mix model?

You add AI search as a channel input by constructing a daily or weekly time-series that captures aggregate AI search exposure, treating it the same way you treat impressions for paid media. The most common inputs are share-of-voice in answer engines from tools like Profound, Otterly, or Peec, weighted by query volume; referral sessions tagged as AI-source in server logs; and where measurable, a synthetic impression count derived from query-level citation tracking. Apply adstock — a decay function — to model the lagged effect of AI citations on conversions, since a user who first sees your brand in a ChatGPT answer often converts days later via direct, organic, or paid. Then let the model estimate the channel coefficient and validate the result with a geo-experiment holdout before publishing the number to the board.

Why do geo experiments matter for validating MMM coefficients?

Geo experiments matter because MMM is a correlational technique, not a causal one. The model estimates which inputs co-vary with revenue, but co-variance is not causation, and a coefficient that looks reasonable can still be wrong. A geo experiment — holding out a designated market area or set of zip codes from AI search optimization while continuing it elsewhere — produces a clean causal estimate of incrementality. You then compare the observed lift in the test geos against the MMM's predicted lift for those geos. If they match within the model's credible interval, the MMM coefficient is trustworthy. If they diverge meaningfully, the model has correlation confounds and the coefficient needs adjustment. Meta's Robyn ships geo-experiment calibration as a first-class workflow, and Google's Causal Impact package supports the difference-in-differences math behind it.

How often should a marketing mix model be re-run to track AEO contribution?

Traditional MMMs were re-run annually or quarterly because the consulting engagement cost made faster cadences impractical, but the answer engine channel changes faster than that. ChatGPT model releases, Google AI Overview expansions, and Perplexity ranking shifts can move citation rates by 30 to 50 percent in a single week. Operators measuring AEO contribution should re-fit at least monthly, and the hosted Bayesian tools — Recast in particular — are designed for continuous re-fitting as new data arrives. Open-source Robyn and LightweightMMM workflows can be scheduled as monthly Airflow or GitHub Actions jobs. The key discipline is to lock the model specification, the channel definitions, and the validation geos in advance, so a re-fit produces comparable coefficients rather than a freshly tuned model each cycle.