The pattern
The MIT Sloan / BCG 2024 longitudinal study reports that between 67% and 78% of mid-market manufacturers in Europe have at least one AI pilot in flight that has not, and likely will not, ship to production. The same pattern shows up in the inaugural Arqmetrica AI Maturity Index Q2 2026 cohort: across the manufacturing segment, the median dimension score on ROI & measurement is 41 out of 100 — eleven points below the threshold our methodology associates with the "shipped at least one production AI use case" cohort. The Capgemini Research Institute Q4 2024 survey on EU industrial AI deployment reaches the same conclusion using a different definition. The picture is consistent enough that we treat it as a structural feature of the segment, not a measurement artefact.
This matters because the cost is rarely the visible budget line. Cumulative spend on stalled pilots is typically the smallest component. The larger costs are the deterioration of board confidence over two or three review cycles, the attrition of the senior operational talent who championed the pilot in the first place, and — the most consequential one — the slow disinvestment from AI as a strategic category, often disguised as "we'll revisit when the technology matures". Once an organisation has been through that cycle twice, the third attempt almost always requires an external intervention.
Why it happens
Two structural causes recur in almost every stalled mid-market manufacturing pilot we have audited. They are not technical.
Cause 1 — Fragmented data ownership
Manufacturing IT estates accumulate over decades. The data needed for almost any meaningful AI use case crosses MES, ERP, quality systems and PLC historians, often with multiple incompatible time-series schemas inside a single plant. Without a clear data product owner — a named person accountable for the cleanliness, schema and access path of a defined data domain — every pilot restarts the same data acquisition project from scratch. Six to nine months disappear before model work even begins. By the time the model is trained, the original sponsor has moved on or the budget cycle has closed.
The EU AI Act Article 10 makes this harder, not easier. From August 2026 it requires demonstrable data governance for any high-risk AI system, including documented provenance and quality criteria. Pilots that cannot evidence those upstream will be unable to progress to production whatever their technical merit.
Cause 2 — No pilot kill criteria
The second cause is methodological. Most pilots launch with a vague hypothesis ("we want to try predictive maintenance on line 4") rather than a falsifiable one with explicit kill criteria — a defined metric, a defined threshold, and a defined date by which the metric must clear the threshold. ISO/IEC 42001:2023 Clause 9 (Performance evaluation) expects exactly this discipline as part of an AI management system. In practice it is almost never in place at pilot kick-off in the mid-market.
Without kill criteria, three things happen. First, pilots drift on past their natural decision point because no one is contractually obliged to terminate them. Second, the team conflates "interesting model output" with "operational value", which they are not the same thing. Third, the organisation never builds the muscle of killing pilots cleanly — and so the next pilot is launched with the same defect.
The Arqmetrica moves catalogue weights "Pre-commit kill criteria before every AI pilot starts" at 5 out of 5 on evidence — the highest tier. That is not because the move is novel; it is because the published evidence base supporting it is unusually clear.
What the Index tells us
The Q2 2026 Index cohort lets us go beyond the published literature and read the patterns directly.
For European mid-market manufacturers in the top quartile (overall scores ≥ 60), three patterns recur with high frequency:
- A named data product owner per major data domain — typically MES, quality, energy, supply chain. Identifiable in their answer to STR-04 ("How is AI work organised in your company?") and corroborated by their DAT-02 score.
- A pilot kill rate of 30% or higher — counter-intuitively a marker of health, not failure. It says the organisation has the discipline to terminate experiments cleanly. Identified through ROI-02.
- An AI scorecard reviewed at quarterly board level. 5–7 metrics, monthly cadence, presented unedited. ROI-01 captures this directly.
For manufacturers in the bottom quartile, the inverse is true: data ownership is shared by committee, the kill rate is below 10%, and AI is reported on by exception only when something goes wrong. They also typically score below 40 on People & capability — meaning the operations team can identify use cases plausibly but lacks the senior delivery muscle to ship them.
Three interventions that work
The Arqmetrica moves catalogue contains 24 interventions, each weighted on evidence strength. Three of them recur as the highest-leverage moves for mid-market manufacturers stuck in pilot purgatory. Ordered by evidence weight:
- Pre-commit kill criteria before every pilot starts. Define metric, threshold, date. Sign off in writing by the executive sponsor and the technical lead before any code is written. (ISO/IEC 42001 Clause 9; Arqmetrica catalogue evidence weight 5/5.)
- Build a one-page AI ROI scorecard for the executive team. Five to seven metrics, tracked monthly, presented unedited at the next board meeting. Pick metrics tied to operational outcomes (units, hours, defects, energy) rather than to model performance (accuracy, F1). (MIT Sloan / BCG 2024 — Value capture cluster; evidence weight 5/5.)
- Run a 60-day data product ownership pass. Inventory the major data domains feeding AI use cases. Name an accountable owner per domain. Document the access path. The output is not a slide deck; it is a one-page register that the board can read in three minutes. (NIST AI RMF 1.0 — Map function; evidence weight 4/5.)
These three moves do not require new tooling, new vendors, or external hiring. They require the leadership decision to install them and the discipline to defend them through the first review cycle. That decision is the actual bottleneck.
What to do today
If your organisation is European mid-market manufacturing and the pattern in this article applies, the next step is to find out where on the spread your company sits. The Arqmetrica AI Maturity Index returns your overall score, your dimension breakdown, and your three highest-leverage moves drawn from the catalogue described above. It takes about ten minutes, the result is private, and the methodology is fully published.
The pattern in this article is structural, not personal. Companies that act on it move out of pilot purgatory in one to two budget cycles. Companies that do not act on it tend to repeat the cycle.