How the arqmetrica AI Maturity Index actually works.
Most published "AI maturity scores" fall apart under scrutiny. They are unfalsifiable surveys, with no scoring rubric and no benchmarking — and they are almost always written by the vendor that benefits from the score. The arqmetrica Index is built differently. This page documents how: what we measure, why those things, how scoring works, and how we benchmark you against your peers. Audit-friendly by design.
Why this Index exists
The five framework anchors
The six dimensions
| Dimension | Weight | Primary framework |
|---|---|---|
| Strategy & vision | 18% | ISO/IEC 42001:2023 §5 — Leadership |
| Data foundations | 17% | NIST AI RMF 1.0 — Map function (Data) |
| People & capability | 17% | OECD AI Principle 2.4 — Building human capacity |
| Governance & ethics | 17% | EU AI Act (Regulation (EU) 2024/1689) |
| Tooling & infrastructure | 14% | NIST AI RMF 1.0 — Map function (Infrastructure) |
| ROI & measurement | 17% | ISO/IEC 42001:2023 §9 — Performance evaluation |
How scoring works
OPTION_SCORES in the code.
A dimension score is the unweighted arithmetic mean of the four question scores in that dimension. The four questions in each dimension are calibrated during pilot testing to carry approximately equal diagnostic weight; weighting them differently would introduce a layer of judgement we cannot defend.
The overall Index score is the weighted mean of the six dimension scores, using the weights in the table above. Both dimension and overall scores are integers in the range 0–100.
Rounding is performed once at the end of each step, to the nearest integer; we do not round inside the sums or carry decimals through the formula. This preserves arithmetic accuracy without inflating apparent precision. The scoring is implemented in src/index/scoring.ts and is pinned by twelve unit tests.How a single response becomes a score.
How each dimension's weight is justified.
Weights are not arbitrary; each is anchored to a published source plus a one-line rationale. The 6 weights sum to 100 by construction.
| Dimension | Weight | Rationale | Source |
|---|---|---|---|
| Strategy & vision | 18% | Strongest single predictor of value capture in MIT Sloan/BCG longitudinal data. | MIT Sloan/BCG 2024 §4 |
| Data foundations | 17% | Foundational dependency: no AI value without data discipline. | NIST AI RMF GOVERN-1 + EU AI Act Art. 10 |
| People & capability | 17% | Strongest determinant of pilot-to-production rate. | MIT Sloan/BCG 2024 §6 |
| Governance & ethics | 17% | Direct EU AI Act enforcement weight (Articles 9, 10, 14). | EU Reg 2024/1689 |
| Tooling & infrastructure | 14% | Necessary but not sufficient — capped at 14 to prevent vendor-stack overweighting. | Stanford AI Index 2024 |
| ROI & measurement | 17% | Outcome dimension — closes the value loop. | ISO/IEC 42001:2023 §9 |
| Total | 100% | ||
Each of the 24 questions traces to a published framework.
Every item in the assessment is mapped to a specific clause or chapter in one of the five anchor sources, so each respondent's score can be traced back to where the construct came from.
Question-to-source map (excerpt).
| Dimension | Item summary | Anchor source / clause |
|---|---|---|
| Strategy & vision | Has your board or executive team formally endorsed an AI strategy? | ISO/IEC 42001:2023 §5.1 (Leadership and commitment) |
| Strategy & vision | Is there a single accountable executive for AI outcomes across the organisation? | ISO/IEC 42001:2023 §5.3 (Roles and responsibilities) |
| Data foundations | How well-documented is the lineage of data feeding production AI systems? | EU AI Act Art. 10 (Data and data governance) |
| People & capability | How structured is your AI literacy and upskilling programme? | OECD AI Principle 2.4 (Building human capacity) |
| Governance & ethics | Have you classified your AI use cases against the EU AI Act risk tiers? | EU AI Act Art. 6 + Annex III (Risk classification) |
| Governance & ethics | Do you have a documented incident response process for AI failures? | NIST AI RMF MANAGE-4 (Incident response) |
| Tooling & infrastructure | How mature is your model deployment and monitoring stack? | ISO/IEC 42001:2023 §8 (Operation) |
| ROI & measurement | Do you track value attribution back to specific AI initiatives? | ISO/IEC 42001:2023 §9 (Performance evaluation) |
Excerpt of 8 representative items; the full 24-question set is published at /the-index/start.
Who responded — and over what window.
The Q2 2026 published edition is built on responses collected from 1 January to 31 March 2026. 437 valid completions out of 612 starts (71.4% completion rate, median 11m 23s).
By industry
| Manufacturing | 89 | 20.4% |
| Financial services | 67 | 15.3% |
| Professional services | 58 | 13.3% |
| Tech & software | 53 | 12.1% |
| Retail & e-commerce | 47 | 10.8% |
| Logistics | 39 | 8.9% |
| Healthcare | 31 | 7.1% |
| Education | 22 | 5.0% |
| Energy & utilities | 18 | 4.1% |
| Public sector | 13 | 3.0% |
| Total | 437 | 100% |
By employee band
| 50–99 | 142 | 32.5% |
| 100–249 | 184 | 42.1% |
| 250–499 | 111 | 25.4% |
| Total | 437 | 100% |
Off-band respondents (excluded from the published mid-market medians)
The Index form accepts companies of any size. The published cohort focuses on the 50–499 mid-market core (N=437 — the breakdowns above). Respondents from outside that range completed the assessment and received their personal report, but their scores are not aggregated into the published mid-market figures. We track them here for full transparency.
- 1–49 (small business)24
- 500+ (large enterprise)16
By country
| Portugal | 156 | 35.7% |
| Spain | 98 | 22.4% |
| France | 64 | 14.6% |
| Germany | 49 | 11.2% |
| Italy | 28 | 6.4% |
| Netherlands | 18 | 4.1% |
| Belgium / Luxembourg | 11 | 2.5% |
| Ireland | 7 | 1.6% |
| Other EU | 6 | 1.4% |
| Total | 437 | 100% |
By respondent role
| C-suite | 87 | 19.9% |
| VP / Director | 156 | 35.7% |
| Senior Manager | 142 | 32.5% |
| Other | 52 | 11.9% |
| Total | 437 | 100% |
What we measure and how confident we are.
Confidence intervals on each median are reported using the Bonett-Price distribution-free approximation (the formula below). With the cohort interquartile range of 27 points (p25=33, p75=60), the 95% CI on the overall N=437 median is approximately ±2.0 points; sectoral medians with N>50 carry a CI of roughly ±4–6 points depending on N; sectoral medians with N<30 widen to ±8–10 points and should be treated as directional rather than precise.
What this index does not yet do.
- Self-report bias: respondents may overstate maturity in social-desirability dimensions, particularly Governance and ROI.
- Selection bias: Index-takers self-select; the cohort is not a probability sample of the European mid-market universe.
- Small-N sectoral cuts: N<30 in Energy & Utilities and Public Sector, with correspondingly wide confidence intervals.
- No external validation cohort yet: planned for the Q3 2026 edition, paired with a structured peer-organisation sample.
What we publish next, when N supports it.
- Cronbach's α at N>200 per dimension (target Q3 2026).
- Test-retest correlation at 90 days, on a rolling cohort (target Q4 2026).
- Convergent validity against externally-published AI investment ratios (target Q4 2026).
- Inter-rater reliability for AI Act risk classification (planned Q1 2027).
How peer benchmarks work
Our transparency commitment
/api/data/delete; the full data-handling rules live on the Data Ethics page.
Aggregate-only public reporting. The quarterly State of European Mid-Market AI report is built solely from anonymised aggregate cohort statistics. No individual response, and no company-identifying field, ever appears in published outputs.
Open methodology. The dimension definitions, weights and scoring formulas live as ordinary TypeScript code in src/index/dimensions.ts and src/index/scoring.ts, in the public arqmetrica repository. Anyone — auditor, regulator, competitor, sceptical client — can read the exact arithmetic that produced any given score. There are no hidden adjustments and no proprietary multipliers.
And one line we will not cross: the Index scores companies, never individuals. We do not deploy AI to classify or rate the people who complete the assessment. This is not a policy we expect to revise.Everything you need to verify the numbers.
The methodology, scoring formula, and weight derivation are open under CC BY 4.0. Cohort-level data is anonymised, but the underlying counts — the cohort tables on this page — are the verification artefact: anyone can reproduce the medians by re-running the formula against the same distribution. The 24 questions are listed in full at /the-index/start.