The Conditions

121,000 developers across 450 companies. 92.6% use an AI coding assistant at least monthly. When AI tools took off in early 2025, developer productivity jumped approximately 10%. Then it stopped climbing. As of Q1 2026, it has been flat for a year.

That's from DX's developer productivity study, the largest published survey on the subject. Laura Tacho, DX's CTO: “In well-structured organizations, AI acts as a force multiplier. In struggling organizations, AI tends to highlight existing flaws rather than fix them.”

She's describing conditions. This piece is about what those conditions are, who has met them, and why almost no one has.

Why It Stops

The standard explanation is that the tools haven't gotten good enough yet. The data says something different.

Anthropic's 2026 Agentic Coding Trends Report contains a figure that reframes the entire debate: 27% of AI-assisted work consists of tasks that would not have been attempted without AI — scaling projects, nice-to-have tooling, exploratory work. Engineers aren't doing the same work faster. They're doing more work.

Five peer-reviewed studies from MSR 2026 confirm the mechanism. AI coding agents increase velocity — commits up 36%, lines added 77–281% depending on tool and timeframe. But quality per line of code stays roughly constant. Static analysis warnings up 18–30%. Complexity up 39–42%. The issue isn't that AI writes bad code. It's that AI writes more code, and verification doesn't scale with volume.

27%

of AI-assisted work
is net-new tasks

Anthropic 2026

→

81%

of devs spend more
time in code review

Harness 2026

→

~10%

productivity gain
then flat for a year

DX, 121K developers

Harness reports 81% of developers spend more time in code review after AI adoption. Faros AI's telemetry across 22,000 developers shows PR review time up 441%. The extra volume created by AI generates a verification burden that absorbs the speed gains. The plateau is the predictable result.

What the Conditions Are

The gains are real at the task level. A 2026 peer-reviewed study in Management Science — field experiments at Microsoft, Accenture, and a Fortune 100 company with 4,867 developers — found a 26% increase in completed tasks. Less experienced developers gained more. This is gold-standard evidence: randomized controlled trials, peer-reviewed, replicated across three firms.

But task-level gains don't aggregate to organizational gains by default. They aggregate under specific conditions.

DORA published an AI capabilities model in April 2026 identifying seven capabilities that determine whether AI tools amplify or destabilize an engineering organization: clear AI policy, healthy data ecosystems, AI-accessible internal data, strong version control, small batches, user-centric focus, and a quality internal platform. Organizations with these capabilities see AI as an accelerant. Organizations without them see instability and compounding tech debt.

McKinsey arrived at the same finding from a different direction: companies in the top quintile for AI productivity were 2.8 times more likely to have redesigned workflows before deploying AI tools. Not after. The redesign is the precondition, not the follow-up.

The pattern holds in the specific cases. Stripe runs 1,300 AI-assisted PRs per week on CI infrastructure and task decomposition pipelines that were built for humans. Intercom's Apex agent delivers because they invested in domain-specific post-training. The tools amplify what was already there.

The Honest Numbers

How many organizations have actually met the conditions?

5.5%

of companies achieve
measurable ROI

McKinsey 2026

of leaders trust their
measurement frameworks

Harness 2026

Two independent surveys, different constructs, nearly identical answers. One measures outcomes. The other measures the ability to measure outcomes. The convergence is the point: almost nobody is achieving real returns, and almost nobody can tell whether they are.

Meanwhile, 89% of engineering leaders report improved productivity (Harness). 92.6% use AI tools weekly (DX). 95% of Pragmatic Engineer respondents use AI weekly. Adoption is universal. Measurement is nearly nonexistent. The gap between those two facts is where the plateau lives.

METR's survey adds the individual dimension: developers report feeling 3x faster but deliver only 1.4–2x more value. Their earlier experimental work found a 40-percentage-point gap between perceived and actual productivity impact. The perception gap isn't just organizational — it's cognitive. People genuinely can't tell whether the tools help as much as they feel like they do.

The Macro Question

Erik Brynjolfsson at the Stanford Digital Economy Lab says AI is entering a “harvest phase.” U.S. productivity jumped 2.7% in 2025, nearly double the 1.4% decade average. GDP grew 3.7% in Q4 with only 181,000 jobs added. He sees the J-curve inflecting.

Torsten Slok, Apollo's Chief Economist: “AI is everywhere except in the incoming macroeconomic data.”

They're both working from the same numbers. The disagreement is about lags — whether 2025 was the start of structural uplift or a statistical blip. DORA's ROI model projects 39% first-year returns for a 500-person organization that meets all seven conditions. The critique is sharp: self-reported data, attribution confusion between AI and the organizational capabilities that make AI work, no net present value calculations. The model tells you what's possible under conditions almost nobody has created.

The positive case for AI-assisted development is real, and it is conditional. The conditions are specific: verification infrastructure that scales with generation volume. Measurement frameworks that capture outcomes, not adoption metrics. Workflow redesign that precedes tool deployment. Organizations that meet them — McKinsey's top quintile, Stripe, a handful of others — see genuine, measurable gains.

5.5% is not a temporary number waiting for adoption to catch up. It reflects the structural difficulty of meeting the conditions themselves. AI coding tools work. That was never the question. The question is whether organizations can rebuild around verification, measurement, and workflow design at the pace the tools demand. A year into universal adoption, the plateau holds.