6 min read

The Meter

The Meter

In February, Tomasz Tunguz of Theory Ventures published a quiet observation: his personal inference costs had gone from $7,200 to $43,000 to over $100,000 annualized in two quarters. He framed tokens as the "fourth component" of engineering compensation — salary, bonus, equity, inference. A 75th-percentile engineer earning $375K now costs $475K fully loaded once you add the token budget. Twenty-one percent in tokens.

A month later, Jensen Huang made it doctrine. At GTC 2026: "If that $500,000 engineer did not consume at least $250,000 worth of tokens, I am going to be deeply alarmed."

The 50% Rule. Half your salary's value back in consumed tokens, or something is wrong with you.

This is not a productivity claim. It's a measurement claim. The CEO of the company that sells the compute is telling you what productive looks like — and productive looks like buying more compute.

The Regression

Software engineering has been here before. Not once. Five times.

EraProxy MetricWho Defined ItHow It Was Gamed
1970s–1990sLines of CodeDevelopers (neutral)Verbose implementations, copy-paste
1990s–2000sStory PointsTeams (neutral)Point inflation, sandbagging
2000s–2010sCommit CountVCS tools (neutral)Micro-commits, split trivial changes
2010s–2020sVelocity / DORAPlatform tools (neutral)Auto-merge, skip reviews, speed over quality
2025–Token ConsumptionVendor (conflict)Verbose prompts, parallel agents, bots looping unattended

Every proxy in that table was gamed, abandoned, and replaced. Every one followed Goodhart's Law: when a measure becomes a target, it ceases to be a good measure. But look at the "Who Defined It" column. LOC was developer-generated. Story points were team-generated. Velocity was tool-measured. In every case, the entity defining the metric had no financial stake in the metric going up.

Tokens break that pattern. The entity selling the compute is also the entity defining productivity as consumption of compute. Jensen Huang sells GPUs. Jensen Huang says you should be "deeply alarmed" if engineers aren't consuming $250K in tokens. The meter is made by the company that sells the electricity.

Why the Fastest Metric Always Wins

The temporal hierarchy explains the mechanism. Each measurement layer arrives before the one that matters more:

Tokens (instant) → Velocity (daily) → Quality (weeks) → Outcomes (quarters)

Each measurement arrives before the one that matters more. The one that arrives first wins the decision it informs.

This isn't a failure mode that better proxies can fix. It's a structural property of measurement under time pressure. Any system operating under decision deadlines will preferentially select faster-arriving, lower-fidelity signals. Token consumption arrives instantly. Quality requires human judgment over weeks. In a quarterly reporting cycle, the instant signal wins every access decision before the meaningful one even shows up.

This is why nobody has solved the outcome-measurement problem. Salesforce's Agentic Work Units measure execution, not accuracy — a triggered workflow counts regardless of whether the agent resolved correctly. Microsoft's EngThrive framework claims "gaming alignment" where cheating produces beneficial outcomes, but researchers aren't sure it works. Shopify's VP of Engineering admits the best measurement is "still very human." The counter-examples confirm the thesis: outcome measurement doesn't scale because outcomes take too long to arrive.

The Organizational Response

When the metric is consumption, the organizational response is to consume.

Amazon deployed MeshClaw with an 80%+ weekly usage target. The Financial Times reported employees running trivial tasks to inflate token counts: "There is just so much pressure to use these tools." Amazon says metrics won't factor into performance reviews. Workers report feeling anxious about low rankings regardless. "Managers are looking at it."

Meta built a leaderboard ranking 85,000 employees by token consumption. A "Token Legend" burned 281 billion tokens in 30 days. The company consumed 60 trillion tokens in one month — an estimated $9 billion. The leaderboard was shut down two days after it leaked to the press, but the practice continues behind closed doors.

Microsoft's Julia Liuson, President of Developer Division and GitHub, issued an internal memo: "AI is no longer optional — it's core to every role and every level." AI usage to factor into performance reviews. The irony: the vendor measuring AI adoption can't get its own employees to adopt. They prefer Cursor over Copilot.

The gaming methods are what you'd expect. Verbose prompts. Parallel agents running unattended. Long context windows stuffed with unnecessary material. Bots looping token-burning requests overnight. The same Goodhart dynamics that killed every previous metric, but at enterprise scale with billion-dollar budgets.

The Security Cost

Here's what nobody is talking about: tokenmaxxing requires granting extensive permissions.

To maximize token consumption, you need to let AI agents access more systems, more files, more APIs. An Amazon developer told reporters they would "absolutely never" let MeshClaw run freely with its default security settings. But the metric says consume. So the permissions expand.

The data is already visible. GitGuardian's 2026 report: 28.65 million hardcoded secrets in public GitHub commits last year — up 34% year-over-year, the largest single-year jump ever recorded. AI-assisted commits leak secrets at 3.2% versus a 1.5% baseline. That's double. AI service credentials specifically surged 81%. And 24,008 unique secrets were found exposed in MCP configuration files — the config that lets AI agents connect to your systems.

Docker's blog documents at least ten incidents across six major AI coding tools in sixteen months. Claude Code executing rm -rf from root. An agent deleting 15 years of family photos after being granted permissions for "temporary Office files." The pattern: agents given broad access to serve the consumption metric, acting with insufficient boundaries.

The velocity metric doesn't just displace quality measurement. It actively degrades the security boundary. Permission expansion is the cost of consumption as a KPI.

The Capex Question

Follow the money one level higher. Amazon, Microsoft, Alphabet, and Meta are spending a combined $650–700 billion in 2026 capex, with projections exceeding $1 trillion for 2027. This spending is justified by demand signals. The demand signal is token consumption. Token consumption is inflated by gaming.

How much of the signal is real?

Jellyfish's data across 7,548 engineers: 2x throughput at 10x cost. Top-decile engineers consume 69 million tokens per PR versus a median of 7 million. The correlation between consumption and output exists — but it's not proportional. More tokens produce more output at dramatically diminishing returns. The companies spending $650 billion assume proportional demand growth. The data shows logarithmic returns at best.

If even 30% of enterprise token consumption is gaming — and Amazon's 80% usage target with employees running trivial tasks suggests this is conservative — then a meaningful fraction of the capex thesis is built on an artificial demand signal. The meter is made by the company that sells the electricity. The meter says demand is growing. The company builds more power plants.

The Name for This

This isn't Goodhart's Law. Goodhart's Law is neutral — it describes a dynamic where any target metric gets gamed by the people measured against it. It says nothing about who chose the metric or why.

This is Goodhart's Law with a revenue model. The entity defining the productivity metric is the entity that profits from maximizing it. The entity building the leaderboard is the entity selling the tokens that appear on it. The measurement system cannot be neutral because neutrality would reduce the number the vendor needs to go up.

Researchers call the adjacent phenomenon surrogation — when a measure of a construct replaces the construct itself in people's minds. Wells Fargo's "cross-selling" metric replaced the "build relationships" strategy it was supposed to track. Token consumption is replacing "engineering productivity" in the same way. But surrogation assumes the metric was chosen innocently. This wasn't. The meter was designed by the company that sells the electricity.