AI doesn’t have a Moore’s Law

It has something harder to name, and harder to ignore.

In January 2023, two months after ChatGPT launched, I started building annexr. I thought I was late. Three years later, I realize I was early.

That mismatch - feeling late while being early - is the signature experience of building in AI right now. It has nothing to do with individual foresight. It has everything to do with how progress in this field compounds.

One curve vs. many

Moore’s Law was one curve: transistor density doubling on a predictable clock. Clean, measurable, predictive. And even that wasn’t a natural law. At Intel, it was a goal the company worked hard to keep. When physics pushed back, they adapted.

AI progress isn’t like that. There is no single company shepherding a single metric. It is a bundle of partially correlated curves: compute, algorithms, inference cost, hardware efficiency, and how long AI systems can work without a human in the loop.

What makes this unusual is that two different kinds of improvement run at once.

Moore’s Law runs on the calendar. Time passes, chips improve.
Wright’s Law runs on cumulative work done. The more you make something, the cheaper it gets.

In AI, both are active simultaneously. That coupling is what makes progress feel relentless. The decoupling is what produces the walls.

The curves, briefly

Four numbers worth knowing:

Training compute has grown roughly 4.5x per year since 2010
Algorithmic efficiency improves separately: the same capability now takes about 3x less compute each year
Inference cost has fallen by orders of magnitude, anywhere from 9x to 900x per year depending on the task
Agent task horizon - how long an AI can work before needing help - has been doubling roughly every four months

These aren’t independent. Falling inference cost enables more compute at inference time, which improves reasoning, which pushes the task-horizon curve further, until it hits a wall.

Where the curves break

The logarithmic trap. Benchmark performance scales roughly with the log of compute. Each doubling of performance costs far more than a doubling of the training budget. The current bet is test-time compute, shifting spend from training to reasoning during use. Whether the economics hold at scale is still open.

The data wall. High-quality human text is largely exhausted at frontier training scales. Synthetic data and model self-play have kept the curve alive in narrow domains like math and code. Whether they generalize is the open question.

The reliability gap. Task horizon measures length, not robustness. An agent that can work for a hundred hours but has a 1% chance of failure each hour is unusable for anything high-stakes. Reliability data lags capability data badly, which is itself a signal about what the industry has optimized for.

The power grid. Moore’s Law was about density, doing more in the same space. AI scaling is about absolute magnitude. Bigger clusters need more gigawatts. Power grids follow civil engineering timelines, not silicon ones.

The institutional curve, which is essentially flat. Compliance, procurement, security review, change management: none of these scale exponentially. The AI curves are steep. The institutional curve is close to flat.

What this means practically

The line of what is worth building keeps moving. Something too expensive or too unreliable last quarter can make sense this quarter. Something that barely works today can be boring infrastructure a year from now.

But the problems blocking real adoption have shifted. They are no longer mostly about capability. They are about reliability, power capacity, data quality, and how long it takes large organizations to trust new systems.

Most of the interesting work for the rest of this decade sits in closing the gap between a steep capability curve and a nearly flat institutional one. Not pushing the frontier further. Getting organizations to the threshold.

That is a harder problem than making the models smarter. It is also a more tractable one.

If you want the backstory on how these three threads came together, I wrote about that separately: The AI moment was seventy years in the making.

AI doesn't have a Moore's Law

It has something harder to name, and harder to ignore.

One curve vs. many

The curves, briefly

Where the curves break

What this means practically

Sources and further reading

It has something harder to name, and harder to ignore.

One curve vs. many

The curves, briefly

Where the curves break

What this means practically

Sources and further reading

Related Posts