POV · AI Industry

AI coding bill Is coming due

For two years the pitch was that AI writes the code so you stop paying for developers. Gartner just put a date on when that trade goes upside down.

Ali Imran MemonFounder & CEO, Kitsune AI

Jun 28, 20266 min read

For two years the pitch was that AI writes the code so you stop paying for developers. Gar — Photo: Pavel Danilyuk (Pexels (free, commercial OK))

TL;DR

Gartner predicts AI coding costs will pass the average developer's salary by 2028, driven by surging token consumption and consumption-based pricing.
The capability curve and the cost curve are moving together. A better agent reasons more, retries more, and burns more tokens per task.
The real fight was never AI against human developers. It is fixed salary against variable consumption, and variable wins the cost race once usage compounds.
The only question left for an engineering org is how much reasoning it can afford per task before the math turns on it.

The savings were always borrowed

In the past 2 years, more code has been built off CLIs by more people than has been done in the 5 years prior combined. And every forecast is putting engineers out of jobs within a very defined period of time. And yet, engineering jobs are making a HUGE comeback with most complaining about bad code outputs from code generation.

The reasoning tax

Here is the part the demos never show you. When you run an agentic coding system, most of what you pay for is not code. Gartner's own breakdown puts 85 to 95% of tokens consumed by AI coding agents into the bucket of overhead: context rediscovery, inefficient file reading, redundant searches, failed attempts. Only 5 to 15% goes to the code generation and editing you actually wanted.

Call it the reasoning tax. Every loop the agent runs, every file it re-reads because it forgot what it already knew, every subtask it spawns, you are billed for. Reasoning models can consume up to 100x more tokens internally than they ever output. Agentic tasks can run up to 1000x the token cost of plain code chat. And output tokens, the expensive kind, run 5 to 8x the price of input on frontier models.

So the smarter the agent gets, the more it thinks. The more it thinks, the bigger the invoice. Capability and cost are not on diverging curves. They are the same curve.

The cheaper-tokens illusion

The obvious rebuttal is that token prices are falling off a cliff. They are. Inference cost per token has dropped as much as 900x a year on some tasks. That sounds like the problem solves itself.

It does not, and the reason is the whole story. Per-token cost is collapsing while per-task token volume is exploding faster. You are paying a tiny fraction of a cent for each token and then asking the agent to burn ten million of them re-reading your repo. The unit got cheaper. The bill got bigger. That is what happens when consumption is uncapped and the meter never stops.

This is why the listicles aged so badly. The same eighteen months that produced every "16 best coding agents" and "top agents for 2026" roundup is the exact window where each of those tools quietly got hungrier, because better answers cost more reasoning. The reviews measured output quality. Nobody put the meter on the table.

Look at what teams are actually paying

The numbers in the field are already past theoretical. Gartner found 23% of tech leaders now spend $200 to $500 per developer per month on tokens, and 6% are over $2,000 per developer per month. The most AI-hungry shops are reportedly running around $7,500 per employee per month on coding tools.

The shape of the jump is the alarming part. Bills that started at $20 to $100 a month per developer are leaping to $2,000 to $5,000. Uber reportedly burned through its entire 2026 AI budget of $3.4 billion in four months, with per-person monthly costs landing between $500 and $2,000. One daily Claude Code user racked up more than 10 billion tokens over eight months, worth over $15,000 at API rates. One developer.

These are not pilot-program rounding errors. This is the real cost of the thing finally showing up on the statement.

The misdirect everyone fell for

The field has spent two years arguing the wrong matchup. AI versus human developers. Who codes faster, who makes fewer bugs, who keeps their job. Engaging fight, wrong fight.

The real comparison is fixed salary against variable consumption. A developer's salary is a known, bounded, forecastable number. You sign it once a year. A consumption contract is open-ended by construction. Gartner is blunt that the shift from seat-based to consumption-based licensing creates highly variable cost structures that make it genuinely hard for enterprises to forecast or control spend. You did not buy a cheaper developer. You signed an open-ended contract and called it a discount.

And here is what tips it from expensive to dangerous: developers optimize for speed and convenience, never for cost. Why would they? Tokens felt free. So without a governed engineering operating model, every engineer is independently pulling on a meter nobody is watching, and the spend compounds quietly until a finance review finds it. Variable cost plus no governance plus an incentive to consume is not a risk. It is a guarantee.

This is not staying in code

Code is just the first wall. The same metered-intelligence model is spreading into every "automate the tedious part" feature being shipped right now. Figma rolling out AI motion and shader tooling at Config is the same pattern wearing a designer's clothes. Every business-automation feature, every generative design tool, is a new metered workload sitting on top of someone's token budget. The reasoning tax is going enterprise-wide, and most TCO models still treat AI like fixed-cost software. It is not. Gartner's framing is that the entire cost base is shifting from fixed infrastructure to variable, consumption-based intelligence. The old spreadsheet does not have a row for this.

What the math actually demands

So stop asking whether to use AI. That question is settled and it was never interesting. The question with teeth is how much reasoning you can afford per task before the curve turns on you.

That means measuring tokens per task the way you measure latency. It means knowing your overhead ratio, because 85% waste is a number you can attack. It means a governed operating model where consumption is owned by a named person and watched like any other variable cost, because the ones who treated tokens as free are the ones about to read a very surprising invoice.

The bill was always coming. Gartner just told you the year.

aicodingdeveloper

Ali Imran Memon

Founder & CEO, Kitsune AI

Operator and builder across media, the creator economy and agentic AI. Founder of Kitsune AI, the Agentic AI Foundry. Talk to the team →

All articles

Solutions

Company

Resources

AI coding bill Is coming due

The savings were always borrowed

The reasoning tax

The cheaper-tokens illusion

Look at what teams are actually paying

The misdirect everyone fell for

This is not staying in code

What the math actually demands

Build the system,
then keep the deed.

Solutions

Company

Resources

AI coding bill Is coming due

The savings were always borrowed

The reasoning tax

The cheaper-tokens illusion

Look at what teams are actually paying

The misdirect everyone fell for

This is not staying in code

What the math actually demands

Has Elon Finally Assembled the AI Equivalent of a Royal Flush With the Cursor Acquisition?

The Pakistanis Quietly Building the Machines That Replace Work

Build the system,then keep the deed.

Build the system,
then keep the deed.