Claude Fable 5 is priced at roughly twice Opus, and it burns tokens fast. A single hard task can run for hours and spin up a fleet of sub-agents, every one of them billing.1

The cost anxiety is not hypothetical. Microsoft reportedly cancelled internal AI coding licenses over token costs. Uber burned through its 2026 AI budget with half the year left.2 So the reflex in every finance review right now is predictable: cap usage, downgrade the default model, make people justify their tokens.

The reflex is wrong. Not because cost discipline is wrong, but because it is pointed at the wrong unit.

You Are Not Buying Tokens

Ethan Mollick gave Fable 5 a multi-page specification for a piece of research software. It worked for nine and a half hours. It launched its own research agents, wrote, checked its own output, and iterated. What came back was finished work.1

Ethan Mollick
Ethan
Mollick

"I am closer to a patron. I describe what I want, I pay for it, and I judge the result."

One Useful Thing  ·  June 2026

Price that run as tokens and it looks alarming. Price it as labor and it is absurd in the other direction: a senior week of work, delivered overnight, for less than the cost of one billable consultant hour.

Every, the AI-native publisher, runs a deck-building automation with 24 skills and 18 scripts behind it. It costs $62 per deck.3 That number gets quoted as a warning about agent costs. It is the opposite. Ask anyone who has paid salaried hours for a strategy deck what one actually costs.

You are not buying tokens. You are buying finished work. Price it against the salary it replaces, not the model it upgrades.

Cost per token is an input price. Cost per completed outcome is the number that belongs in your finance review. Most companies tracking AI spend today cannot produce the second number, which is exactly why the first one scares them.

The Stratification Is Real

Aaron Levie called the structural shift early: enterprise AI has moved from cheap chat tools to expensive frontier agents, and the cost is stratifying. Frontier tasks command premium inference. Commodity tasks route to cheaper models. Enterprises, he argued, will need new finance tooling and new habits to manage AI cost the way they learned to manage cloud cost.2

The mistake is treating "AI" as one line item with one price. It is now a labor market with tiers, and the tiers are priced an order of magnitude apart.

Fig. 1
The routing layer
ALL WORK WHAT IS THE OUTCOME WORTH? FRONTIER · FABLE 5 novel · multi-hour · high stakes 2.0× WORKHORSE drafts · code · analysis 1.0× COMMODITY classify · extract · summarize 0.1× Fable already routes its own grunt work down. Copy your model.
Three labor tiers, one routing decision per workflow. Not one budget cap per person.

Routing, Not Rationing

Here is what rationing does in practice. You cap the frontier model, and people do frontier work with the cheap one. The output quality drops invisibly, decisions get made on weaker analysis, and you saved half the price of a task that produced nothing useful. Or people stop using AI for the hard work entirely, and you save 100 percent of nothing.

Routing does the opposite. A handful of workflows genuinely need Fable 5: the novel, multi-hour, high-stakes work where self-verification and judgment are the product. Most generation work runs fine one tier down. And the high-volume grunt work, classification, extraction, summarization, belongs on models that cost a fraction of a cent.

The model itself already works this way. Mollick's note from inside the 9.5-hour run: clever delegation to cheaper models lowers the real cost, with Fable handing research and grunt tasks to cheaper sub-agents while it keeps the judgment work.1 Your cost structure should copy your model's.

The New Finance Muscle

Cloud spend produced FinOps: a person and a practice that owns cost per unit of value. AI spend is now a bigger, faster-moving version of the same problem, and most companies have nobody on it.

The practice is not complicated:

Read the Microsoft story again through this lens. Cancelling licenses is a governance answer to a routing problem. The companies that win this period will not be the ones that spent the least on tokens. They will be the ones that knew, per workflow, what a token was worth.

Step 1
Price the outcome

List your top ten AI workflows. For each one, write down what a completed unit is worth in salaried hours. This takes an afternoon and changes every conversation about AI cost that follows.

Step 2
Route by value

Assign each workflow a tier: frontier, workhorse, commodity. Default down, escalate up when quality fails. The frontier model is reserved for work where finished, verified output is the product.

Step 3
Re-route monthly

Put a 30-minute monthly review on the calendar: cost per outcome by workflow, and which workflows can move down a tier. Prices fall fast enough that this meeting pays for itself every single time.

Twice the price for the only model that finishes the job is the best deal on the menu.

Route tokens. Don't ration them.

We'll build your routing layer.

The Diagnostic is a free 30–45 minute conversation. We'll map your AI workflows to tiers, put a cost per outcome on each, and find the ones worth frontier tokens.

Book the Diagnostic →
Sources
1Ethan Mollick, One Useful Thing, June 2026. Early-access review of Claude Fable 5: pricing at roughly twice Opus, a 9.5-hour autonomous run, and delegation to cheaper sub-agents as the real cost lever.
2Aaron Levie (@levie), X, May 2026. On enterprise AI cost stratification and the need for new AI finance tooling, in the context of reported Microsoft internal license cancellations and Uber's exhausted 2026 AI budget.
3Dan Shipper, "After Automation," Every, May 2026. The $62-per-deck automation: 24 skills, 18 scripts, and a human verifying the output.
John Tan
John Tan

Founder and CEO of nativefirst.ai. Embeds with scaling founders and CEOs to ship Level-3 agents and AI workflows in production.