Token Economics Will Drive Everything
Brian Armstrong, CEO of Coinbase COIN 0.00%↑ made an X post recently containing a blueprint for the next business operating system.
Coinbase did not announce a new AI model. It did not announce a massive enterprise deal.
It said something much more important:
AI usage can keep growing exponentially while AI spend stays flat, or even falls, if the company gets serious about their token economics. If you use clever engineering, you can scale intelligence without scaling your spend. How? With defaults, routing, caching, context discipline, and visibility.
That is the future of business.
Not because Coinbase has the perfect AI stack.
It doesn’t.
Not because open models now beat every closed frontier model.
They don’t.
But because the economic question has changed.
The question is no longer: “How do we get employees to use AI?”
The question is now: “How do we turn tokens into labor at the lowest sustainable cost?”
That is the entire ballgame going forward. Across all industries. Even the physical ones. Robots are going to convert tokens into labor just like AI agents do.
In our recent piece, GLM-5.2 Proves AI Comes for All Moats, I argued that the real story was not just another strong open model. The real story was the repricing of intelligence. Once capable models become cheap, open, deployable, and “good enough” for enough work, closed-model scarcity premium comes under pressure.
Coinbase is what that thesis looks like when it leaves the model leaderboard and enters the corporate P&L.
Better defaults.
Better routing.
Better caching.
Leaner context.
Better visibility.
That sounds boring. It is not boring.
That is how companies convert AI from a novelty expense into an actual production function.
If labor is being replaced by tokens, then token efficiency becomes labor efficiency. And if token efficiency becomes labor efficiency, then AI infrastructure becomes margin infrastructure.
That is why this matters.
The winners will not be the companies that simply buy the most AI. The winners will be the companies that metabolize AI the best.
Unpacking Token Economics and Engineering
Coinbase’s first move is deceptively simple: stop treating model choice like a sacred human decision every engineer has to make manually.
Engineers can still choose whatever model they want. That is important. This is not an austerity program wearing an AI costume. But defaults matter because most usage follows the path already placed in front of the user.
So Coinbase is experimenting with defaulting more work to cheaper open-weight models like GLM-5.2 and Kimi through its LLM gateway, while still letting engineers escalate to frontier models when the task deserves it.
Please read that again.
The control point is not the employee.
The control point is the gateway.
That is a huge distinction.
Most companies try to control AI spend by annoying users. Caps. Alerts. Approvals. Dashboards that exist mostly to make everyone feel guilty. This is classic enterprise software thinking: if costs are rising, add friction.
Coinbase is doing the opposite.
It is removing the decision from the place where humans are least equipped to make it repeatedly and putting it into infrastructure.
That is exactly right.
A human engineer should not have to perform a mental cost-benefit analysis every time they ask an agent to inspect a file, summarize an error, generate a test, refactor a function, or reason through a deployment issue. That is not leverage. That is cognitive tax.
The system should know.
The system should know which model is good enough for the task. The system should know when a frontier model is worth paying for. The system should know when context is reusable. The system should know whether the prompt is a planning problem, an execution problem, a code review problem, a classification problem, or a cheap summarization problem pretending to be something more important.
This is the first real lesson.
AI cost optimization is not about using worse AI. It’s about not using expensive AI where cheap AI already works.
There is a very large difference.
A frontier model may be worth every penny for architecture planning, ambiguous debugging, security-sensitive reasoning, high-stakes migrations, or multi-step agentic work where failure is expensive.
But using that same model for every repetitive execution step is economic laziness.
It is like hiring a senior partner at a law firm to alphabetize exhibits.
Yes, they can do it.
That does not mean the billing rate makes sense.
Coinbase’s Tokenomic Optimizations
Coinbase’s routing strategy attacks this directly. In their custom harnesses, prompts are preprocessed and routed based on the job, cache availability, and pricing. Planning may go to a frontier model. Execution may go somewhere cheaper. Code review may use multiple models so they can check each other.
That is not just model routing. That is labor routing.
And this is where the whole business world is going.
Today companies have org charts.
Tomorrow they will have model charts.
Which model does first-pass support triage?
Which model writes the initial SQL?
Which model reviews compliance language?
Which model reads contracts?
Which model updates the CRM?
Which model drafts code?
Which model approves code?
Which model watches the cheaper models?
Which model gets called only when everything else fails?
This becomes an economic architecture.
The company that routes intelligently will have a lower cost per unit of cognition than the company spraying every task at the most expensive API.
That delta becomes margin.
And margin becomes power.
The second major lever is caching.
Caching sounds technical, so people skip over it.
They shouldn’t.
Caching is where AI spend optimization starts looking less like “save money on software” and more like “build a cognitive supply chain.”
If every AI request rebuilds the same context from scratch, you are paying over and over for the same knowledge to be re-digested. The repo. The docs. The tool schema. The company policies. The architecture. The customer history. The prior decisions.
That is waste.
Coinbase said its LibreChat cache hit rate moved from 5% to 60% once properly implemented. That is not a tiny optimization. That is a different cost structure.
A warm cache means the company is not buying the same understanding repeatedly.
It is reusing cognition.
That phrase matters.
Reusing cognition.
The entire modern enterprise is full of repeated mental work. Every support rep re-reads the policy. Every engineer re-learns the service boundary. Every analyst rebuilds context from stale dashboards. Every manager asks the same operational questions in slightly different language.
AI makes that problem worse if implemented naively, because every agent starts every task hungry. It wants context. It wants files. It wants tools. It wants history. It wants examples. It wants everything.
And every token has a price.
So the companies that win will build memory systems, caches, retrieval layers, and context packaging that turn repeated work into reusable infrastructure.
This is not optional.
If AI becomes labor, then caching becomes training.
When you train a human employee, you do not want them to forget everything after each task. That would be insane. But a lot of enterprise AI usage today is basically paying a brilliant amnesiac by the token.
Coinbase is trying to stop doing that.
Good.
Context Is Not Enough
The third lever is context discipline.
This is the part a lot of power users already understand intuitively, but enterprises have barely begun to operationalize.
Long context is useful.
Long context is also dangerous.
Not morally dangerous. Economically dangerous.
A giant context window can become a landfill. Old files. Irrelevant tools. Stale assumptions. Prior conversations. Accidental instructions. Bloated MCP tool lists. The model carries all of it forward, and the company pays for the privilege.
That is not intelligence.
That is clutter with an invoice attached.
Coinbase’s advice is simple: start fresh when switching tasks, scope file context narrowly, disconnect unused tools, and don’t rely on compaction as a magic spell.
This is exactly right.
The goal is not fewer tokens in some abstract moral sense.
The goal is fewer wasted tokens.
Companies should want AI usage to rise. If tokens are replacing meetings, analysis, manual QA, rote coding, dashboard work, support triage, and administrative labor, then token growth is not the enemy. Token growth may be the sign that the company is becoming more automated.
But token waste is payroll leakage.
A company would never brag that employees are spending 40% of their week looking for the right document, rereading the same policy, or doing work the wrong team already did yesterday.
Yet companies tolerate the AI equivalent constantly. They brag about it, even.
Bad context is expensive.
Unused tools are expensive.
Overstuffed prompts are expensive.
Lazy agent loops are expensive.
Poor task boundaries are expensive.
This is why AI operations will become a real discipline. Not “prompt engineering” as a cute LinkedIn skill. Real AI operations. Model selection, evals, routing, observability, context architecture, tool governance, cache design, security controls, and cost-per-task accounting.
That is where the money is.
Measure What Matters By Seeing It
The fourth lever is visibility.
This is where Coinbase’s approach is especially smart. They are not saying: “Use fewer tokens.”
They are saying: “Use as many as you want, but make the spend visible, and the more you spend, the more impact we expect.”
That is how adults run a business.
A token budget without outcome measurement is just superstition.
The real metric is not token spend.
The real metric is token ROI.
How many issues closed per dollar?
How many support tickets resolved per dollar?
How many compliance reviews completed per dollar?
How many sales workflows advanced per dollar?
How much engineering throughput per dollar?
How many manual hours removed per dollar?
This is the dashboard every serious company is going to build.
Not because CFOs hate AI. Because CFOs will eventually understand that AI spend is not like SaaS spend.
SaaS spend was mostly seat-based. You bought software for employees.
AI spend is increasingly work-based. You buy cognitive labor directly.
That means the economic unit changes.
The old question was: “How many employees need access?”
The new question is: “How much work should the machine perform?”
Once you see it that way, the whole conversation changes.
You do not want arbitrary caps. Arbitrary caps suppress automation. They punish productive users and protect inefficient workflows. They turn AI into another rationed corporate perk.
What you want is accountability.
Spend more if you create more value.
Use the expensive model if the expensive model earns its keep.
Run the agent longer if the agent closes the loop.
Burn tokens if the tokens replace something more expensive.
This is the future of making money in business.
Labor becomes tokens.
Tokens become workflows.
Workflows become systems.
Systems become margins.
And the companies with the best AI cost architecture compound faster than everyone else.
This is also why the GLM-5.2 story I wrote at Life in the Singularity matters so much. In that piece, I wrote that closed labs are not necessarily being killed by open models. They are being repriced by them.
Coinbase is the repricing mechanism in action.
When a large public company starts routing real internal workloads toward open-weight models because the price-performance curve works, that is not a philosophical debate anymore. That is demand moving.
The End of Frontier Dominance?
OpenAI, Anthropic, and Google can still win enormous markets. They will still have frontier models, premium products, enterprise trust, multimodal systems, safety infrastructure, and deep integrations.
But the lazy version of the business model gets weaker.
The lazy version says: companies will send everything to the best model because the best model is best.
No. They won’t.
Not once the bill gets large enough. We are already seeing this play out across Corporate America.
Not once GLM-style models become good enough for enough tasks.
Not once routing gets mature.
Not once caching becomes standard.
Not once CFOs realize “AI usage” is not a strategy and “AI productivity per dollar” is.
The frontier labs will still get paid.
But they will increasingly need to deserve the premium.
That is healthy.
It forces the industry away from token maximalism and toward useful intelligence. It forces labs to compete not just on benchmark screenshots, but on cost, latency, reliability, tool use, caching, deployment flexibility, and measurable business outcomes.
Again, this does not mean “always use the cheapest model.”
That is also lazy.
Cheap mistakes can be very expensive.
A bad code change can take down production. A bad compliance summary can create legal risk. A bad support answer can damage a customer relationship. A bad financial analysis can misallocate capital.
So the future is not cheap AI everywhere.
The future is priced intelligence everywhere.
Every task gets the level of cognition it economically deserves.
That is a much more interesting world.
It means business operators need to understand AI infrastructure the way they understand hiring, payroll, cloud spend, and gross margin. You cannot outsource the whole thing to vibes. You need to know where intelligence is being consumed, where it is being wasted, where cheaper substitutes work, where frontier reasoning matters, and where automation is actually producing value.
The companies that figure this out will become strangely powerful.
They will have smaller teams doing more work.
They will have agents operating across every function.
They will have lower marginal costs for analysis, coding, support, operations, finance, compliance, and creative production.
They will turn internal knowledge into reusable context.
They will route tasks through model portfolios the way quant funds route orders through markets.
They will measure cost per outcome instead of celebrating raw usage.
And they will look at companies paying frontier-model rack rates for everything the way cloud-native companies looked at businesses still running servers in closets.
This is the WealthSystems.ai angle.
Wealth in the AI era will not just come from owning models.
It will come from owning systems that convert intelligence into cash flow.
A business is becoming a machine for allocating human and artificial cognition against economic opportunities. The better the allocation, the better the margins.
Coinbase just gave everyone a glimpse of that machine.
Not the whole machine.
A glimpse.
But it is enough.
The strategy is not “cut AI spend.”
The strategy is “make AI spend scale like software instead of payroll.”
That is the important distinction.
Payroll scales linearly. More work usually means more people, more management, more coordination cost, more benefits, more recruiting, more meetings, more everything.
Software scales differently.
And AI, if architected correctly, should scale closer to software than labor.
That is the promise. But only if companies stop wasting tokens like tourists with corporate cards.
Coinbase’s reported result is the point: AI spend cut nearly in half while token usage keeps growing.
That is what every CEO should be staring at.
More work.
More automation.
More intelligence in the system.
Lower unit cost.
That is not cost cutting. That is operating leverage.
The next great companies will not ask, “How do we use AI?”
That question is already stale.
They will ask eight questions, over and over:
What work should become tokens?
Which tokens should be cheap?
Which tokens should be expensive?
Which context should be cached?
Which tools should be exposed?
Which models should supervise other models?
Which workflows produce measurable margin?
Which human decisions are still worth protecting?
That is the new management science.
Coinbase did not solve all of it in one post.
But it named the right problem. And naming the right problem is usually the first sign that the old era is ending.
The tokenmaxxing era was about adoption.
The next era is about efficiency.
Then comes the real era: compounding AI operating leverage.
That is where fortunes get made.
👋 Thank you for reading Wealth Systems. I started Wealth Systems in 2023 to share the systems, technology, and mindsets that I encountered on Wall Street. I am a Wall St banker became ₿itcoin nerd, data engineer, agentic engineer & family office investor.
…or you can find me on LNKD.
💡The BIG IDEA is share practical knowledge so we can each build and optimize our own wealth engines and combine them into a wealth system.
To help continue our growth please Like, Comment and Share this.
Disclaimer: For Informational Purposes Only
The content provided on this blog is for informational and educational purposes only and does not constitute financial, accounting, or legal advice. The author is not a licensed financial advisor, broker/dealer, or regulated by any financial authority.
No Warranties: All information is provided “as is” without any representations or warranties, express or implied. While every effort is taken to ensure the accuracy of the information, the author and blog owner cannot guarantee that the information is accurate, complete, or current. The author is not liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use.
Investment Risks: Any investments, trades, or financial decisions made based on information found on this site are done at your own risk. Past performance is not indicative of future results. Investing involves a high level of risk, and you should perform your own due diligence before making any investment decisions.
Consult a Professional: Please consult with a certified financial advisor, accountant, or legal professional before making any financial decisions. By using this website, you agree to hold the author and blog owner harmless from any liability resulting from your use of this information.
Affiliate Disclosure: Some links on this website are affiliate links. This means if you click on the link and purchase the item or sign up for a service, I may receive a small commission at no extra cost to you. I only recommend products or services I personally use or believe will add value to my readers.



