My one-engineer workshop runs on OpenClaw and a Claude subscription. I opened my bill in mid-April and watched a familiar flat line turn into a staircase. Something shifted — and it wasn't just me.

Three things happened between April 4 and April 20. Taken together, they mark the end of the "flat rate, unlimited agents" era that most practical AI-assisted development quietly depended on.

April 4: Anthropic cuts OpenClaw off from subscriptions

Anthropic blocked Claude Pro and Max subscribers from running third-party AI agent frameworks — OpenClaw most prominently — on their flat-rate plans. Going forward, you pay API rates: $3/M input and $15/M output on Sonnet 4.6; $15 and $75 on Opus 4.6.

The reports are stark: hobbyist bills jumped from $20 a month to $500+ overnight. Some users saw 50× increases. Anthropic offered a one-time transition credit matching your monthly plan, redeemable until April 17. That window is closed.

Framed in the press as a "cost crackdown," it's better read as what it is: Anthropic pulling users toward their first-party surfaces — Claude Code, their own VS Code extension — where they see the compute and bill it directly, instead of eating the cost of someone else's agent framework burning through Pro-tier quota.

What this looks like from the workshop floor: it's a forcing function, not a wall. The moves I'm thinking about now — budgets per workflow, harder prompt caching, being more deliberate about when Opus is worth it versus when it isn't — are the kind of discipline that was overdue anyway. Writing this up partly to figure it out in public.

April 16: Opus 4.7 launches with a tokenizer asterisk

Opus 4.7 is legitimately a big jump for coding work. SWE-bench Verified goes 80.8% → 87.6%. SWE-bench Pro goes 53.4% → 64.3%. CursorBench goes 58% → 70%. These aren't marketing numbers; they correspond to real it just solves harder problems on the first try behavior. I noticed it the day I upgraded.

The price tag looks unchanged — same $/M input, same $/M output as 4.6. But the new tokenizer produces up to 35% more tokens from the same text. Your effective cost per request rose somewhere between 0 and 35%, depending on content type.

Anthropic provided the mitigation: prompt caching gets you up to 90% off cache reads, batch processing another 50%. If you were already caching hard, you're fine. If you weren't, your bill quietly climbed.

The headline said "unchanged price." The bill didn't.

April 20: GitHub pauses Copilot sign-ups

Five days after Opus 4.7 shipped, Microsoft paused new Copilot Pro, Pro+, and student sign-ups entirely. Existing Pro users lost access to Opus models (Pro+ only now). Copilot's rate limits now include an undisclosed weekly token cap that can halt a session regardless of your quota.

GitHub's official framing: protect the experience for existing customers. The reporting from Where's Your Ed At tells the real story: Copilot is moving to token-based billing. Agentic workflows — Agent Mode, long-running sessions, parallel tool calls — consume more than 10× what old single-line completions did. Plan economics can't absorb that.

Why all three happened in seventeen days

The chatbot-era pricing was built for three assumptions:

  • Human-paced requests — tens of prompts an hour
  • Short sessions — minutes, not hours
  • Review-before-run — human approves each step

Agentic workflows violate all three. They loop for thirty minutes. They burn 100,000+ tokens in an hour on a single cron. They run concurrently across a fleet of jobs. And the review happens after, not before.

When per-user compute jumps from tens of tokens a minute to tens of thousands, flat-rate pricing doesn't survive contact with reality. Every vendor hit the same wall in the same quarter because every vendor finally shipped agent features.


What this means across the spectrum

For enterprises

Token pricing actually fits the operating model — line-item costs, budget approvals, FinOps discipline. Unit economics work at scale. The real risk is ecosystem lock-in: when Microsoft pushes you to Copilot's token tier and Anthropic pushes you to Claude Code, procurement needs to watch for coupled stacks that are painful to unwind later.

Expect AI-spend attribution tooling to become standard, per-team token budgets to show up in planning by 2027, and a new job title — AI FinOps lead — to appear on LinkedIn before summer.

For small businesses

The hard middle. Too small to negotiate enterprise contracts, too active to absorb a surprise $500 bill. The play: tier usage by task value. Opus only for the highest-leverage moments — architecture, gnarly debugging, novel problem-solving. Sonnet for bulk. Open models (Qwen, DeepSeek, locally-hosted) for routine summarization and search.

Nobody is building opinionated, cost-aware AI tooling for this segment yet — with explicit dashboards, sensible caching defaults, and task-level cost caps. That is a real opportunity for someone. If that's you: please build it.

For hobbyists

Where I sit. The "run experiments for free" era is tightening, but it isn't over. $10–30 a month still buys a lot if you're deliberate. What helps: prompt caching like it's the only thing, batch jobs when latency doesn't matter, and being honest about which tasks actually need a frontier model versus which ones don't.

The silver lining is real. Price pressure is doing what thoughtful design should have done already. Agents without budgets or exit conditions were never actually good. The bill is now the forcing function.

The broader arc

This is the normal adoption curve. Hyper-growth flat rates to land users, then usage-based reality, then tiered plans that match how people actually consume. AWS went through it. Stripe went through it. Twilio went through it. AI agent tooling is hitting the curve three years in.

What to expect by summer:

  • OpenAI and Google follow with similar repricing on their agent products
  • "AI cost observability" emerges as a real SaaS category
  • Agentic workflow becomes a first-class FinOps concern — not just a dev-tools question
  • The first vendor to ship honest, transparent per-task cost tracking earns disproportionate trust

The quiet fine print

None of this is a disaster. It's growing up. The month we stopped pretending autonomous agents cost the same as a chatbot is also the month we started pricing them honestly. Painful short-term, healthier long-term.

For a workshop like mine it means looking at my AI bill the way I look at my electric bill — knowing what it should be, flagging when it isn't, staying honest about what I'm running and why.

Flat-rate AI isn't dead everywhere. You can still get it inside a vendor's walled surface. What's dead is the illusion that a $20 subscription was ever going to cover an autonomous agent burning 100,000 tokens an hour.

It wasn't. We just hadn't been billed for it yet.


Get the next one in your inbox.

We publish when there's something real to send — essays, teardowns, and field notes from a one-engineer workshop running on practical AI. No spam, unsubscribe anytime.

Subscribe →