Blog

Notes on the cost of running AI.

The llmdeal.me blog covers LLM API pricing, model routing, and the unglamorous economics of not overpaying for inference. No hype.

Founder note · 2026-05-16 · 6 min read

I am not a vibe coder. Here is my Anthropic bill.

A working engineer's documented account: $913.25 charged by Anthropic across 31 days, the rate-limit crisis it ran straight into, and why no plan and no amount of money bought a tool that worked.

Read post →
Founder note · 2026-05-16 · 8 min read

Locked out of Claude: what I missed in AI, April–May 2026

A first-person account: three weeks rate-limited on Claude, the AI news that piled up while I waited — Anthropic's scramble, OpenAI's counter-push, the open-weight surge — and why losing that time is the reason llmdeal.me exists.

Read post →
Field report · 2026-05-16 · 5 min read

Grok Build, the first days: a $300/month field report

xAI's new agentic coding CLI, tested by six developers in its first 48 hours. Slow — and a token burn baked into the 8-agent architecture, not a beta bug. A "hello" cost ~70,000 tokens.

Read post →
2026-05-16 · 6 min read

Why developers spent spring 2026 trying to cancel Claude

A rate-limit revolt, a $20 plan that vanished for a day, and an officially-admitted quality regression. Here's what actually happened — and what an LLM API really costs once you stop paying frontier rates for everything.

Read post →
2026-05-16 · 7 min read

What every major LLM API costs in 2026

Anthropic, OpenAI, Google, xAI, DeepSeek, Qwen, Mistral and the open-weight hosts — every major LLM API's per-token price in one reference table, plus what changed in 2026 and how to read the numbers.

Read post →
2026-05-16 · 6 min read

The cheap models caught up: open-weight LLMs in 2026

DeepSeek V4, Qwen3, Kimi K2.6, GLM-5, MiniMax — open-weight models now trade blows with frontier models on coding benchmarks at a fraction of the price. The quality gap has mostly closed.

Read post →
2026-05-16 · 7 min read

LLM gateways compared: OpenRouter, Requesty & the rest

OpenRouter, Requesty, Portkey, Helicone, Cloudflare AI Gateway and more — what each gateway charges, what markup it takes, and which ones offer EU residency or crypto payment.

Read post →
2026-05-16 · 6 min read

Does model routing actually cut your LLM bill?

RouteLLM says 85%. RouterArena says 35%. An honest, evidence-based look at what smart model routing actually saves — and what it doesn't.

Read post →
2026-05-16 · 6 min read

The hidden costs in LLM API pricing

Tokenizer inflation, reasoning-token billing, conversation re-billing, per-request fees, cloud markup — the charges that never appear in the headline per-token rate.

Read post →
2026-05-16 · 6 min read

Where developers actually went after Claude

Codex, hybrid routing, model-agnostic tools, Chinese coding plans from $3/month, local models — a practical map of where the Claude refugees actually landed.

Read post →

More posts soon.