Grok Build field report: xAI's $300/mo coding CLI

What Grok Build is

Grok Build is xAI's entry into the agentic-coding-CLI race — the same category as Claude Code and OpenAI's Codex. It launched as an early beta on May 14, 2026, available only to SuperGrok Heavy subscribers. That tier lists at $299/month, currently discounted to $99/month for the first six months for early adopters.

Its headline feature is parallelism. Grok Build runs on Grok 4.3 beta with a "Heavy" multi-agent architecture: it can spawn up to 8 concurrent sub-agents that plan, search documentation, and write code at the same time, against a 2-million-token context window. The underlying coding model, grok-code-fast-1, scores 70.8% on SWE-Bench Verified — a respectable number. On paper, it's a serious product.

What we found

We're not running a benchmark here, and we won't pretend otherwise. Six developers in our circle — not me personally — picked up Grok Build in its first two days and used it on real tasks. The sample is small and informal. But the verdict was unanimous enough to be worth writing down: they were, across the board, disappointed.

Two things came up from everyone. The first was speed — time-to-first-useful-output was slow enough that the tool broke the flow it's supposed to protect. The second was cost behaviour, and this is the one worth dwelling on. One tester ran what was effectively a sanity check — a near-empty "hello"-grade prompt — and watched it consume on the order of 70,000 tokens getting there. Not a complex refactor. A hello.

For a tool you're paying $299/month to access, "it's an early beta" only covers some of that. Beta explains rough edges, missing polish, the occasional crash. It's a fair shield, and we'll extend it. What it doesn't fully explain is a trivial prompt costing 70,000 tokens — because that part isn't a bug.

Why a "hello" costs 70,000 tokens

The token burn isn't a defect Grok Build will quietly patch away. It's a direct consequence of the feature on the box. When the headline capability is "8 sub-agents working in parallel," then every prompt — including a trivial one — can fan out into eight agents that each plan, each search, each reason, and each spend tokens doing it. xAI's own documentation is candid about this: more agents means "deeper, more thorough research at the cost of higher token usage and latency." That's not us speculating. That's the design, described by the people who built it.

For a genuinely hard task, that fan-out can be worth it. For a "hello," for a one-line edit, for the 60–80% of real work that is not hard, you are paying a multi-agent tax on every single call. And because the agent decides for itself how much machinery to throw at a request, your cost-per-task is structurally unpredictable — you cannot look at a prompt and know what it will cost. At a flat $299/month that unpredictability is hidden until you hit the ceiling; the moment Grok Build gets metered API pricing, it becomes the whole story.

To be fair to xAI: it is two days old, the model itself benchmarks fine, and xAI iterates quickly — the speed complaints may well be gone in a month. We'd genuinely retest. But the cost shape is an architectural choice, not a launch-week wrinkle, and no amount of beta-polish changes the arithmetic of eight agents on a one-line prompt.

The takeaway

We're not here to dunk on a two-day-old product. The honest summary is narrower and more useful: Grok Build, today, is a slow early beta whose most interesting feature also makes it expensive to run for ordinary work — and it asks $299/month for the privilege of finding that out.

It also makes a point we keep making. Wiring your workflow to one premium single-vendor seat — at $300/month, with a cost-per-task you can't predict and can't route around — is precisely the bet we think developers should stop making. The fix isn't picking the "right" $300 subscription. It's keeping models behind one interface, routing the easy work to something cheap and predictable, and escalating only when the task actually earns it. That's what llmdeal.me is built to do — see the pricing.

This is a first-hand field report from a small, informal group of developers, written in Grok Build's first 48 hours of public beta. It is our experience, not a controlled benchmark; treat it as one data point. We'll happily revisit it as the product matures.

References

xAI — Introducing Grok Build (official announcement; accessed 2026-05-16)
Dataconomy — xAI launches Grok Build coding agent for developers (2026-05-15)
Techloy — Grok Build early beta: how xAI's coding agent takes on Claude Code (May 2026)
xAI Docs — Multi-agent: more agents means higher token usage and latency (accessed 2026-05-16)
Pasquale Pillitteri — Grok Build: xAI's agentic coding CLI takes on Claude Code (May 2026)

Product facts checked against xAI's own announcement and documentation, May 2026. Article published 2026-05-16. The performance and cost observations are a first-hand field report and are described as such.