Preorder bonus through Wed 20 May · Launch Mon 18 May

Stop paying frontier prices for queries a smaller model would have nailed.

llmdeal.me is a smart LLM gateway. One API key routes each request to the cheapest model that can actually answer it — our own Qwen-Coder-32B for trivial work, Groq Llama and DeepSeek-V3 for the medium range, Mistral Codestral for code, Qwen3-235B for reasoning. Frontier API access (Claude / GPT-4o) unlocks once preorders fund the operator credit. Open to developers worldwide; EU-resident infrastructure for the privacy-conscious. Crypto checkout. No KYC.

Preorder now in BTC — +30% bonus credits through Wed 20 May 2026. $20 buys you $26 in credits. $100 buys $130. Use them the moment the gateway opens (Mon 18 May).

curl
OpenAI-compatible drop-in
$ curl https://api.llmdeal.me/v1/chat/completions \
  -H "Authorization: Bearer $LLMDEAL_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "smart-route",
    "messages": [{"role": "user", "content": "refactor this function"}]
  }'
# Routed to qwen-coder-32b — $0.0004 instead of $0.012 on Sonnet
Read the docs →

Your LLM bill is mostly mis-routed traffic.

Most teams default every query to Sonnet or GPT-4o. But a third of those queries are formatting, syntax, "what does this regex do", or generating boilerplate — work a 32B coding model handles identically at <5% of the cost.

Today (single-model)

All queries → Claude Sonnet 4.6 at $3 input / $15 output per 1M tokens. You're paying frontier prices for the long tail of trivial requests.

With llmdeal.me

Trivial → our Qwen-Coder-32B in the EU (~$0.80/$1.50). Fast queries → llama-3.3-70b-self-hosted on our EU GPU. Reasoning → DeepSeek-V3 / Qwen3-235B on Cerebras. Coding → Mistral Codestral. Frontier (Claude / GPT-4o) unlocks at the $3,500 stretch-goal threshold.

Six reasons it actually saves you money.

No "AI startup" magic. Boring engineering moves that compound.

01 Smart routing

Open-source classifier (RouteLLM-style) scores each request by difficulty, then routes to the cheapest model that can actually answer. Tune the thresholds per workload.

02 Our own model

Qwen2.5-Coder-32B running on our GPU in the EU for the cheap tier. We pay flat hardware costs, so we can price below every commodity Llama-70B reseller.

03 llama-3.3-70b-self-hosted — LIVE on our EEA GPU

Now live: llama-3.3-70b-self-hosted serving from our own EEA GPU box — no upstream-provider fees, same auth as the rest of the gateway, EU jurisdiction end-to-end. Available in the Pro smart-routing mix today.

04 EU infrastructure, global customers

Inference runs in the EU — GDPR jurisdiction. Open to developers worldwide; we just put the compute in the EU so privacy-conscious workloads don't have to leave it.

05 Crypto, no KYC

Pay in BTC, XMR, or LTC. No identity check, no card decline, no chargeback risk. We don't ask who you are.

06 Minimal data retention

We store only what we need to bill you and reach you: contact handle, order ID, dollar amount, token counts. We do not store prompt content, response content, or IPs past order settlement. GDPR-compliant by default — we apply the same handling worldwide.

Pricing preview

Pay-as-you-go in crypto. No subscription. No "credits" that expire. Final pricing locks in at public launch Mon 18 May 2026 (GMT+2) — preview shown.

Starter

Our own model only
$0.80 / $1.50
per 1M tokens (input / output)
  • Qwen2.5-Coder-32B on our EEA GPU
  • OpenAI-compatible API
  • 12k context window (upgrading to 24-32k post-launch)
  • EU-resident inference
  • BTC / XMR / LTC checkout
Preorder credits

Elite

Coming June 2026 · 70B EU model
$3.00 / $7.00
per 1M tokens (weighted avg)
  • 128k context window
  • EU-only routing by default. US models are not blanket-deployed — they're provisioned on-demand, first-come-first-serve, tailored per end user when explicitly opted in.
  • GDPR Article 28 DPA available on request
  • Dedicated rate limit pool
  • Direct DM support (Matrix / Telegram)
  • 🤝 Zylo API passthrough (same rates as Pro) — Opus 4.6 / GPT-5.2 / Gemini 3.1 Pro / Open frontier at −40% Zylo retail
Preorder credits

Preorder credits — +30% bonus through 2026-05-20.

Back the build now in crypto, lock in a 30% credit bonus, get a beta key as soon as the gateway opens. No KYC. No subscription. No expiry.

After Mon 18 May 2026, these prices stop being a standing retail offer — same model mix may then only be reachable via consortium-tier deals with added profit margin. If the preorder window doesn't bring significant volume, the GPU doesn't get funded and the window simply closes.

$20 pack

Try it out
$20 → $26
~13M tokens on Starter · ~5M on Pro
  • +30% preorder bonus baked in
  • BTC self-serve · XMR + LTC operator-confirmed
  • Credits never expire
  • Beta key DMed at launch
Preorder $20 in BTC

$250 pack

Build supporter
$250 → $325
~162M tokens on Starter · ~65M on Pro
  • +30% preorder bonus baked in
  • Founder-batch beta key (first 10 onboarded)
  • Elite EU-only routing unlocked at launch
  • Direct DM line to me (Telegram / Signal)
  • Refundable as above
Preorder $250 in BTC

$50 pack also available on the buy page. Want a custom amount? DM me.

Founder & Charter tiers — limited slots

Bigger commit = bigger perks. Funds the next-tier GPU nodes directly.

Founder Member $500 → $700

25 slots · +40% bonus · funds US low-model GPU
  • Permanent Founder badge on the customer portal
  • Direct DM line to the operator (Telegram / Signal)
  • 1-hour router-threshold tuning consult call
  • Early access to Llama-3.3-70B EU when it ships
  • Locked-in launch pricing forever ($0.80/$1.50 Starter never increases)
  • Name credit in router_logic.py when it open-sources
Preorder Founder ($500)

25 slots remaining

Charter Patron $1000 → $1500

10 slots · +50% bonus · funds 1 month of full expansion
  • Everything in Founder Member
  • Custom smart-route classifier thresholds tuned to YOUR workload
  • GitHub "Charter patron" attribution when llmdeal/router opens
  • Direct input on the Q3 roadmap (Llama-405B? VL models? Local TTS?)
  • Quarterly call with the operator for the first year
Preorder Charter ($1000)

10 slots remaining

Stretch goals — what preorders unlock.

Every preorder dollar is earmarked. Hit a threshold, that infra deploys. Public counter, no spin.

Total raised $0

Loading roadmap …

Refresh-rate: each page load fetches the live counter. No client-side polling — we don't want to wake your laptop.

Reach the founder.

Three channels. Matrix is preferred (E2EE, federated, no phone number). Telegram works. Discord works now but the invite is being rotated soon and may not stay reachable past 18 May — preorder backers will get the new one DM'd directly.

Not ready to preorder? Just watch.

If you're not buying yet, drop your contact and I'll ping you once at launch. No newsletter, no drip, one message.

Launch ping (free)

No spam. No newsletter. One DM when the gateway opens.

Whatever you check most often.
Helps me prioritise onboarding once preorders deploy the GPU.

Privacy & data — what we keep, what we don't.

We operate under GDPR (Norway / EEA) and apply the same handling to every customer regardless of where you live. Plain English, no privacy-policy-lawyer-speak.

What we store

Order record. Order ID, SKU, currency, dollar amount, status, timestamps. Append-only ledger so we can credit your account at launch.

Contact handle. The email / Telegram / Discord / Signal handle you give us at checkout. Used only to deliver your API key and bill-readiness DMs.

Token counts. Once the gateway is live, we log the number of input + output tokens per request — for billing only. Never the prompt content.

What we don't store

Prompt content. Your prompts and responses are not retained after the request completes. No training, no audit log of content.

KYC / identity data. We don't ask for your name, address, ID, or payment-card details. Crypto only.

IP addresses. Captured transiently for fraud/rate-limit, deleted within 24 h of order settlement.

Want your data deleted? DM the contact handle you signed up with. We remove the order record within 48 hours and email you the deletion timestamp. GDPR Article 17 ("right to be forgotten") — extended to every customer, EU resident or not. Full policy: /privacy.html.

FAQ

Honest answers. We'll add more as the beta progresses.

What does "preorder" actually mean here?

The gateway isn't fully live yet — the GPU node is being provisioned (the EU, EU). Preorders fund the GPU deposit and signal demand. In return: +30% bonus credits baked into every preorder pack, plus priority onboarding when the gateway opens.

You pay $X in BTC today, you get $X × 1.30 in llmdeal credits when the gateway goes live. Credits never expire. No subscription kicks in afterwards.

When does the gateway actually launch?

Target: Monday 18 May 2026, GMT+2 (Central European Summer Time). GPU is being provisioned this week; smart routing + gateway go live on the date above.

If launch slips past 2026-06-01, every preorder is refundable in BTC at the address you paid from. No questions, just DM me on Telegram / Signal.

Status updates land in your DMs (whatever handle you give us at checkout) — at least one before launch day.

What if I never get my credits?

Refund. Full BTC value back to your payment address, on request, any time before launch.

After launch, credits remain refundable too — gated by recorded usage. See the money-back guarantee below.

What happens to these prices after Mon 18 May 2026?

The tier prices on /pricing.html are the preorder window. After public launch on 18 May, the same model mix may only be reachable through consortium-tier deals with added profit margin — not as standing retail pricing on this page.

And the honest version: if significant preorder volume doesn't arrive, the GPU doesn't get funded, the gateway doesn't open at these prices, and the window just closes. No preorders → no llmdeal at this price level → you're welcome to keep paying frontier providers what they ask, where the money really, really matters. That's the deal.

What's the Zylo API partnership? Who is Zylo?

Zylo API ("Next-Generation AI Inference Platform") is a frontier API gateway with direct access to Claude Opus 4.6, GPT-5.2, Gemini 3.1 Pro, DeepSeek V3.2 Thinking, Qwen 3.5, MiniMax 2.5, Kimi K2.5 — 26+ models total.

Our collaboration: Pro and Elite tier customers get passthrough access to Zylo's frontier lineup at −40% Zylo's public retail rate, billed against your llmdeal credit balance. Same llmdeal API key, opt-in per request.

Headline rates (vs Zylo retail published at zyloai.net):

  • Claude Opus 4.6 — $10/M$6/M
  • GPT-5.2 — $5/M$3/M
  • Gemini 3.1 Pro — $3.57/M$2.15/M
  • Open frontier (DeepSeek / Qwen / Kimi / MiniMax) — $1.00/M$0.60/M

Why this is real: Zylo has direct upstream provider relationships and bulk pricing we don't. The partnership lets us pass their bulk rate through to our Pro+ customers with a meaningful discount on top. They get a distribution channel; our customers get frontier at non-frontier prices.

Crosscheck the retail numbers yourself at zyloai.net — that's part of why this partnership is visible on the page instead of buried.

Money-back guarantee — 3 hours of recorded usage

Full refund available on every order ever placed, gated only by cumulative actual token-usage time recorded on your account. Under 3 hours total of recorded usage across all your orders ever? Refundable on request — DM founder. Past 3 hours, the service is considered consumed.

Refunds are per-second prorated against the usage actually recorded, minus the on-chain crypto network fee required to send the refund. For BTC refunds you cover the network fee in fiat upfront — we don't deduct it from the refunded principal; the BTC value you paid in comes back to your payment address. XMR + LTC refunds net the fee on-chain.

This isn't a 14-day trial gimmick — the clock is on actual usage, not calendar time. If you preorder, hold credits, and never touch the API, you can refund a year later.

Is the cost-effectiveness claim independently verifiable?

Yes — and we'd rather you check before you preorder than after. Every model in our Pro mix has a public per-token price and an independent benchmark score on high-trust third-party surfaces that have no financial stake in llmdeal.me:

If our Pro mix isn't on those leaderboards at the prices we quote, the refund clock applies — see above.

When does the A6000 EU node actually deploy after the $1k threshold hits?

Within 7 days of the public counter crossing $1k. We promise the threshold in marketing the moment it's crossed; the rented A6000 spins up shortly after.

Honest caveat: we also apply a small internal safety buffer (slightly above the public threshold + a check that real customers, not one whale, drove the number) before we wire up the A6000. This is so a single $500 Founder buying then asking for a refund doesn't trigger a month of GPU rental we can't sustain. The public milestone still flips green the moment the public threshold is met.

Is this just an OpenAI wrapper?

No. We run our own model (Qwen2.5-Coder-32B) on our own GPU. We also smart-route to OpenAI, Anthropic, Groq, Together, and DeepSeek when your query needs more horsepower than our model can give.

The point isn't to replace frontier models. The point is to not pay frontier prices for queries that don't need them.

How does smart routing work?

Each request hits a small classifier (open-source, RouteLLM-style) that scores difficulty. Easy queries (formatting, simple regex, syntax fixes) → our Qwen-Coder-32B in the EU. Fast workhorse queries → llama-3.3-70b-self-hosted on our EU GPU. Reasoning queries → DeepSeek-V3. Coding-heavy queries → Mistral Codestral. Highest-difficulty queries → Qwen3-235B on Cerebras (frontier OSS-class).

Direct frontier API routing (Claude, GPT-4o) unlocks at the $3,500 preorder stretch goal — once preorders fund the operator credit at Anthropic + OpenAI. Until then, you can reach Claude via the openrouter/anthropic/claude-* path (billed against our OpenRouter prepay, not against your llmdeal credit budget directly — pass-through pricing).

You can override per-request by passing any specific model name (e.g. model: "deepseek-chat") directly. The router only kicks in when you ask for model: "smart-route".

What about my data privacy?

Short version: contact handle + order record stored. Prompts and responses are not retained. Full breakdown in the Privacy & data section above.

Our own model runs in the EU — GDPR jurisdiction. We don't log prompt content. We keep token counts for billing only.

When you route to a frontier provider (OpenAI, Anthropic, Groq), the request goes to them under their data policy — exactly as if you called them directly with your own key. Anthropic doesn't train on API data. OpenAI lets you opt out via the dashboard.

The Elite tier pins routing to EU-only by default — your requests never leave the EU even when escalated to frontier models. US-hosted models are not in the default Elite pool; they're provisioned on-demand, first-come-first-serve, tailored per end user only when a customer explicitly opts in.

Can I use llmdeal.me if I'm not in the EU?

Yes. We serve developers worldwide. The "EU residency" piece is about where our infrastructure runs (EEA GPU + EEA-based operator) — not about who can use the service. No geographic restrictions on signup.

We apply GDPR-level handling to every customer regardless of where you live, because doing it differently per-jurisdiction is operational overhead we don't want.

Why crypto-only?

Three reasons. One: credit cards require KYC and we don't want to ask. Two: chargebacks on usage-based products are a nightmare. Three: we genuinely think devs paying for an API shouldn't have to identify themselves.

We accept BTC (auto-checkout), XMR and LTC (semi-manual — DM us, we send a one-time address, you pay, we credit your account within 1-4 hours).

Is this open source?

The router classifier and gateway code will be open-source once the gateway is stable post-launch (target Mon 18 May 2026). The marketing site and inference stack are private.

Our underlying model (Qwen2.5-Coder-32B) is Apache 2.0 — Alibaba's open release. We didn't train it; we serve it.

Who's behind this?

an EEA-based independent operator on owned bare-metal infrastructure. No VC, no team, no roadmap deck. Pricing is honest because the cost structure is honest.

When does it actually launch?

Public launch Monday 18 May 2026 (GMT+2). Preorders open right now — every BTC backer locks in +30% bonus credits and a beta key delivered on launch day.

Backed by