Setup · 2 minutes · 3 lines

13 tools, one llmdeal key.
Pick yours.

IDEs, agent CLIs, chat UIs. Anything that speaks OpenAI's /v1/chat/completions works. Copy a snippet, paste your key, keep the editor you already know.

IDEs

Cursor PearAI Zed

VS Code

Continue Cline Roo Code Codeium Cody

Agent CLIs

Aider opencode OpenHands Open Interpreter

Chat UIs & servers

OpenWebUI LibreChat Tabby

Cursor

VS Code fork with built-in AI chat, autocomplete and agent mode. Used by ~1M+ developers. Cursor's Settings > Models panel has an Override OpenAI Base URL toggle — that's the door llmdeal walks through. Note: Ask & Plan modes accept custom keys; Agent mode currently does not.

1 Get your llmdeal key

Cursor's "OpenAI API Key" field expects a key that starts with sk-. llmdeal mints keys starting with lld_ — but our Cursor-mode endpoint accepts the same key as a bearer token. Grab your key from the dashboard or spin up a free demo key with one click.

Buy a key →

2 Configure Cursor

In Cursor open Settings → Cursor Settings → Models. Toggle on Override OpenAI Base URL, paste:

cursor settings · base url

https://api.llmdeal.me/cursor

Paste your llmdeal key into the OpenAI API Key field, then click Verify. Cursor will ping our endpoint and check the key is live.

cursor settings · api key field

lld_YOUR_KEY_HERE

Finally hit + Add custom model and type a model name — we recommend the smart-route alias below. Turn the toggle on so Cursor surfaces it in the model picker.

3 Test it

From a terminal, confirm the endpoint is reachable with your key:

curl · health check

curl https://api.llmdeal.me/v1/chat/completions \
  -H "Authorization: Bearer lld_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart-route-coder","messages":[{"role":"user","content":"say hi in 3 words"}]}'

You should see a JSON response with a choices[0].message.content field. If you do, Cursor will work too.

Recommended alias For Cursor's Ask + Plan modes use smart-route-coder — code-aware routing into Qwen3-Coder & GPT-OSS-120B. For long free-form prose chats, try smart-route.

Troubleshooting

Getting a 401 Unauthorized?: Cursor sends the bearer token as-is. Your key is the lld_… string — paste it raw, no sk- prefix needed. If you previously had OpenAI configured, remove that key from the field first.
Verify button greyed out?: The Override OpenAI Base URL toggle must be ON before Verify will fire. Also make sure the URL ends with /cursor (not /v1) — Cursor's pre-flight is path-sensitive.
Model not appearing in the picker?: You need to + Add custom model with the exact name (e.g. smart-route-coder) and flip its toggle on. Cursor only shows models you've explicitly enabled.
Working in Ask, broken in Agent mode?: Known Cursor limitation: Agent mode currently routes through Cursor's first-party infra and ignores custom OpenAI keys. Use Ask (Cmd+L) or Plan mode with llmdeal.

Cursor's UI labels change between versions. These steps target Cursor ≥ 0.42. If a label has moved, look for "Override OpenAI Base URL" in Cursor Settings → Models — verify against the latest Cursor docs if anything looks off.

Continue.dev

Open-source AI assistant for VS Code & JetBrains. ~25k GitHub stars, fully model-agnostic, config lives in ~/.continue/config.yaml. Continue speaks native OpenAI-compatible — point apiBase at llmdeal and you're done.

1 Get your llmdeal key

Continue's apiKey field takes any string — your lld_… key works as-is.

Buy a key →

2 Edit your config file

Open ~/.continue/config.yaml (Continue creates this on first launch — if you don't have it, install the extension and open it once). Append a new model block:

~/.continue/config.yaml

# ~/.continue/config.yaml — llmdeal block
name: llmdeal
version: 0.0.1
schema: v1

models:
  - name: llmdeal smart-route
    provider: openai
    model: smart-route
    apiBase: https://api.llmdeal.me/v1
    apiKey: lld_YOUR_KEY_HERE
    roles: [chat, edit, apply]

Save the file. Continue reloads its config on save — no extension restart needed. The new model appears in Continue's bottom-bar model picker.

3 Test it

One-shot smoke test before you wire it into your editor:

curl · smoke test

curl https://api.llmdeal.me/v1/chat/completions \
  -H "Authorization: Bearer lld_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart-route","messages":[{"role":"user","content":"hello from continue"}]}'

Recommended alias smart-route for everyday chat & edits — balanced cost vs quality. Add a second entry pointing at smart-route-coder for inline edits on large diffs. Continue lets you keep both and switch from the model picker.

Troubleshooting

Continue can't see the new model?: Make sure the YAML is valid (no tabs, 2-space indent). Continue silently drops unparseable blocks — check Output → Continue in VS Code for parse errors.
401 errors in the chat view?: The apiKey value is sent as-is in the Authorization: Bearer … header. Don't quote it twice or wrap it in ${…} unless you've defined that env variable in Continue's secrets block.
"Model not found" from the server?: The model: value must match an llmdeal alias exactly — case-sensitive. Use smart-route, smart-route-coder, or smart-route-fast. See /docs.html for the full list.
Autocomplete isn't using llmdeal?: Autocomplete uses a separate tabAutocompleteModel block. Either add llmdeal there too (set roles: [autocomplete]), or let Continue keep its default for low-latency completions and use llmdeal for chat only.

Aider

CLI-first AI pair-programmer that edits files via git. Beloved by the terminal-purist crowd. Aider's OpenAI-compatible mode is a two-env-var switch — no config file required.

1 Get your llmdeal key

Aider reads OPENAI_API_KEY from your environment. Your lld_… key drops in unchanged.

Buy a key →

2 Export env vars & launch

Mac/Linux — drop this in ~/.zshrc or ~/.bashrc:

shell · env vars

export OPENAI_API_BASE=https://api.llmdeal.me/v1
export OPENAI_API_KEY=lld_YOUR_KEY_HERE

Then launch aider in your project root with the openai/ prefix on the model name:

shell · launch aider

aider --model openai/smart-route-coder

Prefer a config file? Drop this into .aider.conf.yml at your repo root or ~/.aider.conf.yml for a global default:

~/.aider.conf.yml

model: openai/smart-route-coder
openai-api-base: https://api.llmdeal.me/v1
openai-api-key: lld_YOUR_KEY_HERE

3 Test it

Round-trip from the same shell aider will run in:

curl · smoke test

curl "$OPENAI_API_BASE/chat/completions" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart-route-coder","messages":[{"role":"user","content":"hi"}]}'

Recommended alias smart-route-coder — Aider's diff-edit format leans heavily on instruction-following; the coder route tunes for that and is dramatically cheaper than Claude Sonnet for multi-file refactors.

Troubleshooting

"Unknown model" from aider's launcher?: Aider routes via LiteLLM under the hood. Always prefix with openai/ — e.g. openai/smart-route-coder, not bare smart-route-coder.
Env vars don't seem to take effect?: Open a new shell or run source ~/.zshrc. Aider reads env once at launch — restart it after any export.
Hitting rate limits on a large refactor?: Add --map-tokens 0 to disable the repo map for very large repos, or split work into multiple smaller sessions. Our tiered limits reset on a sliding window — check /dashboard.
Aider can't apply diffs cleanly?: Try --edit-format diff-fenced — the coder route plays better with explicit fenced diffs than with whole-file rewrites.

Zed

Rust-native, GPU-accelerated editor with built-in AI assistant. Zed's openai_compatible language-model schema is a first-class config target — just add a provider entry pointing at api.llmdeal.me.

1 Get your llmdeal key

Zed reads custom-provider API keys from an environment variable named <PROVIDER_NAME>_API_KEY — for the "llmdeal" provider below, that means LLMDEAL_API_KEY.

Buy a key →

2 Edit Zed settings

Open ~/.config/zed/settings.json (or Cmd+, in Zed). Add or merge this language_models block:

~/.config/zed/settings.json

{
  "language_models": {
    "openai_compatible": {
      "llmdeal": {
        "api_url": "https://api.llmdeal.me/v1",
        "available_models": [
          {
            "name": "smart-route-fast",
            "display_name": "llmdeal · smart-route-fast",
            "max_tokens": 32768,
            "capabilities": {
              "tools": true,
              "images": false,
              "parallel_tool_calls": false,
              "prompt_cache_key": false
            }
          },
          {
            "name": "smart-route-coder",
            "display_name": "llmdeal · coder",
            "max_tokens": 32768,
            "capabilities": {
              "tools": true,
              "images": false,
              "parallel_tool_calls": false,
              "prompt_cache_key": false
            }
          }
        ]
      }
    }
  }
}

Then export the key into the environment Zed is launched from:

shell · env var

export LLMDEAL_API_KEY=lld_YOUR_KEY_HERE

On macOS, if you launch Zed from the Dock, set the variable with launchctl setenv LLMDEAL_API_KEY lld_… or relaunch Zed from the terminal so the env propagates.

3 Test it

Same smoke test pattern as the other tools:

curl · smoke test

curl https://api.llmdeal.me/v1/chat/completions \
  -H "Authorization: Bearer $LLMDEAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart-route-fast","messages":[{"role":"user","content":"zed pings llmdeal"}]}'

Open Zed's Assistant panel (Cmd+?) and pick "llmdeal · smart-route-fast" from the model dropdown.

Recommended alias smart-route-fast — Zed's UX is latency-sensitive and fast routes through Groq / Cerebras for sub-second first-tokens. Add smart-route-coder as a second entry for heavier inline edits.

Troubleshooting

"Provider not found" when picking a model?: Zed reloads settings on save but sometimes caches the model list. Restart Zed once after editing settings.json.
401 errors despite a valid key?: Confirm the env var name matches the provider key — the provider "llmdeal" requires LLMDEAL_API_KEY (uppercase, underscored). Rename mismatches are the #1 cause.
Tools / function-calling not firing?: "tools": true must be set in capabilities. The smart-route aliases support tool calls — leave this on.
Settings file won't parse?: Zed's settings.json supports comments and trailing commas, but merging into an existing file is finicky — paste the language_models block at the top level, not nested inside "editor" or another section.

Zed's openai_compatible schema is stable as of late 2025 but Zed iterates fast. Verify against the latest Zed docs if a field name has shifted.

PearAI

Open-source Cursor fork (Apache 2.0) with a forked Continue.dev bundled in. Configured via the same config.json file pattern, so anything Continue accepts, PearAI accepts.

1Get your llmdeal key

PearAI passes the API key straight to the upstream OpenAI-compatible endpoint, so any lld_… key works.

Buy a key →

2Edit PearAI config

Open the command palette (Cmd+Shift+P) and run PearAI: Open config.json. Add llmdeal to the models array:

~/.pearai/config.json

{
  "models": [
    {
      "title": "llmdeal smart-route-coder",
      "provider": "openai",
      "model": "smart-route-coder",
      "apiBase": "https://api.llmdeal.me/v1",
      "apiKey": "lld_YOUR_KEY_HERE"
    }
  ]
}

Save the file. PearAI hot-reloads the config; the new model appears in the model dropdown.

3Test it

Smoke test before opening the chat panel:

curl · smoke test

curl https://api.llmdeal.me/v1/chat/completions \
  -H "Authorization: Bearer lld_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart-route-coder","messages":[{"role":"user","content":"ping"}]}'

Recommended alias smart-route-coder for diff edits and chat; PearAI's inline edit UX assumes a tool-aware model.

Troubleshooting

Model not appearing in the picker?: PearAI silently drops malformed JSON. Use PearAI: View Logs to spot the parse error; trailing commas and unquoted keys are the usual culprits.
"Unauthorized" from the chat?: The apiKey goes verbatim in Authorization: Bearer …. Strip any leading sk- wrapper if you migrated from an OpenAI config.
Autocomplete still hitting PearAI's hosted model?: Tab-autocomplete uses a separate tabAutocompleteModel key. Copy your model block there too if you want llmdeal handling completions.
Slow first response?: PearAI pre-flights with a 1-token request. The first call to a cold route can take 2-3s; subsequent calls are warm.

PearAI's config schema tracks Continue.dev's; check upstream docs if the field names look stale.

Cline

Agentic VS Code extension that reads/writes files, runs commands, and uses the browser. ~30k GitHub stars. Has a first-class "OpenAI Compatible" provider in its settings UI.

1Install Cline + get a key

Install the Cline extension from the VS Code marketplace. Then mint an llmdeal key:

Buy a key →

2Configure provider

Open the Cline sidebar, click the settings gear, set API Provider to OpenAI Compatible, then fill these fields:

cline settings · base url

https://api.llmdeal.me/v1

cline settings · api key

lld_YOUR_KEY_HERE

cline settings · model id

smart-route-coder

Enable Image Support off; leave Compute Use off unless you've enabled the corresponding upstream tool. Save.

3Test it

Open the Cline panel and type "list files in cwd". If the agent runs and returns, you're wired. Or smoke-test the endpoint directly:

curl · smoke test

curl https://api.llmdeal.me/v1/chat/completions \
  -H "Authorization: Bearer lld_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart-route-coder","messages":[{"role":"user","content":"hi"}]}'

Recommended alias smart-route-coder. Cline burns tokens fast on multi-step agents; the coder route's pricing makes Cline economical in a way OpenAI direct isn't.

Troubleshooting

Cline asks for a model context size?: Set Context Window to 128000. The smart-route coder backends accept up to 128k.
Agent loops endlessly on tool calls?: Cline's parallel tool-call setting can confuse stricter routes. Open settings, disable Parallel Tool Use, retry.
Rate-limited mid-task?: Cline retries automatically with backoff. If you see persistent 429s, upgrade tier on /dashboard or split the task into smaller agent runs.
Token cost showing $0 in the sidebar?: That's expected; Cline only knows OpenAI's published pricing. Real spend is tracked on /dashboard.

Roo Code

Cline fork with multi-mode agents (Architect, Code, Ask, Debug). Same OpenAI-compatible provider, more granular per-mode model assignments. Settings UI is near-identical to Cline.

1Install Roo Code + key

Install Roo Code from the VS Code marketplace (formerly Roo Cline).

Buy a key →

2Configure provider

Open the Roo sidebar, settings icon, pick OpenAI Compatible as API Provider:

roo · base url

https://api.llmdeal.me/v1

roo · api key

lld_YOUR_KEY_HERE

For per-mode tuning, assign different llmdeal aliases per mode in Modes → Configure:

roo · per-mode model recommendations

# Mode → llmdeal alias
Architect  → smart-route
Code       → smart-route-coder
Ask        → smart-route-fast
Debug      → smart-route-coder

3Test it

Switch Roo into Code mode, ask it to write a hello-world script. If the file edits land, the provider is wired.

curl · smoke test

curl https://api.llmdeal.me/v1/chat/completions \
  -H "Authorization: Bearer lld_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart-route-coder","messages":[{"role":"user","content":"hi"}]}'

Recommended aliases Mix per mode: smart-route-coder in Code/Debug, smart-route in Architect for planning, smart-route-fast in Ask for snappy lookups.

Troubleshooting

Per-mode model not respected?: Click Save after each mode edit; Roo's UI doesn't auto-persist between mode tabs.
"Tools unsupported by model"?: All smart-route aliases support tools. Re-enter the model id (no trailing whitespace) and reload Roo's webview.
Context window warning?: Set Context Window Size to 128000. Roo defaults to the OpenAI gpt-4 value (8k), which truncates aggressively.
Roo and Cline both installed?: Disable one; both register the same VS Code shortcuts and fight over the chat panel.

opencode

Terminal-native AI coding agent from sst (the SST framework folks). Single binary, no editor lock-in. Provider config lives in ~/.config/opencode/config.json.

1Install + get a key

Install with the one-liner: curl -fsSL https://opencode.ai/install | bash. Then mint a key:

Buy a key →

2Add a provider

Create or edit ~/.config/opencode/config.json:

~/.config/opencode/config.json

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "llmdeal": {
      "npm": "@ai-sdk/openai-compatible",
      "options": {
        "baseURL": "https://api.llmdeal.me/v1",
        "apiKey": "lld_YOUR_KEY_HERE"
      },
      "models": {
        "opencode-go/deepseek-v4-flash": { "name": "llmdeal · DeepSeek V4 (default)" },
        "opencode-go/deepseek-v4-pro":   { "name": "llmdeal · DeepSeek V4-Pro reasoner" },
        "opencode-go/gemini-3.5-flash": { "name": "llmdeal · Gemini 3.5 Flash" },
        "smart-route-coder":           { "name": "llmdeal coder (free fallback)" },
        "smart-route":                 { "name": "llmdeal default (free fallback)" }
      }
    }
  }
}

Run opencode in your repo. Switch model via / → model → pick llmdeal/smart-route-coder.

3Test it

Same shape as the other tools; opencode is OpenAI-compatible under the hood:

curl · smoke test

curl https://api.llmdeal.me/v1/chat/completions \
  -H "Authorization: Bearer lld_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart-route-coder","messages":[{"role":"user","content":"hi"}]}'

Recommended alias opencode-go/deepseek-v4-flash for general coding (frontier-class DeepSeek V4, passthrough $0.14/$0.28 per 1M). Drop to smart-route-coder for free-tier fallback. Use opencode-go/deepseek-v4-pro when you need the reasoner.

Opencode Go subscription — $19/mo
Built for the opencode CLI. Includes 8M tokens/mo across the opencode-go/* aliases (DeepSeek V4 Flash & Pro, Gemini 3.5 Flash) plus unmetered free-tier smart-route-* fallbacks. Drop-in config above. Get the Opencode Go plan →

opencode-go pricing

Alias	Input / 1M	Output / 1M	Upstream
`opencode-go/deepseek-v4-flash`	$0.14	$0.28	DeepSeek chat
`opencode-go/deepseek-v4-pro`	$0.435	$0.87	DeepSeek reasoner (75% reduction permanent 2026-05-22)
`opencode-go/gemini-3.5-flash`	$1.50	$9.00	Google Gemini 3.5 Flash (Day-0 in LiteLLM 1.86.2)
`smart-route-coder` + `smart-route` + `smart-route-fast`	$0	$0	Free-tier fallback pool (Codestral, Qwen3 Coder 480B, Llama, Gemini Flash)

All prices are passthrough — no llmdeal markup on the per-token rate. The Opencode Go sub at $19/mo covers 8M tokens/mo across these aliases; overage is billed at the rates above when you top up with a PAYG pack.

Troubleshooting

"Provider not found" on launch?: opencode lazy-installs the npm provider on first run; first launch can take 20s while @ai-sdk/openai-compatible installs. Re-run.
Model picker empty?: The models keys must match real llmdeal aliases. Typos here silently drop entries.
Tool calls failing mid-task?: Run with opencode --print-logs in a second terminal to see the actual request/response and find the schema mismatch.
Want to skip the config file?: Set OPENCODE_PROVIDER=llmdeal OPENAI_BASE_URL=https://api.llmdeal.me/v1 OPENAI_API_KEY=lld_… opencode.

opencode's config schema is still iterating; verify the latest provider block on the repo README if the npm provider name has changed.

OpenHands

Autonomous software engineer agent (formerly OpenDevin). Runs in Docker, drives a sandboxed environment, opens its own browser. Uses LiteLLM under the hood, so any OpenAI-compatible endpoint works.

1Run OpenHands + get a key

Pull and start the runtime container per official quickstart. Mint a key:

Buy a key →

2Configure the LLM

In the OpenHands settings panel, set the LLM provider to Custom, then fill:

openhands · ui settings

Custom Model: openai/smart-route-coder
Base URL:     https://api.llmdeal.me/v1
API Key:      lld_YOUR_KEY_HERE

Or, if you launch OpenHands via CLI, drop these into config.toml:

./config.toml

[llm]
model = "openai/smart-route-coder"
base_url = "https://api.llmdeal.me/v1"
api_key = "lld_YOUR_KEY_HERE"

3Test it

Before kicking off a long agent task, smoke-test the endpoint from inside the OpenHands container shell:

curl · smoke test

curl https://api.llmdeal.me/v1/chat/completions \
  -H "Authorization: Bearer lld_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart-route-coder","messages":[{"role":"user","content":"ready?"}]}'

Recommended alias smart-route-coder prefixed with openai/. OpenHands routes through LiteLLM, which uses the provider prefix to pick the transport.

Troubleshooting

"LiteLLM completion error"?: The model id must include the openai/ prefix. openai/smart-route-coder, not bare.
Container can't reach api.llmdeal.me?: OpenHands runs the agent in a sandboxed container with restricted egress. Verify the host's outbound to api.llmdeal.me:443 is open and the container inherits DNS.
Agent stops mid-task with "context window exceeded"?: Lower max_input_tokens in config.toml (try 96000). OpenHands buffers history aggressively.
Budget runaway on long sessions?: Set max_iterations and a per-session budget cap in OpenHands settings. The agent will retry tool calls indefinitely otherwise.

Open Interpreter

Local "code interpreter" CLI; executes natural-language tasks by running Python/shell on your machine. Configured via env vars or ~/.config/open-interpreter/profiles/default.yaml.

1Install + key

pip install open-interpreter. Then mint a key:

Buy a key →

2Point Open Interpreter at llmdeal

Env-var approach is simplest:

shell · env

export OPENAI_API_BASE=https://api.llmdeal.me/v1
export OPENAI_API_KEY=lld_YOUR_KEY_HERE
interpreter --model openai/smart-route-coder

For a persistent profile, edit ~/.config/open-interpreter/profiles/default.yaml:

default.yaml

llm:
  model: openai/smart-route-coder
  api_base: https://api.llmdeal.me/v1
  api_key: lld_YOUR_KEY_HERE
  context_window: 128000
  max_tokens: 4096

3Test it

Run a no-op command to confirm wiring:

shell

interpreter --model openai/smart-route-coder \
  --auto-run "print the current python version"

Recommended alias openai/smart-route-coder for task work; openai/smart-route-fast if you want snappier "what's this command" lookups.

Troubleshooting

"Model not found" from LiteLLM?: Open Interpreter uses LiteLLM. The openai/ prefix is mandatory; without it LiteLLM tries to look up the model in its anthropic/openai registry and fails.
It keeps asking for confirmation?: Pass --auto-run (or -y) to skip per-command confirmation. Use with care; this thing runs anything.
Streaming output stutters?: Try a fast route: --model openai/smart-route-fast. The coder route is throughput-tuned, not latency.
Context overflowing on long sessions?: Lower context_window in default.yaml to 64000; Open Interpreter keeps full history by default.

Open WebUI

Self-hosted ChatGPT-style web UI; ships with first-class OpenAI-compatible support. Add llmdeal as a connection from the admin panel; no config files needed.

1Run Open WebUI + key

docker run -d -p 3000:8080 -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main. Then mint a key:

Buy a key →

2Add llmdeal as an OpenAI connection

Browse to http://localhost:3000, sign in as admin, then go to Admin Panel → Settings → Connections → OpenAI API. Add a new connection:

open webui · api base url

https://api.llmdeal.me/v1

open webui · api key

lld_YOUR_KEY_HERE

Click the refresh icon next to the connection; Open WebUI will fetch the model list. Save. Models appear in the chat-screen dropdown.

Headless launch with env vars also works:

docker · env injection

docker run -d -p 3000:8080 \
  -e OPENAI_API_BASE_URL=https://api.llmdeal.me/v1 \
  -e OPENAI_API_KEY=lld_YOUR_KEY_HERE \
  -v open-webui:/app/backend/data \
  --name open-webui ghcr.io/open-webui/open-webui:main

3Test it

Pick smart-route-fast in the chat dropdown and send a message. Or curl directly:

curl · smoke test

curl https://api.llmdeal.me/v1/chat/completions \
  -H "Authorization: Bearer lld_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart-route-fast","messages":[{"role":"user","content":"hi"}]}'

Recommended alias smart-route-fast for chat UX (low first-token latency); smart-route for longer reasoning sessions.

Troubleshooting

Model list empty after adding the connection?: Click the small refresh icon next to the connection. Open WebUI fetches /v1/models on demand, not on save.
Connection test fails with "fetch failed"?: If you're running Open WebUI in Docker against a host service, use host.docker.internal instead of localhost. For api.llmdeal.me (public), this isn't an issue.
Want to hide Ollama models?: Disable the Ollama connection in Admin → Settings → Connections; otherwise the chat dropdown is cluttered.
Files / vision uploads fail?: Most smart-route aliases are text-only. Image inputs require a vision-capable route; /docs.html lists which.

LibreChat

Self-hosted ChatGPT clone with multi-user, plugins, RAG, and a strong custom-endpoint config. llmdeal slots into the custom endpoints array in librechat.yaml.

1Install + key

Follow the LibreChat quickstart (Docker compose). Then mint a key:

Buy a key →

2Add llmdeal as a custom endpoint

Add an llmdeal entry to your librechat.yaml under endpoints.custom:

librechat.yaml

version: 1.0.5
cache: true
endpoints:
  custom:
    - name: "llmdeal"
      apiKey: "${LLMDEAL_API_KEY}"
      baseURL: "https://api.llmdeal.me/v1"
      models:
        default: ["smart-route", "smart-route-coder", "smart-route-fast"]
        fetch: true
      titleConvo: true
      titleModel: "smart-route-fast"
      modelDisplayLabel: "llmdeal"

Then add the key to LibreChat's .env:

.env

LLMDEAL_API_KEY=lld_YOUR_KEY_HERE

docker compose restart api. The new llmdeal endpoint shows up in the model picker.

3Test it

curl · smoke test

curl https://api.llmdeal.me/v1/chat/completions \
  -H "Authorization: Bearer lld_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart-route-fast","messages":[{"role":"user","content":"hi"}]}'

Recommended alias smart-route-fast as the conversation-title model (cheap, fast); offer smart-route and smart-route-coder in the user-facing picker.

Troubleshooting

Endpoint missing after restart?: LibreChat is strict about YAML versions. Pin version: 1.0.5 (or your installed version's exact match) at the top of librechat.yaml.
"Could not parse" error in logs?: Use spaces, not tabs. The apiKey value supports ${VAR} interpolation only if the env var exists at startup; missing vars fail silently.
RAG / agents not using llmdeal?: Set RAG_OPENAI_BASEURL and RAG_OPENAI_API_KEY in .env for embeddings, separately from the chat endpoint.
Model fetching returns empty?: Set fetch: false and rely on the default array. llmdeal's /v1/models is fine, but some LibreChat versions cache aggressively.

Codeium / Windsurf

Codeium's free tier and Windsurf editor are tightly bound to their hosted backend; there's no first-party support for swapping the OpenAI base URL today. We're tracking this and will document a workaround if one ships.

No official Codeium custom-endpoint setting exists at time of writing. If you need a Codeium-style autocomplete with llmdeal, use Continue.dev's autocomplete role instead.

Sourcegraph Cody

Cody Enterprise supports BYO LLM via Sourcegraph's site-admin LLM config (Sourcegraph instance acts as the gateway). The free Cody plan currently does not expose a custom OpenAI endpoint setting.

BYO LLM in Cody requires a self-hosted Sourcegraph instance and Enterprise plan. Verify against the latest Sourcegraph docs before wiring llmdeal as a backing provider. For everyday VS Code chat + autocomplete with llmdeal, prefer Continue.dev or Cline.

Tabby (self-host)

Tabby is a self-hosted code-completion server. You typically point it at a local Ollama/llamacpp backend, not a remote API. It does support an OpenAI-compatible chat backend for the side-panel chat feature; the completion model still needs to be local.

1Use llmdeal for chat only

In your Tabby ~/.tabby/config.toml, set the [model.chat.http] block to an OpenAI/chat kind:

~/.tabby/config.toml

[model.chat.http]
kind = "openai/chat"
model_name = "smart-route-coder"
api_endpoint = "https://api.llmdeal.me/v1"
api_key = "lld_YOUR_KEY_HERE"

Restart Tabby. The chat side-panel will route to llmdeal. Code completion stays on your local model — that's by design; remote completion latency would be too high.

Config schema verified against Tabby 0.20+. Older versions use different section names; check upstream docs for your version.

Hosting a consortium GPU?

Have a GPU? Apply to join the consortium and earn a share of consortium-tier revenue.

Apply to host a GPU →

Already approved with a cstm_ token? The consortium setup script takes you from "machine registered, nothing running" to "serving inference to llmdeal customers" in about 10 minutes. See the companion docs for the full flow.

curl -sSf https://llmdeal.me/setup/consortium.sh -o consortium-setup.sh
chmod +x consortium-setup.sh
sudo ./consortium-setup.sh

Curious how much capacity is live right now? View the public pool status →

Cursor

Troubleshooting

Need a different tool?

Hosting a consortium GPU?