13 tools, one llmdeal key.
Pick yours.
IDEs, agent CLIs, chat UIs. Anything that speaks OpenAI's
/v1/chat/completions works. Copy a snippet, paste your
key, keep the editor you already know.
Cursor
VS Code fork with built-in AI chat, autocomplete and agent mode. Used by ~1M+ developers. Cursor's Settings > Models panel has an Override OpenAI Base URL toggle — that's the door llmdeal walks through. Note: Ask & Plan modes accept custom keys; Agent mode currently does not.
Cursor's "OpenAI API Key" field expects a key that starts with
sk-. llmdeal mints keys starting with lld_
— but our Cursor-mode endpoint accepts the same key as a bearer
token. Grab your key from the dashboard
or spin up a free demo key with one click.
In Cursor open Settings → Cursor Settings → Models. Toggle on Override OpenAI Base URL, paste:
https://api.llmdeal.me/cursor
Paste your llmdeal key into the OpenAI API Key field, then click Verify. Cursor will ping our endpoint and check the key is live.
lld_YOUR_KEY_HERE
Finally hit + Add custom model and type a model name — we recommend the smart-route alias below. Turn the toggle on so Cursor surfaces it in the model picker.
From a terminal, confirm the endpoint is reachable with your key:
curl https://api.llmdeal.me/v1/chat/completions \
-H "Authorization: Bearer lld_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{"model":"smart-route-coder","messages":[{"role":"user","content":"say hi in 3 words"}]}'
You should see a JSON response with a choices[0].message.content field. If you do, Cursor will work too.
smart-route-coder
— code-aware routing into Qwen3-Coder & GPT-OSS-120B. For long
free-form prose chats, try smart-route.
Troubleshooting
- Getting a 401 Unauthorized?
-
Cursor sends the bearer token as-is. Your key is the
lld_…string — paste it raw, nosk-prefix needed. If you previously had OpenAI configured, remove that key from the field first. - Verify button greyed out?
-
The Override OpenAI Base URL toggle must be ON before
Verify will fire. Also make sure the URL ends with
/cursor(not/v1) — Cursor's pre-flight is path-sensitive. - Model not appearing in the picker?
-
You need to + Add custom model with the exact
name (e.g.
smart-route-coder) and flip its toggle on. Cursor only shows models you've explicitly enabled. - Working in Ask, broken in Agent mode?
- Known Cursor limitation: Agent mode currently routes through Cursor's first-party infra and ignores custom OpenAI keys. Use Ask (Cmd+L) or Plan mode with llmdeal.
Continue.dev
Open-source AI assistant for VS Code & JetBrains. ~25k GitHub
stars, fully model-agnostic, config lives in
~/.continue/config.yaml. Continue speaks native
OpenAI-compatible — point apiBase at llmdeal and you're
done.
Continue's apiKey field takes any string — your
lld_… key works as-is.
Open ~/.continue/config.yaml (Continue creates this
on first launch — if you don't have it, install the extension and
open it once). Append a new model block:
# ~/.continue/config.yaml — llmdeal block
name: llmdeal
version: 0.0.1
schema: v1
models:
- name: llmdeal smart-route
provider: openai
model: smart-route
apiBase: https://api.llmdeal.me/v1
apiKey: lld_YOUR_KEY_HERE
roles: [chat, edit, apply]
Save the file. Continue reloads its config on save — no extension restart needed. The new model appears in Continue's bottom-bar model picker.
One-shot smoke test before you wire it into your editor:
curl https://api.llmdeal.me/v1/chat/completions \
-H "Authorization: Bearer lld_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{"model":"smart-route","messages":[{"role":"user","content":"hello from continue"}]}'
smart-route for everyday chat & edits — balanced
cost vs quality. Add a second entry pointing at
smart-route-coder for inline edits on large diffs.
Continue lets you keep both and switch from the model picker.
Troubleshooting
- Continue can't see the new model?
- Make sure the YAML is valid (no tabs, 2-space indent). Continue silently drops unparseable blocks — check Output → Continue in VS Code for parse errors.
- 401 errors in the chat view?
-
The
apiKeyvalue is sent as-is in theAuthorization: Bearer …header. Don't quote it twice or wrap it in${…}unless you've defined that env variable in Continue's secrets block. - "Model not found" from the server?
-
The
model:value must match an llmdeal alias exactly — case-sensitive. Usesmart-route,smart-route-coder, orsmart-route-fast. See /docs.html for the full list. - Autocomplete isn't using llmdeal?
-
Autocomplete uses a separate
tabAutocompleteModelblock. Either add llmdeal there too (setroles: [autocomplete]), or let Continue keep its default for low-latency completions and use llmdeal for chat only.
Aider
CLI-first AI pair-programmer that edits files via git. Beloved by the terminal-purist crowd. Aider's OpenAI-compatible mode is a two-env-var switch — no config file required.
Aider reads OPENAI_API_KEY from your environment.
Your lld_… key drops in unchanged.
Mac/Linux — drop this in ~/.zshrc or ~/.bashrc:
export OPENAI_API_BASE=https://api.llmdeal.me/v1
export OPENAI_API_KEY=lld_YOUR_KEY_HERE
Then launch aider in your project root with the openai/ prefix on the model name:
aider --model openai/smart-route-coder
Prefer a config file? Drop this into .aider.conf.yml
at your repo root or ~/.aider.conf.yml for a global
default:
model: openai/smart-route-coder
openai-api-base: https://api.llmdeal.me/v1
openai-api-key: lld_YOUR_KEY_HERE
Round-trip from the same shell aider will run in:
curl "$OPENAI_API_BASE/chat/completions" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"smart-route-coder","messages":[{"role":"user","content":"hi"}]}'
smart-route-coder — Aider's diff-edit format leans
heavily on instruction-following; the coder route tunes for that
and is dramatically cheaper than Claude Sonnet for multi-file
refactors.
Troubleshooting
- "Unknown model" from aider's launcher?
-
Aider routes via LiteLLM under the hood. Always prefix with
openai/— e.g.openai/smart-route-coder, not baresmart-route-coder. - Env vars don't seem to take effect?
-
Open a new shell or run
source ~/.zshrc. Aider reads env once at launch — restart it after any export. - Hitting rate limits on a large refactor?
-
Add
--map-tokens 0to disable the repo map for very large repos, or split work into multiple smaller sessions. Our tiered limits reset on a sliding window — check /dashboard. - Aider can't apply diffs cleanly?
-
Try
--edit-format diff-fenced— the coder route plays better with explicit fenced diffs than with whole-file rewrites.
Zed
Rust-native, GPU-accelerated editor with built-in AI assistant.
Zed's openai_compatible language-model schema is a
first-class config target — just add a provider entry pointing at
api.llmdeal.me.
Zed reads custom-provider API keys from an environment variable
named <PROVIDER_NAME>_API_KEY — for the
"llmdeal" provider below, that means
LLMDEAL_API_KEY.
Open ~/.config/zed/settings.json (or
Cmd+, in Zed). Add or merge this language_models
block:
{
"language_models": {
"openai_compatible": {
"llmdeal": {
"api_url": "https://api.llmdeal.me/v1",
"available_models": [
{
"name": "smart-route-fast",
"display_name": "llmdeal · smart-route-fast",
"max_tokens": 32768,
"capabilities": {
"tools": true,
"images": false,
"parallel_tool_calls": false,
"prompt_cache_key": false
}
},
{
"name": "smart-route-coder",
"display_name": "llmdeal · coder",
"max_tokens": 32768,
"capabilities": {
"tools": true,
"images": false,
"parallel_tool_calls": false,
"prompt_cache_key": false
}
}
]
}
}
}
}
Then export the key into the environment Zed is launched from:
export LLMDEAL_API_KEY=lld_YOUR_KEY_HERE
On macOS, if you launch Zed from the Dock, set the variable with
launchctl setenv LLMDEAL_API_KEY lld_… or relaunch
Zed from the terminal so the env propagates.
Same smoke test pattern as the other tools:
curl https://api.llmdeal.me/v1/chat/completions \
-H "Authorization: Bearer $LLMDEAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"smart-route-fast","messages":[{"role":"user","content":"zed pings llmdeal"}]}'
Open Zed's Assistant panel (Cmd+?) and pick "llmdeal · smart-route-fast" from the model dropdown.
smart-route-fast — Zed's UX is latency-sensitive and
fast routes through Groq / Cerebras for sub-second
first-tokens. Add smart-route-coder as a second entry
for heavier inline edits.
Troubleshooting
- "Provider not found" when picking a model?
-
Zed reloads settings on save but sometimes caches the model list.
Restart Zed once after editing
settings.json. - 401 errors despite a valid key?
-
Confirm the env var name matches the provider key — the provider
"llmdeal"requiresLLMDEAL_API_KEY(uppercase, underscored). Rename mismatches are the #1 cause. - Tools / function-calling not firing?
-
"tools": truemust be set in capabilities. The smart-route aliases support tool calls — leave this on. - Settings file won't parse?
-
Zed's settings.json supports comments and trailing commas, but
merging into an existing file is finicky — paste the
language_modelsblock at the top level, not nested inside"editor"or another section.
PearAI
Open-source Cursor fork (Apache 2.0) with a forked Continue.dev bundled in. Configured via the same config.json file pattern, so anything Continue accepts, PearAI accepts.
PearAI passes the API key straight to the upstream OpenAI-compatible endpoint, so any lld_… key works.
Open the command palette (Cmd+Shift+P) and run PearAI: Open config.json. Add llmdeal to the models array:
{
"models": [
{
"title": "llmdeal smart-route-coder",
"provider": "openai",
"model": "smart-route-coder",
"apiBase": "https://api.llmdeal.me/v1",
"apiKey": "lld_YOUR_KEY_HERE"
}
]
}
Save the file. PearAI hot-reloads the config; the new model appears in the model dropdown.
Smoke test before opening the chat panel:
curl https://api.llmdeal.me/v1/chat/completions \
-H "Authorization: Bearer lld_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{"model":"smart-route-coder","messages":[{"role":"user","content":"ping"}]}'
smart-route-coder for diff edits and chat; PearAI's inline edit UX assumes a tool-aware model.
Troubleshooting
- Model not appearing in the picker?
- PearAI silently drops malformed JSON. Use PearAI: View Logs to spot the parse error; trailing commas and unquoted keys are the usual culprits.
- "Unauthorized" from the chat?
- The
apiKeygoes verbatim inAuthorization: Bearer …. Strip any leadingsk-wrapper if you migrated from an OpenAI config. - Autocomplete still hitting PearAI's hosted model?
- Tab-autocomplete uses a separate
tabAutocompleteModelkey. Copy your model block there too if you want llmdeal handling completions. - Slow first response?
- PearAI pre-flights with a 1-token request. The first call to a cold route can take 2-3s; subsequent calls are warm.
Cline
Agentic VS Code extension that reads/writes files, runs commands, and uses the browser. ~30k GitHub stars. Has a first-class "OpenAI Compatible" provider in its settings UI.
Install the Cline extension from the VS Code marketplace. Then mint an llmdeal key:
Open the Cline sidebar, click the settings gear, set API Provider to OpenAI Compatible, then fill these fields:
https://api.llmdeal.me/v1
lld_YOUR_KEY_HERE
smart-route-coder
Enable Image Support off; leave Compute Use off unless you've enabled the corresponding upstream tool. Save.
Open the Cline panel and type "list files in cwd". If the agent runs and returns, you're wired. Or smoke-test the endpoint directly:
curl https://api.llmdeal.me/v1/chat/completions \
-H "Authorization: Bearer lld_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{"model":"smart-route-coder","messages":[{"role":"user","content":"hi"}]}'
smart-route-coder. Cline burns tokens fast on multi-step agents; the coder route's pricing makes Cline economical in a way OpenAI direct isn't.
Troubleshooting
- Cline asks for a model context size?
- Set Context Window to
128000. The smart-route coder backends accept up to 128k. - Agent loops endlessly on tool calls?
- Cline's parallel tool-call setting can confuse stricter routes. Open settings, disable Parallel Tool Use, retry.
- Rate-limited mid-task?
- Cline retries automatically with backoff. If you see persistent 429s, upgrade tier on /dashboard or split the task into smaller agent runs.
- Token cost showing $0 in the sidebar?
- That's expected; Cline only knows OpenAI's published pricing. Real spend is tracked on /dashboard.
Roo Code
Cline fork with multi-mode agents (Architect, Code, Ask, Debug). Same OpenAI-compatible provider, more granular per-mode model assignments. Settings UI is near-identical to Cline.
Install Roo Code from the VS Code marketplace (formerly Roo Cline).
Open the Roo sidebar, settings icon, pick OpenAI Compatible as API Provider:
https://api.llmdeal.me/v1
lld_YOUR_KEY_HERE
For per-mode tuning, assign different llmdeal aliases per mode in Modes → Configure:
# Mode → llmdeal alias
Architect → smart-route
Code → smart-route-coder
Ask → smart-route-fast
Debug → smart-route-coder
Switch Roo into Code mode, ask it to write a hello-world script. If the file edits land, the provider is wired.
curl https://api.llmdeal.me/v1/chat/completions \
-H "Authorization: Bearer lld_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{"model":"smart-route-coder","messages":[{"role":"user","content":"hi"}]}'
smart-route-coder in Code/Debug, smart-route in Architect for planning, smart-route-fast in Ask for snappy lookups.
Troubleshooting
- Per-mode model not respected?
- Click Save after each mode edit; Roo's UI doesn't auto-persist between mode tabs.
- "Tools unsupported by model"?
- All smart-route aliases support tools. Re-enter the model id (no trailing whitespace) and reload Roo's webview.
- Context window warning?
- Set Context Window Size to
128000. Roo defaults to the OpenAI gpt-4 value (8k), which truncates aggressively. - Roo and Cline both installed?
- Disable one; both register the same VS Code shortcuts and fight over the chat panel.
opencode
Terminal-native AI coding agent from sst (the SST framework folks). Single binary, no editor lock-in. Provider config lives in ~/.config/opencode/config.json.
Install with the one-liner: curl -fsSL https://opencode.ai/install | bash. Then mint a key:
Create or edit ~/.config/opencode/config.json:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"llmdeal": {
"npm": "@ai-sdk/openai-compatible",
"options": {
"baseURL": "https://api.llmdeal.me/v1",
"apiKey": "lld_YOUR_KEY_HERE"
},
"models": {
"opencode-go/deepseek-v4-flash": { "name": "llmdeal · DeepSeek V4 (default)" },
"opencode-go/deepseek-v4-pro": { "name": "llmdeal · DeepSeek V4-Pro reasoner" },
"opencode-go/gemini-3.5-flash": { "name": "llmdeal · Gemini 3.5 Flash" },
"smart-route-coder": { "name": "llmdeal coder (free fallback)" },
"smart-route": { "name": "llmdeal default (free fallback)" }
}
}
}
}
Run opencode in your repo. Switch model via / → model → pick llmdeal/smart-route-coder.
Same shape as the other tools; opencode is OpenAI-compatible under the hood:
curl https://api.llmdeal.me/v1/chat/completions \
-H "Authorization: Bearer lld_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{"model":"smart-route-coder","messages":[{"role":"user","content":"hi"}]}'
opencode-go/deepseek-v4-flash for general coding (frontier-class DeepSeek V4, passthrough $0.14/$0.28 per 1M). Drop to smart-route-coder for free-tier fallback. Use opencode-go/deepseek-v4-pro when you need the reasoner.
Built for the opencode CLI. Includes 8M tokens/mo across the
opencode-go/* aliases (DeepSeek V4 Flash & Pro, Gemini 3.5 Flash) plus unmetered free-tier smart-route-* fallbacks. Drop-in config above. Get the Opencode Go plan →
| Alias | Input / 1M | Output / 1M | Upstream |
|---|---|---|---|
opencode-go/deepseek-v4-flash | $0.14 | $0.28 | DeepSeek chat |
opencode-go/deepseek-v4-pro | $0.435 | $0.87 | DeepSeek reasoner (75% reduction permanent 2026-05-22) |
opencode-go/gemini-3.5-flash | $1.50 | $9.00 | Google Gemini 3.5 Flash (Day-0 in LiteLLM 1.86.2) |
smart-route-coder + smart-route + smart-route-fast | $0 | $0 | Free-tier fallback pool (Codestral, Qwen3 Coder 480B, Llama, Gemini Flash) |
All prices are passthrough — no llmdeal markup on the per-token rate. The Opencode Go sub at $19/mo covers 8M tokens/mo across these aliases; overage is billed at the rates above when you top up with a PAYG pack.
Troubleshooting
- "Provider not found" on launch?
- opencode lazy-installs the npm provider on first run; first launch can take 20s while
@ai-sdk/openai-compatibleinstalls. Re-run. - Model picker empty?
- The
modelskeys must match real llmdeal aliases. Typos here silently drop entries. - Tool calls failing mid-task?
- Run with
opencode --print-logsin a second terminal to see the actual request/response and find the schema mismatch. - Want to skip the config file?
- Set
OPENCODE_PROVIDER=llmdeal OPENAI_BASE_URL=https://api.llmdeal.me/v1 OPENAI_API_KEY=lld_… opencode.
OpenHands
Autonomous software engineer agent (formerly OpenDevin). Runs in Docker, drives a sandboxed environment, opens its own browser. Uses LiteLLM under the hood, so any OpenAI-compatible endpoint works.
Pull and start the runtime container per official quickstart. Mint a key:
In the OpenHands settings panel, set the LLM provider to Custom, then fill:
Custom Model: openai/smart-route-coder
Base URL: https://api.llmdeal.me/v1
API Key: lld_YOUR_KEY_HERE
Or, if you launch OpenHands via CLI, drop these into config.toml:
[llm]
model = "openai/smart-route-coder"
base_url = "https://api.llmdeal.me/v1"
api_key = "lld_YOUR_KEY_HERE"
Before kicking off a long agent task, smoke-test the endpoint from inside the OpenHands container shell:
curl https://api.llmdeal.me/v1/chat/completions \
-H "Authorization: Bearer lld_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{"model":"smart-route-coder","messages":[{"role":"user","content":"ready?"}]}'
smart-route-coder prefixed with openai/. OpenHands routes through LiteLLM, which uses the provider prefix to pick the transport.
Troubleshooting
- "LiteLLM completion error"?
- The model id must include the
openai/prefix.openai/smart-route-coder, not bare. - Container can't reach api.llmdeal.me?
- OpenHands runs the agent in a sandboxed container with restricted egress. Verify the host's outbound to
api.llmdeal.me:443is open and the container inherits DNS. - Agent stops mid-task with "context window exceeded"?
- Lower
max_input_tokensin config.toml (try 96000). OpenHands buffers history aggressively. - Budget runaway on long sessions?
- Set
max_iterationsand a per-session budget cap in OpenHands settings. The agent will retry tool calls indefinitely otherwise.
Open Interpreter
Local "code interpreter" CLI; executes natural-language tasks by running Python/shell on your machine. Configured via env vars or ~/.config/open-interpreter/profiles/default.yaml.
pip install open-interpreter. Then mint a key:
Env-var approach is simplest:
export OPENAI_API_BASE=https://api.llmdeal.me/v1
export OPENAI_API_KEY=lld_YOUR_KEY_HERE
interpreter --model openai/smart-route-coder
For a persistent profile, edit ~/.config/open-interpreter/profiles/default.yaml:
llm:
model: openai/smart-route-coder
api_base: https://api.llmdeal.me/v1
api_key: lld_YOUR_KEY_HERE
context_window: 128000
max_tokens: 4096
Run a no-op command to confirm wiring:
interpreter --model openai/smart-route-coder \
--auto-run "print the current python version"
openai/smart-route-coder for task work; openai/smart-route-fast if you want snappier "what's this command" lookups.
Troubleshooting
- "Model not found" from LiteLLM?
- Open Interpreter uses LiteLLM. The
openai/prefix is mandatory; without it LiteLLM tries to look up the model in its anthropic/openai registry and fails. - It keeps asking for confirmation?
- Pass
--auto-run(or-y) to skip per-command confirmation. Use with care; this thing runs anything. - Streaming output stutters?
- Try a fast route:
--model openai/smart-route-fast. The coder route is throughput-tuned, not latency. - Context overflowing on long sessions?
- Lower
context_windowin default.yaml to 64000; Open Interpreter keeps full history by default.
Open WebUI
Self-hosted ChatGPT-style web UI; ships with first-class OpenAI-compatible support. Add llmdeal as a connection from the admin panel; no config files needed.
docker run -d -p 3000:8080 -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main. Then mint a key:
Browse to http://localhost:3000, sign in as admin, then go to Admin Panel → Settings → Connections → OpenAI API. Add a new connection:
https://api.llmdeal.me/v1
lld_YOUR_KEY_HERE
Click the refresh icon next to the connection; Open WebUI will fetch the model list. Save. Models appear in the chat-screen dropdown.
Headless launch with env vars also works:
docker run -d -p 3000:8080 \
-e OPENAI_API_BASE_URL=https://api.llmdeal.me/v1 \
-e OPENAI_API_KEY=lld_YOUR_KEY_HERE \
-v open-webui:/app/backend/data \
--name open-webui ghcr.io/open-webui/open-webui:main
Pick smart-route-fast in the chat dropdown and send a message. Or curl directly:
curl https://api.llmdeal.me/v1/chat/completions \
-H "Authorization: Bearer lld_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{"model":"smart-route-fast","messages":[{"role":"user","content":"hi"}]}'
smart-route-fast for chat UX (low first-token latency); smart-route for longer reasoning sessions.
Troubleshooting
- Model list empty after adding the connection?
- Click the small refresh icon next to the connection. Open WebUI fetches
/v1/modelson demand, not on save. - Connection test fails with "fetch failed"?
- If you're running Open WebUI in Docker against a host service, use
host.docker.internalinstead oflocalhost. For api.llmdeal.me (public), this isn't an issue. - Want to hide Ollama models?
- Disable the Ollama connection in Admin → Settings → Connections; otherwise the chat dropdown is cluttered.
- Files / vision uploads fail?
- Most smart-route aliases are text-only. Image inputs require a vision-capable route; /docs.html lists which.
LibreChat
Self-hosted ChatGPT clone with multi-user, plugins, RAG, and a strong custom-endpoint config. llmdeal slots into the custom endpoints array in librechat.yaml.
Follow the LibreChat quickstart (Docker compose). Then mint a key:
Add an llmdeal entry to your librechat.yaml under endpoints.custom:
version: 1.0.5
cache: true
endpoints:
custom:
- name: "llmdeal"
apiKey: "${LLMDEAL_API_KEY}"
baseURL: "https://api.llmdeal.me/v1"
models:
default: ["smart-route", "smart-route-coder", "smart-route-fast"]
fetch: true
titleConvo: true
titleModel: "smart-route-fast"
modelDisplayLabel: "llmdeal"
Then add the key to LibreChat's .env:
LLMDEAL_API_KEY=lld_YOUR_KEY_HERE
docker compose restart api. The new llmdeal endpoint shows up in the model picker.
curl https://api.llmdeal.me/v1/chat/completions \
-H "Authorization: Bearer lld_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{"model":"smart-route-fast","messages":[{"role":"user","content":"hi"}]}'
smart-route-fast as the conversation-title model (cheap, fast); offer smart-route and smart-route-coder in the user-facing picker.
Troubleshooting
- Endpoint missing after restart?
- LibreChat is strict about YAML versions. Pin
version: 1.0.5(or your installed version's exact match) at the top of librechat.yaml. - "Could not parse" error in logs?
- Use spaces, not tabs. The
apiKeyvalue supports${VAR}interpolation only if the env var exists at startup; missing vars fail silently. - RAG / agents not using llmdeal?
- Set
RAG_OPENAI_BASEURLandRAG_OPENAI_API_KEYin .env for embeddings, separately from the chat endpoint. - Model fetching returns empty?
- Set
fetch: falseand rely on thedefaultarray. llmdeal's/v1/modelsis fine, but some LibreChat versions cache aggressively.
Codeium / Windsurf
Codeium's free tier and Windsurf editor are tightly bound to their hosted backend; there's no first-party support for swapping the OpenAI base URL today. We're tracking this and will document a workaround if one ships.
Sourcegraph Cody
Cody Enterprise supports BYO LLM via Sourcegraph's site-admin LLM config (Sourcegraph instance acts as the gateway). The free Cody plan currently does not expose a custom OpenAI endpoint setting.
Tabby (self-host)
Tabby is a self-hosted code-completion server. You typically point it at a local Ollama/llamacpp backend, not a remote API. It does support an OpenAI-compatible chat backend for the side-panel chat feature; the completion model still needs to be local.
In your Tabby ~/.tabby/config.toml, set the [model.chat.http] block to an OpenAI/chat kind:
[model.chat.http]
kind = "openai/chat"
model_name = "smart-route-coder"
api_endpoint = "https://api.llmdeal.me/v1"
api_key = "lld_YOUR_KEY_HERE"
Restart Tabby. The chat side-panel will route to llmdeal. Code completion stays on your local model — that's by design; remote completion latency would be too high.
Need a different tool?
Anything that speaks OpenAI's /v1/chat/completions works
with llmdeal — Cline, Roo Code, OpenWebUI, LibreChat, your own
scripts. Base URL: https://api.llmdeal.me/v1. Auth:
Authorization: Bearer lld_….
Hosting a consortium GPU?
Already approved with a cstm_ token? The
consortium setup script
takes you from "machine registered, nothing running" to "serving inference to
llmdeal customers" in about 10 minutes. See the
companion docs for the full flow.
curl -sSf https://llmdeal.me/setup/consortium.sh -o consortium-setup.sh
chmod +x consortium-setup.sh
sudo ./consortium-setup.sh
Curious how much capacity is live right now? View the public pool status →