docs
One OpenAI-compatible endpoint. Authenticate, send a request, stream the reply.
Quickstart
Send a standard OpenAI-style chat completion. Swap the base URL and the model name — that's the whole integration.
# one endpoint, every model curl https://apiarium-labs.hf.space/v1/chat/completions \ -H "Authorization: Bearer $DREAMROUTER_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-opus-4.8", "messages": [{"role":"user","content":"Hello, whale."}] }'
from openai import OpenAI client = OpenAI( base_url="https://apiarium-labs.hf.space/v1", api_key="dr_u_...", # your personal key from the dashboard ) resp = client.chat.completions.create( model="claude-opus-4.8", messages=[{"role": "user", "content": "Hello, whale."}], ) print(resp.choices[0].message.content)
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://apiarium-labs.hf.space/v1", apiKey: "dr_u_...", // your personal key from the dashboard }); const resp = await client.chat.completions.create({ model: "claude-opus-4.8", messages: [{ role: "user", content: "Hello, whale." }], }); console.log(resp.choices[0].message.content);
Authentication
Proxy endpoints require a Bearer token in the Authorization header.
Authorization: Bearer sk-your-DreamRouter-key
/, /health, /v1/models) need no auth. Everything else returns 403 without a valid key. An optional IP allowlist can further lock access.Base URL
All requests are made against:
https://apiarium-labs.hf.space/v1
Point any OpenAI-compatible client at this base URL and you are done.
List models
GET /v1/models returns every model the router can resolve. Public, no auth required.
curl https://apiarium-labs.hf.space/v1/models
{
"object": "list",
"data": [
{ "id": "claude-opus-4.8", "object": "model", "owned_by": "claude-subnet" },
{ "id": "deepseek-v4-pro", "object": "model", "owned_by": "deepseek" }
// ...374 models
]
}
Chat completions
POST /v1/chat/completions accepts the standard OpenAI request body. The model field selects the pool; everything else passes through.
| Field | Type | Notes |
|---|---|---|
model | string | Required. Any name from /v1/models. |
messages | array | OpenAI message format. |
stream | boolean | SSE stream when true. |
temperature | number | Forwarded upstream. |
max_tokens | number | Forwarded. GPT-5.x maps to max_completion_tokens automatically. |
tools | array | Tool/function calling supported where the upstream allows it. |
Streaming
Set "stream": true. The router forwards the upstream SSE stream and filters non-data: lines so strict parsers stay happy. Each chunk is a standard chat.completion.chunk.
data: {"choices":[{"delta":{"content":"Hello"}}]}
data: {"choices":[{"delta":{"content":", whale."}}]}
data: [DONE]
Parameters
Unsupported parameters are silently dropped before forwarding. The router never invents upstream behavior — it only translates model names and injects keys.
Errors
| Status | Meaning |
|---|---|
401 | Missing or invalid bearer token. |
403 | Endpoint blocked, or IP not on the allowlist. |
404 | Model not found. Returned before any upstream call. |
429 | All targets rate-limited after retries. |
502 | Every target in the pool failed its attempts. |
503 | Pool exhausted — every target cooling down. |
Retries & limits
Each request gets up to five attempts. On 401, 402, 403, 429, or 5xx, the failing target cools for sixty seconds and the router tries the next healthy one. Cooldowns are idempotent — re-failing a target never resets its timer, so flapping endpoints recover cleanly.