# Depaza API Reference

> The canonical, machine-readable reference for the Depaza REST API.
> Lives at **https://depaza.com/llms.txt** (and `https://depaza.com/docs/api.md`).
> Point your own agent or LLM at that URL — the Depaza Code CLI fetches it automatically.

Depaza is Europe's sovereign AI platform. Every model is hosted in the EU; data never
leaves European soil. The HTTP API is **OpenAI-compatible** (`/v1/chat/completions`,
`/v1/models`) **and Anthropic-compatible** (`/v1/messages`, Message Batches), so existing
OpenAI or Anthropic SDKs work by changing one line — the base URL and your key.

- **Base URL:** `https://depaza.com/v1`
- **Auth:** `Authorization: Bearer dpz_live_…` (also accepts `x-api-key:` on Anthropic routes)
- **Content type:** `application/json` for request bodies (multipart for file/audio uploads)
- **Encoding:** UTF-8 everywhere
- **Versioning:** the path prefix `/v1` is the API version

---

## Table of contents

**Getting started** — [Authentication](#authentication) · [SDK compatibility](#sdk-compatibility) · [Models](#models)
**Core API** — [Chat completions](#post-v1chatcompletions) · [Tools & function calling](#tools--function-calling) · [List models](#get-v1models) · [Usage](#get-v1usage) · [Messages (Anthropic)](#post-v1messages) · [Count tokens](#post-v1messagescount_tokens) · [Message Batches](#message-batches)
**Capabilities** — [Vision](#post-v1vision) · [Files / extraction](#post-v1files) · [Download generated file](#get-v1filesid) · [Web search](#post-v1search) · [Transcription](#transcription-audio--text) (public API + excellent web dictation up to 10 min + long attachments)
**Account** — [CLI session sync](#cli-session-sync) · [API keys](#api-keys) · [Credit & billing](#credit--billing)
**Reference** — [Rate limits](#rate-limits) · [Errors](#errors) · [Build it with the CLI](#build-it-with-the-depaza-code-cli)

---

## Authentication

Create an API key under **Settings → API** in your account (https://depaza.com/settings).
Keys look like `dpz_live_…`. The plaintext is shown **once** at creation and stored only as a
salted hash — copy it immediately. Revoke a key anytime; revocation is effective immediately.

Send the key as a Bearer token:

```
Authorization: Bearer dpz_live_xxxxxxxxxxxxxxxxxxxxxxxx
```

The Anthropic-compatible routes (`/v1/messages*`) **also** accept the header used by the
Anthropic SDK / Claude CLI:

```
x-api-key: dpz_live_xxxxxxxxxxxxxxxxxxxxxxxx
```

There are two **kinds** of key, billed differently but used identically:

| Kind | Created by | Billing |
|------|-----------|---------|
| `api` | Settings → API (manual) | Prepaid **EUR credit** — debited per call. See [Credit & billing](#credit--billing). |
| `cli` | `depaza auth` (browser connect flow) | **Membership-metered** — included in your plan, capped by a rolling weekly token window. |

> **Deployment note (self-hosting / proxies):** PHP-FPM behind Apache/nginx sometimes strips the
> `Authorization` header. On Apache set `CGIPassAuth On`; on nginx add
> `fastcgi_param HTTP_AUTHORIZATION $http_authorization;`. Depaza's own hosting already does this.

---

## SDK compatibility

### OpenAI SDK (Python)

```python
from openai import OpenAI

client = OpenAI(
    api_key="dpz_live_…",
    base_url="https://depaza.com/v1",
)

resp = client.chat.completions.create(
    model="core",
    messages=[{"role": "user", "content": "Summarise today's EU tech news"}],
)
print(resp.choices[0].message.content)
```

### OpenAI SDK (Node.js)

```js
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "dpz_live_…",
  baseURL: "https://depaza.com/v1",
});

const resp = await client.chat.completions.create({
  model: "core",
  messages: [{ role: "user", content: "Summarise today's EU tech news" }],
});
console.log(resp.choices[0].message.content);
```

### Anthropic SDK (Python)

```python
from anthropic import Anthropic

client = Anthropic(
    api_key="dpz_live_…",
    base_url="https://depaza.com",   # note: no /v1 — the SDK appends it
)

msg = client.messages.create(
    model="core",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello from Europe"}],
)
print(msg.content[0].text)
```

### Claude CLI / Anthropic-compatible tools

```sh
export ANTHROPIC_BASE_URL=https://depaza.com
export ANTHROPIC_API_KEY=dpz_live_…
```

---

## Models

Pass the model **id** as the `model` field. The catalog is intentionally neutral — the
underlying base models are EU-hosted and not exposed.

| id | Name | Context | Best for |
|------|------|---------|----------|
| `lite` | Depaza Lite | 128k | Fast, everyday questions, summarisation, high-volume tasks |
| `core` | Depaza Core | 128k | Best balance of quality, speed and tool use — **default / recommended** |
| `max` | Depaza Max | 128k | Deep analysis and the most demanding tasks |

Additional ids selectable via the API/CLI (not shown in the web model picker):

| id | Best for |
|------|----------|
| `coder` | Code generation specialist |
| `reason` | Long-form reasoning specialist |
| `boss` | Unrestricted / raw persona (advanced; higher cost) |

- **Default:** if `model` is omitted on `/v1/messages`, `core` is used. On `/v1/chat/completions`
  the `model` field is required.
- **Anthropic model names** (e.g. `claude-3-5-sonnet…`) are accepted on `/v1/messages` and mapped
  automatically: names containing `haiku` → `lite`, all others → `max`. Native ids (`lite`/`core`/`max`)
  are honoured as-is.
- **Max output:** clamped server-side to **8192 tokens** regardless of the requested value.
- List the catalog programmatically with [`GET /v1/models`](#get-v1models).

---

## `POST /v1/chat/completions`

OpenAI-compatible chat completions. Same request/response shape as OpenAI, plus a
`depaza_billing` object describing what the turn cost.

**Auth:** `Authorization: Bearer` · **Rate limit:** 120/min, 10 000/day per key

### Two modes, chosen automatically

1. **Managed (no `tools` field)** — Depaza runs its own tools (web search + live page reading)
   server-side when the model decides they help, and returns a finished, web-grounded answer.
   You never receive tool calls to execute. Client `system` messages are ignored (Depaza injects
   its own); the `messages` array must end with a `user` message.
2. **Passthrough (you send a `tools` array)** — a transparent OpenAI proxy: only your tools are
   offered, your `system` message is respected, nothing runs server-side, and the model returns
   `tool_calls` for you to execute. Standard function-calling. The full role history is honoured.

### Request body

| Field | Type | Notes |
|-------|------|-------|
| `model` | string | **Required.** `lite` / `core` / `max` (or `coder`/`reason`/`boss`). |
| `messages` | array | **Required.** `{role, content}`. Managed mode must end with a `user` turn; passthrough honours `system`/`assistant`/`tool` roles. `content` may be a string or OpenAI content-part array. |
| `stream` | bool | `true` → SSE `chat.completion.chunk` frames; omit/`false` → one buffered `chat.completion`. |
| `stream_options.include_usage` | bool | When streaming, emit a final chunk carrying `usage`. |
| `tools` | array | Your own OpenAI function definitions. Presence switches to passthrough mode. |
| `tool_choice` | string/object | `auto` (default), `none`, `required`, or `{type:"function",function:{name}}`. Passthrough only. |
| `temperature` | number | 0–2. |
| `top_p` | number | 0–1. |
| `top_k` | int | Top-k sampling. |
| `max_tokens` | int | Output cap (clamped to 8192). |
| `presence_penalty` / `frequency_penalty` | number | −2 to 2. |
| `stop` | string/array | Stop sequence(s). |
| `seed` | int | Best-effort determinism. |
| `response_format` | object | `{"type":"json_object"}` or a `json_schema` (JSON mode, honoured by `lite`/`max`). |
| `mode` | string | `standard` (default), `document` (research→draft→Office file), or `expert` (draft→critique). |
| `locale` | string | `en` or `da` — language for document/expert orchestration. |
| `document_intake` | bool | `mode=document` only — return clarifying questions first instead of one-shot. |
| `attachments` | array | `[{file|base64, filename?, mime?}]` — up to 5 files run through OCR/extraction and prepended to your message. **Paid plans only.** |
| `depaza_events` | bool | When streaming, also emit custom `depaza.*` frames (see below). |

Vision: if any message contains an OpenAI `image_url` content block, the turn is routed to a
vision-capable model automatically (**paid plans only**, max 4 images, ≤ ~9 MB per inline image).

### Example — managed (curl)

```sh
curl https://depaza.com/v1/chat/completions \
  -H "Authorization: Bearer $DEPAZA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "core",
    "messages": [{"role": "user", "content": "What changed in EU AI rules this week?"}]
  }'
```

### Response (buffered)

```json
{
  "id": "chatcmpl-…",
  "object": "chat.completion",
  "created": 1733300000,
  "model": "core",
  "choices": [
    { "index": 0,
      "message": { "role": "assistant", "content": "…" },
      "finish_reason": "stop" }
  ],
  "usage": { "prompt_tokens": 812, "completion_tokens": 415, "total_tokens": 1227 },
  "depaza_billing": { "web_searches": 1, "balance_cents_after": 2461 }
}
```

`depaza_billing.web_searches` is the count of external searches actually billed this turn;
`balance_cents_after` is your remaining EUR credit (in cents), or `null` for membership keys.
Document Mode adds `depaza_file: {url}`; document intake adds `depaza_questions`.

### Streaming

Request `"stream": true`. Frames are OpenAI `chat.completion.chunk` objects:

```
data: {"id":"chatcmpl-…","object":"chat.completion.chunk","created":1733300000,"model":"core","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-…","object":"chat.completion.chunk","created":1733300000,"model":"core","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-…","object":"chat.completion.chunk","created":1733300000,"model":"core","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"depaza_billing":{"web_searches":0,"balance_cents_after":2461}}
data: [DONE]
```

With `stream_options.include_usage: true`, a penultimate frame carries `"usage": {…}` with an
empty `choices` array. With `depaza_events: true`, extra frames are interleaved:

- `{"object":"depaza.tool","phase":"start|end","name":"web_search"}` — server-side tool activity
- `{"object":"depaza.phase","event":"start|end","phase":"…","label":"…"}` — document/expert progress
- `{"object":"depaza.questions", …}` — document-intake clarifying questions
- `{"object":"depaza.file","url":"/v1/files/123"}` — generated Office file (Document Mode)

---

## Tools & function calling

Depaza picks the tool mode from whether your request includes a `tools` array.

### Built-in tools — omit `tools`

Web search and live page reading run automatically inside the turn when useful. You receive a
finished, web-grounded answer — the same engine that powers the Depaza chat app. Nothing to
execute on your side.

### Your own tools — function calling

Send a `tools` array and Depaza behaves like a standard OpenAI endpoint: only your tools are
offered, your `system` message is respected, nothing runs server-side. The model returns
`tool_calls` with `finish_reason: "tool_calls"`. Execute them, append each result as a
`{role:"tool", tool_call_id, content}` message, and call again so the model can finish.

**Request — offer a tool:**

```sh
curl https://depaza.com/v1/chat/completions \
  -H "Authorization: Bearer $DEPAZA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "core",
    "messages": [{"role": "user", "content": "What is the weather in Paris?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather for a city",
        "parameters": {
          "type": "object",
          "properties": {"city": {"type": "string"}},
          "required": ["city"]
        }
      }
    }]
  }'
```

**Response — the model calls it:**

```json
{
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": null,
      "tool_calls": [{
        "id": "call_abc", "type": "function",
        "function": { "name": "get_current_weather", "arguments": "{\"city\": \"Paris\"}" }
      }]
    },
    "finish_reason": "tool_calls"
  }]
}
```

Use `tool_choice` to force (`required` or a named function) or disable (`none`) tool use.

---

## `GET /v1/models`

OpenAI-shaped model list. **Auth:** Bearer.

```sh
curl https://depaza.com/v1/models -H "Authorization: Bearer $DEPAZA_API_KEY"
```

```json
{
  "object": "list",
  "data": [
    { "id": "lite", "object": "model", "owned_by": "depaza",
      "context_window": 128000, "max_output_tokens": 8192, "description": "…" },
    { "id": "core", "object": "model", "owned_by": "depaza", "context_window": 128000, "max_output_tokens": 8192, "description": "…" },
    { "id": "max",  "object": "model", "owned_by": "depaza", "context_window": 128000, "max_output_tokens": 8192, "description": "…" }
  ]
}
```

---

## `GET /v1/usage`

Bearer-authenticated balance/usage for the calling key. Shape depends on the key kind.

**Credit (`api`) key:**

```json
{ "email": "you@example.com", "plan": "pro", "mode": "credit", "balance_cents": 2461, "currency": "EUR" }
```

**Membership (`cli`) key:**

```json
{
  "email": "you@example.com", "plan": "max", "mode": "membership", "unit": "tokens",
  "cli_enabled": true, "upgrade_url": "https://depaza.com/pricing",
  "weekly": { "used": 1284233, "limit": 50000000, "resets_in_seconds": 412300 }
}
```

---

## `POST /v1/messages`

Anthropic-compatible Messages API. Drop-in for `anthropic.Anthropic(base_url="https://depaza.com")`
and the Claude CLI (`ANTHROPIC_BASE_URL`). The request is translated to Depaza's internal form,
run through the same engine, and translated back to Anthropic shape.

**Auth:** `x-api-key` or `Authorization: Bearer` · **Rate limit:** 120/min per user

### Request body

| Field | Type | Notes |
|-------|------|-------|
| `model` | string | Depaza id, or an Anthropic name (mapped: `haiku`→`lite`, else→`max`). Defaults to `core`. |
| `max_tokens` | int | **Required.** Clamped to 8192. |
| `messages` | array | **Required.** Anthropic message objects. `content` may be a string or a block array (`text`, `image`, `document`, `tool_use`, `tool_result`). |
| `system` | string/array | Optional system prompt (string or array of text blocks). |
| `tools` | array | Anthropic tool defs `{name, description, input_schema}` → mapped to function tools. |
| `tool_choice` | object | `{type:"auto"}` / `{type:"any"}` / `{type:"none"}` / `{type:"tool",name}`. |
| `stream` | bool | `true` → Anthropic SSE event stream. |
| `temperature`, `top_p`, `top_k` | number | Sampling. |
| `stop_sequences` | array | Stop strings. |

**Content blocks supported:** `text`; `image` (`source.type` `base64` with `media_type`+`data`, or
`url`); `document` (a `text`/`content`/`base64` source — PDFs and Office files are OCR/text-extracted
and inlined, since the text models don't read PDFs natively); `tool_use`; `tool_result`.

### Example (curl)

```sh
curl https://depaza.com/v1/messages \
  -H "x-api-key: $DEPAZA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "core",
    "max_tokens": 1024,
    "system": "You are concise.",
    "messages": [{"role": "user", "content": "Name three EU capitals."}]
  }'
```

### Response (buffered)

```json
{
  "id": "msg_…",
  "type": "message",
  "role": "assistant",
  "model": "core",
  "content": [{ "type": "text", "text": "Paris, Berlin, Madrid." }],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": { "input_tokens": 24, "output_tokens": 8 }
}
```

`stop_reason` is one of `end_turn`, `max_tokens`, `tool_use`. Tool calls appear as
`{"type":"tool_use","id","name","input"}` content blocks.

### Streaming (SSE)

Request `"stream": true`. Anthropic event sequence:

```
event: message_start
data: {"type":"message_start","message":{"id":"msg_…","type":"message","role":"assistant","model":"core","content":[],"stop_reason":null,"usage":{"input_tokens":0,"output_tokens":0}}}

event: ping
data: {"type":"ping"}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Paris"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":8}}

event: message_stop
data: {"type":"message_stop"}
```

Tool calls stream as a `tool_use` content block whose arguments arrive via
`input_json_delta` (`partial_json`) deltas.

---

## `POST /v1/messages/count_tokens`

Estimate input tokens for a request (Anthropic-shaped). Approximate — uses a ~3.5 chars/token
heuristic, not an exact tokenizer.

```sh
curl https://depaza.com/v1/messages/count_tokens \
  -H "x-api-key: $DEPAZA_API_KEY" -H "Content-Type: application/json" \
  -d '{"model":"core","messages":[{"role":"user","content":"Hello"}]}'
```

```json
{ "input_tokens": 2 }
```

---

## Message Batches

Anthropic-compatible asynchronous batch API. Submit up to **10 000** message requests; they are
processed in the background and fetched by id once the batch ends. **Auth:** `x-api-key` or Bearer.

### `POST /v1/messages/batches` — create

Rate limit: 60/min per user. Each request needs a unique `custom_id` and `params` (the same shape
as a `/v1/messages` body).

```sh
curl https://depaza.com/v1/messages/batches \
  -H "x-api-key: $DEPAZA_API_KEY" -H "Content-Type: application/json" \
  -d '{
    "requests": [
      { "custom_id": "q1", "params": { "model": "core", "max_tokens": 256, "messages": [{"role":"user","content":"Capital of France?"}] } },
      { "custom_id": "q2", "params": { "model": "lite", "max_tokens": 256, "messages": [{"role":"user","content":"Capital of Spain?"}] } }
    ]
  }'
```

### Batch object

```json
{
  "id": "msgbatch_…",
  "type": "message_batch",
  "processing_status": "in_progress",
  "request_counts": { "processing": 2, "succeeded": 0, "errored": 0, "canceled": 0, "expired": 0 },
  "created_at": "2026-06-04T10:00:00+00:00",
  "ended_at": null,
  "expires_at": "2026-06-05T10:00:00+00:00",
  "cancel_initiated_at": null,
  "archived_at": null,
  "results_url": null
}
```

`processing_status` is `in_progress` or `ended`. When `ended`, `results_url` is populated.

### `GET /v1/messages/batches/{id}` — retrieve (poll)
### `GET /v1/messages/batches` — list

```json
{ "data": [ { "id": "msgbatch_…", "...": "…" } ], "has_more": false }
```

### `GET /v1/messages/batches/{id}/results` — JSONL results

Available only when `processing_status` is `ended` (otherwise `400`). Returns
`application/x-jsonl`, one result per line, keyed by `custom_id`:

```jsonl
{"custom_id":"q1","result":{"type":"succeeded","message":{"id":"msg_…","type":"message","role":"assistant","content":[{"type":"text","text":"Paris."}],"stop_reason":"end_turn","usage":{"input_tokens":12,"output_tokens":3}}}}
{"custom_id":"q2","result":{"type":"succeeded","message":{"...":"…"}}}
```

### `POST /v1/messages/batches/{id}/cancel`

Initiates cancellation and returns the updated batch object.

---

## `POST /v1/vision`

Dedicated EU vision model: turn image(s) into text so text-only models can "see" screenshots.
**Auth:** Bearer · **Paid plans only** · **Rate limit:** 60/min.

| Field | Type | Notes |
|-------|------|-------|
| `images` | array | **Required.** 1–4 base64 strings (or `data:` URLs), ≤ 8 MB each. |
| `prompt` | string | Defaults to "Describe this image in detail." |
| `temperature` | number | 0–2, default 0.2. |
| `max_tokens` | int | 64–4000, default 1800. |

```sh
curl https://depaza.com/v1/vision \
  -H "Authorization: Bearer $DEPAZA_API_KEY" -H "Content-Type: application/json" \
  -d '{"prompt":"What does this screenshot show?","images":["<base64>"]}'
```

```json
{ "text": "A login form with email and password fields…", "model": "mistral-small-3.2-24b-instruct-2506" }
```

---

## `POST /v1/files`

Extract text from a file: image OCR, scanned-PDF OCR, and mechanical text/PDF/Office extraction —
the same pipeline as chat attachments. **Auth:** Bearer · **Paid plans only** · **Rate limit:** 60/min ·
**Max 10 MB**. Supported: PDF, images (PNG/JPG/WebP), Office (DOCX/XLSX/PPTX), CSV, JSON, plain text.

Two input shapes:

- **Multipart:** field `file` (and optional `mime`).
- **JSON:** `{ "file": "<base64>", "mime": "application/pdf" }` (a `data:` URL is also accepted).

```sh
curl https://depaza.com/v1/files \
  -H "Authorization: Bearer $DEPAZA_API_KEY" \
  -F "file=@report.pdf"
```

```json
{ "text": "Q1 revenue grew 18%…", "mime": "application/pdf", "ocr": false }
```

`ocr` is `true` when OCR was used (scanned PDF / image).

---

## `GET /v1/files/{id}`

Download an Office file produced by Document Mode or the `generate_*` tools, authorised by your
bearer key. (The web download at `/api/downloads/{id}` is session-only and unreachable from the
API/CLI; chat responses rewrite generated-file links to this endpoint.) Streams the binary with a
`Content-Disposition` attachment header.

```sh
curl -L https://depaza.com/v1/files/123 \
  -H "Authorization: Bearer $DEPAZA_API_KEY" -o report.docx
```

---

## `POST /v1/search`

EU-first web search (the same engine the chat uses). **Auth:** Bearer · **Paid plans only** ·
**Rate limit:** 60/min. EU search is always on; the US fallback respects your account's
`web_search` preference.

| Field | Type | Notes |
|-------|------|-------|
| `query` | string | **Required.** |
| `max_results` | int | 1–10, default 6. |

```sh
curl https://depaza.com/v1/search \
  -H "Authorization: Bearer $DEPAZA_API_KEY" -H "Content-Type: application/json" \
  -d '{"query":"EU AI Act enforcement 2026","max_results":5}'
```

```json
{
  "query": "EU AI Act enforcement 2026",
  "results": [
    { "title": "…", "url": "https://…", "snippet": "…", "published_at": "2026-05-30T09:00:00Z" }
  ]
}
```

---

## Transcription (audio → text)

Depaza offers several high-quality speech-to-text experiences powered by EU-hosted Whisper (large-v3). Choose the right surface for your use case:

- **Public API** (`POST /v1/transcribe`) — simple, synchronous, great for agents, scripts and the Depaza Code CLI.
- **Web chat voice dictation** — real-time mic recording with client-side segmentation (up to 10 minutes continuous) + screen wake lock.
- **Long audio file attachments** — upload meetings, interviews or lectures of arbitrary length. Automatic background chunking + transcription for paid users.

All transcription runs on European infrastructure. Raw audio is deleted after processing.

### `POST /v1/transcribe` — Public API

**Auth:** Bearer (`dpz_live_…`) · **Paid plans only** · **Rate limit:** 60/min per user · **Max 25 MB per request**

The public endpoint is intentionally kept simple and synchronous for reliability and low latency on short-to-medium clips. It is the direct counterpart to the web mic button.

#### Input formats

**1. Multipart form (recommended for files from disk or SDKs)**

```sh
curl https://depaza.com/v1/transcribe \
  -H "Authorization: Bearer $DEPAZA_API_KEY" \
  -F "audio=@meeting.webm"
```

**2. JSON (base64 or data URL) — perfect for agents and when you already have the bytes**

```json
{
  "audio": "<base64-bytes or data:audio/webm;base64,...>",
  "filename": "meeting.webm"   // optional – helps MIME detection
}
```

A `data:` URL is automatically stripped of the prefix before decoding.

Supported audio formats include anything Whisper accepts: webm/opus, mp3, m4a, wav, flac, ogg, etc.

#### Response

```json
{
  "text": "The full transcribed text..."
}
```

Only the concatenated text is returned today.

#### Limitations (current implementation)

- No `language`, `prompt`, `temperature` or other Whisper parameters are forwarded yet.
- No word-level timestamps, segments or `verbose_json` / `srt` / `vtt` output. You always get plain `{ "text": "..." }`.
- Single request is capped at 25 MB (~25–30 minutes of typical speech, depending on format and silence).
- For longer recordings on the public API you must chunk client-side (or use the web chat audio attachment flow below, which handles chunking automatically).

We may expose more Whisper options in the future while keeping the simple shape for the common case.

#### Examples

**Python (using the OpenAI SDK – just change the base URL)**

```python
from openai import OpenAI

client = OpenAI(
    api_key="dpz_live_…",
    base_url="https://depaza.com/v1",
)

with open("note.webm", "rb") as f:
    resp = client.audio.transcriptions.create(
        model="whisper-1",   # ignored – we use our configured large-v3
        file=f,
    )
print(resp.text)
```

**Node.js**

```js
import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: "dpz_live_…",
  baseURL: "https://depaza.com/v1",
});

const transcription = await client.audio.transcriptions.create({
  file: fs.createReadStream("meeting.m4a"),
  model: "whisper-1",
});

console.log(transcription.text);
```

**JSON base64 (curl)**

```sh
AUDIO_B64=$(base64 -w0 note.webm)
curl https://depaza.com/v1/transcribe \
  -H "Authorization: Bearer $DEPAZA_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"audio\":\"$AUDIO_B64\",\"filename\":\"note.webm\"}"
```

#### Errors

| Status | Typical cause |
|--------|---------------|
| 402    | Paid plan required |
| 413    | Audio > 25 MB |
| 429    | Rate limit exceeded (honour `Retry-After`) |
| 502    | Upstream transcription failure (transient – retry with backoff) |

Every response includes an `x-request-id` header for support.

### Transcription in the web chat (recommended for long or interactive use)

**Voice dictation (mic button in the composer)**

- Tap or hold the microphone icon.
- Client-side segmentation (≈ 3-minute chunks) lets you dictate continuously for up to 10 minutes without hitting server timeouts or Cloudflare limits.
- A screen wake lock is acquired while recording and during the final transcription so your computer or phone does not sleep or show the screensaver.
- Result is inserted into the composer (never auto-sent) so you can review and edit before sending.
- Works on free and paid plans (subject to overall usage).

**Audio file attachments (long recordings)**

- Drop or attach an audio file (mp3, m4a, webm, wav, etc.) of any practical length.
- Paid users only (long-form transcription is compute-heavy).
- The file is stored and a background job transcodes it (16 kHz mono) and splits it into ~8-minute chunks that are sent to Whisper.
- You see live progress (“Transcribing…”, page progress for very long files).
- When complete the full transcript is attached to the file and can be referenced in chat or downloaded as .txt.
- Partial failures are handled gracefully (successful segments are kept + a note is appended).

This combination gives you a very practical “full” long-form dictation experience even though the raw public `/v1/transcribe` endpoint stays lightweight and synchronous.

**Best practice**

- Short clips, agents, automation → use `/v1/transcribe`.
- Interviews, meetings, thinking out loud → use the web mic (up to 10 min) or attach the file.
- Need timestamps/segments today → the web attachment flow currently also returns plain text; let us know if you need a richer format.

See also the in-product help (the “?” in chat) and the Voice input section for UI details.

---

## CLI session sync

Opt-in sync of CLI coding transcripts so they show up in the web dashboard (`/sessions`).
**Auth:** Bearer · **Paid plans only**.

### `POST /v1/sessions` — upsert (rate limit 120/min)

Upserts a conversation keyed on `(user, session_id)` and **replaces** its messages (send the full
transcript each turn).

| Field | Type | Notes |
|-------|------|-------|
| `session_id` | string | **Required.** Stable client-generated id. |
| `messages` | array | `{role, content, tool_calls?, tool_call_id?}` (roles: system/user/assistant/tool). |
| `title` | string | Optional (≤ 200 chars). |
| `model` | string | Optional. |
| `status` | string | `active` or `ended`. |

```json
{ "ok": true, "id": 8842 }
```

### `GET /v1/sessions` — list

```json
{ "sessions": [ { "session_id": "…", "title": "…", "model": "core", "updated_at": "…", "status": "ended", "message_count": 42 } ] }
```

### `GET /v1/sessions/{id}` — fetch transcript

```json
{ "session_id": "…", "model": "core", "status": "ended",
  "messages": [ { "role": "user", "content": "…" }, { "role": "assistant", "content": "…", "tool_calls": [ … ] } ] }
```

---

## API keys

Manage keys programmatically from the **logged-in web session** (these routes use session auth, not
bearer — they back the Settings UI).

| Method | Path | Notes |
|--------|------|-------|
| `GET` | `/v1/keys` | List your keys (metadata only — never the plaintext). |
| `POST` | `/v1/keys` | Body `{name?}`. Returns `{id, token, name}` — `token` is shown **once**. |
| `DELETE` | `/v1/keys/{id}` | Revoke a key. |

```json
// POST /v1/keys
{ "id": 17, "token": "dpz_live_…", "name": "production" }
```

---

## Credit & billing

`api` keys are billed against a **prepaid EUR balance**; `cli` keys are included in your plan.
Usage is billed at the live per-model rates shown on https://depaza.com/docs#pricing (and reflected
in `GET /v1/models`). External web searches are billed only when a query actually reaches the
external engine; answers from Depaza's own index or cache are free.

- **Minimum top-up:** €25. **Auto-recharge:** optionally top up from a saved card when the balance
  falls below a threshold you set.
- An empty balance (or a failed auto-recharge) returns **HTTP 402** until you top up.

Account/billing management routes (session auth, web UI):

| Method | Path | Notes |
|--------|------|-------|
| `GET` | `/v1/billing/account` | `{balance_cents, currency, auto_recharge, threshold_cents, recharge_cents, recharge_state, has_payment_method}`. |
| `POST` | `/v1/billing/settings` | Update `{auto_recharge, threshold_cents, recharge_cents}`. |
| `POST` | `/v1/billing/topup` | Body `{amount_cents}` (min €25). Returns a Stripe Checkout `{url}`. |

For programmatic balance checks with a bearer key, use [`GET /v1/usage`](#get-v1usage).

---

## Rate limits

Limits are sliding 1-minute (and longer) windows. Exceeding one returns **HTTP 429** with a
`Retry-After` header (seconds).

| Endpoint | Limit |
|----------|-------|
| `POST /v1/chat/completions` (`api` key) | 120/min, 10 000/day per key |
| `POST /v1/chat/completions` (`cli` key) | 120/min, 2 000/hour per user + rolling weekly token cap |
| `POST /v1/messages` | 120/min per user |
| `POST /v1/messages/batches` | 60/min per user |
| `POST /v1/sessions` | 120/min per user |
| `POST /v1/vision`, `/v1/files`, `/v1/search`, `/v1/transcribe` | 60/min per user |

Every `/v1/*` response carries an `x-request-id: req_…` header for support/tracing.

---

## Errors

Errors use the OpenAI envelope on most routes:

```json
{ "error": { "message": "…", "type": "invalid_request_error", "code": "invalid_api_key" } }
```

The Anthropic-compatible routes (`/v1/messages*`, batches) use the Anthropic envelope:

```json
{ "type": "error", "error": { "type": "authentication_error", "message": "invalid x-api-key" } }
```

| Status | When | Typical `code` |
|--------|------|----------------|
| `400` | Invalid request — bad model id, malformed/empty `messages`, missing `max_tokens`, unsupported file. | `model_not_found`, `empty_extraction`, … |
| `401` | Invalid or missing API key. | `invalid_api_key` |
| `402` | Insufficient credit / failed auto-recharge, or a paid-plan / weekly-cap gate. | `insufficient_credit`, `plan_required` |
| `403` | Account suspended, or Code CLI access requires a paid plan (or explicit grant). | `account_suspended`, `early_access_required` |
| `404` | Batch / session / file not found. | `not_found` |
| `429` | Rate limited — honour `Retry-After`. | `rate_limit_exceeded` |
| `5xx` | Upstream model error / transient failure. | `server_error`, `api_error` |

---

## Build it with the Depaza Code CLI

Don't hand-write the client — the **Depaza Code CLI** already knows this API (it reads this exact
reference) and can scaffold, run and debug an integration in your terminal.

```sh
curl -fsSL https://depaza.com/install.sh | sh   # install
depaza auth                                       # connect this terminal
depaza docs                                       # print this API reference
depaza "build a Python client for the Depaza /v1/messages API with streaming"
```

The CLI is your European dev team — Emma plans, Bob builds, Max reviews the architecture and Anna
catches the bugs — and a council of EU models out-builds a single frontier model. Everything runs
on European soil. See https://depaza.com/cli.

Because this document is served at **https://depaza.com/llms.txt**, you can also point any other
agent or LLM at that URL to give it accurate, current knowledge of the Depaza API.