Quatarly Logo
Quatarly Docs
API v1 — Live

Quatarly API

A unified, OpenAI-compatible API that gives you access to Claude, Gemini, and GPT models through a single key. Drop-in replacement for any OpenAI SDK, Cursor, Factory AI, Claude Code, OpenCode, and more.

Overview

Quatarly provides a unified REST API that proxies requests to Anthropic (Claude), Google (Gemini), and OpenAI (GPT) through a single endpoint and a single API key. All responses conform to the OpenAI Chat Completions schema, making it a true drop-in replacement anywhere you currently use api.openai.com.

Fully OpenAI-compatible. Point your existing code at https://api.quatarly.cloud/v1, swap the key, and everything works — streaming, tool calling, system prompts, and all.
All models use the OpenAI Chat Completions request format. Quatarly handles translation to the native provider format internally — you never have to change your payload structure.

Authentication

Every request must include your Quatarly API key as a Bearer token in the Authorization header.

HTTP Header
Authorization: Bearer your-api-key-here

Get your free key at api.quatarly.cloud/api-key.

Keep your key secret. Never commit it to source code, expose it in client-side JavaScript, or share it publicly. Treat it like a password.

Base URL

Base URL https://api.quatarly.cloud
OpenAI-compatible (v1) https://api.quatarly.cloud/v1

All endpoints live under /v1. The difference is just what base URL you give each client — the client appends the rest automatically.

How clients use the base URL:

OpenAI SDK / curl / OpenCode — you set base_url = "https://api.quatarly.cloud/v1" and the SDK appends /chat/completions. Works for all models (Claude, Gemini, GPT).

Claude Code / Anthropic SDK — you set ANTHROPIC_BASE_URL = "https://api.quatarly.cloud/" (root) and Claude Code appends /v1/messages itself, reaching /v1/messages which uses the Anthropic Messages API format. Claude models only.
EndpointMethodFormatModels
/v1/chat/completionsPOSTOpenAIAll (Claude, Gemini, GPT)
/v1/messagesPOSTAnthropicClaude only — if /v1/chat/completions doesn't work for you with a Claude model, use this

Rate Limits

Rate limits are enforced per API key. Limits vary by key tier and model. A 429 Too Many Requests response is returned when exceeded.

Limit TypeTrial KeyFull Key
Requests / minute (RPM)70Custom (per plan)
Monthly creditsLimitedPlan-based
Concurrent requests5Unlimited
Credits system: Each request deducts credits based on tokens used. Different models have different credit weights — Claude Opus costs more than Haiku, for example. Check your usage in the management portal.

All Models

All models below are accessible via the standard OpenAI Chat Completions API format using your Quatarly key.

Claude (Anthropic)

claude-sonnet-4-6-thinking
Anthropic · Sonnet
anthropic
claude-opus-4-6-thinking
Anthropic · Opus
anthropic
claude-haiku-4-5-20251001
Anthropic · Haiku
anthropic

Gemini (Google)

gemini-3.1-pro
Google · Pro
openai compat
gemini-3-flash
Google · Flash
openai compat

GPT (OpenAI)

gpt-5.1
OpenAI
openai
gpt-5.1-codex
OpenAI · Codex
openai
gpt-5.1-codex-max
OpenAI · Codex Max
openai
gpt-5.2
OpenAI
openai
gpt-5.2-codex
OpenAI · Codex
openai
gpt-5.3-codex
OpenAI · Codex
openai
gpt-5.4
OpenAI
openai

Factory AI Droid

Connect Factory AI Droid to Quatarly to access all models with a single API key. The setup script patches ~/.factory/settings.json automatically with all model entries.

  • Install Factory AI

    powershell
    irm https://app.factory.ai/cli/windows | iex
    bash
    curl -fsSL https://app.factory.ai/cli | sh
  • Create a Factory Account & Login

    Go to app.factory.ai and create a free account. Then run droid in your terminal and log in to create ~/.factory/settings.json.

  • Run the Setup Script

    You need your Quatarly API key (qua_trail_... or qua_...).

    powershell
    irm https://raw.githubusercontent.com/himanshu91081/Quatarly-setup/main/add-quatarly-models.ps1 -OutFile add-quatarly-models.ps1; .\add-quatarly-models.ps1
    bash
    curl -fsSL https://raw.githubusercontent.com/himanshu91081/Quatarly-setup/main/add-quatarly-models.sh -o add-quatarly-models.sh && bash add-quatarly-models.sh

    The script will prompt for your key and update settings.json with all 11 model entries. Running it again with a new key safely updates existing entries without creating duplicates.

Verify Setup

powershell
Select-String "customModels" -Path "$env:USERPROFILE\.factory\settings.json" -A 5
bash
grep -A 5 "customModels" ~/.factory/settings.json

Expected snippet in settings.json:

json
"customModels": [
  {
    "model":       "claude-sonnet-4-6-thinking",
    "id":          "custom:claude-sonnet-4-6-thinking-0",
    "baseUrl":     "https://api.quatarly.cloud/",
    "apiKey":      "your-api-key",
    "provider":    "anthropic",
    "displayName": "claude-sonnet-4-6-thinking"
  },
  {
    "model":       "gpt-5.1",
    "id":          "custom:gpt-5.1-5",
    "baseUrl":     "https://api.quatarly.cloud/v1",
    "apiKey":      "your-api-key",
    "provider":    "openai",
    "displayName": "gpt-5.1"
  }
]
A backup of your original settings.json is saved as settings.json.backup before any changes. The script requires Python 3.

Claude Code

Use Claude Code as a CLI coding agent routed through Quatarly. No Anthropic account needed — just your Quatarly key.

  • Install Claude Code

    bash
    npm install -g @anthropic-ai/claude-code
  • Set Environment Variables

    Option A — Setup script (recommended, persists across restarts):

    powershell
    irm https://raw.githubusercontent.com/himanshu91081/Quatarly-setup/main/set-claude-env.ps1 -OutFile set-claude-env.ps1; .\set-claude-env.ps1 -ApiKey "your-api-key-here"
    bash
    curl -fsSL https://raw.githubusercontent.com/himanshu91081/Quatarly-setup/main/set-claude-env.sh -o set-claude-env.sh && bash set-claude-env.sh your-api-key-here

    Option B — Set manually for current session only:

    bash
    export ANTHROPIC_BASE_URL="https://api.quatarly.cloud/"
    export ANTHROPIC_AUTH_TOKEN="your-api-key-here"
    export ANTHROPIC_DEFAULT_HAIKU_MODEL="claude-haiku-4-5-20251001"
    export ANTHROPIC_DEFAULT_SONNET_MODEL="claude-sonnet-4-6-thinking"
    export ANTHROPIC_DEFAULT_OPUS_MODEL="claude-opus-4-6-thinking"
    powershell
    $env:ANTHROPIC_BASE_URL             = "https://api.quatarly.cloud/"
    $env:ANTHROPIC_AUTH_TOKEN           = "your-api-key-here"
    $env:ANTHROPIC_DEFAULT_HAIKU_MODEL  = "claude-haiku-4-5-20251001"
    $env:ANTHROPIC_DEFAULT_SONNET_MODEL = "claude-sonnet-4-6-thinking"
    $env:ANTHROPIC_DEFAULT_OPUS_MODEL   = "claude-opus-4-6-thinking"
    VariableValue
    ANTHROPIC_BASE_URLhttps://api.quatarly.cloud/
    ANTHROPIC_AUTH_TOKENYour Quatarly API key
    ANTHROPIC_DEFAULT_HAIKU_MODELclaude-haiku-4-5-20251001
    ANTHROPIC_DEFAULT_SONNET_MODELclaude-sonnet-4-6-thinking
    ANTHROPIC_DEFAULT_OPUS_MODELclaude-opus-4-6-thinking
    After using the setup script, run source ~/.zshrc (macOS) or source ~/.bashrc (Linux) to pick up the changes in the current terminal. GUI apps require a full logout/restart.
    Watch: Claude Code Setup Guide
  • Launch Claude Code

    bash
    claude

    Claude Code will route all requests through Quatarly using your key and credit balance.

OpenCode

Use OpenCode as a terminal AI coding assistant routed through Quatarly. No OpenAI account needed — just your Quatarly key.

  • Install OpenCode

    bash
    npm install -g opencode-ai
  • Create the Config File

    Edit or create ~/.config/opencode/opencode.json:

    json (~/.config/opencode/opencode.json)
    {
        "$schema": "https://opencode.ai/config.json",
        "provider": {
            "openai": {
                "options": {
                    "baseURL": "https://api.quatarly.cloud/v1",
                    "apiKey":  "your-api-key-here"
                }
            }
        },
        "model": "openai/gpt-5.3-codex"
    }

    Create it automatically from the terminal:

    powershell
    $dir = "$env:USERPROFILE\.config\opencode"
    if (!(Test-Path $dir)) { New-Item -ItemType Directory -Force -Path $dir }
    @'
    {
        "$schema": "https://opencode.ai/config.json",
        "provider": {
            "openai": {
                "options": {
                    "baseURL": "https://api.quatarly.cloud/v1",
                    "apiKey": "your-api-key-here"
                }
            }
        },
        "model": "openai/gpt-5.3-codex"
    }
    '@ | Set-Content "$dir\opencode.json"
    bash
    mkdir -p ~/.config/opencode
    cat > ~/.config/opencode/opencode.json << 'EOF'
    {
        "$schema": "https://opencode.ai/config.json",
        "provider": {
            "openai": {
                "options": {
                    "baseURL": "https://api.quatarly.cloud/v1",
                    "apiKey": "qua_trail_your-key-here"
                }
            }
        },
        "model": "openai/gpt-5.3-codex"
    }
    EOF
  • Launch OpenCode

    bash
    opencode

    OpenCode routes all requests through Quatarly. Switch models with /model inside the session — all Quatarly GPT, Gemini, and Claude models appear under the openai provider.

Chat Completions

POST /v1/chat/completions

Send a chat message to any model. The request and response format is identical to the OpenAI Chat Completions API.

Request Body

ParameterTypeRequiredDescription
modelstringrequiredModel ID from the models list (e.g. gpt-5.3-codex)
messagesarrayrequiredArray of message objects with role and content
streambooleanoptionalStream tokens via SSE. Default: false
max_tokensintegeroptionalMax output tokens. Model default if omitted
temperaturenumberoptionalSampling temperature 0–2. Default: 1
top_pnumberoptionalNucleus sampling. Default: 1
systemstringoptionalSystem prompt (alternative to messages array system role)
toolsarrayoptionalTool definitions for function calling

Example Request

json
{
  "model": "claude-sonnet-4-6-thinking",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user",   "content": "Explain quantum entanglement simply." }
  ],
  "stream": false,
  "max_tokens": 1024
}

Response Format

json
{
  "id": "chatcmpl-xyz123",
  "object": "chat.completion",
  "created": 1750000000,
  "model": "claude-sonnet-4-6-thinking",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum entanglement is when two particles..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 32,
    "completion_tokens": 118,
    "total_tokens": 150
  }
}

Parameters Reference

Common parameters across all models. Provider-specific parameters are passed through transparently.

ParameterTypeDefaultNotes
temperaturefloat1.00 = deterministic, 2 = very random
max_tokensintmodel maxHard cap on output length
streamboolfalseSSE streaming. Use -N flag with curl
top_pfloat1.0Nucleus sampling probability cutoff
stopstring[]Stop sequences (up to 4)
presence_penaltyfloat0GPT models only
frequency_penaltyfloat0GPT models only
toolsarrayFunction calling tool definitions
tool_choicestring/object"auto"Tool selection mode

cURL Examples

Basic Chat

bash
curl -X POST "https://api.quatarly.cloud/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key-here" \
  -d '{
    "model": "gpt-5.3-codex",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Streaming Response

bash
curl -X POST "https://api.quatarly.cloud/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key-here" \
  -N \
  -d '{
    "model": "claude-sonnet-4-6-thinking",
    "stream": true,
    "messages": [{"role": "user", "content": "Write a poem about the ocean."}]
  }'

Claude with System Prompt

bash
curl -X POST "https://api.quatarly.cloud/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key-here" \
  -d '{
    "model": "claude-opus-4-6-thinking",
    "messages": [
      {"role": "system", "content": "You are an expert Python developer."},
      {"role": "user",   "content": "Refactor this function for readability."}
    ],
    "max_tokens": 2048
  }'

Gemini with High Temperature

bash
curl -X POST "https://api.quatarly.cloud/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key-here" \
  -d '{
    "model": "gemini-3.1-pro",
    "messages": [{"role": "user", "content": "Brainstorm 10 startup ideas."}],
    "temperature": 1.4,
    "max_tokens": 1024
  }'

Claude via /v1/messages (Anthropic format)

If /v1/chat/completions isn't working with a Claude-only tool, try the Anthropic Messages endpoint instead. Note max_tokens is required and the system prompt is a top-level field.

bash
curl -X POST "https://api.quatarly.cloud/v1/messages" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key-here" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-6-thinking",
    "max_tokens": 1024,
    "system": "You are a helpful assistant.",
    "messages": [
      {"role": "user", "content": "Explain transformers in ML simply."}
    ]
  }'

Python (OpenAI SDK)

Install the official OpenAI library and point it at Quatarly — no other changes needed.

bash
pip install openai

Basic Usage

python
from openai import OpenAI

client = OpenAI(
    api_key="your-api-key-here",
    base_url="https://api.quatarly.cloud/v1",
)

response = client.chat.completions.create(
    model="claude-sonnet-4-6-thinking",
    messages=[
        {"role": "user", "content": "Summarise the last decade of AI progress."}
    ],
    max_tokens=1024,
)

print(response.choices[0].message.content)

Streaming

python
with client.chat.completions.stream(
    model="gpt-5.3-codex",
    messages=[{"role": "user", "content": "Write a sorting algorithm."}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Node.js (OpenAI SDK)

bash
npm install openai

Basic Usage

javascript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-api-key-here",
  baseURL: "https://api.quatarly.cloud/v1",
});

const response = await client.chat.completions.create({
  model: "gemini-3.1-pro",
  messages: [{ role: "user", content: "What is the capital of France?" }],
});

console.log(response.choices[0].message.content);

Streaming

javascript
const stream = await client.chat.completions.create({
  model: "claude-haiku-4-5-20251001",
  messages: [{ role: "user", content: "Tell me a short story." }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Models Table

Model IDFamilyProviderBase URL
claude-sonnet-4-6-thinkingClaudeanthropic...quatarly.cloud/
claude-opus-4-6-thinkingClaudeanthropic...quatarly.cloud/
claude-haiku-4-5-20251001Claudeanthropic...quatarly.cloud/
gemini-3.1-proGeminiopenai compat...quatarly.cloud/v1
gemini-3-flashGeminiopenai compat...quatarly.cloud/v1
gpt-5.1GPTopenai...quatarly.cloud/v1
gpt-5.1-codexGPTopenai...quatarly.cloud/v1
gpt-5.1-codex-maxGPTopenai...quatarly.cloud/v1
gpt-5.2GPTopenai...quatarly.cloud/v1
gpt-5.2-codexGPTopenai...quatarly.cloud/v1
gpt-5.3-codexGPTopenai...quatarly.cloud/v1
gpt-5.4GPTopenai...quatarly.cloud/v1

Error Codes

All errors follow the OpenAI error response format with an additional code field.

HTTP StatusMeaningCommon Cause
400Bad RequestMissing required field, invalid model name, malformed JSON
401UnauthorizedMissing or invalid API key
402Payment RequiredMonthly credits exhausted for your key
429Too Many RequestsRate limit exceeded — wait and retry
500Internal Server ErrorUpstream provider error; retry with backoff
503Service UnavailableProvider temporarily unreachable
json — error response shape
{
  "error": {
    "message": "Rate limit exceeded. Please retry after 60 seconds.",
    "type":    "rate_limit_error",
    "code":    "rate_limit_exceeded"
  }
}

Script Files

FilePlatformPurpose
Windows (PowerShell)Add all models to Factory AI Droid
macOS / Linux (Bash)Add all models to Factory AI Droid
Windows (PowerShell)Set Claude Code environment variables globally
macOS / Linux (Bash)Set Claude Code environment variables globally
Safe to re-run. All scripts are idempotent — running again with a new key updates existing entries without duplicates. A .backup copy of your original config is saved automatically. Python 3 is required for the Factory scripts.