For Application Developers — Token-as-a-Service

Everything production AI needs, out of the box

Stop building auth, budget, and failover infrastructure. TaaS provides it all at the API level.

Programmatic key creation

Issue scoped API keys via the Management API. Integrate key provisioning into your onboarding flow, CI/CD pipeline, or infrastructure-as-code — no manual console steps needed.

Per-key cost controls

Set a hard budget limit and period (daily, monthly, rolling) on every key. When the cap is hit, the key stops — protecting you from runaway costs or compromised credentials.

Per-key model restrictions

Lock each key to an explicit list of models. A key issued for a fast summarisation workload cannot call a more expensive reasoning model — enforced server-side.

Data residency and geo-routing

Restrict keys to EU or other regions. Add "region": "EU" per request for call-level enforcement. TaaS hard-fails requests that cannot be routed compliantly — no silent data leakage.

Automatic multi-supplier failover

When a provider returns a rate-limit or error, TaaS retries against a secondary supplier in milliseconds. Your app sees a successful response — no retry logic required in your code.

GDPR and audit automation

Full structured audit logs per key — model, tokens, cost, region, supplier, latency. Export to your SIEM or pull via API for compliance evidence without building your own logging pipeline.

SDK Compatibility

Works with every OpenAI SDK

TaaS implements the full OpenAI REST API. Every official SDK — Python, Node.js, Go, Ruby, Java — works by changing a single line: the base URL.

All endpoints are supported: /chat/completions, /embeddings, /audio/speech, /audio/transcriptions, /rerank and more. Streaming, function calling, and JSON mode work exactly as expected.

Full API reference

// Node.js — same one-line change
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.TAAS_KEY,
  baseURL: "https://taas.cloudsigma.com/v1",
});

// Streaming works too
const stream = await client.chat.completions.create({
  model: "claude-sonnet-4-5",
  stream: true,
  messages: [{ role: "user", content: "Tell me a story." }],
});

for await (const chunk of stream) {
  process.stdout.write(
    chunk.choices[0]?.delta?.content ?? ""
  );
}

Management API

Automate key provisioning

Issue scoped keys for users, tenants, or workloads from your own backend. Each key carries its own budget, model list, region, and rate limit — no manual steps in the TaaS console required.

POST /v1/manage/keys

{
  "name": "tenant-acme-prod",
  "allowed_models": [
    "gpt-4o-mini",
    "text-embedding-3-small"
  ],
  "allowed_regions": "EU",
  "budget_limit":  100.00,
  "budget_period": "monthly",
  "rate_limit_rpm": 60,
  "metadata": {
    "tenant_id": "acme",
    "env": "production"
  }
}

// Response
{
  "id":      "key_abc123xyz",
  "key":     "taas-sk-...",
  "name":    "tenant-acme-prod",
  "created": "2024-01-15T10:00:00Z"
}

Budget Tracking

Real-time spend per key

Query current usage and remaining budget for any key at any time. Build spend dashboards, trigger alerts, or raise limits programmatically — all via API.

GET /v1/manage/keys/key_abc123xyz

// Response
{
  "id":               "key_abc123xyz",
  "name":             "tenant-acme-prod",
  "budget_limit":     100.00,
  "budget_used":      43.17,
  "budget_remaining": 56.83,
  "budget_period":    "monthly",
  "resets_at":        "2024-02-01T00:00:00Z",
  "status":           "active"
}

// Update budget limit inline
PATCH /v1/manage/keys/key_abc123xyz
{ "budget_limit": 200.00 }

GDPR and Data Residency

Compliance automation at the call level

For GDPR-sensitive workloads, add "region": "EU" to every call — or lock the key to allowed_regions: "EU" so no request can ever leave European infrastructure, even if the calling code forgets the field.

Full structured audit logs — model, tokens, cost, supplier, region, latency — are available per key via the Management API, giving you the evidence trail your DPO needs without building a separate logging pipeline.

Access controls docs API reference

# EU-only embedding for GDPR compliance
curl https://taas.cloudsigma.com/v1/embeddings \
  -H "Authorization: Bearer $TAAS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-large",
    "region": "EU",
    "input": "Patient record summary..."
  }'

# Audit log entry returned via Management API
{
  "timestamp":     "2024-01-15T14:32:01Z",
  "key_id":        "key_abc123xyz",
  "model":         "text-embedding-3-large",
  "supplier":      "openai-eu",
  "region":        "EU",
  "input_tokens":  78,
  "output_tokens": 0,
  "cost_usd":      0.00031,
  "latency_ms":    221
}

Resilience

Failover you never have to write

Implementing retry logic, circuit breakers, and provider fallback adds weeks to any AI integration. TaaS handles it at the infrastructure level — your application code stays simple.

Auto-retry on 429, 500, or timeout
Secondary supplier selected in under 200 ms
Failover respects your geo and model restrictions
Response headers show which supplier was used
Full OpenAI error codes preserved for your error handlers

See model coverage

# TaaS response headers reveal exactly what happened
HTTP/2 200
x-taas-supplier:     openai-eu   # which supplier answered
x-taas-region:       EU          # confirmed routing region
x-taas-failover:     true        # primary was rate-limited
x-taas-latency-ms:   387
x-taas-input-tokens: 512
x-taas-output-tokens:128
x-taas-cost-usd:     0.0042

# Your application code needs zero changes.
# Failover is invisible to the caller.

# Python example — no retry code required
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role":"user","content":"Analyse this."}]
)
# TaaS already handled the failover for you

From zero to production in four steps

No new SDKs to learn. No credential juggling. One API for everything.

01

Sign up and get your admin key

Create a TaaS account and retrieve your root admin key. Store it in your secrets manager — this key can create and revoke all other keys.

02

Issue scoped keys per workload

Use POST /v1/manage/keys to create a key per environment or tenant — with the exact model list, region, budget, and rate limit each workload needs.

03

Point your OpenAI SDK at TaaS

Change base_url to https://taas.cloudsigma.com/v1 and swap in your scoped key. Every existing API call works immediately — no other changes required.

04

Monitor, audit, and scale

Pull real-time usage and audit logs via API. Adjust budgets or rate limits without reissuing keys. Revoke instantly when a key is no longer needed.

Start building today

Ship production AI faster — with governance built in

One endpoint, every model, full compliance automation. TaaS handles failover, budgets, and audit logs so you can focus on building features.

Start building API reference