One OpenAI-compatible endpoint. Programmatic key management. Per-key budgets, model restrictions, and region controls. All the compliance automation your production app needs — without building it yourself.
Drop-in OpenAI SDK replacement
from openai import OpenAI
# One line change — everything else works
client = OpenAI(
api_key="your-taas-key",
base_url="https://taas.cloudsigma.com/v1"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{
"role": "user",
"content": "Hello from TaaS!"
}]
)
print(response.choices[0].message.content)
Compatible with OpenAI Python, Node.js and Go SDKs
Stop building auth, budget, and failover infrastructure. TaaS provides it all at the API level.
Issue scoped API keys via the Management API. Integrate key provisioning into your onboarding flow, CI/CD pipeline, or infrastructure-as-code — no manual console steps needed.
Set a hard budget limit and period (daily, monthly, rolling) on every key. When the cap is hit, the key stops — protecting you from runaway costs or compromised credentials.
Lock each key to an explicit list of models. A key issued for a fast summarisation workload cannot call a more expensive reasoning model — enforced server-side.
Restrict keys to EU or other regions. Add "region": "EU" per request for call-level enforcement. TaaS hard-fails requests that cannot be routed compliantly — no silent data leakage.
When a provider returns a rate-limit or error, TaaS retries against a secondary supplier in milliseconds. Your app sees a successful response — no retry logic required in your code.
Full structured audit logs per key — model, tokens, cost, region, supplier, latency. Export to your SIEM or pull via API for compliance evidence without building your own logging pipeline.
TaaS implements the full OpenAI REST API. Every official SDK — Python, Node.js, Go, Ruby, Java — works by changing a single line: the base URL.
All endpoints are supported: /chat/completions, /embeddings, /audio/speech, /audio/transcriptions, /rerank and more. Streaming, function calling, and JSON mode work exactly as expected.
Full API reference// Node.js — same one-line change
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.TAAS_KEY,
baseURL: "https://taas.cloudsigma.com/v1",
});
// Streaming works too
const stream = await client.chat.completions.create({
model: "claude-sonnet-4-5",
stream: true,
messages: [{ role: "user", content: "Tell me a story." }],
});
for await (const chunk of stream) {
process.stdout.write(
chunk.choices[0]?.delta?.content ?? ""
);
}
Issue scoped keys for users, tenants, or workloads from your own backend. Each key carries its own budget, model list, region, and rate limit — no manual steps in the TaaS console required.
POST /v1/manage/keys
{
"name": "tenant-acme-prod",
"allowed_models": [
"gpt-4o-mini",
"text-embedding-3-small"
],
"allowed_regions": "EU",
"budget_limit": 100.00,
"budget_period": "monthly",
"rate_limit_rpm": 60,
"metadata": {
"tenant_id": "acme",
"env": "production"
}
}
// Response
{
"id": "key_abc123xyz",
"key": "taas-sk-...",
"name": "tenant-acme-prod",
"created": "2024-01-15T10:00:00Z"
}
Query current usage and remaining budget for any key at any time. Build spend dashboards, trigger alerts, or raise limits programmatically — all via API.
GET /v1/manage/keys/key_abc123xyz
// Response
{
"id": "key_abc123xyz",
"name": "tenant-acme-prod",
"budget_limit": 100.00,
"budget_used": 43.17,
"budget_remaining": 56.83,
"budget_period": "monthly",
"resets_at": "2024-02-01T00:00:00Z",
"status": "active"
}
// Update budget limit inline
PATCH /v1/manage/keys/key_abc123xyz
{ "budget_limit": 200.00 }
For GDPR-sensitive workloads, add "region": "EU" to every call — or lock the key to allowed_regions: "EU" so no request can ever leave European infrastructure, even if the calling code forgets the field.
Full structured audit logs — model, tokens, cost, supplier, region, latency — are available per key via the Management API, giving you the evidence trail your DPO needs without building a separate logging pipeline.
# EU-only embedding for GDPR compliance
curl https://taas.cloudsigma.com/v1/embeddings \
-H "Authorization: Bearer $TAAS_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "text-embedding-3-large",
"region": "EU",
"input": "Patient record summary..."
}'
# Audit log entry returned via Management API
{
"timestamp": "2024-01-15T14:32:01Z",
"key_id": "key_abc123xyz",
"model": "text-embedding-3-large",
"supplier": "openai-eu",
"region": "EU",
"input_tokens": 78,
"output_tokens": 0,
"cost_usd": 0.00031,
"latency_ms": 221
}
Implementing retry logic, circuit breakers, and provider fallback adds weeks to any AI integration. TaaS handles it at the infrastructure level — your application code stays simple.
# TaaS response headers reveal exactly what happened
HTTP/2 200
x-taas-supplier: openai-eu # which supplier answered
x-taas-region: EU # confirmed routing region
x-taas-failover: true # primary was rate-limited
x-taas-latency-ms: 387
x-taas-input-tokens: 512
x-taas-output-tokens:128
x-taas-cost-usd: 0.0042
# Your application code needs zero changes.
# Failover is invisible to the caller.
# Python example — no retry code required
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role":"user","content":"Analyse this."}]
)
# TaaS already handled the failover for you
No new SDKs to learn. No credential juggling. One API for everything.
Create a TaaS account and retrieve your root admin key. Store it in your secrets manager — this key can create and revoke all other keys.
Use POST /v1/manage/keys to create a key per environment or tenant — with the exact model list, region, budget, and rate limit each workload needs.
Change base_url to https://taas.cloudsigma.com/v1 and swap in your scoped key. Every existing API call works immediately — no other changes required.
Pull real-time usage and audit logs via API. Adjust budgets or rate limits without reissuing keys. Revoke instantly when a key is no longer needed.
One endpoint, every model, full compliance automation. TaaS handles failover, budgets, and audit logs so you can focus on building features.