Models — Token-as-a-Service

Model families at a glance

Unified access to every leading model family across multiple providers.

Family	Modalities	Max context	Suppliers
GPT-4o / GPT-4.1 OpenAI	ChatVision	128 K	3
Claude 3.5 / 4 Anthropic	ChatVision	200 K	2
Gemini 2.0 / 2.5 Google	ChatVision	1 M	2
text-embedding-3 / BGE-M3 OpenAI · HuggingFace	Embeddings	8 K	3
Whisper / Nova OpenAI · Groq	Transcription	~2 h audio	2
TTS-1 / TTS-1-HD OpenAI	Speech	4 096 chars	2
BGE-Reranker / Cohere Rerank BAAI · Cohere	Reranking	512 tokens	2

Built for reliability and compliance

Every model request benefits from platform-level safeguards.

Automatic failover

When a supplier returns an error or rate-limit, TaaS instantly retries the same request against a secondary provider — completely transparent to your application.

Geo Filtering

Pass region: "EU" or a specific country in any request to guarantee traffic stays within that geographic requirement. Set up filtering on a per API key level.

Supplier filtering

Prefer specific providers for cost or compliance reasons? Restrict keys to named suppliers while keeping automatic failover within your approved list.

Multi-modal in one API

Chat, vision, embeddings, speech synthesis, audio transcription, and reranking — one key, one endpoint, one bill. No per-provider credentials to manage.

OpenAI-compatible

Drop-in replacement for the OpenAI SDK. Change one base URL and start accessing every model family — no other code changes required.

Usage & cost visibility

Per-model, per-key token and cost breakdowns in real time. Set budget caps per key so a single workload can never overspend.

Data Residency

Integrated Data Compliance

Add "region": "EU" to any API call. TaaS selects only EU-hosted supplier endpoints and rejects the request if no compliant route is available — giving you a hard guarantee, not a best-effort one.

Combine with per-key allowed_regions to enforce residency at the credential level, easily tier geographic filtering on a use case basis.

Explore access controls

# EU-only chat request
curl https://taas.cloudsigma.com/v1/chat/completions \
  -H "Authorization: Bearer $TAAS_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "region": "EU",
    "messages": [
      {"role": "user", "content": "Summarise this contract."}
    ]
  }'

# Response header confirms routing:
# X-TaaS-Supplier: openai-eu
# X-TaaS-Region:   EU

Resilience

Multi-supplier failover — automatic and invisible

TaaS maintains live health scores for every supplier. When an upstream returns a 429, 500, or timeout, your request is retried against the next best option — typically in under 200 ms.

Failover respects your geo and supplier restrictions: if your key is locked to EU suppliers, failover only considers other EU-approved routes.

Read the developer guide

# Python — using openai SDK, zero changes needed
from openai import OpenAI

client = OpenAI(
    api_key="your-taas-key",
    base_url="https://taas.cloudsigma.com/v1"
)

# TaaS handles failover transparently
response = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[{"role": "user",
               "content": "Draft a privacy notice."}]
)
print(response.choices[0].message.content)

Get started today

Access every model through one API

No per-provider accounts. No credential rotation. Full multi-modal coverage with EU-ready geo-filtering from day one.

View pricing Developer guide