Use case
Broker model access
Agents and models are coupled in ways that make change expensive. A broker decouples them and puts routing, cost, and availability in one place. Switch providers without touching agent code. Run evaluations against production traffic. Stay available when an upstream provider is not.
Agentic patterns
Model traffic crosses the wire from both directions
The broker serves the same role whether you're routing your enterprise's agent traffic across providers, or brokering the models behind the agent features you ship to customers. Route on policy. Observe on outcomes. Switch without rewriting agent code.
AI In
Your internal estate
The model traffic your enterprise consumes from desktop assistants, internal agents, and SaaS-embedded copilots. The broker pools spend across teams and routes inexpensive calls to inexpensive models.
Scenario
Sales, support, and ops each negotiated their own Anthropic and OpenAI contracts. Spend lives in three places, evals are guesswork, and a frontier provider outage takes down half the org.
Outcomes that matter
- Pooled spend across providers and teams
- Routing on cost, policy, and availability
- Evals against real traffic, not synthetic benchmarks
AI Out
Your customer and partner estate
The models behind the agent features you ship. Customers see your product. You see provider routing, eval data, and cost telemetry. Swap providers without an agent code change.
Scenario
A new frontier model is 30% cheaper at the same quality on your workload. Without a broker, switching is a release. With one, it's a routing rule change validated against last week's traffic.
Outcomes that matter
- Provider swaps without agent code changes
- Per-customer cost telemetry tied to features used
- Production-traffic evals before a model rollout
Model volatility
The model is the most volatile part of an agent
Providers change pricing. New models arrive. Some calls need a frontier model, and others do not. Per-team contracts proliferate. Without a broker, every one of these changes becomes an agent code change, and the economics, evals, and reliability story live in three different places.
Spend is fragmenting
Per-team API keys. Per-agent contracts. Shadow usage on personal accounts. Nobody owns the model spend the way someone owns the cloud bill.
Provider lock-in is real
Agents written against one provider's SDK are coupled to that provider's quirks. Switching to evaluate, or to negotiate, means rewriting client code across teams.
Availability is upstream
When a frontier provider has a bad hour, every agent that depends on it has a bad hour too. Failover needs to live below the agent.
Broker capabilities
One configuration surface for cost, evals, and uptime
Economics
Configure model selection by policy without changing the agent. Route inexpensive calls to inexpensive models. Pin frontier traffic to frontier providers. Pool spend across teams.
Evaluations
Switch models against the same traffic to determine optimal outcomes. Compare quality, latency, and cost across providers using real workloads, not synthetic benchmarks.
Availability
Model pools route around outages and rate limits. Configure priority-weighted routing with fallback. Agents stay up when an upstream provider doesn't.
Configuration
Pools and routing, not glue code
The broker exposes the same abstractions you already use for service-to-service routing (pools, priorities, failover, caching) applied to model traffic. The agent doesn't change when the routing does.
pools:
- name: reasoning-tier
routing: priority-weighted
failover: up to 3
cache: short-context
members:
- provider: anthropic
model: claude-opus-4-7
weight: 70
- provider: openai
model: gpt-5-thinking
weight: 25
- provider: bedrock
model: claude-opus-4-7
weight: 5 # regional capacity
rules:
- match:
tier: standard
pool: standard-tier
- match:
tier: premium
context.tokens: ">100k"
pool: long-context-tierBefore and after
The shape of model operations with a broker in place
Before
After
Model in the agent
Each agent holds a provider SDK, an API key, and a chosen model. Changing any of those means changing the agent. Evals are synthetic. Spend lives in monthly invoice PDFs.
Model behind the broker
Agents request capability. The broker resolves to a provider, applies policy, and routes. Eval and cost data flow from one place. Provider contracts can change without touching agent code.
Outage takes you down
When the upstream provider has a bad hour, every agent that depends on it has a bad hour too. Failover is a code change made under pressure.
Pool routes around failure
Priority-weighted routing with fallback. The broker absorbs provider outages and rate-limit storms. Agents stay up when the upstream is not.
Composes with
The enterprise systems you already have
The broker plugs into the systems your team already runs. Provider credentials, telemetry, audit, and configuration all route through their existing home.
Secrets
Provider API keys live in the secrets store you already manage. Rotated centrally. Never touch agent code.
Learn moreAPM
Model latency, cost per call, denial rate, and routing decisions stream to the dashboards your platform team already watches.
Learn moreSIEM
Every model call is an audit event. Provider, model, pool, agent, user. Queryable in the SIEM your SOC already lives in.
Learn moreSource control
Pool configuration and routing rules live in a repo. Versioned, simulated against recent traffic, rolled back like any other deploy.
Learn moreNext steps