LLM Routing
DgiDgi's LLM Gateway provides flexible routing between multiple LLM providers, supporting platform-managed, tenant-provided, and development configurations.
LLM Source Types
DgiDgi supports three distinct LLM routing modes:
| Mode | Description | Use Case |
|---|---|---|
| Platform LLM | Platform-provided API keys | Default for all tenants |
| Tenant LLM | Tenant's own API keys | Enterprise, data privacy |
| CLI LLM | Local model via CLI | Self-hosted, offline |
Architecture Overview
Platform LLM
Default mode where the platform provides LLM access using shared API keys.
Characteristics
- Managed Keys: Platform maintains API keys for all providers
- Usage Metering: Tracks token usage per tenant for billing
- Rate Limiting: Enforces per-tenant rate limits
- Cost Control: Budget limits and alerts
Flow
Configuration
// Tenant settings (default)
{
"llm": {
"source": "platform",
"preferredProvider": "anthropic",
"preferredModel": "claude-sonnet-4-20250514"
}
}
Tenant LLM
Enterprise mode where tenants provide their own API keys.
Characteristics
- Data Privacy: Requests go directly to provider, no platform visibility
- Own Billing: Tenant pays provider directly
- Full Control: Choose any model, no platform limits
- Key Management: Encrypted storage of tenant credentials
Flow
Configuration
// Tenant settings (own keys)
{
"llm": {
"source": "tenant",
"providers": {
"openai": { "configured": true },
"anthropic": { "configured": true }
}
}
}
Key Storage
Tenant API keys are stored encrypted in the database:
| Column | Description |
|---|---|
id | Primary key |
tenant_id | Tenant identifier |
provider | Provider name (e.g., "openai", "anthropic") |
encrypted_value | Encrypted API key (AES-256-GCM) |
created_at | Creation timestamp |
rotated_at | Last rotation timestamp |
CLI LLM
Local model access via the DgiDgi CLI for self-hosted deployments.
Characteristics
- Local Execution: Models run on user's machine
- Offline Capable: No internet required
- Privacy: Data never leaves local environment
- Custom Models: Support for any Ollama-compatible model
Flow
Configuration
# CLI configuration
dgidgi config set llm.source cli
dgidgi config set llm.endpoint http://localhost:11434
dgidgi config set llm.model llama3
Routing Strategy
The gateway determines LLM source using this priority:
Provider Support
Supported Providers
| Provider | Models | Streaming | Function Calling |
|---|---|---|---|
| OpenAI | GPT-4o, GPT-4, GPT-3.5 | Yes | Yes |
| Anthropic | Claude 4, Claude 3.5 | Yes | Yes |
| Gemini Pro, Gemini Flash | Yes | Yes | |
| Groq | Llama 3, Mixtral | Yes | Yes |
| Mistral | Mistral Large, Medium | Yes | Yes |
| Cohere | Command R+ | Yes | Yes |
| Perplexity | Sonar | Yes | No |
| xAI | Grok | Yes | Yes |
| DeepSeek | DeepSeek Coder | Yes | Yes |
| OpenRouter | Multiple | Yes | Varies |
Model Fallback
If a requested model is unavailable, the gateway attempts fallback:
Rate Limiting & Quotas
Platform LLM Limits
| Tier | Requests/min | Tokens/day | Concurrent |
|---|---|---|---|
| Free | 10 | 50,000 | 2 |
| Pro | 60 | 500,000 | 10 |
| Team | 200 | 2,000,000 | 50 |
| Enterprise | Custom | Custom | Custom |
Tenant LLM Limits
When using tenant keys, limits are determined by the provider's API limits, not DgiDgi platform limits.
Monitoring & Analytics
All LLM requests are logged for analytics (excluding actual content for privacy):
{
"tenant_id": "...",
"provider": "anthropic",
"model": "claude-sonnet-4-20250514",
"source": "platform",
"tokens_input": 1500,
"tokens_output": 800,
"latency_ms": 2340,
"status": "success",
"timestamp": "..."
}