--- title: "Load Balance" description: "Intelligent API key management with weighted load balancing, model-specific filtering, and automatic failover. Distribute traffic across multiple keys for optimal performance and reliability." icon: "scale-balanced" --- ## Smart Key Distribution Bifrost's key management system goes beyond simple API key storage. It provides intelligent load balancing, model-specific key filtering, and weighted distribution to optimize performance and manage costs across multiple API keys. When you configure multiple keys for a provider, Bifrost automatically distributes requests using sophisticated selection algorithms that consider key weights, model compatibility, and deployment mappings. ## How Key Selection Works Bifrost follows a precise selection process for every request: 1. **Context Override Check**: First checks if a key is explicitly provided in context (bypassing management) 2. **Provider Key Lookup**: Retrieves all configured keys for the requested provider 3. **Model Filtering**: Filters keys that support the requested model (respecting `models` allowlists and `blacklisted_models` denylists) 4. **Deployment Validation**: For Azure/Bedrock, validates deployment mappings 5. **Weighted Selection**: Uses weighted random selection among eligible keys This ensures optimal key usage while respecting your configuration constraints. ## Implementation Examples ```bash # 1. Create or ensure the provider exists curl -X POST http://localhost:8080/api/providers \ -H "Content-Type: application/json" \ -d '{ "provider": "openai" }' # 2. Add keys individually via the dedicated keys API curl -X POST http://localhost:8080/api/providers/openai/keys \ -H "Content-Type: application/json" \ -d '{ "name": "openai-key-1", "value": "env.OPENAI_API_KEY_1", "models": ["gpt-4o", "gpt-4o-mini"], "weight": 0.7 }' curl -X POST http://localhost:8080/api/providers/openai/keys \ -H "Content-Type: application/json" \ -d '{ "name": "openai-key-2", "value": "env.OPENAI_API_KEY_2", "models": ["*"], "weight": 0.3 }' # Regular request (uses weighted key selection) curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o-mini", "messages": [{"role": "user", "content": "Hello!"}] }' # Request with direct API key (bypasses key management) curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk-your-direct-api-key" \ -d '{ "model": "openai/gpt-4o-mini", "messages": [{"role": "user", "content": "Hello!"}] }' ``` ```go package main import ( "context" "github.com/maximhq/bifrost/core/schemas" ) func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) { switch provider { case schemas.OpenAI: return []schemas.Key{ { ID: "primary-key", Value: "env.OPENAI_API_KEY_1", Models: ["gpt-4o", "gpt-4o-mini"], // Model whitelist Weight: 0.7, // 70% of traffic }, { ID: "secondary-key", Value: "env.OPENAI_API_KEY_2", Models: []string{"*"}, // ["*"] = supports all models (empty slice denies all in v1.5.0+) Weight: 0.3, // 30% of traffic }, }, nil case schemas.Anthropic: return []schemas.Key{ { Value: "env.ANTHROPIC_API_KEY", Models: ["claude-3-5-sonnet-20241022"], Weight: 1.0, }, }, nil } return nil, fmt.Errorf("provider %s not supported", provider) } // Using with explicit context key (bypasses key management) func makeRequestWithDirectKey() { ctx := context.Background() // Direct key bypasses all key management directKey := schemas.Key{ Value: "sk-direct-api-key", Weight: 1.0, } ctx = context.WithValue(ctx, schemas.BifrostContextKeyDirectKey, directKey) response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostChatRequest{ Provider: schemas.OpenAI, Model: "gpt-4o-mini", Input: messages, }) } ``` ## Weighted Load Balancing Bifrost uses weighted random selection to distribute requests across multiple keys. This allows you to: **Control Traffic Distribution:** - Assign higher weights to premium keys with better rate limits - Balance between production and backup keys - Gradually migrate traffic during key rotation **Weight Calculation Example:** ``` Key 1: Weight 0.7 (70% probability) Key 2: Weight 0.3 (30% probability) Total Weight: 1.0 Random selection ensures statistical distribution over time ``` **Algorithm Details:** 1. Calculate total weight of all eligible keys 2. Generate random number between 0 and total weight 3. Select key based on cumulative weight ranges 4. If selected key fails, automatic fallback to next available key ## Model Whitelisting and Filtering Keys can be restricted to specific models for access control and cost management: **Model Filtering Logic:** - **Empty `models` array (`[]`)**: Denies ALL models (deny-by-default, v1.5.0+) — use `["*"]` to allow all - **Populated `models` array**: Key only supports listed models - **`blacklisted_models`**: Optional per-key denylist. If non-empty and the requested model appears in it, the key is excluded—even if that model is also in `models` (denylist wins over the allowlist) - **Model mismatch**: Key is excluded from selection for that request **Use Cases:** - **Premium Models**: Dedicated keys for expensive models (GPT-4, Claude-3) - **Team Separation**: Different keys for different teams or projects - **Cost Control**: Restrict access to specific model tiers - **Compliance**: Separate keys for different security requirements - **Denylist**: Block specific models on a key **Example Model Restrictions:** Each key is created individually via `POST /api/providers/{provider}/keys`: ```json // Premium-only key { "name": "openai-pre-key-1", "value": "premium-key", "models": ["gpt-4o", "o1-preview"], "weight": 1.0 } // Standard-only key { "name": "openai-std-key-1", "value": "standard-key", "models": ["gpt-4o-mini", "gpt-3.5-turbo"], "weight": 1.0 } // Shared key with denylist { "name": "openai-shared-key", "value": "env.OPENAI_API_KEY", "models": ["gpt-4o", "gpt-4o-mini"], "blacklisted_models": ["gpt-5"], "weight": 1.0 } ``` ## Deployment Mapping (Azure & Bedrock) For cloud providers with deployment-based routing, Bifrost validates deployment availability: **Azure:** - Keys must have deployment mappings for specific models - Deployment name maps to actual Azure deployment identifier - Missing deployment excludes key from selection **AWS Bedrock:** - Supports model profiles and direct model access - Deployment mappings enable inference profile routing - ARN configuration determines URL formation **Deployment Validation Process:** 1. Check if provider uses deployments (Azure/Bedrock) 2. Verify deployment exists for requested model 3. Exclude keys without proper deployment mapping 4. Continue with standard weighted selection ## Custom Key Usage (By Name or ID) Bifrost supports referencing a stored provider key by name or by ID instead of sending the raw secret. This can be useful when you want callers to reference logical key names or stable IDs and let the gateway resolve the actual secret from configured provider keys. **When both are provided, ID takes priority over name.** ### By ID - Header: send `x-bf-api-key-id: ` on the request. The gateway will look up the key with that ID. - Context (Go SDK): ```go ctx := context.Background() ctx = context.WithValue(ctx, schemas.BifrostContextKeyAPIKeyID, "key-uuid-1234") ``` ### By Name - Header: send `x-bf-api-key: ` on the request. The gateway will look up the named key and use its secret for the upstream provider call. - Context (Go SDK): ```go ctx := context.Background() ctx = context.WithValue(ctx, schemas.BifrostContextKeyAPIKeyName, "openai-key-1") ``` Note: Both mechanisms reference a stored key (not the raw secret). The gateway resolves the key against configured provider keys and applies model allowlists, denylists, and deployment mapping. When an explicit key ID or name is supplied, weighted selection is bypassed and the referenced key is used directly. ```bash # Example: request referencing a stored key name that doesn't exist curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -H "x-bf-api-key: non_existant_key" \ -d '{ "model": "anthropic/claude-haiku-4-5", "messages": [{"role": "user", "content": "Hello, Bifrost!"}] }' ``` Response (example): ```json {"is_bifrost_error":false,"error":{"error":"no key found with name \"non_existant_key\" for provider: anthropic","message":"no key found with name \"non_existant_key\" for provider: anthropic"},"extra_fields":{"provider":"anthropic","model_requested":"claude-haiku-4-5","request_type":"chat_completion"}} ``` # Example: request referencing a stored key name that exists but no configured keys support the requested model ```bash curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -H "x-bf-api-key: key_with_model_disabled" \ -d '{ "model": "anthropic/claude-sonnet-4-5", "messages": [{"role": "user", "content": "Hello, Bifrost!"}] }' ``` Response (example): ```json {"is_bifrost_error":false,"error":{"error":"no keys found that support model: claude-sonnet-4-5","message":"no keys found that support model: claude-sonnet-4-5"},"extra_fields":{"provider":"anthropic","model_requested":"claude-sonnet-4-5","request_type":"chat_completion"}} ``` Note: This is not a weighted selection, by providing a specific key name you are explicitly telling Bifrost which stored key to use, so weighted distribution is bypassed. The example above demonstrates the error returned when a referenced key name cannot be resolved. ## Direct Key Bypass For scenarios requiring explicit key control, Bifrost supports bypassing the entire key management system: **Go SDK Context Override:** Pass a key directly in the request context using `schemas.BifrostContextKeyDirectKey`. This completely bypasses provider key lookup and selection. **Gateway Header-based Keys:** Send API keys in `Authorization` (Bearer), `x-api-key` or `x-goog-api-key` headers. Requires `allow_direct_keys` setting to be enabled. **Enable Direct Keys:** ![Web UI](../media/ui-config-direct-keys.png) 1. Navigate to **Configuration** page 2. Toggle **"Allow Direct Keys"** to enabled 3. Save configuration ```json { "client": { "allow_direct_keys": true } } ``` If a Bifrost virtual key (`sk-bf-*`) is attached in the auth header, direct key bypass will be skipped. **When to Use Direct Keys:** - Per-user API key scenarios - External key management systems - Testing with specific keys - Debugging key-related issues