first commit

This commit is contained in:
Beyhan Oğur
2026-04-26 21:52:23 +03:00
commit 880f412e2c
2662 changed files with 866266 additions and 0 deletions

View File

@@ -0,0 +1,416 @@
---
title: "Quick Start"
description: "Configure Bifrost using a config.json file — GitOps-friendly, no-UI deployments, and multinode OSS setups"
icon: "file-code"
---
<Note>
**Full schema reference:** [`https://www.getbifrost.ai/schema`](https://www.getbifrost.ai/schema)
</Note>
`config.json` lets you configure every aspect of Bifrost through a single declarative file. It is the right choice for GitOps workflows, CI/CD pipelines, headless deployments, and multinode OSS setups where a central configuration file is shared across all replicas.
---
## Two Configuration Modes
Bifrost supports **two mutually exclusive modes**. You cannot run both at the same time.
| Mode | When | Behaviour |
|------|------|-----------|
| **Web UI / database** | No `config.json`, or `config.json` with `config_store` enabled | Full UI available, configuration stored in SQLite or PostgreSQL |
| **File-based (`config.json`)** | `config.json` present, `config_store` disabled | UI disabled, all config loaded from file at startup, restart required for changes |
<Note>
See [Setting Up](/quickstart/gateway/setting-up#two-configuration-modes) for a full explanation of both modes and how `config_store` bootstrapping works.
</Note>
---
## Minimal Working Example
```json
{
"$schema": "https://www.getbifrost.ai/schema",
"encryption_key": "env.BIFROST_ENCRYPTION_KEY",
"client": {
"drop_excess_requests": false,
"enable_logging": true
},
"providers": {
"openai": {
"keys": [
{
"name": "openai-primary",
"value": "env.OPENAI_API_KEY",
"models": ["*"],
"weight": 1.0
}
]
}
},
"config_store": {
"enabled": false
}
}
```
Save this as `config.json` in your app directory and start Bifrost:
```bash
# NPX
npx -y @maximhq/bifrost -app-dir ./data
# Docker
docker run -p 8080:8080 \
-v $(pwd)/data:/app/data \
-e OPENAI_API_KEY=sk-... \
-e BIFROST_ENCRYPTION_KEY=your-32-byte-key \
maximhq/bifrost
```
Make your first call:
```bash
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'
```
---
## Environment Variable References
Never put secrets directly in `config.json`. Use the `env.` prefix to reference any environment variable:
```json
{
"encryption_key": "env.BIFROST_ENCRYPTION_KEY",
"providers": {
"openai": {
"keys": [
{
"name": "primary",
"value": "env.OPENAI_API_KEY",
"weight": 1.0
}
]
}
}
}
```
Set the actual values through your deployment platform — shell environment, Docker `-e`, Kubernetes Secrets mounted as env vars, or a `.env` file.
---
## Schema Validation
Add `$schema` to every `config.json` for IDE autocomplete and inline validation:
```json
{
"$schema": "https://www.getbifrost.ai/schema"
}
```
Editors (VS Code, JetBrains, Neovim with LSP) will show completions and flag invalid fields as you type.
---
## Production Example
A production-ready file with PostgreSQL storage, multi-provider setup, governance, and common plugins:
```json
{
"$schema": "https://www.getbifrost.ai/schema",
"encryption_key": "env.BIFROST_ENCRYPTION_KEY",
"client": {
"initial_pool_size": 500,
"drop_excess_requests": true,
"enable_logging": true,
"log_retention_days": 90,
"enforce_auth_on_inference": true,
"allow_direct_keys": false,
"allowed_origins": ["https://app.yourcompany.com"]
},
"providers": {
"openai": {
"keys": [
{
"name": "openai-primary",
"value": "env.OPENAI_API_KEY",
"models": ["*"],
"weight": 1.0
}
],
"network_config": {
"default_request_timeout_in_seconds": 120,
"max_retries": 3
}
},
"anthropic": {
"keys": [
{
"name": "anthropic-primary",
"value": "env.ANTHROPIC_API_KEY",
"models": ["*"],
"weight": 1.0
}
]
}
},
"config_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "env.PG_HOST",
"port": "5432",
"user": "env.PG_USER",
"password": "env.PG_PASSWORD",
"db_name": "bifrost",
"ssl_mode": "require"
}
},
"logs_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "env.PG_HOST",
"port": "5432",
"user": "env.PG_USER",
"password": "env.PG_PASSWORD",
"db_name": "bifrost",
"ssl_mode": "require"
}
}
}
```
---
## Enterprise Example: Postgres + etcd + Access Profiles
Use this pattern when you want enterprise access-profile configuration to be seeded directly from `config.json`, while running clustered nodes with etcd discovery.
```json
{
"$schema": "https://www.getbifrost.ai/schema",
"cluster_config": {
"enabled": true,
"discovery": {
"enabled": true,
"type": "etcd",
"service_name": "bifrost-cluster",
"etcd_endpoints": ["http://localhost:2379"]
}
},
"config_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "localhost",
"port": "5432",
"user": "postgres",
"password": "env.PG_PASSWORD",
"db_name": "bifrost-config",
"ssl_mode": "disable"
}
},
"logs_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "localhost",
"port": "5432",
"user": "postgres",
"password": "env.PG_PASSWORD",
"db_name": "bifrost-config",
"ssl_mode": "disable"
}
},
"mcp": {
"client_configs": [
{
"client_id": "echo_http",
"name": "echo_http",
"connection_type": "http",
"connection_string": "https://mcpplaygroundonline.com/mcp-echo-server",
"auth_type": "none",
"tools_to_execute": ["echo"]
}
]
},
"access_profiles": [
{
"name": "platform-default",
"description": "Default profile for enterprise access-profile testing",
"is_active": true,
"tags": ["platform", "test"],
"provider_configs": [
{
"provider_name": "OpenAi",
"all_models_allowed": false,
"allowed_models": ["gpt-4o-mini"]
}
]
},
{
"name": "platform-readonly-mcp",
"description": "Profile for validating MCP include/exclude behavior",
"is_active": true,
"tags": ["mcp", "test"],
"mcp_servers": [
{
"mcp_server_id": "echo_http"
}
],
"mcp_tool_overrides": [
{
"mcp_client_id": "echo_http",
"tool_name": "echo",
"action": "include"
},
{
"mcp_client_id": "github",
"tool_name": "create_pull_request",
"action": "exclude"
}
]
}
]
}
```
<Note>
`access_profiles` is an enterprise capability. For OSS-only deployments, use `governance.virtual_keys` and related governance resources instead.
</Note>
---
## Example Configs
Ready-to-use reference configurations from the [examples/configs](https://github.com/maximhq/bifrost/tree/main/examples/configs) directory on GitHub:
<AccordionGroup>
<Accordion title="Minimal / File-only">
| Example | Description |
|---------|-------------|
| [noconfigstorenologstore](https://github.com/maximhq/bifrost/blob/main/examples/configs/noconfigstorenologstore/config.json) | Bare-minimum file-only mode — no database, no UI, providers loaded from file |
| [partial](https://github.com/maximhq/bifrost/blob/main/examples/configs/partial/config.json) | SQLite config store with a minimal provider setup |
| [v1compat](https://github.com/maximhq/bifrost/blob/main/examples/configs/v1compat/config.json) | `"version": 1` for v1.4.x array semantics (empty = allow all) |
</Accordion>
<Accordion title="Storage">
| Example | Description |
|---------|-------------|
| [withconfigstore](https://github.com/maximhq/bifrost/blob/main/examples/configs/withconfigstore/config.json) | SQLite config store (Web UI enabled) |
| [withconfigstorelogsstorepostgres](https://github.com/maximhq/bifrost/blob/main/examples/configs/withconfigstorelogsstorepostgres/config.json) | PostgreSQL for both config store and logs store |
| [withlogstore](https://github.com/maximhq/bifrost/blob/main/examples/configs/withlogstore/config.json) | SQLite logs store |
| [withobjectstorages3](https://github.com/maximhq/bifrost/blob/main/examples/configs/withobjectstorages3/config.json) | S3 object storage offload for logs |
| [withobjectstoragegcs](https://github.com/maximhq/bifrost/blob/main/examples/configs/withobjectstoragegcs/config.json) | GCS object storage offload for logs |
| [withvectorstoreweaviate](https://github.com/maximhq/bifrost/blob/main/examples/configs/withvectorstoreweaviate/config.json) | Weaviate vector store (with [docker-compose](https://github.com/maximhq/bifrost/blob/main/examples/configs/withvectorstoreweaviate/docker-compose.yml)) |
</Accordion>
<Accordion title="Semantic Cache">
| Example | Description |
|---------|-------------|
| [withsemanticcache](https://github.com/maximhq/bifrost/blob/main/examples/configs/withsemanticcache/config.json) | Semantic cache backed by Weaviate |
| [withsemanticcachevalkey](https://github.com/maximhq/bifrost/blob/main/examples/configs/withsemanticcachevalkey/config.json) | Semantic cache backed by Valkey / Redis |
</Accordion>
<Accordion title="Governance">
| Example | Description |
|---------|-------------|
| [withauth](https://github.com/maximhq/bifrost/blob/main/examples/configs/withauth/config.json) | Admin username/password auth (`governance.auth_config`) |
| [withvirtualkeys](https://github.com/maximhq/bifrost/blob/main/examples/configs/withvirtualkeys/config.json) | Virtual keys with provider/model allowlists |
| [withteamscustomers](https://github.com/maximhq/bifrost/blob/main/examples/configs/withteamscustomers/config.json) | Teams and customers with budgets and rate limits |
| [withroutingrules](https://github.com/maximhq/bifrost/blob/main/examples/configs/withroutingrules/config.json) | CEL-based routing rules for dynamic provider/model selection |
| [withpricingoverridesnostore](https://github.com/maximhq/bifrost/blob/main/examples/configs/withpricingoverridesnostore/config.json) | Pricing overrides in file-only mode |
| [withpricingoverridessqlite](https://github.com/maximhq/bifrost/blob/main/examples/configs/withpricingoverridessqlite/config.json) | Pricing overrides with SQLite config store |
</Accordion>
<Accordion title="Observability">
| Example | Description |
|---------|-------------|
| [withobservability](https://github.com/maximhq/bifrost/blob/main/examples/configs/withobservability/config.json) | Prometheus metrics (telemetry always active, custom labels via `client.prometheus_labels`) |
| [withprompushgateway](https://github.com/maximhq/bifrost/blob/main/examples/configs/withprompushgateway/config.json) | Prometheus Push Gateway for multi-instance deployments |
| [withotel](https://github.com/maximhq/bifrost/blob/main/examples/configs/withotel/config.json) | OpenTelemetry traces and metrics |
</Accordion>
<Accordion title="Plugins & Advanced">
| Example | Description |
|---------|-------------|
| [withdynamicplugin](https://github.com/maximhq/bifrost/blob/main/examples/configs/withdynamicplugin/config.json) | Loading a custom `.so` plugin at startup |
| [withcompat](https://github.com/maximhq/bifrost/blob/main/examples/configs/withcompat/config.json) | SDK compatibility shims (`should_drop_params`, `convert_text_to_chat`) |
| [withframework](https://github.com/maximhq/bifrost/blob/main/examples/configs/withframework/config.json) | Custom model pricing catalog URL and sync interval |
| [withlargepayload](https://github.com/maximhq/bifrost/blob/main/examples/configs/withlargepayload/config.json) | Large payload optimization (streaming without full materialisation) |
| [withwebsocket](https://github.com/maximhq/bifrost/blob/main/examples/configs/withwebsocket/config.json) | WebSocket / Realtime API connection pool tuning |
| [withnginxreverseproxy](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/config.json) | 3-node Bifrost behind NGINX reverse proxy (includes [docker-compose](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/docker-compose.yml), [nginx.conf](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/nginx.conf), [helm values](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/helm-values.yaml), and [k8s ingress](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/k8s-ingress.yaml)) |
| [withpostgresmcpclientsinconfig](https://github.com/maximhq/bifrost/blob/main/examples/configs/withpostgresmcpclientsinconfig/config.json) | MCP client definitions seeded from config.json with PostgreSQL store |
| [encryptionmigration](https://github.com/maximhq/bifrost/blob/main/examples/configs/encryptionmigration/config.json) | Migrating to a new encryption key |
</Accordion>
</AccordionGroup>
---
## Configuration Guides
<CardGroup cols={2}>
<Card title="Schema Reference" icon="brackets-curly" href="/deployment-guides/config-json/schema-reference">
Every top-level key, its type, default, and where it is documented
</Card>
<Card title="Client Configuration" icon="gear" href="/deployment-guides/config-json/client">
Pool size, logging, CORS, header filtering, compat shims, MCP settings
</Card>
<Card title="Provider Setup" icon="plug" href="/deployment-guides/config-json/providers">
OpenAI, Anthropic, Azure, Bedrock, Vertex, Groq, self-hosted
</Card>
<Card title="Storage" icon="database" href="/deployment-guides/config-json/storage">
config_store, logs_store, vector_store — SQLite, PostgreSQL, object storage
</Card>
<Card title="Plugins" icon="puzzle-piece" href="/deployment-guides/config-json/plugins">
Semantic cache, OTel, Maxim, Datadog, custom plugins
</Card>
<Card title="Cluster" icon="circle-nodes" href="/deployment-guides/config-json/cluster">
Cluster mode with static peers or discovery backends (enterprise)
</Card>
<Card title="Governance" icon="shield-check" href="/deployment-guides/config-json/governance">
Virtual keys, budgets, rate limits, routing rules, admin auth
</Card>
<Card title="Guardrails" icon="shield-halved" href="/deployment-guides/config-json/guardrails">
Content moderation providers and CEL-based rules (enterprise)
</Card>
</CardGroup>
---
## Next Steps
1. Configure [provider keys](/providers/supported-providers/overview)
2. Enable [plugins](/plugins/getting-started)
3. Set up [observability](/features/observability/default)
4. Configure [governance](/features/governance/virtual-keys)
5. Deploy [multiple nodes](/deployment-guides/how-to/multinode) with a shared `config.json`

View File

@@ -0,0 +1,276 @@
---
title: "Client Configuration"
description: "Configure the Bifrost client in config.json — connection pool, logging, CORS, header filtering, compat shims, and MCP settings"
icon: "gear"
---
The `client` block controls how Bifrost manages its internal worker pool, request logging, authentication enforcement, header policies, SDK compatibility shims, and MCP agent behaviour.
---
## Connection Pool
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `initial_pool_size` | integer | `300` | Pre-allocated worker goroutines per provider queue |
| `drop_excess_requests` | boolean | `false` | Drop requests when queue is full instead of waiting (returns HTTP 429) |
A larger pool reduces latency spikes under burst load at the cost of higher baseline memory. `5001000` is a common starting point for production workloads with multiple providers.
```json
{
"client": {
"initial_pool_size": 1000,
"drop_excess_requests": true
}
}
```
---
## Request & Response Logging
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `enable_logging` | boolean | — | Log all LLM requests and responses |
| `disable_content_logging` | boolean | `false` | Strip message content from logs (keeps metadata only) |
| `log_retention_days` | integer | `365` | Days to retain log entries in the store |
| `logging_headers` | array of strings | `[]` | HTTP request headers to capture in log metadata |
Set `disable_content_logging: true` for HIPAA / PCI compliance workloads where message content must not be persisted.
```json
{
"client": {
"enable_logging": true,
"disable_content_logging": true,
"log_retention_days": 90,
"logging_headers": ["x-request-id", "x-user-id"]
}
}
```
---
## Security & CORS
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `allowed_origins` | array | `["*"]` | CORS allowed origins (use URIs or `"*"`) |
| `allow_direct_keys` | boolean | `false` | Allow callers to pass provider keys directly in requests |
| `enforce_auth_on_inference` | boolean | `false` | Require auth (virtual key, API key, or user token) on `/v1/*` inference routes |
| `max_request_body_size_mb` | integer | `100` | Maximum allowed request body size in MB |
| `whitelisted_routes` | array of strings | `[]` | Routes that bypass auth middleware |
| `allowed_headers` | array of strings | `[]` | Additional headers permitted for CORS and WebSocket |
```json
{
"client": {
"allowed_origins": [
"https://app.yourcompany.com",
"https://admin.yourcompany.com"
],
"allow_direct_keys": false,
"enforce_auth_on_inference": true,
"max_request_body_size_mb": 50,
"whitelisted_routes": ["/health", "/metrics"]
}
}
```
---
## Header Filtering
Controls which `x-bf-eh-*` extra headers are forwarded to upstream LLM providers.
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `header_filter_config.allowlist` | array of strings | `[]` | Only these headers are forwarded (whitelist mode) |
| `header_filter_config.denylist` | array of strings | `[]` | These headers are always blocked |
| `required_headers` | array of strings | `[]` | Headers that must be present on every request (rejected with 400 if missing) |
When both `allowlist` and `denylist` are empty, all `x-bf-eh-*` headers pass through. Specifying an `allowlist` enables strict whitelist mode — only listed headers are forwarded.
```json
{
"client": {
"header_filter_config": {
"allowlist": [
"x-bf-eh-anthropic-version",
"x-bf-eh-openai-beta"
],
"denylist": []
},
"required_headers": ["x-request-id"]
}
}
```
---
## Compat Shims
Compatibility flags that let Bifrost silently adapt request/response shapes for SDK integrations.
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `compat.convert_text_to_chat` | boolean | `false` | Wrap legacy `/v1/completions` text requests as chat messages |
| `compat.convert_chat_to_responses` | boolean | `false` | Translate chat completions to Responses API format |
| `compat.should_drop_params` | boolean | `false` | Silently drop unsupported parameters instead of erroring |
| `compat.should_convert_params` | boolean | `false` | Auto-convert parameter values across provider schemas |
```json
{
"client": {
"compat": {
"should_drop_params": true,
"convert_text_to_chat": true
}
}
}
```
---
## MCP Agent Settings
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `mcp_agent_depth` | integer | `10` | Maximum tool-call recursion depth for MCP agent mode |
| `mcp_tool_execution_timeout` | integer | `30` | Timeout per MCP tool execution in seconds |
| `mcp_code_mode_binding_level` | string | — | Code mode binding level: `"server"` or `"tool"` |
| `mcp_tool_sync_interval` | integer | `10` | Global tool sync interval in minutes (`0` = disabled) |
| `mcp_disable_auto_tool_inject` | boolean | `false` | When `true`, MCP tools are not automatically injected into requests |
```json
{
"client": {
"mcp_agent_depth": 15,
"mcp_tool_execution_timeout": 60,
"mcp_tool_sync_interval": 10
}
}
```
---
## Async Jobs
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `async_job_result_ttl` | integer | `3600` | TTL (seconds) for async job results |
| `disable_db_pings_in_health` | boolean | `false` | Exclude database connectivity from `/health` endpoint checks |
---
## Prometheus Labels
Add custom labels to every Prometheus metric emitted by Bifrost:
```json
{
"client": {
"prometheus_labels": ["environment=production", "region=us-east-1"]
}
}
```
---
## Authentication
`governance.auth_config` protects the Bifrost dashboard and management API with username/password auth.
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `is_enabled` | boolean | `false` | Enable username/password auth |
| `admin_username` | string | — | Admin username |
| `admin_password` | string | — | Admin password (use `env.` reference) |
| `disable_auth_on_inference` | boolean | `false` | Skip auth check on `/v1/*` inference routes |
```json
{
"governance": {
"auth_config": {
"is_enabled": true,
"admin_username": "env.BIFROST_ADMIN_USERNAME",
"admin_password": "env.BIFROST_ADMIN_PASSWORD",
"disable_auth_on_inference": false
}
}
}
```
<Note>
A top-level `auth_config` is also accepted for backwards compatibility, but `governance.auth_config` is the preferred location.
</Note>
---
## Encryption Key
```json
{
"encryption_key": "env.BIFROST_ENCRYPTION_KEY"
}
```
| Notes |
|-------|
| Accepts any string; Bifrost derives a 32-byte AES-256 key using Argon2id |
| Can also be set via the `BIFROST_ENCRYPTION_KEY` environment variable |
| Once set and the database is populated, the key cannot be changed without clearing the database |
| Omitting the key stores data in plain text — not recommended for production |
---
## Full Example
```json
{
"$schema": "https://www.getbifrost.ai/schema",
"encryption_key": "env.BIFROST_ENCRYPTION_KEY",
"governance": {
"auth_config": {
"is_enabled": true,
"admin_username": "env.BIFROST_ADMIN_USERNAME",
"admin_password": "env.BIFROST_ADMIN_PASSWORD",
"disable_auth_on_inference": false
}
},
"client": {
"initial_pool_size": 1000,
"drop_excess_requests": true,
"enable_logging": true,
"disable_content_logging": false,
"log_retention_days": 90,
"logging_headers": ["x-request-id", "x-user-id"],
"allowed_origins": ["https://app.yourcompany.com"],
"allow_direct_keys": false,
"enforce_auth_on_inference": true,
"max_request_body_size_mb": 100,
"header_filter_config": {
"allowlist": [],
"denylist": []
},
"required_headers": [],
"compat": {
"should_drop_params": false
},
"prometheus_labels": ["environment=production"],
"mcp_agent_depth": 10,
"mcp_tool_execution_timeout": 30,
"async_job_result_ttl": 3600
}
}
```

View File

@@ -0,0 +1,154 @@
---
title: "Cluster"
description: "Configure enterprise cluster mode in config.json using peers or automatic discovery"
icon: "circle-nodes"
---
<Warning>
`cluster_config` is an enterprise capability. OSS builds ignore this section.
</Warning>
`cluster_config` enables multi-node Bifrost enterprise clustering with gossip-based membership and optional automatic node discovery.
You can form a cluster in two ways:
- Define static `peers` (`host:port`)
- Enable `discovery` with one of: `kubernetes`, `dns`, `udp`, `consul`, `etcd`, `mdns`
<Tip>
At least one of `peers` or `discovery.enabled: true` must be configured when `cluster_config.enabled` is true.
</Tip>
---
## Minimal Runnable Configs
```json
{
"cluster_config": {
"enabled": true,
"discovery": {
"enabled": true,
"type": "mdns",
"service_name": "bifrost-cluster"
}
}
}
```
Use this for local testing. At startup, cluster init requires either:
- non-empty `peers`, or
- `discovery.enabled: true`
If neither is set, cluster initialization fails.
---
## Static Peers
```json
{
"cluster_config": {
"enabled": true,
"region": "us-east-1",
"peers": [
"10.0.1.10:10101",
"10.0.1.11:10101"
],
"gossip": {
"port": 10101,
"config": {
"timeout_seconds": 10,
"success_threshold": 3,
"failure_threshold": 3
}
}
}
}
```
---
## Discovery Example (etcd)
```json
{
"cluster_config": {
"enabled": true,
"region": "us-east-1",
"gossip": {
"port": 10101,
"config": {
"timeout_seconds": 10,
"success_threshold": 3,
"failure_threshold": 3
}
},
"discovery": {
"enabled": true,
"type": "etcd",
"service_name": "bifrost-cluster",
"etcd_endpoints": [
"http://etcd-1:2379",
"http://etcd-2:2379"
],
"dial_timeout": "10s"
}
}
}
```
---
## Field Reference
### `cluster_config`
| Field | Type | Description |
|-------|------|-------------|
| `enabled` | boolean | Enables cluster mode |
| `region` | string | Region label for this node (defaults to `"unknown"` at runtime when omitted) |
| `peers` | array of strings | Static peer addresses in `host:port` format |
| `gossip` | object | Gossip/memberlist settings |
| `discovery` | object | Automatic node discovery settings |
### `cluster_config.gossip`
| Field | Type | Description |
|-------|------|-------------|
| `port` | integer | Gossip port for this node |
| `config.timeout_seconds` | integer | Liveness timeout |
| `config.success_threshold` | integer | Success count before healthy |
| `config.failure_threshold` | integer | Failure count before unhealthy |
### `cluster_config.discovery`
| Field | Type | Description |
|-------|------|-------------|
| `enabled` | boolean | Enables discovery process |
| `type` | string | `kubernetes`, `dns`, `udp`, `consul`, `etcd`, `mdns` |
| `service_name` | string | Service identifier (required for `consul`, `etcd`, `udp`, typically `mdns`; optional for `kubernetes` and `dns`) |
| `bind_port` | integer | Port appended to discovered hosts if missing |
| `dial_timeout` | string | Go duration string (`"5s"`, `"30s"`, `"1m"`) |
| `allowed_address_space` | array of strings | CIDR filters for discovered nodes |
| `k8s_namespace` | string | Kubernetes namespace for pod discovery |
| `k8s_label_selector` | string | Kubernetes label selector |
| `dns_names` | array of strings | DNS names to resolve |
| `udp_broadcast_port` | integer | UDP broadcast port (required for `udp`) |
| `consul_address` | string | Consul address |
| `etcd_endpoints` | array of strings | etcd endpoint URLs |
| `mdns_service` | string | Optional mDNS service type override (e.g. `"_bifrost-cluster._tcp"`) |
<Note>
For `discovery.type: "mdns"`, `service_name` is sufficient for most setups. When `mdns_service` is omitted, Bifrost derives the mDNS service type as `"_<service_name>._tcp"`. If you set `mdns_service`, it **overrides** the derived value and is used for both mDNS registration and browsing.
</Note>
<Warning>
For `discovery.type: "udp"`, configure both `udp_broadcast_port` and `allowed_address_space`.
</Warning>
---
For discovery-method deep dives and deployment patterns, see [Enterprise Clustering](/enterprise/clustering).

View File

@@ -0,0 +1,333 @@
---
title: "Governance"
description: "Seed virtual keys, budgets, rate limits, routing rules, and admin auth in config.json"
icon: "shield-check"
---
The `governance` block lets you seed all governance resources directly in `config.json`. On startup, Bifrost loads these into the configuration store. This is the recommended approach for GitOps workflows where governance state is managed as code.
<Note>
**Governance enforcement is always active** in OSS — you do not need a plugin entry to enable it. To require a virtual key on every inference request, set `client.enforce_auth_on_inference: true`. This is the global default, but a more specific inference-auth flag such as `governance.auth_config.disable_auth_on_inference` overrides it; if no specific override is set, `client.enforce_auth_on_inference` applies.
</Note>
---
## Admin Authentication
Protect the Bifrost dashboard and management API with username/password auth:
```json
{
"governance": {
"auth_config": {
"is_enabled": true,
"admin_username": "env.BIFROST_ADMIN_USERNAME",
"admin_password": "env.BIFROST_ADMIN_PASSWORD",
"disable_auth_on_inference": false
}
}
}
```
| Field | Default | Description |
|-------|---------|-------------|
| `is_enabled` | `false` | Enable admin username/password auth |
| `admin_username` | — | Admin username (supports `env.` prefix) |
| `admin_password` | — | Admin password (supports `env.` prefix) |
| `disable_auth_on_inference` | `false` | Skip auth check on `/v1/*` inference routes |
---
## Virtual Keys
Virtual keys are issued to clients and act as scoped API tokens. Each key specifies which providers, models, and API keys the bearer is allowed to use.
```json
{
"governance": {
"virtual_keys": [
{
"id": "vk-team-platform",
"name": "platform-team",
"value": "env.VK_PLATFORM_TEAM",
"is_active": true,
"provider_configs": [
{
"provider": "openai",
"allowed_models": ["gpt-4o", "gpt-4o-mini"],
"key_ids": ["*"],
"weight": 1
},
{
"provider": "anthropic",
"allowed_models": ["*"],
"key_ids": ["*"],
"weight": 1
}
]
}
]
}
}
```
### Virtual Key Fields
| Field | Required | Description |
|-------|----------|-------------|
| `id` | Yes | Unique virtual key ID (referenced by budgets / rate limits) |
| `name` | Yes | Human-readable name |
| `value` | No | The key token sent by clients (use `env.` prefix). Auto-generated if omitted |
| `is_active` | No | Default `true`. Set `false` to disable without deleting |
| `team_id` | No | Associate with a team (mutually exclusive with `customer_id`) |
| `customer_id` | No | Associate with a customer |
| `rate_limit_id` | No | Attach a rate limit |
| `calendar_aligned` | No | Snap budget resets to day/week/month/year boundaries |
| `provider_configs` | No | Allowed provider/model/key combinations (empty = deny all) |
### Provider Config Fields
| Field | Required | Description |
|-------|----------|-------------|
| `provider` | Yes | Provider name (e.g. `"openai"`) |
| `allowed_models` | No | Model allow-list. `["*"]` = all models; `[]` = deny all |
| `key_ids` | No | Provider key names allowed for this VK. `["*"]` = all keys; `[]` = deny all. Use key `name` values (not UUIDs) in `config.json` |
| `weight` | No | Load-balancing weight when multiple provider configs are present |
| `rate_limit_id` | No | Attach a per-provider-config rate limit |
---
## Budgets
Budgets cap cumulative spend (in USD) for a virtual key or provider config over a rolling window:
```json
{
"governance": {
"budgets": [
{
"id": "budget-platform-monthly",
"max_limit": 500.00,
"reset_duration": "1M",
"virtual_key_id": "vk-team-platform"
}
]
}
}
```
| Field | Required | Description |
|-------|----------|-------------|
| `id` | Yes | Unique budget ID |
| `max_limit` | Yes | Maximum spend in USD |
| `reset_duration` | Yes | Window length: `"30s"`, `"5m"`, `"1h"`, `"1d"`, `"1w"`, `"1M"`, `"1Y"` |
| `virtual_key_id` | No | Attach to a virtual key (mutually exclusive with `provider_config_id`) |
| `provider_config_id` | No | Attach to a provider config ID |
---
## Rate Limits
Rate limits cap requests or tokens over a rolling window:
```json
{
"governance": {
"rate_limits": [
{
"id": "rl-platform-hourly",
"request_max_limit": 1000,
"request_reset_duration": "1h",
"token_max_limit": 1000000,
"token_reset_duration": "1h"
}
]
}
}
```
| Field | Required | Description |
|-------|----------|-------------|
| `id` | Yes | Unique rate limit ID |
| `request_max_limit` | No | Maximum requests in window |
| `request_reset_duration` | No | Window for request counter |
| `token_max_limit` | No | Maximum tokens (input + output) in window |
| `token_reset_duration` | No | Window for token counter |
Attach a rate limit to a virtual key via `virtual_keys[].rate_limit_id`, or to a provider config via `virtual_keys[].provider_configs[].rate_limit_id`.
---
## Routing Rules
Routing rules dynamically select the provider and model for each request based on a [CEL](https://cel.dev) expression. They are evaluated in priority order before the request is dispatched.
```json
{
"governance": {
"routing_rules": [
{
"id": "route-gpt4-to-azure",
"name": "Redirect GPT-4o to Azure",
"cel_expression": "request.model == 'gpt-4o'",
"targets": [
{ "provider": "azure", "model": "gpt-4o", "weight": 1.0 }
]
},
{
"id": "route-cost-split",
"name": "Split traffic 70/30 between providers",
"cel_expression": "true",
"targets": [
{ "provider": "openai", "weight": 0.7 },
{ "provider": "anthropic", "weight": 0.3 }
]
}
]
}
}
```
### Rule Fields
| Field | Required | Description |
|-------|----------|-------------|
| `id` | Yes | Unique rule ID |
| `name` | Yes | Human-readable name |
| `cel_expression` | No | CEL expression. `"true"` matches every request |
| `targets` | Yes | Weighted target list (weights must sum to `1.0`) |
| `enabled` | No | Default `true` |
| `priority` | No | Evaluation order within scope — lower numbers run first |
| `scope` | No | `"global"` (default), `"team"`, `"customer"`, `"virtual_key"` |
| `scope_id` | Conditional | Required when `scope` is not `"global"` |
| `chain_rule` | No | If `true`, re-evaluates the chain after this rule matches |
| `fallbacks` | No | Ordered fallback provider list if primary target fails |
### Target Fields
| Field | Required | Description |
|-------|----------|-------------|
| `weight` | Yes | Fraction of traffic (all weights in a rule must sum to `1.0`) |
| `provider` | No | Target provider. Omit to keep the incoming request's provider |
| `model` | No | Target model. Omit to keep the incoming request's model |
| `key_id` | No | Pin a specific API key by name |
---
## Customers & Teams
Define organizational entities and attach budgets or rate limits to them:
```json
{
"governance": {
"customers": [
{
"id": "customer-acme",
"name": "Acme Corp",
"budget_id": "budget-acme-monthly",
"rate_limit_id": "rl-acme-hourly"
}
],
"teams": [
{
"id": "team-ml",
"name": "ML Team",
"customer_id": "customer-acme",
"budget_id": "budget-team-ml"
}
]
}
}
```
---
## Full Governance Example
```json
{
"$schema": "https://www.getbifrost.ai/schema",
"encryption_key": "env.BIFROST_ENCRYPTION_KEY",
"client": {
"enforce_auth_on_inference": true
},
"governance": {
"auth_config": {
"is_enabled": true,
"admin_username": "env.BIFROST_ADMIN_USERNAME",
"admin_password": "env.BIFROST_ADMIN_PASSWORD"
},
"budgets": [
{
"id": "budget-platform",
"max_limit": 1000.00,
"reset_duration": "1M",
"virtual_key_id": "vk-platform"
}
],
"rate_limits": [
{
"id": "rl-platform",
"request_max_limit": 5000,
"request_reset_duration": "1h",
"token_max_limit": 5000000,
"token_reset_duration": "1h"
}
],
"virtual_keys": [
{
"id": "vk-platform",
"name": "platform-key",
"value": "env.VK_PLATFORM",
"is_active": true,
"rate_limit_id": "rl-platform",
"provider_configs": [
{
"provider": "openai",
"allowed_models": ["*"],
"key_ids": ["*"],
"weight": 1
}
]
}
],
"routing_rules": [
{
"id": "fallback-to-anthropic",
"name": "Fallback on error",
"cel_expression": "true",
"targets": [{ "provider": "openai", "weight": 1.0 }],
"fallbacks": ["anthropic"]
}
]
},
"providers": {
"openai": {
"keys": [{ "name": "openai-primary", "value": "env.OPENAI_API_KEY", "models": ["*"], "weight": 1.0 }]
},
"anthropic": {
"keys": [{ "name": "anthropic-primary", "value": "env.ANTHROPIC_API_KEY", "models": ["*"], "weight": 1.0 }]
}
},
"config_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "env.PG_HOST",
"port": "5432",
"user": "env.PG_USER",
"password": "env.PG_PASSWORD",
"db_name": "bifrost"
}
}
}
```

View File

@@ -0,0 +1,291 @@
---
title: "Guardrails"
description: "Configure content moderation and policy enforcement in config.json using guardrails_config"
icon: "shield-halved"
---
<Note>
Guardrails are an **enterprise-only** feature and require the enterprise Bifrost image.
</Note>
Guardrails are configured under `guardrails_config` in `config.json`. The configuration has two parts:
- **`guardrail_providers`** — the backend that performs the check. Rules link to providers by `id`.
- **`guardrail_rules`** — CEL expressions that control when and where providers are invoked.
---
## Providers
<Tabs>
<Tab title="Regex">
Runs entirely in-process with no external dependency. Patterns use RE2 syntax. Supports optional per-pattern flags: `i` (case-insensitive), `m` (multiline), `s` (dot-all).
```json
{
"guardrails_config": {
"guardrail_providers": [
{
"id": 1,
"provider_name": "regex",
"policy_name": "block-secrets",
"enabled": true,
"timeout": 5,
"config": {
"patterns": [
{ "pattern": "sk-[A-Za-z0-9]{20,}", "description": "OpenAI API key" },
{ "pattern": "AKIA[0-9A-Z]{16}", "description": "AWS access key" },
{ "pattern": "gh[ps]_[A-Za-z0-9]{36}", "description": "GitHub token", "flags": "i" }
],
"mode": "block"
}
}
]
}
}
```
</Tab>
<Tab title="AWS Bedrock">
```json
{
"guardrails_config": {
"guardrail_providers": [
{
"id": 2,
"provider_name": "bedrock",
"policy_name": "content-filter",
"enabled": true,
"timeout": 15,
"config": {
"guardrail_arn": "arn:aws:bedrock:us-east-1::guardrail/abc123",
"guardrail_version": "DRAFT",
"region": "us-east-1",
"access_key": "env.AWS_ACCESS_KEY_ID",
"secret_key": "env.AWS_SECRET_ACCESS_KEY"
}
}
]
}
}
```
</Tab>
<Tab title="Azure Content Safety">
```json
{
"guardrails_config": {
"guardrail_providers": [
{
"id": 3,
"provider_name": "azure",
"policy_name": "azure-content-safety",
"enabled": true,
"timeout": 10,
"config": {
"endpoint": "https://your-resource.cognitiveservices.azure.com",
"api_key": "env.AZURE_CONTENT_SAFETY_KEY",
"analyze_enabled": true,
"analyze_severity_threshold": "medium",
"jailbreak_shield_enabled": true,
"indirect_attack_shield_enabled": true,
"copyright_enabled": false,
"text_blocklist_enabled": false,
"blocklist_names": []
}
}
]
}
}
```
`analyze_severity_threshold` accepts `"low"`, `"medium"`, or `"high"`.
</Tab>
<Tab title="Gray Swan">
```json
{
"guardrails_config": {
"guardrail_providers": [
{
"id": 4,
"provider_name": "grayswan",
"policy_name": "grayswan-jailbreak",
"enabled": true,
"timeout": 15,
"config": {
"api_key": "env.GRAYSWAN_API_KEY",
"violation_threshold": 0.7,
"reasoning_mode": "standard",
"policy_id": "",
"policy_ids": [],
"rules": {}
}
}
]
}
}
```
</Tab>
</Tabs>
### Provider Fields
| Field | Required | Description |
|-------|----------|-------------|
| `id` | Yes | Unique integer ID — referenced by rules via `provider_config_ids` |
| `provider_name` | Yes | Backend: `"regex"`, `"bedrock"`, `"azure"`, `"grayswan"` |
| `policy_name` | Yes | Human-readable policy label |
| `enabled` | Yes | `true` to activate |
| `timeout` | No | Execution timeout in seconds |
| `config` | No | Provider-specific configuration object |
---
## Rules
Rules are CEL expressions that fire when their condition matches. Available CEL variables:
| Variable | Type | Description |
|----------|------|-------------|
| `model` | `string` | Model name from the request |
| `provider` | `string` | Provider name (e.g. `"openai"`) |
| `headers` | `map<string,string>` | HTTP request headers |
| `params` | `map<string,string>` | Query parameters |
| `customer` | `string` | Customer ID |
| `team` | `string` | Team ID |
| `user` | `string` | User ID |
```json
{
"guardrails_config": {
"guardrail_rules": [
{
"id": 101,
"name": "block-secrets-input",
"description": "Block prompts containing credentials",
"enabled": true,
"cel_expression": "true",
"apply_to": "input",
"sampling_rate": 100,
"timeout": 10,
"provider_config_ids": [1]
},
{
"id": 102,
"name": "content-safety-gpt4o-output",
"enabled": true,
"cel_expression": "model == 'gpt-4o'",
"apply_to": "output",
"sampling_rate": 100,
"timeout": 15,
"provider_config_ids": [3]
},
{
"id": 103,
"name": "grayswan-openai-partial",
"enabled": true,
"cel_expression": "provider == 'openai'",
"apply_to": "input",
"sampling_rate": 50,
"timeout": 20,
"provider_config_ids": [4]
}
]
}
}
```
### Rule Fields
| Field | Required | Description |
|-------|----------|-------------|
| `id` | Yes | Unique integer ID |
| `name` | Yes | Human-readable name |
| `description` | No | Optional description |
| `enabled` | Yes | `true` to activate |
| `cel_expression` | Yes | CEL boolean expression. `"true"` matches every request |
| `apply_to` | Yes | `"input"`, `"output"`, or `"both"` |
| `sampling_rate` | No | `0``100`; percentage of requests to evaluate (default: `100`) |
| `timeout` | No | Rule timeout in seconds |
| `provider_config_ids` | No | `id` values of providers to invoke when this rule matches. Multiple providers run in parallel |
---
## Full Example
```json
{
"$schema": "https://www.getbifrost.ai/schema",
"encryption_key": "env.BIFROST_ENCRYPTION_KEY",
"providers": {
"openai": {
"keys": [{ "name": "primary", "value": "env.OPENAI_API_KEY", "models": ["*"], "weight": 1.0 }]
}
},
"guardrails_config": {
"guardrail_providers": [
{
"id": 1,
"provider_name": "regex",
"policy_name": "block-secrets",
"enabled": true,
"timeout": 5,
"config": {
"patterns": [
{ "pattern": "sk-[A-Za-z0-9]{20,}", "description": "OpenAI API key" },
{ "pattern": "AKIA[0-9A-Z]{16}", "description": "AWS access key" }
],
"mode": "block"
}
},
{
"id": 2,
"provider_name": "azure",
"policy_name": "content-safety",
"enabled": true,
"timeout": 10,
"config": {
"endpoint": "https://your-resource.cognitiveservices.azure.com",
"api_key": "env.AZURE_CONTENT_SAFETY_KEY",
"analyze_enabled": true,
"analyze_severity_threshold": "medium",
"jailbreak_shield_enabled": true,
"indirect_attack_shield_enabled": false
}
}
],
"guardrail_rules": [
{
"id": 101,
"name": "block-secrets-input",
"description": "Block prompts leaking credentials",
"enabled": true,
"cel_expression": "true",
"apply_to": "input",
"sampling_rate": 100,
"timeout": 10,
"provider_config_ids": [1]
},
{
"id": 102,
"name": "content-safety-both",
"description": "Azure content safety on all traffic",
"enabled": true,
"cel_expression": "true",
"apply_to": "both",
"sampling_rate": 100,
"timeout": 15,
"provider_config_ids": [2]
}
]
}
}
```

View File

@@ -0,0 +1,318 @@
---
title: "Plugins"
description: "Configure Bifrost plugins in config.json — semantic cache, OpenTelemetry, Maxim, Datadog, and custom plugins"
icon: "puzzle-piece"
---
<Note>
**The `plugins` array only controls explicitly opt-in plugins**: `semantic_cache`, `otel`, `maxim`, `datadog` (enterprise), and custom plugins.
**Telemetry, logging, and governance are auto-loaded built-ins** — they are always active and configured via the `client` block and dedicated top-level keys, not the `plugins` array.
</Note>
---
## Auto-Loaded Built-ins
These plugins start automatically. You do **not** add them to the `plugins` array.
| Plugin | Always active? | How to configure |
|--------|---------------|-----------------|
| **Telemetry** (Prometheus `/metrics`) | Yes, always | `client.prometheus_labels` for custom labels; push gateway via `plugins` entry once DB-backed mode is running |
| **Logging** | When `client.enable_logging: true` and `logs_store` is configured | `client.enable_logging`, `client.disable_content_logging`, `client.logging_headers` |
| **Governance** | Yes, always (OSS) | `client.enforce_auth_on_inference` for VK enforcement; `governance.*` for virtual keys / budgets / routing rules |
See [Client Configuration](/deployment-guides/config-json/client) and [Governance](/deployment-guides/config-json/governance) for full details.
---
## Plugin Array Structure
Every entry in the `plugins` array supports these common fields:
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `name` | string | Yes | Plugin name |
| `enabled` | boolean | Yes | Enable or disable this plugin |
| `config` | object | Varies | Plugin-specific configuration |
| `path` | string | No | Path to a custom plugin binary or WASM file |
| `version` | integer | No | 🛑 **DB-Backed Only.** Plugin metadata persisted on `TablePlugin`. In DB-backed sync, higher values trigger replacement/reload. Valid range: `1` to `32767`. |
| `placement` | string | No | 🛑 **DB-Backed Only.** Execution metadata (`"pre_builtin"`, `"builtin"`, `"post_builtin"`) persisted on `TablePlugin` and used for ordering behavior. |
| `order` | integer | No | 🛑 **DB-Backed Only.** Execution metadata persisted on `TablePlugin`; within a placement group, lower values run earlier. |
<Note>
`name`, `enabled`, `path`, and `config` are the core plugin config fields. In DB-backed mode, `version`, `placement`, and `order` are persisted on `TablePlugin` and used during sync/runtime ordering.
</Note>
---
<Tabs>
<Tab title="Semantic Cache">
### Semantic Cache
Caches LLM responses by semantic similarity. Returns a cached response when an incoming request is semantically close enough to a previous one.
Requires a [vector store](/deployment-guides/config-json/storage#vector_store) to be configured.
| Field | Required | Default | Description |
|-------|----------|---------|-------------|
| `config.dimension` | Yes | — | Embedding dimension. Use `1` for hash-based (exact) caching without an embedding provider |
| `config.provider` | No | — | Provider for generating embeddings (required for semantic mode) |
| `config.embedding_model` | No | — | Model for embeddings (required when `provider` is set) |
| `config.threshold` | No | `0.8` | Cosine similarity threshold for a cache hit (0.01.0) |
| `config.ttl` | No | `300` | Cache entry TTL in seconds (or a duration string like `"1h"`) |
| `config.cache_by_model` | No | `true` | Include model in cache key |
| `config.cache_by_provider` | No | `true` | Include provider in cache key |
| `config.exclude_system_prompt` | No | `false` | Exclude system prompt from cache key |
| `config.conversation_history_threshold` | No | `3` | Skip caching for requests with more messages than this |
| `config.default_cache_key` | No | — | Default cache key when no `x-bf-cache-key` header is sent |
**Semantic mode** (embedding-based similarity search):
```json
{
"plugins": [
{
"name": "semantic_cache",
"enabled": true,
"config": {
"provider": "openai",
"embedding_model": "text-embedding-3-small",
"dimension": 1536,
"threshold": 0.85,
"ttl": 300,
"cache_by_model": true,
"cache_by_provider": true
}
}
]
}
```
**Hash mode** (exact-match caching, no embedding provider needed):
```json
{
"plugins": [
{
"name": "semantic_cache",
"enabled": true,
"config": {
"dimension": 1,
"ttl": 1800
}
}
]
}
```
<Note>
You must also configure a `vector_store` in `config.json`. See [Storage — vector_store](/deployment-guides/config-json/storage#vector_store).
</Note>
</Tab>
<Tab title="OpenTelemetry">
### OpenTelemetry (OTel)
Exports distributed traces to any OTel-compatible collector (Jaeger, Zipkin, Tempo, Datadog via OTLP, etc.).
| Field | Required | Default | Description |
|-------|----------|---------|-------------|
| `config.collector_url` | Yes | — | OTLP collector endpoint |
| `config.trace_type` | Yes | — | Trace format: `"genai_extension"`, `"vercel"`, or `"open_inference"` |
| `config.protocol` | Yes | — | `"http"` or `"grpc"` |
| `config.service_name` | No | `"bifrost"` | Service name reported to the collector |
| `config.metrics_enabled` | No | `false` | Enable push-based OTLP metrics export |
| `config.metrics_endpoint` | No | — | OTLP metrics endpoint URL |
| `config.metrics_push_interval` | No | `15` | Metrics push interval in seconds |
| `config.headers` | No | — | Custom headers for the collector (supports `env.` prefix) |
| `config.insecure` | No | `false` | Skip TLS verification |
| `config.tls_ca_cert` | No | — | Path to TLS CA certificate |
```json
{
"plugins": [
{
"name": "otel",
"enabled": true,
"config": {
"collector_url": "http://otel-collector:4318",
"trace_type": "genai_extension",
"protocol": "http",
"service_name": "bifrost-gateway"
}
}
]
}
```
**With authentication headers:**
```json
{
"plugins": [
{
"name": "otel",
"enabled": true,
"config": {
"collector_url": "https://otel.example.com:4318",
"trace_type": "open_inference",
"protocol": "http",
"service_name": "bifrost",
"headers": {
"Authorization": "env.OTEL_AUTH_HEADER"
}
}
}
]
}
```
**With OTLP metrics export:**
```json
{
"plugins": [
{
"name": "otel",
"enabled": true,
"config": {
"collector_url": "http://otel-collector:4318",
"trace_type": "genai_extension",
"protocol": "http",
"metrics_enabled": true,
"metrics_endpoint": "http://otel-collector:4318/v1/metrics",
"metrics_push_interval": 30
}
}
]
}
```
</Tab>
<Tab title="Maxim">
### Maxim Observability
Sends request traces to the [Maxim](https://www.getmaxim.ai) observability platform.
| Field | Required | Description |
|-------|----------|-------------|
| `config.api_key` | Yes | Maxim API key (use `env.` prefix) |
| `config.log_repo_id` | No | Default Maxim logger repository ID |
```json
{
"plugins": [
{
"name": "maxim",
"enabled": true,
"config": {
"api_key": "env.MAXIM_API_KEY",
"log_repo_id": "your-log-repo-id"
}
}
]
}
```
</Tab>
<Tab title="Datadog">
### Datadog
<Note>
Datadog is an **enterprise-only** plugin and is silently ignored in OSS builds.
</Note>
Sends APM traces and metrics to a Datadog Agent.
| Field | Default | Description |
|-------|---------|-------------|
| `config.agent_addr` | `"localhost:8126"` | Datadog Agent address for APM traces |
| `config.service_name` | `"bifrost"` | Service name in Datadog |
| `config.env` | — | Environment tag (e.g. `"production"`, `"staging"`) |
| `config.version` | — | Service version tag |
| `config.enable_traces` | `true` | Enable APM trace collection |
| `config.custom_tags` | `{}` | Additional key/value tags for all traces and metrics |
```json
{
"plugins": [
{
"name": "datadog",
"enabled": true,
"config": {
"agent_addr": "datadog-agent:8126",
"service_name": "bifrost",
"env": "production",
"enable_traces": true,
"custom_tags": {
"team": "platform",
"region": "us-east-1"
}
}
}
]
}
```
</Tab>
</Tabs>
---
## Custom / Dynamic Plugins
Load a custom Go plugin binary or WASM plugin at startup using the `path` field. Custom plugins must implement one of the Bifrost plugin interfaces.
```json
{
"plugins": [
{
"name": "my-custom-auth",
"enabled": true,
"path": "/app/plugins/my-custom-auth.so",
"config": {
"auth_endpoint": "env.AUTH_SERVICE_URL"
}
}
]
}
```
**WASM plugin:**
```json
{
"plugins": [
{
"name": "my-wasm-plugin",
"enabled": true,
"path": "/app/plugins/my-plugin.wasm",
"config": {}
}
]
}
```
See [Writing Go Plugins](/plugins/writing-go-plugin) and [Writing WASM Plugins](/plugins/writing-wasm-plugin) for implementation guides.
**Placement and ordering (DB-backed only):**
In DB-backed mode, plugin metadata such as `version` (`1` to `32767`), `placement`, and `order` can be managed via config sync and DB/UI workflows:
| `placement` | When it runs |
|-------------|-------------|
| `pre_builtin` | Before all built-in plugins |
| `builtin` | Alongside built-in plugins (by `order`) |
| `post_builtin` | After all built-in plugins (default) |
Within a placement group, lower `order` values run earlier.

View File

@@ -0,0 +1,755 @@
---
title: "Provider Setup"
description: "Configure LLM providers in config.json — API keys, cloud-native auth, per-provider network settings, and self-hosted endpoints"
icon: "plug"
---
All providers are configured under `providers` in `config.json`. Each provider entry contains a `keys` array where every key has a `name`, `value`, `models`, and `weight`, plus optional provider-specific config objects.
**Supplying credentials:**
Use the `env.` prefix to reference environment variables — never put API keys directly in `config.json`:
```json
{
"providers": {
"openai": {
"keys": [
{
"name": "primary",
"value": "env.OPENAI_API_KEY",
"models": ["*"],
"weight": 1.0
}
]
}
}
}
```
---
## Common Provider Fields
Every key object supports these fields:
| Field | Type | Description |
|-------|------|-------------|
| `name` | string | Unique name for this key (used in logs and virtual key pin) |
| `value` | string | API key value or `env.VAR_NAME` reference |
| `models` | array | Models this key serves. `["*"]` = all models |
| `weight` | float | Load balancing weight. Higher = more traffic |
| `aliases` | object | Map logical name → actual model name for this key |
| `use_for_batch_api` | boolean | Mark key as eligible for batch API calls |
Per-provider `network_config` options (applies to all standard providers):
| Field | Type | Description |
|-------|------|-------------|
| `default_request_timeout_in_seconds` | integer | Per-request timeout |
| `max_retries` | integer | Retry attempts on transient errors |
| `retry_backoff_initial` | integer | Initial backoff in milliseconds |
| `retry_backoff_max` | integer | Maximum backoff in milliseconds |
| `max_conns_per_host` | integer | Max TCP connections to the provider endpoint (default: 5000) |
| `extra_headers` | object | Static headers added to every provider request |
| `stream_idle_timeout_in_seconds` | integer | Idle timeout per stream chunk (default: 60) |
| `insecure_skip_verify` | boolean | Disable TLS verification (last resort only) |
| `ca_cert_pem` | string | PEM-encoded CA for self-signed or private CA endpoints |
Concurrency and buffering per provider:
| Field | Type | Description |
|-------|------|-------------|
| `concurrency_and_buffer_size.concurrency` | integer | Max concurrent requests to this provider |
| `concurrency_and_buffer_size.buffer_size` | integer | Request queue depth |
---
<Tabs>
<Tab title="OpenAI">
### OpenAI
Supports multiple keys with weighted load balancing. Mark one key with `use_for_batch_api: true` to designate it for the Batch API.
```json
{
"providers": {
"openai": {
"keys": [
{
"name": "openai-primary",
"value": "env.OPENAI_KEY_1",
"models": ["*"],
"weight": 2.0
},
{
"name": "openai-secondary",
"value": "env.OPENAI_KEY_2",
"models": ["gpt-4o-mini"],
"weight": 1.0
},
{
"name": "openai-batch",
"value": "env.OPENAI_KEY_BATCH",
"models": ["*"],
"weight": 1.0,
"use_for_batch_api": true
}
],
"network_config": {
"default_request_timeout_in_seconds": 120,
"max_retries": 3,
"retry_backoff_initial": 500,
"retry_backoff_max": 5000
}
}
}
}
```
</Tab>
<Tab title="Anthropic">
### Anthropic
```json
{
"providers": {
"anthropic": {
"keys": [
{
"name": "anthropic-primary",
"value": "env.ANTHROPIC_KEY_1",
"models": ["*"],
"weight": 1.0
},
{
"name": "anthropic-secondary",
"value": "env.ANTHROPIC_KEY_2",
"models": ["*"],
"weight": 1.0
}
],
"network_config": {
"default_request_timeout_in_seconds": 180
}
}
}
}
```
**Override Anthropic beta headers** (optional):
```json
{
"providers": {
"anthropic": {
"keys": [
{
"name": "primary",
"value": "env.ANTHROPIC_API_KEY",
"models": ["*"],
"weight": 1.0
}
],
"network_config": {
"beta_header_overrides": {
"redact-thinking-": true
}
}
}
}
}
```
</Tab>
<Tab title="Azure OpenAI">
### Azure OpenAI
Azure requires `azure_key_config` on every key with `endpoint` and `api_version`. List your Azure deployment names in `models` — Bifrost routes requests using the model name as the deployment name. If your deployment names differ from the model names you use in requests, add an `aliases` map on the key.
<Tabs>
<Tab title="API Key">
```json
{
"providers": {
"azure": {
"keys": [
{
"name": "azure-primary",
"value": "env.AZURE_API_KEY",
"models": ["gpt-4o", "gpt-4o-mini"],
"weight": 1.0,
"azure_key_config": {
"endpoint": "env.AZURE_ENDPOINT",
"api_version": "env.AZURE_API_VERSION"
}
}
]
}
}
}
```
Set environment variables:
```bash
export AZURE_API_KEY="your-azure-api-key"
export AZURE_ENDPOINT="https://your-resource.openai.azure.com"
export AZURE_API_VERSION="2024-10-21"
```
</Tab>
<Tab title="Managed Identity / DefaultAzureCredential">
When `value` is empty or omitted, Bifrost uses `DefaultAzureCredential` — which resolves credentials from Workload Identity, VM managed identity, or `az login`.
```json
{
"providers": {
"azure": {
"keys": [
{
"name": "azure-workload-identity",
"value": "",
"models": ["gpt-4o"],
"weight": 1.0,
"azure_key_config": {
"endpoint": "env.AZURE_ENDPOINT",
"api_version": "env.AZURE_API_VERSION"
}
}
]
}
}
}
```
</Tab>
</Tabs>
**Deployment name aliases** — when your Azure deployment names differ from the model names in requests, use `aliases`:
```json
{
"providers": {
"azure": {
"keys": [
{
"name": "azure-primary",
"value": "env.AZURE_API_KEY",
"models": ["gpt-4o"],
"weight": 1.0,
"aliases": {
"gpt-4o": "gpt-4o-prod-deployment"
},
"azure_key_config": {
"endpoint": "env.AZURE_ENDPOINT",
"api_version": "env.AZURE_API_VERSION"
}
}
]
}
}
}
```
**Multi-region failover** (two keys, different regions):
```json
{
"providers": {
"azure": {
"keys": [
{
"name": "eastus",
"value": "env.AZURE_KEY_EAST",
"models": ["gpt-4o"],
"weight": 1.0,
"azure_key_config": {
"endpoint": "env.AZURE_ENDPOINT_EAST",
"api_version": "env.AZURE_API_VERSION"
}
},
{
"name": "westus",
"value": "env.AZURE_KEY_WEST",
"models": ["gpt-4o"],
"weight": 1.0,
"azure_key_config": {
"endpoint": "env.AZURE_ENDPOINT_WEST",
"api_version": "env.AZURE_API_VERSION"
}
}
]
}
}
}
```
</Tab>
<Tab title="AWS Bedrock">
### AWS Bedrock
Bedrock requires `bedrock_key_config` with at minimum a `region`. Three auth modes:
<Tabs>
<Tab title="Static Credentials">
```json
{
"providers": {
"bedrock": {
"keys": [
{
"name": "bedrock-static",
"value": "",
"models": ["*"],
"weight": 1.0,
"bedrock_key_config": {
"region": "us-east-1",
"access_key": "env.AWS_ACCESS_KEY_ID",
"secret_key": "env.AWS_SECRET_ACCESS_KEY"
}
}
]
}
}
}
```
</Tab>
<Tab title="IAM Role (instance profile / IRSA)">
When only `region` is set, Bifrost inherits credentials from the AWS SDK default chain — IRSA (IAM Roles for Service Accounts), EC2 instance profile, or `AWS_*` env vars.
```json
{
"providers": {
"bedrock": {
"keys": [
{
"name": "bedrock-iam",
"value": "",
"models": ["*"],
"weight": 1.0,
"bedrock_key_config": {
"region": "us-east-1"
}
}
]
}
}
}
```
</Tab>
<Tab title="STS AssumeRole">
```json
{
"providers": {
"bedrock": {
"keys": [
{
"name": "bedrock-assumerole",
"value": "",
"models": ["*"],
"weight": 1.0,
"bedrock_key_config": {
"region": "us-west-2",
"role_arn": "env.AWS_ROLE_ARN",
"external_id": "env.AWS_EXTERNAL_ID",
"session_name": "bifrost-session"
}
}
]
}
}
}
```
</Tab>
</Tabs>
**Model aliases** (map logical names to Bedrock inference profile IDs):
```json
{
"bedrock_key_config": {
"region": "us-east-1"
},
"aliases": {
"claude-sonnet": "us.anthropic.claude-3-5-sonnet-20241022-v2:0",
"claude-haiku": "us.anthropic.claude-3-5-haiku-20241022-v1:0"
}
}
```
**Batch API — S3 configuration:**
```json
{
"bedrock_key_config": {
"region": "us-east-1",
"access_key": "env.AWS_ACCESS_KEY_ID",
"secret_key": "env.AWS_SECRET_ACCESS_KEY",
"batch_s3_config": {
"buckets": [
{
"bucket_name": "my-bedrock-batch-bucket",
"prefix": "batch/",
"is_default": true
}
]
}
}
}
```
</Tab>
<Tab title="Google Vertex AI">
### Google Vertex AI
Vertex requires `vertex_key_config` with `project_id` and `region`. Two auth modes:
<Tabs>
<Tab title="Service Account Key">
```json
{
"providers": {
"vertex": {
"keys": [
{
"name": "vertex-sa",
"value": "",
"models": ["*"],
"weight": 1.0,
"vertex_key_config": {
"project_id": "env.VERTEX_PROJECT_ID",
"region": "us-central1",
"auth_credentials": "env.VERTEX_AUTH_CREDENTIALS"
}
}
]
}
}
}
```
`VERTEX_AUTH_CREDENTIALS` should contain the base64-encoded service account JSON.
</Tab>
<Tab title="GKE Workload Identity / ADC">
When `auth_credentials` is omitted, Bifrost calls `google.FindDefaultCredentials` — which resolves to GKE Workload Identity, GCE metadata server, or `gcloud auth application-default login`.
```json
{
"providers": {
"vertex": {
"keys": [
{
"name": "vertex-workload-identity",
"value": "",
"models": ["*"],
"weight": 1.0,
"vertex_key_config": {
"project_id": "my-gcp-project",
"region": "us-central1"
}
}
]
}
}
}
```
</Tab>
</Tabs>
</Tab>
<Tab title="Groq / Gemini / Mistral / Others">
### Standard API-Key Providers
These providers follow the same simple pattern — one or more keys with weights. Replace the provider name and env var name accordingly.
```json
{
"providers": {
"groq": {
"keys": [
{
"name": "groq-primary",
"value": "env.GROQ_API_KEY",
"models": ["*"],
"weight": 1.0
}
]
},
"gemini": {
"keys": [
{
"name": "gemini-primary",
"value": "env.GEMINI_API_KEY",
"models": ["*"],
"weight": 1.0
}
]
},
"mistral": {
"keys": [
{
"name": "mistral-primary",
"value": "env.MISTRAL_API_KEY",
"models": ["*"],
"weight": 1.0
}
]
},
"cohere": {
"keys": [{ "name": "cohere-main", "value": "env.COHERE_API_KEY", "models": ["*"], "weight": 1.0 }]
},
"perplexity": {
"keys": [{ "name": "perplexity-main", "value": "env.PERPLEXITY_API_KEY", "models": ["*"], "weight": 1.0 }]
},
"xai": {
"keys": [{ "name": "xai-main", "value": "env.XAI_API_KEY", "models": ["*"], "weight": 1.0 }]
},
"cerebras": {
"keys": [{ "name": "cerebras-main", "value": "env.CEREBRAS_API_KEY", "models": ["*"], "weight": 1.0 }]
},
"openrouter": {
"keys": [{ "name": "openrouter-main", "value": "env.OPENROUTER_API_KEY", "models": ["*"], "weight": 1.0 }]
},
"nebius": {
"keys": [{ "name": "nebius-main", "value": "env.NEBIUS_API_KEY", "models": ["*"], "weight": 1.0 }]
}
}
}
```
</Tab>
<Tab title="Self-Hosted">
### Self-Hosted Providers
Self-hosted providers point to a URL you operate. No API key is typically required (`"value": ""`).
<Tabs>
<Tab title="Ollama">
```json
{
"providers": {
"ollama": {
"keys": [
{
"name": "ollama-local",
"value": "",
"models": ["*"],
"weight": 1.0,
"ollama_key_config": {
"url": "http://localhost:11434"
}
}
]
}
}
}
```
Using an env var for the URL (useful across environments):
```json
{
"ollama_key_config": {
"url": "env.OLLAMA_URL"
}
}
```
</Tab>
<Tab title="vLLM">
vLLM instances are model-specific — one key per served model:
```json
{
"providers": {
"vllm": {
"keys": [
{
"name": "vllm-llama3-70b",
"value": "",
"models": ["llama-3-70b"],
"weight": 1.0,
"vllm_key_config": {
"url": "http://vllm-server:8000",
"model_name": "meta-llama/Meta-Llama-3-70B-Instruct"
}
},
{
"name": "vllm-mistral",
"value": "",
"models": ["mistral-7b"],
"weight": 1.0,
"vllm_key_config": {
"url": "http://vllm-mistral:8000",
"model_name": "mistralai/Mistral-7B-Instruct-v0.3"
}
}
]
}
}
}
```
</Tab>
<Tab title="SGLang">
```json
{
"providers": {
"sgl": {
"keys": [
{
"name": "sgl-main",
"value": "",
"models": ["*"],
"weight": 1.0,
"sgl_key_config": {
"url": "http://sgl-router:30000"
}
}
]
}
}
}
```
</Tab>
<Tab title="HuggingFace / Replicate">
These providers use `aliases` to map logical model names to provider-specific IDs:
```json
{
"providers": {
"huggingface": {
"keys": [
{
"name": "hf-main",
"value": "env.HF_API_KEY",
"models": ["llama-3", "mixtral"],
"weight": 1.0,
"aliases": {
"llama-3": "meta-llama/Meta-Llama-3-8B-Instruct",
"mixtral": "mistralai/Mixtral-8x7B-Instruct-v0.1"
}
}
]
},
"replicate": {
"keys": [
{
"name": "replicate-main",
"value": "env.REPLICATE_API_KEY",
"models": ["llama-3"],
"weight": 1.0,
"aliases": {
"llama-3": "meta/meta-llama-3-70b-instruct"
},
"replicate_key_config": {
"use_deployments_endpoint": false
}
}
]
}
}
}
```
</Tab>
</Tabs>
</Tab>
</Tabs>
---
## Proxy Configuration
Route provider traffic through an HTTP or SOCKS5 proxy:
```json
{
"providers": {
"openai": {
"keys": [
{ "name": "primary", "value": "env.OPENAI_API_KEY", "models": ["*"], "weight": 1.0 }
],
"proxy_config": {
"type": "http",
"url": "http://proxy.corp.example.com:3128",
"username": "env.PROXY_USER",
"password": "env.PROXY_PASS"
}
}
}
}
```
| Field | Type | Options |
|-------|------|---------|
| `proxy_config.type` | string | `"none"`, `"http"`, `"socks5"`, `"environment"` |
| `proxy_config.url` | string | Proxy server URL |
| `proxy_config.username` | string | Proxy auth username |
| `proxy_config.password` | string | Proxy auth password (`env.` supported) |
| `proxy_config.ca_cert_pem` | string | PEM CA for TLS-intercepting proxies |
Use `"type": "environment"` to pick up `HTTP_PROXY` / `HTTPS_PROXY` env vars automatically.
---
## Multi-Provider Example
```json
{
"$schema": "https://www.getbifrost.ai/schema",
"providers": {
"openai": {
"keys": [
{ "name": "openai-primary", "value": "env.OPENAI_API_KEY", "models": ["*"], "weight": 2.0 }
]
},
"anthropic": {
"keys": [
{ "name": "anthropic-primary", "value": "env.ANTHROPIC_API_KEY", "models": ["*"], "weight": 1.0 }
]
},
"groq": {
"keys": [
{ "name": "groq-primary", "value": "env.GROQ_API_KEY", "models": ["*"], "weight": 1.0 }
]
}
}
}
```
With three providers and the weights above, traffic is distributed: 50% OpenAI, 25% Anthropic, 25% Groq. If any provider returns an error, Bifrost automatically retries on the next key or provider.

View File

@@ -0,0 +1,252 @@
---
title: "Schema Reference"
description: "All top-level keys available in config.json, their types, and where each is documented"
icon: "brackets-curly"
---
<Note>
The live schema is published at [`https://www.getbifrost.ai/schema`](https://www.getbifrost.ai/schema). Add `"$schema": "https://www.getbifrost.ai/schema"` to your `config.json` for IDE autocomplete and inline validation.
</Note>
This page is a concise reference for every top-level key in `config.json`. Click the **Guide** links for full field-by-field documentation.
---
## Top-Level Keys
| Key | Type | Description | Guide |
|-----|------|-------------|-------|
| `$schema` | string | Schema URL for IDE validation. Set to `"https://www.getbifrost.ai/schema"` | — |
| `encryption_key` | string | Optional AES-256 key (derived via Argon2id). Accepts `env.VAR` prefix and is also read from `BIFROST_ENCRYPTION_KEY`. If omitted, data is stored in plaintext. | [Client](/deployment-guides/config-json/client#encryption-key) |
| `client` | object | Worker pool, logging, CORS, auth enforcement, header filtering, MCP, compat shims | [Client](/deployment-guides/config-json/client) |
| `providers` | object | LLM provider API keys, network settings, concurrency | [Providers](/deployment-guides/config-json/providers) |
| `governance` | object | Admin auth, virtual keys, budgets, rate limits, routing rules, customers, teams | [Governance](/deployment-guides/config-json/governance) |
| `guardrails_config` | object | Content moderation providers and CEL-based rules *(enterprise only)* | [Guardrails](/deployment-guides/config-json/guardrails) |
| `access_profiles` | array | Access profile templates for enterprise RBAC/governance controls *(enterprise only)* | [Enterprise Governance](/enterprise/advanced-governance) |
| `cluster_config` | object | Cluster mode settings: gossip, peers, and auto-discovery backends *(enterprise only)* | [Cluster](/deployment-guides/config-json/cluster) |
| `config_store` | object | Configuration database backend — SQLite, PostgreSQL, or disabled (file-only mode) | [Storage](/deployment-guides/config-json/storage#config_store) |
| `logs_store` | object | Request/response log database — SQLite, PostgreSQL + optional S3/GCS offload | [Storage](/deployment-guides/config-json/storage#logs_store) |
| `vector_store` | object | Vector database for semantic cache — Weaviate, Redis, Qdrant, Pinecone, Valkey | [Storage](/deployment-guides/config-json/storage#vector_store) |
| `plugins` | array | Opt-in plugins: `semantic_cache`, `otel`, `maxim`, `datadog`, custom | [Plugins](/deployment-guides/config-json/plugins) |
| `framework` | object | Model pricing catalog URL and sync interval | [Framework](#framework) |
| `mcp` | object | MCP server and tool configuration | — |
| `websocket` | object | WebSocket / Realtime API connection pool tuning | [WebSocket](#websocket) |
| `auth_config` | object | **Deprecated** — use `governance.auth_config` | [Client](/deployment-guides/config-json/client#authentication) |
---
## `version`
Controls how empty arrays in allow-list fields (`models`, `allowed_models`, `key_ids`, `tools_to_execute`) are interpreted:
| Value | Behaviour |
|-------|-----------|
| `2` *(default, v1.5.0+)* | Empty array = **deny all**; `["*"]` = allow all |
| `1` *(v1.4.x compat)* | Empty array = **allow all** |
Omitting `version` uses v2 semantics. Set `"version": 1` only if you are migrating from v1.4.x and need the old behaviour temporarily.
---
## `client`
Controls the worker pool, logging pipeline, security, and SDK shims. All fields are optional.
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `initial_pool_size` | integer | `300` | Pre-allocated goroutines per provider queue |
| `drop_excess_requests` | boolean | `false` | Return HTTP 429 when queue is full |
| `enable_logging` | boolean | `true`* | Persist request/response logs (`*` auto-enabled when `logs_store` is set) |
| `disable_content_logging` | boolean | `false` | Strip message content from logs |
| `log_retention_days` | integer | `365` | Days to retain log entries |
| `logging_headers` | array | `[]` | HTTP headers to capture in log metadata |
| `enforce_auth_on_inference` | boolean | `false` | Require a virtual key on every `/v1/*` request |
| `allow_direct_keys` | boolean | `false` | Allow callers to pass provider API keys directly |
| `allowed_origins` | array | `["*"]` | CORS allowed origins |
| `max_request_body_size_mb` | integer | `100` | Maximum request body in MB |
| `whitelisted_routes` | array | `[]` | Routes that bypass auth middleware |
| `allowed_headers` | array | `[]` | Additional headers permitted for CORS/WebSocket |
| `required_headers` | array | `[]` | Headers that must be present on every request |
| `header_filter_config` | object | — | `allowlist` / `denylist` for `x-bf-eh-*` forwarded headers |
| `prometheus_labels` | array | `[]` | Custom labels for all Prometheus metrics |
| `compat` | object | — | SDK compatibility shims (`should_drop_params`, `convert_text_to_chat`, etc.) |
| `mcp_agent_depth` | integer | `10` | Max tool-call recursion depth |
| `mcp_tool_execution_timeout` | integer | `30` | Per-tool execution timeout in seconds |
| `mcp_tool_sync_interval` | integer | `10` | Tool sync interval in minutes (`0` = disabled) |
| `mcp_disable_auto_tool_inject` | boolean | `false` | Disable automatic MCP tool injection |
| `async_job_result_ttl` | integer | `3600` | TTL for async job results in seconds |
| `disable_db_pings_in_health` | boolean | `false` | Exclude DB connectivity from `/health` |
| `routing_chain_max_depth` | integer | `10` | Max routing rule chain evaluation depth |
Full documentation: [Client Configuration](/deployment-guides/config-json/client).
---
## `providers`
Keyed by provider name. Each entry contains a `keys` array and optional `network_config`, `concurrency_and_buffer_size`, `proxy_config`.
Supported provider keys: `openai`, `anthropic`, `azure`, `bedrock`, `vertex`, `gemini`, `mistral`, `groq`, `cohere`, `perplexity`, `xai`, `cerebras`, `openrouter`, `nebius`, `fireworks`, `parasail`, `huggingface`, `replicate`, `ollama`, `vllm`, `sgl`, `elevenlabs`, `runway`.
Full documentation: [Provider Setup](/deployment-guides/config-json/providers).
---
## `governance`
Seeds governance resources at startup. All sub-keys are optional arrays.
| Sub-key | Description |
|---------|-------------|
| `auth_config` | Admin username/password auth for the dashboard |
| `virtual_keys` | Scoped API tokens with provider/model allowlists |
| `budgets` | Spend caps in USD over a rolling window |
| `rate_limits` | Request and token rate limits |
| `customers` | Customer entities (attach budgets/rate limits) |
| `teams` | Team entities (attach to customers, budgets, rate limits) |
| `routing_rules` | CEL-based dynamic provider/model routing |
| `pricing_overrides` | Scoped per-model pricing overrides |
| `model_configs` | Per-model rate limit and budget configurations |
Full documentation: [Governance](/deployment-guides/config-json/governance).
---
## `guardrails_config`
Enterprise-only. Two sub-keys: `guardrail_providers` (array) and `guardrail_rules` (array).
Full documentation: [Guardrails](/deployment-guides/config-json/guardrails).
---
## `access_profiles`
Enterprise-only. Defines access profile templates that can later be attached to roles/users.
```json
{
"access_profiles": [
{
"name": "platform-default",
"description": "Default platform profile",
"is_active": true,
"tags": ["platform", "default"],
"provider_configs": [
{
"provider_name": "openai",
"all_models_allowed": false,
"allowed_models": ["gpt-4o", "gpt-4o-mini"]
}
],
"mcp_servers": [
{ "mcp_server_id": "github" }
],
"mcp_tool_overrides": [
{ "mcp_client_id": "github", "tool_name": "create_pull_request", "action": "include" }
]
}
]
}
```
---
## `cluster_config`
Enterprise-only clustering settings for multi-node deployments.
| Sub-key | Description |
|---------|-------------|
| `enabled` | Enables cluster mode |
| `region` | Region label used by enterprise clustering |
| `peers` | Static peer list (`host:port`) |
| `gossip` | Gossip/memberlist port + liveness thresholds |
| `discovery` | Auto-discovery configuration (`kubernetes`, `dns`, `udp`, `consul`, `etcd`, `mdns`) |
Full documentation: [Cluster](/deployment-guides/config-json/cluster).
---
## `config_store`, `logs_store`, `vector_store`
Storage backends. Each has `enabled` (boolean), `type` (string), and `config` (object).
| Store | Types |
|-------|-------|
| `config_store` | `"sqlite"`, `"postgres"` |
| `logs_store` | `"sqlite"`, `"postgres"` (+ optional `object_storage`) |
| `vector_store` | `"weaviate"`, `"redis"`, `"qdrant"`, `"pinecone"` (`"redis"` also covers Valkey-compatible endpoints) |
Full documentation: [Storage](/deployment-guides/config-json/storage).
---
## `framework`
Controls model pricing catalog sync:
```json
{
"framework": {
"pricing": {
"pricing_url": "https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json",
"pricing_sync_interval": 86400
}
}
}
```
| Field | Default | Description |
|-------|---------|-------------|
| `pricing.pricing_url` | LiteLLM catalog | URL of a model pricing JSON file |
| `pricing.pricing_sync_interval` | `86400` | Sync interval in seconds (minimum: `3600`) |
---
## `websocket`
Optional tuning for the WebSocket gateway (Responses API WebSocket mode, Realtime API). WebSocket is always enabled.
```json
{
"websocket": {
"max_connections_per_user": 100,
"transcript_buffer_size": 100,
"pool": {
"max_idle_per_key": 50,
"max_total_connections": 1000,
"idle_timeout_seconds": 600,
"max_connection_lifetime_seconds": 7200
}
}
}
```
| Field | Default | Description |
|-------|---------|-------------|
| `max_connections_per_user` | `100` | Max concurrent WebSocket connections per user |
| `transcript_buffer_size` | `100` | Transcript entries buffered for Realtime API mid-session fallback |
| `pool.max_idle_per_key` | `50` | Max idle upstream connections per provider/key |
| `pool.max_total_connections` | `1000` | Max total idle upstream connections |
| `pool.idle_timeout_seconds` | `600` | Evict idle connections after this many seconds |
| `pool.max_connection_lifetime_seconds` | `7200` | Max lifetime of any upstream connection |
---
## Minimal Valid Config
```json
{
"$schema": "https://www.getbifrost.ai/schema",
"encryption_key": "env.BIFROST_ENCRYPTION_KEY",
"providers": {
"openai": {
"keys": [
{ "name": "primary", "value": "env.OPENAI_API_KEY", "models": ["*"], "weight": 1.0 }
]
}
},
"config_store": { "enabled": false }
}
```

View File

@@ -0,0 +1,540 @@
---
title: "Storage"
description: "Configure Bifrost storage backends in config.json — config_store, logs_store, vector_store, and object storage for logs"
icon: "database"
---
Bifrost persists two types of data — **config** (providers, virtual keys, governance rules) and **logs** (request/response records). Each has its own store. A **vector store** is required for semantic caching.
| Store | Purpose | Backends |
|-------|---------|---------|
| `config_store` | Provider configs, virtual keys, governance rules | SQLite, PostgreSQL |
| `logs_store` | Request/response logs shown in UI | SQLite, PostgreSQL + optional S3/GCS offload |
| `vector_store` | Semantic response caching | Weaviate, Redis, Valkey, Qdrant, Pinecone |
<Note>
If you use PostgreSQL for any store, the target database must be **UTF8 encoded**. See [PostgreSQL UTF8 Requirement](/quickstart/gateway/setting-up#postgresql-utf8-requirement).
</Note>
---
## config_store
<Note>
When `config_store` is disabled (or absent), all configuration is loaded from `config.json` at startup only — the Web UI is disabled and changes require a restart. See [Two Configuration Modes](/deployment-guides/config-json#two-configuration-modes).
</Note>
<Tabs>
<Tab title="SQLite">
### SQLite (Default)
Simplest setup — no external database required. Bifrost stores configuration in a local SQLite file.
```json
{
"config_store": {
"enabled": true,
"type": "sqlite",
"config": {
"path": "./config.db"
}
}
}
```
| Field | Description |
|-------|-------------|
| `config.path` | Path to the SQLite file (relative to app-dir, or absolute) |
</Tab>
<Tab title="PostgreSQL">
### PostgreSQL
Production-grade storage suitable for high-availability and high-throughput deployments.
```json
{
"config_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "env.PG_HOST",
"port": "5432",
"user": "env.PG_USER",
"password": "env.PG_PASSWORD",
"db_name": "bifrost",
"ssl_mode": "require",
"max_idle_conns": 5,
"max_open_conns": 50
}
}
}
```
| Field | Default | Description |
|-------|---------|-------------|
| `host` | — | PostgreSQL host (supports `env.` prefix) |
| `port` | — | PostgreSQL port (as string) |
| `user` | — | Database user (supports `env.` prefix) |
| `password` | — | Database password (supports `env.` prefix). Leave empty for IAM role auth. |
| `db_name` | — | Database name |
| `ssl_mode` | — | `"disable"`, `"require"`, `"verify-ca"`, `"verify-full"` |
| `max_idle_conns` | `5` | Maximum idle connections in the pool |
| `max_open_conns` | `50` | Maximum open connections to the database |
</Tab>
<Tab title="Disabled">
### Disabled (file-only mode)
Use this when you want Bifrost to read all configuration from `config.json` only — no database, no Web UI.
```json
{
"config_store": {
"enabled": false
}
}
```
This is the recommended setup for [multinode OSS deployments](/deployment-guides/how-to/multinode) where a shared `config.json` is the single source of truth.
</Tab>
</Tabs>
---
## logs_store
<Tabs>
<Tab title="SQLite">
### SQLite
```json
{
"logs_store": {
"enabled": true,
"type": "sqlite",
"config": {
"path": "./logs.db"
}
}
}
```
</Tab>
<Tab title="PostgreSQL">
### PostgreSQL
```json
{
"logs_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "env.PG_HOST",
"port": "5432",
"user": "env.PG_USER",
"password": "env.PG_PASSWORD",
"db_name": "bifrost",
"ssl_mode": "require",
"max_idle_conns": 10,
"max_open_conns": 100
}
}
}
```
For high log volumes, increase `max_open_conns`:
```json
{
"logs_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "env.PG_HOST",
"port": "5432",
"user": "env.PG_USER",
"password": "env.PG_PASSWORD",
"db_name": "bifrost",
"ssl_mode": "require",
"max_idle_conns": 10,
"max_open_conns": 200
},
"retention_days": 90
}
}
```
</Tab>
<Tab title="Disabled">
```json
{
"logs_store": {
"enabled": false
}
}
```
</Tab>
</Tabs>
### Log Retention
Set `retention_days` to automatically purge old log entries. `0` disables retention-based cleanup.
```json
{
"logs_store": {
"enabled": true,
"type": "postgres",
"config": { "...": "..." },
"retention_days": 90
}
}
```
### Object Storage for Logs
Offload large request/response payloads from the database to S3 or GCS. The database retains only lightweight index records; payloads are fetched on demand.
<Tabs>
<Tab title="AWS S3">
```json
{
"logs_store": {
"enabled": true,
"type": "postgres",
"config": { "...": "..." },
"object_storage": {
"type": "s3",
"bucket": "env.S3_BUCKET",
"prefix": "bifrost",
"compress": true,
"region": "us-east-1",
"access_key_id": "env.S3_ACCESS_KEY_ID",
"secret_access_key": "env.S3_SECRET_ACCESS_KEY"
}
}
}
```
**IAM role (instance profile / IRSA)** — omit `access_key_id` and `secret_access_key`:
```json
{
"object_storage": {
"type": "s3",
"bucket": "bifrost-logs",
"region": "us-east-1",
"compress": true,
"role_arn": "arn:aws:iam::123456789012:role/BifrostS3Role"
}
}
```
| Field | Description |
|-------|-------------|
| `bucket` | S3 bucket name (supports `env.` prefix) |
| `prefix` | Key prefix for stored objects (default: `"bifrost"`) |
| `compress` | Enable gzip compression (default: `false`) |
| `region` | AWS region |
| `access_key_id` | AWS access key ID (omit for default credential chain) |
| `secret_access_key` | AWS secret access key |
| `session_token` | STS temporary credentials session token |
| `role_arn` | IAM role ARN for STS AssumeRole |
| `endpoint` | Custom endpoint for MinIO / Cloudflare R2 |
| `force_path_style` | Use path-style URLs (required for MinIO, default: `false`) |
</Tab>
<Tab title="Google Cloud Storage">
```json
{
"logs_store": {
"enabled": true,
"type": "postgres",
"config": { "...": "..." },
"object_storage": {
"type": "gcs",
"bucket": "bifrost-logs",
"prefix": "bifrost",
"compress": true,
"project_id": "env.GCP_PROJECT_ID",
"credentials_json": "env.GCS_CREDENTIALS_JSON"
}
}
}
```
Omit `credentials_json` to use Application Default Credentials (Workload Identity, GCE metadata, `gcloud auth`).
| Field | Description |
|-------|-------------|
| `project_id` | GCP project ID (supports `env.` prefix) |
| `credentials_json` | Service account JSON or path — omit for ADC |
</Tab>
<Tab title="MinIO (Self-Hosted)">
```json
{
"object_storage": {
"type": "s3",
"bucket": "bifrost-logs",
"prefix": "bifrost",
"compress": false,
"region": "us-east-1",
"endpoint": "http://minio.internal:9000",
"access_key_id": "env.MINIO_ACCESS_KEY",
"secret_access_key": "env.MINIO_SECRET_KEY",
"force_path_style": true
}
}
```
</Tab>
</Tabs>
---
## vector_store
A vector store is required for [semantic caching](/features/semantic-caching). Choose from Weaviate, Redis/Valkey, Qdrant, or Pinecone.
<Tabs>
<Tab title="Weaviate">
```json
{
"vector_store": {
"enabled": true,
"type": "weaviate",
"config": {
"scheme": "http",
"host": "localhost:8080",
"api_key": "env.WEAVIATE_API_KEY",
"grpc_config": {
"host": "localhost:50051",
"secured": false
}
}
}
}
```
| Field | Required | Description |
|-------|----------|-------------|
| `scheme` | Yes | `"http"` or `"https"` |
| `host` | Yes | Weaviate server host and port |
| `api_key` | No | Weaviate API key (supports `env.` prefix) |
| `grpc_config.host` | No | gRPC host for faster vector operations |
| `grpc_config.secured` | No | Use TLS for gRPC connection |
</Tab>
<Tab title="Redis / Valkey">
```json
{
"vector_store": {
"enabled": true,
"type": "redis",
"config": {
"addr": "env.REDIS_ADDR",
"password": "env.REDIS_PASSWORD",
"db": 0,
"use_tls": false
}
}
}
```
**AWS MemoryDB (cluster mode):**
```json
{
"vector_store": {
"enabled": true,
"type": "redis",
"config": {
"addr": "env.MEMORYDB_ENDPOINT",
"password": "env.MEMORYDB_PASSWORD",
"use_tls": true,
"cluster_mode": true
}
}
}
```
| Field | Default | Description |
|-------|---------|-------------|
| `addr` | — | Redis/Valkey address `host:port` (supports `env.` prefix) |
| `password` | — | Redis AUTH password (supports `env.` prefix) |
| `db` | `0` | Redis database number |
| `use_tls` | `false` | Enable TLS |
| `cluster_mode` | `false` | Enable cluster mode (required for MemoryDB; `db` must be `0`) |
| `pool_size` | — | Maximum socket connections |
</Tab>
<Tab title="Qdrant">
```json
{
"vector_store": {
"enabled": true,
"type": "qdrant",
"config": {
"host": "env.QDRANT_HOST",
"port": 6334,
"api_key": "env.QDRANT_API_KEY",
"use_tls": false
}
}
}
```
| Field | Default | Description |
|-------|---------|-------------|
| `host` | — | Qdrant server host (supports `env.` prefix) |
| `port` | `6334` | gRPC port |
| `api_key` | — | API key (supports `env.` prefix) |
| `use_tls` | `false` | Enable TLS |
</Tab>
<Tab title="Pinecone">
Pinecone is external-only.
```json
{
"vector_store": {
"enabled": true,
"type": "pinecone",
"config": {
"api_key": "env.PINECONE_API_KEY",
"index_host": "env.PINECONE_INDEX_HOST"
}
}
}
```
| Field | Description |
|-------|-------------|
| `api_key` | Pinecone API key (supports `env.` prefix) |
| `index_host` | Index host from Pinecone console (e.g. `your-index.svc.us-east1-gcp.pinecone.io`) |
</Tab>
</Tabs>
---
## Mixed Backend Example
Run the config store on PostgreSQL (for UI) while keeping logs on SQLite (simpler, cheaper for append-heavy workloads):
```json
{
"$schema": "https://www.getbifrost.ai/schema",
"encryption_key": "env.BIFROST_ENCRYPTION_KEY",
"config_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "env.PG_HOST",
"port": "5432",
"user": "env.PG_USER",
"password": "env.PG_PASSWORD",
"db_name": "bifrost",
"ssl_mode": "require"
}
},
"logs_store": {
"enabled": true,
"type": "sqlite",
"config": {
"path": "./logs.db"
}
}
}
```
---
## Full Storage Example
```json
{
"$schema": "https://www.getbifrost.ai/schema",
"encryption_key": "env.BIFROST_ENCRYPTION_KEY",
"config_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "env.PG_HOST",
"port": "5432",
"user": "env.PG_USER",
"password": "env.PG_PASSWORD",
"db_name": "bifrost",
"ssl_mode": "require",
"max_idle_conns": 5,
"max_open_conns": 50
}
},
"logs_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "env.PG_HOST",
"port": "5432",
"user": "env.PG_USER",
"password": "env.PG_PASSWORD",
"db_name": "bifrost",
"ssl_mode": "require",
"max_idle_conns": 10,
"max_open_conns": 100
},
"retention_days": 90,
"object_storage": {
"type": "s3",
"bucket": "env.S3_BUCKET",
"region": "us-east-1",
"compress": true,
"access_key_id": "env.S3_ACCESS_KEY_ID",
"secret_access_key": "env.S3_SECRET_ACCESS_KEY"
}
},
"vector_store": {
"enabled": true,
"type": "weaviate",
"config": {
"scheme": "http",
"host": "weaviate:8080"
}
}
}
```

View File

@@ -0,0 +1,440 @@
---
title: "Docker Performance Tuning"
description: "Optimize Bifrost container performance with Go runtime tuning, resource limits, and system configuration"
icon: "docker"
---
This guide covers performance tuning for Bifrost when running in Docker containers. Proper tuning ensures Bifrost can fully utilize container resources and achieve optimal throughput.
<Note>
These optimizations apply to Docker, Docker Compose, Kubernetes, and any container runtime using cgroups for resource management.
</Note>
## Quick Start
For most production deployments, add these settings to your container:
```yaml
services:
bifrost:
image: maximhq/bifrost:latest
environment:
- GOGC=200
- GOMEMLIMIT=3600MiB # 90% of 4GB memory limit
ulimits:
nofile:
soft: 65536
hard: 65536
deploy:
resources:
limits:
cpus: '4'
memory: 4G
```
---
## Go Runtime Tuning
### GOMAXPROCS (Automatic)
Bifrost automatically detects container CPU limits using [automaxprocs](https://github.com/uber-go/automaxprocs). This sets `GOMAXPROCS` to match your container's CPU quota from cgroups (v1 and v2).
**No configuration needed** — this works automatically. You'll see a log line at startup:
```
maxprocs: Updating GOMAXPROCS=4: determined from CPU quota
```
<Warning>
Without automaxprocs, Go would detect all host CPUs (e.g., 64 on an EC2 instance) even when the container is limited to 4 CPUs, causing excessive context switching and degraded performance.
</Warning>
### GOGC (Garbage Collection)
`GOGC` controls garbage collection frequency. The default is `100` (GC triggers when heap grows 100% since last collection).
| Scenario | Recommended GOGC | Trade-off |
|----------|------------------|-----------|
| Memory constrained | 50-100 | More frequent GC, lower memory |
| High throughput, memory available | 200-400 | Less GC overhead, higher memory |
| Latency sensitive | 50-100 | More predictable latency |
```yaml
environment:
- GOGC=200
```
<Tip>
For high-throughput API gateways, `GOGC=200` or `GOGC=400` typically provides the best balance of throughput and memory usage.
</Tip>
### GOMEMLIMIT (Memory Limit)
`GOMEMLIMIT` sets a soft memory limit for the Go runtime. When approaching this limit, Go becomes more aggressive about garbage collection.
**Best practice:** Set to ~90% of your container's memory limit to leave headroom for non-heap memory (goroutine stacks, CGO, etc.).
| Container Memory | Recommended GOMEMLIMIT |
|------------------|------------------------|
| 512 MB | 450MiB |
| 1 GB | 900MiB |
| 2 GB | 1800MiB |
| 4 GB | 3600MiB |
| 8 GB | 7200MiB |
```yaml
environment:
- GOMEMLIMIT=3600MiB
```
<Note>
When using both `GOGC` and `GOMEMLIMIT`, Go GCs based on whichever trigger fires first. For high-throughput workloads, set `GOGC=200` or higher and let `GOMEMLIMIT` be the primary constraint.
</Note>
---
## System Limits
### File Descriptor Limits (ulimits)
Each HTTP connection requires a file descriptor. The default container limit (often 1024) is too low for high-concurrency workloads.
```yaml
ulimits:
nofile:
soft: 65536
hard: 65536
```
| Expected Concurrent Connections | Recommended nofile |
|--------------------------------|-------------------|
| < 1000 | 4096 |
| 1000-5000 | 16384 |
| 5000-10000 | 32768 |
| > 10000 | 65536+ |
<Warning>
If you see errors like `too many open files` or connections being refused under load, increase your `nofile` limit.
</Warning>
### Resource Limits
Set CPU and memory limits to match your expected workload:
```yaml
deploy:
resources:
limits:
cpus: '4'
memory: 4G
reservations:
cpus: '2'
memory: 2G
```
**Sizing guidance:**
| Expected RPS | Recommended CPUs | Recommended Memory |
|--------------|------------------|-------------------|
| 100-500 | 1-2 | 512MB-1GB |
| 500-2000 | 2-4 | 1-2GB |
| 2000-5000 | 4-8 | 2-4GB |
| 5000+ | 8+ | 4GB+ |
---
## Docker Compose Examples
### Development
```yaml
services:
bifrost:
image: maximhq/bifrost:latest
ports:
- "8080:8080"
volumes:
- ./data:/app/data
environment:
- LOG_LEVEL=debug
```
### Production (Single Node)
```yaml
services:
bifrost:
image: maximhq/bifrost:latest
ports:
- "8080:8080"
volumes:
- bifrost-data:/app/data
environment:
- LOG_LEVEL=info
- LOG_STYLE=json
- GOGC=200
- GOMEMLIMIT=3600MiB
ulimits:
nofile:
soft: 65536
hard: 65536
deploy:
resources:
limits:
cpus: '4'
memory: 4G
reservations:
cpus: '2'
memory: 2G
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "-O", "/dev/null", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
restart: unless-stopped
volumes:
bifrost-data:
```
### Production (Multi-Node with PostgreSQL)
<Note>
If you use PostgreSQL for Bifrost storage, ensure the database is UTF8 encoded. See [PostgreSQL UTF8 Requirement](../quickstart/gateway/setting-up#postgresql-utf8-requirement).
</Note>
```yaml
services:
bifrost-1:
image: maximhq/bifrost:latest
ports:
- "8081:8080"
environment:
- LOG_LEVEL=info
- GOGC=200
- GOMEMLIMIT=1800MiB
- BIFROST_DB_TYPE=postgres
- BIFROST_DB_DSN=postgres://user:pass@postgres:5432/bifrost?sslmode=disable
ulimits:
nofile:
soft: 65536
hard: 65536
deploy:
resources:
limits:
cpus: '2'
memory: 2G
depends_on:
- postgres
bifrost-2:
image: maximhq/bifrost:latest
ports:
- "8082:8080"
environment:
- LOG_LEVEL=info
- GOGC=200
- GOMEMLIMIT=1800MiB
- BIFROST_DB_TYPE=postgres
- BIFROST_DB_DSN=postgres://user:pass@postgres:5432/bifrost?sslmode=disable
ulimits:
nofile:
soft: 65536
hard: 65536
deploy:
resources:
limits:
cpus: '2'
memory: 2G
depends_on:
- postgres
postgres:
image: postgres:16-alpine
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=pass
- POSTGRES_DB=bifrost
volumes:
- postgres-data:/var/lib/postgresql/data
volumes:
postgres-data:
```
---
## Kubernetes Configuration
### Basic Deployment
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: bifrost
spec:
replicas: 3
selector:
matchLabels:
app: bifrost
template:
metadata:
labels:
app: bifrost
spec:
containers:
- name: bifrost
image: maximhq/bifrost:latest
ports:
- containerPort: 8080
env:
- name: GOGC
value: "200"
- name: GOMEMLIMIT
value: "3600MiB"
resources:
limits:
cpu: "4"
memory: "4Gi"
requests:
cpu: "2"
memory: "2Gi"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
```
### File Descriptor Limits in Kubernetes
File descriptor limits in Kubernetes are typically set at the node level. Options include:
1. **Node-level configuration** (recommended): Set `fs.file-max` and ulimits in your node configuration
2. **Init container**: Use an init container with elevated privileges to set limits
3. **Security context**: Some clusters allow setting capabilities
```yaml
securityContext:
capabilities:
add: ["SYS_RESOURCE"]
```
<Note>
Check your current limits inside a container with: `cat /proc/sys/fs/file-max` and `ulimit -n`
</Note>
---
## Bifrost Application Settings
Align Bifrost's internal settings with your container resources:
### Concurrency and Buffer Size
Configure per provider in `config.json`:
```json
{
"providers": {
"openai": {
"concurrency_and_buffer_size": {
"concurrency": 1000,
"buffer_size": 1500
}
}
}
}
```
**Formula:**
- `concurrency` = expected RPS per provider
- `buffer_size` = 1.5 × concurrency
### Initial Pool Size
Configure globally in `config.json`:
```json
{
"client": {
"initial_pool_size": 3000
}
}
```
**Formula:** `initial_pool_size` = 1.5 × total expected RPS across all providers
<Tip>
See the [Performance Tuning](/providers/performance) guide for detailed sizing recommendations.
</Tip>
---
## Tuning Checklist
<Steps>
<Step title="Set container resource limits">
Define CPU and memory limits based on expected workload. Start with 2 CPUs / 2GB for moderate loads.
</Step>
<Step title="Configure GOMEMLIMIT">
Set to 90% of container memory limit (e.g., `1800MiB` for 2GB container).
</Step>
<Step title="Tune GOGC">
Start with `GOGC=200` for throughput; reduce to 100 if memory pressure is high.
</Step>
<Step title="Set file descriptor limits">
Set `nofile` ulimit to at least 2× your expected concurrent connections.
</Step>
<Step title="Align Bifrost settings">
Match `concurrency` and `buffer_size` to your container's CPU count and expected RPS.
</Step>
<Step title="Monitor and adjust">
Watch memory usage, GC pause times, and request latencies. Adjust settings based on observed behavior.
</Step>
</Steps>
---
## Troubleshooting
### High Memory Usage
- Reduce `GOGC` (e.g., from 200 to 100)
- Ensure `GOMEMLIMIT` is set
- Reduce `buffer_size` and `initial_pool_size`
### High Latency Spikes
- May indicate GC pauses; try reducing `GOGC`
- Check if container is hitting CPU limits
- Verify `GOMAXPROCS` matches container CPU quota (check startup logs)
### Connection Errors Under Load
- Increase `nofile` ulimit
- Ensure `buffer_size` is large enough for traffic spikes
- Check provider rate limits
### Container OOM Killed
- Reduce `GOMEMLIMIT` to 85% of container memory
- Reduce `GOGC` to trigger more frequent GC
- Reduce `buffer_size` and `initial_pool_size`
---
## Related Documentation
- **[Performance Tuning](/providers/performance)** - Bifrost-specific performance configuration
- **[Helm Deployment](/deployment-guides/helm)** - Kubernetes deployment with Helm
- **[Multi-Node Setup](/deployment-guides/how-to/multinode)** - Scaling across multiple instances

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,378 @@
---
title: "AWS Deployment"
description: "Deploy Bifrost Enterprise on AWS using ECR with IRSA or IAM Task Roles"
icon: "aws"
---
Bifrost Enterprise images for AWS customers are distributed through AWS ECR, enabling native IAM integration for secure, credential-less authentication.
## Architecture
```mermaid
flowchart LR
subgraph AWS[AWS Account]
subgraph EKS[EKS Cluster]
Pod[Bifrost Pod]
KSA[K8s ServiceAccount]
end
IAMRole[IAM Role]
ECR[AWS ECR<br/>Bifrost Images]
end
KSA -->|Annotated with| IAMRole
Pod -->|Assumes| IAMRole
IAMRole -->|Pull Permission| ECR
ECR -->|Image| Pod
```
## Prerequisites
- EKS cluster (v1.23+) or ECS cluster
- AWS CLI configured with appropriate permissions
- `kubectl` configured for your EKS cluster
- Your AWS Account ID allowlisted by Bifrost team
<Note>
Contact the Bifrost team to get your AWS account ID and IAM role ARN allowlisted for ECR access.
</Note>
## IRSA (Recommended)
IAM Roles for Service Accounts (IRSA) provides the most secure authentication method for EKS deployments.
### Step 1: Create IAM Policy
Create an IAM policy that grants ECR pull access to the Bifrost repository.
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ECRAuth",
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken"
],
"Resource": "*"
},
{
"Sid": "ECRPullFromBifrost",
"Effect": "Allow",
"Action": [
"ecr:BatchGetImage",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchCheckLayerAvailability"
],
"Resource": "arn:aws:ecr:us-east-1:BIFROST_ACCOUNT_ID:repository/YOUR_HUB_SLUG"
}
]
}
```
<Warning>
Replace `BIFROST_ACCOUNT_ID` and `YOUR_HUB_SLUG` with the values provided by the Bifrost team.
</Warning>
Save this policy as `bifrost-ecr-pull-policy.json` and create it:
```bash
aws iam create-policy \
--policy-name BifrostECRPullPolicy \
--policy-document file://bifrost-ecr-pull-policy.json
```
### Step 2: Create IAM Role with OIDC Trust
Create an IAM role that can be assumed by your Kubernetes ServiceAccount.
First, get your OIDC provider URL:
```bash
aws eks describe-cluster \
--name YOUR_CLUSTER_NAME \
--query "cluster.identity.oidc.issuer" \
--output text
```
Create the trust policy (`trust-policy.json`):
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::YOUR_ACCOUNT_ID:oidc-provider/oidc.eks.REGION.amazonaws.com/id/OIDC_ID"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.REGION.amazonaws.com/id/OIDC_ID:aud": "sts.amazonaws.com",
"oidc.eks.REGION.amazonaws.com/id/OIDC_ID:sub": "system:serviceaccount:NAMESPACE:bifrost-sa"
}
}
}
]
}
```
Create the role and attach the policy:
```bash
# Create the role
aws iam create-role \
--role-name BifrostECRPullRole \
--assume-role-policy-document file://trust-policy.json
# Attach the policy
aws iam attach-role-policy \
--role-name BifrostECRPullRole \
--policy-arn arn:aws:iam::YOUR_ACCOUNT_ID:policy/BifrostECRPullPolicy
```
### Step 3: Provide Role ARN to Bifrost
Send your IAM role ARN to the Bifrost team for allowlisting:
```
arn:aws:iam::YOUR_ACCOUNT_ID:role/BifrostECRPullRole
```
### Step 4: Create Namespace and ServiceAccount
```bash
kubectl create namespace bifrost
```
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: bifrost-sa
namespace: bifrost
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::YOUR_ACCOUNT_ID:role/BifrostECRPullRole
```
### Step 5: Deploy Bifrost
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: bifrost
namespace: bifrost
spec:
replicas: 2
selector:
matchLabels:
app: bifrost
template:
metadata:
labels:
app: bifrost
spec:
serviceAccountName: bifrost-sa
containers:
- name: bifrost
image: BIFROST_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/YOUR_HUB_SLUG:latest
ports:
- containerPort: 8080
name: http
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
volumeMounts:
- name: config
mountPath: /app/data/config.json
subPath: config.json
volumes:
- name: config
secret:
secretName: bifrost-config
---
apiVersion: v1
kind: Service
metadata:
name: bifrost
namespace: bifrost
spec:
selector:
app: bifrost
ports:
- port: 80
targetPort: 8080
protocol: TCP
type: ClusterIP
```
## ECS Task Roles
For ECS deployments, use IAM Task Roles for authentication.
### Step 1: Create Task Execution Role
The task execution role allows ECS to pull images from ECR.
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage"
],
"Resource": "arn:aws:ecr:us-east-1:BIFROST_ACCOUNT_ID:repository/YOUR_HUB_SLUG"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "*"
}
]
}
```
### Step 2: Create ECS Task Definition
```json
{
"family": "bifrost",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"executionRoleArn": "arn:aws:iam::YOUR_ACCOUNT_ID:role/BifrostECSExecutionRole",
"containerDefinitions": [
{
"name": "bifrost",
"image": "BIFROST_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/YOUR_HUB_SLUG:latest",
"portMappings": [
{
"containerPort": 8080,
"protocol": "tcp"
}
],
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
},
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/bifrost",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "bifrost"
}
}
}
]
}
```
### Step 3: Create ECS Service
```bash
aws ecs create-service \
--cluster your-cluster \
--service-name bifrost \
--task-definition bifrost \
--desired-count 2 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-xxx],securityGroups=[sg-xxx],assignPublicIp=ENABLED}"
```
## Verifying Access
### Test ECR Authentication
```bash
# Get ECR login token
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin \
BIFROST_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com
# Pull test
docker pull BIFROST_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/YOUR_HUB_SLUG:latest
```
### Verify IRSA Configuration
```bash
# Check ServiceAccount annotation
kubectl get sa bifrost-sa -n bifrost -o yaml
# Verify pod can assume role
kubectl exec -it deployment/bifrost -n bifrost -- \
aws sts get-caller-identity
```
## Troubleshooting
### ImagePullBackOff Errors
1. **Check IAM Role trust policy**: Ensure the OIDC provider and ServiceAccount match
2. **Verify ECR permissions**: Confirm the role has `ecr:BatchGetImage` permission
3. **Check allowlisting**: Ensure your role ARN is allowlisted by Bifrost team
```bash
# Check pod events
kubectl describe pod -l app=bifrost -n bifrost
# Check IRSA token
kubectl exec -it deployment/bifrost -n bifrost -- \
cat /var/run/secrets/eks.amazonaws.com/serviceaccount/token
```
### Authentication Errors
```bash
# Verify OIDC provider is configured
aws iam list-open-id-connect-providers
# Check role assumption
aws sts assume-role-with-web-identity \
--role-arn arn:aws:iam::YOUR_ACCOUNT_ID:role/BifrostECRPullRole \
--role-session-name test \
--web-identity-token file:///path/to/token
```
## Next Steps
- Configure [Bifrost settings](/quickstart/gateway/setting-up) for your use case
- Set up [observability](/features/observability/default) for monitoring
- Enable [clustering](/enterprise/clustering) for high availability

View File

@@ -0,0 +1,451 @@
---
title: "Azure Deployment"
description: "Deploy Bifrost Enterprise on Azure AKS using Workload Identity Federation to GCP Artifact Registry"
icon: "microsoft"
---
Bifrost Enterprise images for Azure customers are distributed through GCP Artifact Registry, using Azure Workload Identity Federation for secure, credential-less authentication.
## Architecture
```mermaid
flowchart LR
subgraph Azure[Azure Subscription]
subgraph AKS[AKS Cluster]
Pod[Bifrost Pod]
KSA[K8s ServiceAccount]
end
MI[Managed Identity]
end
subgraph GCP[GCP Project]
WIF[Workload Identity<br/>Federation Pool]
GSA[GCP Service Account]
AR[Artifact Registry<br/>Bifrost Images]
end
KSA -->|Federated| MI
MI -->|OIDC Token| WIF
WIF -->|Exchange| GSA
GSA -->|Pull Permission| AR
AR -->|Image| Pod
```
## How It Works
Azure Workload Identity Federation allows Azure Managed Identities to authenticate to GCP without exchanging credentials:
1. **AKS Pod** requests a token using its Kubernetes ServiceAccount
2. **Azure AD** issues an OIDC token for the Managed Identity
3. **GCP Workload Identity Federation** validates the Azure token
4. **GCP STS** exchanges it for a GCP access token
5. **Pod** uses the GCP token to pull images from Artifact Registry
## Prerequisites
- AKS cluster (v1.24+) with Workload Identity enabled
- Azure CLI configured with appropriate permissions
- `kubectl` configured for your AKS cluster
- Your Azure Tenant ID and Managed Identity Client ID provided to Bifrost team
<Note>
Contact the Bifrost team with your Azure Tenant ID and Managed Identity Client IDs to get access configured.
</Note>
## Step 1: Enable Workload Identity on AKS
If not already enabled, enable Workload Identity on your AKS cluster:
```bash
# For existing cluster
az aks update \
--resource-group YOUR_RESOURCE_GROUP \
--name YOUR_CLUSTER_NAME \
--enable-oidc-issuer \
--enable-workload-identity
# Get the OIDC issuer URL
az aks show \
--resource-group YOUR_RESOURCE_GROUP \
--name YOUR_CLUSTER_NAME \
--query "oidcIssuerProfile.issuerUrl" -o tsv
```
## Step 2: Create Azure Managed Identity
```bash
# Create Managed Identity
az identity create \
--name bifrost-pull-identity \
--resource-group YOUR_RESOURCE_GROUP \
--location YOUR_LOCATION
# Get the Client ID
CLIENT_ID=$(az identity show \
--name bifrost-pull-identity \
--resource-group YOUR_RESOURCE_GROUP \
--query clientId -o tsv)
echo "Client ID: $CLIENT_ID"
```
## Step 3: Create Federated Credential
Link the Kubernetes ServiceAccount to the Azure Managed Identity:
```bash
# Get AKS OIDC issuer
AKS_OIDC_ISSUER=$(az aks show \
--resource-group YOUR_RESOURCE_GROUP \
--name YOUR_CLUSTER_NAME \
--query "oidcIssuerProfile.issuerUrl" -o tsv)
# Create federated credential
az identity federated-credential create \
--name bifrost-federated-credential \
--identity-name bifrost-pull-identity \
--resource-group YOUR_RESOURCE_GROUP \
--issuer "$AKS_OIDC_ISSUER" \
--subject "system:serviceaccount:bifrost:bifrost-sa" \
--audience "api://AzureADTokenExchange"
```
## Step 4: Provide Details to Bifrost Team
Send the following information to the Bifrost team:
```bash
# Get Tenant ID
az account show --query tenantId -o tsv
# Get Client ID
az identity show \
--name bifrost-pull-identity \
--resource-group YOUR_RESOURCE_GROUP \
--query clientId -o tsv
```
The Bifrost team will configure GCP Workload Identity Federation to trust your Azure Managed Identity.
## Step 5: Store GCP Credential Configuration
After the Bifrost team configures access, they will provide a credential configuration. Store it as a ConfigMap:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: gcp-credential-config
namespace: bifrost
data:
credential-config.json: |
{
"type": "external_account",
"audience": "//iam.googleapis.com/projects/BIFROST_PROJECT_NUMBER/locations/global/workloadIdentityPools/YOUR_HUB_SLUG-azure-pool/providers/YOUR_HUB_SLUG-azure-provider",
"subject_token_type": "urn:ietf:params:oauth:token-type:jwt",
"service_account_impersonation_url": "https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/BIFROST_SA@BIFROST_PROJECT.iam.gserviceaccount.com:generateAccessToken",
"token_url": "https://sts.googleapis.com/v1/token",
"credential_source": {
"file": "/var/run/secrets/azure/tokens/azure-identity-token",
"format": {
"type": "text"
}
}
}
```
<Warning>
The Bifrost team will provide the exact values for `BIFROST_PROJECT_NUMBER`, `YOUR_HUB_SLUG`, and `BIFROST_SA`.
</Warning>
## Step 6: Create Kubernetes ServiceAccount
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: bifrost-sa
namespace: bifrost
annotations:
azure.workload.identity/client-id: YOUR_MANAGED_IDENTITY_CLIENT_ID
labels:
azure.workload.identity/use: "true"
```
## Step 7: Create Image Pull Secret with Token Refresh
Create a CronJob to refresh the imagePullSecret using the federated identity:
```yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: refresh-ar-secret
namespace: bifrost
spec:
schedule: "*/30 * * * *" # Every 30 minutes
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 3
jobTemplate:
spec:
template:
metadata:
labels:
azure.workload.identity/use: "true"
spec:
serviceAccountName: bifrost-sa
containers:
- name: token-refresh
image: google/cloud-sdk:slim
command: ["/bin/bash", "-c"]
args:
- |
set -e
# Set GCP credential config
export GOOGLE_APPLICATION_CREDENTIALS=/etc/gcp/credential-config.json
# Get GCP access token via federation
TOKEN=$(gcloud auth print-access-token)
# Delete existing secret if it exists
kubectl delete secret ar-pull-secret --ignore-not-found -n bifrost
# Create new imagePullSecret
kubectl create secret docker-registry ar-pull-secret \
--docker-server=REGION-docker.pkg.dev \
--docker-username=oauth2accesstoken \
--docker-password="$TOKEN" \
-n bifrost
echo "Secret refreshed at $(date)"
volumeMounts:
- name: gcp-credential-config
mountPath: /etc/gcp
readOnly: true
- name: azure-identity-token
mountPath: /var/run/secrets/azure/tokens
readOnly: true
volumes:
- name: gcp-credential-config
configMap:
name: gcp-credential-config
- name: azure-identity-token
projected:
sources:
- serviceAccountToken:
path: azure-identity-token
expirationSeconds: 3600
audience: api://AzureADTokenExchange
restartPolicy: OnFailure
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: secret-manager
namespace: bifrost
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "create", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: secret-manager-binding
namespace: bifrost
subjects:
- kind: ServiceAccount
name: bifrost-sa
namespace: bifrost
roleRef:
kind: Role
name: secret-manager
apiGroup: rbac.authorization.k8s.io
```
## Step 8: Deploy Bifrost
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: bifrost
namespace: bifrost
spec:
replicas: 2
selector:
matchLabels:
app: bifrost
template:
metadata:
labels:
app: bifrost
azure.workload.identity/use: "true"
spec:
serviceAccountName: bifrost-sa
imagePullSecrets:
- name: ar-pull-secret
containers:
- name: bifrost
image: REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
ports:
- containerPort: 8080
name: http
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
volumeMounts:
- name: config
mountPath: /app/data/config.json
subPath: config.json
volumes:
- name: config
secret:
secretName: bifrost-config
---
apiVersion: v1
kind: Service
metadata:
name: bifrost
namespace: bifrost
spec:
selector:
app: bifrost
ports:
- port: 80
targetPort: 8080
protocol: TCP
type: ClusterIP
```
## Bootstrap: Initial Secret Creation
Before the first deployment, manually trigger the CronJob or create the secret:
```bash
# Create namespace
kubectl create namespace bifrost
# Apply all configurations
kubectl apply -f configmap.yaml
kubectl apply -f serviceaccount.yaml
kubectl apply -f cronjob.yaml
# Manually trigger the CronJob
kubectl create job --from=cronjob/refresh-ar-secret initial-refresh -n bifrost
# Wait for completion
kubectl wait --for=condition=complete job/initial-refresh -n bifrost --timeout=120s
# Verify secret was created
kubectl get secret ar-pull-secret -n bifrost
```
## Verifying Access
### Check Workload Identity Configuration
```bash
# Verify AKS has Workload Identity enabled
az aks show \
--resource-group YOUR_RESOURCE_GROUP \
--name YOUR_CLUSTER_NAME \
--query "oidcIssuerProfile.enabled" -o tsv
# Check federated credential
az identity federated-credential show \
--name bifrost-federated-credential \
--identity-name bifrost-pull-identity \
--resource-group YOUR_RESOURCE_GROUP
```
### Verify Token Exchange
```bash
# Check CronJob ran successfully
kubectl get jobs -n bifrost
# View CronJob logs
kubectl logs -l job-name=refresh-ar-secret -n bifrost
# Verify imagePullSecret exists
kubectl get secret ar-pull-secret -n bifrost -o yaml
```
## Troubleshooting
### ImagePullBackOff Errors
1. **Check imagePullSecret exists**: `kubectl get secret ar-pull-secret -n bifrost`
2. **Verify CronJob succeeded**: `kubectl get jobs -n bifrost`
3. **Check Azure Workload Identity**: Ensure labels are set correctly
```bash
# Check pod events
kubectl describe pod -l app=bifrost -n bifrost
# Check ServiceAccount has correct annotations
kubectl get sa bifrost-sa -n bifrost -o yaml
```
### Token Exchange Failures
```bash
# Check CronJob logs for errors
kubectl logs -l job-name=refresh-ar-secret -n bifrost
# Common issues:
# - "audience mismatch": Check credential-config.json audience field
# - "subject mismatch": Verify federated credential subject matches SA
# - "permission denied": Contact Bifrost team to verify WIF configuration
```
### Azure Workload Identity Issues
```bash
# Verify Managed Identity exists
az identity show \
--name bifrost-pull-identity \
--resource-group YOUR_RESOURCE_GROUP
# Check federated credentials
az identity federated-credential list \
--identity-name bifrost-pull-identity \
--resource-group YOUR_RESOURCE_GROUP
# Verify pod has identity token mounted
kubectl exec -it deployment/bifrost -n bifrost -- \
ls -la /var/run/secrets/azure/tokens/
```
## Summary
| Component | Value |
|-----------|-------|
| Registry | GCP Artifact Registry |
| Authentication | Azure WIF -> GCP WIF -> GCP SA |
| Token Lifetime | 60 minutes (auto-refreshed every 30 min) |
| Secret Name | `ar-pull-secret` |
## Next Steps
- Configure [Bifrost settings](/quickstart/gateway/setting-up) for your use case
- Set up [observability](/features/observability/default) for monitoring
- Enable [clustering](/enterprise/clustering) for high availability

View File

@@ -0,0 +1,386 @@
---
title: "GCP Deployment"
description: "Deploy Bifrost Enterprise on GCP using Artifact Registry with Workload Identity"
icon: "google"
---
Bifrost Enterprise images for GCP customers are distributed through GCP Artifact Registry, enabling native Workload Identity for secure, keyless authentication.
## Architecture
```mermaid
flowchart LR
subgraph GCP[GCP Project]
subgraph GKE[GKE Cluster]
Pod[Bifrost Pod]
KSA[K8s ServiceAccount]
end
GSA[GCP Service Account]
AR[Artifact Registry<br/>Bifrost Images]
end
KSA -->|Workload Identity| GSA
Pod -->|Impersonates| GSA
GSA -->|Pull Permission| AR
AR -->|Image| Pod
```
## Prerequisites
- GKE cluster (v1.24+) with Workload Identity enabled
- `gcloud` CLI configured with appropriate permissions
- `kubectl` configured for your GKE cluster
- Your GCP project allowlisted by Bifrost team
<Note>
Contact the Bifrost team with your GCP project ID and service account email to get access configured.
</Note>
## Workload Identity (Recommended)
Workload Identity provides the most secure authentication method for GKE deployments by eliminating the need for service account keys.
### Step 1: Enable Workload Identity on GKE
If not already enabled, enable Workload Identity on your cluster:
```bash
# For existing cluster
gcloud container clusters update YOUR_CLUSTER_NAME \
--region=YOUR_REGION \
--workload-pool=YOUR_PROJECT_ID.svc.id.goog
# Verify Workload Identity is enabled
gcloud container clusters describe YOUR_CLUSTER_NAME \
--region=YOUR_REGION \
--format="value(workloadIdentityConfig.workloadPool)"
```
### Step 2: Create GCP Service Account
Create a service account that will be used to pull images:
```bash
# Create service account
gcloud iam service-accounts create bifrost-pull-sa \
--display-name="Bifrost Image Pull SA" \
--project=YOUR_PROJECT_ID
```
### Step 3: Request Access from Bifrost Team
Provide the following to the Bifrost team:
- Your GCP project ID
- Service account email: `bifrost-pull-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com`
The Bifrost team will grant the necessary permissions to pull images from the registry.
### Step 4: Create Namespace and ServiceAccount
```bash
kubectl create namespace bifrost
```
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: bifrost-sa
namespace: bifrost
annotations:
iam.gke.io/gcp-service-account: bifrost-pull-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com
```
### Step 5: Bind Kubernetes SA to GCP SA
Allow the Kubernetes ServiceAccount to impersonate the GCP Service Account:
```bash
gcloud iam service-accounts add-iam-policy-binding \
bifrost-pull-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com \
--role=roles/iam.workloadIdentityUser \
--member="serviceAccount:YOUR_PROJECT_ID.svc.id.goog[bifrost/bifrost-sa]"
```
### Step 6: Create Image Pull Secret with Token Refresh
Artifact Registry tokens expire after 60 minutes. Use a CronJob to refresh the imagePullSecret:
```yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: refresh-ar-secret
namespace: bifrost
spec:
schedule: "*/30 * * * *" # Every 30 minutes
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 3
jobTemplate:
spec:
template:
spec:
serviceAccountName: bifrost-sa
containers:
- name: token-refresh
image: google/cloud-sdk:slim
command: ["/bin/bash", "-c"]
args:
- |
set -e
# Get access token using Workload Identity
TOKEN=$(gcloud auth print-access-token)
# Delete existing secret if it exists
kubectl delete secret ar-pull-secret --ignore-not-found -n bifrost
# Create new imagePullSecret
kubectl create secret docker-registry ar-pull-secret \
--docker-server=REGION-docker.pkg.dev \
--docker-username=oauth2accesstoken \
--docker-password="$TOKEN" \
-n bifrost
echo "Secret refreshed at $(date)"
restartPolicy: OnFailure
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: secret-manager
namespace: bifrost
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "create", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: secret-manager-binding
namespace: bifrost
subjects:
- kind: ServiceAccount
name: bifrost-sa
namespace: bifrost
roleRef:
kind: Role
name: secret-manager
apiGroup: rbac.authorization.k8s.io
```
<Warning>
Replace `REGION` with your Artifact Registry region (e.g., `us-central1`).
</Warning>
### Step 7: Deploy Bifrost
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: bifrost
namespace: bifrost
spec:
replicas: 2
selector:
matchLabels:
app: bifrost
template:
metadata:
labels:
app: bifrost
spec:
serviceAccountName: bifrost-sa
imagePullSecrets:
- name: ar-pull-secret
containers:
- name: bifrost
image: REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
ports:
- containerPort: 8080
name: http
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
volumeMounts:
- name: config
mountPath: /app/data/config.json
subPath: config.json
volumes:
- name: config
secret:
secretName: bifrost-config
---
apiVersion: v1
kind: Service
metadata:
name: bifrost
namespace: bifrost
spec:
selector:
app: bifrost
ports:
- port: 80
targetPort: 8080
protocol: TCP
type: ClusterIP
```
### Bootstrap: Initial Secret Creation
Before the first deployment, manually create the initial imagePullSecret:
```bash
# Authenticate gcloud
gcloud auth login
# Create initial secret
kubectl create secret docker-registry ar-pull-secret \
--docker-server=REGION-docker.pkg.dev \
--docker-username=oauth2accesstoken \
--docker-password="$(gcloud auth print-access-token)" \
-n bifrost
```
## Service Account Impersonation
For cross-project deployments or when you need to use an existing service account:
### Configure Impersonation
```bash
# Grant impersonation permission
gcloud iam service-accounts add-iam-policy-binding \
BIFROST_PROVIDED_SA@BIFROST_PROJECT.iam.gserviceaccount.com \
--role=roles/iam.serviceAccountTokenCreator \
--member="serviceAccount:bifrost-pull-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com"
```
### Token Refresh with Impersonation
Update the CronJob to use impersonation:
```yaml
args:
- |
set -e
# Get access token by impersonating the Bifrost SA
TOKEN=$(gcloud auth print-access-token \
--impersonate-service-account=BIFROST_PROVIDED_SA@BIFROST_PROJECT.iam.gserviceaccount.com)
kubectl delete secret ar-pull-secret --ignore-not-found -n bifrost
kubectl create secret docker-registry ar-pull-secret \
--docker-server=REGION-docker.pkg.dev \
--docker-username=oauth2accesstoken \
--docker-password="$TOKEN" \
-n bifrost
```
## Service Account Key (Legacy)
<Warning>
Service account keys are not recommended for production. Use Workload Identity instead.
</Warning>
For environments that cannot use Workload Identity:
```bash
# Create key (provided by Bifrost team)
# Store key securely
# Create imagePullSecret
kubectl create secret docker-registry ar-pull-secret \
--docker-server=REGION-docker.pkg.dev \
--docker-username=_json_key \
--docker-password="$(cat sa-key.json)" \
-n bifrost
```
## Verifying Access
### Test Artifact Registry Authentication
```bash
# Configure docker for Artifact Registry
gcloud auth configure-docker REGION-docker.pkg.dev
# Pull test (requires impersonation or direct access)
docker pull REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
```
### Verify Workload Identity Configuration
```bash
# Check ServiceAccount annotation
kubectl get sa bifrost-sa -n bifrost -o yaml
# Verify pod can authenticate
kubectl exec -it deployment/bifrost -n bifrost -- \
gcloud auth print-access-token
# Check token refresh CronJob
kubectl get cronjob refresh-ar-secret -n bifrost
kubectl get jobs -n bifrost
```
## Troubleshooting
### ImagePullBackOff Errors
1. **Check imagePullSecret exists**: `kubectl get secret ar-pull-secret -n bifrost`
2. **Verify token is valid**: Check if CronJob ran successfully
3. **Check Workload Identity binding**: Ensure GCP SA is bound to K8s SA
```bash
# Check pod events
kubectl describe pod -l app=bifrost -n bifrost
# Manually refresh token
kubectl create job --from=cronjob/refresh-ar-secret manual-refresh -n bifrost
```
### Workload Identity Issues
```bash
# Verify Workload Identity pool
gcloud container clusters describe YOUR_CLUSTER_NAME \
--region=YOUR_REGION \
--format="value(workloadIdentityConfig.workloadPool)"
# Check IAM binding
gcloud iam service-accounts get-iam-policy \
bifrost-pull-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com
```
### Token Expiration
If pods fail to pull images after 60 minutes:
1. Verify CronJob is running: `kubectl get cronjob -n bifrost`
2. Check CronJob logs: `kubectl logs -l job-name=refresh-ar-secret -n bifrost`
3. Manually trigger refresh: `kubectl create job --from=cronjob/refresh-ar-secret manual-refresh -n bifrost`
## Next Steps
- Configure [Bifrost settings](/quickstart/gateway/setting-up) for your use case
- Set up [observability](/features/observability/default) for monitoring
- Enable [clustering](/enterprise/clustering) for high availability

View File

@@ -0,0 +1,541 @@
---
title: "On-Premise Deployment"
description: "Deploy Bifrost Enterprise in on-premise or air-gapped environments using Docker credentials"
icon: "server"
---
Bifrost Enterprise supports on-premise deployments for environments that cannot use cloud-native identity federation. Images are pulled from GCP Artifact Registry using username/password authentication.
## Architecture
```mermaid
flowchart LR
subgraph OnPrem[On-Premise Environment]
subgraph K8s[Kubernetes Cluster]
Pod[Bifrost Pod]
Secret[imagePullSecret]
end
Docker[Docker Daemon]
end
subgraph GCP[GCP]
AR[Artifact Registry<br/>Bifrost Images]
end
Secret -->|Credentials| Pod
Pod -->|Pull| AR
Docker -->|Pull| AR
AR -->|Image| Pod
AR -->|Image| Docker
```
## Prerequisites
- Kubernetes cluster (v1.23+) or Docker runtime
- Network access to `us-central1-docker.pkg.dev` (or your designated region)
- Docker credentials provided by Bifrost team
<Note>
Contact the Bifrost team to receive your Docker username and password credentials.
</Note>
## Credentials
The Bifrost team will provide you with:
| Credential | Description |
|------------|-------------|
| **Username** | `_json_key` (fixed value for GCP Artifact Registry) |
| **Password** | Service account JSON key (base64 encoded or raw JSON) |
| **Registry** | `REGION-docker.pkg.dev` (e.g., `us-central1-docker.pkg.dev`) |
| **Repository** | `REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG` |
<Warning>
Store credentials securely. Never commit them to version control or expose them in logs.
</Warning>
## Docker Deployment
### Step 1: Login to Registry
```bash
# Using the JSON key file
cat bifrost-credentials.json | docker login -u _json_key --password-stdin https://REGION-docker.pkg.dev
# Or using the password directly
docker login -u _json_key -p "$(cat bifrost-credentials.json)" https://REGION-docker.pkg.dev
```
### Step 2: Pull the Image
```bash
docker pull REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
```
### Step 3: Run Bifrost
```bash
docker run -d \
--name bifrost \
-p 8080:8080 \
-v /path/to/config.json:/app/data/config.json:ro \
-v /path/to/data:/app/data \
REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
```
## Kubernetes Deployment
### Step 1: Create Namespace
```bash
kubectl create namespace bifrost
```
### Step 2: Create imagePullSecret
<Tabs>
<Tab title="From JSON Key File">
```bash
kubectl create secret docker-registry bifrost-pull-secret \
--docker-server=REGION-docker.pkg.dev \
--docker-username=_json_key \
--docker-password="$(cat bifrost-credentials.json)" \
--namespace=bifrost
```
</Tab>
<Tab title="From Base64 Key">
```bash
# If you received a base64-encoded key
kubectl create secret docker-registry bifrost-pull-secret \
--docker-server=REGION-docker.pkg.dev \
--docker-username=_json_key \
--docker-password="$(echo 'BASE64_ENCODED_KEY' | base64 -d)" \
--namespace=bifrost
```
</Tab>
<Tab title="Using YAML">
```yaml
apiVersion: v1
kind: Secret
metadata:
name: bifrost-pull-secret
namespace: bifrost
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: <BASE64_ENCODED_DOCKER_CONFIG>
```
Generate the base64-encoded config:
```bash
# Create docker config
cat <<EOF > docker-config.json
{
"auths": {
"REGION-docker.pkg.dev": {
"username": "_json_key",
"password": "$(cat bifrost-credentials.json | tr -d '\n')",
"auth": "$(echo -n '_json_key:'$(cat bifrost-credentials.json | tr -d '\n') | base64 -w 0)"
}
}
}
EOF
# Base64 encode for secret
cat docker-config.json | base64 -w 0
```
</Tab>
</Tabs>
### Step 3: Create Bifrost Configuration
<Note>
If you use PostgreSQL for `config_store` or `logs_store`, ensure the target database is UTF8 encoded. See [PostgreSQL UTF8 Requirement](../../quickstart/gateway/setting-up#postgresql-utf8-requirement).
</Note>
```yaml
apiVersion: v1
kind: Secret
metadata:
name: bifrost-config
namespace: bifrost
type: Opaque
stringData:
config.json: |
{
"config_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "postgres.bifrost.svc.cluster.local",
"port": "5432",
"user": "bifrost",
"password": "YOUR_PASSWORD",
"db_name": "bifrost",
"ssl_mode": "disable"
}
},
"logs_store": {
"enabled": true,
"type": "postgres",
"config": {
"host": "postgres.bifrost.svc.cluster.local",
"port": "5432",
"user": "bifrost",
"password": "YOUR_PASSWORD",
"db_name": "bifrost",
"ssl_mode": "disable"
}
}
}
```
### Step 4: Deploy Bifrost
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: bifrost
namespace: bifrost
spec:
replicas: 2
selector:
matchLabels:
app: bifrost
template:
metadata:
labels:
app: bifrost
spec:
imagePullSecrets:
- name: bifrost-pull-secret
containers:
- name: bifrost
image: REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
ports:
- containerPort: 8080
name: http
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "2Gi"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
volumeMounts:
- name: config
mountPath: /app/data/config.json
subPath: config.json
- name: data
mountPath: /app/data
volumes:
- name: config
secret:
secretName: bifrost-config
- name: data
persistentVolumeClaim:
claimName: bifrost-data
---
apiVersion: v1
kind: Service
metadata:
name: bifrost
namespace: bifrost
spec:
selector:
app: bifrost
ports:
- port: 80
targetPort: 8080
protocol: TCP
type: ClusterIP
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: bifrost-data
namespace: bifrost
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
```
### Step 5: Expose Bifrost (Optional)
<Tabs>
<Tab title="Ingress">
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: bifrost
namespace: bifrost
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "50m"
spec:
ingressClassName: nginx
rules:
- host: bifrost.your-domain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: bifrost
port:
number: 80
tls:
- hosts:
- bifrost.your-domain.com
secretName: bifrost-tls
```
</Tab>
<Tab title="LoadBalancer">
```yaml
apiVersion: v1
kind: Service
metadata:
name: bifrost-lb
namespace: bifrost
spec:
selector:
app: bifrost
ports:
- port: 80
targetPort: 8080
protocol: TCP
type: LoadBalancer
```
</Tab>
<Tab title="NodePort">
```yaml
apiVersion: v1
kind: Service
metadata:
name: bifrost-nodeport
namespace: bifrost
spec:
selector:
app: bifrost
ports:
- port: 80
targetPort: 8080
nodePort: 30080
protocol: TCP
type: NodePort
```
</Tab>
</Tabs>
## Docker Compose Deployment
For simpler deployments without Kubernetes:
```yaml
version: '3.8'
services:
bifrost:
image: REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
container_name: bifrost
ports:
- "8080:8080"
volumes:
- ./config.json:/app/data/config.json:ro
- bifrost-data:/app/data
environment:
- BIFROST_LOG_LEVEL=info
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
restart: unless-stopped
postgres:
image: postgres:15-alpine
container_name: bifrost-postgres
environment:
- POSTGRES_USER=bifrost
- POSTGRES_PASSWORD=YOUR_PASSWORD
- POSTGRES_DB=bifrost
volumes:
- postgres-data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U bifrost"]
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped
volumes:
bifrost-data:
postgres-data:
```
Login to registry before running:
```bash
cat bifrost-credentials.json | docker login -u _json_key --password-stdin https://REGION-docker.pkg.dev
docker compose up -d
```
## Air-Gapped Environments
For environments without internet access, you can mirror the image to your internal registry.
### Step 1: Pull Image (Internet-Connected Machine)
```bash
# Login and pull
cat bifrost-credentials.json | docker login -u _json_key --password-stdin https://REGION-docker.pkg.dev
docker pull REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
# Save to tar file
docker save REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest > bifrost-image.tar
```
### Step 2: Transfer and Load (Air-Gapped Machine)
```bash
# Load image
docker load < bifrost-image.tar
# Tag for internal registry
docker tag REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest \
internal-registry.company.com/bifrost:latest
# Push to internal registry
docker push internal-registry.company.com/bifrost:latest
```
### Step 3: Update Kubernetes Manifests
Update the image reference in your deployment:
```yaml
containers:
- name: bifrost
image: internal-registry.company.com/bifrost:latest
```
## Credential Rotation
When the Bifrost team rotates your credentials:
### Update Docker Login
```bash
cat new-credentials.json | docker login -u _json_key --password-stdin https://REGION-docker.pkg.dev
```
### Update Kubernetes Secret
```bash
# Delete old secret
kubectl delete secret bifrost-pull-secret -n bifrost
# Create new secret
kubectl create secret docker-registry bifrost-pull-secret \
--docker-server=REGION-docker.pkg.dev \
--docker-username=_json_key \
--docker-password="$(cat new-credentials.json)" \
--namespace=bifrost
# Restart deployment to pick up new secret
kubectl rollout restart deployment/bifrost -n bifrost
```
## Verifying Access
### Test Docker Authentication
```bash
# Verify login
docker login -u _json_key -p "$(cat bifrost-credentials.json)" https://REGION-docker.pkg.dev
# Test pull
docker pull REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
```
### Verify Kubernetes Secret
```bash
# Check secret exists
kubectl get secret bifrost-pull-secret -n bifrost
# Verify secret content (base64 encoded)
kubectl get secret bifrost-pull-secret -n bifrost -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d
```
## Troubleshooting
### ImagePullBackOff Errors
```bash
# Check pod events
kubectl describe pod -l app=bifrost -n bifrost
# Common issues:
# - "unauthorized": Invalid credentials - check username/password
# - "not found": Wrong repository path - verify with Bifrost team
# - "connection refused": Network issue - check firewall rules
```
### Network Connectivity
```bash
# Test DNS resolution
nslookup REGION-docker.pkg.dev
# Test HTTPS connectivity
curl -v https://REGION-docker.pkg.dev/v2/
# Required outbound access:
# - REGION-docker.pkg.dev:443
# - oauth2.googleapis.com:443 (for token refresh)
```
### Credential Issues
```bash
# Verify JSON key format
cat bifrost-credentials.json | jq .
# Check key hasn't expired
cat bifrost-credentials.json | jq '.private_key_id'
# Contact Bifrost team if credentials are invalid
```
## Security Best Practices
1. **Store credentials securely**: Use a secrets manager (Vault, AWS Secrets Manager) for credential storage
2. **Limit access**: Only grant imagePullSecret access to required namespaces
3. **Rotate regularly**: Request credential rotation from Bifrost team periodically
4. **Audit access**: Monitor image pull logs for unauthorized access attempts
5. **Network isolation**: Restrict outbound access to only required registry endpoints
## Next Steps
- Configure [Bifrost settings](/quickstart/gateway/setting-up) for your use case
- Set up [observability](/features/observability/default) for monitoring
- Enable [clustering](/enterprise/clustering) for high availability

View File

@@ -0,0 +1,141 @@
---
title: "Overview"
description: "Deploy Bifrost Enterprise in your cloud environment with secure, private container image distribution"
icon: "info-circle"
---
Bifrost Enterprise provides private container image distribution through dedicated registries, enabling secure deployments in AWS, GCP, Azure, and on-premise environments.
## Architecture
Bifrost uses a hub-and-spoke model with two container registries optimized for each cloud platform:
```mermaid
flowchart TB
subgraph BifrostInfra[Bifrost Infrastructure]
CICD[CI/CD Pipeline]
GCR[GCP Artifact Registry]
ECR[AWS ECR]
end
subgraph Customers[Customer Environments]
subgraph AWSCustomer[AWS Customers]
EKS[EKS Cluster]
ECS[ECS Service]
end
subgraph GCPCustomer[GCP Customers]
GKE[GKE Cluster]
end
subgraph AzureCustomer[Azure Customers]
AKS[AKS Cluster]
end
subgraph OnPrem[On-Premise]
K8S[Kubernetes]
Docker[Docker]
end
end
CICD -->|Push| GCR
CICD -->|Push| ECR
ECR -->|IRSA| EKS
ECR -->|Task Role| ECS
GCR -->|Workload Identity| GKE
GCR -->|Azure WIF| AKS
GCR -->|Basic Auth| OnPrem
```
### Registry Distribution
| Customer Cloud | Registry Source | Why |
|----------------|-----------------|-----|
| AWS | AWS ECR | Native IAM integration, lowest latency within AWS |
| GCP | GCP Artifact Registry | Native Workload Identity, lowest latency within GCP |
| Azure | GCP Artifact Registry | Workload Identity Federation from Azure to GCP |
| On-Premise | GCP Artifact Registry | Basic auth with username/password credentials |
## Authentication Methods
Choose the authentication method based on your deployment environment:
| Environment | Method | Security Level | Setup Complexity |
|-------------|--------|----------------|------------------|
| AWS EKS | [IRSA](/deployment-guides/enterprise/aws#irsa-recommended) | High | Medium |
| AWS ECS | [IAM Task Roles](/deployment-guides/enterprise/aws#ecs-task-roles) | High | Low |
| GCP GKE | [Workload Identity](/deployment-guides/enterprise/gcp#workload-identity-recommended) | High | Low |
| Azure AKS | [Azure WIF](/deployment-guides/enterprise/azure) | High | Medium |
| On-Premise | [Basic Auth](/deployment-guides/enterprise/on-premise) | Medium | Low |
<Note>
Cloud-native identity federation (IRSA, Workload Identity, Azure WIF) is recommended over static credentials for production deployments.
</Note>
## Security Features
### Encryption
- **In-Transit**: All registry communication uses TLS 1.3
- **At-Rest**: Images encrypted using cloud-native encryption (AWS KMS, GCP CMEK)
### Access Control
- **IAM-based**: Fine-grained permissions using cloud IAM policies
- **Audit Logging**: All image pull operations are logged for compliance
- **IP Restrictions**: Optional VPC Service Controls (GCP) or VPC endpoints (AWS)
### Image Security
- **Vulnerability Scanning**: Automatic scanning on push
- **Immutable Tags**: Optional tag immutability to prevent overwrites
- **Signed Images**: Container image signatures for verification
## Prerequisites
Before deploying Bifrost Enterprise, ensure you have:
<Tabs>
<Tab title="AWS">
- AWS account with ECR access
- EKS cluster (v1.23+) or ECS cluster
- IAM permissions to create roles and policies
- `kubectl` and `aws` CLI configured
</Tab>
<Tab title="GCP">
- GCP project with Artifact Registry API enabled
- GKE cluster (v1.24+) with Workload Identity enabled
- IAM permissions for service account management
- `kubectl` and `gcloud` CLI configured
</Tab>
<Tab title="Azure">
- Azure subscription with AKS
- AKS cluster (v1.24+) with Workload Identity enabled
- Permissions to create Managed Identities
- `kubectl` and `az` CLI configured
</Tab>
<Tab title="On-Premise">
- Kubernetes cluster (v1.23+) or Docker runtime
- Network access to `us-central1-docker.pkg.dev`
- Docker credentials provided by Bifrost team
</Tab>
</Tabs>
## Getting Started
<CardGroup cols={2}>
<Card title="AWS Deployment" icon="aws" href="/deployment-guides/enterprise/aws">
Deploy on EKS or ECS with IRSA authentication
</Card>
<Card title="GCP Deployment" icon="google" href="/deployment-guides/enterprise/gcp">
Deploy on GKE with Workload Identity
</Card>
<Card title="Azure Deployment" icon="microsoft" href="/deployment-guides/enterprise/azure">
Deploy on AKS with Azure Workload Identity Federation
</Card>
<Card title="On-Premise" icon="server" href="/deployment-guides/enterprise/on-premise">
Deploy anywhere with Docker credentials
</Card>
</CardGroup>
## Support
For enterprise deployment assistance:
- **Email**: [contact@getmaxim.ai](mailto:contact@getmaxim.ai)
- **Slack**: Connect via Slack Connect for real-time support
- **Documentation**: Platform-specific guides linked above

View File

@@ -0,0 +1,34 @@
---
title: fly.io
description: "This guide explains how to deploy Bifrost on fly.io"
icon: "fly"
---
As `Bifrost` uses multiple sub-modules (`core`, `framework`, etc.) and also embeds the front-end into a single binary (embed.FS), we use a custom Docker build step before we hand over the deployment to flyctl.
There are two ways to deploy Bifrost on Fly.io:
1. By cloning the repo
2. Using flyctl + Docker Hub image
## By cloning the repo
1. Clone https://github.com/maximhq/bifrost
2. Ensure [Make](/deployment-guides/how-to/install-make) is installed.
3. Run `make deploy-to-fly-io APP_NAME=<your-fly-app-name>`
## Using flyctl + Docker Hub image
1. Update your `fly.toml` to specify the Bifrost Docker Hub image.
```toml
[build]
image = "maximhq/bifrost:latest"
```
2. Or you can specify the Docker Hub image path in the command:
```
fly deploy --app <your-app-name> --image docker.io/maximhq/bifrost:latest
```

View File

@@ -0,0 +1,639 @@
---
title: "Quick Start"
description: "Deploy Bifrost on Kubernetes using the official Helm chart — quickstart for OSS and Enterprise"
icon: "server"
---
<Note>
**Latest Chart Version**: [View on Artifact Hub](https://artifacthub.io/packages/helm/bifrost/bifrost)
</Note>
<Tabs>
<Tab title="OSS">
## Prerequisites
- Kubernetes cluster (v1.19+)
- `kubectl` configured
- Helm 3.2.0+ installed
- Persistent Volume provisioner (required for SQLite; optional for Postgres-only)
<Note>
If you use PostgreSQL for Bifrost storage, ensure the database is UTF8 encoded. See [PostgreSQL UTF8 Requirement](../quickstart/gateway/setting-up#postgresql-utf8-requirement).
</Note>
## Step 1 — Add the Helm Repository
```bash
helm repo add bifrost https://maximhq.github.io/bifrost/helm-charts
helm repo update
```
## Step 2 — Install
<Note>
The Helm chart ships ready-made values files under `helm-charts/bifrost/values-examples/`.
For example: `sqlite-only.yaml`, `production-ha.yaml`, `external-postgres.yaml`, and `secrets-from-k8s.yaml`.
See the full list here: https://github.com/maximhq/bifrost/tree/main/helm-charts/bifrost/values-examples
</Note>
<Tabs>
<Tab title="Minimal (SQLite)">
Fastest way to get running. Bifrost deploys as a StatefulSet with a 10Gi PVC for SQLite.
```bash
kubectl create secret generic bifrost-encryption-key \
--from-literal=encryption-key="$(openssl rand -base64 32)"
helm install bifrost bifrost/bifrost \
--set image.tag=v1.4.11 \
--set bifrost.encryptionKeySecret.name="bifrost-encryption-key" \
--set bifrost.encryptionKeySecret.key="encryption-key"
```
</Tab>
<Tab title="With a Provider Key">
Add your first provider key at install time:
```bash
kubectl create secret generic bifrost-encryption-key \
--from-literal=encryption-key="$(openssl rand -base64 32)"
kubectl create secret generic provider-keys \
--from-literal=openai-api-key='sk-your-key'
helm install bifrost bifrost/bifrost \
--set image.tag=v1.4.11 \
--set bifrost.encryptionKeySecret.name="bifrost-encryption-key" \
--set bifrost.encryptionKeySecret.key="encryption-key" \
--set 'bifrost.providers.openai.keys[0].name=primary' \
--set 'bifrost.providers.openai.keys[0].value=env.OPENAI_API_KEY' \
--set 'bifrost.providers.openai.keys[0].weight=1' \
--set bifrost.providerSecrets.openai.existingSecret="provider-keys" \
--set bifrost.providerSecrets.openai.key="openai-api-key" \
--set bifrost.providerSecrets.openai.envVar="OPENAI_API_KEY"
```
</Tab>
<Tab title="Production (PostgreSQL + HA)">
High-availability setup — 3 replicas, PostgreSQL, autoscaling, ingress.
```bash
# 1. Create secrets
kubectl create secret generic bifrost-encryption-key \
--from-literal=encryption-key="$(openssl rand -base64 32)"
kubectl create secret generic postgres-credentials \
--from-literal=password="$(openssl rand -base64 32)"
kubectl create secret generic provider-keys \
--from-literal=openai-api-key='sk-...'
```
```yaml
# production.yaml
image:
tag: "v1.4.11"
replicaCount: 3
storage:
mode: postgres
postgresql:
enabled: true
auth:
username: bifrost
database: bifrost
existingSecret: "postgres-credentials"
secretKeys:
adminPasswordKey: "password"
primary:
persistence:
size: 50Gi
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 2Gi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: bifrost.yourdomain.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: bifrost-tls
hosts:
- bifrost.yourdomain.com
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 2Gi
bifrost:
encryptionKeySecret:
name: "bifrost-encryption-key"
key: "encryption-key"
client:
initialPoolSize: 500
dropExcessRequests: true
enableLogging: true
providers:
openai:
keys:
- name: "openai-primary"
value: "env.OPENAI_API_KEY"
weight: 1
providerSecrets:
openai:
existingSecret: "provider-keys"
key: "openai-api-key"
envVar: "OPENAI_API_KEY"
plugins:
telemetry:
enabled: true
version: 1
logging:
enabled: true
version: 1
governance:
enabled: true
version: 1
```
```bash
# 2. Install
helm install bifrost bifrost/bifrost -f production.yaml
```
</Tab>
</Tabs>
<Note>
`image.tag` is required — the chart will not start without it. Check [Docker Hub](https://hub.docker.com/r/maximhq/bifrost/tags) for available versions.
</Note>
## Step 3 — Verify
```bash
# Check pods are running
kubectl get pods -l app.kubernetes.io/name=bifrost
# Port forward and hit the health endpoint
kubectl port-forward svc/bifrost 8080:8080
curl http://localhost:8080/health
# Check Prometheus metrics
curl http://localhost:8080/metrics
```
## Step 4 — Configure Providers & Plugins
```bash
# Make your first inference call
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello from Bifrost!"}]
}'
```
Next steps: jump to [Next Steps](#next-steps).
</Tab>
<Tab title="Enterprise">
Enterprise customers receive dedicated container images in a private registry, along with additional features, SLAs, and compliance documentation.
<Note>
[Book a demo](https://calendly.com/maximai/bifrost-demo) to know more about our enterprise features.
</Note>
## Prerequisites
- Kubernetes cluster (v1.19+)
- `kubectl` configured
- Helm 3.2.0+ installed
- Enterprise registry credentials (provided by Maxim)
## Step 1 — Add the Helm Repository
```bash
helm repo add bifrost https://maximhq.github.io/bifrost/helm-charts
helm repo update
```
## Step 2 — Create Pull Secret
Create a Kubernetes image pull secret for our private enterprise registry:
<Tabs>
<Tab title="Google Artifact Registry">
```bash
kubectl create secret docker-registry enterprise-registry-secret \
--docker-server=us-west1-docker.pkg.dev \
--docker-username=_json_key \
--docker-password="$(cat service-account-key.json)" \
--docker-email=your-email@example.com
```
</Tab>
<Tab title="AWS ECR">
```bash
kubectl create secret docker-registry enterprise-registry-secret \
--docker-server=123456789.dkr.ecr.us-east-1.amazonaws.com \
--docker-username=AWS \
--docker-password=$(aws ecr get-login-password --region us-east-1)
```
<Note>
ECR tokens expire after 12 hours. Use the [ECR Credential Helper](https://github.com/awslabs/amazon-ecr-credential-helper) or [ECR Registry Creds operator](https://github.com/upmc-enterprises/registry-creds) for automatic refresh.
</Note>
</Tab>
<Tab title="Azure ACR">
```bash
kubectl create secret docker-registry enterprise-registry-secret \
--docker-server=yourregistry.azurecr.io \
--docker-username=<service-principal-id> \
--docker-password=<service-principal-password>
```
</Tab>
<Tab title="Self-Hosted Registry">
```bash
kubectl create secret docker-registry enterprise-registry-secret \
--docker-server=registry.yourcompany.com \
--docker-username=<username> \
--docker-password=<password>
```
</Tab>
</Tabs>
## Step 3 — Create Required Secrets
```bash
# Encryption key
kubectl create secret generic bifrost-encryption \
--from-literal=key="$(openssl rand -base64 32)"
# Provider API keys
kubectl create secret generic provider-keys \
--from-literal=openai-api-key='sk-...' \
--from-literal=anthropic-api-key='sk-ant-...'
# Admin credentials (for dashboard + governance)
kubectl create secret generic bifrost-admin-credentials \
--from-literal=username='admin' \
--from-literal=password='secure-admin-password'
```
## Step 4 — Install
```yaml
# enterprise.yaml
image:
# Registry URL provided by Maxim
repository: us-west1-docker.pkg.dev/bifrost-enterprise/your-org/bifrost
tag: "latest"
imagePullSecrets:
- name: enterprise-registry-secret
replicaCount: 3
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 4000m
memory: 8Gi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 20
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
storage:
mode: postgres
postgresql:
enabled: true
auth:
password: "secure-password" # use existingSecret in production
primary:
persistence:
size: 100Gi
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 4000m
memory: 8Gi
vectorStore:
enabled: true
type: weaviate
weaviate:
enabled: true
persistence:
size: 100Gi
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/proxy-body-size: "100m"
hosts:
- host: bifrost.yourcompany.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: bifrost-tls
hosts:
- bifrost.yourcompany.com
bifrost:
encryptionKeySecret:
name: "bifrost-encryption"
key: "key"
client:
initialPoolSize: 1000
dropExcessRequests: true
enableLogging: true
disableContentLogging: false # set true for HIPAA/compliance
logRetentionDays: 365
enforceGovernanceHeader: true
allowDirectKeys: false
maxRequestBodySizeMb: 100
allowedOrigins:
- "https://yourcompany.com"
- "https://*.yourcompany.com"
providers:
openai:
keys:
- name: "openai-primary"
value: "env.OPENAI_API_KEY"
weight: 1
anthropic:
keys:
- name: "anthropic-primary"
value: "env.ANTHROPIC_API_KEY"
weight: 1
providerSecrets:
openai:
existingSecret: "provider-keys"
key: "openai-api-key"
envVar: "OPENAI_API_KEY"
anthropic:
existingSecret: "provider-keys"
key: "anthropic-api-key"
envVar: "ANTHROPIC_API_KEY"
governance:
authConfig:
isEnabled: true
disableAuthOnInference: false
existingSecret: "bifrost-admin-credentials"
usernameKey: "username"
passwordKey: "password"
plugins:
telemetry:
enabled: true
version: 1
logging:
enabled: true
version: 1
governance:
enabled: true
version: 1
config:
is_vk_mandatory: true
semanticCache:
enabled: true
version: 1
config:
provider: "openai"
embedding_model: "text-embedding-3-small"
dimension: 1536
threshold: 0.85
ttl: "1h"
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: bifrost
topologyKey: kubernetes.io/hostname
```
```bash
helm install bifrost bifrost/bifrost -f enterprise.yaml
```
Next steps: jump to [Next Steps](#next-steps).
<Note>
For DB-backed deployments, built-in plugins support a top-level `version` field (for example: `telemetry`, `logging`, `governance`, `semanticCache`, `otel`, `maxim`, `datadog`). Increase this number when you want config from Helm to overwrite an older plugin record in the DB.
</Note>
## Enterprise Support
Enterprise customers have access to:
- Dedicated Slack channel for support
- Priority bug fixes and feature requests
- Custom feature development
- SLA guarantees
- Compliance documentation (SOC2, HIPAA, etc.)
Contact [support@getmaxim.ai](mailto:support@getmaxim.ai) for support.
</Tab>
</Tabs>
---
## Operations
### Upgrade
```bash
helm repo update
# Upgrade reusing all existing values
helm upgrade bifrost bifrost/bifrost --reuse-values
# Upgrade with new values
helm upgrade bifrost bifrost/bifrost -f your-values.yaml
# Upgrade and override a single field
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set image.tag=v1.4.11
```
### Rollback
```bash
helm history bifrost
helm rollback bifrost # to previous revision
helm rollback bifrost 2 # to specific revision
```
### Scale
```bash
kubectl scale deployment bifrost --replicas=5
# Or via Helm
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set replicaCount=5
```
### Uninstall
```bash
helm uninstall bifrost
# Also remove PVCs (permanently deletes all data)
kubectl delete pvc -l app.kubernetes.io/instance=bifrost
```
---
## Monitoring
### Prometheus Metrics
Bifrost exposes Prometheus metrics at `/metrics`.
Enable ServiceMonitor for automatic scraping:
```yaml
serviceMonitor:
enabled: true
interval: 30s
scrapeTimeout: 10s
```
### Health Checks
Check pod health:
```bash
# View pod status
kubectl get pods -l app.kubernetes.io/name=bifrost
# Check logs
kubectl logs -l app.kubernetes.io/name=bifrost --tail=100
# Describe pod
kubectl describe pod -l app.kubernetes.io/name=bifrost
```
### Metrics Endpoints
```bash
# Port forward
kubectl port-forward svc/bifrost 8080:8080
# Check metrics
curl http://localhost:8080/metrics
# Check health
curl http://localhost:8080/health
```
---
## Configuration Guides
<CardGroup cols={3}>
<Card title="Values Reference" icon="sliders" href="/deployment-guides/helm/values">
All parameters, secret references, advanced config, example patterns
</Card>
<Card title="Client Configuration" icon="gear" href="/deployment-guides/helm/client">
Pool size, logging, CORS, header filtering, compat shims, MCP settings
</Card>
<Card title="Provider Setup" icon="plug" href="/deployment-guides/helm/providers">
OpenAI, Anthropic, Azure, Bedrock, Vertex, Groq, self-hosted
</Card>
<Card title="Storage" icon="database" href="/deployment-guides/helm/storage">
SQLite, PostgreSQL, object storage for logs, vector stores
</Card>
<Card title="Plugins" icon="puzzle-piece" href="/deployment-guides/helm/plugins">
Telemetry, logging, semantic cache, OTel, Datadog, governance
</Card>
<Card title="Governance" icon="shield" href="/deployment-guides/helm/governance">
Budgets, rate limits, virtual keys, routing rules
</Card>
<Card title="Cluster Mode" icon="network-wired" href="/deployment-guides/helm/cluster">
Multi-replica HA, gossip, peer discovery
</Card>
<Card title="Troubleshooting" icon="wrench" href="/deployment-guides/helm/troubleshooting">
Pod startup, database, ingress, PVC, secrets, performance
</Card>
</CardGroup>
---
## Resources
- [Helm Chart Repository](https://github.com/maximhq/bifrost/tree/main/helm-charts)
- [Artifact Hub](https://artifacthub.io/packages/helm/bifrost/bifrost)
- [Example Configurations](https://github.com/maximhq/bifrost/tree/main/helm-charts/bifrost/values-examples)
- [GitHub Issues](https://github.com/maximhq/bifrost/issues)
## Next Steps
1. Configure [provider keys](/providers/supported-providers/overview)
2. Enable [plugins](/plugins/getting-started)
3. Set up [observability](/features/observability/default)
4. Configure [governance](/features/governance/virtual-keys)

View File

@@ -0,0 +1,316 @@
---
title: "Client Configuration"
description: "Configure the Bifrost client: connection pool, logging, CORS, header filtering, compat shims, and MCP settings"
icon: "gear"
---
The `bifrost.client` block controls how Bifrost manages its internal worker pool, request logging, authentication enforcement, header policies, SDK compatibility shims, and MCP agent behaviour. All settings map directly to the `client` section of the rendered `config.json`.
---
## Connection Pool
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.client.initialPoolSize` | Pre-allocated worker goroutines per provider queue | `300` |
| `bifrost.client.dropExcessRequests` | Drop requests when queue is full instead of waiting | `false` |
A larger pool reduces latency spikes under burst load at the cost of higher baseline memory. For production workloads with multiple providers, `1000` is a common starting point.
```yaml
# client-pool.yaml
image:
tag: "v1.4.11"
bifrost:
client:
initialPoolSize: 1000
dropExcessRequests: true # Return 429 instead of queuing indefinitely
```
```bash
helm install bifrost bifrost/bifrost -f client-pool.yaml
# Or set inline
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set bifrost.client.initialPoolSize=1000 \
--set bifrost.client.dropExcessRequests=true
```
---
## Request & Response Logging
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.client.enableLogging` | Log all LLM requests and responses | `true` |
| `bifrost.client.disableContentLogging` | Strip message content from logs (keeps metadata) | `false` |
| `bifrost.client.logRetentionDays` | Days to retain log entries in the store | `365` |
| `bifrost.client.loggingHeaders` | HTTP request headers to capture in log metadata | `[]` |
Set `disableContentLogging: true` for HIPAA / PCI compliance workloads where message content must not be persisted.
```yaml
bifrost:
client:
enableLogging: true
disableContentLogging: true # PII / compliance: store metadata only
logRetentionDays: 90
loggingHeaders:
- "x-request-id"
- "x-user-id"
```
```bash
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set bifrost.client.disableContentLogging=true \
--set bifrost.client.logRetentionDays=90
```
---
## Security & CORS
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.client.allowedOrigins` | CORS allowed origins | `["*"]` |
| `bifrost.client.allowDirectKeys` | Allow callers to pass provider keys directly in requests | `false` |
| `bifrost.client.enforceGovernanceHeader` | Require `x-bf-vk` virtual-key header on every request | `false` |
| `bifrost.client.maxRequestBodySizeMb` | Maximum allowed request body size | `100` |
| `bifrost.client.whitelistedRoutes` | Routes that bypass auth middleware | `[]` |
```yaml
bifrost:
client:
allowedOrigins:
- "https://app.yourdomain.com"
- "https://admin.yourdomain.com"
allowDirectKeys: false # Prevent callers from supplying raw provider keys
enforceGovernanceHeader: true # Every request must carry a virtual key
maxRequestBodySizeMb: 50
whitelistedRoutes:
- "/health"
- "/metrics"
```
```bash
helm install bifrost bifrost/bifrost \
--set image.tag=v1.4.11 \
--set bifrost.client.enforceGovernanceHeader=true \
--set bifrost.client.allowDirectKeys=false
```
---
## Header Filtering
Controls which `x-bf-eh-*` headers are forwarded to upstream LLM providers.
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.client.headerFilterConfig.allowlist` | Only these headers are forwarded (whitelist mode) | `[]` |
| `bifrost.client.headerFilterConfig.denylist` | These headers are always blocked | `[]` |
| `bifrost.client.requiredHeaders` | Headers that must be present on every request | `[]` |
| `bifrost.client.allowedHeaders` | Additional headers permitted for CORS and WebSocket | `[]` |
When both lists are empty, all `x-bf-eh-*` headers pass through. Specifying an `allowlist` enables strict whitelist mode — only listed headers are forwarded.
```yaml
bifrost:
client:
headerFilterConfig:
allowlist:
- "x-bf-eh-anthropic-version"
- "x-bf-eh-openai-beta"
denylist: []
requiredHeaders:
- "x-request-id"
```
---
## Authentication
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.authConfig.isEnabled` | Enable username/password auth for the API and dashboard | `false` |
| `bifrost.authConfig.adminUsername` | Admin username (plain text, prefer secret) | `""` |
| `bifrost.authConfig.adminPassword` | Admin password (plain text, prefer secret) | `""` |
| `bifrost.authConfig.existingSecret` | Kubernetes Secret name for credentials | `""` |
| `bifrost.authConfig.usernameKey` | Key within the secret for username | `"username"` |
| `bifrost.authConfig.passwordKey` | Key within the secret for password | `"password"` |
| `bifrost.authConfig.disableAuthOnInference` | Skip auth check on `/v1/*` inference routes | `false` |
```bash
# Create secret first
kubectl create secret generic bifrost-admin \
--from-literal=username='admin' \
--from-literal=password='your-secure-password'
```
```yaml
bifrost:
authConfig:
isEnabled: true
disableAuthOnInference: false
existingSecret: "bifrost-admin"
usernameKey: "username"
passwordKey: "password"
```
```bash
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
-f auth-values.yaml
```
---
## Encryption
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.encryptionKey` | Optional encryption key (plain text — use `encryptionKeySecret` in production). If omitted, data is stored in plaintext. | `""` |
| `bifrost.encryptionKeySecret.name` | Kubernetes Secret name containing the key | `""` |
| `bifrost.encryptionKeySecret.key` | Key within the secret | `"encryption-key"` |
Always use a Kubernetes Secret in production:
```bash
kubectl create secret generic bifrost-encryption \
--from-literal=encryption-key='your-32-byte-encryption-key-here'
```
```yaml
bifrost:
encryptionKeySecret:
name: "bifrost-encryption"
key: "encryption-key"
```
```bash
helm install bifrost bifrost/bifrost \
--set image.tag=v1.4.11 \
-f encryption-values.yaml
```
---
## Async Jobs & Database Pings
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.client.disableDbPingsInHealth` | Exclude DB connectivity from `/health` checks | `false` |
| `bifrost.client.asyncJobResultTTL` | TTL (seconds) for async job results | `3600` |
---
## Compat Shims
Compatibility flags that let Bifrost silently adapt request/response shapes for SDK integrations:
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.client.compat.convertTextToChat` | Wrap legacy text completions as chat messages | `false` |
| `bifrost.client.compat.convertChatToResponses` | Translate chat completions to Responses API format | `false` |
| `bifrost.client.compat.shouldDropParams` | Silently drop unsupported parameters instead of erroring | `false` |
| `bifrost.client.compat.shouldConvertParams` | Auto-convert parameter names across provider schemas | `false` |
```yaml
bifrost:
client:
compat:
shouldDropParams: true # Useful when proxying mixed SDK traffic
convertTextToChat: true # For clients using the legacy /v1/completions endpoint
```
---
## Prometheus Labels
Add custom labels to every Prometheus metric emitted by Bifrost:
```yaml
bifrost:
client:
prometheusLabels:
- name: "environment"
value: "production"
- name: "region"
value: "us-east-1"
```
---
## MCP Agent Settings
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.client.mcpAgentDepth` | Maximum tool-call recursion depth for MCP agent mode | `10` |
| `bifrost.client.mcpToolExecutionTimeout` | Timeout per tool execution in seconds | `30` |
| `bifrost.client.mcpCodeModeBindingLevel` | Code mode binding level (`server` or `tool`) | `""` |
| `bifrost.client.mcpToolSyncInterval` | Global tool sync interval in minutes (`0` = disabled) | `0` |
```yaml
bifrost:
client:
mcpAgentDepth: 15
mcpToolExecutionTimeout: 60
```
---
## Full Example
```yaml
# client-full.yaml
image:
tag: "v1.4.11"
bifrost:
encryptionKeySecret:
name: "bifrost-encryption"
key: "encryption-key"
authConfig:
isEnabled: true
disableAuthOnInference: false
existingSecret: "bifrost-admin"
usernameKey: "username"
passwordKey: "password"
client:
initialPoolSize: 1000
dropExcessRequests: true
allowedOrigins:
- "https://app.yourdomain.com"
enableLogging: true
disableContentLogging: false
logRetentionDays: 90
enforceGovernanceHeader: true
allowDirectKeys: false
maxRequestBodySizeMb: 100
headerFilterConfig:
allowlist: []
denylist: []
prometheusLabels:
- name: "environment"
value: "production"
mcpAgentDepth: 10
mcpToolExecutionTimeout: 30
```
```bash
# Create prerequisites
kubectl create secret generic bifrost-encryption \
--from-literal=encryption-key='your-32-byte-encryption-key-here'
kubectl create secret generic bifrost-admin \
--from-literal=username='admin' \
--from-literal=password='your-secure-password'
# Install
helm install bifrost bifrost/bifrost -f client-full.yaml
```

View File

@@ -0,0 +1,523 @@
---
title: "Cluster Mode & HA"
description: "Run Bifrost in a multi-replica cluster with gossip-based peer discovery, distributed state sync, and high-availability configuration"
icon: "network-wired"
---
Cluster mode enables multiple Bifrost replicas to share state — rate limits, budget counters, and governance data — across pods. When `bifrost.cluster.enabled` is `false` (the default), each replica operates independently and state is only shared via the database.
<Note>
Cluster mode requires **PostgreSQL** as the storage backend. SQLite is single-node only.
</Note>
<Warning>
`bifrost.cluster.*` is an enterprise capability. OSS images accept these values but do not run cluster mode at runtime.
</Warning>
## When to Use Cluster Mode
| Scenario | Recommendation |
|----------|---------------|
| Single replica | Not needed |
| Multiple replicas, shared DB only | Optional — DB provides eventual consistency |
| Multiple replicas with strict per-minute rate limiting | **Enable cluster mode** — in-memory counters are synced via gossip |
| Geographic multi-region | Enable cluster mode with DNS or Consul discovery |
---
## Basic Cluster Setup
```yaml
# cluster-values.yaml
image:
tag: "v1.4.11"
replicaCount: 3
storage:
mode: postgres
postgresql:
external:
enabled: true
host: "your-postgres-host.example.com"
port: 5432
user: bifrost
database: bifrost
sslMode: require
existingSecret: "postgres-credentials"
passwordKey: "password"
bifrost:
encryptionKeySecret:
name: "bifrost-encryption"
key: "encryption-key"
cluster:
enabled: true
gossip:
port: 7946
config:
timeoutSeconds: 10
successThreshold: 3
failureThreshold: 3
# Spread replicas across nodes for true HA
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: bifrost
topologyKey: kubernetes.io/hostname
# Conservative scale-down: avoid killing pods mid-stream
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Pods
value: 1
periodSeconds: 120
# Give in-flight SSE streams time to drain
terminationGracePeriodSeconds: 90
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 20"]
```
```bash
kubectl create secret generic postgres-credentials \
--from-literal=password='your-postgres-password'
kubectl create secret generic bifrost-encryption \
--from-literal=encryption-key='your-32-byte-encryption-key'
helm install bifrost bifrost/bifrost -f cluster-values.yaml
```
---
## Peer Discovery
Bifrost uses a gossip protocol (memberlist) for peer-to-peer state sync. Configure how peers find each other:
<Note>
For `consul`, `etcd`, and `udp` discovery, set `bifrost.cluster.discovery.serviceName` so nodes register/discover under a stable service identity.
</Note>
<Tabs>
<Tab title="Kubernetes (Recommended)">
Bifrost queries the Kubernetes API to find other Bifrost pods by label selector. No static peer list needed — works with HPA.
```yaml
bifrost:
cluster:
enabled: true
discovery:
enabled: true
type: kubernetes
k8sNamespace: "default" # namespace where Bifrost runs
k8sLabelSelector: "app.kubernetes.io/name=bifrost"
gossip:
port: 7946
```
The service account needs permission to list pods:
```yaml
serviceAccount:
create: true
annotations: {}
```
```bash
# Create a ClusterRole and binding for pod discovery (apply once)
kubectl apply -f - <<'EOF'
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: bifrost-pod-discovery
namespace: default
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["list", "get", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: bifrost-pod-discovery
namespace: default
subjects:
- kind: ServiceAccount
name: bifrost
namespace: default
roleRef:
kind: Role
name: bifrost-pod-discovery
apiGroup: rbac.authorization.k8s.io
EOF
```
```bash
helm install bifrost bifrost/bifrost -f cluster-k8s-discovery-values.yaml
```
</Tab>
<Tab title="DNS">
Uses a headless service DNS name to resolve peer IPs. Works well with StatefulSets (predictable pod DNS names).
```yaml
bifrost:
cluster:
enabled: true
discovery:
enabled: true
type: dns
dnsNames:
- "bifrost-headless.default.svc.cluster.local"
gossip:
port: 7946
```
The chart automatically creates a headless service (`bifrost-headless`) when cluster mode is enabled with a StatefulSet. For Deployments, create it manually:
```bash
kubectl apply -f - <<'EOF'
apiVersion: v1
kind: Service
metadata:
name: bifrost-headless
spec:
clusterIP: None
selector:
app.kubernetes.io/name: bifrost
ports:
- name: gossip
port: 7946
protocol: TCP
EOF
```
```bash
helm install bifrost bifrost/bifrost -f cluster-dns-discovery-values.yaml
```
</Tab>
<Tab title="Static Peers">
Enumerate peer addresses explicitly. Use when discovery mechanisms are unavailable or you want deterministic membership.
```yaml
bifrost:
cluster:
enabled: true
peers:
- "bifrost-0.bifrost-headless.default.svc.cluster.local:7946"
- "bifrost-1.bifrost-headless.default.svc.cluster.local:7946"
- "bifrost-2.bifrost-headless.default.svc.cluster.local:7946"
gossip:
port: 7946
```
<Note>
Static peers require StatefulSet pod names to be stable. This approach doesn't adapt to HPA-driven scaling — use Kubernetes or DNS discovery for dynamic replica counts.
</Note>
</Tab>
<Tab title="Consul">
```yaml
bifrost:
cluster:
enabled: true
discovery:
enabled: true
type: consul
serviceName: "bifrost-cluster"
consulAddress: "consul.consul.svc.cluster.local:8500"
gossip:
port: 7946
```
```bash
helm install bifrost bifrost/bifrost -f cluster-consul-discovery-values.yaml
```
</Tab>
<Tab title="etcd">
```yaml
bifrost:
cluster:
enabled: true
discovery:
enabled: true
type: etcd
serviceName: "bifrost-cluster"
etcdEndpoints:
- "http://etcd-0.etcd.default.svc.cluster.local:2379"
- "http://etcd-1.etcd.default.svc.cluster.local:2379"
- "http://etcd-2.etcd.default.svc.cluster.local:2379"
gossip:
port: 7946
```
</Tab>
<Tab title="mDNS">
Best for local development or bare-metal clusters where multicast is available.
```yaml
bifrost:
cluster:
enabled: true
discovery:
enabled: true
type: mdns
mdnsService: "_bifrost._tcp"
gossip:
port: 7946
```
</Tab>
</Tabs>
---
## Allowed Address Space
Restrict gossip to a specific subnet (useful in multi-tenant clusters):
```yaml
bifrost:
cluster:
discovery:
enabled: true
type: kubernetes
k8sNamespace: "default"
k8sLabelSelector: "app.kubernetes.io/name=bifrost"
allowedAddressSpace:
- "10.0.0.0/8"
- "172.16.0.0/12"
```
---
## Region-Aware Routing
Tag replicas with a region identifier for latency-aware routing:
```yaml
bifrost:
cluster:
enabled: true
region: "us-east-1"
```
---
## Full HA Production Example
```yaml
# ha-production-values.yaml
image:
tag: "v1.4.11"
replicaCount: 3
resources:
requests:
cpu: 1000m
memory: 1Gi
limits:
cpu: 4000m
memory: 4Gi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 15
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 75
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Pods
value: 1
periodSeconds: 120
scaleUp:
stabilizationWindowSeconds: 30
terminationGracePeriodSeconds: 90
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 20"]
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/proxy-body-size: "100m"
nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
hosts:
- host: bifrost.yourdomain.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: bifrost-tls
hosts:
- bifrost.yourdomain.com
storage:
mode: postgres
postgresql:
external:
enabled: true
host: "rds.us-east-1.amazonaws.com"
port: 5432
user: bifrost
database: bifrost
sslMode: require
existingSecret: "postgres-credentials"
passwordKey: "password"
bifrost:
encryptionKeySecret:
name: "bifrost-encryption"
key: "encryption-key"
client:
initialPoolSize: 1000
dropExcessRequests: true
enableLogging: true
enforceGovernanceHeader: true
cluster:
enabled: true
region: "us-east-1"
discovery:
enabled: true
type: kubernetes
k8sNamespace: "default"
k8sLabelSelector: "app.kubernetes.io/name=bifrost"
gossip:
port: 7946
config:
timeoutSeconds: 10
successThreshold: 3
failureThreshold: 3
plugins:
telemetry:
enabled: true
config:
push_gateway:
enabled: true
push_gateway_url: "http://prometheus-pushgateway.monitoring.svc.cluster.local:9091"
push_interval: 15
logging:
enabled: true
governance:
enabled: true
config:
is_vk_mandatory: true
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: bifrost
topologyKey: kubernetes.io/hostname
serviceAccount:
create: true
annotations: {}
```
```bash
# Prerequisites
kubectl create secret generic postgres-credentials \
--from-literal=password='your-secure-postgres-password'
kubectl create secret generic bifrost-encryption \
--from-literal=encryption-key='your-32-byte-encryption-key'
# RBAC for Kubernetes pod discovery
kubectl apply -f - <<'EOF'
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: bifrost-pod-discovery
namespace: default
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["list", "get", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: bifrost-pod-discovery
namespace: default
subjects:
- kind: ServiceAccount
name: bifrost
namespace: default
roleRef:
kind: Role
name: bifrost-pod-discovery
apiGroup: rbac.authorization.k8s.io
EOF
# Install
helm install bifrost bifrost/bifrost -f ha-production-values.yaml
# Verify all peers have found each other (check logs)
kubectl logs -l app.kubernetes.io/name=bifrost --tail=50 | grep -i gossip
```
---
## Verifying Cluster Health
```bash
# Check all pods are running
kubectl get pods -l app.kubernetes.io/name=bifrost
# Check gossip port is reachable between pods
kubectl exec -it bifrost-0 -- nc -zv bifrost-1.bifrost-headless 7946
# Check health endpoint
kubectl port-forward svc/bifrost 8080:8080 &
curl http://localhost:8080/health
# View HPA status
kubectl get hpa bifrost
# Scale manually during maintenance
kubectl scale deployment bifrost --replicas=5
```

View File

@@ -0,0 +1,446 @@
---
title: "Governance"
description: "Configure Bifrost governance in Helm — budgets, rate limits, virtual keys, routing rules, and admin authentication"
icon: "shield"
---
Governance lets you control who can call which providers, how much they can spend, how fast they can go, and how traffic is routed. Everything is declared under `bifrost.governance` in your values file and seeded into the database at startup.
<Note>
The governance **plugin** must also be enabled for enforcement to take effect:
```yaml
bifrost:
plugins:
governance:
enabled: true
```
See the [Plugins](/deployment-guides/helm/plugins) page for plugin configuration details.
</Note>
---
## Admin Authentication
Protect the Bifrost dashboard and management API with username/password auth.
```bash
kubectl create secret generic bifrost-admin-credentials \
--from-literal=username='admin' \
--from-literal=password='your-secure-admin-password'
```
```yaml
bifrost:
governance:
authConfig:
isEnabled: true
disableAuthOnInference: false # keep auth on inference routes
existingSecret: "bifrost-admin-credentials"
usernameKey: "username"
passwordKey: "password"
```
```bash
helm upgrade bifrost bifrost/bifrost --reuse-values -f governance-auth-values.yaml
```
---
## Budgets
Spending caps that reset on a configurable period. Budgets are referenced by ID from virtual keys, teams, customers, or providers.
| Reset duration | Syntax |
|----------------|--------|
| 30 seconds | `"30s"` |
| 5 minutes | `"5m"` |
| 1 hour | `"1h"` |
| 1 day | `"1d"` |
| 1 week | `"1w"` |
| 1 month | `"1M"` |
| 1 year | `"1Y"` |
```yaml
bifrost:
governance:
budgets:
- id: "budget-dev"
max_limit: 50 # $50 per month
reset_duration: "1M"
- id: "budget-production"
max_limit: 500 # $500 per month
reset_duration: "1M"
- id: "budget-testing"
max_limit: 10 # $10 per day
reset_duration: "1d"
- id: "budget-enterprise"
max_limit: 5000 # $5000 per month
reset_duration: "1M"
```
---
## Rate Limits
Token and request-count caps per time window. Referenced by ID from virtual keys, teams, customers, or providers.
```yaml
bifrost:
governance:
rateLimits:
- id: "rate-limit-standard"
token_max_limit: 100000 # 100K tokens per hour
token_reset_duration: "1h"
request_max_limit: 1000 # 1000 requests per hour
request_reset_duration: "1h"
- id: "rate-limit-high"
token_max_limit: 500000 # 500K tokens per hour
token_reset_duration: "1h"
request_max_limit: 5000
request_reset_duration: "1h"
- id: "rate-limit-burst"
token_max_limit: 50000 # 50K tokens per minute (burst)
token_reset_duration: "1m"
request_max_limit: 500
request_reset_duration: "1m"
- id: "rate-limit-testing"
token_max_limit: 10000
token_reset_duration: "1h"
request_max_limit: 100
request_reset_duration: "1h"
```
---
## Customers & Teams
Optional organizational hierarchy. Virtual keys can be assigned to customers or teams, inheriting their budgets and rate limits.
```yaml
bifrost:
governance:
customers:
- id: "customer-acme"
name: "Acme Corp"
budget_id: "budget-production"
rate_limit_id: "rate-limit-high"
- id: "customer-startup"
name: "Startup Inc"
budget_id: "budget-dev"
rate_limit_id: "rate-limit-standard"
teams:
- id: "team-platform"
name: "Platform Team"
customer_id: "customer-acme"
budget_id: "budget-enterprise"
rate_limit_id: "rate-limit-high"
- id: "team-ml"
name: "ML Team"
customer_id: "customer-acme"
budget_id: "budget-production"
rate_limit_id: "rate-limit-standard"
```
---
## Virtual Keys
Virtual keys are the primary access tokens issued to callers. They scope which providers, models, and underlying API keys are accessible.
```yaml
bifrost:
governance:
virtualKeys:
# 1. Unrestricted dev key — access to every provider
- id: "vk-dev-all"
name: "Dev: all providers"
value: "vk-dev-all-secret-token"
is_active: true
budget_id: "budget-dev"
rate_limit_id: "rate-limit-standard"
# No provider_configs → all providers allowed
# 2. OpenAI only — restricted to two models
- id: "vk-openai-prod"
name: "OpenAI Production"
value: "vk-openai-prod-secret-token"
is_active: true
budget_id: "budget-production"
rate_limit_id: "rate-limit-high"
provider_configs:
- provider: "openai"
weight: 1
allowed_models: ["gpt-4o", "gpt-4o-mini"]
# 3. Multi-provider with weighted routing
- id: "vk-multi"
name: "Multi-provider weighted"
value: "vk-multi-secret-token"
is_active: true
budget_id: "budget-production"
rate_limit_id: "rate-limit-high"
provider_configs:
- provider: "openai"
weight: 2 # 50%
allowed_models: ["*"]
- provider: "anthropic"
weight: 1 # 25%
allowed_models: ["*"]
- provider: "groq"
weight: 1 # 25%
allowed_models: ["*"]
# 4. Team-scoped key
- id: "vk-platform-team"
name: "Platform Team Key"
value: "vk-platform-team-token"
is_active: true
team_id: "team-platform" # inherits team budget/rate-limit
provider_configs:
- provider: "openai"
weight: 1
allowed_models: ["*"]
key_ids: ["openai-primary"] # pin to specific configured key by name
# 5. Restricted testing key
- id: "vk-testing"
name: "Testing (gpt-4o-mini only)"
value: "vk-testing-token"
is_active: true
budget_id: "budget-testing"
rate_limit_id: "rate-limit-testing"
provider_configs:
- provider: "openai"
weight: 1
allowed_models: ["gpt-4o-mini"]
# 6. Batch API key
- id: "vk-batch"
name: "Batch API workloads"
value: "vk-batch-token"
is_active: true
budget_id: "budget-production"
rate_limit_id: "rate-limit-burst"
provider_configs:
- provider: "openai"
weight: 1
allowed_models: ["*"]
key_ids: ["openai-batch"] # only the batch-flagged key
```
`provider_configs[].key_ids` and `provider_configs[].keys` are both supported in Helm values. Prefer `key_ids` for parity with `config.json` (`key_ids` should contain provider key names).
**Use a virtual key in API calls:**
```bash
curl http://localhost:8080/v1/chat/completions \
-H "x-bf-vk: vk-openai-prod-secret-token" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}'
```
---
## Model Configs
Apply budgets and rate limits at the model level, independent of virtual keys:
```yaml
bifrost:
governance:
modelConfigs:
- id: "model-gpt4o"
model_name: "gpt-4o"
provider: "openai"
budget_id: "budget-production"
rate_limit_id: "rate-limit-high"
- id: "model-claude"
model_name: "claude-3-5-sonnet-20241022"
provider: "anthropic"
rate_limit_id: "rate-limit-standard"
```
---
## Provider Governance
Apply budgets and rate limits at the provider level:
```yaml
bifrost:
governance:
providers:
- name: "openai"
budget_id: "budget-production"
rate_limit_id: "rate-limit-high"
send_back_raw_request: false
send_back_raw_response: false
- name: "anthropic"
budget_id: "budget-production"
rate_limit_id: "rate-limit-standard"
```
---
## Routing Rules
CEL-expression-based routing rules redirect requests to different providers or models based on request attributes.
| Field | Description |
|-------|-------------|
| `cel_expression` | CEL expression evaluated against the request; if `true`, rule fires |
| `targets` | Provider/model targets with weights |
| `fallbacks` | Providers to try if all targets fail |
| `scope` | `global`, `team`, `customer`, or `virtual_key` |
| `scope_id` | Required for non-global scopes |
| `priority` | Lower number = evaluated first |
```yaml
bifrost:
governance:
routingRules:
# Route all GPT requests to Azure
- id: "route-gpt-to-azure"
name: "GPT → Azure"
description: "Route all GPT model requests to Azure OpenAI"
enabled: true
cel_expression: "model.startsWith('gpt-')"
targets:
- provider: "azure"
model: "" # empty = use original model name
weight: 1.0
fallbacks: ["openai"]
scope: "global"
priority: 0
# Route heavy models to a slower but cheaper provider
- id: "route-heavy-to-groq"
name: "Large context → Groq"
enabled: true
cel_expression: "model == 'gpt-4o' && request_body.max_tokens > 4000"
targets:
- provider: "groq"
model: "llama-3.3-70b-versatile"
weight: 1.0
fallbacks: ["openai"]
scope: "global"
priority: 1
# Team-scoped rule
- id: "route-ml-team-bedrock"
name: "ML Team → Bedrock"
enabled: true
cel_expression: "true" # match all requests for this scope
targets:
- provider: "bedrock"
model: ""
weight: 1.0
fallbacks: ["openai"]
scope: "team"
scope_id: "team-ml"
priority: 0
```
---
## Full Example
```yaml
# governance-full-values.yaml
image:
tag: "v1.4.11"
bifrost:
encryptionKeySecret:
name: "bifrost-encryption"
key: "encryption-key"
plugins:
governance:
enabled: true
config:
is_vk_mandatory: true
governance:
authConfig:
isEnabled: true
existingSecret: "bifrost-admin-credentials"
usernameKey: "username"
passwordKey: "password"
budgets:
- id: "budget-production"
max_limit: 500
reset_duration: "1M"
- id: "budget-dev"
max_limit: 50
reset_duration: "1M"
rateLimits:
- id: "rate-limit-standard"
token_max_limit: 100000
token_reset_duration: "1h"
request_max_limit: 1000
request_reset_duration: "1h"
virtualKeys:
- id: "vk-production"
name: "Production"
value: "vk-prod-secret-token"
is_active: true
budget_id: "budget-production"
rate_limit_id: "rate-limit-standard"
provider_configs:
- provider: "openai"
weight: 1
allowed_models: ["gpt-4o", "gpt-4o-mini"]
```
```bash
kubectl create secret generic bifrost-encryption \
--from-literal=encryption-key='your-32-byte-key'
kubectl create secret generic bifrost-admin-credentials \
--from-literal=username='admin' \
--from-literal=password='secure-admin-password'
helm install bifrost bifrost/bifrost -f governance-full-values.yaml
```
---
## Access Profiles (Enterprise)
You can seed enterprise `access_profiles` directly from Helm values. The chart renders `bifrost.accessProfiles` into top-level `access_profiles` in `config.json`.
```yaml
bifrost:
accessProfiles:
- name: "platform-default"
description: "Default profile for platform users"
is_active: true
tags: ["platform", "default"]
provider_configs:
- provider_name: "openai"
all_models_allowed: false
allowed_models: ["gpt-4o", "gpt-4o-mini"]
mcp_servers:
- mcp_server_id: "github"
mcp_tool_overrides:
- mcp_client_id: "github"
tool_name: "create_pull_request"
action: "include"
```

View File

@@ -0,0 +1,262 @@
---
title: "Guardrails"
description: "Configure guardrails providers and rules in Bifrost Helm deployments"
icon: "shield-halved"
---
<Note>
Guardrails are an **enterprise-only** feature. They require the enterprise Bifrost image.
</Note>
Guardrails are configured under `bifrost.guardrails` in your values file. The configuration has two parts:
- **`providers`** — the backend that performs the check. Rules link to providers by `id`.
- **`rules`** — CEL expressions that control when and where providers are invoked.
---
## Providers
<Tabs>
<Tab title="Regex">
Runs entirely in-process with no external dependency. Patterns use RE2 syntax. Supports optional per-pattern flags: `i` (case-insensitive), `m` (multiline), `s` (dot-all).
```yaml
bifrost:
guardrails:
providers:
- id: 1
provider_name: "regex"
policy_name: "block-secrets"
enabled: true
timeout: 5
config:
patterns:
- pattern: "sk-[A-Za-z0-9]{20,}"
description: "OpenAI API key"
- pattern: "AKIA[0-9A-Z]{16}"
description: "AWS access key"
flags: "i"
- pattern: "gh[ps]_[A-Za-z0-9]{36}"
description: "GitHub token"
```
</Tab>
<Tab title="AWS Bedrock">
```yaml
bifrost:
guardrails:
providers:
- id: 2
provider_name: "bedrock"
policy_name: "content-filter"
enabled: true
timeout: 15
config:
guardrail_arn: "arn:aws:bedrock:us-east-1::guardrail/abc123"
guardrail_version: "DRAFT" # or a published version number
region: "us-east-1"
access_key: "env.AWS_ACCESS_KEY_ID" # omit to use instance role
secret_key: "env.AWS_SECRET_ACCESS_KEY"
```
</Tab>
<Tab title="Azure Content Safety">
```yaml
bifrost:
guardrails:
providers:
- id: 3
provider_name: "azure"
policy_name: "azure-content-safety"
enabled: true
timeout: 10
config:
endpoint: "https://your-resource.cognitiveservices.azure.com"
api_key: "env.AZURE_CONTENT_SAFETY_KEY"
analyze_enabled: true
analyze_severity_threshold: "medium" # low | medium | high
jailbreak_shield_enabled: true
indirect_attack_shield_enabled: true
copyright_enabled: false
text_blocklist_enabled: false
blocklist_names: []
```
</Tab>
<Tab title="Gray Swan">
```yaml
bifrost:
guardrails:
providers:
- id: 4
provider_name: "grayswan"
policy_name: "grayswan-jailbreak"
enabled: true
timeout: 15
config:
api_key: "env.GRAYSWAN_API_KEY"
violation_threshold: 0.7 # 0.01.0; higher = more permissive
reasoning_mode: "standard" # standard | fast
policy_id: "" # optional: single policy ID
policy_ids: [] # optional: multiple policy IDs
rules: {} # optional: inline rule map
```
</Tab>
</Tabs>
---
## Rules
Rules are CEL expressions that fire when their condition is met. Available CEL variables:
| Variable | Type | Description |
|----------|------|-------------|
| `model` | `string` | Model name from the request |
| `provider` | `string` | Provider name (e.g. `"openai"`) |
| `headers` | `map<string,string>` | HTTP request headers |
| `params` | `map<string,string>` | Query parameters |
| `customer` | `string` | Customer ID |
| `team` | `string` | Team ID |
| `user` | `string` | User ID |
Rule fields:
| Field | Required | Description |
|-------|----------|-------------|
| `id` | Yes | Unique integer ID |
| `name` | Yes | Human-readable name |
| `description` | No | Optional description |
| `enabled` | Yes | `true` to activate |
| `cel_expression` | Yes | CEL boolean expression; `"true"` matches all requests |
| `apply_to` | Yes | `"input"`, `"output"`, or `"both"` |
| `sampling_rate` | No | `0``100`; percentage of requests to check (default: 100) |
| `timeout` | No | Rule timeout in seconds |
| `provider_config_ids` | No | Provider `id`s to invoke when this rule matches |
```yaml
bifrost:
guardrails:
rules:
- id: 101
name: "block-secrets-input"
description: "Block prompts containing API keys"
enabled: true
cel_expression: "true"
apply_to: "input"
sampling_rate: 100
timeout: 10
provider_config_ids: [1]
- id: 102
name: "azure-output-gpt4o"
description: "Scan GPT-4o responses"
enabled: true
cel_expression: "model == 'gpt-4o'"
apply_to: "output"
sampling_rate: 100
timeout: 15
provider_config_ids: [3]
- id: 103
name: "grayswan-openai-input"
enabled: true
cel_expression: "provider == 'openai'"
apply_to: "input"
sampling_rate: 50
timeout: 20
provider_config_ids: [4]
- id: 104
name: "strict-team-check"
enabled: true
cel_expression: "team == 'team-platform'"
apply_to: "both"
sampling_rate: 100
timeout: 30
provider_config_ids: [1, 3] # multiple providers run in parallel
```
---
## Full example
```yaml
# guardrails-values.yaml
image:
tag: "latest"
bifrost:
encryptionKeySecret:
name: "bifrost-encryption"
key: "encryption-key"
guardrails:
providers:
- id: 1
provider_name: "regex"
policy_name: "block-secrets"
enabled: true
timeout: 5
config:
patterns:
- pattern: "sk-[A-Za-z0-9]{20,}"
description: "OpenAI API key"
- pattern: "AKIA[0-9A-Z]{16}"
description: "AWS access key"
- pattern: "gh[ps]_[A-Za-z0-9]{36}"
description: "GitHub token"
- id: 2
provider_name: "azure"
policy_name: "content-safety"
enabled: true
timeout: 10
config:
endpoint: "https://your-resource.cognitiveservices.azure.com"
api_key: "env.AZURE_CONTENT_SAFETY_KEY"
analyze_enabled: true
analyze_severity_threshold: "medium"
jailbreak_shield_enabled: true
indirect_attack_shield_enabled: false
copyright_enabled: false
text_blocklist_enabled: false
rules:
- id: 101
name: "block-secrets-input"
description: "Block prompts leaking credentials"
enabled: true
cel_expression: "true"
apply_to: "input"
sampling_rate: 100
timeout: 10
provider_config_ids: [1]
- id: 102
name: "content-safety-both"
description: "Azure content safety on input and output"
enabled: true
cel_expression: "true"
apply_to: "both"
sampling_rate: 100
timeout: 15
provider_config_ids: [2]
```
```bash
kubectl create secret generic azure-content-safety \
--from-literal=key='your-azure-content-safety-api-key'
helm install bifrost bifrost/bifrost \
-f guardrails-values.yaml \
--set env[0].name=AZURE_CONTENT_SAFETY_KEY \
--set env[0].valueFrom.secretKeyRef.name=azure-content-safety \
--set env[0].valueFrom.secretKeyRef.key=key
```

View File

@@ -0,0 +1,549 @@
---
title: "Plugins"
description: "Configure Bifrost plugins in Helm — telemetry, logging, semantic cache, OpenTelemetry, Datadog, governance, and custom plugins"
icon: "puzzle-piece"
---
Plugins are configured under `bifrost.plugins`. Each plugin is independently enabled/disabled. Pre-hooks run in registration order; post-hooks run in reverse order.
<Note>
**Telemetry, logging, and governance are auto-loaded built-ins** — they are always active and do not need to be explicitly enabled. Their configuration lives in `bifrost.client.*` and `bifrost.governance.*`, not in the `plugins` block.
The `plugins` block controls the opt-in plugins: `semanticCache`, `otel`, `datadog`, `maxim`, and custom plugins.
</Note>
```yaml
bifrost:
plugins:
semanticCache:
enabled: false
otel:
enabled: false
datadog:
enabled: false
```
```bash
# Enable an opt-in plugin at install time
helm install bifrost bifrost/bifrost \
--set image.tag=v1.4.11 \
--set bifrost.plugins.otel.enabled=true
# Or upgrade to enable a plugin without touching other values
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set bifrost.plugins.semanticCache.enabled=true
```
---
<Tabs>
<Tab title="Telemetry">
### Telemetry (Prometheus)
<Note>
Telemetry is **always active** — it cannot be disabled. You do not need to set `bifrost.plugins.telemetry.enabled`.
</Note>
Exposes Prometheus metrics at `GET /metrics`. Custom labels are set via `bifrost.client.prometheusLabels`:
```yaml
bifrost:
client:
prometheusLabels:
- "environment=production"
- "region=us-east-1"
```
```bash
# Verify metrics are exposed
kubectl port-forward svc/bifrost 8080:8080 &
curl http://localhost:8080/metrics | head -30
```
**With Prometheus Push Gateway** (recommended for multi-replica / HA setups where pull-based scraping can miss pods):
```yaml
bifrost:
plugins:
telemetry:
enabled: true
config:
push_gateway:
enabled: true
push_gateway_url: "http://prometheus-pushgateway.monitoring.svc.cluster.local:9091"
job_name: "bifrost"
instance_id: "" # auto-derived from pod name if empty
push_interval: 15
basic_auth:
username: ""
password: ""
```
**ServiceMonitor for Prometheus Operator:**
```yaml
serviceMonitor:
enabled: true
interval: 30s
scrapeTimeout: 10s
namespace: monitoring # namespace where Prometheus is deployed
```
</Tab>
<Tab title="Logging">
### Request/Response Logging
<Note>
Logging is **auto-loaded** when `bifrost.client.enableLogging: true` and a log store is configured. You do not need to set `bifrost.plugins.logging.enabled`.
</Note>
Configure logging via the `client` block:
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.client.enableLogging` | Enable request/response logging | `true` |
| `bifrost.client.disableContentLogging` | Strip message body from logs (HIPAA/PCI) | `false` |
| `bifrost.client.loggingHeaders` | HTTP headers to capture in log metadata | `[]` |
```yaml
bifrost:
client:
enableLogging: true
disableContentLogging: false # set true for HIPAA/compliance
loggingHeaders:
- "x-request-id"
- "x-user-id"
- "x-team-id"
```
```bash
# Verify logs are being written
kubectl port-forward svc/bifrost 8080:8080 &
curl -s "http://localhost:8080/api/logs?limit=5" | jq .
```
See [Client Configuration](/deployment-guides/helm/client) for the full reference.
</Tab>
<Tab title="Governance">
### Governance
<Note>
Governance is **always active** for OSS deployments. You do not need to set `bifrost.plugins.governance.enabled`.
</Note>
Virtual key enforcement is controlled by the `client` block:
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.client.enforceAuthOnInference` | Require a virtual key (`x-bf-vk`) on every inference request | `false` |
```yaml
bifrost:
client:
enforceAuthOnInference: true # require virtual key on all inference requests
```
Define virtual keys, budgets, rate limits, and routing rules in `bifrost.governance.*`. See the [Governance](/deployment-guides/helm/governance) page.
</Tab>
<Tab title="Semantic Cache">
### Semantic Cache
Caches LLM responses using vector similarity so semantically equivalent prompts return cached answers.
Two modes:
- **Semantic mode** (`dimension > 1`): uses an embedding model + vector store for similarity search
- **Direct / hash mode** (`dimension: 1`): exact-match hash-based caching, no embedding model needed
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.plugins.semanticCache.enabled` | Enable semantic caching | `false` |
| `bifrost.plugins.semanticCache.version` | Plugin config version for DB-backed update tracking (`1` to `32767`) | `1` |
| `bifrost.plugins.semanticCache.config.provider` | Embedding provider | `"openai"` |
| `bifrost.plugins.semanticCache.config.embedding_model` | Embedding model name | `"text-embedding-3-small"` |
| `bifrost.plugins.semanticCache.config.dimension` | Embedding dimension (`1` = direct/hash mode) | `1536` |
| `bifrost.plugins.semanticCache.config.threshold` | Cosine similarity threshold (01) | `0.8` |
| `bifrost.plugins.semanticCache.config.ttl` | Cache entry TTL (Go duration) | `"5m"` |
| `bifrost.plugins.semanticCache.config.conversation_history_threshold` | Number of past messages to include in cache key | `3` |
| `bifrost.plugins.semanticCache.config.cache_by_model` | Include model name in cache key | `true` |
| `bifrost.plugins.semanticCache.config.cache_by_provider` | Include provider name in cache key | `true` |
| `bifrost.plugins.semanticCache.config.exclude_system_prompt` | Exclude system prompt from cache key | `false` |
| `bifrost.plugins.semanticCache.config.cleanup_on_shutdown` | Delete cache data on pod shutdown | `false` |
**Semantic mode (with OpenAI embeddings + Weaviate):**
```bash
kubectl create secret generic semantic-cache-secret \
--from-literal=openai-key='sk-your-openai-embedding-key'
```
```yaml
# semantic-cache-values.yaml
image:
tag: "v1.4.11"
vectorStore:
enabled: true
type: weaviate
weaviate:
enabled: true
persistence:
size: 20Gi
bifrost:
plugins:
semanticCache:
enabled: true
config:
provider: "openai"
keys:
- value: "env.SEMANTIC_CACHE_OPENAI_KEY"
weight: 1
embedding_model: "text-embedding-3-small"
dimension: 1536
threshold: 0.85
ttl: "1h"
conversation_history_threshold: 5
cache_by_model: true
cache_by_provider: true
providerSecrets:
semantic-cache-key:
existingSecret: "semantic-cache-secret"
key: "openai-key"
envVar: "SEMANTIC_CACHE_OPENAI_KEY"
```
```bash
helm install bifrost bifrost/bifrost -f semantic-cache-values.yaml
```
**Direct / hash mode** (no embedding provider needed):
```yaml
bifrost:
plugins:
semanticCache:
enabled: true
config:
dimension: 1 # triggers hash-based exact matching
ttl: "30m"
cache_by_model: true
cache_by_provider: true
```
<Note>
The vector store (`vectorStore.*`) must be configured and enabled for semantic mode. Direct/hash mode works without a vector store but still requires a storage backend.
</Note>
</Tab>
<Tab title="OpenTelemetry">
### OpenTelemetry (OTel)
Sends distributed traces and push-based metrics to any OTLP-compatible collector (Jaeger, Tempo, Honeycomb, etc.).
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.plugins.otel.enabled` | Enable OTel tracing | `false` |
| `bifrost.plugins.otel.version` | Plugin config version for DB-backed update tracking (`1` to `32767`) | `1` |
| `bifrost.plugins.otel.config.service_name` | Service name in traces | `"bifrost"` |
| `bifrost.plugins.otel.config.collector_url` | OTLP collector endpoint | `""` |
| `bifrost.plugins.otel.config.trace_type` | Trace type (`genai_extension`, `vercel`, or `open_inference`) | `"genai_extension"` |
| `bifrost.plugins.otel.config.protocol` | Transport protocol (`grpc` or `http`) | `"grpc"` |
| `bifrost.plugins.otel.config.metrics_enabled` | Enable OTLP push-based metrics | `false` |
| `bifrost.plugins.otel.config.metrics_endpoint` | OTLP metrics endpoint | `""` |
| `bifrost.plugins.otel.config.metrics_push_interval` | Push interval in seconds | `15` |
| `bifrost.plugins.otel.config.headers` | Custom headers for the collector | `{}` |
| `bifrost.plugins.otel.config.insecure` | Skip TLS verification | `false` |
| `bifrost.plugins.otel.config.tls_ca_cert` | Path to CA cert for TLS | `""` |
```yaml
# otel-values.yaml
image:
tag: "v1.4.11"
bifrost:
plugins:
otel:
enabled: true
config:
service_name: "bifrost-production"
collector_url: "otel-collector.observability.svc.cluster.local:4317"
trace_type: "genai_extension"
protocol: "grpc"
insecure: true # set false in production with a proper cert
metrics_enabled: true
metrics_endpoint: "otel-collector.observability.svc.cluster.local:4317"
metrics_push_interval: 15
headers:
x-honeycomb-team: "env.HONEYCOMB_API_KEY"
```
```bash
helm upgrade bifrost bifrost/bifrost --reuse-values -f otel-values.yaml
```
**With authentication headers from a Kubernetes Secret:**
```bash
kubectl create secret generic otel-credentials \
--from-literal=api-key='your-honeycomb-or-grafana-key'
```
```yaml
bifrost:
plugins:
otel:
enabled: true
config:
collector_url: "api.honeycomb.io:443"
protocol: "grpc"
headers:
x-honeycomb-team: "env.OTEL_API_KEY"
providerSecrets:
otel-key:
existingSecret: "otel-credentials"
key: "api-key"
envVar: "OTEL_API_KEY"
```
</Tab>
<Tab title="Datadog">
### Datadog APM
Sends traces to a Datadog Agent running in the cluster.
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.plugins.datadog.enabled` | Enable Datadog tracing | `false` |
| `bifrost.plugins.datadog.version` | Plugin config version for DB-backed update tracking (`1` to `32767`) | `1` |
| `bifrost.plugins.datadog.config.service_name` | Service name | `"bifrost"` |
| `bifrost.plugins.datadog.config.agent_addr` | Datadog Agent address | `"localhost:8126"` |
| `bifrost.plugins.datadog.config.env` | Deployment environment tag | `""` |
| `bifrost.plugins.datadog.config.version` | Version tag | `""` |
| `bifrost.plugins.datadog.config.enable_traces` | Enable trace collection | `true` |
| `bifrost.plugins.datadog.config.custom_tags` | Extra tags on all spans | `{}` |
The Datadog Agent is typically deployed via the [Datadog Helm chart](https://docs.datadoghq.com/containers/kubernetes/installation/) as a DaemonSet, making it available at the node's hostIP.
```yaml
# datadog-values.yaml
image:
tag: "v1.4.11"
bifrost:
plugins:
datadog:
enabled: true
config:
service_name: "bifrost"
agent_addr: "$(HOST_IP):8126" # uses Datadog DaemonSet pattern
env: "production"
version: "v1.4.11"
enable_traces: true
custom_tags:
team: "platform"
region: "us-east-1"
# Inject HOST_IP so Bifrost can reach the DaemonSet agent on the same node
env:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
```
```bash
helm upgrade bifrost bifrost/bifrost --reuse-values -f datadog-values.yaml
```
</Tab>
<Tab title="Maxim">
### Maxim Observability
Sends LLM request/response data to [Maxim](https://getmaxim.ai) for tracing, evaluation, and observability.
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.plugins.maxim.enabled` | Enable Maxim plugin | `false` |
| `bifrost.plugins.maxim.version` | Plugin config version for DB-backed update tracking (`1` to `32767`) | `1` |
| `bifrost.plugins.maxim.config.api_key` | Maxim API key (plain text, prefer secret) | `""` |
| `bifrost.plugins.maxim.config.log_repo_id` | Maxim log repository ID | `""` |
| `bifrost.plugins.maxim.secretRef.name` | Kubernetes Secret name for API key | `""` |
| `bifrost.plugins.maxim.secretRef.key` | Key within the secret | `"api-key"` |
```bash
kubectl create secret generic maxim-credentials \
--from-literal=api-key='your-maxim-api-key'
```
```yaml
# maxim-values.yaml
image:
tag: "v1.4.11"
bifrost:
plugins:
maxim:
enabled: true
config:
log_repo_id: "your-log-repo-id"
secretRef:
name: "maxim-credentials"
key: "api-key"
```
```bash
helm upgrade bifrost bifrost/bifrost --reuse-values -f maxim-values.yaml
```
</Tab>
<Tab title="Custom Plugin">
### Custom / Dynamic Plugins
Load a custom Go plugin (compiled `.so` file) at runtime.
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.plugins.custom[].name` | Unique plugin name | `""` |
| `bifrost.plugins.custom[].enabled` | Enable custom plugin | `false` |
| `bifrost.plugins.custom[].path` | Path to compiled `.so` file in the container | `""` |
| `bifrost.plugins.custom[].version` | Plugin config version (`1` to `32767`) | `1` |
| `bifrost.plugins.custom[].config` | Arbitrary plugin-specific configuration | `{}` |
```yaml
bifrost:
plugins:
custom:
- name: "my-custom-plugin"
enabled: true
path: "/plugins/my-plugin.so"
version: 1
config:
api_endpoint: "https://my-service.example.com"
timeout: 5000
```
Mount the `.so` file via a volume:
```yaml
volumes:
- name: custom-plugins
configMap:
name: bifrost-custom-plugins
volumeMounts:
- name: custom-plugins
mountPath: /plugins
```
Or use an init container to download the plugin binary:
```yaml
initContainers:
- name: download-plugin
image: curlimages/curl:8.6.0
command:
- sh
- -c
- |
curl -fsSL https://plugins.example.com/my-plugin.so \
-o /plugins/my-plugin.so
volumeMounts:
- name: plugin-dir
mountPath: /plugins
volumes:
- name: plugin-dir
emptyDir: {}
volumeMounts:
- name: plugin-dir
mountPath: /plugins
```
```bash
helm upgrade bifrost bifrost/bifrost --reuse-values -f custom-plugin-values.yaml
```
</Tab>
</Tabs>
---
## All Plugins Together
```yaml
# all-plugins-values.yaml
image:
tag: "v1.4.11"
bifrost:
encryptionKeySecret:
name: "bifrost-encryption"
key: "encryption-key"
plugins:
telemetry:
enabled: true
config:
custom_labels:
- name: "environment"
value: "production"
logging:
enabled: true
config:
disable_content_logging: false
logging_headers:
- "x-request-id"
governance:
enabled: true
config:
is_vk_mandatory: true
semanticCache:
enabled: true
config:
provider: "openai"
keys:
- value: "env.CACHE_OPENAI_KEY"
weight: 1
embedding_model: "text-embedding-3-small"
dimension: 1536
threshold: 0.85
ttl: "1h"
otel:
enabled: true
config:
service_name: "bifrost"
collector_url: "otel-collector.observability.svc.cluster.local:4317"
protocol: "grpc"
insecure: true
```
```bash
helm install bifrost bifrost/bifrost -f all-plugins-values.yaml
```

View File

@@ -0,0 +1,941 @@
---
title: "Provider Setup"
description: "Configure LLM providers in the Bifrost Helm chart — API keys, cloud-native auth, and self-hosted endpoints"
icon: "plug"
---
All providers are configured under `bifrost.providers` in your values file. Each provider entry contains a `keys` list where each key has a `name`, `value`, `weight`, and optional provider-specific config.
**Two ways to supply credentials:**
- **Direct value** — `value: "sk-..."` (fine for dev; avoid in production)
- **Kubernetes Secret + env var** — store the key in a Secret, inject as an env var, and reference it with `value: "env.VAR_NAME"`
The `providerSecrets` block handles the Secret → env var injection automatically:
```yaml
bifrost:
providers:
openai:
keys:
- name: "primary"
value: "env.OPENAI_API_KEY" # resolved at runtime
weight: 1
providerSecrets:
openai:
existingSecret: "my-openai-secret"
key: "api-key"
envVar: "OPENAI_API_KEY" # injected into the pod
```
---
<Tabs>
<Tab title="OpenAI">
### OpenAI
Supports multiple keys with weighted load balancing. The key with `use_for_batch_api: true` is eligible for the Batch API.
**Step 1 — Create secret**
```bash
kubectl create secret generic openai-credentials \
--from-literal=api-key-1='sk-your-primary-key' \
--from-literal=api-key-2='sk-your-secondary-key' \
--from-literal=api-key-batch='sk-your-batch-key'
```
**Step 2 — Values file**
```yaml
# openai-values.yaml
image:
tag: "v1.4.11"
bifrost:
providers:
openai:
keys:
- name: "openai-primary"
value: "env.OPENAI_KEY_1"
weight: 2 # 50% of traffic
models: ["*"]
- name: "openai-secondary"
value: "env.OPENAI_KEY_2"
weight: 1 # 25%
models: ["gpt-4o-mini"] # restrict to cheaper model
- name: "openai-batch"
value: "env.OPENAI_KEY_BATCH"
weight: 1 # 25%
models: ["*"]
use_for_batch_api: true
providerSecrets:
openai-key-1:
existingSecret: "openai-credentials"
key: "api-key-1"
envVar: "OPENAI_KEY_1"
openai-key-2:
existingSecret: "openai-credentials"
key: "api-key-2"
envVar: "OPENAI_KEY_2"
openai-key-batch:
existingSecret: "openai-credentials"
key: "api-key-batch"
envVar: "OPENAI_KEY_BATCH"
```
**Step 3 — Install**
```bash
helm install bifrost bifrost/bifrost -f openai-values.yaml
```
**Optional — per-provider network config**
```yaml
bifrost:
providers:
openai:
keys:
- name: "primary"
value: "env.OPENAI_KEY_1"
weight: 1
network_config:
default_request_timeout_in_seconds: 120
max_retries: 3
retry_backoff_initial_ms: 500
retry_backoff_max_ms: 5000
max_conns_per_host: 5000
```
</Tab>
<Tab title="Anthropic">
### Anthropic
```bash
kubectl create secret generic anthropic-credentials \
--from-literal=api-key-1='sk-ant-your-primary-key' \
--from-literal=api-key-2='sk-ant-your-secondary-key'
```
```yaml
# anthropic-values.yaml
image:
tag: "v1.4.11"
bifrost:
providers:
anthropic:
keys:
- name: "anthropic-primary"
value: "env.ANTHROPIC_KEY_1"
weight: 1
models: ["*"]
- name: "anthropic-secondary"
value: "env.ANTHROPIC_KEY_2"
weight: 1
models: ["*"]
providerSecrets:
anthropic-key-1:
existingSecret: "anthropic-credentials"
key: "api-key-1"
envVar: "ANTHROPIC_KEY_1"
anthropic-key-2:
existingSecret: "anthropic-credentials"
key: "api-key-2"
envVar: "ANTHROPIC_KEY_2"
```
```bash
helm install bifrost bifrost/bifrost -f anthropic-values.yaml
```
**Override Anthropic beta headers** (optional):
```yaml
bifrost:
providers:
anthropic:
keys:
- name: "primary"
value: "env.ANTHROPIC_KEY_1"
weight: 1
network_config:
beta_header_overrides:
redact-thinking-: true
```
</Tab>
<Tab title="Azure OpenAI">
### Azure OpenAI
Azure requires `azure_key_config` on every key with `endpoint` and `api_version`. Use top-level `aliases` to map logical model names to Azure deployment names.
Two auth modes are supported:
<Tabs>
<Tab title="API Key">
**Step 1 — Create secret**
```bash
kubectl create secret generic azure-credentials \
--from-literal=api-key='your-azure-openai-api-key' \
--from-literal=endpoint='https://your-resource.openai.azure.com'
```
**Step 2 — Values file**
```yaml
# azure-apikey-values.yaml
image:
tag: "v1.4.11"
bifrost:
providers:
azure:
keys:
- name: "azure-primary"
value: "env.AZURE_API_KEY"
weight: 1
models: ["gpt-4o", "gpt-4o-mini", "text-embedding-3-small"]
azure_key_config:
endpoint: "env.AZURE_ENDPOINT"
api_version: "2024-10-21"
aliases:
gpt-4o: "gpt-4o-prod"
gpt-4o-mini: "gpt-4o-mini-prod"
text-embedding-3-small: "embeddings-prod"
providerSecrets:
azure-api-key:
existingSecret: "azure-credentials"
key: "api-key"
envVar: "AZURE_API_KEY"
azure-endpoint:
existingSecret: "azure-credentials"
key: "endpoint"
envVar: "AZURE_ENDPOINT"
```
**Step 3 — Install**
```bash
helm install bifrost bifrost/bifrost -f azure-apikey-values.yaml
```
</Tab>
<Tab title="Managed Identity / Workload Identity">
When `value` is empty, Bifrost uses `DefaultAzureCredential` — which automatically resolves credentials from:
- AKS Workload Identity (recommended for production)
- Azure VM managed identity
- `az login` (developer machines)
**Step 1 — Annotate the service account** (AKS Workload Identity)
```bash
# Associate the Kubernetes service account with your Azure managed identity
kubectl annotate serviceaccount bifrost \
azure.workload.identity/client-id="<MANAGED_IDENTITY_CLIENT_ID>"
```
```yaml
serviceAccount:
annotations:
azure.workload.identity/client-id: "<MANAGED_IDENTITY_CLIENT_ID>"
```
**Step 2 — Values file**
```bash
kubectl create secret generic azure-config \
--from-literal=endpoint='https://your-resource.openai.azure.com'
```
```yaml
# azure-msi-values.yaml
image:
tag: "v1.4.11"
serviceAccount:
annotations:
azure.workload.identity/client-id: "<MANAGED_IDENTITY_CLIENT_ID>"
bifrost:
providers:
azure:
keys:
- name: "azure-workload-identity"
value: "" # empty = DefaultAzureCredential
weight: 1
models: ["gpt-4o"]
azure_key_config:
endpoint: "env.AZURE_ENDPOINT"
api_version: "2024-10-21"
aliases:
gpt-4o: "gpt-4o-prod"
providerSecrets:
azure-endpoint:
existingSecret: "azure-config"
key: "endpoint"
envVar: "AZURE_ENDPOINT"
```
**Step 3 — Install**
```bash
helm install bifrost bifrost/bifrost -f azure-msi-values.yaml
```
</Tab>
</Tabs>
**Multi-region failover** (two deployments, different regions):
```yaml
bifrost:
providers:
azure:
keys:
- name: "eastus"
value: "env.AZURE_KEY_EAST"
weight: 1
azure_key_config:
endpoint: "env.AZURE_ENDPOINT_EAST"
api_version: "2024-10-21"
aliases:
gpt-4o: "gpt-4o-eastus"
- name: "westus"
value: "env.AZURE_KEY_WEST"
weight: 1
azure_key_config:
endpoint: "env.AZURE_ENDPOINT_WEST"
api_version: "2024-10-21"
aliases:
gpt-4o: "gpt-4o-westus"
```
</Tab>
<Tab title="AWS Bedrock">
### AWS Bedrock
Bedrock requires `bedrock_key_config` with at minimum a `region`. Three auth modes:
<Tabs>
<Tab title="Static Credentials">
```bash
kubectl create secret generic aws-credentials \
--from-literal=access-key-id='AKIAIOSFODNN7EXAMPLE' \
--from-literal=secret-access-key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY'
```
```yaml
# bedrock-static-values.yaml
image:
tag: "v1.4.11"
bifrost:
providers:
bedrock:
keys:
- name: "bedrock-static"
value: ""
weight: 1
models: ["*"]
bedrock_key_config:
region: "us-east-1"
access_key: "env.AWS_ACCESS_KEY_ID"
secret_key: "env.AWS_SECRET_ACCESS_KEY"
deployments:
# Logical name -> Bedrock inference profile
anthropic.claude-3-5-sonnet: "us.anthropic.claude-3-5-sonnet-20240620-v1:0"
providerSecrets:
aws-access-key:
existingSecret: "aws-credentials"
key: "access-key-id"
envVar: "AWS_ACCESS_KEY_ID"
aws-secret-key:
existingSecret: "aws-credentials"
key: "secret-access-key"
envVar: "AWS_SECRET_ACCESS_KEY"
```
```bash
helm install bifrost bifrost/bifrost -f bedrock-static-values.yaml
```
</Tab>
<Tab title="IRSA / EKS Pod Identity">
When only `region` is set, Bifrost inherits credentials from the AWS SDK default chain — IRSA (IAM Roles for Service Accounts), EC2 instance profile, or `AWS_*` env vars.
**Step 1 — Annotate the service account with the IAM role**
```bash
kubectl annotate serviceaccount bifrost \
eks.amazonaws.com/role-arn="arn:aws:iam::123456789012:role/BifrostBedrockRole"
```
```yaml
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::123456789012:role/BifrostBedrockRole"
```
**Step 2 — Values file**
```yaml
# bedrock-irsa-values.yaml
image:
tag: "v1.4.11"
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::123456789012:role/BifrostBedrockRole"
bifrost:
providers:
bedrock:
keys:
- name: "bedrock-irsa"
value: ""
weight: 1
models: ["*"]
bedrock_key_config:
region: "us-east-1"
# No access_key / secret_key — SDK uses IRSA token automatically
```
```bash
helm install bifrost bifrost/bifrost -f bedrock-irsa-values.yaml
```
</Tab>
<Tab title="STS AssumeRole">
Assumes a cross-account role on top of the default credential chain.
```yaml
# bedrock-assumerole-values.yaml
image:
tag: "v1.4.11"
bifrost:
providers:
bedrock:
keys:
- name: "bedrock-assumerole"
value: ""
weight: 1
models: ["*"]
bedrock_key_config:
region: "us-west-2"
# Source identity from pod's default chain, then assume this role
role_arn: "env.AWS_ROLE_ARN"
external_id: "env.AWS_EXTERNAL_ID"
session_name: "bifrost-session"
```
```bash
kubectl create secret generic aws-role-config \
--from-literal=role-arn='arn:aws:iam::999999999999:role/CrossAccountBedrockRole' \
--from-literal=external-id='your-external-id'
```
```yaml
providerSecrets:
aws-role-arn:
existingSecret: "aws-role-config"
key: "role-arn"
envVar: "AWS_ROLE_ARN"
aws-external-id:
existingSecret: "aws-role-config"
key: "external-id"
envVar: "AWS_EXTERNAL_ID"
```
```bash
helm install bifrost bifrost/bifrost -f bedrock-assumerole-values.yaml
```
</Tab>
</Tabs>
**Batch API — S3 configuration**
```yaml
bedrock_key_config:
region: "us-east-1"
access_key: "env.AWS_ACCESS_KEY_ID"
secret_key: "env.AWS_SECRET_ACCESS_KEY"
batch_s3_config:
buckets:
- bucket_name: "my-bedrock-batch-bucket"
prefix: "batch/"
is_default: true
```
</Tab>
<Tab title="Google Vertex AI">
### Google Vertex AI
Vertex requires `vertex_key_config` with `project_id` and `region`. Two auth modes:
<Tabs>
<Tab title="Service Account Key">
```bash
# Base64-encode the service account JSON
SA_JSON=$(cat service-account-key.json | base64 -w 0)
kubectl create secret generic gcp-credentials \
--from-literal=service-account-json="${SA_JSON}"
```
```yaml
# vertex-sa-values.yaml
image:
tag: "v1.4.11"
bifrost:
providers:
vertex:
keys:
- name: "vertex-sa-key"
value: ""
weight: 1
models: ["*"]
vertex_key_config:
project_id: "env.VERTEX_PROJECT_ID"
region: "us-central1"
auth_credentials: "env.VERTEX_AUTH_CREDENTIALS"
providerSecrets:
vertex-project-id:
existingSecret: "gcp-credentials"
key: "project-id"
envVar: "VERTEX_PROJECT_ID"
vertex-sa:
existingSecret: "gcp-credentials"
key: "service-account-json"
envVar: "VERTEX_AUTH_CREDENTIALS"
```
```bash
helm install bifrost bifrost/bifrost -f vertex-sa-values.yaml
```
</Tab>
<Tab title="GKE Workload Identity / ADC">
When `auth_credentials` is omitted, Bifrost calls `google.FindDefaultCredentials` — which resolves to:
- GKE Workload Identity (recommended)
- GCE metadata server (on Compute Engine / Cloud Run)
- `GOOGLE_APPLICATION_CREDENTIALS` path
- `gcloud auth application-default login` (developer machines)
**Step 1 — Annotate the service account** (GKE Workload Identity)
```bash
gcloud iam service-accounts add-iam-policy-binding \
bifrost-sa@my-project.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:my-project.svc.id.goog[default/bifrost]"
```
```yaml
serviceAccount:
annotations:
iam.gke.io/gcp-service-account: "bifrost-sa@my-project.iam.gserviceaccount.com"
```
**Step 2 — Values file**
```yaml
# vertex-wli-values.yaml
image:
tag: "v1.4.11"
serviceAccount:
annotations:
iam.gke.io/gcp-service-account: "bifrost-sa@my-project.iam.gserviceaccount.com"
bifrost:
providers:
vertex:
keys:
- name: "vertex-workload-identity"
value: ""
weight: 1
models: ["*"]
vertex_key_config:
project_id: "my-gcp-project"
region: "us-central1"
# auth_credentials intentionally omitted → ADC lookup
```
```bash
helm install bifrost bifrost/bifrost -f vertex-wli-values.yaml
```
</Tab>
</Tabs>
</Tab>
<Tab title="Groq / Mistral / Gemini / Others">
### Standard API-Key Providers
These providers follow the same simple pattern — one or more keys with weights.
<Tabs>
<Tab title="Groq">
```bash
kubectl create secret generic groq-credentials \
--from-literal=api-key='gsk_your_groq_api_key'
```
```yaml
bifrost:
providers:
groq:
keys:
- name: "groq-primary"
value: "env.GROQ_API_KEY"
weight: 1
models: ["*"]
providerSecrets:
groq-key:
existingSecret: "groq-credentials"
key: "api-key"
envVar: "GROQ_API_KEY"
```
</Tab>
<Tab title="Gemini">
```bash
kubectl create secret generic gemini-credentials \
--from-literal=api-key='your-gemini-api-key'
```
```yaml
bifrost:
providers:
gemini:
keys:
- name: "gemini-main"
value: "env.GEMINI_API_KEY"
weight: 1
models: ["*"]
providerSecrets:
gemini-key:
existingSecret: "gemini-credentials"
key: "api-key"
envVar: "GEMINI_API_KEY"
```
</Tab>
<Tab title="Mistral">
```bash
kubectl create secret generic mistral-credentials \
--from-literal=api-key='your-mistral-api-key'
```
```yaml
bifrost:
providers:
mistral:
keys:
- name: "mistral-main"
value: "env.MISTRAL_API_KEY"
weight: 1
models: ["*"]
providerSecrets:
mistral-key:
existingSecret: "mistral-credentials"
key: "api-key"
envVar: "MISTRAL_API_KEY"
```
</Tab>
<Tab title="Cohere / Perplexity / xAI / Others">
All standard API-key providers follow the same pattern. Replace the provider name and env var name accordingly:
```yaml
bifrost:
providers:
cohere:
keys:
- name: "cohere-main"
value: "env.COHERE_API_KEY"
weight: 1
perplexity:
keys:
- name: "perplexity-main"
value: "env.PERPLEXITY_API_KEY"
weight: 1
xai:
keys:
- name: "xai-main"
value: "env.XAI_API_KEY"
weight: 1
cerebras:
keys:
- name: "cerebras-main"
value: "env.CEREBRAS_API_KEY"
weight: 1
openrouter:
keys:
- name: "openrouter-main"
value: "env.OPENROUTER_API_KEY"
weight: 1
nebius:
keys:
- name: "nebius-main"
value: "env.NEBIUS_API_KEY"
weight: 1
```
</Tab>
</Tabs>
**Install command (any of the above)**
```bash
helm install bifrost bifrost/bifrost \
--set image.tag=v1.4.11 \
-f provider-values.yaml
```
</Tab>
<Tab title="Self-Hosted">
### Self-Hosted Providers
Self-hosted providers point to a URL you operate. No API key is typically required (`value: ""`).
<Tabs>
<Tab title="Ollama">
```yaml
# ollama-values.yaml
image:
tag: "v1.4.11"
bifrost:
providers:
ollama:
keys:
- name: "ollama-local"
value: ""
weight: 1
models: ["*"]
ollama_key_config:
url: "http://ollama.default.svc.cluster.local:11434"
```
```bash
helm install bifrost bifrost/bifrost -f ollama-values.yaml
```
Using an env var for the URL (useful across environments):
```bash
kubectl create secret generic ollama-config \
--from-literal=url='http://ollama.default.svc.cluster.local:11434'
```
```yaml
ollama_key_config:
url: "env.OLLAMA_URL"
providerSecrets:
ollama-url:
existingSecret: "ollama-config"
key: "url"
envVar: "OLLAMA_URL"
```
</Tab>
<Tab title="vLLM">
vLLM instances are model-specific — one key per served model.
```yaml
# vllm-values.yaml
image:
tag: "v1.4.11"
bifrost:
providers:
vllm:
keys:
- name: "vllm-llama3-70b"
value: ""
weight: 1
models: ["llama-3-70b"]
vllm_key_config:
url: "http://vllm.default.svc.cluster.local:8000"
model_name: "meta-llama/Meta-Llama-3-70B-Instruct"
- name: "vllm-mistral"
value: ""
weight: 1
models: ["mistral-7b"]
vllm_key_config:
url: "http://vllm-mistral.default.svc.cluster.local:8000"
model_name: "mistralai/Mistral-7B-Instruct-v0.3"
```
```bash
helm install bifrost bifrost/bifrost -f vllm-values.yaml
```
</Tab>
<Tab title="SGLang">
```yaml
# sgl-values.yaml
image:
tag: "v1.4.11"
bifrost:
providers:
sgl:
keys:
- name: "sgl-main"
value: ""
weight: 1
models: ["*"]
sgl_key_config:
url: "http://sgl-router.default.svc.cluster.local:30000"
```
```bash
helm install bifrost bifrost/bifrost -f sgl-values.yaml
```
</Tab>
<Tab title="HuggingFace / Replicate">
These providers use `aliases` to map logical model names to provider-specific IDs.
```yaml
bifrost:
providers:
huggingface:
keys:
- name: "hf-main"
value: "env.HF_API_KEY"
weight: 1
models: ["llama-3", "mixtral"]
aliases:
llama-3: "meta-llama/Meta-Llama-3-8B-Instruct"
mixtral: "mistralai/Mixtral-8x7B-Instruct-v0.1"
replicate:
keys:
- name: "replicate-main"
value: "env.REPLICATE_API_KEY"
weight: 1
models: ["llama-3"]
aliases:
llama-3: "meta/meta-llama-3-70b-instruct"
replicate_key_config:
use_deployments_endpoint: false
```
</Tab>
</Tabs>
</Tab>
</Tabs>
---
## Multi-Provider Example
Combine providers in a single values file:
```yaml
# multi-provider-values.yaml
image:
tag: "v1.4.11"
bifrost:
providers:
openai:
keys:
- name: "openai-primary"
value: "env.OPENAI_API_KEY"
weight: 2
models: ["*"]
anthropic:
keys:
- name: "anthropic-primary"
value: "env.ANTHROPIC_API_KEY"
weight: 1
models: ["*"]
groq:
keys:
- name: "groq-primary"
value: "env.GROQ_API_KEY"
weight: 1
models: ["*"]
providerSecrets:
openai-key:
existingSecret: "provider-keys"
key: "openai"
envVar: "OPENAI_API_KEY"
anthropic-key:
existingSecret: "provider-keys"
key: "anthropic"
envVar: "ANTHROPIC_API_KEY"
groq-key:
existingSecret: "provider-keys"
key: "groq"
envVar: "GROQ_API_KEY"
plugins:
logging:
enabled: true
governance:
enabled: true
```
```bash
# Create a single secret with all provider keys
kubectl create secret generic provider-keys \
--from-literal=openai='sk-your-openai-key' \
--from-literal=anthropic='sk-ant-your-anthropic-key' \
--from-literal=groq='gsk_your-groq-key'
helm install bifrost bifrost/bifrost -f multi-provider-values.yaml
```

View File

@@ -0,0 +1,550 @@
---
title: "Storage"
description: "Configure Bifrost storage backends in Helm — SQLite, PostgreSQL (embedded and external), per-store overrides, and S3/GCS object storage for logs"
icon: "database"
---
Bifrost persists two types of data — **config** (providers, virtual keys, governance rules) and **logs** (request/response records). Each has its own store, both defaulting to the top-level `storage.mode`.
| Parameter | Description | Default |
|-----------|-------------|---------|
| `storage.mode` | Default backend for both stores (`sqlite` or `postgres`) | `sqlite` |
| `storage.configStore.type` | Override backend for the config store | `""` (inherits `storage.mode`) |
| `storage.logsStore.type` | Override backend for the logs store | `""` (inherits `storage.mode`) |
<Note>
When any store uses SQLite the chart deploys a **StatefulSet** with a PVC. With PostgreSQL only (no SQLite) it deploys a **Deployment**. Mixing backends (e.g. config=postgres, logs=sqlite) still requires a StatefulSet.
</Note>
---
<Tabs>
<Tab title="SQLite">
### SQLite (Default)
Simplest setup — no external database required. Bifrost runs as a StatefulSet with a persistent volume for the SQLite files.
| Parameter | Description | Default |
|-----------|-------------|---------|
| `storage.persistence.enabled` | Create a PVC for SQLite data | `true` |
| `storage.persistence.size` | PVC size | `10Gi` |
| `storage.persistence.accessMode` | PVC access mode | `ReadWriteOnce` |
| `storage.persistence.storageClass` | Storage class (leave empty for cluster default) | `""` |
| `storage.persistence.existingClaim` | Reuse an existing PVC | `""` |
```yaml
# sqlite-values.yaml
image:
tag: "v1.4.11"
storage:
mode: sqlite
persistence:
enabled: true
size: 20Gi
# storageClass: "gp3" # uncomment to pin storage class
bifrost:
encryptionKey: "your-32-byte-encryption-key-here"
```
```bash
helm install bifrost bifrost/bifrost -f sqlite-values.yaml
```
**Reuse an existing PVC** (e.g. after a StatefulSet migration):
```yaml
storage:
persistence:
existingClaim: "bifrost-data"
```
<Warning>
Upgrading from SQLite to PostgreSQL requires a data migration — the two stores are not compatible. Plan accordingly before switching `storage.mode` on a running deployment.
</Warning>
#### StatefulSet Migration (chart v2.0.0+)
Prior to v2.0.0, SQLite used a Deployment + manual PVC. v2.0.0 moved SQLite to a StatefulSet. If upgrading from an older chart:
```bash
# 1. Scale down the old deployment
kubectl scale deployment bifrost --replicas=0
# 2. Note the existing PVC name
kubectl get pvc
# 3. Upgrade the chart, pointing at the existing claim
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set storage.persistence.existingClaim=<your-old-pvc-name> \
--set image.tag=v1.4.11
```
</Tab>
<Tab title="Embedded PostgreSQL">
### Embedded PostgreSQL
The chart can deploy a PostgreSQL instance alongside Bifrost. Good for simple production setups where you don't have an existing database.
| Parameter | Description | Default |
|-----------|-------------|---------|
| `storage.mode` | Set to `postgres` | `sqlite` |
| `postgresql.enabled` | Deploy PostgreSQL as a sub-deployment | `false` |
| `postgresql.auth.username` | Database user | `bifrost` |
| `postgresql.auth.password` | Database password | `bifrost_password` |
| `postgresql.auth.database` | Database name | `bifrost` |
| `postgresql.primary.persistence.size` | PVC size for PostgreSQL data | `8Gi` |
<Note>
Ensure the database is created with **UTF8 encoding**. The embedded PostgreSQL deployment handles this automatically. See [PostgreSQL UTF8 Requirement](/quickstart/gateway/setting-up#postgresql-utf8-requirement) for manual setups.
</Note>
```bash
kubectl create secret generic postgres-credentials \
--from-literal=password='your-secure-postgres-password'
```
```yaml
# embedded-postgres-values.yaml
image:
tag: "v1.4.11"
storage:
mode: postgres
postgresql:
enabled: true
auth:
username: bifrost
password: "your-secure-postgres-password" # use existingSecret in production
database: bifrost
primary:
persistence:
enabled: true
size: 50Gi
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 4Gi
bifrost:
encryptionKey: "your-32-byte-encryption-key-here"
```
```bash
helm install bifrost bifrost/bifrost -f embedded-postgres-values.yaml
```
**Verify the connection from Bifrost:**
```bash
kubectl exec -it deployment/bifrost -- nc -zv bifrost-postgresql 5432
```
</Tab>
<Tab title="External PostgreSQL">
### External PostgreSQL
Point Bifrost at an existing PostgreSQL instance — RDS, Cloud SQL, Azure Database, or self-managed.
| Parameter | Description | Default |
|-----------|-------------|---------|
| `postgresql.enabled` | Must be `false` | `false` |
| `postgresql.external.enabled` | Enable external connection | `false` |
| `postgresql.external.host` | Hostname or IP | `""` |
| `postgresql.external.port` | Port | `5432` |
| `postgresql.external.user` | Username | `bifrost` |
| `postgresql.external.database` | Database name | `bifrost` |
| `postgresql.external.sslMode` | SSL mode (`disable`, `require`, `verify-ca`, `verify-full`) | `disable` |
| `postgresql.external.existingSecret` | Secret name for the password | `""` |
| `postgresql.external.passwordKey` | Key within the secret | `"password"` |
```bash
kubectl create secret generic external-postgres-credentials \
--from-literal=password='your-external-postgres-password'
```
```yaml
# external-postgres-values.yaml
image:
tag: "v1.4.11"
storage:
mode: postgres
postgresql:
enabled: false
external:
enabled: true
host: "your-rds-endpoint.us-east-1.rds.amazonaws.com"
port: 5432
user: bifrost
database: bifrost
sslMode: require
existingSecret: "external-postgres-credentials"
passwordKey: "password"
bifrost:
encryptionKey: "your-32-byte-encryption-key-here"
```
```bash
helm install bifrost bifrost/bifrost -f external-postgres-values.yaml
```
**Test connectivity before installing:**
```bash
kubectl run pg-test --image=postgres:16-alpine --rm -it --restart=Never -- \
psql "host=your-rds-endpoint.us-east-1.rds.amazonaws.com dbname=bifrost user=bifrost sslmode=require" \
-c "SELECT version();"
```
</Tab>
<Tab title="Mixed (Config=Postgres, Logs=SQLite)">
### Mixed Backend
Run the config store on PostgreSQL (fast lookups, shared across replicas) while keeping logs on SQLite (simpler, cheaper for append-heavy workloads).
```yaml
# mixed-values.yaml
image:
tag: "v1.4.11"
storage:
mode: sqlite # default fallback
configStore:
type: postgres # override: config uses postgres
logsStore:
type: sqlite # explicit: logs use sqlite
persistence:
enabled: true
size: 20Gi # for the SQLite logs store
postgresql:
external:
enabled: true
host: "your-postgres-host.example.com"
port: 5432
user: bifrost
database: bifrost
sslMode: require
existingSecret: "postgres-credentials"
passwordKey: "password"
bifrost:
encryptionKey: "your-32-byte-encryption-key-here"
```
```bash
kubectl create secret generic postgres-credentials \
--from-literal=password='your-postgres-password'
helm install bifrost bifrost/bifrost -f mixed-values.yaml
```
<Note>
In mixed mode, Bifrost deploys a StatefulSet (because SQLite is in use) with both a PostgreSQL connection and a local PVC for the SQLite log store.
</Note>
**PostgreSQL connection pool tuning** (high log volume):
```yaml
storage:
configStore:
type: postgres
maxIdleConns: 5
maxOpenConns: 50
logsStore:
type: postgres
maxIdleConns: 10
maxOpenConns: 100
```
</Tab>
</Tabs>
---
## Object Storage for Logs
Offload large request/response payloads from the database to S3 or GCS. The DB retains only lightweight index records; payloads are fetched on demand.
<Tabs>
<Tab title="AWS S3">
```bash
kubectl create secret generic s3-credentials \
--from-literal=access-key-id='AKIAIOSFODNN7EXAMPLE' \
--from-literal=secret-access-key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY'
```
```yaml
storage:
logsStore:
objectStorage:
enabled: true
type: s3
bucket: "bifrost-logs"
prefix: "bifrost"
compress: true # gzip compression
# S3 configuration
region: us-east-1
accessKeyId: "env.S3_ACCESS_KEY_ID"
secretAccessKey: "env.S3_SECRET_ACCESS_KEY"
# endpoint: "" # Custom endpoint for MinIO / Cloudflare R2
# forcePathStyle: false # Set true for MinIO
bifrost:
# inject S3 credentials as env vars
providerSecrets:
s3-access-key:
existingSecret: "s3-credentials"
key: "access-key-id"
envVar: "S3_ACCESS_KEY_ID"
s3-secret-key:
existingSecret: "s3-credentials"
key: "secret-access-key"
envVar: "S3_SECRET_ACCESS_KEY"
```
**Using IAM role (IRSA / instance profile) instead of static keys:**
```yaml
storage:
logsStore:
objectStorage:
enabled: true
type: s3
bucket: "bifrost-logs"
region: us-east-1
# No accessKeyId / secretAccessKey — uses SDK default chain
roleArn: "arn:aws:iam::123456789012:role/BifrostS3Role"
```
</Tab>
<Tab title="Google Cloud Storage">
```bash
kubectl create secret generic gcs-credentials \
--from-literal=service-account-json="$(cat service-account-key.json)"
```
```yaml
storage:
logsStore:
objectStorage:
enabled: true
type: gcs
bucket: "bifrost-logs"
prefix: "bifrost"
compress: true
# GCS configuration
projectId: "my-gcp-project"
credentialsJson: "env.GCS_CREDENTIALS_JSON" # omit for Workload Identity
bifrost:
providerSecrets:
gcs-creds:
existingSecret: "gcs-credentials"
key: "service-account-json"
envVar: "GCS_CREDENTIALS_JSON"
```
</Tab>
<Tab title="MinIO (Self-Hosted)">
```yaml
storage:
logsStore:
objectStorage:
enabled: true
type: s3
bucket: "bifrost-logs"
prefix: "bifrost"
compress: false
region: us-east-1 # can be any value for MinIO
endpoint: "http://minio.minio-ns.svc.cluster.local:9000"
accessKeyId: "env.MINIO_ACCESS_KEY"
secretAccessKey: "env.MINIO_SECRET_KEY"
forcePathStyle: true # required for MinIO
```
</Tab>
</Tabs>
```bash
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
-f object-storage-values.yaml
```
---
## Vector Store
A vector store is required for [semantic caching](/deployment-guides/helm/plugins). Choose from Weaviate, Redis, or Qdrant (embedded or external), or Pinecone (external only).
<Tabs>
<Tab title="Weaviate">
```yaml
vectorStore:
enabled: true
type: weaviate
weaviate:
enabled: true # deploy embedded Weaviate
replicas: 1
persistence:
enabled: true
size: 20Gi
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 4Gi
```
**External Weaviate:**
```yaml
vectorStore:
enabled: true
type: weaviate
weaviate:
enabled: false
external:
enabled: true
scheme: https
host: "weaviate.example.com"
apiKey: "env.WEAVIATE_API_KEY"
grpcHost: "weaviate-grpc.example.com"
grpcSecured: true
existingSecret: "weaviate-credentials"
apiKeyKey: "api-key"
```
</Tab>
<Tab title="Redis / Valkey">
```yaml
vectorStore:
enabled: true
type: redis
redis:
enabled: true # deploy embedded Redis
auth:
enabled: true
password: "redis_password"
master:
persistence:
size: 8Gi
```
**External Redis / AWS MemoryDB:**
```bash
kubectl create secret generic redis-credentials \
--from-literal=password='your-redis-password'
```
```yaml
vectorStore:
enabled: true
type: redis
redis:
enabled: false
external:
enabled: true
host: "your-redis.cache.amazonaws.com"
port: 6379
useTls: true
clusterMode: true # required for AWS MemoryDB
existingSecret: "redis-credentials"
passwordKey: "password"
```
</Tab>
<Tab title="Qdrant">
```yaml
vectorStore:
enabled: true
type: qdrant
qdrant:
enabled: true # deploy embedded Qdrant
persistence:
size: 10Gi
```
**External Qdrant:**
```bash
kubectl create secret generic qdrant-credentials \
--from-literal=api-key='your-qdrant-api-key'
```
```yaml
vectorStore:
enabled: true
type: qdrant
qdrant:
enabled: false
external:
enabled: true
host: "qdrant.example.com"
port: 6334
useTls: true
existingSecret: "qdrant-credentials"
apiKeyKey: "api-key"
```
</Tab>
<Tab title="Pinecone">
Pinecone is external-only.
```bash
kubectl create secret generic pinecone-credentials \
--from-literal=api-key='your-pinecone-api-key'
```
```yaml
vectorStore:
enabled: true
type: pinecone
pinecone:
external:
enabled: true
indexHost: "your-index.svc.us-east1-gcp.pinecone.io"
existingSecret: "pinecone-credentials"
apiKeyKey: "api-key"
```
</Tab>
</Tabs>
```bash
helm install bifrost bifrost/bifrost \
--set image.tag=v1.4.11 \
-f storage-values.yaml
```

View File

@@ -0,0 +1,401 @@
---
title: "Troubleshooting"
description: "Diagnose and fix common issues with Bifrost Helm deployments — pods, database, ingress, secrets, PVCs, and performance"
icon: "wrench"
---
This page covers the most common problems encountered when deploying Bifrost with Helm, along with diagnostic commands and fixes.
---
## Pod Not Starting
### Quick diagnostics
```bash
# Show pod status
kubectl get pods -l app.kubernetes.io/name=bifrost
# Show pod events (most useful first step)
kubectl describe pod -l app.kubernetes.io/name=bifrost
# Show pod logs (use --previous if the pod has already crashed)
kubectl logs -l app.kubernetes.io/name=bifrost
kubectl logs -l app.kubernetes.io/name=bifrost --previous
```
### Image pull errors (`ErrImagePull` / `ImagePullBackOff`)
```bash
# Check which image is being pulled
kubectl describe pod -l app.kubernetes.io/name=bifrost | grep "Image:"
# Verify imagePullSecrets are attached
kubectl get pod -l app.kubernetes.io/name=bifrost -o jsonpath='{.items[0].spec.imagePullSecrets}'
# Test secret manually
kubectl get secret <pull-secret-name> -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | jq .
```
Common causes:
- `image.tag` not set — the chart requires it; the pod will not start without it
- Pull secret missing or expired (ECR tokens expire after 12 hours)
- Incorrect `image.repository` for enterprise registry
```bash
# Fix: set the correct tag
helm upgrade bifrost bifrost/bifrost --reuse-values --set image.tag=v1.4.11
```
### PVC not binding (`Pending`)
```bash
# Check PVC status
kubectl get pvc -l app.kubernetes.io/instance=bifrost
# Show binding events
kubectl describe pvc -l app.kubernetes.io/instance=bifrost
```
Common causes:
- No Persistent Volume provisioner in the cluster
- `storageClass` set to a class that doesn't exist
- `ReadWriteOnce` access mode with multiple replicas (SQLite PVCs are single-node)
```bash
# List available storage classes
kubectl get storageclass
# Fix: pin to a valid storage class
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set storage.persistence.storageClass=standard
```
### ConfigMap / Secret errors
```bash
# View the generated ConfigMap (contains rendered config.json)
kubectl get configmap bifrost-config -o yaml
# View secrets the pod depends on
kubectl get secret -l app.kubernetes.io/instance=bifrost
# Decode a specific secret value
kubectl get secret bifrost-encryption -o jsonpath='{.data.key}' | base64 -d
```
### CrashLoopBackOff
```bash
# Get last log lines before the crash
kubectl logs -l app.kubernetes.io/name=bifrost --previous --tail=50
# Common causes shown in logs:
# "encryption key is not initialized" → no key provided; optional, but data will be stored in plaintext
# "failed to connect to database" → see Database section below
# "image.tag is required" → set image.tag in values
```
---
## Database Connection Issues
### Embedded PostgreSQL
```bash
# Check if the PostgreSQL pod is running
kubectl get pods -l app.kubernetes.io/name=bifrost-postgresql
# Connect directly to inspect the database
kubectl exec -it deployment/bifrost-postgresql -- psql -U bifrost -d bifrost
# Test connectivity from the Bifrost pod
kubectl exec -it deployment/bifrost -- nc -zv bifrost-postgresql 5432
# Check PostgreSQL logs
kubectl logs deployment/bifrost-postgresql --tail=50
```
### External PostgreSQL
```bash
# Test connectivity from within the cluster
kubectl run pg-test --image=postgres:16-alpine --rm -it --restart=Never -- \
psql "host=your-db-host dbname=bifrost user=bifrost sslmode=require"
# Verify the secret value is correct
kubectl get secret postgres-credentials -o jsonpath='{.data.password}' | base64 -d
# Check that the external host/port is reachable
kubectl exec -it deployment/bifrost -- nc -zv your-db-host 5432
```
Common causes:
- `sslMode: disable` when the database requires SSL — set `sslMode: require`
- Password in secret doesn't match the database user
- Network policy blocking pod → database traffic
- Database not UTF8 encoded (see [PostgreSQL UTF8 Requirement](/quickstart/gateway/setting-up#postgresql-utf8-requirement))
```bash
# Fix: update the secret and restart
kubectl create secret generic postgres-credentials \
--from-literal=password='correct-password' \
--dry-run=client -o yaml | kubectl apply -f -
kubectl rollout restart deployment/bifrost
```
---
## Ingress Not Working
```bash
# Check ingress resource status
kubectl describe ingress bifrost
# Check if the ingress controller is running
kubectl get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx
# View ingress controller logs for routing errors
kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx --tail=50
# Verify DNS resolves to the correct load balancer IP
nslookup bifrost.yourdomain.com
kubectl get ingress bifrost -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
# Test without TLS first
curl -v http://bifrost.yourdomain.com/health
```
Common causes:
- `ingress.className` not set or set to a class not installed in the cluster
- TLS certificate not issued yet (cert-manager can take up to 60 seconds)
- Service port mismatch — Bifrost listens on `8080` by default
```bash
# Check cert-manager certificate status
kubectl get certificate -l app.kubernetes.io/instance=bifrost
kubectl describe certificate bifrost-tls
```
---
## Secret and Credential Issues
### Provider API key not resolving
If Bifrost logs show `env.OPENAI_API_KEY: not set` or similar:
```bash
# Check the env var is present in the running pod
kubectl exec -it deployment/bifrost -- env | grep OPENAI
# Verify the providerSecrets secret exists with the right key
kubectl get secret provider-api-keys -o yaml
# Check the providerSecrets configuration rendered correctly
kubectl get configmap bifrost-config -o yaml | grep -A5 providers
```
### Encryption key issues
```bash
# Verify the secret exists and contains the right key name
kubectl get secret bifrost-encryption -o yaml
# Check the exact key name matches encryptionKeySecret.key in values
# Default key name is "encryption-key" — if you used "key", set:
# bifrost.encryptionKeySecret.key: "key"
```
---
## High Memory Usage
```bash
# Check current resource usage
kubectl top pods -l app.kubernetes.io/name=bifrost
# Check if OOM kills are happening
kubectl describe pod -l app.kubernetes.io/name=bifrost | grep -A3 "OOMKilled\|Limits"
# View resource requests/limits on running pods
kubectl get pod -l app.kubernetes.io/name=bifrost \
-o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].resources}{"\n"}{end}'
```
**Increase resource limits:**
```bash
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set resources.limits.memory=4Gi \
--set resources.requests.memory=1Gi
```
**Tune Go runtime** (see [Docker Tuning](/deployment-guides/docker-tuning)):
```yaml
env:
- name: GOGC
value: "200" # run GC less often
- name: GOMEMLIMIT
value: "3500MiB" # hard memory ceiling slightly below the container limit
```
---
## High CPU Usage / Latency
```bash
# Check CPU usage
kubectl top pods -l app.kubernetes.io/name=bifrost
# Check if HPA is scaling correctly
kubectl get hpa bifrost
kubectl describe hpa bifrost
```
Common causes:
- `initialPoolSize` too small — goroutines queuing up; increase to `500``1000`
- `dropExcessRequests: false` with a small pool — queue depth growing unboundedly
```bash
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set bifrost.client.initialPoolSize=1000 \
--set bifrost.client.dropExcessRequests=true
```
---
## Autoscaling Issues
### HPA not scaling
```bash
# Check HPA status and current metrics
kubectl describe hpa bifrost
# Verify metrics server is installed
kubectl top nodes
kubectl top pods
# Common fix: metrics server not installed
# Install with:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
```
### Pods scaling down too aggressively (drops active SSE streams)
The default `scaleDown.stabilizationWindowSeconds: 300` and `preStop` sleep of 15 seconds should prevent this. If streams are still being cut:
```yaml
terminationGracePeriodSeconds: 120 # increase if streams run longer than 105s
autoscaling:
behavior:
scaleDown:
stabilizationWindowSeconds: 600 # wait 10 min before scaling down
policies:
- type: Pods
value: 1
periodSeconds: 300 # remove at most 1 pod per 5 min
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 30"] # give load balancer more time to drain
```
```bash
helm upgrade bifrost bifrost/bifrost --reuse-values -f graceful-shutdown-values.yaml
```
---
## SQLite / PVC Issues
### StatefulSet migration (upgrading from chart < v2.0.0)
Older chart versions used a Deployment + manual PVC. v2.0.0 moved SQLite to a StatefulSet. If upgrading:
```bash
# 1. Scale down the old deployment
kubectl scale deployment bifrost --replicas=0
# 2. Note the existing PVC name
kubectl get pvc
# 3. Upgrade, pointing at the existing claim
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set storage.persistence.existingClaim=<your-old-pvc-name> \
--set image.tag=v1.4.11
```
### Data lost after upgrade
```bash
# Check if PVCs still exist (they persist after helm uninstall)
kubectl get pvc -l app.kubernetes.io/instance=bifrost
# Re-attach by setting existingClaim
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set storage.persistence.existingClaim=<pvc-name>
```
---
## Cluster Mode Issues
### Peers not discovering each other
```bash
# Check gossip port is reachable between pods
kubectl exec -it bifrost-0 -- nc -zv bifrost-1.bifrost-headless 7946
# View gossip-related log lines
kubectl logs -l app.kubernetes.io/name=bifrost --tail=100 | grep -i gossip
# Check the headless service exists
kubectl get svc bifrost-headless
```
For Kubernetes-based discovery, verify the service account has pod list permissions:
```bash
kubectl auth can-i list pods --as=system:serviceaccount:default:bifrost
```
---
## Useful Diagnostic Commands
```bash
# Full state dump for a support ticket
kubectl get all -l app.kubernetes.io/instance=bifrost
kubectl describe pod -l app.kubernetes.io/name=bifrost > pod-describe.txt
kubectl logs -l app.kubernetes.io/name=bifrost --tail=200 > pod-logs.txt
# View the full rendered config.json
kubectl get configmap bifrost-config -o jsonpath='{.data.config\.json}' | jq .
# Check current Helm values (shows all overrides)
helm get values bifrost
# Check Helm release status
helm status bifrost
# View Helm release history
helm history bifrost
```
---
## Still Stuck?
- [GitHub Issues](https://github.com/maximhq/bifrost/issues) — search existing issues or open a new one
- [Enterprise Support](mailto:support@getmaxim.ai) — for enterprise customers with SLA

View File

@@ -0,0 +1,718 @@
---
title: "Values Reference"
description: "Complete reference for Bifrost Helm chart values — key parameters, how to supply them, and links to example files"
icon: "sliders"
---
This page covers every top-level parameter group in the Bifrost Helm chart's `values.yaml`, how to supply values via `--set` vs `-f`, and where to find ready-made example files.
<Note>
The full values schema is available at [https://getbifrost.ai/schema](https://getbifrost.ai/schema). All `values.yaml` fields map directly to `config.json` fields generated by the chart.
</Note>
## Supplying Values
### One-liner with `--set`
Good for a single field or quick experiments:
```bash
helm install bifrost bifrost/bifrost \
--set image.tag=v1.4.11 \
--set replicaCount=3 \
--set bifrost.client.initialPoolSize=500
```
### Values file with `-f`
Recommended for anything beyond a couple of fields:
```bash
# Create your values file
cat > my-values.yaml <<'EOF'
image:
tag: "v1.4.11"
replicaCount: 2
bifrost:
encryptionKey: "your-32-byte-encryption-key-here"
client:
initialPoolSize: 500
enableLogging: true
EOF
# Install
helm install bifrost bifrost/bifrost -f my-values.yaml
# Upgrade later
helm upgrade bifrost bifrost/bifrost -f my-values.yaml
# Upgrade and reuse all previously set values, overriding only one field
helm upgrade bifrost bifrost/bifrost \
--reuse-values \
--set replicaCount=5
```
### Multiple values files
Later files override earlier ones — useful for a base + environment-specific overlay:
```bash
helm install bifrost bifrost/bifrost \
-f base-values.yaml \
-f production-overrides.yaml
```
---
## Key Parameters Reference
### Image
| Parameter | Description | Default |
|-----------|-------------|---------|
| `image.repository` | Container image repository | `docker.io/maximhq/bifrost` |
| `image.tag` | **Required.** Image version (e.g. `v1.4.11`) | `""` |
| `image.pullPolicy` | Image pull policy | `IfNotPresent` |
| `imagePullSecrets` | List of pull secret names for private registries | `[]` |
```bash
# Always specify the tag — the chart will not start without it
helm install bifrost bifrost/bifrost --set image.tag=v1.4.11
```
### Replicas & Autoscaling
| Parameter | Description | Default |
|-----------|-------------|---------|
| `replicaCount` | Static replica count (ignored when HPA is enabled) | `1` |
| `autoscaling.enabled` | Enable Horizontal Pod Autoscaler | `false` |
| `autoscaling.minReplicas` | Minimum replicas | `1` |
| `autoscaling.maxReplicas` | Maximum replicas | `10` |
| `autoscaling.targetCPUUtilizationPercentage` | CPU target for scaling | `80` |
| `autoscaling.targetMemoryUtilizationPercentage` | Memory target for scaling | `80` |
| `autoscaling.behavior.scaleDown.stabilizationWindowSeconds` | Cooldown before scale-down (important for SSE streams) | `300` |
| `autoscaling.behavior.scaleDown.policies[0].value` | Max pods removed per period | `1` |
### Resources
| Parameter | Description | Default |
|-----------|-------------|---------|
| `resources.requests.cpu` | CPU request | `500m` |
| `resources.requests.memory` | Memory request | `512Mi` |
| `resources.limits.cpu` | CPU limit | `2000m` |
| `resources.limits.memory` | Memory limit | `2Gi` |
### Service
| Parameter | Description | Default |
|-----------|-------------|---------|
| `service.type` | `ClusterIP`, `LoadBalancer`, or `NodePort` | `ClusterIP` |
| `service.port` | Service port | `8080` |
### Ingress
| Parameter | Description | Default |
|-----------|-------------|---------|
| `ingress.enabled` | Enable ingress | `false` |
| `ingress.className` | Ingress class (e.g. `nginx`, `traefik`) | `""` |
| `ingress.annotations` | Ingress annotations | `{}` |
| `ingress.hosts` | Host rules | see values.yaml |
| `ingress.tls` | TLS configuration | `[]` |
```yaml
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/proxy-body-size: "100m"
hosts:
- host: bifrost.yourdomain.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: bifrost-tls
hosts:
- bifrost.yourdomain.com
```
### Probes
| Parameter | Description | Default |
|-----------|-------------|---------|
| `livenessProbe.initialDelaySeconds` | Seconds before first liveness check | `30` |
| `livenessProbe.periodSeconds` | Liveness check interval | `30` |
| `readinessProbe.initialDelaySeconds` | Seconds before first readiness check | `10` |
| `readinessProbe.periodSeconds` | Readiness check interval | `10` |
Both probes hit `GET /health`.
### Graceful Shutdown
Bifrost supports long-lived SSE streaming connections. The default `preStop` hook and termination grace period let in-flight streams finish before the pod is killed:
| Parameter | Description | Default |
|-----------|-------------|---------|
| `terminationGracePeriodSeconds` | Total grace period | `60` |
| `lifecycle.preStop.exec.command` | Sleep before SIGTERM so load balancer drains | `["sh", "-c", "sleep 15"]` |
Increase `terminationGracePeriodSeconds` if your typical stream responses take longer than 45 seconds.
### Service Account
| Parameter | Description | Default |
|-----------|-------------|---------|
| `serviceAccount.create` | Create a dedicated service account | `true` |
| `serviceAccount.annotations` | Annotations (e.g. for IRSA, Workload Identity) | `{}` |
| `serviceAccount.name` | Override the generated name | `""` |
### Pod Scheduling
```yaml
# Spread replicas across nodes
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: bifrost
topologyKey: kubernetes.io/hostname
# Pin to specific node pool
nodeSelector:
node-type: ai-workload
# Tolerate GPU taints
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
```
### Extra Environment Variables
Three ways to inject env vars:
```yaml
# Inline key/value pairs
env:
- name: HTTP_PROXY
value: "http://proxy.corp.example.com:3128"
# Map syntax (appended after env)
extraEnv:
NO_PROXY: "169.254.169.254,10.0.0.0/8"
# Bulk-load from existing Secrets or ConfigMaps
envFrom:
- secretRef:
name: my-corp-secrets
- configMapRef:
name: my-app-config
```
### Init Containers
```yaml
initContainers:
- name: wait-for-db
image: busybox:1.35
command: ["sh", "-c", "until nc -z postgres-svc 5432; do sleep 2; done"]
```
---
## Values Examples
The chart ships ready-made example files under [`helm-charts/bifrost/values-examples/`](https://github.com/maximhq/bifrost/tree/main/helm-charts/bifrost/values-examples):
| File | Use case |
|------|----------|
| `sqlite-only.yaml` | Minimal local/dev setup |
| `postgres-only.yaml` | Single-store Postgres |
| `production-ha.yaml` | HA: 3 replicas, Postgres, Weaviate, HPA, Ingress |
| `providers-and-virtual-keys.yaml` | All 23 providers + 7 virtual key patterns |
| `secrets-from-k8s.yaml` | All sensitive values from Kubernetes Secrets |
| `external-postgres.yaml` | Point at an existing Postgres instance |
| `postgres-redis.yaml` | Postgres + Redis vector store |
| `postgres-weaviate.yaml` | Postgres + Weaviate vector store |
| `postgres-qdrant.yaml` | Postgres + Qdrant vector store |
| `semantic-cache-secret-example.yaml` | Semantic cache with secret injection |
| `mixed-backend.yaml` | Config store = postgres, logs store = sqlite |
Install from an example file directly:
```bash
helm install bifrost bifrost/bifrost \
-f https://raw.githubusercontent.com/maximhq/bifrost/main/helm-charts/bifrost/values-examples/production-ha.yaml \
--set image.tag=v1.4.11
```
---
## Helm Operations
### View current values
```bash
helm get values bifrost
```
### Diff before upgrading (requires helm-diff plugin)
```bash
helm diff upgrade bifrost bifrost/bifrost -f my-values.yaml
```
### Rollback
```bash
helm history bifrost
helm rollback bifrost # to previous revision
helm rollback bifrost 2 # to revision 2
```
### Uninstall
```bash
helm uninstall bifrost
# Also remove PVCs (deletes all data)
kubectl delete pvc -l app.kubernetes.io/instance=bifrost
```
---
## All Key Parameters
A quick-reference table of the most commonly used top-level parameters:
| Parameter | Description | Default |
|-----------|-------------|---------|
| `image.tag` | **Required.** Bifrost image version (e.g., `v1.4.11`) | `""` |
| `replicaCount` | Number of replicas | `1` |
| `storage.mode` | Storage backend (`sqlite` or `postgres`) | `sqlite` |
| `storage.persistence.size` | PVC size for SQLite | `10Gi` |
| `postgresql.enabled` | Deploy embedded PostgreSQL | `false` |
| `vectorStore.enabled` | Enable vector store | `false` |
| `vectorStore.type` | Vector store type (`weaviate`, `redis`, `qdrant`) | `none` |
| `bifrost.encryptionKey` | Optional encryption key (use `encryptionKeySecret` in production). If omitted, data is stored in plaintext. | `""` |
| `ingress.enabled` | Enable ingress | `false` |
| `autoscaling.enabled` | Enable HPA | `false` |
### Secret Reference Parameters
Use existing Kubernetes Secrets instead of plain-text values. Every sensitive field in the chart has a corresponding `existingSecret` / `secretRef` alternative:
| Parameter | Description | Default |
|-----------|-------------|---------|
| `bifrost.encryptionKeySecret.name` | Secret name for encryption key | `""` |
| `bifrost.encryptionKeySecret.key` | Key within the secret | `"encryption-key"` |
| `postgresql.external.existingSecret` | Secret name for PostgreSQL password | `""` |
| `postgresql.external.passwordKey` | Key within the secret | `"password"` |
| `vectorStore.redis.external.existingSecret` | Secret name for Redis password | `""` |
| `vectorStore.redis.external.passwordKey` | Key within the secret | `"password"` |
| `vectorStore.weaviate.external.existingSecret` | Secret name for Weaviate API key | `""` |
| `vectorStore.weaviate.external.apiKeyKey` | Key within the secret | `"api-key"` |
| `vectorStore.qdrant.external.existingSecret` | Secret name for Qdrant API key | `""` |
| `vectorStore.qdrant.external.apiKeyKey` | Key within the secret | `"api-key"` |
| `bifrost.plugins.maxim.secretRef.name` | Secret name for Maxim API key | `""` |
| `bifrost.plugins.maxim.secretRef.key` | Key within the secret | `"api-key"` |
| `bifrost.providerSecrets.<provider>.existingSecret` | Secret name for provider API key | `""` |
| `bifrost.providerSecrets.<provider>.key` | Key within the secret | `"api-key"` |
| `bifrost.providerSecrets.<provider>.envVar` | Environment variable name to inject | `""` |
---
## Advanced Configuration
### Comprehensive Example
A production-ready values file combining the most common settings:
```yaml
# my-values.yaml
image:
tag: "v1.4.11"
replicaCount: 3
storage:
mode: postgres
postgresql:
enabled: true
auth:
password: "secure-password" # use existingSecret in production
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
ingress:
enabled: true
className: nginx
hosts:
- host: bifrost.example.com
paths:
- path: /
pathType: Prefix
bifrost:
encryptionKeySecret:
name: "bifrost-encryption"
key: "key"
providers:
openai:
keys:
- name: "primary"
value: "env.OPENAI_API_KEY"
weight: 1
providerSecrets:
openai:
existingSecret: "provider-api-keys"
key: "openai-api-key"
envVar: "OPENAI_API_KEY"
```
```bash
helm install bifrost bifrost/bifrost -f my-values.yaml
```
### Node Affinity & Scheduling
Deploy to specific nodes and spread replicas across hosts:
```yaml
nodeSelector:
node-type: ai-workload
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: bifrost
topologyKey: kubernetes.io/hostname
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
```
### Deployment & Pod Annotations
Useful for tooling like [Keel](https://keel.sh) for automatic image updates or Datadog APM injection:
```yaml
deploymentAnnotations:
keel.sh/policy: force
keel.sh/trigger: poll
podAnnotations:
ad.datadoghq.com/bifrost.logs: '[{"source":"bifrost","service":"bifrost"}]'
```
---
## Common Patterns
Ready-made values files for the most common deployment scenarios. Each pattern builds on the [quickstart](/deployment-guides/helm).
<Tabs>
<Tab title="Development">
Simple setup for local testing. SQLite, single replica, no autoscaling.
```bash
helm install bifrost bifrost/bifrost \
--set image.tag=v1.4.11 \
--set 'bifrost.providers.openai.keys[0].name=dev-key' \
--set 'bifrost.providers.openai.keys[0].value=sk-your-key' \
--set 'bifrost.providers.openai.keys[0].weight=1'
```
```bash
# Access
kubectl port-forward svc/bifrost 8080:8080
```
</Tab>
<Tab title="Multi-Provider">
Multiple LLM providers with weighted load balancing.
```bash
kubectl create secret generic provider-keys \
--from-literal=openai-api-key='sk-...' \
--from-literal=anthropic-api-key='sk-ant-...' \
--from-literal=gemini-api-key='your-gemini-key'
```
```yaml
# multi-provider.yaml
image:
tag: "v1.4.11"
bifrost:
encryptionKey: "your-encryption-key"
client:
enableLogging: true
allowDirectKeys: false
providers:
openai:
keys:
- name: "openai-primary"
value: "env.OPENAI_API_KEY"
weight: 2 # 50% of traffic
anthropic:
keys:
- name: "anthropic-primary"
value: "env.ANTHROPIC_API_KEY"
weight: 1 # 25%
gemini:
keys:
- name: "gemini-primary"
value: "env.GEMINI_API_KEY"
weight: 1 # 25%
providerSecrets:
openai:
existingSecret: "provider-keys"
key: "openai-api-key"
envVar: "OPENAI_API_KEY"
anthropic:
existingSecret: "provider-keys"
key: "anthropic-api-key"
envVar: "ANTHROPIC_API_KEY"
gemini:
existingSecret: "provider-keys"
key: "gemini-api-key"
envVar: "GEMINI_API_KEY"
plugins:
telemetry:
enabled: true
logging:
enabled: true
```
```bash
helm install bifrost bifrost/bifrost -f multi-provider.yaml
```
</Tab>
<Tab title="External Database">
Use an existing PostgreSQL instance — RDS, Cloud SQL, Azure Database, or self-managed.
```bash
kubectl create secret generic postgres-credentials \
--from-literal=password='your-external-postgres-password'
```
```yaml
# external-db.yaml
image:
tag: "v1.4.11"
storage:
mode: postgres
postgresql:
enabled: false
external:
enabled: true
host: "your-rds-endpoint.us-east-1.rds.amazonaws.com"
port: 5432
user: "bifrost"
database: "bifrost"
sslMode: "require"
existingSecret: "postgres-credentials"
passwordKey: "password"
bifrost:
encryptionKey: "your-encryption-key"
providers:
openai:
keys:
- name: "openai-primary"
value: "sk-..."
weight: 1
```
```bash
helm install bifrost bifrost/bifrost -f external-db.yaml
```
</Tab>
<Tab title="AI Workloads">
Semantic response caching for high-volume AI inference.
```bash
kubectl create secret generic bifrost-encryption \
--from-literal=key='your-32-byte-encryption-key'
kubectl create secret generic provider-keys \
--from-literal=openai-api-key='sk-your-key'
```
```yaml
# ai-workload.yaml
image:
tag: "v1.4.11"
storage:
mode: postgres
postgresql:
enabled: true
auth:
password: "secure-password"
primary:
persistence:
size: 50Gi
vectorStore:
enabled: true
type: weaviate
weaviate:
enabled: true
persistence:
size: 50Gi
bifrost:
encryptionKeySecret:
name: "bifrost-encryption"
key: "key"
providers:
openai:
keys:
- name: "openai-primary"
value: "env.OPENAI_API_KEY"
weight: 1
providerSecrets:
openai:
existingSecret: "provider-keys"
key: "openai-api-key"
envVar: "OPENAI_API_KEY"
plugins:
semanticCache:
enabled: true
config:
provider: "openai"
keys:
- value: "env.OPENAI_API_KEY"
weight: 1
embedding_model: "text-embedding-3-small"
dimension: 1536
threshold: 0.85
ttl: "1h"
cache_by_model: true
cache_by_provider: true
```
```bash
helm install bifrost bifrost/bifrost -f ai-workload.yaml
```
</Tab>
<Tab title="Kubernetes Secrets Only">
Zero credentials in values files — all sensitive data in Kubernetes Secrets.
```bash
kubectl create secret generic postgres-credentials \
--from-literal=password='your-postgres-password'
kubectl create secret generic bifrost-encryption \
--from-literal=key='your-encryption-key'
kubectl create secret generic provider-keys \
--from-literal=openai-api-key='sk-...' \
--from-literal=anthropic-api-key='sk-ant-...'
kubectl create secret generic qdrant-credentials \
--from-literal=api-key='your-qdrant-api-key'
```
```yaml
# secrets-only.yaml
image:
tag: "v1.4.11"
storage:
mode: postgres
postgresql:
enabled: false
external:
enabled: true
host: "postgres.example.com"
port: 5432
user: "bifrost"
database: "bifrost"
sslMode: "require"
existingSecret: "postgres-credentials"
passwordKey: "password"
vectorStore:
enabled: true
type: qdrant
qdrant:
enabled: false
external:
enabled: true
host: "qdrant.example.com"
port: 6334
existingSecret: "qdrant-credentials"
apiKeyKey: "api-key"
bifrost:
encryptionKeySecret:
name: "bifrost-encryption"
key: "key"
providers:
openai:
keys:
- name: "openai-primary"
value: "env.OPENAI_API_KEY"
weight: 1
anthropic:
keys:
- name: "anthropic-primary"
value: "env.ANTHROPIC_API_KEY"
weight: 1
providerSecrets:
openai:
existingSecret: "provider-keys"
key: "openai-api-key"
envVar: "OPENAI_API_KEY"
anthropic:
existingSecret: "provider-keys"
key: "anthropic-api-key"
envVar: "ANTHROPIC_API_KEY"
```
```bash
helm install bifrost bifrost/bifrost -f secrets-only.yaml
```
</Tab>
</Tabs>

View File

@@ -0,0 +1,77 @@
---
title: "Install make command"
description: "This guide explains how to install make command."
icon: "compact-disc"
---
## Windows
### Option A: Chocolatey (easy)
```
# Run in an elevated PowerShell (Run as Administrator)
choco install make
# verify
make --version
```
### Option B: Scoop (no admin needed)
```
# In a normal PowerShell
Set-ExecutionPolicy -Scope CurrentUser RemoteSigned
iwr get.scoop.sh -useb | iex
scoop install make
make --version
```
### Option C: MSYS2 (full Unix-like env)
```
# 1) Install MSYS2 from https://www.msys2.org/
# 2) In "MSYS2 MSYS" terminal:
pacman -Syu # then reopen terminal if asked
pacman -S make
make --version
```
<Note> Visual Studios nmake is a different tool (not GNU make). </Note>
## Ubuntu / Debian
```
sudo apt update
# Pulls in compilers and common build tools, including make
sudo apt install build-essential
# (or just) sudo apt install make
make --version
```
## macOS
### Option A: Xcode Command Line Tools (most common)
```
xcode-select --install # follow the prompt
make --version
```
This provides Apples/BSD-flavored make, which is fine for most projects.
### Option B: Homebrew (get GNU make ≥ 4.x as gmake)
```
# Install Homebrew if needed: https://brew.sh
brew install make
gmake --version
```
If a project specifically requires GNU make as make, you can use:
echo 'alias make="gmake"' >> ~/.zshrc && source ~/.zshrc
## Troubleshooting tips
- If make isnt found, restart your terminal (or on Windows, open a new PowerShell) so your PATH updates.
- Run which make (where make on Windows) to confirm which binary youre using.
- For Windows builds that depend on Unix tools (sed, grep, etc.), prefer MSYS2 or WSL for a smoother experience.

View File

@@ -0,0 +1,444 @@
---
title: "Multinode Deployment"
description: "Deploy multiple Bifrost nodes with shared configuration for high availability in OSS deployments"
icon: "layer-group"
---
## Overview
Running multiple Bifrost nodes provides high availability, load distribution, and fault tolerance for your AI gateway. This guide covers the recommended approach for deploying multiple Bifrost nodes in OSS deployments.
<Warning>
Running multiple OSS Bifrost nodes with a Postgres backend is not supported.
Here is the short technical explanation:
- Bifrost is designed to keep all critical information in memory, including provider configs, API keys, budgets, usage, and traffic distribution.
- Once a node is initialized, it does not read this information back from the database.
- In the Enterprise version, we use a slightly modified version of RAFT to synchronize this state in real time across nodes, while the database acts only as a dumb store.
- Based on our current view, OSS is sufficient for startups and medium-scale teams, and can easily handle around 3,0005,000 RPS on a single instance.
- If you need high availability and enterprise capabilities such as real-time synchronization, the Enterprise plan is the right fit.
- And yes, that is part of how we draw the OSS vs Enterprise line 💰.
</Warning>
### OSS vs Enterprise
| Aspect | OSS Approach | Enterprise Approach |
|--------|--------------|---------------------|
| **Configuration Source** | Shared `config.json` file | Database with P2P sync |
| **Sync Mechanism** | File sharing (ConfigMap, volumes) | Gossip protocol (real-time) |
| **Config Updates** | Modify file + restart nodes | UI/API with automatic propagation |
---
## How It Works
All configuration in Bifrost is loaded into memory at startup. For OSS multinode deployments, the recommended approach is to use `config.json` **without** `config_store` enabled.
### `config.json` as Single Source of Truth
When you deploy without `config_store`:
- **No database involved** - `config.json` is the only configuration source
- **Shared file** - All nodes read from the same `config.json` file
- **Identical configuration** - Since the source is shared, all nodes automatically have the same configuration
- **No sync needed** - The shared file itself ensures consistency
<Frame>
<img src="/media/oss-multinode.png" alt="OSS multi-node setup" />
</Frame>
---
## Why not to use `config_store` for Multinode OSS?
Using `config_store` (database-backed configuration) with multiple nodes in OSS creates a **synchronization problem**:
1. **Config changes are local** - When you update configuration via the UI or API, it updates the database and the in-memory config on that specific node only
2. **No propagation mechanism** - Other nodes don't know about the change; they keep their existing in-memory configuration
3. **Nodes become out of sync** - Different nodes end up with different configurations
4. **Restart required** - You'd have to restart all nodes after every config change to bring them back in sync
This defeats the purpose of having database-backed configuration with real-time updates.
<Warning>
Without P2P clustering (Enterprise feature), there's no mechanism to notify other nodes of configuration changes. For OSS multinode deployments, use the shared `config.json` approach instead.
</Warning>
### Enterprise Solution
Bifrost Enterprise includes **P2P clustering** with gossip protocol that automatically syncs configuration changes across all nodes in real-time. See the [Clustering documentation](/enterprise/clustering) for details.
---
## Setting Up Multinode OSS Deployment
### Example config.json
Create a `config.json` **without** `config_store` or `logs_store`:
<Note>
If you use PostgreSQL for `logs_store`, ensure the target database is UTF8 encoded. See [PostgreSQL UTF8 Requirement](../../quickstart/gateway/setting-up#postgresql-utf8-requirement).
</Note>
```json
{
"$schema": "https://www.getbifrost.ai/schema",
"client": {
"drop_excess_requests": false,
"enable_logging": false
},
"config_store": {
"enabled": false
},
"logs_store": {
"enabled": true,
"type": "postgres",
"config": {...}
},
"providers": {
"openai": {
"keys": [
{
"name": "openai-primary",
"value": "env.OPENAI_API_KEY",
"models": ["gpt-4o", "gpt-4o-mini"],
"weight": 1.0
}
]
},
"anthropic": {
"keys": [
{
"name": "anthropic-primary",
"value": "env.ANTHROPIC_API_KEY",
"models": ["claude-sonnet-4-20250514", "claude-3-5-haiku-20241022"],
"weight": 1.0
}
]
}
}
}
```
<Note>
Notice `config_store` is disabled. This ensures all configuration comes from the file only.
</Note>
### Kubernetes Deployment
Use a ConfigMap to share the same configuration across all pods:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: bifrost-config
namespace: default
data:
config.json: |
{
"$schema": "https://www.getbifrost.ai/schema",
"client": {
"drop_excess_requests": false,
"enable_logging": false
},
"config_store": {
"enabled": false
},
"logs_store": {
"enabled": true,
"type": "postgres",
"config": {...}
},
"providers": {
"openai": {
"keys": [
{
"name": "openai-primary",
"value": "env.OPENAI_API_KEY",
"models": ["gpt-4o", "gpt-4o-mini"],
"weight": 1.0
}
]
}
}
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: bifrost
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: bifrost
template:
metadata:
labels:
app: bifrost
spec:
containers:
- name: bifrost
image: maximhq/bifrost:latest
ports:
- containerPort: 8080
name: http
env:
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: provider-secrets
key: openai-api-key
volumeMounts:
- name: config
mountPath: /app
readOnly: true
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 1000m
memory: 1Gi
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: config
configMap:
name: bifrost-config
---
apiVersion: v1
kind: Service
metadata:
name: bifrost
namespace: default
spec:
type: LoadBalancer
selector:
app: bifrost
ports:
- port: 80
targetPort: 8080
protocol: TCP
name: http
```
### Docker Compose
Share the configuration using a bind mount:
```yaml
version: '3.8'
services:
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- bifrost-1
- bifrost-2
- bifrost-3
bifrost-1:
image: maximhq/bifrost:latest
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
volumes:
- ./config.json:/app/config.json:ro
expose:
- "8080"
bifrost-2:
image: maximhq/bifrost:latest
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
volumes:
- ./config.json:/app/config.json:ro
expose:
- "8080"
bifrost-3:
image: maximhq/bifrost:latest
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
volumes:
- ./config.json:/app/config.json:ro
expose:
- "8080"
```
**nginx.conf** for load balancing:
```nginx
events {
worker_connections 1024;
}
http {
upstream bifrost {
least_conn;
server bifrost-1:8080;
server bifrost-2:8080;
server bifrost-3:8080;
}
server {
listen 80;
location / {
proxy_pass http://bifrost;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
location /health {
access_log off;
return 200 "healthy\n";
}
}
}
```
### Bare Metal / VM Deployment
For bare metal or VM deployments, distribute the configuration file using:
- **NFS mount** - Mount a shared NFS directory containing `config.json`
- **rsync** - Sync the config file from a central location to all nodes
- **Configuration management** - Use Ansible, Chef, or Puppet to deploy identical configs
Example with rsync:
```bash
# On config server - push to all nodes
for node in node1 node2 node3; do
rsync -avz /etc/bifrost/config.json $node:/etc/bifrost/config.json
done
# Restart nodes after config update
for node in node1 node2 node3; do
ssh $node "systemctl restart bifrost"
done
```
---
## Updating Configuration
To update configuration in a multinode OSS deployment:
1. **Modify the shared `config.json` file**
- Update the ConfigMap (Kubernetes)
- Edit the shared file (Docker Compose / bare metal)
2. **Restart the nodes**
- Rolling restart is supported - nodes can be restarted one at a time
- Each node picks up the new configuration on startup
### Kubernetes Rolling Restart
```bash
# Update ConfigMap
kubectl apply -f configmap.yaml
# Trigger rolling restart
kubectl rollout restart deployment/bifrost
# Watch the rollout
kubectl rollout status deployment/bifrost
```
### Docker Compose Restart
```bash
# After updating config.json
docker-compose restart bifrost-1
docker-compose restart bifrost-2
docker-compose restart bifrost-3
```
---
## Best Practices
### Use Environment Variables for Secrets
Never put API keys directly in `config.json`. Use the `env.` prefix to reference environment variables:
```json
{
"providers": {
"openai": {
"keys": [
{
"value": "env.OPENAI_API_KEY"
}
]
}
}
}
```
Then provide the actual keys via environment variables or Kubernetes secrets.
### Load Balancer Configuration
Always put a load balancer in front of your Bifrost nodes:
- **Kubernetes**: Use a Service with `type: LoadBalancer` or an Ingress
- **Docker/VMs**: Use nginx, HAProxy, or a cloud load balancer
### Health Checks
Configure health checks to ensure traffic only goes to healthy nodes:
- **Liveness endpoint**: `GET /health`
- **Readiness endpoint**: `GET /health`
### Resource Allocation
For production deployments:
```yaml
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2000m
memory: 2Gi
```
---
## Summary
| Scenario | Recommendation |
|----------|----------------|
| Single node | Use `config_store` for UI access |
| Multinode OSS | Use shared `config.json` without `config_store` |
| Multinode Enterprise | Use P2P clustering with `config_store` |
For OSS multinode deployments, the shared `config.json` approach provides a simple, reliable way to keep all nodes in sync without the complexity of database synchronization.

View File

@@ -0,0 +1,185 @@
---
title: "Nginx reverse proxy"
description: "Run Bifrost behind NGINX with streaming-safe settings for SSE and WebSocket traffic"
icon: "shuffle"
---
This guide shows how to put NGINX in front of Bifrost for TLS termination, centralized routing, and load balancing.
<Note>
Incoming reverse-proxy behavior is configured in your infrastructure layer (NGINX/Ingress), not in `config.json`.
</Note>
---
## When to use this setup
- You want HTTPS termination in front of Bifrost.
- You run multiple Bifrost replicas and want L7 load balancing.
- You need one stable gateway URL for SDKs and agent clients.
---
## Docker Compose deployment
Use this when Bifrost and NGINX run as services in the same Compose project.
```yaml
services:
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- bifrost-1
- bifrost-2
- bifrost-3
bifrost-1:
image: maximhq/bifrost:latest
expose:
- "8080"
bifrost-2:
image: maximhq/bifrost:latest
expose:
- "8080"
bifrost-3:
image: maximhq/bifrost:latest
expose:
- "8080"
```
```nginx
events {
worker_connections 1024;
}
http {
upstream bifrost_backend {
least_conn;
server bifrost-1:8080;
server bifrost-2:8080;
server bifrost-3:8080;
}
server {
listen 80;
location / {
proxy_pass http://bifrost_backend;
# Preserve original request context
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Keep streaming responses stable
proxy_http_version 1.1;
proxy_buffering off;
proxy_request_buffering off;
proxy_read_timeout 300s;
proxy_send_timeout 300s;
}
}
}
```
If you expose WebSocket traffic through the same endpoint, add upgrade headers in the same `location /` block:
```nginx
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
```
---
## VM or bare-metal deployment
Use the same NGINX `location /` settings as above, and point `upstream` servers to hostnames/IPs reachable from that VM.
If you terminate TLS directly on NGINX, add:
```nginx
listen 443 ssl;
server_name bifrost.example.com;
ssl_certificate /etc/nginx/certs/fullchain.pem;
ssl_certificate_key /etc/nginx/certs/privkey.pem;
```
---
## Kubernetes (NGINX Ingress)
If you deploy with Helm, use Ingress values instead of a standalone NGINX config:
```yaml
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/proxy-body-size: "100m"
nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
nginx.ingress.kubernetes.io/proxy-send-timeout: "300"
nginx.ingress.kubernetes.io/proxy-buffering: "off"
hosts:
- host: bifrost.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: bifrost-tls
hosts:
- bifrost.example.com
```
---
## Verify the proxy path
```bash
# Docker Compose: render final config and validate syntax
docker compose config
# Kubernetes: validate ingress manifest locally
kubectl apply --dry-run=client -f ingress.yaml
```
```bash
# Health check through reverse proxy
curl -i http://bifrost.example.com/health
# Streaming check through NGINX
curl -N http://bifrost.example.com/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"stream": true,
"messages": [{"role": "user", "content": "test stream"}]
}'
```
If streaming responses arrive in delayed bursts, confirm buffering is disabled in NGINX or Ingress annotations.
---
## Related guides
- [Helm quick start](/deployment-guides/helm)
- [Helm values reference](/deployment-guides/helm/values)
- [Multinode deployment](/deployment-guides/how-to/multinode)
---
## Runnable example files
Use the complete Docker Compose + Helm/Kubernetes example in the repository:
- [docker-compose.yml](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/docker-compose.yml)
- [helm-values.yaml](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/helm-values.yaml)
- [k8s-ingress.yaml](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/k8s-ingress.yaml)

File diff suppressed because it is too large Load Diff