first commit

2026-04-26 21:52:23 +03:00
commit 880f412e2c
2662 changed files with 866266 additions and 0 deletions
--- a/docs/deployment-guides/config-json.mdx
+++ b/docs/deployment-guides/config-json.mdx
@@ -0,0 +1,416 @@
+---
+title: "Quick Start"
+description: "Configure Bifrost using a config.json file — GitOps-friendly, no-UI deployments, and multinode OSS setups"
+icon: "file-code"
+---
+
+<Note>
+**Full schema reference:** [`https://www.getbifrost.ai/schema`](https://www.getbifrost.ai/schema)
+</Note>
+
+`config.json` lets you configure every aspect of Bifrost through a single declarative file. It is the right choice for GitOps workflows, CI/CD pipelines, headless deployments, and multinode OSS setups where a central configuration file is shared across all replicas.
+
+---
+
+## Two Configuration Modes
+
+Bifrost supports **two mutually exclusive modes**. You cannot run both at the same time.
+
+| Mode | When | Behaviour |
+|------|------|-----------|
+| **Web UI / database** | No `config.json`, or `config.json` with `config_store` enabled | Full UI available, configuration stored in SQLite or PostgreSQL |
+| **File-based (`config.json`)** | `config.json` present, `config_store` disabled | UI disabled, all config loaded from file at startup, restart required for changes |
+
+<Note>
+See [Setting Up](/quickstart/gateway/setting-up#two-configuration-modes) for a full explanation of both modes and how `config_store` bootstrapping works.
+</Note>
+
+---
+
+## Minimal Working Example
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema",
+  "encryption_key": "env.BIFROST_ENCRYPTION_KEY",
+  "client": {
+    "drop_excess_requests": false,
+    "enable_logging": true
+  },
+  "providers": {
+    "openai": {
+      "keys": [
+        {
+          "name": "openai-primary",
+          "value": "env.OPENAI_API_KEY",
+          "models": ["*"],
+          "weight": 1.0
+        }
+      ]
+    }
+  },
+  "config_store": {
+    "enabled": false
+  }
+}
+```
+
+Save this as `config.json` in your app directory and start Bifrost:
+
+```bash
+# NPX
+npx -y @maximhq/bifrost -app-dir ./data
+
+# Docker
+docker run -p 8080:8080 \
+  -v $(pwd)/data:/app/data \
+  -e OPENAI_API_KEY=sk-... \
+  -e BIFROST_ENCRYPTION_KEY=your-32-byte-key \
+  maximhq/bifrost
+```
+
+Make your first call:
+
+```bash
+curl http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "openai/gpt-4o-mini",
+    "messages": [{"role": "user", "content": "Hello!"}]
+  }'
+```
+
+---
+
+## Environment Variable References
+
+Never put secrets directly in `config.json`. Use the `env.` prefix to reference any environment variable:
+
+```json
+{
+  "encryption_key": "env.BIFROST_ENCRYPTION_KEY",
+  "providers": {
+    "openai": {
+      "keys": [
+        {
+          "name": "primary",
+          "value": "env.OPENAI_API_KEY",
+          "weight": 1.0
+        }
+      ]
+    }
+  }
+}
+```
+
+Set the actual values through your deployment platform — shell environment, Docker `-e`, Kubernetes Secrets mounted as env vars, or a `.env` file.
+
+---
+
+## Schema Validation
+
+Add `$schema` to every `config.json` for IDE autocomplete and inline validation:
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema"
+}
+```
+
+Editors (VS Code, JetBrains, Neovim with LSP) will show completions and flag invalid fields as you type.
+
+---
+
+## Production Example
+
+A production-ready file with PostgreSQL storage, multi-provider setup, governance, and common plugins:
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema",
+  "encryption_key": "env.BIFROST_ENCRYPTION_KEY",
+
+  "client": {
+    "initial_pool_size": 500,
+    "drop_excess_requests": true,
+    "enable_logging": true,
+    "log_retention_days": 90,
+    "enforce_auth_on_inference": true,
+    "allow_direct_keys": false,
+    "allowed_origins": ["https://app.yourcompany.com"]
+  },
+
+  "providers": {
+    "openai": {
+      "keys": [
+        {
+          "name": "openai-primary",
+          "value": "env.OPENAI_API_KEY",
+          "models": ["*"],
+          "weight": 1.0
+        }
+      ],
+      "network_config": {
+        "default_request_timeout_in_seconds": 120,
+        "max_retries": 3
+      }
+    },
+    "anthropic": {
+      "keys": [
+        {
+          "name": "anthropic-primary",
+          "value": "env.ANTHROPIC_API_KEY",
+          "models": ["*"],
+          "weight": 1.0
+        }
+      ]
+    }
+  },
+
+  "config_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {
+      "host": "env.PG_HOST",
+      "port": "5432",
+      "user": "env.PG_USER",
+      "password": "env.PG_PASSWORD",
+      "db_name": "bifrost",
+      "ssl_mode": "require"
+    }
+  },
+
+  "logs_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {
+      "host": "env.PG_HOST",
+      "port": "5432",
+      "user": "env.PG_USER",
+      "password": "env.PG_PASSWORD",
+      "db_name": "bifrost",
+      "ssl_mode": "require"
+    }
+  }
+}
+```
+
+---
+
+## Enterprise Example: Postgres + etcd + Access Profiles
+
+Use this pattern when you want enterprise access-profile configuration to be seeded directly from `config.json`, while running clustered nodes with etcd discovery.
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema",
+  "cluster_config": {
+    "enabled": true,
+    "discovery": {
+      "enabled": true,
+      "type": "etcd",
+      "service_name": "bifrost-cluster",
+      "etcd_endpoints": ["http://localhost:2379"]
+    }
+  },
+  "config_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {
+      "host": "localhost",
+      "port": "5432",
+      "user": "postgres",
+      "password": "env.PG_PASSWORD",
+      "db_name": "bifrost-config",
+      "ssl_mode": "disable"
+    }
+  },
+  "logs_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {
+      "host": "localhost",
+      "port": "5432",
+      "user": "postgres",
+      "password": "env.PG_PASSWORD",
+      "db_name": "bifrost-config",
+      "ssl_mode": "disable"
+    }
+  },
+  "mcp": {
+    "client_configs": [
+      {
+        "client_id": "echo_http",
+        "name": "echo_http",
+        "connection_type": "http",
+        "connection_string": "https://mcpplaygroundonline.com/mcp-echo-server",
+        "auth_type": "none",
+        "tools_to_execute": ["echo"]
+      }
+    ]
+  },
+  "access_profiles": [
+    {
+      "name": "platform-default",
+      "description": "Default profile for enterprise access-profile testing",
+      "is_active": true,
+      "tags": ["platform", "test"],
+      "provider_configs": [
+        {
+          "provider_name": "OpenAi",
+          "all_models_allowed": false,
+          "allowed_models": ["gpt-4o-mini"]
+        }
+      ]
+    },
+    {
+      "name": "platform-readonly-mcp",
+      "description": "Profile for validating MCP include/exclude behavior",
+      "is_active": true,
+      "tags": ["mcp", "test"],
+      "mcp_servers": [
+        {
+          "mcp_server_id": "echo_http"
+        }
+      ],
+      "mcp_tool_overrides": [
+        {
+          "mcp_client_id": "echo_http",
+          "tool_name": "echo",
+          "action": "include"
+        },
+        {
+          "mcp_client_id": "github",
+          "tool_name": "create_pull_request",
+          "action": "exclude"
+        }
+      ]
+    }
+  ]
+}
+```
+
+<Note>
+`access_profiles` is an enterprise capability. For OSS-only deployments, use `governance.virtual_keys` and related governance resources instead.
+</Note>
+
+---
+
+## Example Configs
+
+Ready-to-use reference configurations from the [examples/configs](https://github.com/maximhq/bifrost/tree/main/examples/configs) directory on GitHub:
+
+<AccordionGroup>
+
+<Accordion title="Minimal / File-only">
+
+| Example | Description |
+|---------|-------------|
+| [noconfigstorenologstore](https://github.com/maximhq/bifrost/blob/main/examples/configs/noconfigstorenologstore/config.json) | Bare-minimum file-only mode — no database, no UI, providers loaded from file |
+| [partial](https://github.com/maximhq/bifrost/blob/main/examples/configs/partial/config.json) | SQLite config store with a minimal provider setup |
+| [v1compat](https://github.com/maximhq/bifrost/blob/main/examples/configs/v1compat/config.json) | `"version": 1` for v1.4.x array semantics (empty = allow all) |
+
+</Accordion>
+
+<Accordion title="Storage">
+
+| Example | Description |
+|---------|-------------|
+| [withconfigstore](https://github.com/maximhq/bifrost/blob/main/examples/configs/withconfigstore/config.json) | SQLite config store (Web UI enabled) |
+| [withconfigstorelogsstorepostgres](https://github.com/maximhq/bifrost/blob/main/examples/configs/withconfigstorelogsstorepostgres/config.json) | PostgreSQL for both config store and logs store |
+| [withlogstore](https://github.com/maximhq/bifrost/blob/main/examples/configs/withlogstore/config.json) | SQLite logs store |
+| [withobjectstorages3](https://github.com/maximhq/bifrost/blob/main/examples/configs/withobjectstorages3/config.json) | S3 object storage offload for logs |
+| [withobjectstoragegcs](https://github.com/maximhq/bifrost/blob/main/examples/configs/withobjectstoragegcs/config.json) | GCS object storage offload for logs |
+| [withvectorstoreweaviate](https://github.com/maximhq/bifrost/blob/main/examples/configs/withvectorstoreweaviate/config.json) | Weaviate vector store (with [docker-compose](https://github.com/maximhq/bifrost/blob/main/examples/configs/withvectorstoreweaviate/docker-compose.yml)) |
+
+</Accordion>
+
+<Accordion title="Semantic Cache">
+
+| Example | Description |
+|---------|-------------|
+| [withsemanticcache](https://github.com/maximhq/bifrost/blob/main/examples/configs/withsemanticcache/config.json) | Semantic cache backed by Weaviate |
+| [withsemanticcachevalkey](https://github.com/maximhq/bifrost/blob/main/examples/configs/withsemanticcachevalkey/config.json) | Semantic cache backed by Valkey / Redis |
+
+</Accordion>
+
+<Accordion title="Governance">
+
+| Example | Description |
+|---------|-------------|
+| [withauth](https://github.com/maximhq/bifrost/blob/main/examples/configs/withauth/config.json) | Admin username/password auth (`governance.auth_config`) |
+| [withvirtualkeys](https://github.com/maximhq/bifrost/blob/main/examples/configs/withvirtualkeys/config.json) | Virtual keys with provider/model allowlists |
+| [withteamscustomers](https://github.com/maximhq/bifrost/blob/main/examples/configs/withteamscustomers/config.json) | Teams and customers with budgets and rate limits |
+| [withroutingrules](https://github.com/maximhq/bifrost/blob/main/examples/configs/withroutingrules/config.json) | CEL-based routing rules for dynamic provider/model selection |
+| [withpricingoverridesnostore](https://github.com/maximhq/bifrost/blob/main/examples/configs/withpricingoverridesnostore/config.json) | Pricing overrides in file-only mode |
+| [withpricingoverridessqlite](https://github.com/maximhq/bifrost/blob/main/examples/configs/withpricingoverridessqlite/config.json) | Pricing overrides with SQLite config store |
+
+</Accordion>
+
+<Accordion title="Observability">
+
+| Example | Description |
+|---------|-------------|
+| [withobservability](https://github.com/maximhq/bifrost/blob/main/examples/configs/withobservability/config.json) | Prometheus metrics (telemetry always active, custom labels via `client.prometheus_labels`) |
+| [withprompushgateway](https://github.com/maximhq/bifrost/blob/main/examples/configs/withprompushgateway/config.json) | Prometheus Push Gateway for multi-instance deployments |
+| [withotel](https://github.com/maximhq/bifrost/blob/main/examples/configs/withotel/config.json) | OpenTelemetry traces and metrics |
+
+</Accordion>
+
+<Accordion title="Plugins & Advanced">
+
+| Example | Description |
+|---------|-------------|
+| [withdynamicplugin](https://github.com/maximhq/bifrost/blob/main/examples/configs/withdynamicplugin/config.json) | Loading a custom `.so` plugin at startup |
+| [withcompat](https://github.com/maximhq/bifrost/blob/main/examples/configs/withcompat/config.json) | SDK compatibility shims (`should_drop_params`, `convert_text_to_chat`) |
+| [withframework](https://github.com/maximhq/bifrost/blob/main/examples/configs/withframework/config.json) | Custom model pricing catalog URL and sync interval |
+| [withlargepayload](https://github.com/maximhq/bifrost/blob/main/examples/configs/withlargepayload/config.json) | Large payload optimization (streaming without full materialisation) |
+| [withwebsocket](https://github.com/maximhq/bifrost/blob/main/examples/configs/withwebsocket/config.json) | WebSocket / Realtime API connection pool tuning |
+| [withnginxreverseproxy](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/config.json) | 3-node Bifrost behind NGINX reverse proxy (includes [docker-compose](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/docker-compose.yml), [nginx.conf](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/nginx.conf), [helm values](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/helm-values.yaml), and [k8s ingress](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/k8s-ingress.yaml)) |
+| [withpostgresmcpclientsinconfig](https://github.com/maximhq/bifrost/blob/main/examples/configs/withpostgresmcpclientsinconfig/config.json) | MCP client definitions seeded from config.json with PostgreSQL store |
+| [encryptionmigration](https://github.com/maximhq/bifrost/blob/main/examples/configs/encryptionmigration/config.json) | Migrating to a new encryption key |
+
+</Accordion>
+
+</AccordionGroup>
+
+---
+
+## Configuration Guides
+
+<CardGroup cols={2}>
+  <Card title="Schema Reference" icon="brackets-curly" href="/deployment-guides/config-json/schema-reference">
+    Every top-level key, its type, default, and where it is documented
+  </Card>
+  <Card title="Client Configuration" icon="gear" href="/deployment-guides/config-json/client">
+    Pool size, logging, CORS, header filtering, compat shims, MCP settings
+  </Card>
+  <Card title="Provider Setup" icon="plug" href="/deployment-guides/config-json/providers">
+    OpenAI, Anthropic, Azure, Bedrock, Vertex, Groq, self-hosted
+  </Card>
+  <Card title="Storage" icon="database" href="/deployment-guides/config-json/storage">
+    config_store, logs_store, vector_store — SQLite, PostgreSQL, object storage
+  </Card>
+  <Card title="Plugins" icon="puzzle-piece" href="/deployment-guides/config-json/plugins">
+    Semantic cache, OTel, Maxim, Datadog, custom plugins
+  </Card>
+  <Card title="Cluster" icon="circle-nodes" href="/deployment-guides/config-json/cluster">
+    Cluster mode with static peers or discovery backends (enterprise)
+  </Card>
+  <Card title="Governance" icon="shield-check" href="/deployment-guides/config-json/governance">
+    Virtual keys, budgets, rate limits, routing rules, admin auth
+  </Card>
+  <Card title="Guardrails" icon="shield-halved" href="/deployment-guides/config-json/guardrails">
+    Content moderation providers and CEL-based rules (enterprise)
+  </Card>
+</CardGroup>
+
+---
+
+## Next Steps
+
+1. Configure [provider keys](/providers/supported-providers/overview)
+2. Enable [plugins](/plugins/getting-started)
+3. Set up [observability](/features/observability/default)
+4. Configure [governance](/features/governance/virtual-keys)
+5. Deploy [multiple nodes](/deployment-guides/how-to/multinode) with a shared `config.json`
--- a/docs/deployment-guides/config-json/client.mdx
+++ b/docs/deployment-guides/config-json/client.mdx
@@ -0,0 +1,276 @@
+---
+title: "Client Configuration"
+description: "Configure the Bifrost client in config.json — connection pool, logging, CORS, header filtering, compat shims, and MCP settings"
+icon: "gear"
+---
+
+The `client` block controls how Bifrost manages its internal worker pool, request logging, authentication enforcement, header policies, SDK compatibility shims, and MCP agent behaviour.
+
+---
+
+## Connection Pool
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `initial_pool_size` | integer | `300` | Pre-allocated worker goroutines per provider queue |
+| `drop_excess_requests` | boolean | `false` | Drop requests when queue is full instead of waiting (returns HTTP 429) |
+
+A larger pool reduces latency spikes under burst load at the cost of higher baseline memory. `500–1000` is a common starting point for production workloads with multiple providers.
+
+```json
+{
+  "client": {
+    "initial_pool_size": 1000,
+    "drop_excess_requests": true
+  }
+}
+```
+
+---
+
+## Request & Response Logging
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `enable_logging` | boolean | — | Log all LLM requests and responses |
+| `disable_content_logging` | boolean | `false` | Strip message content from logs (keeps metadata only) |
+| `log_retention_days` | integer | `365` | Days to retain log entries in the store |
+| `logging_headers` | array of strings | `[]` | HTTP request headers to capture in log metadata |
+
+Set `disable_content_logging: true` for HIPAA / PCI compliance workloads where message content must not be persisted.
+
+```json
+{
+  "client": {
+    "enable_logging": true,
+    "disable_content_logging": true,
+    "log_retention_days": 90,
+    "logging_headers": ["x-request-id", "x-user-id"]
+  }
+}
+```
+
+---
+
+## Security & CORS
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `allowed_origins` | array | `["*"]` | CORS allowed origins (use URIs or `"*"`) |
+| `allow_direct_keys` | boolean | `false` | Allow callers to pass provider keys directly in requests |
+| `enforce_auth_on_inference` | boolean | `false` | Require auth (virtual key, API key, or user token) on `/v1/*` inference routes |
+| `max_request_body_size_mb` | integer | `100` | Maximum allowed request body size in MB |
+| `whitelisted_routes` | array of strings | `[]` | Routes that bypass auth middleware |
+| `allowed_headers` | array of strings | `[]` | Additional headers permitted for CORS and WebSocket |
+
+```json
+{
+  "client": {
+    "allowed_origins": [
+      "https://app.yourcompany.com",
+      "https://admin.yourcompany.com"
+    ],
+    "allow_direct_keys": false,
+    "enforce_auth_on_inference": true,
+    "max_request_body_size_mb": 50,
+    "whitelisted_routes": ["/health", "/metrics"]
+  }
+}
+```
+
+---
+
+## Header Filtering
+
+Controls which `x-bf-eh-*` extra headers are forwarded to upstream LLM providers.
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `header_filter_config.allowlist` | array of strings | `[]` | Only these headers are forwarded (whitelist mode) |
+| `header_filter_config.denylist` | array of strings | `[]` | These headers are always blocked |
+| `required_headers` | array of strings | `[]` | Headers that must be present on every request (rejected with 400 if missing) |
+
+When both `allowlist` and `denylist` are empty, all `x-bf-eh-*` headers pass through. Specifying an `allowlist` enables strict whitelist mode — only listed headers are forwarded.
+
+```json
+{
+  "client": {
+    "header_filter_config": {
+      "allowlist": [
+        "x-bf-eh-anthropic-version",
+        "x-bf-eh-openai-beta"
+      ],
+      "denylist": []
+    },
+    "required_headers": ["x-request-id"]
+  }
+}
+```
+
+---
+
+## Compat Shims
+
+Compatibility flags that let Bifrost silently adapt request/response shapes for SDK integrations.
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `compat.convert_text_to_chat` | boolean | `false` | Wrap legacy `/v1/completions` text requests as chat messages |
+| `compat.convert_chat_to_responses` | boolean | `false` | Translate chat completions to Responses API format |
+| `compat.should_drop_params` | boolean | `false` | Silently drop unsupported parameters instead of erroring |
+| `compat.should_convert_params` | boolean | `false` | Auto-convert parameter values across provider schemas |
+
+```json
+{
+  "client": {
+    "compat": {
+      "should_drop_params": true,
+      "convert_text_to_chat": true
+    }
+  }
+}
+```
+
+---
+
+## MCP Agent Settings
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `mcp_agent_depth` | integer | `10` | Maximum tool-call recursion depth for MCP agent mode |
+| `mcp_tool_execution_timeout` | integer | `30` | Timeout per MCP tool execution in seconds |
+| `mcp_code_mode_binding_level` | string | — | Code mode binding level: `"server"` or `"tool"` |
+| `mcp_tool_sync_interval` | integer | `10` | Global tool sync interval in minutes (`0` = disabled) |
+| `mcp_disable_auto_tool_inject` | boolean | `false` | When `true`, MCP tools are not automatically injected into requests |
+
+```json
+{
+  "client": {
+    "mcp_agent_depth": 15,
+    "mcp_tool_execution_timeout": 60,
+    "mcp_tool_sync_interval": 10
+  }
+}
+```
+
+---
+
+## Async Jobs
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `async_job_result_ttl` | integer | `3600` | TTL (seconds) for async job results |
+| `disable_db_pings_in_health` | boolean | `false` | Exclude database connectivity from `/health` endpoint checks |
+
+---
+
+## Prometheus Labels
+
+Add custom labels to every Prometheus metric emitted by Bifrost:
+
+```json
+{
+  "client": {
+    "prometheus_labels": ["environment=production", "region=us-east-1"]
+  }
+}
+```
+
+---
+
+## Authentication
+
+`governance.auth_config` protects the Bifrost dashboard and management API with username/password auth.
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `is_enabled` | boolean | `false` | Enable username/password auth |
+| `admin_username` | string | — | Admin username |
+| `admin_password` | string | — | Admin password (use `env.` reference) |
+| `disable_auth_on_inference` | boolean | `false` | Skip auth check on `/v1/*` inference routes |
+
+```json
+{
+  "governance": {
+    "auth_config": {
+      "is_enabled": true,
+      "admin_username": "env.BIFROST_ADMIN_USERNAME",
+      "admin_password": "env.BIFROST_ADMIN_PASSWORD",
+      "disable_auth_on_inference": false
+    }
+  }
+}
+```
+
+<Note>
+A top-level `auth_config` is also accepted for backwards compatibility, but `governance.auth_config` is the preferred location.
+</Note>
+
+---
+
+## Encryption Key
+
+```json
+{
+  "encryption_key": "env.BIFROST_ENCRYPTION_KEY"
+}
+```
+
+| Notes |
+|-------|
+| Accepts any string; Bifrost derives a 32-byte AES-256 key using Argon2id |
+| Can also be set via the `BIFROST_ENCRYPTION_KEY` environment variable |
+| Once set and the database is populated, the key cannot be changed without clearing the database |
+| Omitting the key stores data in plain text — not recommended for production |
+
+---
+
+## Full Example
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema",
+  "encryption_key": "env.BIFROST_ENCRYPTION_KEY",
+
+  "governance": {
+    "auth_config": {
+      "is_enabled": true,
+      "admin_username": "env.BIFROST_ADMIN_USERNAME",
+      "admin_password": "env.BIFROST_ADMIN_PASSWORD",
+      "disable_auth_on_inference": false
+    }
+  },
+
+  "client": {
+    "initial_pool_size": 1000,
+    "drop_excess_requests": true,
+
+    "enable_logging": true,
+    "disable_content_logging": false,
+    "log_retention_days": 90,
+    "logging_headers": ["x-request-id", "x-user-id"],
+
+    "allowed_origins": ["https://app.yourcompany.com"],
+    "allow_direct_keys": false,
+    "enforce_auth_on_inference": true,
+    "max_request_body_size_mb": 100,
+
+    "header_filter_config": {
+      "allowlist": [],
+      "denylist": []
+    },
+    "required_headers": [],
+
+    "compat": {
+      "should_drop_params": false
+    },
+
+    "prometheus_labels": ["environment=production"],
+
+    "mcp_agent_depth": 10,
+    "mcp_tool_execution_timeout": 30,
+
+    "async_job_result_ttl": 3600
+  }
+}
+```
--- a/docs/deployment-guides/config-json/cluster.mdx
+++ b/docs/deployment-guides/config-json/cluster.mdx
@@ -0,0 +1,154 @@
+---
+title: "Cluster"
+description: "Configure enterprise cluster mode in config.json using peers or automatic discovery"
+icon: "circle-nodes"
+---
+
+<Warning>
+`cluster_config` is an enterprise capability. OSS builds ignore this section.
+
+</Warning>
+
+`cluster_config` enables multi-node Bifrost enterprise clustering with gossip-based membership and optional automatic node discovery.
+
+You can form a cluster in two ways:
+
+- Define static `peers` (`host:port`)
+- Enable `discovery` with one of: `kubernetes`, `dns`, `udp`, `consul`, `etcd`, `mdns`
+
+<Tip>
+At least one of `peers` or `discovery.enabled: true` must be configured when `cluster_config.enabled` is true.
+</Tip>
+
+---
+
+## Minimal Runnable Configs
+
+```json
+{
+  "cluster_config": {
+    "enabled": true,
+    "discovery": {
+      "enabled": true,
+      "type": "mdns",
+      "service_name": "bifrost-cluster"
+    }
+  }
+}
+```
+
+Use this for local testing. At startup, cluster init requires either:
+
+- non-empty `peers`, or
+- `discovery.enabled: true`
+
+If neither is set, cluster initialization fails.
+
+---
+
+## Static Peers
+
+```json
+{
+  "cluster_config": {
+    "enabled": true,
+    "region": "us-east-1",
+    "peers": [
+      "10.0.1.10:10101",
+      "10.0.1.11:10101"
+    ],
+    "gossip": {
+      "port": 10101,
+      "config": {
+        "timeout_seconds": 10,
+        "success_threshold": 3,
+        "failure_threshold": 3
+      }
+    }
+  }
+}
+```
+
+---
+
+## Discovery Example (etcd)
+
+```json
+{
+  "cluster_config": {
+    "enabled": true,
+    "region": "us-east-1",
+    "gossip": {
+      "port": 10101,
+      "config": {
+        "timeout_seconds": 10,
+        "success_threshold": 3,
+        "failure_threshold": 3
+      }
+    },
+    "discovery": {
+      "enabled": true,
+      "type": "etcd",
+      "service_name": "bifrost-cluster",
+      "etcd_endpoints": [
+        "http://etcd-1:2379",
+        "http://etcd-2:2379"
+      ],
+      "dial_timeout": "10s"
+    }
+  }
+}
+```
+
+---
+
+## Field Reference
+
+### `cluster_config`
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `enabled` | boolean | Enables cluster mode |
+| `region` | string | Region label for this node (defaults to `"unknown"` at runtime when omitted) |
+| `peers` | array of strings | Static peer addresses in `host:port` format |
+| `gossip` | object | Gossip/memberlist settings |
+| `discovery` | object | Automatic node discovery settings |
+
+### `cluster_config.gossip`
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `port` | integer | Gossip port for this node |
+| `config.timeout_seconds` | integer | Liveness timeout |
+| `config.success_threshold` | integer | Success count before healthy |
+| `config.failure_threshold` | integer | Failure count before unhealthy |
+
+### `cluster_config.discovery`
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `enabled` | boolean | Enables discovery process |
+| `type` | string | `kubernetes`, `dns`, `udp`, `consul`, `etcd`, `mdns` |
+| `service_name` | string | Service identifier (required for `consul`, `etcd`, `udp`, typically `mdns`; optional for `kubernetes` and `dns`) |
+| `bind_port` | integer | Port appended to discovered hosts if missing |
+| `dial_timeout` | string | Go duration string (`"5s"`, `"30s"`, `"1m"`) |
+| `allowed_address_space` | array of strings | CIDR filters for discovered nodes |
+| `k8s_namespace` | string | Kubernetes namespace for pod discovery |
+| `k8s_label_selector` | string | Kubernetes label selector |
+| `dns_names` | array of strings | DNS names to resolve |
+| `udp_broadcast_port` | integer | UDP broadcast port (required for `udp`) |
+| `consul_address` | string | Consul address |
+| `etcd_endpoints` | array of strings | etcd endpoint URLs |
+| `mdns_service` | string | Optional mDNS service type override (e.g. `"_bifrost-cluster._tcp"`) |
+
+<Note>
+For `discovery.type: "mdns"`, `service_name` is sufficient for most setups. When `mdns_service` is omitted, Bifrost derives the mDNS service type as `"_<service_name>._tcp"`. If you set `mdns_service`, it **overrides** the derived value and is used for both mDNS registration and browsing.
+</Note>
+
+<Warning>
+For `discovery.type: "udp"`, configure both `udp_broadcast_port` and `allowed_address_space`.
+</Warning>
+
+---
+
+For discovery-method deep dives and deployment patterns, see [Enterprise Clustering](/enterprise/clustering).
--- a/docs/deployment-guides/config-json/governance.mdx
+++ b/docs/deployment-guides/config-json/governance.mdx
@@ -0,0 +1,333 @@
+---
+title: "Governance"
+description: "Seed virtual keys, budgets, rate limits, routing rules, and admin auth in config.json"
+icon: "shield-check"
+---
+
+The `governance` block lets you seed all governance resources directly in `config.json`. On startup, Bifrost loads these into the configuration store. This is the recommended approach for GitOps workflows where governance state is managed as code.
+
+<Note>
+**Governance enforcement is always active** in OSS — you do not need a plugin entry to enable it. To require a virtual key on every inference request, set `client.enforce_auth_on_inference: true`. This is the global default, but a more specific inference-auth flag such as `governance.auth_config.disable_auth_on_inference` overrides it; if no specific override is set, `client.enforce_auth_on_inference` applies.
+</Note>
+
+---
+
+## Admin Authentication
+
+Protect the Bifrost dashboard and management API with username/password auth:
+
+```json
+{
+  "governance": {
+    "auth_config": {
+      "is_enabled": true,
+      "admin_username": "env.BIFROST_ADMIN_USERNAME",
+      "admin_password": "env.BIFROST_ADMIN_PASSWORD",
+      "disable_auth_on_inference": false
+    }
+  }
+}
+```
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `is_enabled` | `false` | Enable admin username/password auth |
+| `admin_username` | — | Admin username (supports `env.` prefix) |
+| `admin_password` | — | Admin password (supports `env.` prefix) |
+| `disable_auth_on_inference` | `false` | Skip auth check on `/v1/*` inference routes |
+
+---
+
+## Virtual Keys
+
+Virtual keys are issued to clients and act as scoped API tokens. Each key specifies which providers, models, and API keys the bearer is allowed to use.
+
+```json
+{
+  "governance": {
+    "virtual_keys": [
+      {
+        "id": "vk-team-platform",
+        "name": "platform-team",
+        "value": "env.VK_PLATFORM_TEAM",
+        "is_active": true,
+        "provider_configs": [
+          {
+            "provider": "openai",
+            "allowed_models": ["gpt-4o", "gpt-4o-mini"],
+            "key_ids": ["*"],
+            "weight": 1
+          },
+          {
+            "provider": "anthropic",
+            "allowed_models": ["*"],
+            "key_ids": ["*"],
+            "weight": 1
+          }
+        ]
+      }
+    ]
+  }
+}
+```
+
+### Virtual Key Fields
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `id` | Yes | Unique virtual key ID (referenced by budgets / rate limits) |
+| `name` | Yes | Human-readable name |
+| `value` | No | The key token sent by clients (use `env.` prefix). Auto-generated if omitted |
+| `is_active` | No | Default `true`. Set `false` to disable without deleting |
+| `team_id` | No | Associate with a team (mutually exclusive with `customer_id`) |
+| `customer_id` | No | Associate with a customer |
+| `rate_limit_id` | No | Attach a rate limit |
+| `calendar_aligned` | No | Snap budget resets to day/week/month/year boundaries |
+| `provider_configs` | No | Allowed provider/model/key combinations (empty = deny all) |
+
+### Provider Config Fields
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `provider` | Yes | Provider name (e.g. `"openai"`) |
+| `allowed_models` | No | Model allow-list. `["*"]` = all models; `[]` = deny all |
+| `key_ids` | No | Provider key names allowed for this VK. `["*"]` = all keys; `[]` = deny all. Use key `name` values (not UUIDs) in `config.json` |
+| `weight` | No | Load-balancing weight when multiple provider configs are present |
+| `rate_limit_id` | No | Attach a per-provider-config rate limit |
+
+---
+
+## Budgets
+
+Budgets cap cumulative spend (in USD) for a virtual key or provider config over a rolling window:
+
+```json
+{
+  "governance": {
+    "budgets": [
+      {
+        "id": "budget-platform-monthly",
+        "max_limit": 500.00,
+        "reset_duration": "1M",
+        "virtual_key_id": "vk-team-platform"
+      }
+    ]
+  }
+}
+```
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `id` | Yes | Unique budget ID |
+| `max_limit` | Yes | Maximum spend in USD |
+| `reset_duration` | Yes | Window length: `"30s"`, `"5m"`, `"1h"`, `"1d"`, `"1w"`, `"1M"`, `"1Y"` |
+| `virtual_key_id` | No | Attach to a virtual key (mutually exclusive with `provider_config_id`) |
+| `provider_config_id` | No | Attach to a provider config ID |
+
+---
+
+## Rate Limits
+
+Rate limits cap requests or tokens over a rolling window:
+
+```json
+{
+  "governance": {
+    "rate_limits": [
+      {
+        "id": "rl-platform-hourly",
+        "request_max_limit": 1000,
+        "request_reset_duration": "1h",
+        "token_max_limit": 1000000,
+        "token_reset_duration": "1h"
+      }
+    ]
+  }
+}
+```
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `id` | Yes | Unique rate limit ID |
+| `request_max_limit` | No | Maximum requests in window |
+| `request_reset_duration` | No | Window for request counter |
+| `token_max_limit` | No | Maximum tokens (input + output) in window |
+| `token_reset_duration` | No | Window for token counter |
+
+Attach a rate limit to a virtual key via `virtual_keys[].rate_limit_id`, or to a provider config via `virtual_keys[].provider_configs[].rate_limit_id`.
+
+---
+
+## Routing Rules
+
+Routing rules dynamically select the provider and model for each request based on a [CEL](https://cel.dev) expression. They are evaluated in priority order before the request is dispatched.
+
+```json
+{
+  "governance": {
+    "routing_rules": [
+      {
+        "id": "route-gpt4-to-azure",
+        "name": "Redirect GPT-4o to Azure",
+        "cel_expression": "request.model == 'gpt-4o'",
+        "targets": [
+          { "provider": "azure", "model": "gpt-4o", "weight": 1.0 }
+        ]
+      },
+      {
+        "id": "route-cost-split",
+        "name": "Split traffic 70/30 between providers",
+        "cel_expression": "true",
+        "targets": [
+          { "provider": "openai",    "weight": 0.7 },
+          { "provider": "anthropic", "weight": 0.3 }
+        ]
+      }
+    ]
+  }
+}
+```
+
+### Rule Fields
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `id` | Yes | Unique rule ID |
+| `name` | Yes | Human-readable name |
+| `cel_expression` | No | CEL expression. `"true"` matches every request |
+| `targets` | Yes | Weighted target list (weights must sum to `1.0`) |
+| `enabled` | No | Default `true` |
+| `priority` | No | Evaluation order within scope — lower numbers run first |
+| `scope` | No | `"global"` (default), `"team"`, `"customer"`, `"virtual_key"` |
+| `scope_id` | Conditional | Required when `scope` is not `"global"` |
+| `chain_rule` | No | If `true`, re-evaluates the chain after this rule matches |
+| `fallbacks` | No | Ordered fallback provider list if primary target fails |
+
+### Target Fields
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `weight` | Yes | Fraction of traffic (all weights in a rule must sum to `1.0`) |
+| `provider` | No | Target provider. Omit to keep the incoming request's provider |
+| `model` | No | Target model. Omit to keep the incoming request's model |
+| `key_id` | No | Pin a specific API key by name |
+
+---
+
+## Customers & Teams
+
+Define organizational entities and attach budgets or rate limits to them:
+
+```json
+{
+  "governance": {
+    "customers": [
+      {
+        "id": "customer-acme",
+        "name": "Acme Corp",
+        "budget_id": "budget-acme-monthly",
+        "rate_limit_id": "rl-acme-hourly"
+      }
+    ],
+    "teams": [
+      {
+        "id": "team-ml",
+        "name": "ML Team",
+        "customer_id": "customer-acme",
+        "budget_id": "budget-team-ml"
+      }
+    ]
+  }
+}
+```
+
+---
+
+## Full Governance Example
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema",
+  "encryption_key": "env.BIFROST_ENCRYPTION_KEY",
+
+  "client": {
+    "enforce_auth_on_inference": true
+  },
+
+  "governance": {
+    "auth_config": {
+      "is_enabled": true,
+      "admin_username": "env.BIFROST_ADMIN_USERNAME",
+      "admin_password": "env.BIFROST_ADMIN_PASSWORD"
+    },
+
+    "budgets": [
+      {
+        "id": "budget-platform",
+        "max_limit": 1000.00,
+        "reset_duration": "1M",
+        "virtual_key_id": "vk-platform"
+      }
+    ],
+
+    "rate_limits": [
+      {
+        "id": "rl-platform",
+        "request_max_limit": 5000,
+        "request_reset_duration": "1h",
+        "token_max_limit": 5000000,
+        "token_reset_duration": "1h"
+      }
+    ],
+
+    "virtual_keys": [
+      {
+        "id": "vk-platform",
+        "name": "platform-key",
+        "value": "env.VK_PLATFORM",
+        "is_active": true,
+        "rate_limit_id": "rl-platform",
+        "provider_configs": [
+          {
+            "provider": "openai",
+            "allowed_models": ["*"],
+            "key_ids": ["*"],
+            "weight": 1
+          }
+        ]
+      }
+    ],
+
+    "routing_rules": [
+      {
+        "id": "fallback-to-anthropic",
+        "name": "Fallback on error",
+        "cel_expression": "true",
+        "targets": [{ "provider": "openai", "weight": 1.0 }],
+        "fallbacks": ["anthropic"]
+      }
+    ]
+  },
+
+  "providers": {
+    "openai": {
+      "keys": [{ "name": "openai-primary", "value": "env.OPENAI_API_KEY", "models": ["*"], "weight": 1.0 }]
+    },
+    "anthropic": {
+      "keys": [{ "name": "anthropic-primary", "value": "env.ANTHROPIC_API_KEY", "models": ["*"], "weight": 1.0 }]
+    }
+  },
+
+  "config_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {
+      "host": "env.PG_HOST",
+      "port": "5432",
+      "user": "env.PG_USER",
+      "password": "env.PG_PASSWORD",
+      "db_name": "bifrost"
+    }
+  }
+}
+```
--- a/docs/deployment-guides/config-json/guardrails.mdx
+++ b/docs/deployment-guides/config-json/guardrails.mdx
@@ -0,0 +1,291 @@
+---
+title: "Guardrails"
+description: "Configure content moderation and policy enforcement in config.json using guardrails_config"
+icon: "shield-halved"
+---
+
+<Note>
+Guardrails are an **enterprise-only** feature and require the enterprise Bifrost image.
+</Note>
+
+Guardrails are configured under `guardrails_config` in `config.json`. The configuration has two parts:
+
+- **`guardrail_providers`** — the backend that performs the check. Rules link to providers by `id`.
+- **`guardrail_rules`** — CEL expressions that control when and where providers are invoked.
+
+---
+
+## Providers
+
+<Tabs>
+<Tab title="Regex">
+
+Runs entirely in-process with no external dependency. Patterns use RE2 syntax. Supports optional per-pattern flags: `i` (case-insensitive), `m` (multiline), `s` (dot-all).
+
+```json
+{
+  "guardrails_config": {
+    "guardrail_providers": [
+      {
+        "id": 1,
+        "provider_name": "regex",
+        "policy_name": "block-secrets",
+        "enabled": true,
+        "timeout": 5,
+        "config": {
+          "patterns": [
+            { "pattern": "sk-[A-Za-z0-9]{20,}", "description": "OpenAI API key" },
+            { "pattern": "AKIA[0-9A-Z]{16}", "description": "AWS access key" },
+            { "pattern": "gh[ps]_[A-Za-z0-9]{36}", "description": "GitHub token", "flags": "i" }
+          ],
+          "mode": "block"
+        }
+      }
+    ]
+  }
+}
+```
+
+</Tab>
+<Tab title="AWS Bedrock">
+
+```json
+{
+  "guardrails_config": {
+    "guardrail_providers": [
+      {
+        "id": 2,
+        "provider_name": "bedrock",
+        "policy_name": "content-filter",
+        "enabled": true,
+        "timeout": 15,
+        "config": {
+          "guardrail_arn": "arn:aws:bedrock:us-east-1::guardrail/abc123",
+          "guardrail_version": "DRAFT",
+          "region": "us-east-1",
+          "access_key": "env.AWS_ACCESS_KEY_ID",
+          "secret_key": "env.AWS_SECRET_ACCESS_KEY"
+        }
+      }
+    ]
+  }
+}
+```
+
+</Tab>
+<Tab title="Azure Content Safety">
+
+```json
+{
+  "guardrails_config": {
+    "guardrail_providers": [
+      {
+        "id": 3,
+        "provider_name": "azure",
+        "policy_name": "azure-content-safety",
+        "enabled": true,
+        "timeout": 10,
+        "config": {
+          "endpoint": "https://your-resource.cognitiveservices.azure.com",
+          "api_key": "env.AZURE_CONTENT_SAFETY_KEY",
+          "analyze_enabled": true,
+          "analyze_severity_threshold": "medium",
+          "jailbreak_shield_enabled": true,
+          "indirect_attack_shield_enabled": true,
+          "copyright_enabled": false,
+          "text_blocklist_enabled": false,
+          "blocklist_names": []
+        }
+      }
+    ]
+  }
+}
+```
+
+`analyze_severity_threshold` accepts `"low"`, `"medium"`, or `"high"`.
+
+</Tab>
+<Tab title="Gray Swan">
+
+```json
+{
+  "guardrails_config": {
+    "guardrail_providers": [
+      {
+        "id": 4,
+        "provider_name": "grayswan",
+        "policy_name": "grayswan-jailbreak",
+        "enabled": true,
+        "timeout": 15,
+        "config": {
+          "api_key": "env.GRAYSWAN_API_KEY",
+          "violation_threshold": 0.7,
+          "reasoning_mode": "standard",
+          "policy_id": "",
+          "policy_ids": [],
+          "rules": {}
+        }
+      }
+    ]
+  }
+}
+```
+
+</Tab>
+</Tabs>
+
+### Provider Fields
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `id` | Yes | Unique integer ID — referenced by rules via `provider_config_ids` |
+| `provider_name` | Yes | Backend: `"regex"`, `"bedrock"`, `"azure"`, `"grayswan"` |
+| `policy_name` | Yes | Human-readable policy label |
+| `enabled` | Yes | `true` to activate |
+| `timeout` | No | Execution timeout in seconds |
+| `config` | No | Provider-specific configuration object |
+
+---
+
+## Rules
+
+Rules are CEL expressions that fire when their condition matches. Available CEL variables:
+
+| Variable | Type | Description |
+|----------|------|-------------|
+| `model` | `string` | Model name from the request |
+| `provider` | `string` | Provider name (e.g. `"openai"`) |
+| `headers` | `map<string,string>` | HTTP request headers |
+| `params` | `map<string,string>` | Query parameters |
+| `customer` | `string` | Customer ID |
+| `team` | `string` | Team ID |
+| `user` | `string` | User ID |
+
+```json
+{
+  "guardrails_config": {
+    "guardrail_rules": [
+      {
+        "id": 101,
+        "name": "block-secrets-input",
+        "description": "Block prompts containing credentials",
+        "enabled": true,
+        "cel_expression": "true",
+        "apply_to": "input",
+        "sampling_rate": 100,
+        "timeout": 10,
+        "provider_config_ids": [1]
+      },
+      {
+        "id": 102,
+        "name": "content-safety-gpt4o-output",
+        "enabled": true,
+        "cel_expression": "model == 'gpt-4o'",
+        "apply_to": "output",
+        "sampling_rate": 100,
+        "timeout": 15,
+        "provider_config_ids": [3]
+      },
+      {
+        "id": 103,
+        "name": "grayswan-openai-partial",
+        "enabled": true,
+        "cel_expression": "provider == 'openai'",
+        "apply_to": "input",
+        "sampling_rate": 50,
+        "timeout": 20,
+        "provider_config_ids": [4]
+      }
+    ]
+  }
+}
+```
+
+### Rule Fields
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `id` | Yes | Unique integer ID |
+| `name` | Yes | Human-readable name |
+| `description` | No | Optional description |
+| `enabled` | Yes | `true` to activate |
+| `cel_expression` | Yes | CEL boolean expression. `"true"` matches every request |
+| `apply_to` | Yes | `"input"`, `"output"`, or `"both"` |
+| `sampling_rate` | No | `0`–`100`; percentage of requests to evaluate (default: `100`) |
+| `timeout` | No | Rule timeout in seconds |
+| `provider_config_ids` | No | `id` values of providers to invoke when this rule matches. Multiple providers run in parallel |
+
+---
+
+## Full Example
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema",
+  "encryption_key": "env.BIFROST_ENCRYPTION_KEY",
+
+  "providers": {
+    "openai": {
+      "keys": [{ "name": "primary", "value": "env.OPENAI_API_KEY", "models": ["*"], "weight": 1.0 }]
+    }
+  },
+
+  "guardrails_config": {
+    "guardrail_providers": [
+      {
+        "id": 1,
+        "provider_name": "regex",
+        "policy_name": "block-secrets",
+        "enabled": true,
+        "timeout": 5,
+        "config": {
+          "patterns": [
+            { "pattern": "sk-[A-Za-z0-9]{20,}", "description": "OpenAI API key" },
+            { "pattern": "AKIA[0-9A-Z]{16}", "description": "AWS access key" }
+          ],
+          "mode": "block"
+        }
+      },
+      {
+        "id": 2,
+        "provider_name": "azure",
+        "policy_name": "content-safety",
+        "enabled": true,
+        "timeout": 10,
+        "config": {
+          "endpoint": "https://your-resource.cognitiveservices.azure.com",
+          "api_key": "env.AZURE_CONTENT_SAFETY_KEY",
+          "analyze_enabled": true,
+          "analyze_severity_threshold": "medium",
+          "jailbreak_shield_enabled": true,
+          "indirect_attack_shield_enabled": false
+        }
+      }
+    ],
+    "guardrail_rules": [
+      {
+        "id": 101,
+        "name": "block-secrets-input",
+        "description": "Block prompts leaking credentials",
+        "enabled": true,
+        "cel_expression": "true",
+        "apply_to": "input",
+        "sampling_rate": 100,
+        "timeout": 10,
+        "provider_config_ids": [1]
+      },
+      {
+        "id": 102,
+        "name": "content-safety-both",
+        "description": "Azure content safety on all traffic",
+        "enabled": true,
+        "cel_expression": "true",
+        "apply_to": "both",
+        "sampling_rate": 100,
+        "timeout": 15,
+        "provider_config_ids": [2]
+      }
+    ]
+  }
+}
+```
--- a/docs/deployment-guides/config-json/plugins.mdx
+++ b/docs/deployment-guides/config-json/plugins.mdx
@@ -0,0 +1,318 @@
+---
+title: "Plugins"
+description: "Configure Bifrost plugins in config.json — semantic cache, OpenTelemetry, Maxim, Datadog, and custom plugins"
+icon: "puzzle-piece"
+---
+
+<Note>
+**The `plugins` array only controls explicitly opt-in plugins**: `semantic_cache`, `otel`, `maxim`, `datadog` (enterprise), and custom plugins.
+
+**Telemetry, logging, and governance are auto-loaded built-ins** — they are always active and configured via the `client` block and dedicated top-level keys, not the `plugins` array.
+</Note>
+
+---
+
+## Auto-Loaded Built-ins
+
+These plugins start automatically. You do **not** add them to the `plugins` array.
+
+| Plugin | Always active? | How to configure |
+|--------|---------------|-----------------|
+| **Telemetry** (Prometheus `/metrics`) | Yes, always | `client.prometheus_labels` for custom labels; push gateway via `plugins` entry once DB-backed mode is running |
+| **Logging** | When `client.enable_logging: true` and `logs_store` is configured | `client.enable_logging`, `client.disable_content_logging`, `client.logging_headers` |
+| **Governance** | Yes, always (OSS) | `client.enforce_auth_on_inference` for VK enforcement; `governance.*` for virtual keys / budgets / routing rules |
+
+See [Client Configuration](/deployment-guides/config-json/client) and [Governance](/deployment-guides/config-json/governance) for full details.
+
+---
+
+## Plugin Array Structure
+
+Every entry in the `plugins` array supports these common fields:
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `name` | string | Yes | Plugin name |
+| `enabled` | boolean | Yes | Enable or disable this plugin |
+| `config` | object | Varies | Plugin-specific configuration |
+| `path` | string | No | Path to a custom plugin binary or WASM file |
+| `version` | integer | No | 🛑 **DB-Backed Only.** Plugin metadata persisted on `TablePlugin`. In DB-backed sync, higher values trigger replacement/reload. Valid range: `1` to `32767`. |
+| `placement` | string | No | 🛑 **DB-Backed Only.** Execution metadata (`"pre_builtin"`, `"builtin"`, `"post_builtin"`) persisted on `TablePlugin` and used for ordering behavior. |
+| `order` | integer | No | 🛑 **DB-Backed Only.** Execution metadata persisted on `TablePlugin`; within a placement group, lower values run earlier. |
+
+<Note>
+`name`, `enabled`, `path`, and `config` are the core plugin config fields. In DB-backed mode, `version`, `placement`, and `order` are persisted on `TablePlugin` and used during sync/runtime ordering.
+</Note>
+
+---
+
+<Tabs>
+
+<Tab title="Semantic Cache">
+
+### Semantic Cache
+
+Caches LLM responses by semantic similarity. Returns a cached response when an incoming request is semantically close enough to a previous one.
+
+Requires a [vector store](/deployment-guides/config-json/storage#vector_store) to be configured.
+
+| Field | Required | Default | Description |
+|-------|----------|---------|-------------|
+| `config.dimension` | Yes | — | Embedding dimension. Use `1` for hash-based (exact) caching without an embedding provider |
+| `config.provider` | No | — | Provider for generating embeddings (required for semantic mode) |
+| `config.embedding_model` | No | — | Model for embeddings (required when `provider` is set) |
+| `config.threshold` | No | `0.8` | Cosine similarity threshold for a cache hit (0.0–1.0) |
+| `config.ttl` | No | `300` | Cache entry TTL in seconds (or a duration string like `"1h"`) |
+| `config.cache_by_model` | No | `true` | Include model in cache key |
+| `config.cache_by_provider` | No | `true` | Include provider in cache key |
+| `config.exclude_system_prompt` | No | `false` | Exclude system prompt from cache key |
+| `config.conversation_history_threshold` | No | `3` | Skip caching for requests with more messages than this |
+| `config.default_cache_key` | No | — | Default cache key when no `x-bf-cache-key` header is sent |
+
+**Semantic mode** (embedding-based similarity search):
+
+```json
+{
+  "plugins": [
+    {
+      "name": "semantic_cache",
+      "enabled": true,
+      "config": {
+        "provider": "openai",
+        "embedding_model": "text-embedding-3-small",
+        "dimension": 1536,
+        "threshold": 0.85,
+        "ttl": 300,
+        "cache_by_model": true,
+        "cache_by_provider": true
+      }
+    }
+  ]
+}
+```
+
+**Hash mode** (exact-match caching, no embedding provider needed):
+
+```json
+{
+  "plugins": [
+    {
+      "name": "semantic_cache",
+      "enabled": true,
+      "config": {
+        "dimension": 1,
+        "ttl": 1800
+      }
+    }
+  ]
+}
+```
+
+<Note>
+You must also configure a `vector_store` in `config.json`. See [Storage — vector_store](/deployment-guides/config-json/storage#vector_store).
+</Note>
+
+</Tab>
+
+<Tab title="OpenTelemetry">
+
+### OpenTelemetry (OTel)
+
+Exports distributed traces to any OTel-compatible collector (Jaeger, Zipkin, Tempo, Datadog via OTLP, etc.).
+
+| Field | Required | Default | Description |
+|-------|----------|---------|-------------|
+| `config.collector_url` | Yes | — | OTLP collector endpoint |
+| `config.trace_type` | Yes | — | Trace format: `"genai_extension"`, `"vercel"`, or `"open_inference"` |
+| `config.protocol` | Yes | — | `"http"` or `"grpc"` |
+| `config.service_name` | No | `"bifrost"` | Service name reported to the collector |
+| `config.metrics_enabled` | No | `false` | Enable push-based OTLP metrics export |
+| `config.metrics_endpoint` | No | — | OTLP metrics endpoint URL |
+| `config.metrics_push_interval` | No | `15` | Metrics push interval in seconds |
+| `config.headers` | No | — | Custom headers for the collector (supports `env.` prefix) |
+| `config.insecure` | No | `false` | Skip TLS verification |
+| `config.tls_ca_cert` | No | — | Path to TLS CA certificate |
+
+```json
+{
+  "plugins": [
+    {
+      "name": "otel",
+      "enabled": true,
+      "config": {
+        "collector_url": "http://otel-collector:4318",
+        "trace_type": "genai_extension",
+        "protocol": "http",
+        "service_name": "bifrost-gateway"
+      }
+    }
+  ]
+}
+```
+
+**With authentication headers:**
+
+```json
+{
+  "plugins": [
+    {
+      "name": "otel",
+      "enabled": true,
+      "config": {
+        "collector_url": "https://otel.example.com:4318",
+        "trace_type": "open_inference",
+        "protocol": "http",
+        "service_name": "bifrost",
+        "headers": {
+          "Authorization": "env.OTEL_AUTH_HEADER"
+        }
+      }
+    }
+  ]
+}
+```
+
+**With OTLP metrics export:**
+
+```json
+{
+  "plugins": [
+    {
+      "name": "otel",
+      "enabled": true,
+      "config": {
+        "collector_url": "http://otel-collector:4318",
+        "trace_type": "genai_extension",
+        "protocol": "http",
+        "metrics_enabled": true,
+        "metrics_endpoint": "http://otel-collector:4318/v1/metrics",
+        "metrics_push_interval": 30
+      }
+    }
+  ]
+}
+```
+
+</Tab>
+
+<Tab title="Maxim">
+
+### Maxim Observability
+
+Sends request traces to the [Maxim](https://www.getmaxim.ai) observability platform.
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `config.api_key` | Yes | Maxim API key (use `env.` prefix) |
+| `config.log_repo_id` | No | Default Maxim logger repository ID |
+
+```json
+{
+  "plugins": [
+    {
+      "name": "maxim",
+      "enabled": true,
+      "config": {
+        "api_key": "env.MAXIM_API_KEY",
+        "log_repo_id": "your-log-repo-id"
+      }
+    }
+  ]
+}
+```
+
+</Tab>
+
+<Tab title="Datadog">
+
+### Datadog
+
+<Note>
+Datadog is an **enterprise-only** plugin and is silently ignored in OSS builds.
+</Note>
+
+Sends APM traces and metrics to a Datadog Agent.
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `config.agent_addr` | `"localhost:8126"` | Datadog Agent address for APM traces |
+| `config.service_name` | `"bifrost"` | Service name in Datadog |
+| `config.env` | — | Environment tag (e.g. `"production"`, `"staging"`) |
+| `config.version` | — | Service version tag |
+| `config.enable_traces` | `true` | Enable APM trace collection |
+| `config.custom_tags` | `{}` | Additional key/value tags for all traces and metrics |
+
+```json
+{
+  "plugins": [
+    {
+      "name": "datadog",
+      "enabled": true,
+      "config": {
+        "agent_addr": "datadog-agent:8126",
+        "service_name": "bifrost",
+        "env": "production",
+        "enable_traces": true,
+        "custom_tags": {
+          "team": "platform",
+          "region": "us-east-1"
+        }
+      }
+    }
+  ]
+}
+```
+
+</Tab>
+
+</Tabs>
+
+---
+
+## Custom / Dynamic Plugins
+
+Load a custom Go plugin binary or WASM plugin at startup using the `path` field. Custom plugins must implement one of the Bifrost plugin interfaces.
+
+```json
+{
+  "plugins": [
+    {
+      "name": "my-custom-auth",
+      "enabled": true,
+      "path": "/app/plugins/my-custom-auth.so",
+      "config": {
+        "auth_endpoint": "env.AUTH_SERVICE_URL"
+      }
+    }
+  ]
+}
+```
+
+**WASM plugin:**
+
+```json
+{
+  "plugins": [
+    {
+      "name": "my-wasm-plugin",
+      "enabled": true,
+      "path": "/app/plugins/my-plugin.wasm",
+      "config": {}
+    }
+  ]
+}
+```
+
+See [Writing Go Plugins](/plugins/writing-go-plugin) and [Writing WASM Plugins](/plugins/writing-wasm-plugin) for implementation guides.
+
+**Placement and ordering (DB-backed only):**
+
+In DB-backed mode, plugin metadata such as `version` (`1` to `32767`), `placement`, and `order` can be managed via config sync and DB/UI workflows:
+
+| `placement` | When it runs |
+|-------------|-------------|
+| `pre_builtin` | Before all built-in plugins |
+| `builtin` | Alongside built-in plugins (by `order`) |
+| `post_builtin` | After all built-in plugins (default) |
+
+Within a placement group, lower `order` values run earlier.
--- a/docs/deployment-guides/config-json/providers.mdx
+++ b/docs/deployment-guides/config-json/providers.mdx
@@ -0,0 +1,755 @@
+---
+title: "Provider Setup"
+description: "Configure LLM providers in config.json — API keys, cloud-native auth, per-provider network settings, and self-hosted endpoints"
+icon: "plug"
+---
+
+All providers are configured under `providers` in `config.json`. Each provider entry contains a `keys` array where every key has a `name`, `value`, `models`, and `weight`, plus optional provider-specific config objects.
+
+**Supplying credentials:**
+
+Use the `env.` prefix to reference environment variables — never put API keys directly in `config.json`:
+
+```json
+{
+  "providers": {
+    "openai": {
+      "keys": [
+        {
+          "name": "primary",
+          "value": "env.OPENAI_API_KEY",
+          "models": ["*"],
+          "weight": 1.0
+        }
+      ]
+    }
+  }
+}
+```
+
+---
+
+## Common Provider Fields
+
+Every key object supports these fields:
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `name` | string | Unique name for this key (used in logs and virtual key pin) |
+| `value` | string | API key value or `env.VAR_NAME` reference |
+| `models` | array | Models this key serves. `["*"]` = all models |
+| `weight` | float | Load balancing weight. Higher = more traffic |
+| `aliases` | object | Map logical name → actual model name for this key |
+| `use_for_batch_api` | boolean | Mark key as eligible for batch API calls |
+
+Per-provider `network_config` options (applies to all standard providers):
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `default_request_timeout_in_seconds` | integer | Per-request timeout |
+| `max_retries` | integer | Retry attempts on transient errors |
+| `retry_backoff_initial` | integer | Initial backoff in milliseconds |
+| `retry_backoff_max` | integer | Maximum backoff in milliseconds |
+| `max_conns_per_host` | integer | Max TCP connections to the provider endpoint (default: 5000) |
+| `extra_headers` | object | Static headers added to every provider request |
+| `stream_idle_timeout_in_seconds` | integer | Idle timeout per stream chunk (default: 60) |
+| `insecure_skip_verify` | boolean | Disable TLS verification (last resort only) |
+| `ca_cert_pem` | string | PEM-encoded CA for self-signed or private CA endpoints |
+
+Concurrency and buffering per provider:
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `concurrency_and_buffer_size.concurrency` | integer | Max concurrent requests to this provider |
+| `concurrency_and_buffer_size.buffer_size` | integer | Request queue depth |
+
+---
+
+<Tabs>
+
+<Tab title="OpenAI">
+
+### OpenAI
+
+Supports multiple keys with weighted load balancing. Mark one key with `use_for_batch_api: true` to designate it for the Batch API.
+
+```json
+{
+  "providers": {
+    "openai": {
+      "keys": [
+        {
+          "name": "openai-primary",
+          "value": "env.OPENAI_KEY_1",
+          "models": ["*"],
+          "weight": 2.0
+        },
+        {
+          "name": "openai-secondary",
+          "value": "env.OPENAI_KEY_2",
+          "models": ["gpt-4o-mini"],
+          "weight": 1.0
+        },
+        {
+          "name": "openai-batch",
+          "value": "env.OPENAI_KEY_BATCH",
+          "models": ["*"],
+          "weight": 1.0,
+          "use_for_batch_api": true
+        }
+      ],
+      "network_config": {
+        "default_request_timeout_in_seconds": 120,
+        "max_retries": 3,
+        "retry_backoff_initial": 500,
+        "retry_backoff_max": 5000
+      }
+    }
+  }
+}
+```
+
+</Tab>
+
+<Tab title="Anthropic">
+
+### Anthropic
+
+```json
+{
+  "providers": {
+    "anthropic": {
+      "keys": [
+        {
+          "name": "anthropic-primary",
+          "value": "env.ANTHROPIC_KEY_1",
+          "models": ["*"],
+          "weight": 1.0
+        },
+        {
+          "name": "anthropic-secondary",
+          "value": "env.ANTHROPIC_KEY_2",
+          "models": ["*"],
+          "weight": 1.0
+        }
+      ],
+      "network_config": {
+        "default_request_timeout_in_seconds": 180
+      }
+    }
+  }
+}
+```
+
+**Override Anthropic beta headers** (optional):
+
+```json
+{
+  "providers": {
+    "anthropic": {
+      "keys": [
+        {
+          "name": "primary",
+          "value": "env.ANTHROPIC_API_KEY",
+          "models": ["*"],
+          "weight": 1.0
+        }
+      ],
+      "network_config": {
+        "beta_header_overrides": {
+          "redact-thinking-": true
+        }
+      }
+    }
+  }
+}
+```
+
+</Tab>
+
+<Tab title="Azure OpenAI">
+
+### Azure OpenAI
+
+Azure requires `azure_key_config` on every key with `endpoint` and `api_version`. List your Azure deployment names in `models` — Bifrost routes requests using the model name as the deployment name. If your deployment names differ from the model names you use in requests, add an `aliases` map on the key.
+
+<Tabs>
+<Tab title="API Key">
+
+```json
+{
+  "providers": {
+    "azure": {
+      "keys": [
+        {
+          "name": "azure-primary",
+          "value": "env.AZURE_API_KEY",
+          "models": ["gpt-4o", "gpt-4o-mini"],
+          "weight": 1.0,
+          "azure_key_config": {
+            "endpoint": "env.AZURE_ENDPOINT",
+            "api_version": "env.AZURE_API_VERSION"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+Set environment variables:
+
+```bash
+export AZURE_API_KEY="your-azure-api-key"
+export AZURE_ENDPOINT="https://your-resource.openai.azure.com"
+export AZURE_API_VERSION="2024-10-21"
+```
+
+</Tab>
+<Tab title="Managed Identity / DefaultAzureCredential">
+
+When `value` is empty or omitted, Bifrost uses `DefaultAzureCredential` — which resolves credentials from Workload Identity, VM managed identity, or `az login`.
+
+```json
+{
+  "providers": {
+    "azure": {
+      "keys": [
+        {
+          "name": "azure-workload-identity",
+          "value": "",
+          "models": ["gpt-4o"],
+          "weight": 1.0,
+          "azure_key_config": {
+            "endpoint": "env.AZURE_ENDPOINT",
+            "api_version": "env.AZURE_API_VERSION"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+</Tab>
+</Tabs>
+
+**Deployment name aliases** — when your Azure deployment names differ from the model names in requests, use `aliases`:
+
+```json
+{
+  "providers": {
+    "azure": {
+      "keys": [
+        {
+          "name": "azure-primary",
+          "value": "env.AZURE_API_KEY",
+          "models": ["gpt-4o"],
+          "weight": 1.0,
+          "aliases": {
+            "gpt-4o": "gpt-4o-prod-deployment"
+          },
+          "azure_key_config": {
+            "endpoint": "env.AZURE_ENDPOINT",
+            "api_version": "env.AZURE_API_VERSION"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+**Multi-region failover** (two keys, different regions):
+
+```json
+{
+  "providers": {
+    "azure": {
+      "keys": [
+        {
+          "name": "eastus",
+          "value": "env.AZURE_KEY_EAST",
+          "models": ["gpt-4o"],
+          "weight": 1.0,
+          "azure_key_config": {
+            "endpoint": "env.AZURE_ENDPOINT_EAST",
+            "api_version": "env.AZURE_API_VERSION"
+          }
+        },
+        {
+          "name": "westus",
+          "value": "env.AZURE_KEY_WEST",
+          "models": ["gpt-4o"],
+          "weight": 1.0,
+          "azure_key_config": {
+            "endpoint": "env.AZURE_ENDPOINT_WEST",
+            "api_version": "env.AZURE_API_VERSION"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+</Tab>
+
+<Tab title="AWS Bedrock">
+
+### AWS Bedrock
+
+Bedrock requires `bedrock_key_config` with at minimum a `region`. Three auth modes:
+
+<Tabs>
+<Tab title="Static Credentials">
+
+```json
+{
+  "providers": {
+    "bedrock": {
+      "keys": [
+        {
+          "name": "bedrock-static",
+          "value": "",
+          "models": ["*"],
+          "weight": 1.0,
+          "bedrock_key_config": {
+            "region": "us-east-1",
+            "access_key": "env.AWS_ACCESS_KEY_ID",
+            "secret_key": "env.AWS_SECRET_ACCESS_KEY"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+</Tab>
+<Tab title="IAM Role (instance profile / IRSA)">
+
+When only `region` is set, Bifrost inherits credentials from the AWS SDK default chain — IRSA (IAM Roles for Service Accounts), EC2 instance profile, or `AWS_*` env vars.
+
+```json
+{
+  "providers": {
+    "bedrock": {
+      "keys": [
+        {
+          "name": "bedrock-iam",
+          "value": "",
+          "models": ["*"],
+          "weight": 1.0,
+          "bedrock_key_config": {
+            "region": "us-east-1"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+</Tab>
+<Tab title="STS AssumeRole">
+
+```json
+{
+  "providers": {
+    "bedrock": {
+      "keys": [
+        {
+          "name": "bedrock-assumerole",
+          "value": "",
+          "models": ["*"],
+          "weight": 1.0,
+          "bedrock_key_config": {
+            "region": "us-west-2",
+            "role_arn": "env.AWS_ROLE_ARN",
+            "external_id": "env.AWS_EXTERNAL_ID",
+            "session_name": "bifrost-session"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+</Tab>
+</Tabs>
+
+**Model aliases** (map logical names to Bedrock inference profile IDs):
+
+```json
+{
+  "bedrock_key_config": {
+    "region": "us-east-1"
+  },
+  "aliases": {
+    "claude-sonnet": "us.anthropic.claude-3-5-sonnet-20241022-v2:0",
+    "claude-haiku":  "us.anthropic.claude-3-5-haiku-20241022-v1:0"
+  }
+}
+```
+
+**Batch API — S3 configuration:**
+
+```json
+{
+  "bedrock_key_config": {
+    "region": "us-east-1",
+    "access_key": "env.AWS_ACCESS_KEY_ID",
+    "secret_key": "env.AWS_SECRET_ACCESS_KEY",
+    "batch_s3_config": {
+      "buckets": [
+        {
+          "bucket_name": "my-bedrock-batch-bucket",
+          "prefix": "batch/",
+          "is_default": true
+        }
+      ]
+    }
+  }
+}
+```
+
+</Tab>
+
+<Tab title="Google Vertex AI">
+
+### Google Vertex AI
+
+Vertex requires `vertex_key_config` with `project_id` and `region`. Two auth modes:
+
+<Tabs>
+<Tab title="Service Account Key">
+
+```json
+{
+  "providers": {
+    "vertex": {
+      "keys": [
+        {
+          "name": "vertex-sa",
+          "value": "",
+          "models": ["*"],
+          "weight": 1.0,
+          "vertex_key_config": {
+            "project_id": "env.VERTEX_PROJECT_ID",
+            "region": "us-central1",
+            "auth_credentials": "env.VERTEX_AUTH_CREDENTIALS"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+`VERTEX_AUTH_CREDENTIALS` should contain the base64-encoded service account JSON.
+
+</Tab>
+<Tab title="GKE Workload Identity / ADC">
+
+When `auth_credentials` is omitted, Bifrost calls `google.FindDefaultCredentials` — which resolves to GKE Workload Identity, GCE metadata server, or `gcloud auth application-default login`.
+
+```json
+{
+  "providers": {
+    "vertex": {
+      "keys": [
+        {
+          "name": "vertex-workload-identity",
+          "value": "",
+          "models": ["*"],
+          "weight": 1.0,
+          "vertex_key_config": {
+            "project_id": "my-gcp-project",
+            "region": "us-central1"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+</Tab>
+</Tabs>
+
+</Tab>
+
+<Tab title="Groq / Gemini / Mistral / Others">
+
+### Standard API-Key Providers
+
+These providers follow the same simple pattern — one or more keys with weights. Replace the provider name and env var name accordingly.
+
+```json
+{
+  "providers": {
+    "groq": {
+      "keys": [
+        {
+          "name": "groq-primary",
+          "value": "env.GROQ_API_KEY",
+          "models": ["*"],
+          "weight": 1.0
+        }
+      ]
+    },
+    "gemini": {
+      "keys": [
+        {
+          "name": "gemini-primary",
+          "value": "env.GEMINI_API_KEY",
+          "models": ["*"],
+          "weight": 1.0
+        }
+      ]
+    },
+    "mistral": {
+      "keys": [
+        {
+          "name": "mistral-primary",
+          "value": "env.MISTRAL_API_KEY",
+          "models": ["*"],
+          "weight": 1.0
+        }
+      ]
+    },
+    "cohere": {
+      "keys": [{ "name": "cohere-main", "value": "env.COHERE_API_KEY", "models": ["*"], "weight": 1.0 }]
+    },
+    "perplexity": {
+      "keys": [{ "name": "perplexity-main", "value": "env.PERPLEXITY_API_KEY", "models": ["*"], "weight": 1.0 }]
+    },
+    "xai": {
+      "keys": [{ "name": "xai-main", "value": "env.XAI_API_KEY", "models": ["*"], "weight": 1.0 }]
+    },
+    "cerebras": {
+      "keys": [{ "name": "cerebras-main", "value": "env.CEREBRAS_API_KEY", "models": ["*"], "weight": 1.0 }]
+    },
+    "openrouter": {
+      "keys": [{ "name": "openrouter-main", "value": "env.OPENROUTER_API_KEY", "models": ["*"], "weight": 1.0 }]
+    },
+    "nebius": {
+      "keys": [{ "name": "nebius-main", "value": "env.NEBIUS_API_KEY", "models": ["*"], "weight": 1.0 }]
+    }
+  }
+}
+```
+
+</Tab>
+
+<Tab title="Self-Hosted">
+
+### Self-Hosted Providers
+
+Self-hosted providers point to a URL you operate. No API key is typically required (`"value": ""`).
+
+<Tabs>
+<Tab title="Ollama">
+
+```json
+{
+  "providers": {
+    "ollama": {
+      "keys": [
+        {
+          "name": "ollama-local",
+          "value": "",
+          "models": ["*"],
+          "weight": 1.0,
+          "ollama_key_config": {
+            "url": "http://localhost:11434"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+Using an env var for the URL (useful across environments):
+
+```json
+{
+  "ollama_key_config": {
+    "url": "env.OLLAMA_URL"
+  }
+}
+```
+
+</Tab>
+<Tab title="vLLM">
+
+vLLM instances are model-specific — one key per served model:
+
+```json
+{
+  "providers": {
+    "vllm": {
+      "keys": [
+        {
+          "name": "vllm-llama3-70b",
+          "value": "",
+          "models": ["llama-3-70b"],
+          "weight": 1.0,
+          "vllm_key_config": {
+            "url": "http://vllm-server:8000",
+            "model_name": "meta-llama/Meta-Llama-3-70B-Instruct"
+          }
+        },
+        {
+          "name": "vllm-mistral",
+          "value": "",
+          "models": ["mistral-7b"],
+          "weight": 1.0,
+          "vllm_key_config": {
+            "url": "http://vllm-mistral:8000",
+            "model_name": "mistralai/Mistral-7B-Instruct-v0.3"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+</Tab>
+<Tab title="SGLang">
+
+```json
+{
+  "providers": {
+    "sgl": {
+      "keys": [
+        {
+          "name": "sgl-main",
+          "value": "",
+          "models": ["*"],
+          "weight": 1.0,
+          "sgl_key_config": {
+            "url": "http://sgl-router:30000"
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+</Tab>
+<Tab title="HuggingFace / Replicate">
+
+These providers use `aliases` to map logical model names to provider-specific IDs:
+
+```json
+{
+  "providers": {
+    "huggingface": {
+      "keys": [
+        {
+          "name": "hf-main",
+          "value": "env.HF_API_KEY",
+          "models": ["llama-3", "mixtral"],
+          "weight": 1.0,
+          "aliases": {
+            "llama-3": "meta-llama/Meta-Llama-3-8B-Instruct",
+            "mixtral": "mistralai/Mixtral-8x7B-Instruct-v0.1"
+          }
+        }
+      ]
+    },
+    "replicate": {
+      "keys": [
+        {
+          "name": "replicate-main",
+          "value": "env.REPLICATE_API_KEY",
+          "models": ["llama-3"],
+          "weight": 1.0,
+          "aliases": {
+            "llama-3": "meta/meta-llama-3-70b-instruct"
+          },
+          "replicate_key_config": {
+            "use_deployments_endpoint": false
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+</Tab>
+</Tabs>
+
+</Tab>
+
+</Tabs>
+
+---
+
+## Proxy Configuration
+
+Route provider traffic through an HTTP or SOCKS5 proxy:
+
+```json
+{
+  "providers": {
+    "openai": {
+      "keys": [
+        { "name": "primary", "value": "env.OPENAI_API_KEY", "models": ["*"], "weight": 1.0 }
+      ],
+      "proxy_config": {
+        "type": "http",
+        "url": "http://proxy.corp.example.com:3128",
+        "username": "env.PROXY_USER",
+        "password": "env.PROXY_PASS"
+      }
+    }
+  }
+}
+```
+
+| Field | Type | Options |
+|-------|------|---------|
+| `proxy_config.type` | string | `"none"`, `"http"`, `"socks5"`, `"environment"` |
+| `proxy_config.url` | string | Proxy server URL |
+| `proxy_config.username` | string | Proxy auth username |
+| `proxy_config.password` | string | Proxy auth password (`env.` supported) |
+| `proxy_config.ca_cert_pem` | string | PEM CA for TLS-intercepting proxies |
+
+Use `"type": "environment"` to pick up `HTTP_PROXY` / `HTTPS_PROXY` env vars automatically.
+
+---
+
+## Multi-Provider Example
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema",
+  "providers": {
+    "openai": {
+      "keys": [
+        { "name": "openai-primary", "value": "env.OPENAI_API_KEY", "models": ["*"], "weight": 2.0 }
+      ]
+    },
+    "anthropic": {
+      "keys": [
+        { "name": "anthropic-primary", "value": "env.ANTHROPIC_API_KEY", "models": ["*"], "weight": 1.0 }
+      ]
+    },
+    "groq": {
+      "keys": [
+        { "name": "groq-primary", "value": "env.GROQ_API_KEY", "models": ["*"], "weight": 1.0 }
+      ]
+    }
+  }
+}
+```
+
+With three providers and the weights above, traffic is distributed: 50% OpenAI, 25% Anthropic, 25% Groq. If any provider returns an error, Bifrost automatically retries on the next key or provider.
--- a/docs/deployment-guides/config-json/schema-reference.mdx
+++ b/docs/deployment-guides/config-json/schema-reference.mdx
@@ -0,0 +1,252 @@
+---
+title: "Schema Reference"
+description: "All top-level keys available in config.json, their types, and where each is documented"
+icon: "brackets-curly"
+---
+
+<Note>
+The live schema is published at [`https://www.getbifrost.ai/schema`](https://www.getbifrost.ai/schema). Add `"$schema": "https://www.getbifrost.ai/schema"` to your `config.json` for IDE autocomplete and inline validation.
+</Note>
+
+This page is a concise reference for every top-level key in `config.json`. Click the **Guide** links for full field-by-field documentation.
+
+---
+
+## Top-Level Keys
+
+| Key | Type | Description | Guide |
+|-----|------|-------------|-------|
+| `$schema` | string | Schema URL for IDE validation. Set to `"https://www.getbifrost.ai/schema"` | — |
+| `encryption_key` | string | Optional AES-256 key (derived via Argon2id). Accepts `env.VAR` prefix and is also read from `BIFROST_ENCRYPTION_KEY`. If omitted, data is stored in plaintext. | [Client](/deployment-guides/config-json/client#encryption-key) |
+| `client` | object | Worker pool, logging, CORS, auth enforcement, header filtering, MCP, compat shims | [Client](/deployment-guides/config-json/client) |
+| `providers` | object | LLM provider API keys, network settings, concurrency | [Providers](/deployment-guides/config-json/providers) |
+| `governance` | object | Admin auth, virtual keys, budgets, rate limits, routing rules, customers, teams | [Governance](/deployment-guides/config-json/governance) |
+| `guardrails_config` | object | Content moderation providers and CEL-based rules *(enterprise only)* | [Guardrails](/deployment-guides/config-json/guardrails) |
+| `access_profiles` | array | Access profile templates for enterprise RBAC/governance controls *(enterprise only)* | [Enterprise Governance](/enterprise/advanced-governance) |
+| `cluster_config` | object | Cluster mode settings: gossip, peers, and auto-discovery backends *(enterprise only)* | [Cluster](/deployment-guides/config-json/cluster) |
+| `config_store` | object | Configuration database backend — SQLite, PostgreSQL, or disabled (file-only mode) | [Storage](/deployment-guides/config-json/storage#config_store) |
+| `logs_store` | object | Request/response log database — SQLite, PostgreSQL + optional S3/GCS offload | [Storage](/deployment-guides/config-json/storage#logs_store) |
+| `vector_store` | object | Vector database for semantic cache — Weaviate, Redis, Qdrant, Pinecone, Valkey | [Storage](/deployment-guides/config-json/storage#vector_store) |
+| `plugins` | array | Opt-in plugins: `semantic_cache`, `otel`, `maxim`, `datadog`, custom | [Plugins](/deployment-guides/config-json/plugins) |
+| `framework` | object | Model pricing catalog URL and sync interval | [Framework](#framework) |
+| `mcp` | object | MCP server and tool configuration | — |
+| `websocket` | object | WebSocket / Realtime API connection pool tuning | [WebSocket](#websocket) |
+| `auth_config` | object | **Deprecated** — use `governance.auth_config` | [Client](/deployment-guides/config-json/client#authentication) |
+
+---
+
+## `version`
+
+Controls how empty arrays in allow-list fields (`models`, `allowed_models`, `key_ids`, `tools_to_execute`) are interpreted:
+
+| Value | Behaviour |
+|-------|-----------|
+| `2` *(default, v1.5.0+)* | Empty array = **deny all**; `["*"]` = allow all |
+| `1` *(v1.4.x compat)* | Empty array = **allow all** |
+
+Omitting `version` uses v2 semantics. Set `"version": 1` only if you are migrating from v1.4.x and need the old behaviour temporarily.
+
+---
+
+## `client`
+
+Controls the worker pool, logging pipeline, security, and SDK shims. All fields are optional.
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `initial_pool_size` | integer | `300` | Pre-allocated goroutines per provider queue |
+| `drop_excess_requests` | boolean | `false` | Return HTTP 429 when queue is full |
+| `enable_logging` | boolean | `true`* | Persist request/response logs (`*` auto-enabled when `logs_store` is set) |
+| `disable_content_logging` | boolean | `false` | Strip message content from logs |
+| `log_retention_days` | integer | `365` | Days to retain log entries |
+| `logging_headers` | array | `[]` | HTTP headers to capture in log metadata |
+| `enforce_auth_on_inference` | boolean | `false` | Require a virtual key on every `/v1/*` request |
+| `allow_direct_keys` | boolean | `false` | Allow callers to pass provider API keys directly |
+| `allowed_origins` | array | `["*"]` | CORS allowed origins |
+| `max_request_body_size_mb` | integer | `100` | Maximum request body in MB |
+| `whitelisted_routes` | array | `[]` | Routes that bypass auth middleware |
+| `allowed_headers` | array | `[]` | Additional headers permitted for CORS/WebSocket |
+| `required_headers` | array | `[]` | Headers that must be present on every request |
+| `header_filter_config` | object | — | `allowlist` / `denylist` for `x-bf-eh-*` forwarded headers |
+| `prometheus_labels` | array | `[]` | Custom labels for all Prometheus metrics |
+| `compat` | object | — | SDK compatibility shims (`should_drop_params`, `convert_text_to_chat`, etc.) |
+| `mcp_agent_depth` | integer | `10` | Max tool-call recursion depth |
+| `mcp_tool_execution_timeout` | integer | `30` | Per-tool execution timeout in seconds |
+| `mcp_tool_sync_interval` | integer | `10` | Tool sync interval in minutes (`0` = disabled) |
+| `mcp_disable_auto_tool_inject` | boolean | `false` | Disable automatic MCP tool injection |
+| `async_job_result_ttl` | integer | `3600` | TTL for async job results in seconds |
+| `disable_db_pings_in_health` | boolean | `false` | Exclude DB connectivity from `/health` |
+| `routing_chain_max_depth` | integer | `10` | Max routing rule chain evaluation depth |
+
+Full documentation: [Client Configuration](/deployment-guides/config-json/client).
+
+---
+
+## `providers`
+
+Keyed by provider name. Each entry contains a `keys` array and optional `network_config`, `concurrency_and_buffer_size`, `proxy_config`.
+
+Supported provider keys: `openai`, `anthropic`, `azure`, `bedrock`, `vertex`, `gemini`, `mistral`, `groq`, `cohere`, `perplexity`, `xai`, `cerebras`, `openrouter`, `nebius`, `fireworks`, `parasail`, `huggingface`, `replicate`, `ollama`, `vllm`, `sgl`, `elevenlabs`, `runway`.
+
+Full documentation: [Provider Setup](/deployment-guides/config-json/providers).
+
+---
+
+## `governance`
+
+Seeds governance resources at startup. All sub-keys are optional arrays.
+
+| Sub-key | Description |
+|---------|-------------|
+| `auth_config` | Admin username/password auth for the dashboard |
+| `virtual_keys` | Scoped API tokens with provider/model allowlists |
+| `budgets` | Spend caps in USD over a rolling window |
+| `rate_limits` | Request and token rate limits |
+| `customers` | Customer entities (attach budgets/rate limits) |
+| `teams` | Team entities (attach to customers, budgets, rate limits) |
+| `routing_rules` | CEL-based dynamic provider/model routing |
+| `pricing_overrides` | Scoped per-model pricing overrides |
+| `model_configs` | Per-model rate limit and budget configurations |
+
+Full documentation: [Governance](/deployment-guides/config-json/governance).
+
+---
+
+## `guardrails_config`
+
+Enterprise-only. Two sub-keys: `guardrail_providers` (array) and `guardrail_rules` (array).
+
+Full documentation: [Guardrails](/deployment-guides/config-json/guardrails).
+
+---
+
+## `access_profiles`
+
+Enterprise-only. Defines access profile templates that can later be attached to roles/users.
+
+```json
+{
+  "access_profiles": [
+    {
+      "name": "platform-default",
+      "description": "Default platform profile",
+      "is_active": true,
+      "tags": ["platform", "default"],
+      "provider_configs": [
+        {
+          "provider_name": "openai",
+          "all_models_allowed": false,
+          "allowed_models": ["gpt-4o", "gpt-4o-mini"]
+        }
+      ],
+      "mcp_servers": [
+        { "mcp_server_id": "github" }
+      ],
+      "mcp_tool_overrides": [
+        { "mcp_client_id": "github", "tool_name": "create_pull_request", "action": "include" }
+      ]
+    }
+  ]
+}
+```
+
+---
+
+## `cluster_config`
+
+Enterprise-only clustering settings for multi-node deployments.
+
+| Sub-key | Description |
+|---------|-------------|
+| `enabled` | Enables cluster mode |
+| `region` | Region label used by enterprise clustering |
+| `peers` | Static peer list (`host:port`) |
+| `gossip` | Gossip/memberlist port + liveness thresholds |
+| `discovery` | Auto-discovery configuration (`kubernetes`, `dns`, `udp`, `consul`, `etcd`, `mdns`) |
+
+Full documentation: [Cluster](/deployment-guides/config-json/cluster).
+
+---
+
+## `config_store`, `logs_store`, `vector_store`
+
+Storage backends. Each has `enabled` (boolean), `type` (string), and `config` (object).
+
+| Store | Types |
+|-------|-------|
+| `config_store` | `"sqlite"`, `"postgres"` |
+| `logs_store` | `"sqlite"`, `"postgres"` (+ optional `object_storage`) |
+| `vector_store` | `"weaviate"`, `"redis"`, `"qdrant"`, `"pinecone"` (`"redis"` also covers Valkey-compatible endpoints) |
+
+Full documentation: [Storage](/deployment-guides/config-json/storage).
+
+---
+
+## `framework`
+
+Controls model pricing catalog sync:
+
+```json
+{
+  "framework": {
+    "pricing": {
+      "pricing_url": "https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json",
+      "pricing_sync_interval": 86400
+    }
+  }
+}
+```
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `pricing.pricing_url` | LiteLLM catalog | URL of a model pricing JSON file |
+| `pricing.pricing_sync_interval` | `86400` | Sync interval in seconds (minimum: `3600`) |
+
+---
+
+## `websocket`
+
+Optional tuning for the WebSocket gateway (Responses API WebSocket mode, Realtime API). WebSocket is always enabled.
+
+```json
+{
+  "websocket": {
+    "max_connections_per_user": 100,
+    "transcript_buffer_size": 100,
+    "pool": {
+      "max_idle_per_key": 50,
+      "max_total_connections": 1000,
+      "idle_timeout_seconds": 600,
+      "max_connection_lifetime_seconds": 7200
+    }
+  }
+}
+```
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `max_connections_per_user` | `100` | Max concurrent WebSocket connections per user |
+| `transcript_buffer_size` | `100` | Transcript entries buffered for Realtime API mid-session fallback |
+| `pool.max_idle_per_key` | `50` | Max idle upstream connections per provider/key |
+| `pool.max_total_connections` | `1000` | Max total idle upstream connections |
+| `pool.idle_timeout_seconds` | `600` | Evict idle connections after this many seconds |
+| `pool.max_connection_lifetime_seconds` | `7200` | Max lifetime of any upstream connection |
+
+---
+
+## Minimal Valid Config
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema",
+  "encryption_key": "env.BIFROST_ENCRYPTION_KEY",
+  "providers": {
+    "openai": {
+      "keys": [
+        { "name": "primary", "value": "env.OPENAI_API_KEY", "models": ["*"], "weight": 1.0 }
+      ]
+    }
+  },
+  "config_store": { "enabled": false }
+}
+```
--- a/docs/deployment-guides/config-json/storage.mdx
+++ b/docs/deployment-guides/config-json/storage.mdx
@@ -0,0 +1,540 @@
+---
+title: "Storage"
+description: "Configure Bifrost storage backends in config.json — config_store, logs_store, vector_store, and object storage for logs"
+icon: "database"
+---
+
+Bifrost persists two types of data — **config** (providers, virtual keys, governance rules) and **logs** (request/response records). Each has its own store. A **vector store** is required for semantic caching.
+
+| Store | Purpose | Backends |
+|-------|---------|---------|
+| `config_store` | Provider configs, virtual keys, governance rules | SQLite, PostgreSQL |
+| `logs_store` | Request/response logs shown in UI | SQLite, PostgreSQL + optional S3/GCS offload |
+| `vector_store` | Semantic response caching | Weaviate, Redis, Valkey, Qdrant, Pinecone |
+
+<Note>
+If you use PostgreSQL for any store, the target database must be **UTF8 encoded**. See [PostgreSQL UTF8 Requirement](/quickstart/gateway/setting-up#postgresql-utf8-requirement).
+</Note>
+
+---
+
+## config_store
+
+<Note>
+When `config_store` is disabled (or absent), all configuration is loaded from `config.json` at startup only — the Web UI is disabled and changes require a restart. See [Two Configuration Modes](/deployment-guides/config-json#two-configuration-modes).
+</Note>
+
+<Tabs>
+
+<Tab title="SQLite">
+
+### SQLite (Default)
+
+Simplest setup — no external database required. Bifrost stores configuration in a local SQLite file.
+
+```json
+{
+  "config_store": {
+    "enabled": true,
+    "type": "sqlite",
+    "config": {
+      "path": "./config.db"
+    }
+  }
+}
+```
+
+| Field | Description |
+|-------|-------------|
+| `config.path` | Path to the SQLite file (relative to app-dir, or absolute) |
+
+</Tab>
+
+<Tab title="PostgreSQL">
+
+### PostgreSQL
+
+Production-grade storage suitable for high-availability and high-throughput deployments.
+
+```json
+{
+  "config_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {
+      "host": "env.PG_HOST",
+      "port": "5432",
+      "user": "env.PG_USER",
+      "password": "env.PG_PASSWORD",
+      "db_name": "bifrost",
+      "ssl_mode": "require",
+      "max_idle_conns": 5,
+      "max_open_conns": 50
+    }
+  }
+}
+```
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `host` | — | PostgreSQL host (supports `env.` prefix) |
+| `port` | — | PostgreSQL port (as string) |
+| `user` | — | Database user (supports `env.` prefix) |
+| `password` | — | Database password (supports `env.` prefix). Leave empty for IAM role auth. |
+| `db_name` | — | Database name |
+| `ssl_mode` | — | `"disable"`, `"require"`, `"verify-ca"`, `"verify-full"` |
+| `max_idle_conns` | `5` | Maximum idle connections in the pool |
+| `max_open_conns` | `50` | Maximum open connections to the database |
+
+</Tab>
+
+<Tab title="Disabled">
+
+### Disabled (file-only mode)
+
+Use this when you want Bifrost to read all configuration from `config.json` only — no database, no Web UI.
+
+```json
+{
+  "config_store": {
+    "enabled": false
+  }
+}
+```
+
+This is the recommended setup for [multinode OSS deployments](/deployment-guides/how-to/multinode) where a shared `config.json` is the single source of truth.
+
+</Tab>
+
+</Tabs>
+
+---
+
+## logs_store
+
+<Tabs>
+
+<Tab title="SQLite">
+
+### SQLite
+
+```json
+{
+  "logs_store": {
+    "enabled": true,
+    "type": "sqlite",
+    "config": {
+      "path": "./logs.db"
+    }
+  }
+}
+```
+
+</Tab>
+
+<Tab title="PostgreSQL">
+
+### PostgreSQL
+
+```json
+{
+  "logs_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {
+      "host": "env.PG_HOST",
+      "port": "5432",
+      "user": "env.PG_USER",
+      "password": "env.PG_PASSWORD",
+      "db_name": "bifrost",
+      "ssl_mode": "require",
+      "max_idle_conns": 10,
+      "max_open_conns": 100
+    }
+  }
+}
+```
+
+For high log volumes, increase `max_open_conns`:
+
+```json
+{
+  "logs_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {
+      "host": "env.PG_HOST",
+      "port": "5432",
+      "user": "env.PG_USER",
+      "password": "env.PG_PASSWORD",
+      "db_name": "bifrost",
+      "ssl_mode": "require",
+      "max_idle_conns": 10,
+      "max_open_conns": 200
+    },
+    "retention_days": 90
+  }
+}
+```
+
+</Tab>
+
+<Tab title="Disabled">
+
+```json
+{
+  "logs_store": {
+    "enabled": false
+  }
+}
+```
+
+</Tab>
+
+</Tabs>
+
+### Log Retention
+
+Set `retention_days` to automatically purge old log entries. `0` disables retention-based cleanup.
+
+```json
+{
+  "logs_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": { "...": "..." },
+    "retention_days": 90
+  }
+}
+```
+
+### Object Storage for Logs
+
+Offload large request/response payloads from the database to S3 or GCS. The database retains only lightweight index records; payloads are fetched on demand.
+
+<Tabs>
+<Tab title="AWS S3">
+
+```json
+{
+  "logs_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": { "...": "..." },
+    "object_storage": {
+      "type": "s3",
+      "bucket": "env.S3_BUCKET",
+      "prefix": "bifrost",
+      "compress": true,
+      "region": "us-east-1",
+      "access_key_id": "env.S3_ACCESS_KEY_ID",
+      "secret_access_key": "env.S3_SECRET_ACCESS_KEY"
+    }
+  }
+}
+```
+
+**IAM role (instance profile / IRSA)** — omit `access_key_id` and `secret_access_key`:
+
+```json
+{
+  "object_storage": {
+    "type": "s3",
+    "bucket": "bifrost-logs",
+    "region": "us-east-1",
+    "compress": true,
+    "role_arn": "arn:aws:iam::123456789012:role/BifrostS3Role"
+  }
+}
+```
+
+| Field | Description |
+|-------|-------------|
+| `bucket` | S3 bucket name (supports `env.` prefix) |
+| `prefix` | Key prefix for stored objects (default: `"bifrost"`) |
+| `compress` | Enable gzip compression (default: `false`) |
+| `region` | AWS region |
+| `access_key_id` | AWS access key ID (omit for default credential chain) |
+| `secret_access_key` | AWS secret access key |
+| `session_token` | STS temporary credentials session token |
+| `role_arn` | IAM role ARN for STS AssumeRole |
+| `endpoint` | Custom endpoint for MinIO / Cloudflare R2 |
+| `force_path_style` | Use path-style URLs (required for MinIO, default: `false`) |
+
+</Tab>
+<Tab title="Google Cloud Storage">
+
+```json
+{
+  "logs_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": { "...": "..." },
+    "object_storage": {
+      "type": "gcs",
+      "bucket": "bifrost-logs",
+      "prefix": "bifrost",
+      "compress": true,
+      "project_id": "env.GCP_PROJECT_ID",
+      "credentials_json": "env.GCS_CREDENTIALS_JSON"
+    }
+  }
+}
+```
+
+Omit `credentials_json` to use Application Default Credentials (Workload Identity, GCE metadata, `gcloud auth`).
+
+| Field | Description |
+|-------|-------------|
+| `project_id` | GCP project ID (supports `env.` prefix) |
+| `credentials_json` | Service account JSON or path — omit for ADC |
+
+</Tab>
+<Tab title="MinIO (Self-Hosted)">
+
+```json
+{
+  "object_storage": {
+    "type": "s3",
+    "bucket": "bifrost-logs",
+    "prefix": "bifrost",
+    "compress": false,
+    "region": "us-east-1",
+    "endpoint": "http://minio.internal:9000",
+    "access_key_id": "env.MINIO_ACCESS_KEY",
+    "secret_access_key": "env.MINIO_SECRET_KEY",
+    "force_path_style": true
+  }
+}
+```
+
+</Tab>
+</Tabs>
+
+---
+
+## vector_store
+
+A vector store is required for [semantic caching](/features/semantic-caching). Choose from Weaviate, Redis/Valkey, Qdrant, or Pinecone.
+
+<Tabs>
+
+<Tab title="Weaviate">
+
+```json
+{
+  "vector_store": {
+    "enabled": true,
+    "type": "weaviate",
+    "config": {
+      "scheme": "http",
+      "host": "localhost:8080",
+      "api_key": "env.WEAVIATE_API_KEY",
+      "grpc_config": {
+        "host": "localhost:50051",
+        "secured": false
+      }
+    }
+  }
+}
+```
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `scheme` | Yes | `"http"` or `"https"` |
+| `host` | Yes | Weaviate server host and port |
+| `api_key` | No | Weaviate API key (supports `env.` prefix) |
+| `grpc_config.host` | No | gRPC host for faster vector operations |
+| `grpc_config.secured` | No | Use TLS for gRPC connection |
+
+</Tab>
+
+<Tab title="Redis / Valkey">
+
+```json
+{
+  "vector_store": {
+    "enabled": true,
+    "type": "redis",
+    "config": {
+      "addr": "env.REDIS_ADDR",
+      "password": "env.REDIS_PASSWORD",
+      "db": 0,
+      "use_tls": false
+    }
+  }
+}
+```
+
+**AWS MemoryDB (cluster mode):**
+
+```json
+{
+  "vector_store": {
+    "enabled": true,
+    "type": "redis",
+    "config": {
+      "addr": "env.MEMORYDB_ENDPOINT",
+      "password": "env.MEMORYDB_PASSWORD",
+      "use_tls": true,
+      "cluster_mode": true
+    }
+  }
+}
+```
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `addr` | — | Redis/Valkey address `host:port` (supports `env.` prefix) |
+| `password` | — | Redis AUTH password (supports `env.` prefix) |
+| `db` | `0` | Redis database number |
+| `use_tls` | `false` | Enable TLS |
+| `cluster_mode` | `false` | Enable cluster mode (required for MemoryDB; `db` must be `0`) |
+| `pool_size` | — | Maximum socket connections |
+
+</Tab>
+
+<Tab title="Qdrant">
+
+```json
+{
+  "vector_store": {
+    "enabled": true,
+    "type": "qdrant",
+    "config": {
+      "host": "env.QDRANT_HOST",
+      "port": 6334,
+      "api_key": "env.QDRANT_API_KEY",
+      "use_tls": false
+    }
+  }
+}
+```
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `host` | — | Qdrant server host (supports `env.` prefix) |
+| `port` | `6334` | gRPC port |
+| `api_key` | — | API key (supports `env.` prefix) |
+| `use_tls` | `false` | Enable TLS |
+
+</Tab>
+
+<Tab title="Pinecone">
+
+Pinecone is external-only.
+
+```json
+{
+  "vector_store": {
+    "enabled": true,
+    "type": "pinecone",
+    "config": {
+      "api_key": "env.PINECONE_API_KEY",
+      "index_host": "env.PINECONE_INDEX_HOST"
+    }
+  }
+}
+```
+
+| Field | Description |
+|-------|-------------|
+| `api_key` | Pinecone API key (supports `env.` prefix) |
+| `index_host` | Index host from Pinecone console (e.g. `your-index.svc.us-east1-gcp.pinecone.io`) |
+
+</Tab>
+
+</Tabs>
+
+---
+
+## Mixed Backend Example
+
+Run the config store on PostgreSQL (for UI) while keeping logs on SQLite (simpler, cheaper for append-heavy workloads):
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema",
+  "encryption_key": "env.BIFROST_ENCRYPTION_KEY",
+
+  "config_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {
+      "host": "env.PG_HOST",
+      "port": "5432",
+      "user": "env.PG_USER",
+      "password": "env.PG_PASSWORD",
+      "db_name": "bifrost",
+      "ssl_mode": "require"
+    }
+  },
+
+  "logs_store": {
+    "enabled": true,
+    "type": "sqlite",
+    "config": {
+      "path": "./logs.db"
+    }
+  }
+}
+```
+
+---
+
+## Full Storage Example
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema",
+  "encryption_key": "env.BIFROST_ENCRYPTION_KEY",
+
+  "config_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {
+      "host": "env.PG_HOST",
+      "port": "5432",
+      "user": "env.PG_USER",
+      "password": "env.PG_PASSWORD",
+      "db_name": "bifrost",
+      "ssl_mode": "require",
+      "max_idle_conns": 5,
+      "max_open_conns": 50
+    }
+  },
+
+  "logs_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {
+      "host": "env.PG_HOST",
+      "port": "5432",
+      "user": "env.PG_USER",
+      "password": "env.PG_PASSWORD",
+      "db_name": "bifrost",
+      "ssl_mode": "require",
+      "max_idle_conns": 10,
+      "max_open_conns": 100
+    },
+    "retention_days": 90,
+    "object_storage": {
+      "type": "s3",
+      "bucket": "env.S3_BUCKET",
+      "region": "us-east-1",
+      "compress": true,
+      "access_key_id": "env.S3_ACCESS_KEY_ID",
+      "secret_access_key": "env.S3_SECRET_ACCESS_KEY"
+    }
+  },
+
+  "vector_store": {
+    "enabled": true,
+    "type": "weaviate",
+    "config": {
+      "scheme": "http",
+      "host": "weaviate:8080"
+    }
+  }
+}
+```
--- a/docs/deployment-guides/docker-tuning.mdx
+++ b/docs/deployment-guides/docker-tuning.mdx
@@ -0,0 +1,440 @@
+---
+title: "Docker Performance Tuning"
+description: "Optimize Bifrost container performance with Go runtime tuning, resource limits, and system configuration"
+icon: "docker"
+---
+
+This guide covers performance tuning for Bifrost when running in Docker containers. Proper tuning ensures Bifrost can fully utilize container resources and achieve optimal throughput.
+
+<Note>
+These optimizations apply to Docker, Docker Compose, Kubernetes, and any container runtime using cgroups for resource management.
+</Note>
+
+## Quick Start
+
+For most production deployments, add these settings to your container:
+
+```yaml
+services:
+  bifrost:
+    image: maximhq/bifrost:latest
+    environment:
+      - GOGC=200
+      - GOMEMLIMIT=3600MiB  # 90% of 4GB memory limit
+    ulimits:
+      nofile:
+        soft: 65536
+        hard: 65536
+    deploy:
+      resources:
+        limits:
+          cpus: '4'
+          memory: 4G
+```
+
+---
+
+## Go Runtime Tuning
+
+### GOMAXPROCS (Automatic)
+
+Bifrost automatically detects container CPU limits using [automaxprocs](https://github.com/uber-go/automaxprocs). This sets `GOMAXPROCS` to match your container's CPU quota from cgroups (v1 and v2).
+
+**No configuration needed** — this works automatically. You'll see a log line at startup:
+
+```
+maxprocs: Updating GOMAXPROCS=4: determined from CPU quota
+```
+
+<Warning>
+Without automaxprocs, Go would detect all host CPUs (e.g., 64 on an EC2 instance) even when the container is limited to 4 CPUs, causing excessive context switching and degraded performance.
+</Warning>
+
+### GOGC (Garbage Collection)
+
+`GOGC` controls garbage collection frequency. The default is `100` (GC triggers when heap grows 100% since last collection).
+
+| Scenario | Recommended GOGC | Trade-off |
+|----------|------------------|-----------|
+| Memory constrained | 50-100 | More frequent GC, lower memory |
+| High throughput, memory available | 200-400 | Less GC overhead, higher memory |
+| Latency sensitive | 50-100 | More predictable latency |
+
+```yaml
+environment:
+  - GOGC=200
+```
+
+<Tip>
+For high-throughput API gateways, `GOGC=200` or `GOGC=400` typically provides the best balance of throughput and memory usage.
+</Tip>
+
+### GOMEMLIMIT (Memory Limit)
+
+`GOMEMLIMIT` sets a soft memory limit for the Go runtime. When approaching this limit, Go becomes more aggressive about garbage collection.
+
+**Best practice:** Set to ~90% of your container's memory limit to leave headroom for non-heap memory (goroutine stacks, CGO, etc.).
+
+| Container Memory | Recommended GOMEMLIMIT |
+|------------------|------------------------|
+| 512 MB | 450MiB |
+| 1 GB | 900MiB |
+| 2 GB | 1800MiB |
+| 4 GB | 3600MiB |
+| 8 GB | 7200MiB |
+
+```yaml
+environment:
+  - GOMEMLIMIT=3600MiB
+```
+
+<Note>
+When using both `GOGC` and `GOMEMLIMIT`, Go GCs based on whichever trigger fires first. For high-throughput workloads, set `GOGC=200` or higher and let `GOMEMLIMIT` be the primary constraint.
+</Note>
+
+---
+
+## System Limits
+
+### File Descriptor Limits (ulimits)
+
+Each HTTP connection requires a file descriptor. The default container limit (often 1024) is too low for high-concurrency workloads.
+
+```yaml
+ulimits:
+  nofile:
+    soft: 65536
+    hard: 65536
+```
+
+| Expected Concurrent Connections | Recommended nofile |
+|--------------------------------|-------------------|
+| < 1000 | 4096 |
+| 1000-5000 | 16384 |
+| 5000-10000 | 32768 |
+| > 10000 | 65536+ |
+
+<Warning>
+If you see errors like `too many open files` or connections being refused under load, increase your `nofile` limit.
+</Warning>
+
+### Resource Limits
+
+Set CPU and memory limits to match your expected workload:
+
+```yaml
+deploy:
+  resources:
+    limits:
+      cpus: '4'
+      memory: 4G
+    reservations:
+      cpus: '2'
+      memory: 2G
+```
+
+**Sizing guidance:**
+
+| Expected RPS | Recommended CPUs | Recommended Memory |
+|--------------|------------------|-------------------|
+| 100-500 | 1-2 | 512MB-1GB |
+| 500-2000 | 2-4 | 1-2GB |
+| 2000-5000 | 4-8 | 2-4GB |
+| 5000+ | 8+ | 4GB+ |
+
+---
+
+## Docker Compose Examples
+
+### Development
+
+```yaml
+services:
+  bifrost:
+    image: maximhq/bifrost:latest
+    ports:
+      - "8080:8080"
+    volumes:
+      - ./data:/app/data
+    environment:
+      - LOG_LEVEL=debug
+```
+
+### Production (Single Node)
+
+```yaml
+services:
+  bifrost:
+    image: maximhq/bifrost:latest
+    ports:
+      - "8080:8080"
+    volumes:
+      - bifrost-data:/app/data
+    environment:
+      - LOG_LEVEL=info
+      - LOG_STYLE=json
+      - GOGC=200
+      - GOMEMLIMIT=3600MiB
+    ulimits:
+      nofile:
+        soft: 65536
+        hard: 65536
+    deploy:
+      resources:
+        limits:
+          cpus: '4'
+          memory: 4G
+        reservations:
+          cpus: '2'
+          memory: 2G
+    healthcheck:
+      test: ["CMD", "wget", "--no-verbose", "--tries=1", "-O", "/dev/null", "http://localhost:8080/health"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+    restart: unless-stopped
+
+volumes:
+  bifrost-data:
+```
+
+### Production (Multi-Node with PostgreSQL)
+
+<Note>
+If you use PostgreSQL for Bifrost storage, ensure the database is UTF8 encoded. See [PostgreSQL UTF8 Requirement](../quickstart/gateway/setting-up#postgresql-utf8-requirement).
+</Note>
+
+```yaml
+services:
+  bifrost-1:
+    image: maximhq/bifrost:latest
+    ports:
+      - "8081:8080"
+    environment:
+      - LOG_LEVEL=info
+      - GOGC=200
+      - GOMEMLIMIT=1800MiB
+      - BIFROST_DB_TYPE=postgres
+      - BIFROST_DB_DSN=postgres://user:pass@postgres:5432/bifrost?sslmode=disable
+    ulimits:
+      nofile:
+        soft: 65536
+        hard: 65536
+    deploy:
+      resources:
+        limits:
+          cpus: '2'
+          memory: 2G
+    depends_on:
+      - postgres
+
+  bifrost-2:
+    image: maximhq/bifrost:latest
+    ports:
+      - "8082:8080"
+    environment:
+      - LOG_LEVEL=info
+      - GOGC=200
+      - GOMEMLIMIT=1800MiB
+      - BIFROST_DB_TYPE=postgres
+      - BIFROST_DB_DSN=postgres://user:pass@postgres:5432/bifrost?sslmode=disable
+    ulimits:
+      nofile:
+        soft: 65536
+        hard: 65536
+    deploy:
+      resources:
+        limits:
+          cpus: '2'
+          memory: 2G
+    depends_on:
+      - postgres
+
+  postgres:
+    image: postgres:16-alpine
+    environment:
+      - POSTGRES_USER=user
+      - POSTGRES_PASSWORD=pass
+      - POSTGRES_DB=bifrost
+    volumes:
+      - postgres-data:/var/lib/postgresql/data
+
+volumes:
+  postgres-data:
+```
+
+---
+
+## Kubernetes Configuration
+
+### Basic Deployment
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: bifrost
+spec:
+  replicas: 3
+  selector:
+    matchLabels:
+      app: bifrost
+  template:
+    metadata:
+      labels:
+        app: bifrost
+    spec:
+      containers:
+        - name: bifrost
+          image: maximhq/bifrost:latest
+          ports:
+            - containerPort: 8080
+          env:
+            - name: GOGC
+              value: "200"
+            - name: GOMEMLIMIT
+              value: "3600MiB"
+          resources:
+            limits:
+              cpu: "4"
+              memory: "4Gi"
+            requests:
+              cpu: "2"
+              memory: "2Gi"
+          livenessProbe:
+            httpGet:
+              path: /health
+              port: 8080
+            initialDelaySeconds: 5
+            periodSeconds: 10
+          readinessProbe:
+            httpGet:
+              path: /health
+              port: 8080
+            initialDelaySeconds: 5
+            periodSeconds: 5
+```
+
+### File Descriptor Limits in Kubernetes
+
+File descriptor limits in Kubernetes are typically set at the node level. Options include:
+
+1. **Node-level configuration** (recommended): Set `fs.file-max` and ulimits in your node configuration
+2. **Init container**: Use an init container with elevated privileges to set limits
+3. **Security context**: Some clusters allow setting capabilities
+
+```yaml
+securityContext:
+  capabilities:
+    add: ["SYS_RESOURCE"]
+```
+
+<Note>
+Check your current limits inside a container with: `cat /proc/sys/fs/file-max` and `ulimit -n`
+</Note>
+
+---
+
+## Bifrost Application Settings
+
+Align Bifrost's internal settings with your container resources:
+
+### Concurrency and Buffer Size
+
+Configure per provider in `config.json`:
+
+```json
+{
+  "providers": {
+    "openai": {
+      "concurrency_and_buffer_size": {
+        "concurrency": 1000,
+        "buffer_size": 1500
+      }
+    }
+  }
+}
+```
+
+**Formula:**
+- `concurrency` = expected RPS per provider
+- `buffer_size` = 1.5 × concurrency
+
+### Initial Pool Size
+
+Configure globally in `config.json`:
+
+```json
+{
+  "client": {
+    "initial_pool_size": 3000
+  }
+}
+```
+
+**Formula:** `initial_pool_size` = 1.5 × total expected RPS across all providers
+
+<Tip>
+See the [Performance Tuning](/providers/performance) guide for detailed sizing recommendations.
+</Tip>
+
+---
+
+## Tuning Checklist
+
+<Steps>
+  <Step title="Set container resource limits">
+    Define CPU and memory limits based on expected workload. Start with 2 CPUs / 2GB for moderate loads.
+  </Step>
+  <Step title="Configure GOMEMLIMIT">
+    Set to 90% of container memory limit (e.g., `1800MiB` for 2GB container).
+  </Step>
+  <Step title="Tune GOGC">
+    Start with `GOGC=200` for throughput; reduce to 100 if memory pressure is high.
+  </Step>
+  <Step title="Set file descriptor limits">
+    Set `nofile` ulimit to at least 2× your expected concurrent connections.
+  </Step>
+  <Step title="Align Bifrost settings">
+    Match `concurrency` and `buffer_size` to your container's CPU count and expected RPS.
+  </Step>
+  <Step title="Monitor and adjust">
+    Watch memory usage, GC pause times, and request latencies. Adjust settings based on observed behavior.
+  </Step>
+</Steps>
+
+---
+
+## Troubleshooting
+
+### High Memory Usage
+
+- Reduce `GOGC` (e.g., from 200 to 100)
+- Ensure `GOMEMLIMIT` is set
+- Reduce `buffer_size` and `initial_pool_size`
+
+### High Latency Spikes
+
+- May indicate GC pauses; try reducing `GOGC`
+- Check if container is hitting CPU limits
+- Verify `GOMAXPROCS` matches container CPU quota (check startup logs)
+
+### Connection Errors Under Load
+
+- Increase `nofile` ulimit
+- Ensure `buffer_size` is large enough for traffic spikes
+- Check provider rate limits
+
+### Container OOM Killed
+
+- Reduce `GOMEMLIMIT` to 85% of container memory
+- Reduce `GOGC` to trigger more frequent GC
+- Reduce `buffer_size` and `initial_pool_size`
+
+---
+
+## Related Documentation
+
+- **[Performance Tuning](/providers/performance)** - Bifrost-specific performance configuration
+- **[Helm Deployment](/deployment-guides/helm)** - Kubernetes deployment with Helm
+- **[Multi-Node Setup](/deployment-guides/how-to/multinode)** - Scaling across multiple instances
--- a/docs/deployment-guides/ecs.mdx
+++ b/docs/deployment-guides/ecs.mdx
--- a/docs/deployment-guides/enterprise/aws.mdx
+++ b/docs/deployment-guides/enterprise/aws.mdx
@@ -0,0 +1,378 @@
+---
+title: "AWS Deployment"
+description: "Deploy Bifrost Enterprise on AWS using ECR with IRSA or IAM Task Roles"
+icon: "aws"
+---
+
+Bifrost Enterprise images for AWS customers are distributed through AWS ECR, enabling native IAM integration for secure, credential-less authentication.
+
+## Architecture
+
+```mermaid
+flowchart LR
+    subgraph AWS[AWS Account]
+        subgraph EKS[EKS Cluster]
+            Pod[Bifrost Pod]
+            KSA[K8s ServiceAccount]
+        end
+        IAMRole[IAM Role]
+        ECR[AWS ECR<br/>Bifrost Images]
+    end
+    
+    KSA -->|Annotated with| IAMRole
+    Pod -->|Assumes| IAMRole
+    IAMRole -->|Pull Permission| ECR
+    ECR -->|Image| Pod
+```
+
+## Prerequisites
+
+- EKS cluster (v1.23+) or ECS cluster
+- AWS CLI configured with appropriate permissions
+- `kubectl` configured for your EKS cluster
+- Your AWS Account ID allowlisted by Bifrost team
+
+<Note>
+Contact the Bifrost team to get your AWS account ID and IAM role ARN allowlisted for ECR access.
+</Note>
+
+## IRSA (Recommended)
+
+IAM Roles for Service Accounts (IRSA) provides the most secure authentication method for EKS deployments.
+
+### Step 1: Create IAM Policy
+
+Create an IAM policy that grants ECR pull access to the Bifrost repository.
+
+```json
+{
+  "Version": "2012-10-17",
+  "Statement": [
+    {
+      "Sid": "ECRAuth",
+      "Effect": "Allow",
+      "Action": [
+        "ecr:GetAuthorizationToken"
+      ],
+      "Resource": "*"
+    },
+    {
+      "Sid": "ECRPullFromBifrost",
+      "Effect": "Allow",
+      "Action": [
+        "ecr:BatchGetImage",
+        "ecr:GetDownloadUrlForLayer",
+        "ecr:BatchCheckLayerAvailability"
+      ],
+      "Resource": "arn:aws:ecr:us-east-1:BIFROST_ACCOUNT_ID:repository/YOUR_HUB_SLUG"
+    }
+  ]
+}
+```
+
+<Warning>
+Replace `BIFROST_ACCOUNT_ID` and `YOUR_HUB_SLUG` with the values provided by the Bifrost team.
+</Warning>
+
+Save this policy as `bifrost-ecr-pull-policy.json` and create it:
+
+```bash
+aws iam create-policy \
+  --policy-name BifrostECRPullPolicy \
+  --policy-document file://bifrost-ecr-pull-policy.json
+```
+
+### Step 2: Create IAM Role with OIDC Trust
+
+Create an IAM role that can be assumed by your Kubernetes ServiceAccount.
+
+First, get your OIDC provider URL:
+
+```bash
+aws eks describe-cluster \
+  --name YOUR_CLUSTER_NAME \
+  --query "cluster.identity.oidc.issuer" \
+  --output text
+```
+
+Create the trust policy (`trust-policy.json`):
+
+```json
+{
+  "Version": "2012-10-17",
+  "Statement": [
+    {
+      "Effect": "Allow",
+      "Principal": {
+        "Federated": "arn:aws:iam::YOUR_ACCOUNT_ID:oidc-provider/oidc.eks.REGION.amazonaws.com/id/OIDC_ID"
+      },
+      "Action": "sts:AssumeRoleWithWebIdentity",
+      "Condition": {
+        "StringEquals": {
+          "oidc.eks.REGION.amazonaws.com/id/OIDC_ID:aud": "sts.amazonaws.com",
+          "oidc.eks.REGION.amazonaws.com/id/OIDC_ID:sub": "system:serviceaccount:NAMESPACE:bifrost-sa"
+        }
+      }
+    }
+  ]
+}
+```
+
+Create the role and attach the policy:
+
+```bash
+# Create the role
+aws iam create-role \
+  --role-name BifrostECRPullRole \
+  --assume-role-policy-document file://trust-policy.json
+
+# Attach the policy
+aws iam attach-role-policy \
+  --role-name BifrostECRPullRole \
+  --policy-arn arn:aws:iam::YOUR_ACCOUNT_ID:policy/BifrostECRPullPolicy
+```
+
+### Step 3: Provide Role ARN to Bifrost
+
+Send your IAM role ARN to the Bifrost team for allowlisting:
+
+```
+arn:aws:iam::YOUR_ACCOUNT_ID:role/BifrostECRPullRole
+```
+
+### Step 4: Create Namespace and ServiceAccount
+
+```bash
+kubectl create namespace bifrost
+```
+
+```yaml
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: bifrost-sa
+  namespace: bifrost
+  annotations:
+    eks.amazonaws.com/role-arn: arn:aws:iam::YOUR_ACCOUNT_ID:role/BifrostECRPullRole
+```
+
+### Step 5: Deploy Bifrost
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: bifrost
+  namespace: bifrost
+spec:
+  replicas: 2
+  selector:
+    matchLabels:
+      app: bifrost
+  template:
+    metadata:
+      labels:
+        app: bifrost
+    spec:
+      serviceAccountName: bifrost-sa
+      containers:
+      - name: bifrost
+        image: BIFROST_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/YOUR_HUB_SLUG:latest
+        ports:
+        - containerPort: 8080
+          name: http
+        resources:
+          requests:
+            cpu: "250m"
+            memory: "512Mi"
+          limits:
+            cpu: "1000m"
+            memory: "2Gi"
+        livenessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          initialDelaySeconds: 30
+          periodSeconds: 10
+        readinessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          initialDelaySeconds: 10
+          periodSeconds: 5
+        volumeMounts:
+        - name: config
+          mountPath: /app/data/config.json
+          subPath: config.json
+      volumes:
+      - name: config
+        secret:
+          secretName: bifrost-config
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: bifrost
+  namespace: bifrost
+spec:
+  selector:
+    app: bifrost
+  ports:
+  - port: 80
+    targetPort: 8080
+    protocol: TCP
+  type: ClusterIP
+```
+
+## ECS Task Roles
+
+For ECS deployments, use IAM Task Roles for authentication.
+
+### Step 1: Create Task Execution Role
+
+The task execution role allows ECS to pull images from ECR.
+
+```json
+{
+  "Version": "2012-10-17",
+  "Statement": [
+    {
+      "Effect": "Allow",
+      "Action": [
+        "ecr:GetAuthorizationToken"
+      ],
+      "Resource": "*"
+    },
+    {
+      "Effect": "Allow",
+      "Action": [
+        "ecr:BatchCheckLayerAvailability",
+        "ecr:GetDownloadUrlForLayer",
+        "ecr:BatchGetImage"
+      ],
+      "Resource": "arn:aws:ecr:us-east-1:BIFROST_ACCOUNT_ID:repository/YOUR_HUB_SLUG"
+    },
+    {
+      "Effect": "Allow",
+      "Action": [
+        "logs:CreateLogStream",
+        "logs:PutLogEvents"
+      ],
+      "Resource": "*"
+    }
+  ]
+}
+```
+
+### Step 2: Create ECS Task Definition
+
+```json
+{
+  "family": "bifrost",
+  "networkMode": "awsvpc",
+  "requiresCompatibilities": ["FARGATE"],
+  "cpu": "512",
+  "memory": "1024",
+  "executionRoleArn": "arn:aws:iam::YOUR_ACCOUNT_ID:role/BifrostECSExecutionRole",
+  "containerDefinitions": [
+    {
+      "name": "bifrost",
+      "image": "BIFROST_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/YOUR_HUB_SLUG:latest",
+      "portMappings": [
+        {
+          "containerPort": 8080,
+          "protocol": "tcp"
+        }
+      ],
+      "healthCheck": {
+        "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
+        "interval": 30,
+        "timeout": 5,
+        "retries": 3,
+        "startPeriod": 60
+      },
+      "logConfiguration": {
+        "logDriver": "awslogs",
+        "options": {
+          "awslogs-group": "/ecs/bifrost",
+          "awslogs-region": "us-east-1",
+          "awslogs-stream-prefix": "bifrost"
+        }
+      }
+    }
+  ]
+}
+```
+
+### Step 3: Create ECS Service
+
+```bash
+aws ecs create-service \
+  --cluster your-cluster \
+  --service-name bifrost \
+  --task-definition bifrost \
+  --desired-count 2 \
+  --launch-type FARGATE \
+  --network-configuration "awsvpcConfiguration={subnets=[subnet-xxx],securityGroups=[sg-xxx],assignPublicIp=ENABLED}"
+```
+
+## Verifying Access
+
+### Test ECR Authentication
+
+```bash
+# Get ECR login token
+aws ecr get-login-password --region us-east-1 | \
+  docker login --username AWS --password-stdin \
+  BIFROST_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com
+
+# Pull test
+docker pull BIFROST_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/YOUR_HUB_SLUG:latest
+```
+
+### Verify IRSA Configuration
+
+```bash
+# Check ServiceAccount annotation
+kubectl get sa bifrost-sa -n bifrost -o yaml
+
+# Verify pod can assume role
+kubectl exec -it deployment/bifrost -n bifrost -- \
+  aws sts get-caller-identity
+```
+
+## Troubleshooting
+
+### ImagePullBackOff Errors
+
+1. **Check IAM Role trust policy**: Ensure the OIDC provider and ServiceAccount match
+2. **Verify ECR permissions**: Confirm the role has `ecr:BatchGetImage` permission
+3. **Check allowlisting**: Ensure your role ARN is allowlisted by Bifrost team
+
+```bash
+# Check pod events
+kubectl describe pod -l app=bifrost -n bifrost
+
+# Check IRSA token
+kubectl exec -it deployment/bifrost -n bifrost -- \
+  cat /var/run/secrets/eks.amazonaws.com/serviceaccount/token
+```
+
+### Authentication Errors
+
+```bash
+# Verify OIDC provider is configured
+aws iam list-open-id-connect-providers
+
+# Check role assumption
+aws sts assume-role-with-web-identity \
+  --role-arn arn:aws:iam::YOUR_ACCOUNT_ID:role/BifrostECRPullRole \
+  --role-session-name test \
+  --web-identity-token file:///path/to/token
+```
+
+## Next Steps
+
+- Configure [Bifrost settings](/quickstart/gateway/setting-up) for your use case
+- Set up [observability](/features/observability/default) for monitoring
+- Enable [clustering](/enterprise/clustering) for high availability
--- a/docs/deployment-guides/enterprise/azure.mdx
+++ b/docs/deployment-guides/enterprise/azure.mdx
@@ -0,0 +1,451 @@
+---
+title: "Azure Deployment"
+description: "Deploy Bifrost Enterprise on Azure AKS using Workload Identity Federation to GCP Artifact Registry"
+icon: "microsoft"
+---
+
+Bifrost Enterprise images for Azure customers are distributed through GCP Artifact Registry, using Azure Workload Identity Federation for secure, credential-less authentication.
+
+## Architecture
+
+```mermaid
+flowchart LR
+    subgraph Azure[Azure Subscription]
+        subgraph AKS[AKS Cluster]
+            Pod[Bifrost Pod]
+            KSA[K8s ServiceAccount]
+        end
+        MI[Managed Identity]
+    end
+    
+    subgraph GCP[GCP Project]
+        WIF[Workload Identity<br/>Federation Pool]
+        GSA[GCP Service Account]
+        AR[Artifact Registry<br/>Bifrost Images]
+    end
+    
+    KSA -->|Federated| MI
+    MI -->|OIDC Token| WIF
+    WIF -->|Exchange| GSA
+    GSA -->|Pull Permission| AR
+    AR -->|Image| Pod
+```
+
+## How It Works
+
+Azure Workload Identity Federation allows Azure Managed Identities to authenticate to GCP without exchanging credentials:
+
+1. **AKS Pod** requests a token using its Kubernetes ServiceAccount
+2. **Azure AD** issues an OIDC token for the Managed Identity
+3. **GCP Workload Identity Federation** validates the Azure token
+4. **GCP STS** exchanges it for a GCP access token
+5. **Pod** uses the GCP token to pull images from Artifact Registry
+
+## Prerequisites
+
+- AKS cluster (v1.24+) with Workload Identity enabled
+- Azure CLI configured with appropriate permissions
+- `kubectl` configured for your AKS cluster
+- Your Azure Tenant ID and Managed Identity Client ID provided to Bifrost team
+
+<Note>
+Contact the Bifrost team with your Azure Tenant ID and Managed Identity Client IDs to get access configured.
+</Note>
+
+## Step 1: Enable Workload Identity on AKS
+
+If not already enabled, enable Workload Identity on your AKS cluster:
+
+```bash
+# For existing cluster
+az aks update \
+  --resource-group YOUR_RESOURCE_GROUP \
+  --name YOUR_CLUSTER_NAME \
+  --enable-oidc-issuer \
+  --enable-workload-identity
+
+# Get the OIDC issuer URL
+az aks show \
+  --resource-group YOUR_RESOURCE_GROUP \
+  --name YOUR_CLUSTER_NAME \
+  --query "oidcIssuerProfile.issuerUrl" -o tsv
+```
+
+## Step 2: Create Azure Managed Identity
+
+```bash
+# Create Managed Identity
+az identity create \
+  --name bifrost-pull-identity \
+  --resource-group YOUR_RESOURCE_GROUP \
+  --location YOUR_LOCATION
+
+# Get the Client ID
+CLIENT_ID=$(az identity show \
+  --name bifrost-pull-identity \
+  --resource-group YOUR_RESOURCE_GROUP \
+  --query clientId -o tsv)
+
+echo "Client ID: $CLIENT_ID"
+```
+
+## Step 3: Create Federated Credential
+
+Link the Kubernetes ServiceAccount to the Azure Managed Identity:
+
+```bash
+# Get AKS OIDC issuer
+AKS_OIDC_ISSUER=$(az aks show \
+  --resource-group YOUR_RESOURCE_GROUP \
+  --name YOUR_CLUSTER_NAME \
+  --query "oidcIssuerProfile.issuerUrl" -o tsv)
+
+# Create federated credential
+az identity federated-credential create \
+  --name bifrost-federated-credential \
+  --identity-name bifrost-pull-identity \
+  --resource-group YOUR_RESOURCE_GROUP \
+  --issuer "$AKS_OIDC_ISSUER" \
+  --subject "system:serviceaccount:bifrost:bifrost-sa" \
+  --audience "api://AzureADTokenExchange"
+```
+
+## Step 4: Provide Details to Bifrost Team
+
+Send the following information to the Bifrost team:
+
+```bash
+# Get Tenant ID
+az account show --query tenantId -o tsv
+
+# Get Client ID
+az identity show \
+  --name bifrost-pull-identity \
+  --resource-group YOUR_RESOURCE_GROUP \
+  --query clientId -o tsv
+```
+
+The Bifrost team will configure GCP Workload Identity Federation to trust your Azure Managed Identity.
+
+## Step 5: Store GCP Credential Configuration
+
+After the Bifrost team configures access, they will provide a credential configuration. Store it as a ConfigMap:
+
+```yaml
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: gcp-credential-config
+  namespace: bifrost
+data:
+  credential-config.json: |
+    {
+      "type": "external_account",
+      "audience": "//iam.googleapis.com/projects/BIFROST_PROJECT_NUMBER/locations/global/workloadIdentityPools/YOUR_HUB_SLUG-azure-pool/providers/YOUR_HUB_SLUG-azure-provider",
+      "subject_token_type": "urn:ietf:params:oauth:token-type:jwt",
+      "service_account_impersonation_url": "https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/BIFROST_SA@BIFROST_PROJECT.iam.gserviceaccount.com:generateAccessToken",
+      "token_url": "https://sts.googleapis.com/v1/token",
+      "credential_source": {
+        "file": "/var/run/secrets/azure/tokens/azure-identity-token",
+        "format": {
+          "type": "text"
+        }
+      }
+    }
+```
+
+<Warning>
+The Bifrost team will provide the exact values for `BIFROST_PROJECT_NUMBER`, `YOUR_HUB_SLUG`, and `BIFROST_SA`.
+</Warning>
+
+## Step 6: Create Kubernetes ServiceAccount
+
+```yaml
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: bifrost-sa
+  namespace: bifrost
+  annotations:
+    azure.workload.identity/client-id: YOUR_MANAGED_IDENTITY_CLIENT_ID
+  labels:
+    azure.workload.identity/use: "true"
+```
+
+## Step 7: Create Image Pull Secret with Token Refresh
+
+Create a CronJob to refresh the imagePullSecret using the federated identity:
+
+```yaml
+apiVersion: batch/v1
+kind: CronJob
+metadata:
+  name: refresh-ar-secret
+  namespace: bifrost
+spec:
+  schedule: "*/30 * * * *"  # Every 30 minutes
+  successfulJobsHistoryLimit: 1
+  failedJobsHistoryLimit: 3
+  jobTemplate:
+    spec:
+      template:
+        metadata:
+          labels:
+            azure.workload.identity/use: "true"
+        spec:
+          serviceAccountName: bifrost-sa
+          containers:
+          - name: token-refresh
+            image: google/cloud-sdk:slim
+            command: ["/bin/bash", "-c"]
+            args:
+            - |
+              set -e
+              
+              # Set GCP credential config
+              export GOOGLE_APPLICATION_CREDENTIALS=/etc/gcp/credential-config.json
+              
+              # Get GCP access token via federation
+              TOKEN=$(gcloud auth print-access-token)
+              
+              # Delete existing secret if it exists
+              kubectl delete secret ar-pull-secret --ignore-not-found -n bifrost
+              
+              # Create new imagePullSecret
+              kubectl create secret docker-registry ar-pull-secret \
+                --docker-server=REGION-docker.pkg.dev \
+                --docker-username=oauth2accesstoken \
+                --docker-password="$TOKEN" \
+                -n bifrost
+              
+              echo "Secret refreshed at $(date)"
+            volumeMounts:
+            - name: gcp-credential-config
+              mountPath: /etc/gcp
+              readOnly: true
+            - name: azure-identity-token
+              mountPath: /var/run/secrets/azure/tokens
+              readOnly: true
+          volumes:
+          - name: gcp-credential-config
+            configMap:
+              name: gcp-credential-config
+          - name: azure-identity-token
+            projected:
+              sources:
+              - serviceAccountToken:
+                  path: azure-identity-token
+                  expirationSeconds: 3600
+                  audience: api://AzureADTokenExchange
+          restartPolicy: OnFailure
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: Role
+metadata:
+  name: secret-manager
+  namespace: bifrost
+rules:
+- apiGroups: [""]
+  resources: ["secrets"]
+  verbs: ["get", "create", "delete"]
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: RoleBinding
+metadata:
+  name: secret-manager-binding
+  namespace: bifrost
+subjects:
+- kind: ServiceAccount
+  name: bifrost-sa
+  namespace: bifrost
+roleRef:
+  kind: Role
+  name: secret-manager
+  apiGroup: rbac.authorization.k8s.io
+```
+
+## Step 8: Deploy Bifrost
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: bifrost
+  namespace: bifrost
+spec:
+  replicas: 2
+  selector:
+    matchLabels:
+      app: bifrost
+  template:
+    metadata:
+      labels:
+        app: bifrost
+        azure.workload.identity/use: "true"
+    spec:
+      serviceAccountName: bifrost-sa
+      imagePullSecrets:
+      - name: ar-pull-secret
+      containers:
+      - name: bifrost
+        image: REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
+        ports:
+        - containerPort: 8080
+          name: http
+        resources:
+          requests:
+            cpu: "250m"
+            memory: "512Mi"
+          limits:
+            cpu: "1000m"
+            memory: "2Gi"
+        livenessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          initialDelaySeconds: 30
+          periodSeconds: 10
+        readinessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          initialDelaySeconds: 10
+          periodSeconds: 5
+        volumeMounts:
+        - name: config
+          mountPath: /app/data/config.json
+          subPath: config.json
+      volumes:
+      - name: config
+        secret:
+          secretName: bifrost-config
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: bifrost
+  namespace: bifrost
+spec:
+  selector:
+    app: bifrost
+  ports:
+  - port: 80
+    targetPort: 8080
+    protocol: TCP
+  type: ClusterIP
+```
+
+## Bootstrap: Initial Secret Creation
+
+Before the first deployment, manually trigger the CronJob or create the secret:
+
+```bash
+# Create namespace
+kubectl create namespace bifrost
+
+# Apply all configurations
+kubectl apply -f configmap.yaml
+kubectl apply -f serviceaccount.yaml
+kubectl apply -f cronjob.yaml
+
+# Manually trigger the CronJob
+kubectl create job --from=cronjob/refresh-ar-secret initial-refresh -n bifrost
+
+# Wait for completion
+kubectl wait --for=condition=complete job/initial-refresh -n bifrost --timeout=120s
+
+# Verify secret was created
+kubectl get secret ar-pull-secret -n bifrost
+```
+
+## Verifying Access
+
+### Check Workload Identity Configuration
+
+```bash
+# Verify AKS has Workload Identity enabled
+az aks show \
+  --resource-group YOUR_RESOURCE_GROUP \
+  --name YOUR_CLUSTER_NAME \
+  --query "oidcIssuerProfile.enabled" -o tsv
+
+# Check federated credential
+az identity federated-credential show \
+  --name bifrost-federated-credential \
+  --identity-name bifrost-pull-identity \
+  --resource-group YOUR_RESOURCE_GROUP
+```
+
+### Verify Token Exchange
+
+```bash
+# Check CronJob ran successfully
+kubectl get jobs -n bifrost
+
+# View CronJob logs
+kubectl logs -l job-name=refresh-ar-secret -n bifrost
+
+# Verify imagePullSecret exists
+kubectl get secret ar-pull-secret -n bifrost -o yaml
+```
+
+## Troubleshooting
+
+### ImagePullBackOff Errors
+
+1. **Check imagePullSecret exists**: `kubectl get secret ar-pull-secret -n bifrost`
+2. **Verify CronJob succeeded**: `kubectl get jobs -n bifrost`
+3. **Check Azure Workload Identity**: Ensure labels are set correctly
+
+```bash
+# Check pod events
+kubectl describe pod -l app=bifrost -n bifrost
+
+# Check ServiceAccount has correct annotations
+kubectl get sa bifrost-sa -n bifrost -o yaml
+```
+
+### Token Exchange Failures
+
+```bash
+# Check CronJob logs for errors
+kubectl logs -l job-name=refresh-ar-secret -n bifrost
+
+# Common issues:
+# - "audience mismatch": Check credential-config.json audience field
+# - "subject mismatch": Verify federated credential subject matches SA
+# - "permission denied": Contact Bifrost team to verify WIF configuration
+```
+
+### Azure Workload Identity Issues
+
+```bash
+# Verify Managed Identity exists
+az identity show \
+  --name bifrost-pull-identity \
+  --resource-group YOUR_RESOURCE_GROUP
+
+# Check federated credentials
+az identity federated-credential list \
+  --identity-name bifrost-pull-identity \
+  --resource-group YOUR_RESOURCE_GROUP
+
+# Verify pod has identity token mounted
+kubectl exec -it deployment/bifrost -n bifrost -- \
+  ls -la /var/run/secrets/azure/tokens/
+```
+
+## Summary
+
+| Component | Value |
+|-----------|-------|
+| Registry | GCP Artifact Registry |
+| Authentication | Azure WIF -> GCP WIF -> GCP SA |
+| Token Lifetime | 60 minutes (auto-refreshed every 30 min) |
+| Secret Name | `ar-pull-secret` |
+
+## Next Steps
+
+- Configure [Bifrost settings](/quickstart/gateway/setting-up) for your use case
+- Set up [observability](/features/observability/default) for monitoring
+- Enable [clustering](/enterprise/clustering) for high availability
--- a/docs/deployment-guides/enterprise/gcp.mdx
+++ b/docs/deployment-guides/enterprise/gcp.mdx
@@ -0,0 +1,386 @@
+---
+title: "GCP Deployment"
+description: "Deploy Bifrost Enterprise on GCP using Artifact Registry with Workload Identity"
+icon: "google"
+---
+
+Bifrost Enterprise images for GCP customers are distributed through GCP Artifact Registry, enabling native Workload Identity for secure, keyless authentication.
+
+## Architecture
+
+```mermaid
+flowchart LR
+    subgraph GCP[GCP Project]
+        subgraph GKE[GKE Cluster]
+            Pod[Bifrost Pod]
+            KSA[K8s ServiceAccount]
+        end
+        GSA[GCP Service Account]
+        AR[Artifact Registry<br/>Bifrost Images]
+    end
+    
+    KSA -->|Workload Identity| GSA
+    Pod -->|Impersonates| GSA
+    GSA -->|Pull Permission| AR
+    AR -->|Image| Pod
+```
+
+## Prerequisites
+
+- GKE cluster (v1.24+) with Workload Identity enabled
+- `gcloud` CLI configured with appropriate permissions
+- `kubectl` configured for your GKE cluster
+- Your GCP project allowlisted by Bifrost team
+
+<Note>
+Contact the Bifrost team with your GCP project ID and service account email to get access configured.
+</Note>
+
+## Workload Identity (Recommended)
+
+Workload Identity provides the most secure authentication method for GKE deployments by eliminating the need for service account keys.
+
+### Step 1: Enable Workload Identity on GKE
+
+If not already enabled, enable Workload Identity on your cluster:
+
+```bash
+# For existing cluster
+gcloud container clusters update YOUR_CLUSTER_NAME \
+  --region=YOUR_REGION \
+  --workload-pool=YOUR_PROJECT_ID.svc.id.goog
+
+# Verify Workload Identity is enabled
+gcloud container clusters describe YOUR_CLUSTER_NAME \
+  --region=YOUR_REGION \
+  --format="value(workloadIdentityConfig.workloadPool)"
+```
+
+### Step 2: Create GCP Service Account
+
+Create a service account that will be used to pull images:
+
+```bash
+# Create service account
+gcloud iam service-accounts create bifrost-pull-sa \
+  --display-name="Bifrost Image Pull SA" \
+  --project=YOUR_PROJECT_ID
+```
+
+### Step 3: Request Access from Bifrost Team
+
+Provide the following to the Bifrost team:
+- Your GCP project ID
+- Service account email: `bifrost-pull-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com`
+
+The Bifrost team will grant the necessary permissions to pull images from the registry.
+
+### Step 4: Create Namespace and ServiceAccount
+
+```bash
+kubectl create namespace bifrost
+```
+
+```yaml
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: bifrost-sa
+  namespace: bifrost
+  annotations:
+    iam.gke.io/gcp-service-account: bifrost-pull-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com
+```
+
+### Step 5: Bind Kubernetes SA to GCP SA
+
+Allow the Kubernetes ServiceAccount to impersonate the GCP Service Account:
+
+```bash
+gcloud iam service-accounts add-iam-policy-binding \
+  bifrost-pull-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com \
+  --role=roles/iam.workloadIdentityUser \
+  --member="serviceAccount:YOUR_PROJECT_ID.svc.id.goog[bifrost/bifrost-sa]"
+```
+
+### Step 6: Create Image Pull Secret with Token Refresh
+
+Artifact Registry tokens expire after 60 minutes. Use a CronJob to refresh the imagePullSecret:
+
+```yaml
+apiVersion: batch/v1
+kind: CronJob
+metadata:
+  name: refresh-ar-secret
+  namespace: bifrost
+spec:
+  schedule: "*/30 * * * *"  # Every 30 minutes
+  successfulJobsHistoryLimit: 1
+  failedJobsHistoryLimit: 3
+  jobTemplate:
+    spec:
+      template:
+        spec:
+          serviceAccountName: bifrost-sa
+          containers:
+          - name: token-refresh
+            image: google/cloud-sdk:slim
+            command: ["/bin/bash", "-c"]
+            args:
+            - |
+              set -e
+              
+              # Get access token using Workload Identity
+              TOKEN=$(gcloud auth print-access-token)
+              
+              # Delete existing secret if it exists
+              kubectl delete secret ar-pull-secret --ignore-not-found -n bifrost
+              
+              # Create new imagePullSecret
+              kubectl create secret docker-registry ar-pull-secret \
+                --docker-server=REGION-docker.pkg.dev \
+                --docker-username=oauth2accesstoken \
+                --docker-password="$TOKEN" \
+                -n bifrost
+              
+              echo "Secret refreshed at $(date)"
+          restartPolicy: OnFailure
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: Role
+metadata:
+  name: secret-manager
+  namespace: bifrost
+rules:
+- apiGroups: [""]
+  resources: ["secrets"]
+  verbs: ["get", "create", "delete"]
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: RoleBinding
+metadata:
+  name: secret-manager-binding
+  namespace: bifrost
+subjects:
+- kind: ServiceAccount
+  name: bifrost-sa
+  namespace: bifrost
+roleRef:
+  kind: Role
+  name: secret-manager
+  apiGroup: rbac.authorization.k8s.io
+```
+
+<Warning>
+Replace `REGION` with your Artifact Registry region (e.g., `us-central1`).
+</Warning>
+
+### Step 7: Deploy Bifrost
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: bifrost
+  namespace: bifrost
+spec:
+  replicas: 2
+  selector:
+    matchLabels:
+      app: bifrost
+  template:
+    metadata:
+      labels:
+        app: bifrost
+    spec:
+      serviceAccountName: bifrost-sa
+      imagePullSecrets:
+      - name: ar-pull-secret
+      containers:
+      - name: bifrost
+        image: REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
+        ports:
+        - containerPort: 8080
+          name: http
+        resources:
+          requests:
+            cpu: "250m"
+            memory: "512Mi"
+          limits:
+            cpu: "1000m"
+            memory: "2Gi"
+        livenessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          initialDelaySeconds: 30
+          periodSeconds: 10
+        readinessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          initialDelaySeconds: 10
+          periodSeconds: 5
+        volumeMounts:
+        - name: config
+          mountPath: /app/data/config.json
+          subPath: config.json
+      volumes:
+      - name: config
+        secret:
+          secretName: bifrost-config
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: bifrost
+  namespace: bifrost
+spec:
+  selector:
+    app: bifrost
+  ports:
+  - port: 80
+    targetPort: 8080
+    protocol: TCP
+  type: ClusterIP
+```
+
+### Bootstrap: Initial Secret Creation
+
+Before the first deployment, manually create the initial imagePullSecret:
+
+```bash
+# Authenticate gcloud
+gcloud auth login
+
+# Create initial secret
+kubectl create secret docker-registry ar-pull-secret \
+  --docker-server=REGION-docker.pkg.dev \
+  --docker-username=oauth2accesstoken \
+  --docker-password="$(gcloud auth print-access-token)" \
+  -n bifrost
+```
+
+## Service Account Impersonation
+
+For cross-project deployments or when you need to use an existing service account:
+
+### Configure Impersonation
+
+```bash
+# Grant impersonation permission
+gcloud iam service-accounts add-iam-policy-binding \
+  BIFROST_PROVIDED_SA@BIFROST_PROJECT.iam.gserviceaccount.com \
+  --role=roles/iam.serviceAccountTokenCreator \
+  --member="serviceAccount:bifrost-pull-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com"
+```
+
+### Token Refresh with Impersonation
+
+Update the CronJob to use impersonation:
+
+```yaml
+args:
+- |
+  set -e
+  
+  # Get access token by impersonating the Bifrost SA
+  TOKEN=$(gcloud auth print-access-token \
+    --impersonate-service-account=BIFROST_PROVIDED_SA@BIFROST_PROJECT.iam.gserviceaccount.com)
+  
+  kubectl delete secret ar-pull-secret --ignore-not-found -n bifrost
+  kubectl create secret docker-registry ar-pull-secret \
+    --docker-server=REGION-docker.pkg.dev \
+    --docker-username=oauth2accesstoken \
+    --docker-password="$TOKEN" \
+    -n bifrost
+```
+
+## Service Account Key (Legacy)
+
+<Warning>
+Service account keys are not recommended for production. Use Workload Identity instead.
+</Warning>
+
+For environments that cannot use Workload Identity:
+
+```bash
+# Create key (provided by Bifrost team)
+# Store key securely
+
+# Create imagePullSecret
+kubectl create secret docker-registry ar-pull-secret \
+  --docker-server=REGION-docker.pkg.dev \
+  --docker-username=_json_key \
+  --docker-password="$(cat sa-key.json)" \
+  -n bifrost
+```
+
+## Verifying Access
+
+### Test Artifact Registry Authentication
+
+```bash
+# Configure docker for Artifact Registry
+gcloud auth configure-docker REGION-docker.pkg.dev
+
+# Pull test (requires impersonation or direct access)
+docker pull REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
+```
+
+### Verify Workload Identity Configuration
+
+```bash
+# Check ServiceAccount annotation
+kubectl get sa bifrost-sa -n bifrost -o yaml
+
+# Verify pod can authenticate
+kubectl exec -it deployment/bifrost -n bifrost -- \
+  gcloud auth print-access-token
+
+# Check token refresh CronJob
+kubectl get cronjob refresh-ar-secret -n bifrost
+kubectl get jobs -n bifrost
+```
+
+## Troubleshooting
+
+### ImagePullBackOff Errors
+
+1. **Check imagePullSecret exists**: `kubectl get secret ar-pull-secret -n bifrost`
+2. **Verify token is valid**: Check if CronJob ran successfully
+3. **Check Workload Identity binding**: Ensure GCP SA is bound to K8s SA
+
+```bash
+# Check pod events
+kubectl describe pod -l app=bifrost -n bifrost
+
+# Manually refresh token
+kubectl create job --from=cronjob/refresh-ar-secret manual-refresh -n bifrost
+```
+
+### Workload Identity Issues
+
+```bash
+# Verify Workload Identity pool
+gcloud container clusters describe YOUR_CLUSTER_NAME \
+  --region=YOUR_REGION \
+  --format="value(workloadIdentityConfig.workloadPool)"
+
+# Check IAM binding
+gcloud iam service-accounts get-iam-policy \
+  bifrost-pull-sa@YOUR_PROJECT_ID.iam.gserviceaccount.com
+```
+
+### Token Expiration
+
+If pods fail to pull images after 60 minutes:
+
+1. Verify CronJob is running: `kubectl get cronjob -n bifrost`
+2. Check CronJob logs: `kubectl logs -l job-name=refresh-ar-secret -n bifrost`
+3. Manually trigger refresh: `kubectl create job --from=cronjob/refresh-ar-secret manual-refresh -n bifrost`
+
+## Next Steps
+
+- Configure [Bifrost settings](/quickstart/gateway/setting-up) for your use case
+- Set up [observability](/features/observability/default) for monitoring
+- Enable [clustering](/enterprise/clustering) for high availability
--- a/docs/deployment-guides/enterprise/on-premise.mdx
+++ b/docs/deployment-guides/enterprise/on-premise.mdx
@@ -0,0 +1,541 @@
+---
+title: "On-Premise Deployment"
+description: "Deploy Bifrost Enterprise in on-premise or air-gapped environments using Docker credentials"
+icon: "server"
+---
+
+Bifrost Enterprise supports on-premise deployments for environments that cannot use cloud-native identity federation. Images are pulled from GCP Artifact Registry using username/password authentication.
+
+## Architecture
+
+```mermaid
+flowchart LR
+    subgraph OnPrem[On-Premise Environment]
+        subgraph K8s[Kubernetes Cluster]
+            Pod[Bifrost Pod]
+            Secret[imagePullSecret]
+        end
+        Docker[Docker Daemon]
+    end
+    
+    subgraph GCP[GCP]
+        AR[Artifact Registry<br/>Bifrost Images]
+    end
+    
+    Secret -->|Credentials| Pod
+    Pod -->|Pull| AR
+    Docker -->|Pull| AR
+    AR -->|Image| Pod
+    AR -->|Image| Docker
+```
+
+## Prerequisites
+
+- Kubernetes cluster (v1.23+) or Docker runtime
+- Network access to `us-central1-docker.pkg.dev` (or your designated region)
+- Docker credentials provided by Bifrost team
+
+<Note>
+Contact the Bifrost team to receive your Docker username and password credentials.
+</Note>
+
+## Credentials
+
+The Bifrost team will provide you with:
+
+| Credential | Description |
+|------------|-------------|
+| **Username** | `_json_key` (fixed value for GCP Artifact Registry) |
+| **Password** | Service account JSON key (base64 encoded or raw JSON) |
+| **Registry** | `REGION-docker.pkg.dev` (e.g., `us-central1-docker.pkg.dev`) |
+| **Repository** | `REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG` |
+
+<Warning>
+Store credentials securely. Never commit them to version control or expose them in logs.
+</Warning>
+
+## Docker Deployment
+
+### Step 1: Login to Registry
+
+```bash
+# Using the JSON key file
+cat bifrost-credentials.json | docker login -u _json_key --password-stdin https://REGION-docker.pkg.dev
+
+# Or using the password directly
+docker login -u _json_key -p "$(cat bifrost-credentials.json)" https://REGION-docker.pkg.dev
+```
+
+### Step 2: Pull the Image
+
+```bash
+docker pull REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
+```
+
+### Step 3: Run Bifrost
+
+```bash
+docker run -d \
+  --name bifrost \
+  -p 8080:8080 \
+  -v /path/to/config.json:/app/data/config.json:ro \
+  -v /path/to/data:/app/data \
+  REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
+```
+
+## Kubernetes Deployment
+
+### Step 1: Create Namespace
+
+```bash
+kubectl create namespace bifrost
+```
+
+### Step 2: Create imagePullSecret
+
+<Tabs>
+<Tab title="From JSON Key File">
+```bash
+kubectl create secret docker-registry bifrost-pull-secret \
+  --docker-server=REGION-docker.pkg.dev \
+  --docker-username=_json_key \
+  --docker-password="$(cat bifrost-credentials.json)" \
+  --namespace=bifrost
+```
+</Tab>
+<Tab title="From Base64 Key">
+```bash
+# If you received a base64-encoded key
+kubectl create secret docker-registry bifrost-pull-secret \
+  --docker-server=REGION-docker.pkg.dev \
+  --docker-username=_json_key \
+  --docker-password="$(echo 'BASE64_ENCODED_KEY' | base64 -d)" \
+  --namespace=bifrost
+```
+</Tab>
+<Tab title="Using YAML">
+```yaml
+apiVersion: v1
+kind: Secret
+metadata:
+  name: bifrost-pull-secret
+  namespace: bifrost
+type: kubernetes.io/dockerconfigjson
+data:
+  .dockerconfigjson: <BASE64_ENCODED_DOCKER_CONFIG>
+```
+
+Generate the base64-encoded config:
+
+```bash
+# Create docker config
+cat <<EOF > docker-config.json
+{
+  "auths": {
+    "REGION-docker.pkg.dev": {
+      "username": "_json_key",
+      "password": "$(cat bifrost-credentials.json | tr -d '\n')",
+      "auth": "$(echo -n '_json_key:'$(cat bifrost-credentials.json | tr -d '\n') | base64 -w 0)"
+    }
+  }
+}
+EOF
+
+# Base64 encode for secret
+cat docker-config.json | base64 -w 0
+```
+</Tab>
+</Tabs>
+
+### Step 3: Create Bifrost Configuration
+
+<Note>
+If you use PostgreSQL for `config_store` or `logs_store`, ensure the target database is UTF8 encoded. See [PostgreSQL UTF8 Requirement](../../quickstart/gateway/setting-up#postgresql-utf8-requirement).
+</Note>
+
+```yaml
+apiVersion: v1
+kind: Secret
+metadata:
+  name: bifrost-config
+  namespace: bifrost
+type: Opaque
+stringData:
+  config.json: |
+    {
+      "config_store": {
+        "enabled": true,
+        "type": "postgres",
+        "config": {
+          "host": "postgres.bifrost.svc.cluster.local",
+          "port": "5432",
+          "user": "bifrost",
+          "password": "YOUR_PASSWORD",
+          "db_name": "bifrost",
+          "ssl_mode": "disable"
+        }
+      },
+      "logs_store": {
+        "enabled": true,
+        "type": "postgres",
+        "config": {
+          "host": "postgres.bifrost.svc.cluster.local",
+          "port": "5432",
+          "user": "bifrost",
+          "password": "YOUR_PASSWORD",
+          "db_name": "bifrost",
+          "ssl_mode": "disable"
+        }
+      }
+    }
+```
+
+### Step 4: Deploy Bifrost
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: bifrost
+  namespace: bifrost
+spec:
+  replicas: 2
+  selector:
+    matchLabels:
+      app: bifrost
+  template:
+    metadata:
+      labels:
+        app: bifrost
+    spec:
+      imagePullSecrets:
+      - name: bifrost-pull-secret
+      containers:
+      - name: bifrost
+        image: REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
+        ports:
+        - containerPort: 8080
+          name: http
+        resources:
+          requests:
+            cpu: "250m"
+            memory: "512Mi"
+          limits:
+            cpu: "1000m"
+            memory: "2Gi"
+        livenessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          initialDelaySeconds: 30
+          periodSeconds: 10
+        readinessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          initialDelaySeconds: 10
+          periodSeconds: 5
+        volumeMounts:
+        - name: config
+          mountPath: /app/data/config.json
+          subPath: config.json
+        - name: data
+          mountPath: /app/data
+      volumes:
+      - name: config
+        secret:
+          secretName: bifrost-config
+      - name: data
+        persistentVolumeClaim:
+          claimName: bifrost-data
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: bifrost
+  namespace: bifrost
+spec:
+  selector:
+    app: bifrost
+  ports:
+  - port: 80
+    targetPort: 8080
+    protocol: TCP
+  type: ClusterIP
+---
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata:
+  name: bifrost-data
+  namespace: bifrost
+spec:
+  accessModes:
+  - ReadWriteOnce
+  resources:
+    requests:
+      storage: 10Gi
+```
+
+### Step 5: Expose Bifrost (Optional)
+
+<Tabs>
+<Tab title="Ingress">
+```yaml
+apiVersion: networking.k8s.io/v1
+kind: Ingress
+metadata:
+  name: bifrost
+  namespace: bifrost
+  annotations:
+    nginx.ingress.kubernetes.io/proxy-body-size: "50m"
+spec:
+  ingressClassName: nginx
+  rules:
+  - host: bifrost.your-domain.com
+    http:
+      paths:
+      - path: /
+        pathType: Prefix
+        backend:
+          service:
+            name: bifrost
+            port:
+              number: 80
+  tls:
+  - hosts:
+    - bifrost.your-domain.com
+    secretName: bifrost-tls
+```
+</Tab>
+<Tab title="LoadBalancer">
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: bifrost-lb
+  namespace: bifrost
+spec:
+  selector:
+    app: bifrost
+  ports:
+  - port: 80
+    targetPort: 8080
+    protocol: TCP
+  type: LoadBalancer
+```
+</Tab>
+<Tab title="NodePort">
+```yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: bifrost-nodeport
+  namespace: bifrost
+spec:
+  selector:
+    app: bifrost
+  ports:
+  - port: 80
+    targetPort: 8080
+    nodePort: 30080
+    protocol: TCP
+  type: NodePort
+```
+</Tab>
+</Tabs>
+
+## Docker Compose Deployment
+
+For simpler deployments without Kubernetes:
+
+```yaml
+version: '3.8'
+
+services:
+  bifrost:
+    image: REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
+    container_name: bifrost
+    ports:
+      - "8080:8080"
+    volumes:
+      - ./config.json:/app/data/config.json:ro
+      - bifrost-data:/app/data
+    environment:
+      - BIFROST_LOG_LEVEL=info
+    healthcheck:
+      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8080/health"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+      start_period: 40s
+    restart: unless-stopped
+
+  postgres:
+    image: postgres:15-alpine
+    container_name: bifrost-postgres
+    environment:
+      - POSTGRES_USER=bifrost
+      - POSTGRES_PASSWORD=YOUR_PASSWORD
+      - POSTGRES_DB=bifrost
+    volumes:
+      - postgres-data:/var/lib/postgresql/data
+    healthcheck:
+      test: ["CMD-SHELL", "pg_isready -U bifrost"]
+      interval: 10s
+      timeout: 5s
+      retries: 5
+    restart: unless-stopped
+
+volumes:
+  bifrost-data:
+  postgres-data:
+```
+
+Login to registry before running:
+
+```bash
+cat bifrost-credentials.json | docker login -u _json_key --password-stdin https://REGION-docker.pkg.dev
+docker compose up -d
+```
+
+## Air-Gapped Environments
+
+For environments without internet access, you can mirror the image to your internal registry.
+
+### Step 1: Pull Image (Internet-Connected Machine)
+
+```bash
+# Login and pull
+cat bifrost-credentials.json | docker login -u _json_key --password-stdin https://REGION-docker.pkg.dev
+docker pull REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
+
+# Save to tar file
+docker save REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest > bifrost-image.tar
+```
+
+### Step 2: Transfer and Load (Air-Gapped Machine)
+
+```bash
+# Load image
+docker load < bifrost-image.tar
+
+# Tag for internal registry
+docker tag REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest \
+  internal-registry.company.com/bifrost:latest
+
+# Push to internal registry
+docker push internal-registry.company.com/bifrost:latest
+```
+
+### Step 3: Update Kubernetes Manifests
+
+Update the image reference in your deployment:
+
+```yaml
+containers:
+- name: bifrost
+  image: internal-registry.company.com/bifrost:latest
+```
+
+## Credential Rotation
+
+When the Bifrost team rotates your credentials:
+
+### Update Docker Login
+
+```bash
+cat new-credentials.json | docker login -u _json_key --password-stdin https://REGION-docker.pkg.dev
+```
+
+### Update Kubernetes Secret
+
+```bash
+# Delete old secret
+kubectl delete secret bifrost-pull-secret -n bifrost
+
+# Create new secret
+kubectl create secret docker-registry bifrost-pull-secret \
+  --docker-server=REGION-docker.pkg.dev \
+  --docker-username=_json_key \
+  --docker-password="$(cat new-credentials.json)" \
+  --namespace=bifrost
+
+# Restart deployment to pick up new secret
+kubectl rollout restart deployment/bifrost -n bifrost
+```
+
+## Verifying Access
+
+### Test Docker Authentication
+
+```bash
+# Verify login
+docker login -u _json_key -p "$(cat bifrost-credentials.json)" https://REGION-docker.pkg.dev
+
+# Test pull
+docker pull REGION-docker.pkg.dev/BIFROST_PROJECT/YOUR_HUB_SLUG/bifrost:latest
+```
+
+### Verify Kubernetes Secret
+
+```bash
+# Check secret exists
+kubectl get secret bifrost-pull-secret -n bifrost
+
+# Verify secret content (base64 encoded)
+kubectl get secret bifrost-pull-secret -n bifrost -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d
+```
+
+## Troubleshooting
+
+### ImagePullBackOff Errors
+
+```bash
+# Check pod events
+kubectl describe pod -l app=bifrost -n bifrost
+
+# Common issues:
+# - "unauthorized": Invalid credentials - check username/password
+# - "not found": Wrong repository path - verify with Bifrost team
+# - "connection refused": Network issue - check firewall rules
+```
+
+### Network Connectivity
+
+```bash
+# Test DNS resolution
+nslookup REGION-docker.pkg.dev
+
+# Test HTTPS connectivity
+curl -v https://REGION-docker.pkg.dev/v2/
+
+# Required outbound access:
+# - REGION-docker.pkg.dev:443
+# - oauth2.googleapis.com:443 (for token refresh)
+```
+
+### Credential Issues
+
+```bash
+# Verify JSON key format
+cat bifrost-credentials.json | jq .
+
+# Check key hasn't expired
+cat bifrost-credentials.json | jq '.private_key_id'
+
+# Contact Bifrost team if credentials are invalid
+```
+
+## Security Best Practices
+
+1. **Store credentials securely**: Use a secrets manager (Vault, AWS Secrets Manager) for credential storage
+2. **Limit access**: Only grant imagePullSecret access to required namespaces
+3. **Rotate regularly**: Request credential rotation from Bifrost team periodically
+4. **Audit access**: Monitor image pull logs for unauthorized access attempts
+5. **Network isolation**: Restrict outbound access to only required registry endpoints
+
+## Next Steps
+
+- Configure [Bifrost settings](/quickstart/gateway/setting-up) for your use case
+- Set up [observability](/features/observability/default) for monitoring
+- Enable [clustering](/enterprise/clustering) for high availability
--- a/docs/deployment-guides/enterprise/overview.mdx
+++ b/docs/deployment-guides/enterprise/overview.mdx
@@ -0,0 +1,141 @@
+---
+title: "Overview"
+description: "Deploy Bifrost Enterprise in your cloud environment with secure, private container image distribution"
+icon: "info-circle"
+---
+
+Bifrost Enterprise provides private container image distribution through dedicated registries, enabling secure deployments in AWS, GCP, Azure, and on-premise environments.
+
+## Architecture
+
+Bifrost uses a hub-and-spoke model with two container registries optimized for each cloud platform:
+
+```mermaid
+flowchart TB
+    subgraph BifrostInfra[Bifrost Infrastructure]
+        CICD[CI/CD Pipeline]
+        GCR[GCP Artifact Registry]
+        ECR[AWS ECR]
+    end
+    
+    subgraph Customers[Customer Environments]
+        subgraph AWSCustomer[AWS Customers]
+            EKS[EKS Cluster]
+            ECS[ECS Service]
+        end
+        subgraph GCPCustomer[GCP Customers]
+            GKE[GKE Cluster]
+        end
+        subgraph AzureCustomer[Azure Customers]
+            AKS[AKS Cluster]
+        end
+        subgraph OnPrem[On-Premise]
+            K8S[Kubernetes]
+            Docker[Docker]
+        end
+    end
+    
+    CICD -->|Push| GCR
+    CICD -->|Push| ECR
+    
+    ECR -->|IRSA| EKS
+    ECR -->|Task Role| ECS
+    GCR -->|Workload Identity| GKE
+    GCR -->|Azure WIF| AKS
+    GCR -->|Basic Auth| OnPrem
+```
+
+### Registry Distribution
+
+| Customer Cloud | Registry Source | Why |
+|----------------|-----------------|-----|
+| AWS | AWS ECR | Native IAM integration, lowest latency within AWS |
+| GCP | GCP Artifact Registry | Native Workload Identity, lowest latency within GCP |
+| Azure | GCP Artifact Registry | Workload Identity Federation from Azure to GCP |
+| On-Premise | GCP Artifact Registry | Basic auth with username/password credentials |
+
+## Authentication Methods
+
+Choose the authentication method based on your deployment environment:
+
+| Environment | Method | Security Level | Setup Complexity |
+|-------------|--------|----------------|------------------|
+| AWS EKS | [IRSA](/deployment-guides/enterprise/aws#irsa-recommended) | High | Medium |
+| AWS ECS | [IAM Task Roles](/deployment-guides/enterprise/aws#ecs-task-roles) | High | Low |
+| GCP GKE | [Workload Identity](/deployment-guides/enterprise/gcp#workload-identity-recommended) | High | Low |
+| Azure AKS | [Azure WIF](/deployment-guides/enterprise/azure) | High | Medium |
+| On-Premise | [Basic Auth](/deployment-guides/enterprise/on-premise) | Medium | Low |
+
+<Note>
+Cloud-native identity federation (IRSA, Workload Identity, Azure WIF) is recommended over static credentials for production deployments.
+</Note>
+
+## Security Features
+
+### Encryption
+- **In-Transit**: All registry communication uses TLS 1.3
+- **At-Rest**: Images encrypted using cloud-native encryption (AWS KMS, GCP CMEK)
+
+### Access Control
+- **IAM-based**: Fine-grained permissions using cloud IAM policies
+- **Audit Logging**: All image pull operations are logged for compliance
+- **IP Restrictions**: Optional VPC Service Controls (GCP) or VPC endpoints (AWS)
+
+### Image Security
+- **Vulnerability Scanning**: Automatic scanning on push
+- **Immutable Tags**: Optional tag immutability to prevent overwrites
+- **Signed Images**: Container image signatures for verification
+
+## Prerequisites
+
+Before deploying Bifrost Enterprise, ensure you have:
+
+<Tabs>
+<Tab title="AWS">
+- AWS account with ECR access
+- EKS cluster (v1.23+) or ECS cluster
+- IAM permissions to create roles and policies
+- `kubectl` and `aws` CLI configured
+</Tab>
+<Tab title="GCP">
+- GCP project with Artifact Registry API enabled
+- GKE cluster (v1.24+) with Workload Identity enabled
+- IAM permissions for service account management
+- `kubectl` and `gcloud` CLI configured
+</Tab>
+<Tab title="Azure">
+- Azure subscription with AKS
+- AKS cluster (v1.24+) with Workload Identity enabled
+- Permissions to create Managed Identities
+- `kubectl` and `az` CLI configured
+</Tab>
+<Tab title="On-Premise">
+- Kubernetes cluster (v1.23+) or Docker runtime
+- Network access to `us-central1-docker.pkg.dev`
+- Docker credentials provided by Bifrost team
+</Tab>
+</Tabs>
+
+## Getting Started
+
+<CardGroup cols={2}>
+  <Card title="AWS Deployment" icon="aws" href="/deployment-guides/enterprise/aws">
+    Deploy on EKS or ECS with IRSA authentication
+  </Card>
+  <Card title="GCP Deployment" icon="google" href="/deployment-guides/enterprise/gcp">
+    Deploy on GKE with Workload Identity
+  </Card>
+  <Card title="Azure Deployment" icon="microsoft" href="/deployment-guides/enterprise/azure">
+    Deploy on AKS with Azure Workload Identity Federation
+  </Card>
+  <Card title="On-Premise" icon="server" href="/deployment-guides/enterprise/on-premise">
+    Deploy anywhere with Docker credentials
+  </Card>
+</CardGroup>
+
+## Support
+
+For enterprise deployment assistance:
+- **Email**: [contact@getmaxim.ai](mailto:contact@getmaxim.ai)
+- **Slack**: Connect via Slack Connect for real-time support
+- **Documentation**: Platform-specific guides linked above
--- a/docs/deployment-guides/fly.mdx
+++ b/docs/deployment-guides/fly.mdx
@@ -0,0 +1,34 @@
+---
+title: fly.io
+description: "This guide explains how to deploy Bifrost on fly.io"
+icon: "fly"
+---
+
+As `Bifrost` uses multiple sub-modules (`core`, `framework`, etc.) and also embeds the front-end into a single binary (embed.FS), we use a custom Docker build step before we hand over the deployment to flyctl.
+
+There are two ways to deploy Bifrost on Fly.io:
+
+1. By cloning the repo
+2. Using flyctl + Docker Hub image
+
+## By cloning the repo
+
+1. Clone https://github.com/maximhq/bifrost
+2. Ensure [Make](/deployment-guides/how-to/install-make) is installed.
+3. Run `make deploy-to-fly-io APP_NAME=<your-fly-app-name>`
+
+
+## Using flyctl + Docker Hub image
+
+1. Update your `fly.toml` to specify the Bifrost Docker Hub image.
+
+```toml
+[build]
+image = "maximhq/bifrost:latest"
+```
+
+2. Or you can specify the Docker Hub image path in the command:
+
+```
+fly deploy --app <your-app-name> --image docker.io/maximhq/bifrost:latest
+```
--- a/docs/deployment-guides/helm.mdx
+++ b/docs/deployment-guides/helm.mdx
@@ -0,0 +1,639 @@
+---
+title: "Quick Start"
+description: "Deploy Bifrost on Kubernetes using the official Helm chart — quickstart for OSS and Enterprise"
+icon: "server"
+---
+
+<Note>
+**Latest Chart Version**: [View on Artifact Hub](https://artifacthub.io/packages/helm/bifrost/bifrost)
+</Note>
+
+<Tabs>
+
+<Tab title="OSS">
+
+## Prerequisites
+
+- Kubernetes cluster (v1.19+)
+- `kubectl` configured
+- Helm 3.2.0+ installed
+- Persistent Volume provisioner (required for SQLite; optional for Postgres-only)
+
+<Note>
+If you use PostgreSQL for Bifrost storage, ensure the database is UTF8 encoded. See [PostgreSQL UTF8 Requirement](../quickstart/gateway/setting-up#postgresql-utf8-requirement).
+</Note>
+
+## Step 1 — Add the Helm Repository
+
+```bash
+helm repo add bifrost https://maximhq.github.io/bifrost/helm-charts
+helm repo update
+```
+
+## Step 2 — Install
+
+<Note>
+The Helm chart ships ready-made values files under `helm-charts/bifrost/values-examples/`.
+For example: `sqlite-only.yaml`, `production-ha.yaml`, `external-postgres.yaml`, and `secrets-from-k8s.yaml`.
+See the full list here: https://github.com/maximhq/bifrost/tree/main/helm-charts/bifrost/values-examples
+</Note>
+
+<Tabs>
+<Tab title="Minimal (SQLite)">
+
+Fastest way to get running. Bifrost deploys as a StatefulSet with a 10Gi PVC for SQLite.
+
+```bash
+kubectl create secret generic bifrost-encryption-key \
+  --from-literal=encryption-key="$(openssl rand -base64 32)"
+
+helm install bifrost bifrost/bifrost \
+  --set image.tag=v1.4.11 \
+  --set bifrost.encryptionKeySecret.name="bifrost-encryption-key" \
+  --set bifrost.encryptionKeySecret.key="encryption-key"
+```
+
+</Tab>
+<Tab title="With a Provider Key">
+
+Add your first provider key at install time:
+
+```bash
+kubectl create secret generic bifrost-encryption-key \
+  --from-literal=encryption-key="$(openssl rand -base64 32)"
+
+kubectl create secret generic provider-keys \
+  --from-literal=openai-api-key='sk-your-key'
+
+helm install bifrost bifrost/bifrost \
+  --set image.tag=v1.4.11 \
+  --set bifrost.encryptionKeySecret.name="bifrost-encryption-key" \
+  --set bifrost.encryptionKeySecret.key="encryption-key" \
+  --set 'bifrost.providers.openai.keys[0].name=primary' \
+  --set 'bifrost.providers.openai.keys[0].value=env.OPENAI_API_KEY' \
+  --set 'bifrost.providers.openai.keys[0].weight=1' \
+  --set bifrost.providerSecrets.openai.existingSecret="provider-keys" \
+  --set bifrost.providerSecrets.openai.key="openai-api-key" \
+  --set bifrost.providerSecrets.openai.envVar="OPENAI_API_KEY"
+```
+
+</Tab>
+<Tab title="Production (PostgreSQL + HA)">
+
+High-availability setup — 3 replicas, PostgreSQL, autoscaling, ingress.
+
+```bash
+# 1. Create secrets
+kubectl create secret generic bifrost-encryption-key \
+  --from-literal=encryption-key="$(openssl rand -base64 32)"
+
+kubectl create secret generic postgres-credentials \
+  --from-literal=password="$(openssl rand -base64 32)"
+
+kubectl create secret generic provider-keys \
+  --from-literal=openai-api-key='sk-...'
+```
+
+```yaml
+# production.yaml
+image:
+  tag: "v1.4.11"
+
+replicaCount: 3
+
+storage:
+  mode: postgres
+
+postgresql:
+  enabled: true
+  auth:
+    username: bifrost
+    database: bifrost
+    existingSecret: "postgres-credentials"
+    secretKeys:
+      adminPasswordKey: "password"
+  primary:
+    persistence:
+      size: 50Gi
+    resources:
+      requests:
+        cpu: 500m
+        memory: 1Gi
+      limits:
+        cpu: 2000m
+        memory: 2Gi
+
+autoscaling:
+  enabled: true
+  minReplicas: 3
+  maxReplicas: 10
+  targetCPUUtilizationPercentage: 70
+  targetMemoryUtilizationPercentage: 80
+
+ingress:
+  enabled: true
+  className: nginx
+  annotations:
+    cert-manager.io/cluster-issuer: letsencrypt-prod
+  hosts:
+    - host: bifrost.yourdomain.com
+      paths:
+        - path: /
+          pathType: Prefix
+  tls:
+    - secretName: bifrost-tls
+      hosts:
+        - bifrost.yourdomain.com
+
+resources:
+  requests:
+    cpu: 500m
+    memory: 1Gi
+  limits:
+    cpu: 2000m
+    memory: 2Gi
+
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption-key"
+    key: "encryption-key"
+
+  client:
+    initialPoolSize: 500
+    dropExcessRequests: true
+    enableLogging: true
+
+  providers:
+    openai:
+      keys:
+        - name: "openai-primary"
+          value: "env.OPENAI_API_KEY"
+          weight: 1
+
+  providerSecrets:
+    openai:
+      existingSecret: "provider-keys"
+      key: "openai-api-key"
+      envVar: "OPENAI_API_KEY"
+
+  plugins:
+    telemetry:
+      enabled: true
+      version: 1
+    logging:
+      enabled: true
+      version: 1
+    governance:
+      enabled: true
+      version: 1
+```
+
+```bash
+# 2. Install
+helm install bifrost bifrost/bifrost -f production.yaml
+```
+
+</Tab>
+</Tabs>
+
+<Note>
+`image.tag` is required — the chart will not start without it. Check [Docker Hub](https://hub.docker.com/r/maximhq/bifrost/tags) for available versions.
+</Note>
+
+## Step 3 — Verify
+
+```bash
+# Check pods are running
+kubectl get pods -l app.kubernetes.io/name=bifrost
+
+# Port forward and hit the health endpoint
+kubectl port-forward svc/bifrost 8080:8080
+curl http://localhost:8080/health
+
+# Check Prometheus metrics
+curl http://localhost:8080/metrics
+```
+
+## Step 4 — Configure Providers & Plugins
+
+```bash
+# Make your first inference call
+curl http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o-mini",
+    "messages": [{"role": "user", "content": "Hello from Bifrost!"}]
+  }'
+```
+
+Next steps: jump to [Next Steps](#next-steps).
+
+</Tab>
+
+<Tab title="Enterprise">
+
+Enterprise customers receive dedicated container images in a private registry, along with additional features, SLAs, and compliance documentation.
+
+<Note>
+[Book a demo](https://calendly.com/maximai/bifrost-demo) to know more about our enterprise features.
+</Note>
+
+## Prerequisites
+
+- Kubernetes cluster (v1.19+)
+- `kubectl` configured
+- Helm 3.2.0+ installed
+- Enterprise registry credentials (provided by Maxim)
+
+## Step 1 — Add the Helm Repository
+
+```bash
+helm repo add bifrost https://maximhq.github.io/bifrost/helm-charts
+helm repo update
+```
+
+## Step 2 — Create Pull Secret
+
+Create a Kubernetes image pull secret for our private enterprise registry:
+
+<Tabs>
+<Tab title="Google Artifact Registry">
+
+```bash
+kubectl create secret docker-registry enterprise-registry-secret \
+  --docker-server=us-west1-docker.pkg.dev \
+  --docker-username=_json_key \
+  --docker-password="$(cat service-account-key.json)" \
+  --docker-email=your-email@example.com
+```
+
+</Tab>
+<Tab title="AWS ECR">
+
+```bash
+kubectl create secret docker-registry enterprise-registry-secret \
+  --docker-server=123456789.dkr.ecr.us-east-1.amazonaws.com \
+  --docker-username=AWS \
+  --docker-password=$(aws ecr get-login-password --region us-east-1)
+```
+
+<Note>
+ECR tokens expire after 12 hours. Use the [ECR Credential Helper](https://github.com/awslabs/amazon-ecr-credential-helper) or [ECR Registry Creds operator](https://github.com/upmc-enterprises/registry-creds) for automatic refresh.
+</Note>
+
+</Tab>
+<Tab title="Azure ACR">
+
+```bash
+kubectl create secret docker-registry enterprise-registry-secret \
+  --docker-server=yourregistry.azurecr.io \
+  --docker-username=<service-principal-id> \
+  --docker-password=<service-principal-password>
+```
+
+</Tab>
+<Tab title="Self-Hosted Registry">
+
+```bash
+kubectl create secret docker-registry enterprise-registry-secret \
+  --docker-server=registry.yourcompany.com \
+  --docker-username=<username> \
+  --docker-password=<password>
+```
+
+</Tab>
+</Tabs>
+
+## Step 3 — Create Required Secrets
+
+```bash
+# Encryption key
+kubectl create secret generic bifrost-encryption \
+  --from-literal=key="$(openssl rand -base64 32)"
+
+# Provider API keys
+kubectl create secret generic provider-keys \
+  --from-literal=openai-api-key='sk-...' \
+  --from-literal=anthropic-api-key='sk-ant-...'
+
+# Admin credentials (for dashboard + governance)
+kubectl create secret generic bifrost-admin-credentials \
+  --from-literal=username='admin' \
+  --from-literal=password='secure-admin-password'
+```
+
+## Step 4 — Install
+
+```yaml
+# enterprise.yaml
+image:
+  # Registry URL provided by Maxim
+  repository: us-west1-docker.pkg.dev/bifrost-enterprise/your-org/bifrost
+  tag: "latest"
+
+imagePullSecrets:
+  - name: enterprise-registry-secret
+
+replicaCount: 3
+
+resources:
+  requests:
+    cpu: 1000m
+    memory: 2Gi
+  limits:
+    cpu: 4000m
+    memory: 8Gi
+
+autoscaling:
+  enabled: true
+  minReplicas: 3
+  maxReplicas: 20
+  targetCPUUtilizationPercentage: 70
+  targetMemoryUtilizationPercentage: 80
+
+storage:
+  mode: postgres
+
+postgresql:
+  enabled: true
+  auth:
+    password: "secure-password"   # use existingSecret in production
+  primary:
+    persistence:
+      size: 100Gi
+    resources:
+      requests:
+        cpu: 1000m
+        memory: 2Gi
+      limits:
+        cpu: 4000m
+        memory: 8Gi
+
+vectorStore:
+  enabled: true
+  type: weaviate
+  weaviate:
+    enabled: true
+    persistence:
+      size: 100Gi
+
+ingress:
+  enabled: true
+  className: nginx
+  annotations:
+    cert-manager.io/cluster-issuer: letsencrypt-prod
+    nginx.ingress.kubernetes.io/proxy-body-size: "100m"
+  hosts:
+    - host: bifrost.yourcompany.com
+      paths:
+        - path: /
+          pathType: Prefix
+  tls:
+    - secretName: bifrost-tls
+      hosts:
+        - bifrost.yourcompany.com
+
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption"
+    key: "key"
+
+  client:
+    initialPoolSize: 1000
+    dropExcessRequests: true
+    enableLogging: true
+    disableContentLogging: false    # set true for HIPAA/compliance
+    logRetentionDays: 365
+    enforceGovernanceHeader: true
+    allowDirectKeys: false
+    maxRequestBodySizeMb: 100
+    allowedOrigins:
+      - "https://yourcompany.com"
+      - "https://*.yourcompany.com"
+
+  providers:
+    openai:
+      keys:
+        - name: "openai-primary"
+          value: "env.OPENAI_API_KEY"
+          weight: 1
+    anthropic:
+      keys:
+        - name: "anthropic-primary"
+          value: "env.ANTHROPIC_API_KEY"
+          weight: 1
+
+  providerSecrets:
+    openai:
+      existingSecret: "provider-keys"
+      key: "openai-api-key"
+      envVar: "OPENAI_API_KEY"
+    anthropic:
+      existingSecret: "provider-keys"
+      key: "anthropic-api-key"
+      envVar: "ANTHROPIC_API_KEY"
+
+  governance:
+    authConfig:
+      isEnabled: true
+      disableAuthOnInference: false
+      existingSecret: "bifrost-admin-credentials"
+      usernameKey: "username"
+      passwordKey: "password"
+
+  plugins:
+    telemetry:
+      enabled: true
+      version: 1
+    logging:
+      enabled: true
+      version: 1
+    governance:
+      enabled: true
+      version: 1
+      config:
+        is_vk_mandatory: true
+    semanticCache:
+      enabled: true
+      version: 1
+      config:
+        provider: "openai"
+        embedding_model: "text-embedding-3-small"
+        dimension: 1536
+        threshold: 0.85
+        ttl: "1h"
+
+affinity:
+  podAntiAffinity:
+    requiredDuringSchedulingIgnoredDuringExecution:
+      - labelSelector:
+          matchLabels:
+            app.kubernetes.io/name: bifrost
+        topologyKey: kubernetes.io/hostname
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f enterprise.yaml
+```
+
+Next steps: jump to [Next Steps](#next-steps).
+
+<Note>
+For DB-backed deployments, built-in plugins support a top-level `version` field (for example: `telemetry`, `logging`, `governance`, `semanticCache`, `otel`, `maxim`, `datadog`). Increase this number when you want config from Helm to overwrite an older plugin record in the DB.
+</Note>
+
+## Enterprise Support
+
+Enterprise customers have access to:
+- Dedicated Slack channel for support
+- Priority bug fixes and feature requests
+- Custom feature development
+- SLA guarantees
+- Compliance documentation (SOC2, HIPAA, etc.)
+
+Contact [support@getmaxim.ai](mailto:support@getmaxim.ai) for support.
+
+</Tab>
+
+</Tabs>
+
+---
+
+## Operations
+
+### Upgrade
+
+```bash
+helm repo update
+
+# Upgrade reusing all existing values
+helm upgrade bifrost bifrost/bifrost --reuse-values
+
+# Upgrade with new values
+helm upgrade bifrost bifrost/bifrost -f your-values.yaml
+
+# Upgrade and override a single field
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set image.tag=v1.4.11
+```
+
+### Rollback
+
+```bash
+helm history bifrost
+helm rollback bifrost          # to previous revision
+helm rollback bifrost 2        # to specific revision
+```
+
+### Scale
+
+```bash
+kubectl scale deployment bifrost --replicas=5
+
+# Or via Helm
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set replicaCount=5
+```
+
+### Uninstall
+
+```bash
+helm uninstall bifrost
+
+# Also remove PVCs (permanently deletes all data)
+kubectl delete pvc -l app.kubernetes.io/instance=bifrost
+```
+
+---
+
+## Monitoring
+
+### Prometheus Metrics
+
+Bifrost exposes Prometheus metrics at `/metrics`.
+
+Enable ServiceMonitor for automatic scraping:
+
+```yaml
+serviceMonitor:
+  enabled: true
+  interval: 30s
+  scrapeTimeout: 10s
+```
+
+### Health Checks
+
+Check pod health:
+
+```bash
+# View pod status
+kubectl get pods -l app.kubernetes.io/name=bifrost
+
+# Check logs
+kubectl logs -l app.kubernetes.io/name=bifrost --tail=100
+
+# Describe pod
+kubectl describe pod -l app.kubernetes.io/name=bifrost
+```
+
+### Metrics Endpoints
+
+```bash
+# Port forward
+kubectl port-forward svc/bifrost 8080:8080
+
+# Check metrics
+curl http://localhost:8080/metrics
+
+# Check health
+curl http://localhost:8080/health
+```
+
+---
+
+## Configuration Guides
+
+<CardGroup cols={3}>
+  <Card title="Values Reference" icon="sliders" href="/deployment-guides/helm/values">
+    All parameters, secret references, advanced config, example patterns
+  </Card>
+  <Card title="Client Configuration" icon="gear" href="/deployment-guides/helm/client">
+    Pool size, logging, CORS, header filtering, compat shims, MCP settings
+  </Card>
+  <Card title="Provider Setup" icon="plug" href="/deployment-guides/helm/providers">
+    OpenAI, Anthropic, Azure, Bedrock, Vertex, Groq, self-hosted
+  </Card>
+  <Card title="Storage" icon="database" href="/deployment-guides/helm/storage">
+    SQLite, PostgreSQL, object storage for logs, vector stores
+  </Card>
+  <Card title="Plugins" icon="puzzle-piece" href="/deployment-guides/helm/plugins">
+    Telemetry, logging, semantic cache, OTel, Datadog, governance
+  </Card>
+  <Card title="Governance" icon="shield" href="/deployment-guides/helm/governance">
+    Budgets, rate limits, virtual keys, routing rules
+  </Card>
+  <Card title="Cluster Mode" icon="network-wired" href="/deployment-guides/helm/cluster">
+    Multi-replica HA, gossip, peer discovery
+  </Card>
+  <Card title="Troubleshooting" icon="wrench" href="/deployment-guides/helm/troubleshooting">
+    Pod startup, database, ingress, PVC, secrets, performance
+  </Card>
+</CardGroup>
+
+---
+
+## Resources
+
+- [Helm Chart Repository](https://github.com/maximhq/bifrost/tree/main/helm-charts)
+- [Artifact Hub](https://artifacthub.io/packages/helm/bifrost/bifrost)
+- [Example Configurations](https://github.com/maximhq/bifrost/tree/main/helm-charts/bifrost/values-examples)
+- [GitHub Issues](https://github.com/maximhq/bifrost/issues)
+
+## Next Steps
+
+1. Configure [provider keys](/providers/supported-providers/overview)
+2. Enable [plugins](/plugins/getting-started)
+3. Set up [observability](/features/observability/default)
+4. Configure [governance](/features/governance/virtual-keys)
--- a/docs/deployment-guides/helm/client.mdx
+++ b/docs/deployment-guides/helm/client.mdx
@@ -0,0 +1,316 @@
+---
+title: "Client Configuration"
+description: "Configure the Bifrost client: connection pool, logging, CORS, header filtering, compat shims, and MCP settings"
+icon: "gear"
+---
+
+The `bifrost.client` block controls how Bifrost manages its internal worker pool, request logging, authentication enforcement, header policies, SDK compatibility shims, and MCP agent behaviour. All settings map directly to the `client` section of the rendered `config.json`.
+
+---
+
+## Connection Pool
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.client.initialPoolSize` | Pre-allocated worker goroutines per provider queue | `300` |
+| `bifrost.client.dropExcessRequests` | Drop requests when queue is full instead of waiting | `false` |
+
+A larger pool reduces latency spikes under burst load at the cost of higher baseline memory. For production workloads with multiple providers, `1000` is a common starting point.
+
+```yaml
+# client-pool.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  client:
+    initialPoolSize: 1000
+    dropExcessRequests: true   # Return 429 instead of queuing indefinitely
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f client-pool.yaml
+
+# Or set inline
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set bifrost.client.initialPoolSize=1000 \
+  --set bifrost.client.dropExcessRequests=true
+```
+
+---
+
+## Request & Response Logging
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.client.enableLogging` | Log all LLM requests and responses | `true` |
+| `bifrost.client.disableContentLogging` | Strip message content from logs (keeps metadata) | `false` |
+| `bifrost.client.logRetentionDays` | Days to retain log entries in the store | `365` |
+| `bifrost.client.loggingHeaders` | HTTP request headers to capture in log metadata | `[]` |
+
+Set `disableContentLogging: true` for HIPAA / PCI compliance workloads where message content must not be persisted.
+
+```yaml
+bifrost:
+  client:
+    enableLogging: true
+    disableContentLogging: true    # PII / compliance: store metadata only
+    logRetentionDays: 90
+    loggingHeaders:
+      - "x-request-id"
+      - "x-user-id"
+```
+
+```bash
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set bifrost.client.disableContentLogging=true \
+  --set bifrost.client.logRetentionDays=90
+```
+
+---
+
+## Security & CORS
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.client.allowedOrigins` | CORS allowed origins | `["*"]` |
+| `bifrost.client.allowDirectKeys` | Allow callers to pass provider keys directly in requests | `false` |
+| `bifrost.client.enforceGovernanceHeader` | Require `x-bf-vk` virtual-key header on every request | `false` |
+| `bifrost.client.maxRequestBodySizeMb` | Maximum allowed request body size | `100` |
+| `bifrost.client.whitelistedRoutes` | Routes that bypass auth middleware | `[]` |
+
+```yaml
+bifrost:
+  client:
+    allowedOrigins:
+      - "https://app.yourdomain.com"
+      - "https://admin.yourdomain.com"
+    allowDirectKeys: false         # Prevent callers from supplying raw provider keys
+    enforceGovernanceHeader: true  # Every request must carry a virtual key
+    maxRequestBodySizeMb: 50
+    whitelistedRoutes:
+      - "/health"
+      - "/metrics"
+```
+
+```bash
+helm install bifrost bifrost/bifrost \
+  --set image.tag=v1.4.11 \
+  --set bifrost.client.enforceGovernanceHeader=true \
+  --set bifrost.client.allowDirectKeys=false
+```
+
+---
+
+## Header Filtering
+
+Controls which `x-bf-eh-*` headers are forwarded to upstream LLM providers.
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.client.headerFilterConfig.allowlist` | Only these headers are forwarded (whitelist mode) | `[]` |
+| `bifrost.client.headerFilterConfig.denylist` | These headers are always blocked | `[]` |
+| `bifrost.client.requiredHeaders` | Headers that must be present on every request | `[]` |
+| `bifrost.client.allowedHeaders` | Additional headers permitted for CORS and WebSocket | `[]` |
+
+When both lists are empty, all `x-bf-eh-*` headers pass through. Specifying an `allowlist` enables strict whitelist mode — only listed headers are forwarded.
+
+```yaml
+bifrost:
+  client:
+    headerFilterConfig:
+      allowlist:
+        - "x-bf-eh-anthropic-version"
+        - "x-bf-eh-openai-beta"
+      denylist: []
+    requiredHeaders:
+      - "x-request-id"
+```
+
+---
+
+## Authentication
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.authConfig.isEnabled` | Enable username/password auth for the API and dashboard | `false` |
+| `bifrost.authConfig.adminUsername` | Admin username (plain text, prefer secret) | `""` |
+| `bifrost.authConfig.adminPassword` | Admin password (plain text, prefer secret) | `""` |
+| `bifrost.authConfig.existingSecret` | Kubernetes Secret name for credentials | `""` |
+| `bifrost.authConfig.usernameKey` | Key within the secret for username | `"username"` |
+| `bifrost.authConfig.passwordKey` | Key within the secret for password | `"password"` |
+| `bifrost.authConfig.disableAuthOnInference` | Skip auth check on `/v1/*` inference routes | `false` |
+
+```bash
+# Create secret first
+kubectl create secret generic bifrost-admin \
+  --from-literal=username='admin' \
+  --from-literal=password='your-secure-password'
+```
+
+```yaml
+bifrost:
+  authConfig:
+    isEnabled: true
+    disableAuthOnInference: false
+    existingSecret: "bifrost-admin"
+    usernameKey: "username"
+    passwordKey: "password"
+```
+
+```bash
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  -f auth-values.yaml
+```
+
+---
+
+## Encryption
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.encryptionKey` | Optional encryption key (plain text — use `encryptionKeySecret` in production). If omitted, data is stored in plaintext. | `""` |
+| `bifrost.encryptionKeySecret.name` | Kubernetes Secret name containing the key | `""` |
+| `bifrost.encryptionKeySecret.key` | Key within the secret | `"encryption-key"` |
+
+Always use a Kubernetes Secret in production:
+
+```bash
+kubectl create secret generic bifrost-encryption \
+  --from-literal=encryption-key='your-32-byte-encryption-key-here'
+```
+
+```yaml
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption"
+    key: "encryption-key"
+```
+
+```bash
+helm install bifrost bifrost/bifrost \
+  --set image.tag=v1.4.11 \
+  -f encryption-values.yaml
+```
+
+---
+
+## Async Jobs & Database Pings
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.client.disableDbPingsInHealth` | Exclude DB connectivity from `/health` checks | `false` |
+| `bifrost.client.asyncJobResultTTL` | TTL (seconds) for async job results | `3600` |
+
+---
+
+## Compat Shims
+
+Compatibility flags that let Bifrost silently adapt request/response shapes for SDK integrations:
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.client.compat.convertTextToChat` | Wrap legacy text completions as chat messages | `false` |
+| `bifrost.client.compat.convertChatToResponses` | Translate chat completions to Responses API format | `false` |
+| `bifrost.client.compat.shouldDropParams` | Silently drop unsupported parameters instead of erroring | `false` |
+| `bifrost.client.compat.shouldConvertParams` | Auto-convert parameter names across provider schemas | `false` |
+
+```yaml
+bifrost:
+  client:
+    compat:
+      shouldDropParams: true     # Useful when proxying mixed SDK traffic
+      convertTextToChat: true    # For clients using the legacy /v1/completions endpoint
+```
+
+---
+
+## Prometheus Labels
+
+Add custom labels to every Prometheus metric emitted by Bifrost:
+
+```yaml
+bifrost:
+  client:
+    prometheusLabels:
+      - name: "environment"
+        value: "production"
+      - name: "region"
+        value: "us-east-1"
+```
+
+---
+
+## MCP Agent Settings
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.client.mcpAgentDepth` | Maximum tool-call recursion depth for MCP agent mode | `10` |
+| `bifrost.client.mcpToolExecutionTimeout` | Timeout per tool execution in seconds | `30` |
+| `bifrost.client.mcpCodeModeBindingLevel` | Code mode binding level (`server` or `tool`) | `""` |
+| `bifrost.client.mcpToolSyncInterval` | Global tool sync interval in minutes (`0` = disabled) | `0` |
+
+```yaml
+bifrost:
+  client:
+    mcpAgentDepth: 15
+    mcpToolExecutionTimeout: 60
+```
+
+---
+
+## Full Example
+
+```yaml
+# client-full.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption"
+    key: "encryption-key"
+
+  authConfig:
+    isEnabled: true
+    disableAuthOnInference: false
+    existingSecret: "bifrost-admin"
+    usernameKey: "username"
+    passwordKey: "password"
+
+  client:
+    initialPoolSize: 1000
+    dropExcessRequests: true
+    allowedOrigins:
+      - "https://app.yourdomain.com"
+    enableLogging: true
+    disableContentLogging: false
+    logRetentionDays: 90
+    enforceGovernanceHeader: true
+    allowDirectKeys: false
+    maxRequestBodySizeMb: 100
+    headerFilterConfig:
+      allowlist: []
+      denylist: []
+    prometheusLabels:
+      - name: "environment"
+        value: "production"
+    mcpAgentDepth: 10
+    mcpToolExecutionTimeout: 30
+```
+
+```bash
+# Create prerequisites
+kubectl create secret generic bifrost-encryption \
+  --from-literal=encryption-key='your-32-byte-encryption-key-here'
+
+kubectl create secret generic bifrost-admin \
+  --from-literal=username='admin' \
+  --from-literal=password='your-secure-password'
+
+# Install
+helm install bifrost bifrost/bifrost -f client-full.yaml
+```
--- a/docs/deployment-guides/helm/cluster.mdx
+++ b/docs/deployment-guides/helm/cluster.mdx
@@ -0,0 +1,523 @@
+---
+title: "Cluster Mode & HA"
+description: "Run Bifrost in a multi-replica cluster with gossip-based peer discovery, distributed state sync, and high-availability configuration"
+icon: "network-wired"
+---
+
+Cluster mode enables multiple Bifrost replicas to share state — rate limits, budget counters, and governance data — across pods. When `bifrost.cluster.enabled` is `false` (the default), each replica operates independently and state is only shared via the database.
+
+<Note>
+Cluster mode requires **PostgreSQL** as the storage backend. SQLite is single-node only.
+</Note>
+
+<Warning>
+`bifrost.cluster.*` is an enterprise capability. OSS images accept these values but do not run cluster mode at runtime.
+</Warning>
+
+## When to Use Cluster Mode
+
+| Scenario | Recommendation |
+|----------|---------------|
+| Single replica | Not needed |
+| Multiple replicas, shared DB only | Optional — DB provides eventual consistency |
+| Multiple replicas with strict per-minute rate limiting | **Enable cluster mode** — in-memory counters are synced via gossip |
+| Geographic multi-region | Enable cluster mode with DNS or Consul discovery |
+
+---
+
+## Basic Cluster Setup
+
+```yaml
+# cluster-values.yaml
+image:
+  tag: "v1.4.11"
+
+replicaCount: 3
+
+storage:
+  mode: postgres
+
+postgresql:
+  external:
+    enabled: true
+    host: "your-postgres-host.example.com"
+    port: 5432
+    user: bifrost
+    database: bifrost
+    sslMode: require
+    existingSecret: "postgres-credentials"
+    passwordKey: "password"
+
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption"
+    key: "encryption-key"
+
+  cluster:
+    enabled: true
+    gossip:
+      port: 7946
+      config:
+        timeoutSeconds: 10
+        successThreshold: 3
+        failureThreshold: 3
+
+# Spread replicas across nodes for true HA
+affinity:
+  podAntiAffinity:
+    requiredDuringSchedulingIgnoredDuringExecution:
+      - labelSelector:
+          matchLabels:
+            app.kubernetes.io/name: bifrost
+        topologyKey: kubernetes.io/hostname
+
+# Conservative scale-down: avoid killing pods mid-stream
+autoscaling:
+  enabled: true
+  minReplicas: 3
+  maxReplicas: 10
+  targetCPUUtilizationPercentage: 70
+  behavior:
+    scaleDown:
+      stabilizationWindowSeconds: 300
+      policies:
+        - type: Pods
+          value: 1
+          periodSeconds: 120
+
+# Give in-flight SSE streams time to drain
+terminationGracePeriodSeconds: 90
+lifecycle:
+  preStop:
+    exec:
+      command: ["sh", "-c", "sleep 20"]
+```
+
+```bash
+kubectl create secret generic postgres-credentials \
+  --from-literal=password='your-postgres-password'
+
+kubectl create secret generic bifrost-encryption \
+  --from-literal=encryption-key='your-32-byte-encryption-key'
+
+helm install bifrost bifrost/bifrost -f cluster-values.yaml
+```
+
+---
+
+## Peer Discovery
+
+Bifrost uses a gossip protocol (memberlist) for peer-to-peer state sync. Configure how peers find each other:
+
+<Note>
+For `consul`, `etcd`, and `udp` discovery, set `bifrost.cluster.discovery.serviceName` so nodes register/discover under a stable service identity.
+</Note>
+
+<Tabs>
+
+<Tab title="Kubernetes (Recommended)">
+
+Bifrost queries the Kubernetes API to find other Bifrost pods by label selector. No static peer list needed — works with HPA.
+
+```yaml
+bifrost:
+  cluster:
+    enabled: true
+    discovery:
+      enabled: true
+      type: kubernetes
+      k8sNamespace: "default"           # namespace where Bifrost runs
+      k8sLabelSelector: "app.kubernetes.io/name=bifrost"
+    gossip:
+      port: 7946
+```
+
+The service account needs permission to list pods:
+
+```yaml
+serviceAccount:
+  create: true
+  annotations: {}
+```
+
+```bash
+# Create a ClusterRole and binding for pod discovery (apply once)
+kubectl apply -f - <<'EOF'
+apiVersion: rbac.authorization.k8s.io/v1
+kind: Role
+metadata:
+  name: bifrost-pod-discovery
+  namespace: default
+rules:
+  - apiGroups: [""]
+    resources: ["pods"]
+    verbs: ["list", "get", "watch"]
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: RoleBinding
+metadata:
+  name: bifrost-pod-discovery
+  namespace: default
+subjects:
+  - kind: ServiceAccount
+    name: bifrost
+    namespace: default
+roleRef:
+  kind: Role
+  name: bifrost-pod-discovery
+  apiGroup: rbac.authorization.k8s.io
+EOF
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f cluster-k8s-discovery-values.yaml
+```
+
+</Tab>
+
+<Tab title="DNS">
+
+Uses a headless service DNS name to resolve peer IPs. Works well with StatefulSets (predictable pod DNS names).
+
+```yaml
+bifrost:
+  cluster:
+    enabled: true
+    discovery:
+      enabled: true
+      type: dns
+      dnsNames:
+        - "bifrost-headless.default.svc.cluster.local"
+    gossip:
+      port: 7946
+```
+
+The chart automatically creates a headless service (`bifrost-headless`) when cluster mode is enabled with a StatefulSet. For Deployments, create it manually:
+
+```bash
+kubectl apply -f - <<'EOF'
+apiVersion: v1
+kind: Service
+metadata:
+  name: bifrost-headless
+spec:
+  clusterIP: None
+  selector:
+    app.kubernetes.io/name: bifrost
+  ports:
+    - name: gossip
+      port: 7946
+      protocol: TCP
+EOF
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f cluster-dns-discovery-values.yaml
+```
+
+</Tab>
+
+<Tab title="Static Peers">
+
+Enumerate peer addresses explicitly. Use when discovery mechanisms are unavailable or you want deterministic membership.
+
+```yaml
+bifrost:
+  cluster:
+    enabled: true
+    peers:
+      - "bifrost-0.bifrost-headless.default.svc.cluster.local:7946"
+      - "bifrost-1.bifrost-headless.default.svc.cluster.local:7946"
+      - "bifrost-2.bifrost-headless.default.svc.cluster.local:7946"
+    gossip:
+      port: 7946
+```
+
+<Note>
+Static peers require StatefulSet pod names to be stable. This approach doesn't adapt to HPA-driven scaling — use Kubernetes or DNS discovery for dynamic replica counts.
+</Note>
+
+</Tab>
+
+<Tab title="Consul">
+
+```yaml
+bifrost:
+  cluster:
+    enabled: true
+    discovery:
+      enabled: true
+      type: consul
+      serviceName: "bifrost-cluster"
+      consulAddress: "consul.consul.svc.cluster.local:8500"
+    gossip:
+      port: 7946
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f cluster-consul-discovery-values.yaml
+```
+
+</Tab>
+
+<Tab title="etcd">
+
+```yaml
+bifrost:
+  cluster:
+    enabled: true
+    discovery:
+      enabled: true
+      type: etcd
+      serviceName: "bifrost-cluster"
+      etcdEndpoints:
+        - "http://etcd-0.etcd.default.svc.cluster.local:2379"
+        - "http://etcd-1.etcd.default.svc.cluster.local:2379"
+        - "http://etcd-2.etcd.default.svc.cluster.local:2379"
+    gossip:
+      port: 7946
+```
+
+</Tab>
+
+<Tab title="mDNS">
+
+Best for local development or bare-metal clusters where multicast is available.
+
+```yaml
+bifrost:
+  cluster:
+    enabled: true
+    discovery:
+      enabled: true
+      type: mdns
+      mdnsService: "_bifrost._tcp"
+    gossip:
+      port: 7946
+```
+
+</Tab>
+
+</Tabs>
+
+---
+
+## Allowed Address Space
+
+Restrict gossip to a specific subnet (useful in multi-tenant clusters):
+
+```yaml
+bifrost:
+  cluster:
+    discovery:
+      enabled: true
+      type: kubernetes
+      k8sNamespace: "default"
+      k8sLabelSelector: "app.kubernetes.io/name=bifrost"
+      allowedAddressSpace:
+        - "10.0.0.0/8"
+        - "172.16.0.0/12"
+```
+
+---
+
+## Region-Aware Routing
+
+Tag replicas with a region identifier for latency-aware routing:
+
+```yaml
+bifrost:
+  cluster:
+    enabled: true
+    region: "us-east-1"
+```
+
+---
+
+## Full HA Production Example
+
+```yaml
+# ha-production-values.yaml
+image:
+  tag: "v1.4.11"
+
+replicaCount: 3
+
+resources:
+  requests:
+    cpu: 1000m
+    memory: 1Gi
+  limits:
+    cpu: 4000m
+    memory: 4Gi
+
+autoscaling:
+  enabled: true
+  minReplicas: 3
+  maxReplicas: 15
+  targetCPUUtilizationPercentage: 70
+  targetMemoryUtilizationPercentage: 75
+  behavior:
+    scaleDown:
+      stabilizationWindowSeconds: 300
+      policies:
+        - type: Pods
+          value: 1
+          periodSeconds: 120
+    scaleUp:
+      stabilizationWindowSeconds: 30
+
+terminationGracePeriodSeconds: 90
+lifecycle:
+  preStop:
+    exec:
+      command: ["sh", "-c", "sleep 20"]
+
+ingress:
+  enabled: true
+  className: nginx
+  annotations:
+    cert-manager.io/cluster-issuer: letsencrypt-prod
+    nginx.ingress.kubernetes.io/proxy-body-size: "100m"
+    nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
+  hosts:
+    - host: bifrost.yourdomain.com
+      paths:
+        - path: /
+          pathType: Prefix
+  tls:
+    - secretName: bifrost-tls
+      hosts:
+        - bifrost.yourdomain.com
+
+storage:
+  mode: postgres
+
+postgresql:
+  external:
+    enabled: true
+    host: "rds.us-east-1.amazonaws.com"
+    port: 5432
+    user: bifrost
+    database: bifrost
+    sslMode: require
+    existingSecret: "postgres-credentials"
+    passwordKey: "password"
+
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption"
+    key: "encryption-key"
+
+  client:
+    initialPoolSize: 1000
+    dropExcessRequests: true
+    enableLogging: true
+    enforceGovernanceHeader: true
+
+  cluster:
+    enabled: true
+    region: "us-east-1"
+    discovery:
+      enabled: true
+      type: kubernetes
+      k8sNamespace: "default"
+      k8sLabelSelector: "app.kubernetes.io/name=bifrost"
+    gossip:
+      port: 7946
+      config:
+        timeoutSeconds: 10
+        successThreshold: 3
+        failureThreshold: 3
+
+  plugins:
+    telemetry:
+      enabled: true
+      config:
+        push_gateway:
+          enabled: true
+          push_gateway_url: "http://prometheus-pushgateway.monitoring.svc.cluster.local:9091"
+          push_interval: 15
+    logging:
+      enabled: true
+    governance:
+      enabled: true
+      config:
+        is_vk_mandatory: true
+
+affinity:
+  podAntiAffinity:
+    requiredDuringSchedulingIgnoredDuringExecution:
+      - labelSelector:
+          matchLabels:
+            app.kubernetes.io/name: bifrost
+        topologyKey: kubernetes.io/hostname
+
+serviceAccount:
+  create: true
+  annotations: {}
+```
+
+```bash
+# Prerequisites
+kubectl create secret generic postgres-credentials \
+  --from-literal=password='your-secure-postgres-password'
+
+kubectl create secret generic bifrost-encryption \
+  --from-literal=encryption-key='your-32-byte-encryption-key'
+
+# RBAC for Kubernetes pod discovery
+kubectl apply -f - <<'EOF'
+apiVersion: rbac.authorization.k8s.io/v1
+kind: Role
+metadata:
+  name: bifrost-pod-discovery
+  namespace: default
+rules:
+  - apiGroups: [""]
+    resources: ["pods"]
+    verbs: ["list", "get", "watch"]
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: RoleBinding
+metadata:
+  name: bifrost-pod-discovery
+  namespace: default
+subjects:
+  - kind: ServiceAccount
+    name: bifrost
+    namespace: default
+roleRef:
+  kind: Role
+  name: bifrost-pod-discovery
+  apiGroup: rbac.authorization.k8s.io
+EOF
+
+# Install
+helm install bifrost bifrost/bifrost -f ha-production-values.yaml
+
+# Verify all peers have found each other (check logs)
+kubectl logs -l app.kubernetes.io/name=bifrost --tail=50 | grep -i gossip
+```
+
+---
+
+## Verifying Cluster Health
+
+```bash
+# Check all pods are running
+kubectl get pods -l app.kubernetes.io/name=bifrost
+
+# Check gossip port is reachable between pods
+kubectl exec -it bifrost-0 -- nc -zv bifrost-1.bifrost-headless 7946
+
+# Check health endpoint
+kubectl port-forward svc/bifrost 8080:8080 &
+curl http://localhost:8080/health
+
+# View HPA status
+kubectl get hpa bifrost
+
+# Scale manually during maintenance
+kubectl scale deployment bifrost --replicas=5
+```
--- a/docs/deployment-guides/helm/governance.mdx
+++ b/docs/deployment-guides/helm/governance.mdx
@@ -0,0 +1,446 @@
+---
+title: "Governance"
+description: "Configure Bifrost governance in Helm — budgets, rate limits, virtual keys, routing rules, and admin authentication"
+icon: "shield"
+---
+
+Governance lets you control who can call which providers, how much they can spend, how fast they can go, and how traffic is routed. Everything is declared under `bifrost.governance` in your values file and seeded into the database at startup.
+
+<Note>
+The governance **plugin** must also be enabled for enforcement to take effect:
+
+```yaml
+bifrost:
+  plugins:
+    governance:
+      enabled: true
+```
+
+See the [Plugins](/deployment-guides/helm/plugins) page for plugin configuration details.
+</Note>
+
+---
+
+## Admin Authentication
+
+Protect the Bifrost dashboard and management API with username/password auth.
+
+```bash
+kubectl create secret generic bifrost-admin-credentials \
+  --from-literal=username='admin' \
+  --from-literal=password='your-secure-admin-password'
+```
+
+```yaml
+bifrost:
+  governance:
+    authConfig:
+      isEnabled: true
+      disableAuthOnInference: false   # keep auth on inference routes
+      existingSecret: "bifrost-admin-credentials"
+      usernameKey: "username"
+      passwordKey: "password"
+```
+
+```bash
+helm upgrade bifrost bifrost/bifrost --reuse-values -f governance-auth-values.yaml
+```
+
+---
+
+## Budgets
+
+Spending caps that reset on a configurable period. Budgets are referenced by ID from virtual keys, teams, customers, or providers.
+
+| Reset duration | Syntax |
+|----------------|--------|
+| 30 seconds | `"30s"` |
+| 5 minutes | `"5m"` |
+| 1 hour | `"1h"` |
+| 1 day | `"1d"` |
+| 1 week | `"1w"` |
+| 1 month | `"1M"` |
+| 1 year | `"1Y"` |
+
+```yaml
+bifrost:
+  governance:
+    budgets:
+      - id: "budget-dev"
+        max_limit: 50          # $50 per month
+        reset_duration: "1M"
+
+      - id: "budget-production"
+        max_limit: 500         # $500 per month
+        reset_duration: "1M"
+
+      - id: "budget-testing"
+        max_limit: 10          # $10 per day
+        reset_duration: "1d"
+
+      - id: "budget-enterprise"
+        max_limit: 5000        # $5000 per month
+        reset_duration: "1M"
+```
+
+---
+
+## Rate Limits
+
+Token and request-count caps per time window. Referenced by ID from virtual keys, teams, customers, or providers.
+
+```yaml
+bifrost:
+  governance:
+    rateLimits:
+      - id: "rate-limit-standard"
+        token_max_limit: 100000       # 100K tokens per hour
+        token_reset_duration: "1h"
+        request_max_limit: 1000       # 1000 requests per hour
+        request_reset_duration: "1h"
+
+      - id: "rate-limit-high"
+        token_max_limit: 500000       # 500K tokens per hour
+        token_reset_duration: "1h"
+        request_max_limit: 5000
+        request_reset_duration: "1h"
+
+      - id: "rate-limit-burst"
+        token_max_limit: 50000        # 50K tokens per minute (burst)
+        token_reset_duration: "1m"
+        request_max_limit: 500
+        request_reset_duration: "1m"
+
+      - id: "rate-limit-testing"
+        token_max_limit: 10000
+        token_reset_duration: "1h"
+        request_max_limit: 100
+        request_reset_duration: "1h"
+```
+
+---
+
+## Customers & Teams
+
+Optional organizational hierarchy. Virtual keys can be assigned to customers or teams, inheriting their budgets and rate limits.
+
+```yaml
+bifrost:
+  governance:
+    customers:
+      - id: "customer-acme"
+        name: "Acme Corp"
+        budget_id: "budget-production"
+        rate_limit_id: "rate-limit-high"
+
+      - id: "customer-startup"
+        name: "Startup Inc"
+        budget_id: "budget-dev"
+        rate_limit_id: "rate-limit-standard"
+
+    teams:
+      - id: "team-platform"
+        name: "Platform Team"
+        customer_id: "customer-acme"
+        budget_id: "budget-enterprise"
+        rate_limit_id: "rate-limit-high"
+
+      - id: "team-ml"
+        name: "ML Team"
+        customer_id: "customer-acme"
+        budget_id: "budget-production"
+        rate_limit_id: "rate-limit-standard"
+```
+
+---
+
+## Virtual Keys
+
+Virtual keys are the primary access tokens issued to callers. They scope which providers, models, and underlying API keys are accessible.
+
+```yaml
+bifrost:
+  governance:
+    virtualKeys:
+      # 1. Unrestricted dev key — access to every provider
+      - id: "vk-dev-all"
+        name: "Dev: all providers"
+        value: "vk-dev-all-secret-token"
+        is_active: true
+        budget_id: "budget-dev"
+        rate_limit_id: "rate-limit-standard"
+        # No provider_configs → all providers allowed
+
+      # 2. OpenAI only — restricted to two models
+      - id: "vk-openai-prod"
+        name: "OpenAI Production"
+        value: "vk-openai-prod-secret-token"
+        is_active: true
+        budget_id: "budget-production"
+        rate_limit_id: "rate-limit-high"
+        provider_configs:
+          - provider: "openai"
+            weight: 1
+            allowed_models: ["gpt-4o", "gpt-4o-mini"]
+
+      # 3. Multi-provider with weighted routing
+      - id: "vk-multi"
+        name: "Multi-provider weighted"
+        value: "vk-multi-secret-token"
+        is_active: true
+        budget_id: "budget-production"
+        rate_limit_id: "rate-limit-high"
+        provider_configs:
+          - provider: "openai"
+            weight: 2         # 50%
+            allowed_models: ["*"]
+          - provider: "anthropic"
+            weight: 1         # 25%
+            allowed_models: ["*"]
+          - provider: "groq"
+            weight: 1         # 25%
+            allowed_models: ["*"]
+
+      # 4. Team-scoped key
+      - id: "vk-platform-team"
+        name: "Platform Team Key"
+        value: "vk-platform-team-token"
+        is_active: true
+        team_id: "team-platform"       # inherits team budget/rate-limit
+        provider_configs:
+          - provider: "openai"
+            weight: 1
+            allowed_models: ["*"]
+            key_ids: ["openai-primary"]  # pin to specific configured key by name
+
+      # 5. Restricted testing key
+      - id: "vk-testing"
+        name: "Testing (gpt-4o-mini only)"
+        value: "vk-testing-token"
+        is_active: true
+        budget_id: "budget-testing"
+        rate_limit_id: "rate-limit-testing"
+        provider_configs:
+          - provider: "openai"
+            weight: 1
+            allowed_models: ["gpt-4o-mini"]
+
+      # 6. Batch API key
+      - id: "vk-batch"
+        name: "Batch API workloads"
+        value: "vk-batch-token"
+        is_active: true
+        budget_id: "budget-production"
+        rate_limit_id: "rate-limit-burst"
+        provider_configs:
+          - provider: "openai"
+            weight: 1
+            allowed_models: ["*"]
+            key_ids: ["openai-batch"]    # only the batch-flagged key
+```
+
+`provider_configs[].key_ids` and `provider_configs[].keys` are both supported in Helm values. Prefer `key_ids` for parity with `config.json` (`key_ids` should contain provider key names).
+
+**Use a virtual key in API calls:**
+
+```bash
+curl http://localhost:8080/v1/chat/completions \
+  -H "x-bf-vk: vk-openai-prod-secret-token" \
+  -H "Content-Type: application/json" \
+  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}'
+```
+
+---
+
+## Model Configs
+
+Apply budgets and rate limits at the model level, independent of virtual keys:
+
+```yaml
+bifrost:
+  governance:
+    modelConfigs:
+      - id: "model-gpt4o"
+        model_name: "gpt-4o"
+        provider: "openai"
+        budget_id: "budget-production"
+        rate_limit_id: "rate-limit-high"
+
+      - id: "model-claude"
+        model_name: "claude-3-5-sonnet-20241022"
+        provider: "anthropic"
+        rate_limit_id: "rate-limit-standard"
+```
+
+---
+
+## Provider Governance
+
+Apply budgets and rate limits at the provider level:
+
+```yaml
+bifrost:
+  governance:
+    providers:
+      - name: "openai"
+        budget_id: "budget-production"
+        rate_limit_id: "rate-limit-high"
+        send_back_raw_request: false
+        send_back_raw_response: false
+
+      - name: "anthropic"
+        budget_id: "budget-production"
+        rate_limit_id: "rate-limit-standard"
+```
+
+---
+
+## Routing Rules
+
+CEL-expression-based routing rules redirect requests to different providers or models based on request attributes.
+
+| Field | Description |
+|-------|-------------|
+| `cel_expression` | CEL expression evaluated against the request; if `true`, rule fires |
+| `targets` | Provider/model targets with weights |
+| `fallbacks` | Providers to try if all targets fail |
+| `scope` | `global`, `team`, `customer`, or `virtual_key` |
+| `scope_id` | Required for non-global scopes |
+| `priority` | Lower number = evaluated first |
+
+```yaml
+bifrost:
+  governance:
+    routingRules:
+      # Route all GPT requests to Azure
+      - id: "route-gpt-to-azure"
+        name: "GPT → Azure"
+        description: "Route all GPT model requests to Azure OpenAI"
+        enabled: true
+        cel_expression: "model.startsWith('gpt-')"
+        targets:
+          - provider: "azure"
+            model: ""        # empty = use original model name
+            weight: 1.0
+        fallbacks: ["openai"]
+        scope: "global"
+        priority: 0
+
+      # Route heavy models to a slower but cheaper provider
+      - id: "route-heavy-to-groq"
+        name: "Large context → Groq"
+        enabled: true
+        cel_expression: "model == 'gpt-4o' && request_body.max_tokens > 4000"
+        targets:
+          - provider: "groq"
+            model: "llama-3.3-70b-versatile"
+            weight: 1.0
+        fallbacks: ["openai"]
+        scope: "global"
+        priority: 1
+
+      # Team-scoped rule
+      - id: "route-ml-team-bedrock"
+        name: "ML Team → Bedrock"
+        enabled: true
+        cel_expression: "true"    # match all requests for this scope
+        targets:
+          - provider: "bedrock"
+            model: ""
+            weight: 1.0
+        fallbacks: ["openai"]
+        scope: "team"
+        scope_id: "team-ml"
+        priority: 0
+```
+
+---
+
+## Full Example
+
+```yaml
+# governance-full-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption"
+    key: "encryption-key"
+
+  plugins:
+    governance:
+      enabled: true
+      config:
+        is_vk_mandatory: true
+
+  governance:
+    authConfig:
+      isEnabled: true
+      existingSecret: "bifrost-admin-credentials"
+      usernameKey: "username"
+      passwordKey: "password"
+
+    budgets:
+      - id: "budget-production"
+        max_limit: 500
+        reset_duration: "1M"
+      - id: "budget-dev"
+        max_limit: 50
+        reset_duration: "1M"
+
+    rateLimits:
+      - id: "rate-limit-standard"
+        token_max_limit: 100000
+        token_reset_duration: "1h"
+        request_max_limit: 1000
+        request_reset_duration: "1h"
+
+    virtualKeys:
+      - id: "vk-production"
+        name: "Production"
+        value: "vk-prod-secret-token"
+        is_active: true
+        budget_id: "budget-production"
+        rate_limit_id: "rate-limit-standard"
+        provider_configs:
+          - provider: "openai"
+            weight: 1
+            allowed_models: ["gpt-4o", "gpt-4o-mini"]
+```
+
+```bash
+kubectl create secret generic bifrost-encryption \
+  --from-literal=encryption-key='your-32-byte-key'
+
+kubectl create secret generic bifrost-admin-credentials \
+  --from-literal=username='admin' \
+  --from-literal=password='secure-admin-password'
+
+helm install bifrost bifrost/bifrost -f governance-full-values.yaml
+```
+
+---
+
+## Access Profiles (Enterprise)
+
+You can seed enterprise `access_profiles` directly from Helm values. The chart renders `bifrost.accessProfiles` into top-level `access_profiles` in `config.json`.
+
+```yaml
+bifrost:
+  accessProfiles:
+    - name: "platform-default"
+      description: "Default profile for platform users"
+      is_active: true
+      tags: ["platform", "default"]
+      provider_configs:
+        - provider_name: "openai"
+          all_models_allowed: false
+          allowed_models: ["gpt-4o", "gpt-4o-mini"]
+      mcp_servers:
+        - mcp_server_id: "github"
+      mcp_tool_overrides:
+        - mcp_client_id: "github"
+          tool_name: "create_pull_request"
+          action: "include"
+```
--- a/docs/deployment-guides/helm/guardrails.mdx
+++ b/docs/deployment-guides/helm/guardrails.mdx
@@ -0,0 +1,262 @@
+---
+title: "Guardrails"
+description: "Configure guardrails providers and rules in Bifrost Helm deployments"
+icon: "shield-halved"
+---
+
+<Note>
+Guardrails are an **enterprise-only** feature. They require the enterprise Bifrost image.
+</Note>
+
+Guardrails are configured under `bifrost.guardrails` in your values file. The configuration has two parts:
+
+- **`providers`** — the backend that performs the check. Rules link to providers by `id`.
+- **`rules`** — CEL expressions that control when and where providers are invoked.
+
+---
+
+## Providers
+
+<Tabs>
+<Tab title="Regex">
+
+Runs entirely in-process with no external dependency. Patterns use RE2 syntax. Supports optional per-pattern flags: `i` (case-insensitive), `m` (multiline), `s` (dot-all).
+
+```yaml
+bifrost:
+  guardrails:
+    providers:
+      - id: 1
+        provider_name: "regex"
+        policy_name: "block-secrets"
+        enabled: true
+        timeout: 5
+        config:
+          patterns:
+            - pattern: "sk-[A-Za-z0-9]{20,}"
+              description: "OpenAI API key"
+            - pattern: "AKIA[0-9A-Z]{16}"
+              description: "AWS access key"
+              flags: "i"
+            - pattern: "gh[ps]_[A-Za-z0-9]{36}"
+              description: "GitHub token"
+```
+
+</Tab>
+<Tab title="AWS Bedrock">
+
+```yaml
+bifrost:
+  guardrails:
+    providers:
+      - id: 2
+        provider_name: "bedrock"
+        policy_name: "content-filter"
+        enabled: true
+        timeout: 15
+        config:
+          guardrail_arn: "arn:aws:bedrock:us-east-1::guardrail/abc123"
+          guardrail_version: "DRAFT"          # or a published version number
+          region: "us-east-1"
+          access_key: "env.AWS_ACCESS_KEY_ID" # omit to use instance role
+          secret_key: "env.AWS_SECRET_ACCESS_KEY"
+```
+
+</Tab>
+<Tab title="Azure Content Safety">
+
+```yaml
+bifrost:
+  guardrails:
+    providers:
+      - id: 3
+        provider_name: "azure"
+        policy_name: "azure-content-safety"
+        enabled: true
+        timeout: 10
+        config:
+          endpoint: "https://your-resource.cognitiveservices.azure.com"
+          api_key: "env.AZURE_CONTENT_SAFETY_KEY"
+          analyze_enabled: true
+          analyze_severity_threshold: "medium"  # low | medium | high
+          jailbreak_shield_enabled: true
+          indirect_attack_shield_enabled: true
+          copyright_enabled: false
+          text_blocklist_enabled: false
+          blocklist_names: []
+```
+
+</Tab>
+<Tab title="Gray Swan">
+
+```yaml
+bifrost:
+  guardrails:
+    providers:
+      - id: 4
+        provider_name: "grayswan"
+        policy_name: "grayswan-jailbreak"
+        enabled: true
+        timeout: 15
+        config:
+          api_key: "env.GRAYSWAN_API_KEY"
+          violation_threshold: 0.7        # 0.0–1.0; higher = more permissive
+          reasoning_mode: "standard"      # standard | fast
+          policy_id: ""                   # optional: single policy ID
+          policy_ids: []                  # optional: multiple policy IDs
+          rules: {}                       # optional: inline rule map
+```
+
+</Tab>
+</Tabs>
+
+---
+
+## Rules
+
+Rules are CEL expressions that fire when their condition is met. Available CEL variables:
+
+| Variable | Type | Description |
+|----------|------|-------------|
+| `model` | `string` | Model name from the request |
+| `provider` | `string` | Provider name (e.g. `"openai"`) |
+| `headers` | `map<string,string>` | HTTP request headers |
+| `params` | `map<string,string>` | Query parameters |
+| `customer` | `string` | Customer ID |
+| `team` | `string` | Team ID |
+| `user` | `string` | User ID |
+
+Rule fields:
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `id` | Yes | Unique integer ID |
+| `name` | Yes | Human-readable name |
+| `description` | No | Optional description |
+| `enabled` | Yes | `true` to activate |
+| `cel_expression` | Yes | CEL boolean expression; `"true"` matches all requests |
+| `apply_to` | Yes | `"input"`, `"output"`, or `"both"` |
+| `sampling_rate` | No | `0`–`100`; percentage of requests to check (default: 100) |
+| `timeout` | No | Rule timeout in seconds |
+| `provider_config_ids` | No | Provider `id`s to invoke when this rule matches |
+
+```yaml
+bifrost:
+  guardrails:
+    rules:
+      - id: 101
+        name: "block-secrets-input"
+        description: "Block prompts containing API keys"
+        enabled: true
+        cel_expression: "true"
+        apply_to: "input"
+        sampling_rate: 100
+        timeout: 10
+        provider_config_ids: [1]
+
+      - id: 102
+        name: "azure-output-gpt4o"
+        description: "Scan GPT-4o responses"
+        enabled: true
+        cel_expression: "model == 'gpt-4o'"
+        apply_to: "output"
+        sampling_rate: 100
+        timeout: 15
+        provider_config_ids: [3]
+
+      - id: 103
+        name: "grayswan-openai-input"
+        enabled: true
+        cel_expression: "provider == 'openai'"
+        apply_to: "input"
+        sampling_rate: 50
+        timeout: 20
+        provider_config_ids: [4]
+
+      - id: 104
+        name: "strict-team-check"
+        enabled: true
+        cel_expression: "team == 'team-platform'"
+        apply_to: "both"
+        sampling_rate: 100
+        timeout: 30
+        provider_config_ids: [1, 3]   # multiple providers run in parallel
+```
+
+---
+
+## Full example
+
+```yaml
+# guardrails-values.yaml
+image:
+  tag: "latest"
+
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption"
+    key: "encryption-key"
+
+  guardrails:
+    providers:
+      - id: 1
+        provider_name: "regex"
+        policy_name: "block-secrets"
+        enabled: true
+        timeout: 5
+        config:
+          patterns:
+            - pattern: "sk-[A-Za-z0-9]{20,}"
+              description: "OpenAI API key"
+            - pattern: "AKIA[0-9A-Z]{16}"
+              description: "AWS access key"
+            - pattern: "gh[ps]_[A-Za-z0-9]{36}"
+              description: "GitHub token"
+
+      - id: 2
+        provider_name: "azure"
+        policy_name: "content-safety"
+        enabled: true
+        timeout: 10
+        config:
+          endpoint: "https://your-resource.cognitiveservices.azure.com"
+          api_key: "env.AZURE_CONTENT_SAFETY_KEY"
+          analyze_enabled: true
+          analyze_severity_threshold: "medium"
+          jailbreak_shield_enabled: true
+          indirect_attack_shield_enabled: false
+          copyright_enabled: false
+          text_blocklist_enabled: false
+
+    rules:
+      - id: 101
+        name: "block-secrets-input"
+        description: "Block prompts leaking credentials"
+        enabled: true
+        cel_expression: "true"
+        apply_to: "input"
+        sampling_rate: 100
+        timeout: 10
+        provider_config_ids: [1]
+
+      - id: 102
+        name: "content-safety-both"
+        description: "Azure content safety on input and output"
+        enabled: true
+        cel_expression: "true"
+        apply_to: "both"
+        sampling_rate: 100
+        timeout: 15
+        provider_config_ids: [2]
+```
+
+```bash
+kubectl create secret generic azure-content-safety \
+  --from-literal=key='your-azure-content-safety-api-key'
+
+helm install bifrost bifrost/bifrost \
+  -f guardrails-values.yaml \
+  --set env[0].name=AZURE_CONTENT_SAFETY_KEY \
+  --set env[0].valueFrom.secretKeyRef.name=azure-content-safety \
+  --set env[0].valueFrom.secretKeyRef.key=key
+```
--- a/docs/deployment-guides/helm/plugins.mdx
+++ b/docs/deployment-guides/helm/plugins.mdx
@@ -0,0 +1,549 @@
+---
+title: "Plugins"
+description: "Configure Bifrost plugins in Helm — telemetry, logging, semantic cache, OpenTelemetry, Datadog, governance, and custom plugins"
+icon: "puzzle-piece"
+---
+
+Plugins are configured under `bifrost.plugins`. Each plugin is independently enabled/disabled. Pre-hooks run in registration order; post-hooks run in reverse order.
+
+<Note>
+**Telemetry, logging, and governance are auto-loaded built-ins** — they are always active and do not need to be explicitly enabled. Their configuration lives in `bifrost.client.*` and `bifrost.governance.*`, not in the `plugins` block.
+
+The `plugins` block controls the opt-in plugins: `semanticCache`, `otel`, `datadog`, `maxim`, and custom plugins.
+</Note>
+
+```yaml
+bifrost:
+  plugins:
+    semanticCache:
+      enabled: false
+    otel:
+      enabled: false
+    datadog:
+      enabled: false
+```
+
+```bash
+# Enable an opt-in plugin at install time
+helm install bifrost bifrost/bifrost \
+  --set image.tag=v1.4.11 \
+  --set bifrost.plugins.otel.enabled=true
+
+# Or upgrade to enable a plugin without touching other values
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set bifrost.plugins.semanticCache.enabled=true
+```
+
+---
+
+<Tabs>
+
+<Tab title="Telemetry">
+
+### Telemetry (Prometheus)
+
+<Note>
+Telemetry is **always active** — it cannot be disabled. You do not need to set `bifrost.plugins.telemetry.enabled`.
+</Note>
+
+Exposes Prometheus metrics at `GET /metrics`. Custom labels are set via `bifrost.client.prometheusLabels`:
+
+```yaml
+bifrost:
+  client:
+    prometheusLabels:
+      - "environment=production"
+      - "region=us-east-1"
+```
+
+```bash
+# Verify metrics are exposed
+kubectl port-forward svc/bifrost 8080:8080 &
+curl http://localhost:8080/metrics | head -30
+```
+
+**With Prometheus Push Gateway** (recommended for multi-replica / HA setups where pull-based scraping can miss pods):
+
+```yaml
+bifrost:
+  plugins:
+    telemetry:
+      enabled: true
+      config:
+        push_gateway:
+          enabled: true
+          push_gateway_url: "http://prometheus-pushgateway.monitoring.svc.cluster.local:9091"
+          job_name: "bifrost"
+          instance_id: ""      # auto-derived from pod name if empty
+          push_interval: 15
+          basic_auth:
+            username: ""
+            password: ""
+```
+
+**ServiceMonitor for Prometheus Operator:**
+
+```yaml
+serviceMonitor:
+  enabled: true
+  interval: 30s
+  scrapeTimeout: 10s
+  namespace: monitoring     # namespace where Prometheus is deployed
+```
+
+</Tab>
+
+<Tab title="Logging">
+
+### Request/Response Logging
+
+<Note>
+Logging is **auto-loaded** when `bifrost.client.enableLogging: true` and a log store is configured. You do not need to set `bifrost.plugins.logging.enabled`.
+</Note>
+
+Configure logging via the `client` block:
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.client.enableLogging` | Enable request/response logging | `true` |
+| `bifrost.client.disableContentLogging` | Strip message body from logs (HIPAA/PCI) | `false` |
+| `bifrost.client.loggingHeaders` | HTTP headers to capture in log metadata | `[]` |
+
+```yaml
+bifrost:
+  client:
+    enableLogging: true
+    disableContentLogging: false   # set true for HIPAA/compliance
+    loggingHeaders:
+      - "x-request-id"
+      - "x-user-id"
+      - "x-team-id"
+```
+
+```bash
+# Verify logs are being written
+kubectl port-forward svc/bifrost 8080:8080 &
+curl -s "http://localhost:8080/api/logs?limit=5" | jq .
+```
+
+See [Client Configuration](/deployment-guides/helm/client) for the full reference.
+
+</Tab>
+
+<Tab title="Governance">
+
+### Governance
+
+<Note>
+Governance is **always active** for OSS deployments. You do not need to set `bifrost.plugins.governance.enabled`.
+</Note>
+
+Virtual key enforcement is controlled by the `client` block:
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.client.enforceAuthOnInference` | Require a virtual key (`x-bf-vk`) on every inference request | `false` |
+
+```yaml
+bifrost:
+  client:
+    enforceAuthOnInference: true   # require virtual key on all inference requests
+```
+
+Define virtual keys, budgets, rate limits, and routing rules in `bifrost.governance.*`. See the [Governance](/deployment-guides/helm/governance) page.
+
+</Tab>
+
+<Tab title="Semantic Cache">
+
+### Semantic Cache
+
+Caches LLM responses using vector similarity so semantically equivalent prompts return cached answers.
+
+Two modes:
+- **Semantic mode** (`dimension > 1`): uses an embedding model + vector store for similarity search
+- **Direct / hash mode** (`dimension: 1`): exact-match hash-based caching, no embedding model needed
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.plugins.semanticCache.enabled` | Enable semantic caching | `false` |
+| `bifrost.plugins.semanticCache.version` | Plugin config version for DB-backed update tracking (`1` to `32767`) | `1` |
+| `bifrost.plugins.semanticCache.config.provider` | Embedding provider | `"openai"` |
+| `bifrost.plugins.semanticCache.config.embedding_model` | Embedding model name | `"text-embedding-3-small"` |
+| `bifrost.plugins.semanticCache.config.dimension` | Embedding dimension (`1` = direct/hash mode) | `1536` |
+| `bifrost.plugins.semanticCache.config.threshold` | Cosine similarity threshold (0–1) | `0.8` |
+| `bifrost.plugins.semanticCache.config.ttl` | Cache entry TTL (Go duration) | `"5m"` |
+| `bifrost.plugins.semanticCache.config.conversation_history_threshold` | Number of past messages to include in cache key | `3` |
+| `bifrost.plugins.semanticCache.config.cache_by_model` | Include model name in cache key | `true` |
+| `bifrost.plugins.semanticCache.config.cache_by_provider` | Include provider name in cache key | `true` |
+| `bifrost.plugins.semanticCache.config.exclude_system_prompt` | Exclude system prompt from cache key | `false` |
+| `bifrost.plugins.semanticCache.config.cleanup_on_shutdown` | Delete cache data on pod shutdown | `false` |
+
+**Semantic mode (with OpenAI embeddings + Weaviate):**
+
+```bash
+kubectl create secret generic semantic-cache-secret \
+  --from-literal=openai-key='sk-your-openai-embedding-key'
+```
+
+```yaml
+# semantic-cache-values.yaml
+image:
+  tag: "v1.4.11"
+
+vectorStore:
+  enabled: true
+  type: weaviate
+  weaviate:
+    enabled: true
+    persistence:
+      size: 20Gi
+
+bifrost:
+  plugins:
+    semanticCache:
+      enabled: true
+      config:
+        provider: "openai"
+        keys:
+          - value: "env.SEMANTIC_CACHE_OPENAI_KEY"
+            weight: 1
+        embedding_model: "text-embedding-3-small"
+        dimension: 1536
+        threshold: 0.85
+        ttl: "1h"
+        conversation_history_threshold: 5
+        cache_by_model: true
+        cache_by_provider: true
+
+  providerSecrets:
+    semantic-cache-key:
+      existingSecret: "semantic-cache-secret"
+      key: "openai-key"
+      envVar: "SEMANTIC_CACHE_OPENAI_KEY"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f semantic-cache-values.yaml
+```
+
+**Direct / hash mode** (no embedding provider needed):
+
+```yaml
+bifrost:
+  plugins:
+    semanticCache:
+      enabled: true
+      config:
+        dimension: 1          # triggers hash-based exact matching
+        ttl: "30m"
+        cache_by_model: true
+        cache_by_provider: true
+```
+
+<Note>
+The vector store (`vectorStore.*`) must be configured and enabled for semantic mode. Direct/hash mode works without a vector store but still requires a storage backend.
+</Note>
+
+</Tab>
+
+<Tab title="OpenTelemetry">
+
+### OpenTelemetry (OTel)
+
+Sends distributed traces and push-based metrics to any OTLP-compatible collector (Jaeger, Tempo, Honeycomb, etc.).
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.plugins.otel.enabled` | Enable OTel tracing | `false` |
+| `bifrost.plugins.otel.version` | Plugin config version for DB-backed update tracking (`1` to `32767`) | `1` |
+| `bifrost.plugins.otel.config.service_name` | Service name in traces | `"bifrost"` |
+| `bifrost.plugins.otel.config.collector_url` | OTLP collector endpoint | `""` |
+| `bifrost.plugins.otel.config.trace_type` | Trace type (`genai_extension`, `vercel`, or `open_inference`) | `"genai_extension"` |
+| `bifrost.plugins.otel.config.protocol` | Transport protocol (`grpc` or `http`) | `"grpc"` |
+| `bifrost.plugins.otel.config.metrics_enabled` | Enable OTLP push-based metrics | `false` |
+| `bifrost.plugins.otel.config.metrics_endpoint` | OTLP metrics endpoint | `""` |
+| `bifrost.plugins.otel.config.metrics_push_interval` | Push interval in seconds | `15` |
+| `bifrost.plugins.otel.config.headers` | Custom headers for the collector | `{}` |
+| `bifrost.plugins.otel.config.insecure` | Skip TLS verification | `false` |
+| `bifrost.plugins.otel.config.tls_ca_cert` | Path to CA cert for TLS | `""` |
+
+```yaml
+# otel-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  plugins:
+    otel:
+      enabled: true
+      config:
+        service_name: "bifrost-production"
+        collector_url: "otel-collector.observability.svc.cluster.local:4317"
+        trace_type: "genai_extension"
+        protocol: "grpc"
+        insecure: true        # set false in production with a proper cert
+        metrics_enabled: true
+        metrics_endpoint: "otel-collector.observability.svc.cluster.local:4317"
+        metrics_push_interval: 15
+        headers:
+          x-honeycomb-team: "env.HONEYCOMB_API_KEY"
+```
+
+```bash
+helm upgrade bifrost bifrost/bifrost --reuse-values -f otel-values.yaml
+```
+
+**With authentication headers from a Kubernetes Secret:**
+
+```bash
+kubectl create secret generic otel-credentials \
+  --from-literal=api-key='your-honeycomb-or-grafana-key'
+```
+
+```yaml
+bifrost:
+  plugins:
+    otel:
+      enabled: true
+      config:
+        collector_url: "api.honeycomb.io:443"
+        protocol: "grpc"
+        headers:
+          x-honeycomb-team: "env.OTEL_API_KEY"
+
+  providerSecrets:
+    otel-key:
+      existingSecret: "otel-credentials"
+      key: "api-key"
+      envVar: "OTEL_API_KEY"
+```
+
+</Tab>
+
+<Tab title="Datadog">
+
+### Datadog APM
+
+Sends traces to a Datadog Agent running in the cluster.
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.plugins.datadog.enabled` | Enable Datadog tracing | `false` |
+| `bifrost.plugins.datadog.version` | Plugin config version for DB-backed update tracking (`1` to `32767`) | `1` |
+| `bifrost.plugins.datadog.config.service_name` | Service name | `"bifrost"` |
+| `bifrost.plugins.datadog.config.agent_addr` | Datadog Agent address | `"localhost:8126"` |
+| `bifrost.plugins.datadog.config.env` | Deployment environment tag | `""` |
+| `bifrost.plugins.datadog.config.version` | Version tag | `""` |
+| `bifrost.plugins.datadog.config.enable_traces` | Enable trace collection | `true` |
+| `bifrost.plugins.datadog.config.custom_tags` | Extra tags on all spans | `{}` |
+
+The Datadog Agent is typically deployed via the [Datadog Helm chart](https://docs.datadoghq.com/containers/kubernetes/installation/) as a DaemonSet, making it available at the node's hostIP.
+
+```yaml
+# datadog-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  plugins:
+    datadog:
+      enabled: true
+      config:
+        service_name: "bifrost"
+        agent_addr: "$(HOST_IP):8126"   # uses Datadog DaemonSet pattern
+        env: "production"
+        version: "v1.4.11"
+        enable_traces: true
+        custom_tags:
+          team: "platform"
+          region: "us-east-1"
+
+# Inject HOST_IP so Bifrost can reach the DaemonSet agent on the same node
+env:
+  - name: HOST_IP
+    valueFrom:
+      fieldRef:
+        fieldPath: status.hostIP
+```
+
+```bash
+helm upgrade bifrost bifrost/bifrost --reuse-values -f datadog-values.yaml
+```
+
+</Tab>
+
+<Tab title="Maxim">
+
+### Maxim Observability
+
+Sends LLM request/response data to [Maxim](https://getmaxim.ai) for tracing, evaluation, and observability.
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.plugins.maxim.enabled` | Enable Maxim plugin | `false` |
+| `bifrost.plugins.maxim.version` | Plugin config version for DB-backed update tracking (`1` to `32767`) | `1` |
+| `bifrost.plugins.maxim.config.api_key` | Maxim API key (plain text, prefer secret) | `""` |
+| `bifrost.plugins.maxim.config.log_repo_id` | Maxim log repository ID | `""` |
+| `bifrost.plugins.maxim.secretRef.name` | Kubernetes Secret name for API key | `""` |
+| `bifrost.plugins.maxim.secretRef.key` | Key within the secret | `"api-key"` |
+
+```bash
+kubectl create secret generic maxim-credentials \
+  --from-literal=api-key='your-maxim-api-key'
+```
+
+```yaml
+# maxim-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  plugins:
+    maxim:
+      enabled: true
+      config:
+        log_repo_id: "your-log-repo-id"
+      secretRef:
+        name: "maxim-credentials"
+        key: "api-key"
+```
+
+```bash
+helm upgrade bifrost bifrost/bifrost --reuse-values -f maxim-values.yaml
+```
+
+</Tab>
+
+<Tab title="Custom Plugin">
+
+### Custom / Dynamic Plugins
+
+Load a custom Go plugin (compiled `.so` file) at runtime.
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.plugins.custom[].name` | Unique plugin name | `""` |
+| `bifrost.plugins.custom[].enabled` | Enable custom plugin | `false` |
+| `bifrost.plugins.custom[].path` | Path to compiled `.so` file in the container | `""` |
+| `bifrost.plugins.custom[].version` | Plugin config version (`1` to `32767`) | `1` |
+| `bifrost.plugins.custom[].config` | Arbitrary plugin-specific configuration | `{}` |
+
+```yaml
+bifrost:
+  plugins:
+    custom:
+      - name: "my-custom-plugin"
+        enabled: true
+        path: "/plugins/my-plugin.so"
+        version: 1
+        config:
+          api_endpoint: "https://my-service.example.com"
+          timeout: 5000
+```
+
+Mount the `.so` file via a volume:
+
+```yaml
+volumes:
+  - name: custom-plugins
+    configMap:
+      name: bifrost-custom-plugins
+
+volumeMounts:
+  - name: custom-plugins
+    mountPath: /plugins
+```
+
+Or use an init container to download the plugin binary:
+
+```yaml
+initContainers:
+  - name: download-plugin
+    image: curlimages/curl:8.6.0
+    command:
+      - sh
+      - -c
+      - |
+        curl -fsSL https://plugins.example.com/my-plugin.so \
+          -o /plugins/my-plugin.so
+    volumeMounts:
+      - name: plugin-dir
+        mountPath: /plugins
+
+volumes:
+  - name: plugin-dir
+    emptyDir: {}
+
+volumeMounts:
+  - name: plugin-dir
+    mountPath: /plugins
+```
+
+```bash
+helm upgrade bifrost bifrost/bifrost --reuse-values -f custom-plugin-values.yaml
+```
+
+</Tab>
+
+</Tabs>
+
+---
+
+## All Plugins Together
+
+```yaml
+# all-plugins-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption"
+    key: "encryption-key"
+
+  plugins:
+    telemetry:
+      enabled: true
+      config:
+        custom_labels:
+          - name: "environment"
+            value: "production"
+
+    logging:
+      enabled: true
+      config:
+        disable_content_logging: false
+        logging_headers:
+          - "x-request-id"
+
+    governance:
+      enabled: true
+      config:
+        is_vk_mandatory: true
+
+    semanticCache:
+      enabled: true
+      config:
+        provider: "openai"
+        keys:
+          - value: "env.CACHE_OPENAI_KEY"
+            weight: 1
+        embedding_model: "text-embedding-3-small"
+        dimension: 1536
+        threshold: 0.85
+        ttl: "1h"
+
+    otel:
+      enabled: true
+      config:
+        service_name: "bifrost"
+        collector_url: "otel-collector.observability.svc.cluster.local:4317"
+        protocol: "grpc"
+        insecure: true
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f all-plugins-values.yaml
+```
--- a/docs/deployment-guides/helm/providers.mdx
+++ b/docs/deployment-guides/helm/providers.mdx
@@ -0,0 +1,941 @@
+---
+title: "Provider Setup"
+description: "Configure LLM providers in the Bifrost Helm chart — API keys, cloud-native auth, and self-hosted endpoints"
+icon: "plug"
+---
+
+All providers are configured under `bifrost.providers` in your values file. Each provider entry contains a `keys` list where each key has a `name`, `value`, `weight`, and optional provider-specific config.
+
+**Two ways to supply credentials:**
+
+- **Direct value** — `value: "sk-..."` (fine for dev; avoid in production)
+- **Kubernetes Secret + env var** — store the key in a Secret, inject as an env var, and reference it with `value: "env.VAR_NAME"`
+
+The `providerSecrets` block handles the Secret → env var injection automatically:
+
+```yaml
+bifrost:
+  providers:
+    openai:
+      keys:
+        - name: "primary"
+          value: "env.OPENAI_API_KEY"   # resolved at runtime
+          weight: 1
+
+  providerSecrets:
+    openai:
+      existingSecret: "my-openai-secret"
+      key: "api-key"
+      envVar: "OPENAI_API_KEY"          # injected into the pod
+```
+
+---
+
+<Tabs>
+
+<Tab title="OpenAI">
+
+### OpenAI
+
+Supports multiple keys with weighted load balancing. The key with `use_for_batch_api: true` is eligible for the Batch API.
+
+**Step 1 — Create secret**
+
+```bash
+kubectl create secret generic openai-credentials \
+  --from-literal=api-key-1='sk-your-primary-key' \
+  --from-literal=api-key-2='sk-your-secondary-key' \
+  --from-literal=api-key-batch='sk-your-batch-key'
+```
+
+**Step 2 — Values file**
+
+```yaml
+# openai-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  providers:
+    openai:
+      keys:
+        - name: "openai-primary"
+          value: "env.OPENAI_KEY_1"
+          weight: 2               # 50% of traffic
+          models: ["*"]
+        - name: "openai-secondary"
+          value: "env.OPENAI_KEY_2"
+          weight: 1               # 25%
+          models: ["gpt-4o-mini"] # restrict to cheaper model
+        - name: "openai-batch"
+          value: "env.OPENAI_KEY_BATCH"
+          weight: 1               # 25%
+          models: ["*"]
+          use_for_batch_api: true
+
+  providerSecrets:
+    openai-key-1:
+      existingSecret: "openai-credentials"
+      key: "api-key-1"
+      envVar: "OPENAI_KEY_1"
+    openai-key-2:
+      existingSecret: "openai-credentials"
+      key: "api-key-2"
+      envVar: "OPENAI_KEY_2"
+    openai-key-batch:
+      existingSecret: "openai-credentials"
+      key: "api-key-batch"
+      envVar: "OPENAI_KEY_BATCH"
+```
+
+**Step 3 — Install**
+
+```bash
+helm install bifrost bifrost/bifrost -f openai-values.yaml
+```
+
+**Optional — per-provider network config**
+
+```yaml
+bifrost:
+  providers:
+    openai:
+      keys:
+        - name: "primary"
+          value: "env.OPENAI_KEY_1"
+          weight: 1
+      network_config:
+        default_request_timeout_in_seconds: 120
+        max_retries: 3
+        retry_backoff_initial_ms: 500
+        retry_backoff_max_ms: 5000
+        max_conns_per_host: 5000
+```
+
+</Tab>
+
+<Tab title="Anthropic">
+
+### Anthropic
+
+```bash
+kubectl create secret generic anthropic-credentials \
+  --from-literal=api-key-1='sk-ant-your-primary-key' \
+  --from-literal=api-key-2='sk-ant-your-secondary-key'
+```
+
+```yaml
+# anthropic-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  providers:
+    anthropic:
+      keys:
+        - name: "anthropic-primary"
+          value: "env.ANTHROPIC_KEY_1"
+          weight: 1
+          models: ["*"]
+        - name: "anthropic-secondary"
+          value: "env.ANTHROPIC_KEY_2"
+          weight: 1
+          models: ["*"]
+
+  providerSecrets:
+    anthropic-key-1:
+      existingSecret: "anthropic-credentials"
+      key: "api-key-1"
+      envVar: "ANTHROPIC_KEY_1"
+    anthropic-key-2:
+      existingSecret: "anthropic-credentials"
+      key: "api-key-2"
+      envVar: "ANTHROPIC_KEY_2"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f anthropic-values.yaml
+```
+
+**Override Anthropic beta headers** (optional):
+
+```yaml
+bifrost:
+  providers:
+    anthropic:
+      keys:
+        - name: "primary"
+          value: "env.ANTHROPIC_KEY_1"
+          weight: 1
+      network_config:
+        beta_header_overrides:
+          redact-thinking-: true
+```
+
+</Tab>
+
+<Tab title="Azure OpenAI">
+
+### Azure OpenAI
+
+Azure requires `azure_key_config` on every key with `endpoint` and `api_version`. Use top-level `aliases` to map logical model names to Azure deployment names.
+
+Two auth modes are supported:
+
+<Tabs>
+<Tab title="API Key">
+
+**Step 1 — Create secret**
+
+```bash
+kubectl create secret generic azure-credentials \
+  --from-literal=api-key='your-azure-openai-api-key' \
+  --from-literal=endpoint='https://your-resource.openai.azure.com'
+```
+
+**Step 2 — Values file**
+
+```yaml
+# azure-apikey-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  providers:
+    azure:
+      keys:
+        - name: "azure-primary"
+          value: "env.AZURE_API_KEY"
+          weight: 1
+          models: ["gpt-4o", "gpt-4o-mini", "text-embedding-3-small"]
+          azure_key_config:
+            endpoint: "env.AZURE_ENDPOINT"
+            api_version: "2024-10-21"
+          aliases:
+            gpt-4o: "gpt-4o-prod"
+            gpt-4o-mini: "gpt-4o-mini-prod"
+            text-embedding-3-small: "embeddings-prod"
+
+  providerSecrets:
+    azure-api-key:
+      existingSecret: "azure-credentials"
+      key: "api-key"
+      envVar: "AZURE_API_KEY"
+    azure-endpoint:
+      existingSecret: "azure-credentials"
+      key: "endpoint"
+      envVar: "AZURE_ENDPOINT"
+```
+
+**Step 3 — Install**
+
+```bash
+helm install bifrost bifrost/bifrost -f azure-apikey-values.yaml
+```
+
+</Tab>
+<Tab title="Managed Identity / Workload Identity">
+
+When `value` is empty, Bifrost uses `DefaultAzureCredential` — which automatically resolves credentials from:
+- AKS Workload Identity (recommended for production)
+- Azure VM managed identity
+- `az login` (developer machines)
+
+**Step 1 — Annotate the service account** (AKS Workload Identity)
+
+```bash
+# Associate the Kubernetes service account with your Azure managed identity
+kubectl annotate serviceaccount bifrost \
+  azure.workload.identity/client-id="<MANAGED_IDENTITY_CLIENT_ID>"
+```
+
+```yaml
+serviceAccount:
+  annotations:
+    azure.workload.identity/client-id: "<MANAGED_IDENTITY_CLIENT_ID>"
+```
+
+**Step 2 — Values file**
+
+```bash
+kubectl create secret generic azure-config \
+  --from-literal=endpoint='https://your-resource.openai.azure.com'
+```
+
+```yaml
+# azure-msi-values.yaml
+image:
+  tag: "v1.4.11"
+
+serviceAccount:
+  annotations:
+    azure.workload.identity/client-id: "<MANAGED_IDENTITY_CLIENT_ID>"
+
+bifrost:
+  providers:
+    azure:
+      keys:
+        - name: "azure-workload-identity"
+          value: ""                          # empty = DefaultAzureCredential
+          weight: 1
+          models: ["gpt-4o"]
+          azure_key_config:
+            endpoint: "env.AZURE_ENDPOINT"
+            api_version: "2024-10-21"
+          aliases:
+            gpt-4o: "gpt-4o-prod"
+
+  providerSecrets:
+    azure-endpoint:
+      existingSecret: "azure-config"
+      key: "endpoint"
+      envVar: "AZURE_ENDPOINT"
+```
+
+**Step 3 — Install**
+
+```bash
+helm install bifrost bifrost/bifrost -f azure-msi-values.yaml
+```
+
+</Tab>
+</Tabs>
+
+**Multi-region failover** (two deployments, different regions):
+
+```yaml
+bifrost:
+  providers:
+    azure:
+      keys:
+        - name: "eastus"
+          value: "env.AZURE_KEY_EAST"
+          weight: 1
+          azure_key_config:
+            endpoint: "env.AZURE_ENDPOINT_EAST"
+            api_version: "2024-10-21"
+          aliases:
+            gpt-4o: "gpt-4o-eastus"
+        - name: "westus"
+          value: "env.AZURE_KEY_WEST"
+          weight: 1
+          azure_key_config:
+            endpoint: "env.AZURE_ENDPOINT_WEST"
+            api_version: "2024-10-21"
+          aliases:
+            gpt-4o: "gpt-4o-westus"
+```
+
+</Tab>
+
+<Tab title="AWS Bedrock">
+
+### AWS Bedrock
+
+Bedrock requires `bedrock_key_config` with at minimum a `region`. Three auth modes:
+
+<Tabs>
+<Tab title="Static Credentials">
+
+```bash
+kubectl create secret generic aws-credentials \
+  --from-literal=access-key-id='AKIAIOSFODNN7EXAMPLE' \
+  --from-literal=secret-access-key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY'
+```
+
+```yaml
+# bedrock-static-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  providers:
+    bedrock:
+      keys:
+        - name: "bedrock-static"
+          value: ""
+          weight: 1
+          models: ["*"]
+          bedrock_key_config:
+            region: "us-east-1"
+            access_key: "env.AWS_ACCESS_KEY_ID"
+            secret_key: "env.AWS_SECRET_ACCESS_KEY"
+            deployments:
+              # Logical name -> Bedrock inference profile
+              anthropic.claude-3-5-sonnet: "us.anthropic.claude-3-5-sonnet-20240620-v1:0"
+
+  providerSecrets:
+    aws-access-key:
+      existingSecret: "aws-credentials"
+      key: "access-key-id"
+      envVar: "AWS_ACCESS_KEY_ID"
+    aws-secret-key:
+      existingSecret: "aws-credentials"
+      key: "secret-access-key"
+      envVar: "AWS_SECRET_ACCESS_KEY"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f bedrock-static-values.yaml
+```
+
+</Tab>
+<Tab title="IRSA / EKS Pod Identity">
+
+When only `region` is set, Bifrost inherits credentials from the AWS SDK default chain — IRSA (IAM Roles for Service Accounts), EC2 instance profile, or `AWS_*` env vars.
+
+**Step 1 — Annotate the service account with the IAM role**
+
+```bash
+kubectl annotate serviceaccount bifrost \
+  eks.amazonaws.com/role-arn="arn:aws:iam::123456789012:role/BifrostBedrockRole"
+```
+
+```yaml
+serviceAccount:
+  annotations:
+    eks.amazonaws.com/role-arn: "arn:aws:iam::123456789012:role/BifrostBedrockRole"
+```
+
+**Step 2 — Values file**
+
+```yaml
+# bedrock-irsa-values.yaml
+image:
+  tag: "v1.4.11"
+
+serviceAccount:
+  annotations:
+    eks.amazonaws.com/role-arn: "arn:aws:iam::123456789012:role/BifrostBedrockRole"
+
+bifrost:
+  providers:
+    bedrock:
+      keys:
+        - name: "bedrock-irsa"
+          value: ""
+          weight: 1
+          models: ["*"]
+          bedrock_key_config:
+            region: "us-east-1"
+            # No access_key / secret_key — SDK uses IRSA token automatically
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f bedrock-irsa-values.yaml
+```
+
+</Tab>
+<Tab title="STS AssumeRole">
+
+Assumes a cross-account role on top of the default credential chain.
+
+```yaml
+# bedrock-assumerole-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  providers:
+    bedrock:
+      keys:
+        - name: "bedrock-assumerole"
+          value: ""
+          weight: 1
+          models: ["*"]
+          bedrock_key_config:
+            region: "us-west-2"
+            # Source identity from pod's default chain, then assume this role
+            role_arn: "env.AWS_ROLE_ARN"
+            external_id: "env.AWS_EXTERNAL_ID"
+            session_name: "bifrost-session"
+```
+
+```bash
+kubectl create secret generic aws-role-config \
+  --from-literal=role-arn='arn:aws:iam::999999999999:role/CrossAccountBedrockRole' \
+  --from-literal=external-id='your-external-id'
+```
+
+```yaml
+  providerSecrets:
+    aws-role-arn:
+      existingSecret: "aws-role-config"
+      key: "role-arn"
+      envVar: "AWS_ROLE_ARN"
+    aws-external-id:
+      existingSecret: "aws-role-config"
+      key: "external-id"
+      envVar: "AWS_EXTERNAL_ID"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f bedrock-assumerole-values.yaml
+```
+
+</Tab>
+</Tabs>
+
+**Batch API — S3 configuration**
+
+```yaml
+bedrock_key_config:
+  region: "us-east-1"
+  access_key: "env.AWS_ACCESS_KEY_ID"
+  secret_key: "env.AWS_SECRET_ACCESS_KEY"
+  batch_s3_config:
+    buckets:
+      - bucket_name: "my-bedrock-batch-bucket"
+        prefix: "batch/"
+        is_default: true
+```
+
+</Tab>
+
+<Tab title="Google Vertex AI">
+
+### Google Vertex AI
+
+Vertex requires `vertex_key_config` with `project_id` and `region`. Two auth modes:
+
+<Tabs>
+<Tab title="Service Account Key">
+
+```bash
+# Base64-encode the service account JSON
+SA_JSON=$(cat service-account-key.json | base64 -w 0)
+
+kubectl create secret generic gcp-credentials \
+  --from-literal=service-account-json="${SA_JSON}"
+```
+
+```yaml
+# vertex-sa-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  providers:
+    vertex:
+      keys:
+        - name: "vertex-sa-key"
+          value: ""
+          weight: 1
+          models: ["*"]
+          vertex_key_config:
+            project_id: "env.VERTEX_PROJECT_ID"
+            region: "us-central1"
+            auth_credentials: "env.VERTEX_AUTH_CREDENTIALS"
+
+  providerSecrets:
+    vertex-project-id:
+      existingSecret: "gcp-credentials"
+      key: "project-id"
+      envVar: "VERTEX_PROJECT_ID"
+    vertex-sa:
+      existingSecret: "gcp-credentials"
+      key: "service-account-json"
+      envVar: "VERTEX_AUTH_CREDENTIALS"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f vertex-sa-values.yaml
+```
+
+</Tab>
+<Tab title="GKE Workload Identity / ADC">
+
+When `auth_credentials` is omitted, Bifrost calls `google.FindDefaultCredentials` — which resolves to:
+- GKE Workload Identity (recommended)
+- GCE metadata server (on Compute Engine / Cloud Run)
+- `GOOGLE_APPLICATION_CREDENTIALS` path
+- `gcloud auth application-default login` (developer machines)
+
+**Step 1 — Annotate the service account** (GKE Workload Identity)
+
+```bash
+gcloud iam service-accounts add-iam-policy-binding \
+  bifrost-sa@my-project.iam.gserviceaccount.com \
+  --role roles/iam.workloadIdentityUser \
+  --member "serviceAccount:my-project.svc.id.goog[default/bifrost]"
+```
+
+```yaml
+serviceAccount:
+  annotations:
+    iam.gke.io/gcp-service-account: "bifrost-sa@my-project.iam.gserviceaccount.com"
+```
+
+**Step 2 — Values file**
+
+```yaml
+# vertex-wli-values.yaml
+image:
+  tag: "v1.4.11"
+
+serviceAccount:
+  annotations:
+    iam.gke.io/gcp-service-account: "bifrost-sa@my-project.iam.gserviceaccount.com"
+
+bifrost:
+  providers:
+    vertex:
+      keys:
+        - name: "vertex-workload-identity"
+          value: ""
+          weight: 1
+          models: ["*"]
+          vertex_key_config:
+            project_id: "my-gcp-project"
+            region: "us-central1"
+            # auth_credentials intentionally omitted → ADC lookup
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f vertex-wli-values.yaml
+```
+
+</Tab>
+</Tabs>
+
+</Tab>
+
+<Tab title="Groq / Mistral / Gemini / Others">
+
+### Standard API-Key Providers
+
+These providers follow the same simple pattern — one or more keys with weights.
+
+<Tabs>
+<Tab title="Groq">
+
+```bash
+kubectl create secret generic groq-credentials \
+  --from-literal=api-key='gsk_your_groq_api_key'
+```
+
+```yaml
+bifrost:
+  providers:
+    groq:
+      keys:
+        - name: "groq-primary"
+          value: "env.GROQ_API_KEY"
+          weight: 1
+          models: ["*"]
+
+  providerSecrets:
+    groq-key:
+      existingSecret: "groq-credentials"
+      key: "api-key"
+      envVar: "GROQ_API_KEY"
+```
+
+</Tab>
+<Tab title="Gemini">
+
+```bash
+kubectl create secret generic gemini-credentials \
+  --from-literal=api-key='your-gemini-api-key'
+```
+
+```yaml
+bifrost:
+  providers:
+    gemini:
+      keys:
+        - name: "gemini-main"
+          value: "env.GEMINI_API_KEY"
+          weight: 1
+          models: ["*"]
+
+  providerSecrets:
+    gemini-key:
+      existingSecret: "gemini-credentials"
+      key: "api-key"
+      envVar: "GEMINI_API_KEY"
+```
+
+</Tab>
+<Tab title="Mistral">
+
+```bash
+kubectl create secret generic mistral-credentials \
+  --from-literal=api-key='your-mistral-api-key'
+```
+
+```yaml
+bifrost:
+  providers:
+    mistral:
+      keys:
+        - name: "mistral-main"
+          value: "env.MISTRAL_API_KEY"
+          weight: 1
+          models: ["*"]
+
+  providerSecrets:
+    mistral-key:
+      existingSecret: "mistral-credentials"
+      key: "api-key"
+      envVar: "MISTRAL_API_KEY"
+```
+
+</Tab>
+<Tab title="Cohere / Perplexity / xAI / Others">
+
+All standard API-key providers follow the same pattern. Replace the provider name and env var name accordingly:
+
+```yaml
+bifrost:
+  providers:
+    cohere:
+      keys:
+        - name: "cohere-main"
+          value: "env.COHERE_API_KEY"
+          weight: 1
+    perplexity:
+      keys:
+        - name: "perplexity-main"
+          value: "env.PERPLEXITY_API_KEY"
+          weight: 1
+    xai:
+      keys:
+        - name: "xai-main"
+          value: "env.XAI_API_KEY"
+          weight: 1
+    cerebras:
+      keys:
+        - name: "cerebras-main"
+          value: "env.CEREBRAS_API_KEY"
+          weight: 1
+    openrouter:
+      keys:
+        - name: "openrouter-main"
+          value: "env.OPENROUTER_API_KEY"
+          weight: 1
+    nebius:
+      keys:
+        - name: "nebius-main"
+          value: "env.NEBIUS_API_KEY"
+          weight: 1
+```
+
+</Tab>
+</Tabs>
+
+**Install command (any of the above)**
+
+```bash
+helm install bifrost bifrost/bifrost \
+  --set image.tag=v1.4.11 \
+  -f provider-values.yaml
+```
+
+</Tab>
+
+<Tab title="Self-Hosted">
+
+### Self-Hosted Providers
+
+Self-hosted providers point to a URL you operate. No API key is typically required (`value: ""`).
+
+<Tabs>
+<Tab title="Ollama">
+
+```yaml
+# ollama-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  providers:
+    ollama:
+      keys:
+        - name: "ollama-local"
+          value: ""
+          weight: 1
+          models: ["*"]
+          ollama_key_config:
+            url: "http://ollama.default.svc.cluster.local:11434"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f ollama-values.yaml
+```
+
+Using an env var for the URL (useful across environments):
+
+```bash
+kubectl create secret generic ollama-config \
+  --from-literal=url='http://ollama.default.svc.cluster.local:11434'
+```
+
+```yaml
+          ollama_key_config:
+            url: "env.OLLAMA_URL"
+
+  providerSecrets:
+    ollama-url:
+      existingSecret: "ollama-config"
+      key: "url"
+      envVar: "OLLAMA_URL"
+```
+
+</Tab>
+<Tab title="vLLM">
+
+vLLM instances are model-specific — one key per served model.
+
+```yaml
+# vllm-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  providers:
+    vllm:
+      keys:
+        - name: "vllm-llama3-70b"
+          value: ""
+          weight: 1
+          models: ["llama-3-70b"]
+          vllm_key_config:
+            url: "http://vllm.default.svc.cluster.local:8000"
+            model_name: "meta-llama/Meta-Llama-3-70B-Instruct"
+        - name: "vllm-mistral"
+          value: ""
+          weight: 1
+          models: ["mistral-7b"]
+          vllm_key_config:
+            url: "http://vllm-mistral.default.svc.cluster.local:8000"
+            model_name: "mistralai/Mistral-7B-Instruct-v0.3"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f vllm-values.yaml
+```
+
+</Tab>
+<Tab title="SGLang">
+
+```yaml
+# sgl-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  providers:
+    sgl:
+      keys:
+        - name: "sgl-main"
+          value: ""
+          weight: 1
+          models: ["*"]
+          sgl_key_config:
+            url: "http://sgl-router.default.svc.cluster.local:30000"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f sgl-values.yaml
+```
+
+</Tab>
+<Tab title="HuggingFace / Replicate">
+
+These providers use `aliases` to map logical model names to provider-specific IDs.
+
+```yaml
+bifrost:
+  providers:
+    huggingface:
+      keys:
+        - name: "hf-main"
+          value: "env.HF_API_KEY"
+          weight: 1
+          models: ["llama-3", "mixtral"]
+          aliases:
+            llama-3: "meta-llama/Meta-Llama-3-8B-Instruct"
+            mixtral: "mistralai/Mixtral-8x7B-Instruct-v0.1"
+
+    replicate:
+      keys:
+        - name: "replicate-main"
+          value: "env.REPLICATE_API_KEY"
+          weight: 1
+          models: ["llama-3"]
+          aliases:
+            llama-3: "meta/meta-llama-3-70b-instruct"
+          replicate_key_config:
+            use_deployments_endpoint: false
+```
+
+</Tab>
+</Tabs>
+
+</Tab>
+
+</Tabs>
+
+---
+
+## Multi-Provider Example
+
+Combine providers in a single values file:
+
+```yaml
+# multi-provider-values.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  providers:
+    openai:
+      keys:
+        - name: "openai-primary"
+          value: "env.OPENAI_API_KEY"
+          weight: 2
+          models: ["*"]
+    anthropic:
+      keys:
+        - name: "anthropic-primary"
+          value: "env.ANTHROPIC_API_KEY"
+          weight: 1
+          models: ["*"]
+    groq:
+      keys:
+        - name: "groq-primary"
+          value: "env.GROQ_API_KEY"
+          weight: 1
+          models: ["*"]
+
+  providerSecrets:
+    openai-key:
+      existingSecret: "provider-keys"
+      key: "openai"
+      envVar: "OPENAI_API_KEY"
+    anthropic-key:
+      existingSecret: "provider-keys"
+      key: "anthropic"
+      envVar: "ANTHROPIC_API_KEY"
+    groq-key:
+      existingSecret: "provider-keys"
+      key: "groq"
+      envVar: "GROQ_API_KEY"
+
+  plugins:
+    logging:
+      enabled: true
+    governance:
+      enabled: true
+```
+
+```bash
+# Create a single secret with all provider keys
+kubectl create secret generic provider-keys \
+  --from-literal=openai='sk-your-openai-key' \
+  --from-literal=anthropic='sk-ant-your-anthropic-key' \
+  --from-literal=groq='gsk_your-groq-key'
+
+helm install bifrost bifrost/bifrost -f multi-provider-values.yaml
+```
--- a/docs/deployment-guides/helm/storage.mdx
+++ b/docs/deployment-guides/helm/storage.mdx
@@ -0,0 +1,550 @@
+---
+title: "Storage"
+description: "Configure Bifrost storage backends in Helm — SQLite, PostgreSQL (embedded and external), per-store overrides, and S3/GCS object storage for logs"
+icon: "database"
+---
+
+Bifrost persists two types of data — **config** (providers, virtual keys, governance rules) and **logs** (request/response records). Each has its own store, both defaulting to the top-level `storage.mode`.
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `storage.mode` | Default backend for both stores (`sqlite` or `postgres`) | `sqlite` |
+| `storage.configStore.type` | Override backend for the config store | `""` (inherits `storage.mode`) |
+| `storage.logsStore.type` | Override backend for the logs store | `""` (inherits `storage.mode`) |
+
+<Note>
+When any store uses SQLite the chart deploys a **StatefulSet** with a PVC. With PostgreSQL only (no SQLite) it deploys a **Deployment**. Mixing backends (e.g. config=postgres, logs=sqlite) still requires a StatefulSet.
+</Note>
+
+---
+
+<Tabs>
+
+<Tab title="SQLite">
+
+### SQLite (Default)
+
+Simplest setup — no external database required. Bifrost runs as a StatefulSet with a persistent volume for the SQLite files.
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `storage.persistence.enabled` | Create a PVC for SQLite data | `true` |
+| `storage.persistence.size` | PVC size | `10Gi` |
+| `storage.persistence.accessMode` | PVC access mode | `ReadWriteOnce` |
+| `storage.persistence.storageClass` | Storage class (leave empty for cluster default) | `""` |
+| `storage.persistence.existingClaim` | Reuse an existing PVC | `""` |
+
+```yaml
+# sqlite-values.yaml
+image:
+  tag: "v1.4.11"
+
+storage:
+  mode: sqlite
+  persistence:
+    enabled: true
+    size: 20Gi
+    # storageClass: "gp3"   # uncomment to pin storage class
+
+bifrost:
+  encryptionKey: "your-32-byte-encryption-key-here"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f sqlite-values.yaml
+```
+
+**Reuse an existing PVC** (e.g. after a StatefulSet migration):
+
+```yaml
+storage:
+  persistence:
+    existingClaim: "bifrost-data"
+```
+
+<Warning>
+Upgrading from SQLite to PostgreSQL requires a data migration — the two stores are not compatible. Plan accordingly before switching `storage.mode` on a running deployment.
+</Warning>
+
+#### StatefulSet Migration (chart v2.0.0+)
+
+Prior to v2.0.0, SQLite used a Deployment + manual PVC. v2.0.0 moved SQLite to a StatefulSet. If upgrading from an older chart:
+
+```bash
+# 1. Scale down the old deployment
+kubectl scale deployment bifrost --replicas=0
+
+# 2. Note the existing PVC name
+kubectl get pvc
+
+# 3. Upgrade the chart, pointing at the existing claim
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set storage.persistence.existingClaim=<your-old-pvc-name> \
+  --set image.tag=v1.4.11
+```
+
+</Tab>
+
+<Tab title="Embedded PostgreSQL">
+
+### Embedded PostgreSQL
+
+The chart can deploy a PostgreSQL instance alongside Bifrost. Good for simple production setups where you don't have an existing database.
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `storage.mode` | Set to `postgres` | `sqlite` |
+| `postgresql.enabled` | Deploy PostgreSQL as a sub-deployment | `false` |
+| `postgresql.auth.username` | Database user | `bifrost` |
+| `postgresql.auth.password` | Database password | `bifrost_password` |
+| `postgresql.auth.database` | Database name | `bifrost` |
+| `postgresql.primary.persistence.size` | PVC size for PostgreSQL data | `8Gi` |
+
+<Note>
+Ensure the database is created with **UTF8 encoding**. The embedded PostgreSQL deployment handles this automatically. See [PostgreSQL UTF8 Requirement](/quickstart/gateway/setting-up#postgresql-utf8-requirement) for manual setups.
+</Note>
+
+```bash
+kubectl create secret generic postgres-credentials \
+  --from-literal=password='your-secure-postgres-password'
+```
+
+```yaml
+# embedded-postgres-values.yaml
+image:
+  tag: "v1.4.11"
+
+storage:
+  mode: postgres
+
+postgresql:
+  enabled: true
+  auth:
+    username: bifrost
+    password: "your-secure-postgres-password"   # use existingSecret in production
+    database: bifrost
+  primary:
+    persistence:
+      enabled: true
+      size: 50Gi
+    resources:
+      requests:
+        cpu: 500m
+        memory: 1Gi
+      limits:
+        cpu: 2000m
+        memory: 4Gi
+
+bifrost:
+  encryptionKey: "your-32-byte-encryption-key-here"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f embedded-postgres-values.yaml
+```
+
+**Verify the connection from Bifrost:**
+
+```bash
+kubectl exec -it deployment/bifrost -- nc -zv bifrost-postgresql 5432
+```
+
+</Tab>
+
+<Tab title="External PostgreSQL">
+
+### External PostgreSQL
+
+Point Bifrost at an existing PostgreSQL instance — RDS, Cloud SQL, Azure Database, or self-managed.
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `postgresql.enabled` | Must be `false` | `false` |
+| `postgresql.external.enabled` | Enable external connection | `false` |
+| `postgresql.external.host` | Hostname or IP | `""` |
+| `postgresql.external.port` | Port | `5432` |
+| `postgresql.external.user` | Username | `bifrost` |
+| `postgresql.external.database` | Database name | `bifrost` |
+| `postgresql.external.sslMode` | SSL mode (`disable`, `require`, `verify-ca`, `verify-full`) | `disable` |
+| `postgresql.external.existingSecret` | Secret name for the password | `""` |
+| `postgresql.external.passwordKey` | Key within the secret | `"password"` |
+
+```bash
+kubectl create secret generic external-postgres-credentials \
+  --from-literal=password='your-external-postgres-password'
+```
+
+```yaml
+# external-postgres-values.yaml
+image:
+  tag: "v1.4.11"
+
+storage:
+  mode: postgres
+
+postgresql:
+  enabled: false
+  external:
+    enabled: true
+    host: "your-rds-endpoint.us-east-1.rds.amazonaws.com"
+    port: 5432
+    user: bifrost
+    database: bifrost
+    sslMode: require
+    existingSecret: "external-postgres-credentials"
+    passwordKey: "password"
+
+bifrost:
+  encryptionKey: "your-32-byte-encryption-key-here"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f external-postgres-values.yaml
+```
+
+**Test connectivity before installing:**
+
+```bash
+kubectl run pg-test --image=postgres:16-alpine --rm -it --restart=Never -- \
+  psql "host=your-rds-endpoint.us-east-1.rds.amazonaws.com dbname=bifrost user=bifrost sslmode=require" \
+  -c "SELECT version();"
+```
+
+</Tab>
+
+<Tab title="Mixed (Config=Postgres, Logs=SQLite)">
+
+### Mixed Backend
+
+Run the config store on PostgreSQL (fast lookups, shared across replicas) while keeping logs on SQLite (simpler, cheaper for append-heavy workloads).
+
+```yaml
+# mixed-values.yaml
+image:
+  tag: "v1.4.11"
+
+storage:
+  mode: sqlite           # default fallback
+  configStore:
+    type: postgres       # override: config uses postgres
+  logsStore:
+    type: sqlite         # explicit: logs use sqlite
+  persistence:
+    enabled: true
+    size: 20Gi           # for the SQLite logs store
+
+postgresql:
+  external:
+    enabled: true
+    host: "your-postgres-host.example.com"
+    port: 5432
+    user: bifrost
+    database: bifrost
+    sslMode: require
+    existingSecret: "postgres-credentials"
+    passwordKey: "password"
+
+bifrost:
+  encryptionKey: "your-32-byte-encryption-key-here"
+```
+
+```bash
+kubectl create secret generic postgres-credentials \
+  --from-literal=password='your-postgres-password'
+
+helm install bifrost bifrost/bifrost -f mixed-values.yaml
+```
+
+<Note>
+In mixed mode, Bifrost deploys a StatefulSet (because SQLite is in use) with both a PostgreSQL connection and a local PVC for the SQLite log store.
+</Note>
+
+**PostgreSQL connection pool tuning** (high log volume):
+
+```yaml
+storage:
+  configStore:
+    type: postgres
+    maxIdleConns: 5
+    maxOpenConns: 50
+  logsStore:
+    type: postgres
+    maxIdleConns: 10
+    maxOpenConns: 100
+```
+
+</Tab>
+
+</Tabs>
+
+---
+
+## Object Storage for Logs
+
+Offload large request/response payloads from the database to S3 or GCS. The DB retains only lightweight index records; payloads are fetched on demand.
+
+<Tabs>
+<Tab title="AWS S3">
+
+```bash
+kubectl create secret generic s3-credentials \
+  --from-literal=access-key-id='AKIAIOSFODNN7EXAMPLE' \
+  --from-literal=secret-access-key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY'
+```
+
+```yaml
+storage:
+  logsStore:
+    objectStorage:
+      enabled: true
+      type: s3
+      bucket: "bifrost-logs"
+      prefix: "bifrost"
+      compress: true           # gzip compression
+
+      # S3 configuration
+      region: us-east-1
+      accessKeyId: "env.S3_ACCESS_KEY_ID"
+      secretAccessKey: "env.S3_SECRET_ACCESS_KEY"
+      # endpoint: ""           # Custom endpoint for MinIO / Cloudflare R2
+      # forcePathStyle: false  # Set true for MinIO
+
+bifrost:
+  # inject S3 credentials as env vars
+  providerSecrets:
+    s3-access-key:
+      existingSecret: "s3-credentials"
+      key: "access-key-id"
+      envVar: "S3_ACCESS_KEY_ID"
+    s3-secret-key:
+      existingSecret: "s3-credentials"
+      key: "secret-access-key"
+      envVar: "S3_SECRET_ACCESS_KEY"
+```
+
+**Using IAM role (IRSA / instance profile) instead of static keys:**
+
+```yaml
+storage:
+  logsStore:
+    objectStorage:
+      enabled: true
+      type: s3
+      bucket: "bifrost-logs"
+      region: us-east-1
+      # No accessKeyId / secretAccessKey — uses SDK default chain
+      roleArn: "arn:aws:iam::123456789012:role/BifrostS3Role"
+```
+
+</Tab>
+<Tab title="Google Cloud Storage">
+
+```bash
+kubectl create secret generic gcs-credentials \
+  --from-literal=service-account-json="$(cat service-account-key.json)"
+```
+
+```yaml
+storage:
+  logsStore:
+    objectStorage:
+      enabled: true
+      type: gcs
+      bucket: "bifrost-logs"
+      prefix: "bifrost"
+      compress: true
+
+      # GCS configuration
+      projectId: "my-gcp-project"
+      credentialsJson: "env.GCS_CREDENTIALS_JSON"   # omit for Workload Identity
+
+bifrost:
+  providerSecrets:
+    gcs-creds:
+      existingSecret: "gcs-credentials"
+      key: "service-account-json"
+      envVar: "GCS_CREDENTIALS_JSON"
+```
+
+</Tab>
+<Tab title="MinIO (Self-Hosted)">
+
+```yaml
+storage:
+  logsStore:
+    objectStorage:
+      enabled: true
+      type: s3
+      bucket: "bifrost-logs"
+      prefix: "bifrost"
+      compress: false
+
+      region: us-east-1          # can be any value for MinIO
+      endpoint: "http://minio.minio-ns.svc.cluster.local:9000"
+      accessKeyId: "env.MINIO_ACCESS_KEY"
+      secretAccessKey: "env.MINIO_SECRET_KEY"
+      forcePathStyle: true        # required for MinIO
+```
+
+</Tab>
+</Tabs>
+
+```bash
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  -f object-storage-values.yaml
+```
+
+---
+
+## Vector Store
+
+A vector store is required for [semantic caching](/deployment-guides/helm/plugins). Choose from Weaviate, Redis, or Qdrant (embedded or external), or Pinecone (external only).
+
+<Tabs>
+<Tab title="Weaviate">
+
+```yaml
+vectorStore:
+  enabled: true
+  type: weaviate
+  weaviate:
+    enabled: true          # deploy embedded Weaviate
+    replicas: 1
+    persistence:
+      enabled: true
+      size: 20Gi
+    resources:
+      requests:
+        cpu: 500m
+        memory: 1Gi
+      limits:
+        cpu: 2000m
+        memory: 4Gi
+```
+
+**External Weaviate:**
+
+```yaml
+vectorStore:
+  enabled: true
+  type: weaviate
+  weaviate:
+    enabled: false
+    external:
+      enabled: true
+      scheme: https
+      host: "weaviate.example.com"
+      apiKey: "env.WEAVIATE_API_KEY"
+      grpcHost: "weaviate-grpc.example.com"
+      grpcSecured: true
+      existingSecret: "weaviate-credentials"
+      apiKeyKey: "api-key"
+```
+
+</Tab>
+<Tab title="Redis / Valkey">
+
+```yaml
+vectorStore:
+  enabled: true
+  type: redis
+  redis:
+    enabled: true          # deploy embedded Redis
+    auth:
+      enabled: true
+      password: "redis_password"
+    master:
+      persistence:
+        size: 8Gi
+```
+
+**External Redis / AWS MemoryDB:**
+
+```bash
+kubectl create secret generic redis-credentials \
+  --from-literal=password='your-redis-password'
+```
+
+```yaml
+vectorStore:
+  enabled: true
+  type: redis
+  redis:
+    enabled: false
+    external:
+      enabled: true
+      host: "your-redis.cache.amazonaws.com"
+      port: 6379
+      useTls: true
+      clusterMode: true          # required for AWS MemoryDB
+      existingSecret: "redis-credentials"
+      passwordKey: "password"
+```
+
+</Tab>
+<Tab title="Qdrant">
+
+```yaml
+vectorStore:
+  enabled: true
+  type: qdrant
+  qdrant:
+    enabled: true          # deploy embedded Qdrant
+    persistence:
+      size: 10Gi
+```
+
+**External Qdrant:**
+
+```bash
+kubectl create secret generic qdrant-credentials \
+  --from-literal=api-key='your-qdrant-api-key'
+```
+
+```yaml
+vectorStore:
+  enabled: true
+  type: qdrant
+  qdrant:
+    enabled: false
+    external:
+      enabled: true
+      host: "qdrant.example.com"
+      port: 6334
+      useTls: true
+      existingSecret: "qdrant-credentials"
+      apiKeyKey: "api-key"
+```
+
+</Tab>
+<Tab title="Pinecone">
+
+Pinecone is external-only.
+
+```bash
+kubectl create secret generic pinecone-credentials \
+  --from-literal=api-key='your-pinecone-api-key'
+```
+
+```yaml
+vectorStore:
+  enabled: true
+  type: pinecone
+  pinecone:
+    external:
+      enabled: true
+      indexHost: "your-index.svc.us-east1-gcp.pinecone.io"
+      existingSecret: "pinecone-credentials"
+      apiKeyKey: "api-key"
+```
+
+</Tab>
+</Tabs>
+
+```bash
+helm install bifrost bifrost/bifrost \
+  --set image.tag=v1.4.11 \
+  -f storage-values.yaml
+```
--- a/docs/deployment-guides/helm/troubleshooting.mdx
+++ b/docs/deployment-guides/helm/troubleshooting.mdx
@@ -0,0 +1,401 @@
+---
+title: "Troubleshooting"
+description: "Diagnose and fix common issues with Bifrost Helm deployments — pods, database, ingress, secrets, PVCs, and performance"
+icon: "wrench"
+---
+
+This page covers the most common problems encountered when deploying Bifrost with Helm, along with diagnostic commands and fixes.
+
+---
+
+## Pod Not Starting
+
+### Quick diagnostics
+
+```bash
+# Show pod status
+kubectl get pods -l app.kubernetes.io/name=bifrost
+
+# Show pod events (most useful first step)
+kubectl describe pod -l app.kubernetes.io/name=bifrost
+
+# Show pod logs (use --previous if the pod has already crashed)
+kubectl logs -l app.kubernetes.io/name=bifrost
+kubectl logs -l app.kubernetes.io/name=bifrost --previous
+```
+
+### Image pull errors (`ErrImagePull` / `ImagePullBackOff`)
+
+```bash
+# Check which image is being pulled
+kubectl describe pod -l app.kubernetes.io/name=bifrost | grep "Image:"
+
+# Verify imagePullSecrets are attached
+kubectl get pod -l app.kubernetes.io/name=bifrost -o jsonpath='{.items[0].spec.imagePullSecrets}'
+
+# Test secret manually
+kubectl get secret <pull-secret-name> -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | jq .
+```
+
+Common causes:
+- `image.tag` not set — the chart requires it; the pod will not start without it
+- Pull secret missing or expired (ECR tokens expire after 12 hours)
+- Incorrect `image.repository` for enterprise registry
+
+```bash
+# Fix: set the correct tag
+helm upgrade bifrost bifrost/bifrost --reuse-values --set image.tag=v1.4.11
+```
+
+### PVC not binding (`Pending`)
+
+```bash
+# Check PVC status
+kubectl get pvc -l app.kubernetes.io/instance=bifrost
+
+# Show binding events
+kubectl describe pvc -l app.kubernetes.io/instance=bifrost
+```
+
+Common causes:
+- No Persistent Volume provisioner in the cluster
+- `storageClass` set to a class that doesn't exist
+- `ReadWriteOnce` access mode with multiple replicas (SQLite PVCs are single-node)
+
+```bash
+# List available storage classes
+kubectl get storageclass
+
+# Fix: pin to a valid storage class
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set storage.persistence.storageClass=standard
+```
+
+### ConfigMap / Secret errors
+
+```bash
+# View the generated ConfigMap (contains rendered config.json)
+kubectl get configmap bifrost-config -o yaml
+
+# View secrets the pod depends on
+kubectl get secret -l app.kubernetes.io/instance=bifrost
+
+# Decode a specific secret value
+kubectl get secret bifrost-encryption -o jsonpath='{.data.key}' | base64 -d
+```
+
+### CrashLoopBackOff
+
+```bash
+# Get last log lines before the crash
+kubectl logs -l app.kubernetes.io/name=bifrost --previous --tail=50
+
+# Common causes shown in logs:
+# "encryption key is not initialized" → no key provided; optional, but data will be stored in plaintext
+# "failed to connect to database" → see Database section below
+# "image.tag is required" → set image.tag in values
+```
+
+---
+
+## Database Connection Issues
+
+### Embedded PostgreSQL
+
+```bash
+# Check if the PostgreSQL pod is running
+kubectl get pods -l app.kubernetes.io/name=bifrost-postgresql
+
+# Connect directly to inspect the database
+kubectl exec -it deployment/bifrost-postgresql -- psql -U bifrost -d bifrost
+
+# Test connectivity from the Bifrost pod
+kubectl exec -it deployment/bifrost -- nc -zv bifrost-postgresql 5432
+
+# Check PostgreSQL logs
+kubectl logs deployment/bifrost-postgresql --tail=50
+```
+
+### External PostgreSQL
+
+```bash
+# Test connectivity from within the cluster
+kubectl run pg-test --image=postgres:16-alpine --rm -it --restart=Never -- \
+  psql "host=your-db-host dbname=bifrost user=bifrost sslmode=require"
+
+# Verify the secret value is correct
+kubectl get secret postgres-credentials -o jsonpath='{.data.password}' | base64 -d
+
+# Check that the external host/port is reachable
+kubectl exec -it deployment/bifrost -- nc -zv your-db-host 5432
+```
+
+Common causes:
+- `sslMode: disable` when the database requires SSL — set `sslMode: require`
+- Password in secret doesn't match the database user
+- Network policy blocking pod → database traffic
+- Database not UTF8 encoded (see [PostgreSQL UTF8 Requirement](/quickstart/gateway/setting-up#postgresql-utf8-requirement))
+
+```bash
+# Fix: update the secret and restart
+kubectl create secret generic postgres-credentials \
+  --from-literal=password='correct-password' \
+  --dry-run=client -o yaml | kubectl apply -f -
+
+kubectl rollout restart deployment/bifrost
+```
+
+---
+
+## Ingress Not Working
+
+```bash
+# Check ingress resource status
+kubectl describe ingress bifrost
+
+# Check if the ingress controller is running
+kubectl get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx
+
+# View ingress controller logs for routing errors
+kubectl logs -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx --tail=50
+
+# Verify DNS resolves to the correct load balancer IP
+nslookup bifrost.yourdomain.com
+kubectl get ingress bifrost -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
+
+# Test without TLS first
+curl -v http://bifrost.yourdomain.com/health
+```
+
+Common causes:
+- `ingress.className` not set or set to a class not installed in the cluster
+- TLS certificate not issued yet (cert-manager can take up to 60 seconds)
+- Service port mismatch — Bifrost listens on `8080` by default
+
+```bash
+# Check cert-manager certificate status
+kubectl get certificate -l app.kubernetes.io/instance=bifrost
+kubectl describe certificate bifrost-tls
+```
+
+---
+
+## Secret and Credential Issues
+
+### Provider API key not resolving
+
+If Bifrost logs show `env.OPENAI_API_KEY: not set` or similar:
+
+```bash
+# Check the env var is present in the running pod
+kubectl exec -it deployment/bifrost -- env | grep OPENAI
+
+# Verify the providerSecrets secret exists with the right key
+kubectl get secret provider-api-keys -o yaml
+
+# Check the providerSecrets configuration rendered correctly
+kubectl get configmap bifrost-config -o yaml | grep -A5 providers
+```
+
+### Encryption key issues
+
+```bash
+# Verify the secret exists and contains the right key name
+kubectl get secret bifrost-encryption -o yaml
+
+# Check the exact key name matches encryptionKeySecret.key in values
+# Default key name is "encryption-key" — if you used "key", set:
+#   bifrost.encryptionKeySecret.key: "key"
+```
+
+---
+
+## High Memory Usage
+
+```bash
+# Check current resource usage
+kubectl top pods -l app.kubernetes.io/name=bifrost
+
+# Check if OOM kills are happening
+kubectl describe pod -l app.kubernetes.io/name=bifrost | grep -A3 "OOMKilled\|Limits"
+
+# View resource requests/limits on running pods
+kubectl get pod -l app.kubernetes.io/name=bifrost \
+  -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].resources}{"\n"}{end}'
+```
+
+**Increase resource limits:**
+
+```bash
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set resources.limits.memory=4Gi \
+  --set resources.requests.memory=1Gi
+```
+
+**Tune Go runtime** (see [Docker Tuning](/deployment-guides/docker-tuning)):
+
+```yaml
+env:
+  - name: GOGC
+    value: "200"          # run GC less often
+  - name: GOMEMLIMIT
+    value: "3500MiB"      # hard memory ceiling slightly below the container limit
+```
+
+---
+
+## High CPU Usage / Latency
+
+```bash
+# Check CPU usage
+kubectl top pods -l app.kubernetes.io/name=bifrost
+
+# Check if HPA is scaling correctly
+kubectl get hpa bifrost
+kubectl describe hpa bifrost
+```
+
+Common causes:
+- `initialPoolSize` too small — goroutines queuing up; increase to `500`–`1000`
+- `dropExcessRequests: false` with a small pool — queue depth growing unboundedly
+
+```bash
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set bifrost.client.initialPoolSize=1000 \
+  --set bifrost.client.dropExcessRequests=true
+```
+
+---
+
+## Autoscaling Issues
+
+### HPA not scaling
+
+```bash
+# Check HPA status and current metrics
+kubectl describe hpa bifrost
+
+# Verify metrics server is installed
+kubectl top nodes
+kubectl top pods
+
+# Common fix: metrics server not installed
+# Install with:
+kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
+```
+
+### Pods scaling down too aggressively (drops active SSE streams)
+
+The default `scaleDown.stabilizationWindowSeconds: 300` and `preStop` sleep of 15 seconds should prevent this. If streams are still being cut:
+
+```yaml
+terminationGracePeriodSeconds: 120   # increase if streams run longer than 105s
+
+autoscaling:
+  behavior:
+    scaleDown:
+      stabilizationWindowSeconds: 600  # wait 10 min before scaling down
+      policies:
+        - type: Pods
+          value: 1
+          periodSeconds: 300           # remove at most 1 pod per 5 min
+
+lifecycle:
+  preStop:
+    exec:
+      command: ["sh", "-c", "sleep 30"]  # give load balancer more time to drain
+```
+
+```bash
+helm upgrade bifrost bifrost/bifrost --reuse-values -f graceful-shutdown-values.yaml
+```
+
+---
+
+## SQLite / PVC Issues
+
+### StatefulSet migration (upgrading from chart < v2.0.0)
+
+Older chart versions used a Deployment + manual PVC. v2.0.0 moved SQLite to a StatefulSet. If upgrading:
+
+```bash
+# 1. Scale down the old deployment
+kubectl scale deployment bifrost --replicas=0
+
+# 2. Note the existing PVC name
+kubectl get pvc
+
+# 3. Upgrade, pointing at the existing claim
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set storage.persistence.existingClaim=<your-old-pvc-name> \
+  --set image.tag=v1.4.11
+```
+
+### Data lost after upgrade
+
+```bash
+# Check if PVCs still exist (they persist after helm uninstall)
+kubectl get pvc -l app.kubernetes.io/instance=bifrost
+
+# Re-attach by setting existingClaim
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set storage.persistence.existingClaim=<pvc-name>
+```
+
+---
+
+## Cluster Mode Issues
+
+### Peers not discovering each other
+
+```bash
+# Check gossip port is reachable between pods
+kubectl exec -it bifrost-0 -- nc -zv bifrost-1.bifrost-headless 7946
+
+# View gossip-related log lines
+kubectl logs -l app.kubernetes.io/name=bifrost --tail=100 | grep -i gossip
+
+# Check the headless service exists
+kubectl get svc bifrost-headless
+```
+
+For Kubernetes-based discovery, verify the service account has pod list permissions:
+
+```bash
+kubectl auth can-i list pods --as=system:serviceaccount:default:bifrost
+```
+
+---
+
+## Useful Diagnostic Commands
+
+```bash
+# Full state dump for a support ticket
+kubectl get all -l app.kubernetes.io/instance=bifrost
+kubectl describe pod -l app.kubernetes.io/name=bifrost > pod-describe.txt
+kubectl logs -l app.kubernetes.io/name=bifrost --tail=200 > pod-logs.txt
+
+# View the full rendered config.json
+kubectl get configmap bifrost-config -o jsonpath='{.data.config\.json}' | jq .
+
+# Check current Helm values (shows all overrides)
+helm get values bifrost
+
+# Check Helm release status
+helm status bifrost
+
+# View Helm release history
+helm history bifrost
+```
+
+---
+
+## Still Stuck?
+
+- [GitHub Issues](https://github.com/maximhq/bifrost/issues) — search existing issues or open a new one
+- [Enterprise Support](mailto:support@getmaxim.ai) — for enterprise customers with SLA
--- a/docs/deployment-guides/helm/values.mdx
+++ b/docs/deployment-guides/helm/values.mdx
@@ -0,0 +1,718 @@
+---
+title: "Values Reference"
+description: "Complete reference for Bifrost Helm chart values — key parameters, how to supply them, and links to example files"
+icon: "sliders"
+---
+
+This page covers every top-level parameter group in the Bifrost Helm chart's `values.yaml`, how to supply values via `--set` vs `-f`, and where to find ready-made example files.
+
+<Note>
+The full values schema is available at [https://getbifrost.ai/schema](https://getbifrost.ai/schema). All `values.yaml` fields map directly to `config.json` fields generated by the chart.
+</Note>
+
+## Supplying Values
+
+### One-liner with `--set`
+
+Good for a single field or quick experiments:
+
+```bash
+helm install bifrost bifrost/bifrost \
+  --set image.tag=v1.4.11 \
+  --set replicaCount=3 \
+  --set bifrost.client.initialPoolSize=500
+```
+
+### Values file with `-f`
+
+Recommended for anything beyond a couple of fields:
+
+```bash
+# Create your values file
+cat > my-values.yaml <<'EOF'
+image:
+  tag: "v1.4.11"
+
+replicaCount: 2
+
+bifrost:
+  encryptionKey: "your-32-byte-encryption-key-here"
+  client:
+    initialPoolSize: 500
+    enableLogging: true
+EOF
+
+# Install
+helm install bifrost bifrost/bifrost -f my-values.yaml
+
+# Upgrade later
+helm upgrade bifrost bifrost/bifrost -f my-values.yaml
+
+# Upgrade and reuse all previously set values, overriding only one field
+helm upgrade bifrost bifrost/bifrost \
+  --reuse-values \
+  --set replicaCount=5
+```
+
+### Multiple values files
+
+Later files override earlier ones — useful for a base + environment-specific overlay:
+
+```bash
+helm install bifrost bifrost/bifrost \
+  -f base-values.yaml \
+  -f production-overrides.yaml
+```
+
+---
+
+## Key Parameters Reference
+
+### Image
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `image.repository` | Container image repository | `docker.io/maximhq/bifrost` |
+| `image.tag` | **Required.** Image version (e.g. `v1.4.11`) | `""` |
+| `image.pullPolicy` | Image pull policy | `IfNotPresent` |
+| `imagePullSecrets` | List of pull secret names for private registries | `[]` |
+
+```bash
+# Always specify the tag — the chart will not start without it
+helm install bifrost bifrost/bifrost --set image.tag=v1.4.11
+```
+
+### Replicas & Autoscaling
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `replicaCount` | Static replica count (ignored when HPA is enabled) | `1` |
+| `autoscaling.enabled` | Enable Horizontal Pod Autoscaler | `false` |
+| `autoscaling.minReplicas` | Minimum replicas | `1` |
+| `autoscaling.maxReplicas` | Maximum replicas | `10` |
+| `autoscaling.targetCPUUtilizationPercentage` | CPU target for scaling | `80` |
+| `autoscaling.targetMemoryUtilizationPercentage` | Memory target for scaling | `80` |
+| `autoscaling.behavior.scaleDown.stabilizationWindowSeconds` | Cooldown before scale-down (important for SSE streams) | `300` |
+| `autoscaling.behavior.scaleDown.policies[0].value` | Max pods removed per period | `1` |
+
+### Resources
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `resources.requests.cpu` | CPU request | `500m` |
+| `resources.requests.memory` | Memory request | `512Mi` |
+| `resources.limits.cpu` | CPU limit | `2000m` |
+| `resources.limits.memory` | Memory limit | `2Gi` |
+
+### Service
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `service.type` | `ClusterIP`, `LoadBalancer`, or `NodePort` | `ClusterIP` |
+| `service.port` | Service port | `8080` |
+
+### Ingress
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `ingress.enabled` | Enable ingress | `false` |
+| `ingress.className` | Ingress class (e.g. `nginx`, `traefik`) | `""` |
+| `ingress.annotations` | Ingress annotations | `{}` |
+| `ingress.hosts` | Host rules | see values.yaml |
+| `ingress.tls` | TLS configuration | `[]` |
+
+```yaml
+ingress:
+  enabled: true
+  className: nginx
+  annotations:
+    cert-manager.io/cluster-issuer: letsencrypt-prod
+    nginx.ingress.kubernetes.io/proxy-body-size: "100m"
+  hosts:
+    - host: bifrost.yourdomain.com
+      paths:
+        - path: /
+          pathType: Prefix
+  tls:
+    - secretName: bifrost-tls
+      hosts:
+        - bifrost.yourdomain.com
+```
+
+### Probes
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `livenessProbe.initialDelaySeconds` | Seconds before first liveness check | `30` |
+| `livenessProbe.periodSeconds` | Liveness check interval | `30` |
+| `readinessProbe.initialDelaySeconds` | Seconds before first readiness check | `10` |
+| `readinessProbe.periodSeconds` | Readiness check interval | `10` |
+
+Both probes hit `GET /health`.
+
+### Graceful Shutdown
+
+Bifrost supports long-lived SSE streaming connections. The default `preStop` hook and termination grace period let in-flight streams finish before the pod is killed:
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `terminationGracePeriodSeconds` | Total grace period | `60` |
+| `lifecycle.preStop.exec.command` | Sleep before SIGTERM so load balancer drains | `["sh", "-c", "sleep 15"]` |
+
+Increase `terminationGracePeriodSeconds` if your typical stream responses take longer than 45 seconds.
+
+### Service Account
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `serviceAccount.create` | Create a dedicated service account | `true` |
+| `serviceAccount.annotations` | Annotations (e.g. for IRSA, Workload Identity) | `{}` |
+| `serviceAccount.name` | Override the generated name | `""` |
+
+### Pod Scheduling
+
+```yaml
+# Spread replicas across nodes
+affinity:
+  podAntiAffinity:
+    requiredDuringSchedulingIgnoredDuringExecution:
+      - labelSelector:
+          matchLabels:
+            app.kubernetes.io/name: bifrost
+        topologyKey: kubernetes.io/hostname
+
+# Pin to specific node pool
+nodeSelector:
+  node-type: ai-workload
+
+# Tolerate GPU taints
+tolerations:
+  - key: "gpu"
+    operator: "Equal"
+    value: "true"
+    effect: "NoSchedule"
+```
+
+### Extra Environment Variables
+
+Three ways to inject env vars:
+
+```yaml
+# Inline key/value pairs
+env:
+  - name: HTTP_PROXY
+    value: "http://proxy.corp.example.com:3128"
+
+# Map syntax (appended after env)
+extraEnv:
+  NO_PROXY: "169.254.169.254,10.0.0.0/8"
+
+# Bulk-load from existing Secrets or ConfigMaps
+envFrom:
+  - secretRef:
+      name: my-corp-secrets
+  - configMapRef:
+      name: my-app-config
+```
+
+### Init Containers
+
+```yaml
+initContainers:
+  - name: wait-for-db
+    image: busybox:1.35
+    command: ["sh", "-c", "until nc -z postgres-svc 5432; do sleep 2; done"]
+```
+
+---
+
+## Values Examples
+
+The chart ships ready-made example files under [`helm-charts/bifrost/values-examples/`](https://github.com/maximhq/bifrost/tree/main/helm-charts/bifrost/values-examples):
+
+| File | Use case |
+|------|----------|
+| `sqlite-only.yaml` | Minimal local/dev setup |
+| `postgres-only.yaml` | Single-store Postgres |
+| `production-ha.yaml` | HA: 3 replicas, Postgres, Weaviate, HPA, Ingress |
+| `providers-and-virtual-keys.yaml` | All 23 providers + 7 virtual key patterns |
+| `secrets-from-k8s.yaml` | All sensitive values from Kubernetes Secrets |
+| `external-postgres.yaml` | Point at an existing Postgres instance |
+| `postgres-redis.yaml` | Postgres + Redis vector store |
+| `postgres-weaviate.yaml` | Postgres + Weaviate vector store |
+| `postgres-qdrant.yaml` | Postgres + Qdrant vector store |
+| `semantic-cache-secret-example.yaml` | Semantic cache with secret injection |
+| `mixed-backend.yaml` | Config store = postgres, logs store = sqlite |
+
+Install from an example file directly:
+
+```bash
+helm install bifrost bifrost/bifrost \
+  -f https://raw.githubusercontent.com/maximhq/bifrost/main/helm-charts/bifrost/values-examples/production-ha.yaml \
+  --set image.tag=v1.4.11
+```
+
+---
+
+## Helm Operations
+
+### View current values
+
+```bash
+helm get values bifrost
+```
+
+### Diff before upgrading (requires helm-diff plugin)
+
+```bash
+helm diff upgrade bifrost bifrost/bifrost -f my-values.yaml
+```
+
+### Rollback
+
+```bash
+helm history bifrost
+helm rollback bifrost       # to previous revision
+helm rollback bifrost 2     # to revision 2
+```
+
+### Uninstall
+
+```bash
+helm uninstall bifrost
+
+# Also remove PVCs (deletes all data)
+kubectl delete pvc -l app.kubernetes.io/instance=bifrost
+```
+
+---
+
+## All Key Parameters
+
+A quick-reference table of the most commonly used top-level parameters:
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `image.tag` | **Required.** Bifrost image version (e.g., `v1.4.11`) | `""` |
+| `replicaCount` | Number of replicas | `1` |
+| `storage.mode` | Storage backend (`sqlite` or `postgres`) | `sqlite` |
+| `storage.persistence.size` | PVC size for SQLite | `10Gi` |
+| `postgresql.enabled` | Deploy embedded PostgreSQL | `false` |
+| `vectorStore.enabled` | Enable vector store | `false` |
+| `vectorStore.type` | Vector store type (`weaviate`, `redis`, `qdrant`) | `none` |
+| `bifrost.encryptionKey` | Optional encryption key (use `encryptionKeySecret` in production). If omitted, data is stored in plaintext. | `""` |
+| `ingress.enabled` | Enable ingress | `false` |
+| `autoscaling.enabled` | Enable HPA | `false` |
+
+### Secret Reference Parameters
+
+Use existing Kubernetes Secrets instead of plain-text values. Every sensitive field in the chart has a corresponding `existingSecret` / `secretRef` alternative:
+
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `bifrost.encryptionKeySecret.name` | Secret name for encryption key | `""` |
+| `bifrost.encryptionKeySecret.key` | Key within the secret | `"encryption-key"` |
+| `postgresql.external.existingSecret` | Secret name for PostgreSQL password | `""` |
+| `postgresql.external.passwordKey` | Key within the secret | `"password"` |
+| `vectorStore.redis.external.existingSecret` | Secret name for Redis password | `""` |
+| `vectorStore.redis.external.passwordKey` | Key within the secret | `"password"` |
+| `vectorStore.weaviate.external.existingSecret` | Secret name for Weaviate API key | `""` |
+| `vectorStore.weaviate.external.apiKeyKey` | Key within the secret | `"api-key"` |
+| `vectorStore.qdrant.external.existingSecret` | Secret name for Qdrant API key | `""` |
+| `vectorStore.qdrant.external.apiKeyKey` | Key within the secret | `"api-key"` |
+| `bifrost.plugins.maxim.secretRef.name` | Secret name for Maxim API key | `""` |
+| `bifrost.plugins.maxim.secretRef.key` | Key within the secret | `"api-key"` |
+| `bifrost.providerSecrets.<provider>.existingSecret` | Secret name for provider API key | `""` |
+| `bifrost.providerSecrets.<provider>.key` | Key within the secret | `"api-key"` |
+| `bifrost.providerSecrets.<provider>.envVar` | Environment variable name to inject | `""` |
+
+---
+
+## Advanced Configuration
+
+### Comprehensive Example
+
+A production-ready values file combining the most common settings:
+
+```yaml
+# my-values.yaml
+image:
+  tag: "v1.4.11"
+
+replicaCount: 3
+
+storage:
+  mode: postgres
+
+postgresql:
+  enabled: true
+  auth:
+    password: "secure-password"   # use existingSecret in production
+
+autoscaling:
+  enabled: true
+  minReplicas: 3
+  maxReplicas: 10
+
+ingress:
+  enabled: true
+  className: nginx
+  hosts:
+    - host: bifrost.example.com
+      paths:
+        - path: /
+          pathType: Prefix
+
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption"
+    key: "key"
+  providers:
+    openai:
+      keys:
+        - name: "primary"
+          value: "env.OPENAI_API_KEY"
+          weight: 1
+  providerSecrets:
+    openai:
+      existingSecret: "provider-api-keys"
+      key: "openai-api-key"
+      envVar: "OPENAI_API_KEY"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f my-values.yaml
+```
+
+### Node Affinity & Scheduling
+
+Deploy to specific nodes and spread replicas across hosts:
+
+```yaml
+nodeSelector:
+  node-type: ai-workload
+
+affinity:
+  podAntiAffinity:
+    requiredDuringSchedulingIgnoredDuringExecution:
+      - labelSelector:
+          matchLabels:
+            app.kubernetes.io/name: bifrost
+        topologyKey: kubernetes.io/hostname
+
+tolerations:
+  - key: "gpu"
+    operator: "Equal"
+    value: "true"
+    effect: "NoSchedule"
+```
+
+### Deployment & Pod Annotations
+
+Useful for tooling like [Keel](https://keel.sh) for automatic image updates or Datadog APM injection:
+
+```yaml
+deploymentAnnotations:
+  keel.sh/policy: force
+  keel.sh/trigger: poll
+
+podAnnotations:
+  ad.datadoghq.com/bifrost.logs: '[{"source":"bifrost","service":"bifrost"}]'
+```
+
+---
+
+## Common Patterns
+
+Ready-made values files for the most common deployment scenarios. Each pattern builds on the [quickstart](/deployment-guides/helm).
+
+<Tabs>
+<Tab title="Development">
+
+Simple setup for local testing. SQLite, single replica, no autoscaling.
+
+```bash
+helm install bifrost bifrost/bifrost \
+  --set image.tag=v1.4.11 \
+  --set 'bifrost.providers.openai.keys[0].name=dev-key' \
+  --set 'bifrost.providers.openai.keys[0].value=sk-your-key' \
+  --set 'bifrost.providers.openai.keys[0].weight=1'
+```
+
+```bash
+# Access
+kubectl port-forward svc/bifrost 8080:8080
+```
+
+</Tab>
+<Tab title="Multi-Provider">
+
+Multiple LLM providers with weighted load balancing.
+
+```bash
+kubectl create secret generic provider-keys \
+  --from-literal=openai-api-key='sk-...' \
+  --from-literal=anthropic-api-key='sk-ant-...' \
+  --from-literal=gemini-api-key='your-gemini-key'
+```
+
+```yaml
+# multi-provider.yaml
+image:
+  tag: "v1.4.11"
+
+bifrost:
+  encryptionKey: "your-encryption-key"
+
+  client:
+    enableLogging: true
+    allowDirectKeys: false
+
+  providers:
+    openai:
+      keys:
+        - name: "openai-primary"
+          value: "env.OPENAI_API_KEY"
+          weight: 2    # 50% of traffic
+    anthropic:
+      keys:
+        - name: "anthropic-primary"
+          value: "env.ANTHROPIC_API_KEY"
+          weight: 1    # 25%
+    gemini:
+      keys:
+        - name: "gemini-primary"
+          value: "env.GEMINI_API_KEY"
+          weight: 1    # 25%
+
+  providerSecrets:
+    openai:
+      existingSecret: "provider-keys"
+      key: "openai-api-key"
+      envVar: "OPENAI_API_KEY"
+    anthropic:
+      existingSecret: "provider-keys"
+      key: "anthropic-api-key"
+      envVar: "ANTHROPIC_API_KEY"
+    gemini:
+      existingSecret: "provider-keys"
+      key: "gemini-api-key"
+      envVar: "GEMINI_API_KEY"
+
+  plugins:
+    telemetry:
+      enabled: true
+    logging:
+      enabled: true
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f multi-provider.yaml
+```
+
+</Tab>
+<Tab title="External Database">
+
+Use an existing PostgreSQL instance — RDS, Cloud SQL, Azure Database, or self-managed.
+
+```bash
+kubectl create secret generic postgres-credentials \
+  --from-literal=password='your-external-postgres-password'
+```
+
+```yaml
+# external-db.yaml
+image:
+  tag: "v1.4.11"
+
+storage:
+  mode: postgres
+
+postgresql:
+  enabled: false
+  external:
+    enabled: true
+    host: "your-rds-endpoint.us-east-1.rds.amazonaws.com"
+    port: 5432
+    user: "bifrost"
+    database: "bifrost"
+    sslMode: "require"
+    existingSecret: "postgres-credentials"
+    passwordKey: "password"
+
+bifrost:
+  encryptionKey: "your-encryption-key"
+
+  providers:
+    openai:
+      keys:
+        - name: "openai-primary"
+          value: "sk-..."
+          weight: 1
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f external-db.yaml
+```
+
+</Tab>
+<Tab title="AI Workloads">
+
+Semantic response caching for high-volume AI inference.
+
+```bash
+kubectl create secret generic bifrost-encryption \
+  --from-literal=key='your-32-byte-encryption-key'
+
+kubectl create secret generic provider-keys \
+  --from-literal=openai-api-key='sk-your-key'
+```
+
+```yaml
+# ai-workload.yaml
+image:
+  tag: "v1.4.11"
+
+storage:
+  mode: postgres
+
+postgresql:
+  enabled: true
+  auth:
+    password: "secure-password"
+  primary:
+    persistence:
+      size: 50Gi
+
+vectorStore:
+  enabled: true
+  type: weaviate
+  weaviate:
+    enabled: true
+    persistence:
+      size: 50Gi
+
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption"
+    key: "key"
+
+  providers:
+    openai:
+      keys:
+        - name: "openai-primary"
+          value: "env.OPENAI_API_KEY"
+          weight: 1
+
+  providerSecrets:
+    openai:
+      existingSecret: "provider-keys"
+      key: "openai-api-key"
+      envVar: "OPENAI_API_KEY"
+
+  plugins:
+    semanticCache:
+      enabled: true
+      config:
+        provider: "openai"
+        keys:
+          - value: "env.OPENAI_API_KEY"
+            weight: 1
+        embedding_model: "text-embedding-3-small"
+        dimension: 1536
+        threshold: 0.85
+        ttl: "1h"
+        cache_by_model: true
+        cache_by_provider: true
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f ai-workload.yaml
+```
+
+</Tab>
+<Tab title="Kubernetes Secrets Only">
+
+Zero credentials in values files — all sensitive data in Kubernetes Secrets.
+
+```bash
+kubectl create secret generic postgres-credentials \
+  --from-literal=password='your-postgres-password'
+
+kubectl create secret generic bifrost-encryption \
+  --from-literal=key='your-encryption-key'
+
+kubectl create secret generic provider-keys \
+  --from-literal=openai-api-key='sk-...' \
+  --from-literal=anthropic-api-key='sk-ant-...'
+
+kubectl create secret generic qdrant-credentials \
+  --from-literal=api-key='your-qdrant-api-key'
+```
+
+```yaml
+# secrets-only.yaml
+image:
+  tag: "v1.4.11"
+
+storage:
+  mode: postgres
+
+postgresql:
+  enabled: false
+  external:
+    enabled: true
+    host: "postgres.example.com"
+    port: 5432
+    user: "bifrost"
+    database: "bifrost"
+    sslMode: "require"
+    existingSecret: "postgres-credentials"
+    passwordKey: "password"
+
+vectorStore:
+  enabled: true
+  type: qdrant
+  qdrant:
+    enabled: false
+    external:
+      enabled: true
+      host: "qdrant.example.com"
+      port: 6334
+      existingSecret: "qdrant-credentials"
+      apiKeyKey: "api-key"
+
+bifrost:
+  encryptionKeySecret:
+    name: "bifrost-encryption"
+    key: "key"
+
+  providers:
+    openai:
+      keys:
+        - name: "openai-primary"
+          value: "env.OPENAI_API_KEY"
+          weight: 1
+    anthropic:
+      keys:
+        - name: "anthropic-primary"
+          value: "env.ANTHROPIC_API_KEY"
+          weight: 1
+
+  providerSecrets:
+    openai:
+      existingSecret: "provider-keys"
+      key: "openai-api-key"
+      envVar: "OPENAI_API_KEY"
+    anthropic:
+      existingSecret: "provider-keys"
+      key: "anthropic-api-key"
+      envVar: "ANTHROPIC_API_KEY"
+```
+
+```bash
+helm install bifrost bifrost/bifrost -f secrets-only.yaml
+```
+
+</Tab>
+</Tabs>
--- a/docs/deployment-guides/how-to/install-make.mdx
+++ b/docs/deployment-guides/how-to/install-make.mdx
@@ -0,0 +1,77 @@
+---
+title: "Install make command"
+description: "This guide explains how to install make command."
+icon: "compact-disc"
+---
+
+
+## Windows
+
+### Option A: Chocolatey (easy)
+
+```
+# Run in an elevated PowerShell (Run as Administrator)
+choco install make
+# verify
+make --version
+```
+
+### Option B: Scoop (no admin needed)
+```
+# In a normal PowerShell
+Set-ExecutionPolicy -Scope CurrentUser RemoteSigned
+iwr get.scoop.sh -useb | iex
+scoop install make
+make --version
+```
+
+### Option C: MSYS2 (full Unix-like env)
+
+```
+# 1) Install MSYS2 from https://www.msys2.org/
+# 2) In "MSYS2 MSYS" terminal:
+pacman -Syu         # then reopen terminal if asked
+pacman -S make
+make --version
+```
+
+<Note> Visual Studio’s nmake is a different tool (not GNU make). </Note>
+
+## Ubuntu / Debian
+
+```
+sudo apt update
+# Pulls in compilers and common build tools, including make
+sudo apt install build-essential
+# (or just) sudo apt install make
+make --version
+```
+
+## macOS
+
+### Option A: Xcode Command Line Tools (most common)
+
+```
+xcode-select --install   # follow the prompt
+make --version
+```
+
+This provides Apple’s/BSD-flavored make, which is fine for most projects.
+
+### Option B: Homebrew (get GNU make ≥ 4.x as gmake)
+
+```
+# Install Homebrew if needed: https://brew.sh
+brew install make
+gmake --version
+```
+
+If a project specifically requires GNU make as make, you can use:
+
+echo 'alias make="gmake"' >> ~/.zshrc && source ~/.zshrc
+
+## Troubleshooting tips
+
+- If make isn’t found, restart your terminal (or on Windows, open a new PowerShell) so your PATH updates.
+- Run which make (where make on Windows) to confirm which binary you’re using.
+- For Windows builds that depend on Unix tools (sed, grep, etc.), prefer MSYS2 or WSL for a smoother experience.
--- a/docs/deployment-guides/how-to/multinode.mdx
+++ b/docs/deployment-guides/how-to/multinode.mdx
@@ -0,0 +1,444 @@
+---
+title: "Multinode Deployment"
+description: "Deploy multiple Bifrost nodes with shared configuration for high availability in OSS deployments"
+icon: "layer-group"
+---
+
+## Overview
+
+Running multiple Bifrost nodes provides high availability, load distribution, and fault tolerance for your AI gateway. This guide covers the recommended approach for deploying multiple Bifrost nodes in OSS deployments.
+
+<Warning>
+  Running multiple OSS Bifrost nodes with a Postgres backend is not supported.
+  
+  Here is the short technical explanation:
+  
+  - Bifrost is designed to keep all critical information in memory, including provider configs, API keys, budgets, usage, and traffic distribution.
+  - Once a node is initialized, it does not read this information back from the database.
+  - In the Enterprise version, we use a slightly modified version of RAFT to synchronize this state in real time across nodes, while the database acts only as a dumb store.
+  - Based on our current view, OSS is sufficient for startups and medium-scale teams, and can easily handle around 3,000–5,000 RPS on a single instance.
+  - If you need high availability and enterprise capabilities such as real-time synchronization, the Enterprise plan is the right fit. 
+  - And yes, that is part of how we draw the OSS vs Enterprise line 💰.
+</Warning>
+
+### OSS vs Enterprise
+
+| Aspect | OSS Approach | Enterprise Approach |
+|--------|--------------|---------------------|
+| **Configuration Source** | Shared `config.json` file | Database with P2P sync |
+| **Sync Mechanism** | File sharing (ConfigMap, volumes) | Gossip protocol (real-time) |
+| **Config Updates** | Modify file + restart nodes | UI/API with automatic propagation |
+
+---
+
+## How It Works
+
+All configuration in Bifrost is loaded into memory at startup. For OSS multinode deployments, the recommended approach is to use `config.json` **without** `config_store` enabled.
+
+### `config.json` as Single Source of Truth
+
+When you deploy without `config_store`:
+
+- **No database involved** - `config.json` is the only configuration source
+- **Shared file** - All nodes read from the same `config.json` file
+- **Identical configuration** - Since the source is shared, all nodes automatically have the same configuration
+- **No sync needed** - The shared file itself ensures consistency
+
+<Frame>
+<img src="/media/oss-multinode.png" alt="OSS multi-node setup" />
+</Frame>
+---
+
+## Why not to use `config_store` for Multinode OSS?
+
+Using `config_store` (database-backed configuration) with multiple nodes in OSS creates a **synchronization problem**:
+
+1. **Config changes are local** - When you update configuration via the UI or API, it updates the database and the in-memory config on that specific node only
+2. **No propagation mechanism** - Other nodes don't know about the change; they keep their existing in-memory configuration
+3. **Nodes become out of sync** - Different nodes end up with different configurations
+4. **Restart required** - You'd have to restart all nodes after every config change to bring them back in sync
+
+This defeats the purpose of having database-backed configuration with real-time updates.
+
+<Warning>
+Without P2P clustering (Enterprise feature), there's no mechanism to notify other nodes of configuration changes. For OSS multinode deployments, use the shared `config.json` approach instead.
+</Warning>
+
+### Enterprise Solution
+
+Bifrost Enterprise includes **P2P clustering** with gossip protocol that automatically syncs configuration changes across all nodes in real-time. See the [Clustering documentation](/enterprise/clustering) for details.
+
+---
+
+## Setting Up Multinode OSS Deployment
+
+### Example config.json
+
+Create a `config.json` **without** `config_store` or `logs_store`:
+
+<Note>
+If you use PostgreSQL for `logs_store`, ensure the target database is UTF8 encoded. See [PostgreSQL UTF8 Requirement](../../quickstart/gateway/setting-up#postgresql-utf8-requirement).
+</Note>
+
+```json
+{
+  "$schema": "https://www.getbifrost.ai/schema",
+  "client": {
+    "drop_excess_requests": false,
+    "enable_logging": false    
+  },
+  "config_store": {
+    "enabled": false
+  },
+  "logs_store": {
+    "enabled": true,
+    "type": "postgres",
+    "config": {...}
+  },
+  "providers": {
+    "openai": {
+      "keys": [
+        {
+          "name": "openai-primary",
+          "value": "env.OPENAI_API_KEY",
+          "models": ["gpt-4o", "gpt-4o-mini"],
+          "weight": 1.0
+        }
+      ]
+    },
+    "anthropic": {
+      "keys": [
+        {
+          "name": "anthropic-primary",
+          "value": "env.ANTHROPIC_API_KEY",
+          "models": ["claude-sonnet-4-20250514", "claude-3-5-haiku-20241022"],
+          "weight": 1.0
+        }
+      ]
+    }
+  }
+}
+```
+
+<Note>
+Notice `config_store` is disabled. This ensures all configuration comes from the file only.
+</Note>
+
+### Kubernetes Deployment
+
+Use a ConfigMap to share the same configuration across all pods:
+
+```yaml
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: bifrost-config
+  namespace: default
+data:
+  config.json: |
+    {
+      "$schema": "https://www.getbifrost.ai/schema",
+      "client": {
+        "drop_excess_requests": false,
+        "enable_logging": false        
+      },
+      "config_store": {
+        "enabled": false
+      },
+      "logs_store": {
+        "enabled": true,
+        "type": "postgres",
+        "config": {...}
+      },
+      "providers": {
+        "openai": {
+          "keys": [
+            {
+              "name": "openai-primary",
+              "value": "env.OPENAI_API_KEY",
+              "models": ["gpt-4o", "gpt-4o-mini"],
+              "weight": 1.0
+            }
+          ]
+        }
+      }
+    }
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: bifrost
+  namespace: default
+spec:
+  replicas: 3
+  selector:
+    matchLabels:
+      app: bifrost
+  template:
+    metadata:
+      labels:
+        app: bifrost
+    spec:
+      containers:
+      - name: bifrost
+        image: maximhq/bifrost:latest
+        ports:
+        - containerPort: 8080
+          name: http
+        env:
+        - name: OPENAI_API_KEY
+          valueFrom:
+            secretKeyRef:
+              name: provider-secrets
+              key: openai-api-key
+        volumeMounts:
+        - name: config
+          mountPath: /app
+          readOnly: true
+        resources:
+          requests:
+            cpu: 250m
+            memory: 256Mi
+          limits:
+            cpu: 1000m
+            memory: 1Gi
+        livenessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          initialDelaySeconds: 10
+          periodSeconds: 10
+        readinessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          initialDelaySeconds: 5
+          periodSeconds: 5
+      volumes:
+      - name: config
+        configMap:
+          name: bifrost-config
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: bifrost
+  namespace: default
+spec:
+  type: LoadBalancer
+  selector:
+    app: bifrost
+  ports:
+  - port: 80
+    targetPort: 8080
+    protocol: TCP
+    name: http
+```
+
+### Docker Compose
+
+Share the configuration using a bind mount:
+
+```yaml
+version: '3.8'
+
+services:
+  nginx:
+    image: nginx:alpine
+    ports:
+      - "80:80"
+    volumes:
+      - ./nginx.conf:/etc/nginx/nginx.conf:ro
+    depends_on:
+      - bifrost-1
+      - bifrost-2
+      - bifrost-3
+
+  bifrost-1:
+    image: maximhq/bifrost:latest
+    environment:
+      - OPENAI_API_KEY=${OPENAI_API_KEY}
+      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
+    volumes:
+      - ./config.json:/app/config.json:ro
+    expose:
+      - "8080"
+
+  bifrost-2:
+    image: maximhq/bifrost:latest
+    environment:
+      - OPENAI_API_KEY=${OPENAI_API_KEY}
+      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
+    volumes:
+      - ./config.json:/app/config.json:ro
+    expose:
+      - "8080"
+
+  bifrost-3:
+    image: maximhq/bifrost:latest
+    environment:
+      - OPENAI_API_KEY=${OPENAI_API_KEY}
+      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
+    volumes:
+      - ./config.json:/app/config.json:ro
+    expose:
+      - "8080"
+```
+
+**nginx.conf** for load balancing:
+
+```nginx
+events {
+    worker_connections 1024;
+}
+
+http {
+    upstream bifrost {
+        least_conn;
+        server bifrost-1:8080;
+        server bifrost-2:8080;
+        server bifrost-3:8080;
+    }
+
+    server {
+        listen 80;
+
+        location / {
+            proxy_pass http://bifrost;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_connect_timeout 60s;
+            proxy_send_timeout 60s;
+            proxy_read_timeout 60s;
+        }
+
+        location /health {
+            access_log off;
+            return 200 "healthy\n";
+        }
+    }
+}
+```
+
+### Bare Metal / VM Deployment
+
+For bare metal or VM deployments, distribute the configuration file using:
+
+- **NFS mount** - Mount a shared NFS directory containing `config.json`
+- **rsync** - Sync the config file from a central location to all nodes
+- **Configuration management** - Use Ansible, Chef, or Puppet to deploy identical configs
+
+Example with rsync:
+
+```bash
+# On config server - push to all nodes
+for node in node1 node2 node3; do
+  rsync -avz /etc/bifrost/config.json $node:/etc/bifrost/config.json
+done
+
+# Restart nodes after config update
+for node in node1 node2 node3; do
+  ssh $node "systemctl restart bifrost"
+done
+```
+
+---
+
+## Updating Configuration
+
+To update configuration in a multinode OSS deployment:
+
+1. **Modify the shared `config.json` file**
+   - Update the ConfigMap (Kubernetes)
+   - Edit the shared file (Docker Compose / bare metal)
+
+2. **Restart the nodes**
+   - Rolling restart is supported - nodes can be restarted one at a time
+   - Each node picks up the new configuration on startup
+
+### Kubernetes Rolling Restart
+
+```bash
+# Update ConfigMap
+kubectl apply -f configmap.yaml
+
+# Trigger rolling restart
+kubectl rollout restart deployment/bifrost
+
+# Watch the rollout
+kubectl rollout status deployment/bifrost
+```
+
+### Docker Compose Restart
+
+```bash
+# After updating config.json
+docker-compose restart bifrost-1
+docker-compose restart bifrost-2
+docker-compose restart bifrost-3
+```
+
+---
+
+## Best Practices
+
+### Use Environment Variables for Secrets
+
+Never put API keys directly in `config.json`. Use the `env.` prefix to reference environment variables:
+
+```json
+{
+  "providers": {
+    "openai": {
+      "keys": [
+        {
+          "value": "env.OPENAI_API_KEY"
+        }
+      ]
+    }
+  }
+}
+```
+
+Then provide the actual keys via environment variables or Kubernetes secrets.
+
+### Load Balancer Configuration
+
+Always put a load balancer in front of your Bifrost nodes:
+
+- **Kubernetes**: Use a Service with `type: LoadBalancer` or an Ingress
+- **Docker/VMs**: Use nginx, HAProxy, or a cloud load balancer
+
+### Health Checks
+
+Configure health checks to ensure traffic only goes to healthy nodes:
+
+- **Liveness endpoint**: `GET /health`
+- **Readiness endpoint**: `GET /health`
+
+### Resource Allocation
+
+For production deployments:
+
+```yaml
+resources:
+  requests:
+    cpu: 500m
+    memory: 512Mi
+  limits:
+    cpu: 2000m
+    memory: 2Gi
+```
+
+---
+
+## Summary
+
+| Scenario | Recommendation |
+|----------|----------------|
+| Single node | Use `config_store` for UI access |
+| Multinode OSS | Use shared `config.json` without `config_store` |
+| Multinode Enterprise | Use P2P clustering with `config_store` |
+
+For OSS multinode deployments, the shared `config.json` approach provides a simple, reliable way to keep all nodes in sync without the complexity of database synchronization.
--- a/docs/deployment-guides/how-to/nginx-reverse-proxy.mdx
+++ b/docs/deployment-guides/how-to/nginx-reverse-proxy.mdx
@@ -0,0 +1,185 @@
+---
+title: "Nginx reverse proxy"
+description: "Run Bifrost behind NGINX with streaming-safe settings for SSE and WebSocket traffic"
+icon: "shuffle"
+---
+
+This guide shows how to put NGINX in front of Bifrost for TLS termination, centralized routing, and load balancing.
+
+<Note>
+Incoming reverse-proxy behavior is configured in your infrastructure layer (NGINX/Ingress), not in `config.json`.
+</Note>
+
+---
+
+## When to use this setup
+
+- You want HTTPS termination in front of Bifrost.
+- You run multiple Bifrost replicas and want L7 load balancing.
+- You need one stable gateway URL for SDKs and agent clients.
+
+---
+
+## Docker Compose deployment
+
+Use this when Bifrost and NGINX run as services in the same Compose project.
+
+```yaml
+services:
+  nginx:
+    image: nginx:alpine
+    ports:
+      - "80:80"
+    volumes:
+      - ./nginx.conf:/etc/nginx/nginx.conf:ro
+    depends_on:
+      - bifrost-1
+      - bifrost-2
+      - bifrost-3
+
+  bifrost-1:
+    image: maximhq/bifrost:latest
+    expose:
+      - "8080"
+
+  bifrost-2:
+    image: maximhq/bifrost:latest
+    expose:
+      - "8080"
+
+  bifrost-3:
+    image: maximhq/bifrost:latest
+    expose:
+      - "8080"
+```
+
+```nginx
+events {
+    worker_connections 1024;
+}
+
+http {
+    upstream bifrost_backend {
+        least_conn;
+        server bifrost-1:8080;
+        server bifrost-2:8080;
+        server bifrost-3:8080;
+    }
+
+    server {
+        listen 80;
+
+        location / {
+            proxy_pass http://bifrost_backend;
+
+            # Preserve original request context
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+
+            # Keep streaming responses stable
+            proxy_http_version 1.1;
+            proxy_buffering off;
+            proxy_request_buffering off;
+            proxy_read_timeout 300s;
+            proxy_send_timeout 300s;
+        }
+    }
+}
+```
+
+If you expose WebSocket traffic through the same endpoint, add upgrade headers in the same `location /` block:
+
+```nginx
+proxy_set_header Upgrade $http_upgrade;
+proxy_set_header Connection "upgrade";
+```
+
+---
+
+## VM or bare-metal deployment
+
+Use the same NGINX `location /` settings as above, and point `upstream` servers to hostnames/IPs reachable from that VM.
+
+If you terminate TLS directly on NGINX, add:
+
+```nginx
+listen 443 ssl;
+server_name bifrost.example.com;
+ssl_certificate /etc/nginx/certs/fullchain.pem;
+ssl_certificate_key /etc/nginx/certs/privkey.pem;
+```
+
+---
+
+## Kubernetes (NGINX Ingress)
+
+If you deploy with Helm, use Ingress values instead of a standalone NGINX config:
+
+```yaml
+ingress:
+  enabled: true
+  className: nginx
+  annotations:
+    cert-manager.io/cluster-issuer: letsencrypt-prod
+    nginx.ingress.kubernetes.io/proxy-body-size: "100m"
+    nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
+    nginx.ingress.kubernetes.io/proxy-send-timeout: "300"
+    nginx.ingress.kubernetes.io/proxy-buffering: "off"
+  hosts:
+    - host: bifrost.example.com
+      paths:
+        - path: /
+          pathType: Prefix
+  tls:
+    - secretName: bifrost-tls
+      hosts:
+        - bifrost.example.com
+```
+
+---
+
+## Verify the proxy path
+
+```bash
+# Docker Compose: render final config and validate syntax
+docker compose config
+
+# Kubernetes: validate ingress manifest locally
+kubectl apply --dry-run=client -f ingress.yaml
+```
+
+```bash
+# Health check through reverse proxy
+curl -i http://bifrost.example.com/health
+
+# Streaming check through NGINX
+curl -N http://bifrost.example.com/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o-mini",
+    "stream": true,
+    "messages": [{"role": "user", "content": "test stream"}]
+  }'
+```
+
+If streaming responses arrive in delayed bursts, confirm buffering is disabled in NGINX or Ingress annotations.
+
+---
+
+## Related guides
+
+- [Helm quick start](/deployment-guides/helm)
+- [Helm values reference](/deployment-guides/helm/values)
+- [Multinode deployment](/deployment-guides/how-to/multinode)
+
+---
+
+## Runnable example files
+
+Use the complete Docker Compose + Helm/Kubernetes example in the repository:
+
+- [docker-compose.yml](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/docker-compose.yml)
+- [helm-values.yaml](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/helm-values.yaml)
+- [k8s-ingress.yaml](https://github.com/maximhq/bifrost/blob/main/examples/configs/withnginxreverseproxy/k8s-ingress.yaml)
--- a/docs/deployment-guides/k8s.mdx
+++ b/docs/deployment-guides/k8s.mdx