first commit

2026-04-26 21:52:23 +03:00
commit 880f412e2c
2662 changed files with 866266 additions and 0 deletions
--- a/docs/mcp/agent-mode.mdx
+++ b/docs/mcp/agent-mode.mdx
@@ -0,0 +1,475 @@
+---
+title: "Agent Mode (Auto-Execution)"
+sidebarTitle: "Agent Mode"
+description: "Enable autonomous tool execution with configurable auto-approval for building AI agents."
+icon: "robot"
+---
+
+<Note>
+This feature is only available on `v1.4.0-prerelease1` and above.
+</Note>
+
+## Overview
+
+**Agent Mode** enables Bifrost to automatically execute tool calls without requiring explicit execution API calls for each tool. This transforms Bifrost from a simple gateway into an autonomous agent runtime.
+
+<Warning>
+**Streaming Not Supported**: Agent Mode is not compatible with streaming operations (`chat_stream` and `responses_stream`). Due to architectural limitations, the autonomous tool execution loop requires complete responses before proceeding to the next iteration (we cannot store all streaming chunks in memory just "in case" we get any tool calls, this would be a big anti-pattern). Use non-streaming endpoints (`chat` and `responses`) when Agent Mode is enabled.
+</Warning>
+
+When Agent Mode is enabled:
+1. LLM returns tool calls in its response
+2. Bifrost automatically executes **auto-executable** tools
+3. Results are fed back to the LLM
+4. Loop continues until no more tool calls OR max depth reached
+5. Non-auto-executable tools are returned to your application for approval
+
+<Warning>
+Agent Mode requires explicit configuration. Tools must be marked as auto-executable via `tools_to_auto_execute`. By default, no tools are auto-executed.
+</Warning>
+
+---
+
+## Configuration
+
+Agent Mode requires two configurations:
+
+1. **`tools_to_execute`**: Which tools are available (whitelist)
+2. **`tools_to_auto_execute`**: Which tools can run automatically (subset of above)
+
+### Tools To Execute vs Tools To Auto Execute
+
+| Field | Purpose | Semantics |
+|-------|---------|-----------|
+| `tools_to_execute` | Tools available to the LLM | `["*"]` = all, `[]` = none, `["a", "b"]` = specific |
+| `tools_to_auto_execute` | Tools that run without approval | Same semantics, must be subset of `tools_to_execute` |
+
+<Note>
+A tool in `tools_to_auto_execute` that is NOT in `tools_to_execute` will be ignored. The execute list takes precedence.
+</Note>
+
+---
+
+## Gateway Setup
+
+<Tabs>
+<Tab title="Web UI">
+
+### Configuring Auto-Execute Tools
+
+1. Navigate to **MCP Gateway** in the left sidebar
+2. Click on a client to open its configuration sheet
+3. Scroll to the **Available Tools** section
+4. For each tool, toggle the **Automatically execute tool** switch
+5. Click **Save Changes** to apply
+
+The auto-execute configuration is managed per-client, allowing fine-grained control over which tools run automatically vs. requiring manual approval.
+
+### Global Agent Settings
+
+Configure max depth and other agent settings via:
+
+**Gateway API:**
+```bash
+# Update tool manager config
+curl -X PUT http://localhost:8080/api/settings/mcp/tool-manager-config \
+  -H "Content-Type: application/json" \
+  -d '{
+    "max_agent_depth": 15,
+    "tool_execution_timeout": "45s",
+    "code_mode_binding_level": "tool"
+  }'
+```
+
+**config.json:**
+```json
+{
+  "mcp": {
+    "tool_manager_config": {
+      "max_agent_depth": 15,
+      "tool_execution_timeout": "45s",
+      "code_mode_binding_level": "tool"
+    }
+  }
+}
+```
+
+</Tab>
+<Tab title="API">
+
+### Add Client with Auto-Execute Tools
+
+```bash
+curl -X POST http://localhost:8080/api/mcp/client \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "filesystem",
+    "connection_type": "stdio",
+    "stdio_config": {
+      "command": "npx",
+      "args": ["-y", "@anthropic/mcp-filesystem"]
+    },
+    "tools_to_execute": ["*"],
+    "tools_to_auto_execute": ["read_file", "list_directory"]
+  }'
+```
+
+### Update Existing Client
+
+```bash
+curl -X PUT http://localhost:8080/api/mcp/client/{id} \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "filesystem",
+    "connection_type": "stdio",
+    "stdio_config": {
+      "command": "npx",
+      "args": ["-y", "@anthropic/mcp-filesystem"]
+    },
+    "tools_to_execute": ["*"],
+    "tools_to_auto_execute": ["*"]
+  }'
+```
+
+</Tab>
+<Tab title="config.json">
+
+```json
+{
+  "mcp": {
+    "client_configs": [
+      {
+        "name": "filesystem",
+        "connection_type": "stdio",
+        "stdio_config": {
+          "command": "npx",
+          "args": ["-y", "@anthropic/mcp-filesystem"]
+        },
+        "tools_to_execute": ["*"],
+        "tools_to_auto_execute": ["read_file", "list_directory"]
+      },
+      {
+        "name": "web_search",
+        "connection_type": "http",
+        "connection_string": "http://localhost:3001/mcp",
+        "tools_to_execute": ["search"],
+        "tools_to_auto_execute": ["search"]
+      }
+    ],
+    "tool_manager_config": {
+      "max_agent_depth": 10,
+      "tool_execution_timeout": "30s"
+    }
+  }
+}
+```
+
+</Tab>
+</Tabs>
+
+---
+
+## Go SDK Setup
+
+```go
+package main
+
+import (
+    "context"
+    "time"
+
+    bifrost "github.com/maximhq/bifrost/core"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+func main() {
+    mcpConfig := &schemas.MCPConfig{
+        ClientConfigs: []schemas.MCPClientConfig{
+            {
+                Name:           "filesystem",
+                ConnectionType: schemas.MCPConnectionTypeSTDIO,
+                StdioConfig: &schemas.MCPStdioConfig{
+                    Command: "npx",
+                    Args:    []string{"-y", "@anthropic/mcp-filesystem"},
+                },
+                // All tools available
+                ToolsToExecute: []string{"*"},
+                // Only read operations auto-execute
+                ToolsToAutoExecute: []string{"read_file", "list_directory"},
+            },
+        },
+        ToolManagerConfig: &schemas.MCPToolManagerConfig{
+            MaxAgentDepth:        10,                      // Max iterations
+            ToolExecutionTimeout: 30 * time.Second,        // Per-tool timeout
+        },
+    }
+
+    client, err := bifrost.Init(context.Background(), schemas.BifrostConfig{
+        Account:   account,
+        MCPConfig: mcpConfig,
+    })
+    if err != nil {
+        panic(err)
+    }
+
+    // Make request - agent mode runs automatically
+    request := &schemas.BifrostChatRequest{
+        Provider: schemas.OpenAI,
+        Model:    "gpt-4o",
+        Input: []schemas.ChatMessage{
+            {
+                Role: schemas.ChatMessageRoleUser,
+                Content: schemas.ChatMessageContent{
+                    ContentStr: bifrost.Ptr("List all Go files in the project and summarize their purpose"),
+                },
+            },
+        },
+    }
+
+    // This will:
+    // 1. Get tool calls from LLM
+    // 2. Auto-execute list_directory, read_file
+    // 3. Feed results back to LLM
+    // 4. Return final response
+    response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), request)
+}
+```
+
+---
+
+## Agent Mode Behavior
+
+### Max Depth
+
+The `max_agent_depth` setting limits how many iterations the agent can perform:
+
+- **Default**: 10 iterations
+- Each LLM call that produces tool calls counts as one iteration
+- When max depth is reached, the current response is returned (may contain pending tool calls)
+
+### Parallel Execution
+
+Auto-executable tools are executed **in parallel** for performance:
+
+```mermaid
+graph TD
+    Start["<b>LLM returns tools</b><br/>[tool_1, tool_2, tool_3]<br/><i>all auto-executable</i>"]
+
+    Tool1["Execute tool_1"]
+    Tool2["Execute tool_2"]
+    Tool3["Execute tool_3"]
+
+    Collect["<b>Collect Results</b><br/>Continue to next LLM call"]
+
+    Start --> Tool1
+    Start --> Tool2
+    Start --> Tool3
+
+    Tool1 --> Collect
+    Tool2 --> Collect
+    Tool3 --> Collect
+
+    style Start fill:#FFF3E0,stroke:#BF360C,stroke-width:2.5px,color:#1A1A1A
+    style Tool1 fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
+    style Tool2 fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
+    style Tool3 fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
+    style Collect fill:#E8F5E9,stroke:#1B5E20,stroke-width:2.5px,color:#1A1A1A
+```
+
+### Mixed Auto/Non-Auto Tools
+
+When a response contains both auto-executable and non-auto-executable tools:
+
+1. Auto-executable tools are executed first
+2. The response is returned with:
+   - A text `content` field containing the executed tool results as JSON
+   - Pending non-auto-executable tool calls in `tool_calls`
+   - `finish_reason` set to `"stop"`
+
+```json
+{
+  "choices": [{
+    "index": 0,
+    "finish_reason": "stop",
+    "message": {
+      "role": "assistant",
+      "content": "The Output from allowed tools calls is - {\"filesystem_list_directory\":\"[\\\"file1.go\\\", \\\"file2.go\\\"]\"}\n\nNow I shall call these tools next...",
+      "tool_calls": [{
+        "id": "call_pending",
+        "type": "function",
+        "function": {
+          "name": "filesystem_write_file",
+          "arguments": "{\"path\": \"output.txt\", \"content\": \"...\"}"
+        }
+      }]
+    }
+  }]
+}
+```
+
+<Note>
+The `content` field contains a JSON summary of executed tool results. The `tool_calls` array contains only the non-auto-executable tools that require your approval. The `finish_reason` is set to `"stop"` to exit the agent loop.
+</Note>
+
+Your application then:
+1. Parse the `content` field to see what was already executed
+2. Review the pending non-auto-executable tools in `tool_calls`
+3. Execute or reject them manually
+4. Continue the conversation with results
+
+---
+
+## Security Considerations
+
+<Warning>
+Be careful which tools you mark as auto-executable. Dangerous operations like `write_file`, `delete_file`, `execute_command` should typically require human approval.
+</Warning>
+
+### Recommended Patterns
+
+**Safe for Auto-Execute:**
+- Read operations (`read_file`, `list_directory`)
+- Search/query operations (`search`, `fetch_url`)
+- Non-destructive information gathering
+
+**Require Human Approval:**
+- Write operations (`write_file`, `create_file`)
+- Delete operations (`delete_file`, `delete_record`)
+- Execute operations (`run_command`, `execute_script`)
+- Operations with side effects (sending emails, making purchases)
+
+### Example: Safe Configuration
+
+```json
+{
+  "tools_to_execute": ["*"],
+  "tools_to_auto_execute": [
+    "read_file",
+    "list_directory",
+    "search",
+    "get_weather"
+  ]
+}
+```
+
+---
+
+## Tool Execution Timeout
+
+Individual tool executions are bounded by `tool_execution_timeout`:
+
+- **Default**: 30 seconds
+- If a tool exceeds the timeout, an error result is returned
+- The agent loop continues with the error result
+
+```json
+{
+  "tool_manager_config": {
+    "tool_execution_timeout": "60s"
+  }
+}
+```
+
+---
+
+## Advanced: Agent Loop Internals
+
+### Iteration Tracking
+
+When Agent Mode executes, each iteration through the LLM and tool execution cycle increments a counter. You can track this for logging and debugging:
+
+```go
+// During iteration 1 -> Request made with max_tokens adjustment
+// Tool results collected and added to history
+// During iteration 2 -> Another LLM call with history
+// Process continues until no more tool calls or max_agent_depth reached
+```
+
+The `max_agent_depth` setting controls maximum iterations:
+- **Default:** 10
+- **Range:** 1-50 (configurable)
+- When reached, current response returned as-is (may contain pending tool calls)
+
+### Custom Request ID Management
+
+For complex workflows, track each iteration with unique request IDs:
+
+```go
+mcpConfig := &schemas.MCPConfig{
+    ToolManagerConfig: &schemas.MCPToolManagerConfig{
+        MaxAgentDepth: 10,
+    },
+    FetchNewRequestIDFunc: func(ctx context.Context) string {
+        // Called before each LLM invocation
+        baseID := ctx.Value(schemas.BifrostContextKeyRequestID).(string)
+        iterationNum := ctx.Value("iteration").(int)
+        return fmt.Sprintf("%s-iter-%d", baseID, iterationNum)
+    },
+}
+```
+
+This enables:
+- Audit trail of intermediate steps
+- Correlation of tool executions to iterations
+- Detailed observability for agent behavior
+
+### Parallel vs Sequential Execution
+
+**Auto-executable tools** run in parallel for performance:
+```
+Iteration N:
+  ├─ Execute tool_1 ───┐
+  ├─ Execute tool_2 ───┼─── Parallel (simultaneous)
+  └─ Execute tool_3 ───┘
+        ↓
+  Collect results → Feed to next iteration
+```
+
+**Non-auto-executable tools** return immediately:
+```
+Iteration N:
+  Auto tools executed in parallel
+  Non-auto tools returned in response
+  Application reviews & approves non-auto tools
+  Application calls execute endpoint manually
+  Results fed back in next iteration
+```
+
+### Response Format in Agent Mode
+
+When Agent Mode finds mixed auto/non-auto tools:
+
+```json
+{
+  "choices": [{
+    "message": {
+      "role": "assistant",
+      "content": "Executed tools: filesystem_list_directory returned [...]",
+      "tool_calls": [{
+        "id": "call_abc",
+        "type": "function",
+        "function": {
+          "name": "filesystem_write_file",
+          "arguments": "..."
+        }
+      }]
+    },
+    "finish_reason": "stop"
+  }]
+}
+```
+
+The `content` field contains JSON summary of executed tool results. The `tool_calls` array contains only non-auto-executable tools.
+
+---
+
+## Next Steps
+
+<CardGroup cols={2}>
+  <Card title="Code Mode" icon="code" href="./code-mode">
+    Let AI write code to orchestrate multiple tools
+  </Card>
+  <Card title="Tool Filtering" icon="filter" href="./filtering">
+    Control tool availability per request
+  </Card>
+</CardGroup>
--- a/docs/mcp/code-mode.mdx
+++ b/docs/mcp/code-mode.mdx
@@ -0,0 +1,629 @@
+---
+title: "Code Mode"
+sidebarTitle: "Code Mode"
+description: "AI writes Python to orchestrate tools. Reduces token usage by 50%+ when using multiple MCP servers."
+icon: "code"
+---
+
+<Note>
+This feature is only available on `v1.4.0-prerelease1` and above.
+</Note>
+
+## Overview
+
+**Code Mode** is a transformative approach to using MCP that solves a critical problem at scale:
+
+> **The Problem:** When you connect 8-10 MCP servers (150+ tools), every single request includes all tool definitions in the context. The LLM spends most of its budget reading tool catalogs instead of doing actual work.
+
+**The Solution:** Instead of exposing 150 tools directly, Code Mode exposes just **four generic tools**. The LLM uses those tools to write Python code (Starlark) that orchestrates everything else in a sandbox.
+
+### The Impact
+
+Compare a workflow across 5 MCP servers with ~100 tools:
+
+**Classic MCP Flow:**
+- 6 LLM turns
+- 100 tools in context **every turn** (600 tool-definition tokens)
+- All intermediate results flow through the model
+
+**Code Mode Flow:**
+- 3-4 LLM turns
+- Only 4 tools + definitions on-demand
+- Intermediate results processed in sandbox
+
+**Result: ~50% cost reduction + 30-40% faster execution**
+
+Code Mode provides four meta-tools to the AI:
+1. **`listToolFiles`** - Discover available MCP servers
+2. **`readToolFile`** - Load Python stub signatures on-demand
+3. **`getToolDocs`** - Get detailed documentation for a specific tool
+4. **`executeToolCode`** - Execute Python code with full tool bindings
+
+## When to Use Code Mode
+
+**Enable Code Mode if you have:**
+- ✅ 3+ MCP servers connected
+- ✅ Complex multi-step workflows
+- ✅ Concerned about token costs or latency
+- ✅ Tools that need to interact with each other
+
+**Keep Classic MCP if you have:**
+- ✅ Only 1-2 small MCP servers
+- ✅ Simple, direct tool calls
+- ✅ Very latency-sensitive use cases (though Code Mode is usually faster)
+
+**You can mix both:** Enable Code Mode for "heavy" servers (web, documents, databases) and keep small utilities as direct tools.
+
+---
+
+## How Code Mode Works
+
+### The Four Tools
+
+Instead of seeing 150+ tool definitions, the model sees four generic tools:
+
+```mermaid
+graph LR
+    LLM["<b>LLM Context</b><br/><i>Compact & Efficient</i>"]
+
+    List["<b>listToolFiles</b><br/>Discover servers"]
+    Read["<b>readToolFile</b><br/>Load signatures"]
+    Docs["<b>getToolDocs</b><br/>Get detailed docs"]
+    Execute["<b>executeToolCode</b><br/>Run code with bindings"]
+
+    Hidden["<i>All other MCP servers<br/>hidden behind these 4 tools</i>"]
+
+    LLM --> List
+    LLM --> Read
+    LLM --> Docs
+    LLM --> Execute
+
+    List -.-> Hidden
+    Read -.-> Hidden
+    Docs -.-> Hidden
+    Execute -.-> Hidden
+
+    style LLM fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
+    style List fill:#E8F5E9,stroke:#1B5E20,stroke-width:2.5px,color:#1A1A1A
+    style Read fill:#FFF3E0,stroke:#BF360C,stroke-width:2.5px,color:#1A1A1A
+    style Docs fill:#E1F5FE,stroke:#0288D1,stroke-width:2.5px,color:#1A1A1A
+    style Execute fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
+    style Hidden fill:#EEEEEE,stroke:#424242,stroke-width:1.5px,stroke-dasharray: 5 5,color:#1A1A1A
+```
+
+### The Execution Flow
+
+```mermaid
+graph LR
+    User["<b>1. User Request</b><br/>Search YouTube<br/>& save to file"]
+
+    Discover["<b>2. Discover Tools</b><br/>listToolFiles()"]
+
+    GetDefs["<b>3. Load Definitions</b><br/>readToolFile()"]
+
+    Write["<b>4. Write Code</b><br/>Python<br/>in sandbox"]
+
+    Execute["<b>5. Execute</b><br/>Real MCP calls<br/>contained in VM"]
+
+    Result["<b>6. Compact Result</b><br/>{saved:10}"]
+
+    Response["<b>7. Final Response</b><br/>Found & saved<br/>10 videos"]
+
+    User --> Discover
+    Discover --> GetDefs
+    GetDefs --> Write
+    Write --> Execute
+    Execute --> Result
+    Result --> Response
+
+    style User fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
+    style Discover fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
+    style GetDefs fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
+    style Write fill:#FFF3E0,stroke:#BF360C,stroke-width:2.5px,color:#1A1A1A
+    style Execute fill:#E8F5E9,stroke:#1B5E20,stroke-width:3px,color:#1A1A1A
+    style Result fill:#FFFDE7,stroke:#F57F17,stroke-width:2.5px,color:#1A1A1A
+    style Response fill:#E8F5E9,stroke:#1B5E20,stroke-width:2.5px,color:#1A1A1A
+```
+
+**Key insight:** All the complex orchestration happens inside the sandbox. The LLM only receives the final, compact result—not every intermediate step.
+
+---
+
+## Why This Matters at Scale
+
+### Classic MCP with 5 servers (100 tools):
+
+```
+Turn 1: Prompt + search query + [100 tool definitions]
+Turn 2: Prompt + search result + [100 tool definitions]
+Turn 3: Prompt + channel list + [100 tool definitions]
+Turn 4: Prompt + video list + [100 tool definitions]
+Turn 5: Prompt + summaries + [100 tool definitions]
+Turn 6: Prompt + doc result + [100 tool definitions]
+
+Total: 6 LLM calls, ~600+ tokens in tool definitions alone
+```
+
+### Code Mode with same 5 servers:
+
+```
+Turn 1: Prompt + 4 tools (listToolFiles, readToolFile, getToolDocs, executeToolCode)
+Turn 2: Prompt + server list + 4 tools
+Turn 3: Prompt + selected definitions + 4 tools + [EXECUTES CODE]
+        [YouTube search, channel list, videos, summaries, doc creation all happen in sandbox]
+Turn 4: Prompt + final result + 4 tools
+
+Total: 3-4 LLM calls, ~50 tokens in tool definitions
+Result: 50% cost reduction, 3-4x fewer LLM round trips
+```
+
+---
+
+## Enabling Code Mode
+
+Code Mode must be enabled **per MCP client**. Once enabled, that client's tools are accessed through the four meta-tools rather than exposed directly.
+
+**Best practice:** Enable Code Mode for 3+ servers or any "heavy" server (web search, documents, databases).
+
+<Tabs>
+<Tab title="Web UI">
+
+### Enable Code Mode for a Client
+
+1. Navigate to **MCP Gateway** in the sidebar
+2. Click on a client row to open the configuration sheet
+
+<Frame>
+  <img src="/media/ui-mcp-edit-server.png" alt="MCP Client Configuration" />
+</Frame>
+
+3. In the **Basic Information** section, toggle **Code Mode Client** to enabled
+4. Click **Save Changes**
+
+Once enabled:
+- This client's tools are no longer in the default tool list
+- They become accessible through `listToolFiles()` and `readToolFile()`
+- The AI can write code using `executeToolCode()` to call them
+
+</Tab>
+<Tab title="API">
+
+```bash
+# When adding a new client
+curl -X POST http://localhost:8080/api/mcp/client \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "youtube",
+    "connection_type": "http",
+    "connection_string": "http://localhost:3001/mcp",
+    "tools_to_execute": ["*"],
+    "is_code_mode_client": true
+  }'
+
+# Or update an existing client
+curl -X PUT http://localhost:8080/api/mcp/client/{id} \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "youtube",
+    "connection_type": "http",
+    "connection_string": "http://localhost:3001/mcp",
+    "tools_to_execute": ["*"],
+    "is_code_mode_client": true
+  }'
+```
+
+</Tab>
+<Tab title="config.json">
+
+```json
+{
+  "mcp": {
+    "client_configs": [
+      {
+        "name": "youtube",
+        "connection_type": "http",
+        "connection_string": "http://localhost:3001/mcp",
+        "tools_to_execute": ["*"],
+        "is_code_mode_client": true
+      },
+      {
+        "name": "filesystem",
+        "connection_type": "stdio",
+        "stdio_config": {
+          "command": "npx",
+          "args": ["-y", "@anthropic/mcp-filesystem"]
+        },
+        "tools_to_execute": ["*"],
+        "is_code_mode_client": true
+      }
+    ]
+  }
+}
+```
+
+</Tab>
+</Tabs>
+
+### Go SDK Setup
+
+```go
+mcpConfig := &schemas.MCPConfig{
+    ClientConfigs: []schemas.MCPClientConfig{
+        {
+            Name:             "youtube",
+            ConnectionType:   schemas.MCPConnectionTypeHTTP,
+            ConnectionString: bifrost.Ptr("http://localhost:3001/mcp"),
+            ToolsToExecute:   []string{"*"},
+            IsCodeModeClient: true, // Enable code mode
+        },
+        {
+            Name:           "filesystem",
+            ConnectionType: schemas.MCPConnectionTypeSTDIO,
+            StdioConfig: &schemas.MCPStdioConfig{
+                Command: "npx",
+                Args:    []string{"-y", "@anthropic/mcp-filesystem"},
+            },
+            ToolsToExecute:   []string{"*"},
+            IsCodeModeClient: true, // Enable code mode
+        },
+    },
+}
+```
+
+---
+
+## The Four Code Mode Tools
+
+When Code Mode clients are connected, Bifrost automatically adds four meta-tools to every request:
+
+### 1. listToolFiles
+
+Lists all available virtual `.pyi` stub files for connected code mode servers.
+
+**Example output (Server-level binding):**
+```
+servers/
+  youtube.pyi
+  filesystem.pyi
+```
+
+**Example output (Tool-level binding):**
+```
+servers/
+  youtube/
+    search.pyi
+    get_video.pyi
+  filesystem/
+    read_file.pyi
+    write_file.pyi
+```
+
+### 2. readToolFile
+
+Reads a virtual `.pyi` file to get compact Python function signatures for tools.
+
+**Parameters:**
+- `fileName` (required): Path like `servers/youtube.pyi` or `servers/youtube/search.pyi`
+- `startLine` (optional): 1-based starting line for partial reads
+- `endLine` (optional): 1-based ending line for partial reads
+
+**Example output:**
+```python
+# youtube server tools
+# Usage: youtube.tool_name(param=value)
+# For detailed docs: use getToolDocs(server="youtube", tool="tool_name")
+
+def search(query: str, maxResults: int = None) -> dict:  # Search for videos
+def get_video(id: str) -> dict:  # Get video details
+```
+
+### 3. getToolDocs
+
+Get detailed documentation for a specific tool when the compact signature from `readToolFile` is not sufficient.
+
+**Parameters:**
+- `server` (required): The server name (e.g., `"youtube"`)
+- `tool` (required): The tool name (e.g., `"search"`)
+
+**Example output:**
+```python
+# ============================================================================
+# Documentation for youtube.search tool
+# ============================================================================
+#
+# USAGE INSTRUCTIONS:
+# Call tools using: result = youtube.tool_name(param=value)
+# No async/await needed - calls are synchronous.
+#
+# CRITICAL - HANDLING RESPONSES:
+# Tool responses are dicts. To avoid runtime errors:
+# 1. Use print(result) to inspect the response structure first
+# 2. Access dict values with brackets: result["key"] NOT result.key
+# 3. Use .get() for safe access: result.get("key", default)
+# ============================================================================
+
+def search(query: str, maxResults: int = None) -> dict:
+    """
+    Search for videos on YouTube.
+
+    Args:
+        query (str): Search query (required)
+        maxResults (int): Max results to return (optional)
+
+    Returns:
+        dict: Response from the tool. Structure varies by tool.
+              Use print(result) to inspect the actual structure.
+
+    Example:
+        result = youtube.search(query="...")
+        print(result)  # Always inspect response first!
+        value = result.get("key", default)  # Safe access
+    """
+    ...
+```
+
+### 4. executeToolCode
+
+Executes Python code in a sandboxed Starlark interpreter with access to all code mode server tools.
+
+**Parameters:**
+- `code` (required): Python code to execute
+
+**Execution Environment:**
+- Python code runs in a Starlark interpreter (Python subset)
+- All code mode servers are exposed as global objects (e.g., `youtube`, `filesystem`)
+- Tool calls are **synchronous** - no async/await needed
+- Use `print()` for logging (output captured in logs)
+- Assign to `result` variable to return a value
+- Tool execution timeout applies (default 30s)
+
+**Syntax notes:**
+- Use keyword arguments: `server.tool(param="value")` NOT `server.tool({"param": "value"})`
+- Access dict values with brackets: `result["key"]` NOT `result.key`
+- List comprehensions work: `[x for x in items if x["active"]]`
+
+**Example code:**
+```python
+# Search YouTube and return formatted results
+results = youtube.search(query="AI news", maxResults=5)
+titles = [item["snippet"]["title"] for item in results["items"]]
+print("Found", len(titles), "videos")
+result = {"titles": titles, "count": len(titles)}
+```
+
+---
+
+## Binding Levels
+
+Code Mode supports two binding levels that control how tools are organized in the virtual file system:
+
+### Server-Level Binding (Default)
+
+All tools from a server are grouped into a single `.pyi` file.
+
+```
+servers/
+  youtube.pyi        ← Contains all youtube tools
+  filesystem.pyi     ← Contains all filesystem tools
+```
+
+**Best for:**
+- Servers with few tools
+- When you want to see all tools at once
+- Simpler discovery workflow
+
+### Tool-Level Binding
+
+Each tool gets its own `.pyi` file.
+
+```
+servers/
+  youtube/
+    search.pyi
+    get_video.pyi
+    get_channel.pyi
+  filesystem/
+    read_file.pyi
+    write_file.pyi
+    list_directory.pyi
+```
+
+**Best for:**
+- Servers with many tools
+- When tools have large/complex schemas
+- More focused documentation per tool
+
+### Configuring Binding Level
+
+Binding level is a **global setting** that controls how Code Mode's virtual file system is organized. It affects how the AI discovers and loads tool definitions.
+
+<Tabs>
+<Tab title="Web UI">
+
+Binding level can be viewed in the MCP configuration overview:
+
+<Frame>
+  <img src="/media/ui-mcp-config.png" alt="MCP Gateway Configuration" />
+</Frame>
+
+- **Server-level (default)**: One `.pyi` file per MCP server
+  - Use when: 5-20 tools per server, want simple discovery
+  - Example: `servers/youtube.pyi` contains all YouTube tools
+
+- **Tool-level**: One `.pyi` file per individual tool
+  - Use when: 30+ tools per server, want minimal context bloat
+  - Example: `servers/youtube/search.pyi`, `servers/youtube/list_channels.pyi`
+
+Both modes use the same four-tool interface (`listToolFiles`, `readToolFile`, `getToolDocs`, `executeToolCode`). The choice is purely about **context efficiency per read operation**.
+
+</Tab>
+<Tab title="config.json">
+
+```json
+{
+  "mcp": {
+    "tool_manager_config": {
+      "code_mode_binding_level": "server"
+    }
+  }
+}
+```
+
+Options: `"server"` (default) or `"tool"`
+
+</Tab>
+<Tab title="Go SDK">
+
+```go
+mcpConfig := &schemas.MCPConfig{
+    ToolManagerConfig: &schemas.MCPToolManagerConfig{
+        CodeModeBindingLevel: schemas.CodeModeBindingLevelTool, // or CodeModeBindingLevelServer
+    },
+    ClientConfigs: []schemas.MCPClientConfig{
+        // ... clients
+    },
+}
+```
+
+</Tab>
+</Tabs>
+
+---
+
+## Auto-Execution with Code Mode
+
+Code Mode tools can be auto-executed in [Agent Mode](./agent-mode), but with **additional validation**:
+
+1. The `listToolFiles` and `readToolFile` tools are always auto-executable (they're read-only)
+2. The `executeToolCode` tool is auto-executable **only if** all tool calls within the code are allowed
+
+### How Validation Works
+
+When `executeToolCode` is called in agent mode:
+
+1. Bifrost parses the Python code
+2. Extracts all `serverName.toolName()` calls
+3. Checks each call against `tools_to_auto_execute` for that server
+4. If ALL calls are allowed → auto-execute
+5. If ANY call is not allowed → return to user for approval
+
+**Example:**
+```json
+{
+  "name": "youtube",
+  "tools_to_execute": ["*"],
+  "tools_to_auto_execute": ["search"],
+  "is_code_mode_client": true
+}
+```
+
+```python
+# This code WILL auto-execute (only uses search)
+results = youtube.search(query="AI")
+result = results
+
+# This code will NOT auto-execute (uses delete_video which is not in auto-execute list)
+youtube.delete_video(id="abc123")
+```
+
+---
+
+## Code Execution Environment
+
+### Available APIs
+
+| Available | Not Available |
+|-----------|---------------|
+| Python-like syntax | `import` statements |
+| Synchronous tool calls | Classes (use dicts) |
+| `print()` for logging | File I/O |
+| Dict/List operations | Network access |
+| List comprehensions | `random`, `time` modules |
+
+### Runtime Environment Details
+
+**Engine:** Starlark interpreter (Python subset)
+
+**Tool Exposure:** Tools from code mode clients are exposed as global objects:
+```python
+# If you have a 'youtube' code mode client with a 'search' tool
+results = youtube.search(query="AI news")
+```
+
+**Code Processing:**
+1. Code is validated for syntax errors
+2. Tool calls are extracted and validated
+3. Code executes in isolated Starlark context
+4. Result variable is automatically serialized to JSON
+
+**Execution Limits:**
+- Default timeout: 30 seconds per tool execution
+- Memory isolation: Each execution gets its own context
+- No access to host file system or network
+- Logs captured from print() calls
+
+### Error Handling
+
+Bifrost provides detailed error messages with hints:
+
+```python
+# Error: youtube is not defined
+# Hints:
+# - Variable or identifier 'youtube' is not defined
+# - Available server keys: youtubeAPI, filesystem
+# - Use one of the available server keys as the object name
+```
+
+### Timeouts
+
+- Default: 30 seconds per tool call
+- Configure via `tool_execution_timeout` in `tool_manager_config`
+- Long-running operations are interrupted with timeout error
+
+---
+
+## Real-World Impact Comparison
+
+### Scenario: E-commerce Assistant with Multiple Services
+
+**Setup:**
+- 10 MCP servers (product catalog, inventory, payments, shipping, chat, analytics, docs, images, calendar, notifications)
+- Average 15 tools per server = **150 total tools**
+- Complex multi-step task: "Find matching products, check inventory, compare prices, get shipping estimate, create quote"
+
+### Classic MCP Results
+
+| Metric | Value |
+|--------|-------|
+| LLM Turns | 8-10 |
+| Tokens in Tool Defs | ~2,400 per turn |
+| Avg Request Tokens | 4,000-5,000 |
+| Avg Total Cost | $3.20-4.00 |
+| Latency | 18-25 seconds |
+
+**Problem:** Most context goes to tool definitions. Model makes redundant tool calls. Every intermediate result travels back through the LLM.
+
+### Code Mode Results
+
+| Metric | Value |
+|--------|-------|
+| LLM Turns | 3-4 |
+| Tokens in Tool Defs | ~100-300 per turn |
+| Avg Request Tokens | 1,500-2,000 |
+| Avg Total Cost | $1.20-1.80 |
+| Latency | 8-12 seconds |
+
+**Benefit:** Model writes one Python script. All orchestration happens in sandbox. Only compact result returned to LLM.
+
+---
+
+## Next Steps
+
+<CardGroup cols={2}>
+  <Card title="Agent Mode" icon="robot" href="./agent-mode">
+    Combine Code Mode with auto-execution
+  </Card>
+  <Card title="MCP Gateway URL" icon="server" href="./gateway-url">
+    Expose your tools to external clients
+  </Card>
+</CardGroup>
--- a/docs/mcp/connecting-to-servers.mdx
+++ b/docs/mcp/connecting-to-servers.mdx
@@ -0,0 +1,950 @@
+---
+title: "Connecting to MCP Servers"
+sidebarTitle: "Connecting to Servers"
+description: "Connect Bifrost to external MCP servers via STDIO, HTTP, or SSE protocols."
+icon: "plug"
+---
+
+## Overview
+
+Bifrost can connect to any MCP-compatible server to discover and execute tools. Each connection is called an **MCP Client** in Bifrost terminology.
+
+## Connection Types
+
+Bifrost supports three connection protocols, each with different authentication options:
+
+| Type | Description | Best For | Auth Support |
+|------|-------------|----------|--------------|
+| **STDIO** | Spawns a subprocess and communicates via stdin/stdout | Local tools, CLI utilities, scripts | None |
+| **HTTP** | Sends requests to an HTTP endpoint | Remote APIs, microservices, cloud functions | Headers, OAuth 2.0, Per-User OAuth |
+| **SSE** | Server-Sent Events for persistent connections | Real-time data, streaming tools | Headers, OAuth 2.0, Per-User OAuth |
+
+### STDIO Connections
+
+STDIO connections launch external processes and communicate via standard input/output. Best for local tools and scripts.
+
+```json
+{
+  "name": "filesystem",
+  "connection_type": "stdio",
+  "stdio_config": {
+    "command": "npx",
+    "args": ["-y", "@anthropic/mcp-filesystem"],
+    "envs": ["HOME", "PATH"]
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+**Use Cases:**
+- Local filesystem operations
+- Python/Node.js MCP servers
+- CLI utilities and scripts
+- Database tools with local credentials
+
+<Warning>
+**Docker Users:** When running Bifrost in Docker, STDIO connections may not work if the required commands (e.g., `npx`, `python`) are not installed in the container. For STDIO-based MCP servers, build a custom Docker image that includes the necessary dependencies, or use HTTP/SSE connections to externally hosted MCP servers.
+</Warning>
+
+### HTTP Connections
+
+HTTP connections communicate with MCP servers via HTTP requests. Ideal for remote services and microservices.
+
+HTTP connections support three authentication methods:
+- **Header-based authentication**: Static headers (API keys, custom tokens)
+- **OAuth 2.0**: Shared token managed by an admin, with automatic refresh
+- **Per-User OAuth**: Each end-user authenticates with their own credentials
+
+#### Header-Based Authentication
+
+Use static headers for API keys and custom authentication tokens:
+
+```json
+{
+  "name": "web-search",
+  "connection_type": "http",
+  "connection_string": "https://mcp-server.example.com/mcp",
+  "auth_type": "headers",
+  "headers": {
+    "Authorization": "Bearer your-api-key",
+    "X-Custom-Header": "value"
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+**Use Cases:**
+- Static API keys
+- Bearer token authentication
+- Custom header-based auth schemes
+
+#### OAuth 2.0 Authentication
+
+Use OAuth 2.0 for secure, user-based authentication with automatic token refresh:
+
+```json
+{
+  "name": "web-search",
+  "connection_type": "http",
+  "connection_string": "https://mcp-server.example.com/mcp",
+  "auth_type": "oauth",
+  "oauth_config": {
+    "client_id": "your-client-id",
+    "client_secret": "your-client-secret",
+    "authorize_url": "https://auth.example.com/authorize",
+    "token_url": "https://auth.example.com/token",
+    "scopes": ["read", "write"]
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+**Features:**
+- Automatic token refresh before expiration
+- PKCE support for public clients
+- Dynamic client registration (RFC 7591)
+- OAuth discovery from server URLs
+
+[→ Learn more about OAuth authentication →](./oauth)
+
+**Use Cases:**
+- Shared service integrations where all users access the same account
+- Admin-managed third-party connections
+- Compliance with OAuth 2.0 standards
+
+#### Per-User OAuth
+
+Use per-user OAuth when each end-user should access the upstream service under their own account (e.g., each user's personal Notion workspace or GitHub repos). Bifrost acts as an OAuth 2.1 Authorization Server — users authenticate through a consent flow and their tokens are stored per-identity.
+
+Per-user OAuth is configured through the Web UI only (Bifrost runs a test OAuth flow and pre-fetches tools at setup time).
+
+[→ Learn more about Per-User OAuth →](./per-user-oauth)
+
+**Use Cases:**
+- Multi-tenant apps where users access their own data
+- Personal integrations (Notion, GitHub, Google Drive)
+- Scenarios requiring per-user audit trails and token isolation
+
+**Overall HTTP Use Cases:**
+- Remote API integrations
+- Cloud-hosted MCP services
+- Microservice architectures
+- Third-party tool providers
+
+### SSE Connections
+
+Server-Sent Events (SSE) connections provide real-time, persistent connections to MCP servers. Like HTTP connections, SSE supports header-based authentication, OAuth 2.0, and per-user OAuth.
+
+#### Header-Based Authentication
+
+```json
+{
+  "name": "live-data",
+  "connection_type": "sse",
+  "connection_string": "https://stream.example.com/mcp/sse",
+  "auth_type": "headers",
+  "headers": {
+    "Authorization": "Bearer your-api-key"
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+#### OAuth 2.0 Authentication
+
+```json
+{
+  "name": "live-data",
+  "connection_type": "sse",
+  "connection_string": "https://stream.example.com/mcp/sse",
+  "auth_type": "oauth",
+  "oauth_config": {
+    "client_id": "your-client-id",
+    "authorize_url": "https://auth.example.com/authorize",
+    "token_url": "https://auth.example.com/token",
+    "scopes": ["stream:read"]
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+**Use Cases:**
+- Real-time market data
+- Live system monitoring
+- Event-driven workflows
+- Shared streaming connections managed by an admin
+
+#### Per-User OAuth
+
+Same as HTTP — each user authenticates under their own account. Configured through the Web UI only.
+
+[→ Learn more about Per-User OAuth →](./per-user-oauth)
+
+[→ Learn more about OAuth authentication →](./oauth)
+
+---
+
+## Gateway Setup
+
+<Tabs>
+<Tab title="Web UI">
+
+### Adding an MCP Client
+
+1. Navigate to **MCP Gateway** in the sidebar - you'll see a table of all registered servers
+
+<Frame>
+  <img src="/media/ui-mcp-servers-table.png" alt="MCP Servers Table" />
+</Frame>
+
+2. Click **New MCP Server** button to open the creation form
+
+3. Fill in the connection details:
+
+<Frame>
+  <img src="/media/ui-mcp-new-server.png" alt="Add MCP Client Form" />
+</Frame>
+
+**Fields:**
+- **Name**: Unique identifier (no spaces or hyphens, ASCII only)
+- **Connection Type**: STDIO, HTTP, or SSE
+- **For STDIO**: Command, arguments, and environment variables
+- **For HTTP/SSE**: Connection URL
+
+4. Click **Create** to connect
+
+### Viewing and Managing Connected Tools
+
+Once connected, click on any client row to open the configuration sheet:
+
+<Frame>
+  <img src="/media/ui-mcp-tool-config.png" alt="MCP Client Configuration and Tools" />
+</Frame>
+
+Here you can:
+- View all discovered tools with their descriptions and parameters
+- Enable/disable individual tools via toggle switches
+- Configure auto-execution for specific tools
+- Edit custom headers for HTTP/SSE connections
+- View the full connection configuration as JSON
+
+</Tab>
+<Tab title="API">
+
+### Add STDIO Client
+
+```bash
+curl -X POST http://localhost:8080/api/mcp/client \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "filesystem",
+    "connection_type": "stdio",
+    "stdio_config": {
+      "command": "npx",
+      "args": ["-y", "@anthropic/mcp-filesystem"],
+      "envs": ["HOME", "PATH"]
+    },
+    "tools_to_execute": ["*"]
+  }'
+```
+
+### Add HTTP Client
+
+```bash
+curl -X POST http://localhost:8080/api/mcp/client \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "web_search",
+    "connection_type": "http",
+    "connection_string": "http://localhost:3001/mcp",
+    "tools_to_execute": ["*"]
+  }'
+```
+
+### Add SSE Client
+
+```bash
+curl -X POST http://localhost:8080/api/mcp/client \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "realtime_data",
+    "connection_type": "sse",
+    "connection_string": "https://api.example.com/mcp/sse",
+    "tools_to_execute": ["*"]
+  }'
+```
+
+### List All Clients
+
+```bash
+curl http://localhost:8080/api/mcp/clients
+```
+
+Response:
+```json
+[
+  {
+    "config": {
+      "id": "abc123",
+      "name": "filesystem",
+      "connection_type": "stdio",
+      "stdio_config": {
+        "command": "npx",
+        "args": ["-y", "@anthropic/mcp-filesystem"]
+      }
+    },
+    "tools": [
+      {"name": "read_file", "description": "Read contents of a file"},
+      {"name": "write_file", "description": "Write contents to a file"},
+      {"name": "list_directory", "description": "List directory contents"}
+    ],
+    "state": "connected"
+  }
+]
+```
+
+</Tab>
+<Tab title="config.json">
+
+Configure MCP clients in your `config.json`:
+
+```json
+{
+  "mcp": {
+    "client_configs": [
+      {
+        "name": "filesystem",
+        "connection_type": "stdio",
+        "is_ping_available": true,
+        "stdio_config": {
+          "command": "npx",
+          "args": ["-y", "@anthropic/mcp-filesystem"],
+          "envs": ["HOME", "PATH"]
+        },
+        "tools_to_execute": ["*"]
+      },
+      {
+        "name": "web_search",
+        "connection_type": "http",
+        "connection_string": "env.WEB_SEARCH_MCP_URL",
+        "is_ping_available": false,
+        "tools_to_execute": ["search", "fetch_url"]
+      },
+      {
+        "name": "database",
+        "connection_type": "sse",
+        "connection_string": "https://db-mcp.example.com/sse",
+        "is_ping_available": true,
+        "tools_to_execute": []
+      }
+    ]
+  }
+}
+```
+
+<Note>
+Use `env.VARIABLE_NAME` syntax to reference environment variables for sensitive values like URLs with API keys.
+</Note>
+
+</Tab>
+</Tabs>
+
+---
+
+## Go SDK Setup
+
+Configure MCP in your Bifrost initialization:
+
+```go
+package main
+
+import (
+    "context"
+    bifrost "github.com/maximhq/bifrost/core"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+func main() {
+    mcpConfig := &schemas.MCPConfig{
+        ClientConfigs: []schemas.MCPClientConfig{
+            {
+                Name:             "filesystem",
+                ConnectionType:   schemas.MCPConnectionTypeSTDIO,
+                IsPingAvailable:  true,  // Use lightweight ping for health checks
+                StdioConfig: &schemas.MCPStdioConfig{
+                    Command: "npx",
+                    Args:    []string{"-y", "@anthropic/mcp-filesystem"},
+                    Envs:    []string{"HOME", "PATH"},
+                },
+                ToolsToExecute: []string{"*"},
+            },
+            {
+                Name:             "web_search",
+                ConnectionType:   schemas.MCPConnectionTypeHTTP,
+                ConnectionString: bifrost.Ptr("http://localhost:3001/mcp"),
+                IsPingAvailable:  false,  // Use listTools for health checks
+                ToolsToExecute:   []string{"search", "fetch_url"},
+            },
+        },
+    }
+
+    client, err := bifrost.Init(context.Background(), schemas.BifrostConfig{
+        Account:   account,
+        MCPConfig: mcpConfig,
+        Logger:    bifrost.NewDefaultLogger(schemas.LogLevelInfo),
+    })
+    if err != nil {
+        panic(err)
+    }
+}
+```
+
+### Tools To Execute Semantics
+
+The `ToolsToExecute` field controls which tools from the client are available:
+
+| Value | Behavior |
+|-------|----------|
+| `["*"]` | All tools from this client are included |
+| `[]` or `nil` | No tools included (deny-by-default) |
+| `["tool1", "tool2"]` | Only specified tools are included |
+
+### Tools To Auto Execute (Agent Mode)
+
+The `ToolsToAutoExecute` field controls which tools can be automatically executed in [Agent Mode](./agent-mode):
+
+| Value | Behavior |
+|-------|----------|
+| `["*"]` | All tools are auto-executed |
+| `[]` or `nil` | No tools are auto-executed (manual approval required) |
+| `["tool1", "tool2"]` | Only specified tools are auto-executed |
+
+<Note>
+A tool must be in **both** `ToolsToExecute` and `ToolsToAutoExecute` to be auto-executed. If a tool is in `ToolsToAutoExecute` but not in `ToolsToExecute`, it will be skipped.
+</Note>
+
+**Example configuration:**
+
+```go
+{
+    Name:           "filesystem",
+    ConnectionType: schemas.MCPConnectionTypeSTDIO,
+    StdioConfig: &schemas.MCPStdioConfig{
+        Command: "npx",
+        Args:    []string{"-y", "@anthropic/mcp-filesystem"},
+    },
+    ToolsToExecute:     []string{"*"},                              // All tools available
+    ToolsToAutoExecute: []string{"read_file", "list_directory"},    // Only these auto-execute
+}
+```
+
+---
+
+## Environment Variables
+
+Use environment variables for sensitive configuration values:
+
+**Gateway (config.json):**
+```json
+{
+  "name": "secure_api",
+  "connection_type": "http",
+  "connection_string": "env.SECURE_MCP_URL"
+}
+```
+
+**Go SDK:**
+```go
+{
+    Name:             "secure_api",
+    ConnectionType:   schemas.MCPConnectionTypeHTTP,
+    ConnectionString: bifrost.Ptr(os.Getenv("SECURE_MCP_URL")),
+}
+```
+
+Environment variables are:
+- Automatically resolved during client connection
+- Redacted in API responses and UI for security
+- Validated at startup to ensure all required variables are set
+
+---
+
+## Forwarding Request Headers to MCP Servers
+
+<Info>
+Header Forwarding is available in **v1.5.0-prerelease1 and above**.
+</Info>
+
+By default, Bifrost does not forward incoming request headers to MCP servers during tool execution. The `allowed_extra_headers` field lets you define a per-client allowlist of headers that callers may inject at request time and have forwarded to that MCP server when tools are executed.
+
+This is separate from the static `headers` field used for authentication:
+
+| Field | Purpose | When sent |
+|-------|---------|-----------|
+| `headers` | Static auth credentials (API keys, tokens) | Always, on every tool call |
+| `allowed_extra_headers` | Dynamic per-request headers from callers | Only when the caller provides them, and only if they match the allowlist |
+
+**Common use cases:**
+- Forwarding a user's auth token to an MCP server that enforces per-user authorization
+- Passing a tenant or org ID to a multi-tenant MCP server
+- Propagating trace or correlation IDs for end-to-end observability
+
+### How It Works
+
+1. An incoming request carries one or more headers matching a client's `allowed_extra_headers` pattern
+2. Bifrost captures those headers from the request (using the union of all clients' allowlists)
+3. At tool execution time, each client **re-checks** the header against its own allowlist — so the same header can be forwarded to one MCP server but not another
+
+<Note>
+Headers are matched case-insensitively. The only wildcard supported is a standalone `"*"` (allow all headers) — partial patterns like `x-tenant-*` are not supported. If `"*"` is used, it must be the only entry in the list.
+</Note>
+
+<Tabs>
+<Tab title="UI">
+
+**Configure:** Navigate to **MCP Gateway**, open the configuration sheet for an HTTP or SSE client, and set the **Allowed Extra Headers** field:
+
+<Frame>
+  <img src="/media/ui-mcp-allowed-extra-headers.png" alt="Allowed Extra Headers configuration in the MCP client edit sheet" />
+</Frame>
+
+**Send headers:** Include the allowed headers in any inference request to the LLM gateway:
+
+```bash
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "x-user-token: eyJhbGci..." \
+  -H "x-tenant-id: acme-corp" \
+  -d '{
+    "model": "openai/gpt-4o",
+    "messages": [{"role": "user", "content": "Look up my account details"}]
+  }'
+```
+
+</Tab>
+<Tab title="Management API">
+
+**Configure:** Include `allowed_extra_headers` when creating or updating a client:
+
+```bash
+curl -X POST http://localhost:8080/api/mcp/client \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "my_api",
+    "connection_type": "http",
+    "connection_string": "https://mcp.example.com/mcp",
+    "auth_type": "headers",
+    "headers": {
+      "Authorization": "Bearer service-token"
+    },
+    "allowed_extra_headers": ["x-user-token", "x-tenant-id", "x-request-id"],
+    "tools_to_execute": ["*"]
+  }'
+```
+
+**Send headers:** Include the allowed headers in any inference request:
+
+```bash
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "x-user-token: eyJhbGci..." \
+  -H "x-tenant-id: acme-corp" \
+  -d '{
+    "model": "openai/gpt-4o",
+    "messages": [{"role": "user", "content": "Look up my account details"}]
+  }'
+```
+
+</Tab>
+<Tab title="Config File">
+
+**Configure:**
+
+```json
+{
+  "mcp": {
+    "client_configs": [
+      {
+        "name": "my_api",
+        "connection_type": "http",
+        "connection_string": "https://mcp.example.com/mcp",
+        "auth_type": "headers",
+        "headers": {
+          "Authorization": "Bearer service-token"
+        },
+        "allowed_extra_headers": ["x-user-token", "x-tenant-id", "x-request-id"],
+        "tools_to_execute": ["*"]
+      }
+    ]
+  }
+}
+```
+
+**Send headers:** Include the allowed headers in any inference request:
+
+```bash
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "x-user-token: eyJhbGci..." \
+  -H "x-tenant-id: acme-corp" \
+  -d '{
+    "model": "openai/gpt-4o",
+    "messages": [{"role": "user", "content": "Look up my account details"}]
+  }'
+```
+
+</Tab>
+<Tab title="MCP Gateway (/mcp)">
+
+**Configure** the client as above (Web UI, Management API, or config.json).
+
+**Send headers:** When an external MCP client (e.g., Claude Desktop, Cursor) connects to Bifrost's `/mcp` endpoint, include the allowed headers in that HTTP request. Bifrost forwards them during any tool call made within that session:
+
+```json
+{
+  "mcpServers": {
+    "bifrost": {
+      "url": "http://localhost:8080/mcp",
+      "headers": {
+        "x-user-token": "eyJhbGci...",
+        "x-tenant-id": "acme-corp"
+      }
+    }
+  }
+}
+```
+
+<Note>
+Header support in MCP client config varies by client. The above JSON format applies to clients that support custom headers (e.g., Claude Desktop, Cursor). Check your MCP client's documentation for the exact configuration syntax.
+</Note>
+
+</Tab>
+<Tab title="Go SDK">
+
+**Configure:**
+
+```go
+schemas.MCPClientConfig{
+    Name:             "my_api",
+    ConnectionType:   schemas.MCPConnectionTypeHTTP,
+    ConnectionString: bifrost.Ptr("https://mcp.example.com/mcp"),
+    AuthType:         schemas.MCPAuthTypeHeaders,
+    Headers: map[string]schemas.EnvVar{
+        "Authorization": {Value: "Bearer service-token"},
+    },
+    AllowedExtraHeaders: schemas.WhiteList{"x-user-token", "x-tenant-id", "x-request-id"},
+    ToolsToExecute: []string{"*"},
+}
+```
+
+**Send headers:** Set `BifrostContextKeyMCPExtraHeaders` on the context before calling `ChatCompletionRequest` or `ExecuteChatMCPTool`:
+
+```go
+bifrostCtx := schemas.NewBifrostContext(context.Background(), schemas.NoDeadline)
+bifrostCtx.SetValue(schemas.BifrostContextKeyMCPExtraHeaders, map[string][]string{
+    "x-user-token": {"eyJhbGci..."},
+    "x-tenant-id":  {"acme-corp"},
+})
+
+response, err := client.ChatCompletionRequest(bifrostCtx, request)
+```
+
+</Tab>
+</Tabs>
+
+---
+
+## Client State Management
+
+### Connection States
+
+| State | Description |
+|-------|-------------|
+| `connected` | Client is active and tools are available |
+| `connecting` | Client is establishing connection |
+| `disconnected` | Client lost connection but can be reconnected |
+| `error` | Client configuration or connection failed |
+
+### Managing Clients at Runtime
+
+<Tabs>
+<Tab title="Gateway API">
+
+**Reconnect a client:**
+```bash
+curl -X POST http://localhost:8080/api/mcp/client/{id}/reconnect
+```
+
+**Edit client configuration:**
+```bash
+curl -X PUT http://localhost:8080/api/mcp/client/{id} \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "filesystem",
+    "connection_type": "stdio",
+    "stdio_config": {
+      "command": "npx",
+      "args": ["-y", "@anthropic/mcp-filesystem"]
+    },
+    "tools_to_execute": ["read_file", "list_directory"]
+  }'
+```
+
+**Remove a client:**
+```bash
+curl -X DELETE http://localhost:8080/api/mcp/client/{id}
+```
+
+</Tab>
+<Tab title="Go SDK">
+
+```go
+// Get all connected clients
+clients, err := client.GetMCPClients()
+for _, mcpClient := range clients {
+    fmt.Printf("Client: %s, State: %s, Tools: %d\n",
+        mcpClient.Config.Name,
+        mcpClient.State,
+        len(mcpClient.Tools))
+}
+
+// Reconnect a disconnected client
+err = client.ReconnectMCPClient("filesystem")
+
+// Add new client at runtime
+err = client.AddMCPClient(schemas.MCPClientConfig{
+    Name:           "new_client",
+    ConnectionType: schemas.MCPConnectionTypeHTTP,
+    ConnectionString: bifrost.Ptr("http://localhost:3002/mcp"),
+    ToolsToExecute: []string{"*"},
+})
+
+// Remove a client
+err = client.RemoveMCPClient("old_client")
+
+// Edit client tools
+err = client.EditMCPClientTools("filesystem", []string{"read_file", "list_directory"})
+```
+
+</Tab>
+</Tabs>
+
+---
+
+## Health Monitoring
+
+Bifrost automatically monitors MCP client health with periodic checks every 10 seconds by default.
+
+### Health Check Methods
+
+By default, Bifrost uses the lightweight **ping method** for health checks. However, you can configure the health check method based on your MCP server's capabilities:
+
+| Method | When to Use | Overhead | Fallback |
+|--------|------------|----------|----------|
+| **Ping** (default) | Server supports MCP ping protocol | Minimal | Best for most servers |
+| **ListTools** | Server doesn't support ping, or you need heavier checks | Higher | More resource-intensive |
+
+### Configuring Health Check Method
+
+You can toggle the `is_ping_available` setting for each client:
+
+#### Via Web UI
+
+1. Navigate to **MCP Gateway** and select a server
+2. In the configuration panel, toggle **"Ping Available for Health Check"**
+3. Enable: Uses lightweight ping for health checks
+4. Disable: Uses listTools method for health checks instead
+
+<Frame>
+  <img src="/media/ui-mcp-ping-available.png" alt="Ping Available Toggle" />
+</Frame>
+
+#### Via API
+
+```bash
+curl -X PUT http://localhost:8080/api/mcp/client/{id} \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "my_server",
+    "is_ping_available": false
+  }'
+```
+
+#### Via config.json
+
+```json
+{
+  "mcp": {
+    "client_configs": [
+      {
+        "name": "filesystem",
+        "connection_type": "stdio",
+        "is_ping_available": true,
+        "stdio_config": {
+          "command": "npx",
+          "args": ["-y", "@anthropic/mcp-filesystem"]
+        }
+      }
+    ]
+  }
+}
+```
+
+#### Via Go SDK
+
+```go
+err := client.EditMCPClient(context.Background(), schemas.MCPClientConfig{
+    ID:               "filesystem",
+    Name:             "filesystem",
+    IsPingAvailable:  false,  // Use listTools instead of ping
+    ToolsToExecute:   []string{"*"},
+})
+```
+
+### Health Check Behavior
+
+When a client disconnects:
+1. State changes to `disconnected`
+2. Tools from that client become unavailable
+3. You can reconnect via API or UI
+
+**Note:** Changing `is_ping_available` takes effect immediately without requiring a client reconnection.
+
+---
+
+## Connection Resilience and Retry Logic
+
+Bifrost automatically implements **exponential backoff retry logic** to handle transient network failures and temporary service unavailability. This ensures that brief connection issues don't immediately cause tool unavailability.
+
+<Warning>
+**Important:** Bifrost only retries on transient errors (network failures, timeouts, temporary service unavailability). Permanent errors like authentication failures, configuration errors, and missing commands fail immediately without retry.
+</Warning>
+
+### Automatic Retry Strategy
+
+Bifrost retries failed operations using the following strategy, as implemented by `ExecuteWithRetry` and `DefaultRetryConfig` in the MCP layer:
+
+| Parameter | Value | Description |
+|-----------|-------|-------------|
+| **Max Retries** (`DefaultRetryConfig.MaxRetries`) | 5 | Retries after the initial attempt (6 attempts total) |
+| **Initial Backoff** (`DefaultRetryConfig.InitialBackoff`) | 1 second | Starting backoff duration before doubling |
+| **Max Backoff** (`DefaultRetryConfig.MaxBackoff`) | 30 seconds | Maximum wait time between retries |
+| **Backoff Multiplier** | 2x | Exponential growth between attempts |
+
+**Backoff Progression** (matches `ExecuteWithRetry` with `DefaultRetryConfig`):
+- Attempt 1: Initial attempt (no wait)
+- Attempt 2: Wait 1s, then retry (and double backoff to 2s)
+- Attempt 3: Wait 2s, then retry (and double backoff to 4s)
+- Attempt 4: Wait 4s, then retry (and double backoff to 8s)
+- Attempt 5: Wait 8s, then retry (and double backoff to 16s)
+- Attempt 6: Wait 16s, then retry (backoff capped at 30s max)
+
+### Error Classification
+
+Bifrost intelligently classifies errors as either **transient** (retryable) or **permanent** (fail immediately):
+
+**Transient Errors (Retried):**
+- Connection timeouts or refused connections
+- Network unreachable errors
+- DNS resolution failures
+- HTTP 5xx errors (500, 502, 503, 504)
+- HTTP 429 (Too Many Requests)
+- I/O errors and broken pipes
+- Temporary service unavailability
+
+**Permanent Errors (Fail Immediately - No Retry):**
+- **Context deadline exceeded or cancelled** - Retrying won't help if time limit is reached
+- Authentication failures (401, 403)
+- Authorization denied
+- Configuration errors (invalid auth, invalid config)
+- File or command not found (e.g., "command not found: npx")
+- Bad request errors (400, 405, 422)
+- Command execution permission denied
+- Invalid credentials
+
+### What Operations Are Retried
+
+Bifrost applies retry logic to these critical operations:
+
+1. **Connection Creation** - Establishing initial connection to the MCP server (with error classification)
+2. **Transport Start** - Starting the transport layer (STDIO, HTTP, SSE)
+3. **Client Initialization** - Initializing the MCP client protocol
+4. **Tool Discovery** - Retrieving available tools from the server
+5. **Automatic Reconnection** - When health checks detect disconnection
+
+### Reconnection on Health Check Failure
+
+When a client reaches 5 consecutive health check failures:
+
+1. Client state changes to `disconnected`
+2. Bifrost automatically attempts reconnection **in the background**
+3. Reconnection uses the same exponential backoff retry logic
+4. Once reconnected, health checks resume normal operation
+5. Tools become available again without manual intervention
+
+This automatic reconnection happens asynchronously and doesn't block other operations.
+
+### Manual Reconnection
+
+You can also trigger manual reconnection at any time:
+
+<Tabs>
+<Tab title="Gateway API">
+
+```bash
+curl -X POST http://localhost:8080/api/mcp/client/{id}/reconnect
+```
+
+Manual reconnection also uses the retry logic for robustness.
+
+</Tab>
+<Tab title="Go SDK">
+
+```go
+// Reconnect with automatic retry logic
+err := client.ReconnectMCPClient("filesystem")
+if err != nil {
+    log.Printf("Reconnection failed after retries: %v", err)
+}
+```
+
+</Tab>
+</Tabs>
+
+### Benefits
+
+- **Handles transient failures**: Brief network hiccups won't cause tool unavailability
+- **Prevents server overload**: Exponential backoff prevents hammering servers
+- **Automatic recovery**: Disconnected clients reconnect automatically
+- **Production-ready**: No manual intervention needed for temporary issues
+- **Transparent logging**: Detailed retry attempts logged for debugging
+
+---
+
+## Naming Conventions
+
+MCP client names have specific requirements:
+
+<Warning>
+- Must contain only ASCII characters
+- Cannot contain hyphens (`-`) or spaces
+- Cannot start with a number
+- Must be unique across all clients
+</Warning>
+
+**Valid names:** `filesystem`, `web_search`, `myAPI`, `tool123`
+
+**Invalid names:** `my-tools`, `web search`, `123tools`, `datos-api`
+
+---
+
+## Next Steps
+
+<CardGroup cols={2}>
+  <Card title="Tool Execution" icon="play" href="./tool-execution">
+    Learn how to execute tools from connected MCP servers
+  </Card>
+  <Card title="Agent Mode" icon="robot" href="./agent-mode">
+    Enable autonomous tool execution with auto-approval
+  </Card>
+</CardGroup>
--- a/docs/mcp/filtering.mdx
+++ b/docs/mcp/filtering.mdx
@@ -0,0 +1,495 @@
+---
+title: "Tool Filtering"
+sidebarTitle: "Filtering"
+description: "Control which MCP tools are available at the client, request, and virtual key levels."
+icon: "filter"
+---
+
+## Overview
+
+Bifrost provides **three levels of tool filtering** to control which MCP tools are available:
+
+1. **Client Configuration** - Set which tools a client can execute (`tools_to_execute`)
+2. **Request Headers** - Filter tools per-request via HTTP headers or context
+3. **Virtual Key Configuration** - Control tools per-VK (Gateway only)
+
+These levels stack: a tool must pass all applicable filters to be available.
+
+```mermaid
+graph LR
+    All["<b>Available<br/>Tools</b>"]
+    Client["<b>Client Config</b><br/>tools_to_execute"]
+    Request["<b>Request<br/>Headers</b>"]
+    VK["<b>Virtual Key<br/>Filter</b>"]
+    Final["<b>Tools<br/>for LLM</b>"]
+
+    All --> Client
+    Client --> Request
+    Request --> VK
+    VK --> Final
+
+    style All fill:#EEEEEE,stroke:#424242,stroke-width:2.5px,color:#1A1A1A
+    style Client fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
+    style Request fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
+    style VK fill:#E8F5E9,stroke:#1B5E20,stroke-width:2.5px,color:#1A1A1A
+    style Final fill:#FFFDE7,stroke:#F57F17,stroke-width:2.5px,color:#1A1A1A
+```
+
+---
+
+## Level 1: Client Configuration
+
+The `tools_to_execute` field on each MCP client config defines the **baseline** of available tools.
+
+### Semantics
+
+| Value | Behavior |
+|-------|----------|
+| `["*"]` | All tools from this client are available |
+| `[]` or omitted | No tools available (deny-by-default) |
+| `["tool1", "tool2"]` | Only specified tools are available |
+
+### Configuration
+
+<Tabs>
+<Tab title="Gateway">
+
+```bash
+curl -X POST http://localhost:8080/api/mcp/client \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "filesystem",
+    "connection_type": "stdio",
+    "stdio_config": {
+      "command": "npx",
+      "args": ["-y", "@anthropic/mcp-filesystem"]
+    },
+    "tools_to_execute": ["read_file", "list_directory"]
+  }'
+```
+
+</Tab>
+<Tab title="Go SDK">
+
+```go
+mcpConfig := &schemas.MCPConfig{
+    ClientConfigs: []schemas.MCPClientConfig{
+        {
+            Name:           "filesystem",
+            ConnectionType: schemas.MCPConnectionTypeSTDIO,
+            StdioConfig: &schemas.MCPStdioConfig{
+                Command: "npx",
+                Args:    []string{"-y", "@anthropic/mcp-filesystem"},
+            },
+            ToolsToExecute: []string{"read_file", "list_directory"}, // Only these tools
+        },
+    },
+}
+```
+
+</Tab>
+<Tab title="config.json">
+
+```json
+{
+  "mcp": {
+    "client_configs": [
+      {
+        "name": "filesystem",
+        "connection_type": "stdio",
+        "stdio_config": {
+          "command": "npx",
+          "args": ["-y", "@anthropic/mcp-filesystem"]
+        },
+        "tools_to_execute": ["read_file", "list_directory"]
+      }
+    ]
+  }
+}
+```
+
+</Tab>
+</Tabs>
+
+---
+
+## Level 2: Request-Level Filtering
+
+Filter tools dynamically on a per-request basis using headers (Gateway) or context values (SDK).
+
+### Available Filters
+
+| Filter | Purpose |
+|--------|---------|
+| `mcp-include-clients` | Only include tools from specified clients |
+| `mcp-include-tools` | Only include specified tools (format: `clientName-toolName`) |
+
+### Gateway Headers
+
+```bash
+# Include only specific clients
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "x-bf-mcp-include-clients: filesystem,web_search" \
+  -d '...'
+
+# Include only specific tools
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "x-bf-mcp-include-tools: filesystem-read_file,web_search-search" \
+  -d '...'
+
+# Include all tools from one client, specific tools from another
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "x-bf-mcp-include-tools: filesystem-*,web_search-search" \
+  -d '...'
+
+# Include internal tools registered via RegisterTool()
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "x-bf-mcp-include-tools: bifrostInternal-echo,bifrostInternal-calculator" \
+  -d '...'
+
+# Empty clients filter blocks ALL tools - no tools available to LLM
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "x-bf-mcp-include-clients:" \
+  -d '...'
+# Result: No MCP tools available (deny-all)
+
+# Empty tools filter also blocks ALL tools
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "x-bf-mcp-include-tools:" \
+  -d '...'
+# Result: No MCP tools available (deny-all)
+```
+
+### Go SDK Context Values
+
+```go
+// Include only specific clients
+ctx := context.WithValue(context.Background(),
+    schemas.BifrostContextKey("mcp-include-clients"),
+    []string{"filesystem", "web_search"})
+
+// Include only specific tools
+ctx = context.WithValue(ctx,
+    schemas.BifrostContextKey("mcp-include-tools"),
+    []string{"filesystem-read_file", "web_search-search"})
+
+// Wildcard for all tools from a client
+ctx = context.WithValue(ctx,
+    schemas.BifrostContextKey("mcp-include-tools"),
+    []string{"filesystem-*", "web_search-search"})
+
+// Include all internal tools (registered via RegisterTool)
+ctx = context.WithValue(ctx,
+    schemas.BifrostContextKey("mcp-include-tools"),
+    []string{"bifrostInternal-*"})
+
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), request)
+
+// Empty include-clients blocks ALL tools - no tools available
+ctx = context.WithValue(context.Background(),
+    schemas.BifrostContextKey("mcp-include-clients"),
+    []string{})  // Empty slice = deny-all
+// Result: No MCP tools available to LLM
+
+// Empty include-tools also blocks ALL tools
+ctx = context.WithValue(context.Background(),
+    schemas.BifrostContextKey("mcp-include-tools"),
+    []string{})  // Empty slice = deny-all
+// Result: No MCP tools available to LLM
+```
+
+### Wildcard Support
+
+| Pattern | Meaning |
+|---------|---------|
+| `*` (in include-clients) | Include all clients |
+| `clientName-*` (in include-tools) | Include all tools from that client |
+| `clientName-toolName` | Include specific tool |
+
+### Tool Naming Convention
+
+**Important:** All MCP tools follow a consistent naming convention using the **prefixed format** `clientName-toolName`:
+
+- **External MCP Clients** (HTTP, SSE, STDIO): Tools use the format `clientName-toolName`
+  - Example: `filesystem-read_file`, `web_search-search`
+  - The `clientName` is the name configured for the MCP client
+
+- **Internal (In-Process) Tools**: Tools registered via `RegisterTool()` use the prefix `bifrostInternal-`
+  - Example: `bifrostInternal-echo`, `bifrostInternal-my_custom_tool`
+  - These tools are registered via `RegisterTool()` in the SDK
+
+This consistent naming convention ensures clear separation between tools from different clients and prevents naming conflicts across all MCP client types.
+
+---
+
+## Level 3: Virtual Key Filtering (Gateway Only)
+
+Virtual Keys can have their own MCP tool access configuration, which **takes precedence** over request-level headers.
+
+<Note>
+When a Virtual Key has no MCP configurations, **no MCP tools are available** (deny-by-default). You must explicitly add MCP client configurations to allow tools. When a Virtual Key has MCP configurations, it generates the `x-bf-mcp-include-tools` header automatically, overriding any manually sent header.
+</Note>
+
+### Configuration
+
+<Tabs>
+<Tab title="Web UI">
+
+1. Navigate to **Virtual Keys** in the governance section
+2. Create or edit a Virtual Key
+3. In **MCP Client Configurations**, add the clients and tools this VK can access
+
+<Frame>
+  <img src="/media/ui-virtual-key-mcp-filter.png" alt="Virtual Key MCP Configuration" />
+</Frame>
+
+</Tab>
+<Tab title="API">
+
+```bash
+curl -X POST http://localhost:8080/api/governance/virtual-keys \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "support-team-key",
+    "mcp_configs": [
+      {
+        "mcp_client_name": "knowledge_base",
+        "tools_to_execute": ["search", "get_article"]
+      },
+      {
+        "mcp_client_name": "ticketing",
+        "tools_to_execute": ["*"]
+      }
+    ]
+  }'
+```
+
+</Tab>
+<Tab title="config.json">
+
+```json
+{
+  "governance": {
+    "virtual_keys": [
+      {
+        "name": "support-team-key",
+        "mcp_configs": [
+          {
+            "mcp_client_name": "knowledge_base",
+            "tools_to_execute": ["search", "get_article"]
+          },
+          {
+            "mcp_client_name": "ticketing",
+            "tools_to_execute": ["*"]
+          }
+        ]
+      }
+    ]
+  }
+}
+```
+
+</Tab>
+</Tabs>
+
+### Virtual Key MCP Config Semantics
+
+| Configuration | Result |
+|---------------|--------|
+| `tools_to_execute: ["*"]` | All tools from this client |
+| `tools_to_execute: []` | No tools from this client |
+| `tools_to_execute: ["a", "b"]` | Only specified tools |
+| Client not configured | All tools blocked from that client |
+
+Learn more in [MCP Tool Filtering for Virtual Keys](../features/governance/mcp-tools).
+
+---
+
+## Filtering Logic
+
+### How Filters Combine
+
+1. **Client config** is the baseline (must include the tool)
+2. **Request filters** further narrow down (if specified)
+3. **VK filters** override request filters (if VK has MCP configs)
+
+### Example Scenario
+
+**Setup:**
+- Client `filesystem` has `tools_to_execute: ["read_file", "write_file", "delete_file"]`
+- Virtual Key `prod-key` has `mcp_configs: [{ mcp_client_name: "filesystem", tools_to_execute: ["read_file"] }]`
+
+**Request with `prod-key`:**
+```bash
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "Authorization: Bearer vk_prod_key" \
+  -H "x-bf-mcp-include-tools: filesystem-write_file" \  # This is IGNORED
+  -d '...'
+```
+
+**Result:** Only `read_file` is available (VK config overrides request header)
+
+**Request without VK (if allowed):**
+```bash
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "x-bf-mcp-include-tools: filesystem-write_file" \
+  -d '...'
+```
+
+**Result:** Only `write_file` is available (request header applies)
+
+---
+
+## Common Patterns
+
+### Read-Only Access
+
+Allow only read operations:
+
+```json
+{
+  "tools_to_execute": ["read_file", "list_directory", "get_file_info"]
+}
+```
+
+### Environment-Based Filtering
+
+Use different VKs for different environments:
+
+```json
+{
+  "virtual_keys": [
+    {
+      "name": "development",
+      "mcp_configs": [
+        { "mcp_client_name": "filesystem", "tools_to_execute": ["*"] },
+        { "mcp_client_name": "database", "tools_to_execute": ["*"] }
+      ]
+    },
+    {
+      "name": "production",
+      "mcp_configs": [
+        { "mcp_client_name": "filesystem", "tools_to_execute": ["read_file"] },
+        { "mcp_client_name": "database", "tools_to_execute": ["query"] }
+      ]
+    }
+  ]
+}
+```
+
+### Per-User Tool Access
+
+Create VKs for different user roles:
+
+```json
+{
+  "virtual_keys": [
+    {
+      "name": "viewer-role",
+      "mcp_configs": [
+        { "mcp_client_name": "documents", "tools_to_execute": ["view", "search"] }
+      ]
+    },
+    {
+      "name": "editor-role",
+      "mcp_configs": [
+        { "mcp_client_name": "documents", "tools_to_execute": ["view", "search", "edit", "create"] }
+      ]
+    },
+    {
+      "name": "admin-role",
+      "mcp_configs": [
+        { "mcp_client_name": "documents", "tools_to_execute": ["*"] }
+      ]
+    }
+  ]
+}
+```
+
+---
+
+## Advanced: Context-Based Filtering
+
+For SDK users, filtering can be applied at the context level, enabling per-request tool customization:
+
+### Go SDK Context Filtering
+
+```go
+import (
+    "context"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+// Filter to specific clients
+ctx := context.WithValue(
+    context.Background(),
+    schemas.BifrostContextKey("mcp-include-clients"),
+    []string{"filesystem", "web_search"},
+)
+
+// Or filter to specific tools
+ctx = context.WithValue(
+    ctx,
+    schemas.BifrostContextKey("mcp-include-tools"),
+    []string{"filesystem-read_file", "web_search-search"},
+)
+
+// Request will only see filtered tools
+response, _ := client.ChatCompletionRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), request)
+```
+
+### Filter Precedence
+
+When multiple filters apply, they combine as an intersection (AND logic):
+
+```
+Client Config Tools ∩ Request Filters ∩ VK Filters = Available Tools
+```
+
+**Example:**
+- Client config allows: [read_file, write_file, delete_file]
+- Request header specifies: [read_file, write_file]
+- VK config restricts to: [read_file]
+- **Result:** Only [read_file] available
+
+---
+
+## Debugging Tool Availability
+
+### Check Available Tools
+
+**Gateway API:**
+```bash
+curl http://localhost:8080/api/mcp/clients
+```
+
+**Response shows tools per client:**
+```json
+[
+  {
+    "config": { "name": "filesystem", "tools_to_execute": ["read_file", "write_file"] },
+    "tools": [
+      { "name": "read_file", "description": "Read file contents" },
+      { "name": "write_file", "description": "Write to file" }
+    ],
+    "state": "connected"
+  }
+]
+```
+
+### Check What LLM Receives
+
+The tools included in a chat request depend on all active filters. To see what tools are available for a specific request, check the request body sent to the LLM provider in your logs or observability platform.
+
+---
+
+## Next Steps
+
+<CardGroup cols={2}>
+  <Card title="Virtual Key MCP Tools" icon="key" href="../features/governance/mcp-tools">
+    Detailed VK tool configuration
+  </Card>
+  <Card title="Agent Mode" icon="robot" href="./agent-mode">
+    Configure auto-execution for filtered tools
+  </Card>
+</CardGroup>
--- a/docs/mcp/gateway-url.mdx
+++ b/docs/mcp/gateway-url.mdx
@@ -0,0 +1,380 @@
+---
+title: "MCP Gateway URL"
+sidebarTitle: "Gateway URL"
+description: "Expose Bifrost as an MCP server for Claude Desktop and other MCP clients."
+icon: "server"
+---
+
+<Note>
+This feature is only available on `v1.4.0-prerelease1` and above.
+</Note>
+
+<Info>
+This feature is only available in the **Gateway** deployment. It is not available when using Bifrost as a Go SDK.
+</Info>
+
+## Overview
+
+Bifrost can act as an **MCP server**, exposing all your connected MCP tools to external MCP clients like Claude Desktop, Cursor, or any other MCP-compatible application.
+
+This enables a powerful pattern:
+- Connect Bifrost to multiple MCP servers (filesystem, web search, databases, etc.)
+- Expose all those tools through a single MCP endpoint
+- External clients connect to Bifrost and get access to all aggregated tools
+
+```mermaid
+graph TD
+    Clients["<b>External MCP Clients</b><br/>Claude Desktop, Cursor<br/>Custom Apps"]
+
+    Gateway["<b>Bifrost Gateway</b>"]
+    Endpoints["<b>Endpoints</b><br/>POST /mcp: JSON-RPC<br/>GET /mcp: SSE Stream"]
+    Registry["<b>Aggregated Tool Registry</b><br/>filesystem • web search<br/>databases • custom tools"]
+
+    Servers["<b>External MCP Servers</b><br/>filesystem • web-search<br/>databases • custom"]
+
+    Clients -->|MCP Protocol<br/>HTTP/SSE| Gateway
+    Gateway --> Endpoints
+    Gateway --> Registry
+    Gateway -->|MCP Protocol| Servers
+
+    style Clients fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
+    style Gateway fill:#E8F5E9,stroke:#1B5E20,stroke-width:2.5px,color:#1A1A1A
+    style Endpoints fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
+    style Registry fill:#FFF3E0,stroke:#BF360C,stroke-width:2.5px,color:#1A1A1A
+    style Servers fill:#FFFDE7,stroke:#F57F17,stroke-width:2.5px,color:#1A1A1A
+```
+
+---
+
+## Endpoints
+
+| Endpoint | Method | Purpose |
+|----------|--------|---------|
+| `/mcp` | POST | JSON-RPC 2.0 messages for tool discovery and execution |
+| `/mcp` | GET | Server-Sent Events (SSE) for persistent connections |
+
+### POST /mcp (JSON-RPC)
+
+Handle JSON-RPC 2.0 messages for tool listing and execution:
+
+```bash
+# List available tools
+curl -X POST http://localhost:8080/mcp \
+  -H "Content-Type: application/json" \
+  -d '{
+    "jsonrpc": "2.0",
+    "id": 1,
+    "method": "tools/list"
+  }'
+
+# Call a tool
+curl -X POST http://localhost:8080/mcp \
+  -H "Content-Type: application/json" \
+  -d '{
+    "jsonrpc": "2.0",
+    "id": 2,
+    "method": "tools/call",
+    "params": {
+      "name": "filesystem_read_file",
+      "arguments": {
+        "path": "/tmp/test.txt"
+      }
+    }
+  }'
+```
+
+### GET /mcp (SSE)
+
+Establish a persistent SSE connection for real-time communication:
+
+```bash
+curl -N http://localhost:8080/mcp \
+  -H "Accept: text/event-stream"
+```
+
+The SSE endpoint sends:
+- `connection/opened` message on connect
+- Keeps connection alive until client disconnects
+
+---
+
+## External MCP Client Integration
+
+The `/mcp` endpoint supports any MCP-compatible client that can communicate via HTTP or SSE:
+
+- **Claude Desktop** - macOS and Windows desktop application
+- **Cursor** - IDE with MCP support
+- **Custom Applications** - Any app implementing the MCP protocol
+- **Browser Extensions** - Tools with MCP client capability
+
+To connect an external MCP client, configure it to connect to:
+```
+http://your-bifrost-gateway/mcp
+```
+
+Include any required Virtual Key authentication headers if governance is enabled.
+
+---
+
+## Virtual Key Authentication
+
+Bifrost supports per-Virtual Key MCP servers, allowing you to expose different tools to different clients.
+
+### Global Server (No Virtual Key)
+
+When `enforce_auth_on_inference` is `false`, requests without a Virtual Key use the global MCP server with all available tools.
+
+### Virtual Key-Specific Servers
+
+When using Virtual Keys, each VK gets its own MCP server with filtered tools based on its configuration.
+
+**Authenticate with Virtual Key:**
+
+```bash
+# Via Authorization header
+curl -X POST http://localhost:8080/mcp \
+  -H "Authorization: Bearer vk_your_virtual_key" \
+  -H "Content-Type: application/json" \
+  -d '{"jsonrpc": "2.0", "id": 1, "method": "tools/list"}'
+
+# Via X-Api-Key header
+curl -X POST http://localhost:8080/mcp \
+  -H "X-Api-Key: vk_your_virtual_key" \
+  -H "Content-Type: application/json" \
+  -d '{"jsonrpc": "2.0", "id": 1, "method": "tools/list"}'
+
+# Via x-bf-virtual-key header
+curl -X POST http://localhost:8080/mcp \
+  -H "x-bf-virtual-key: vk_your_virtual_key" \
+  -H "Content-Type: application/json" \
+  -d '{"jsonrpc": "2.0", "id": 1, "method": "tools/list"}'
+```
+
+**Claude Desktop with Virtual Key:**
+
+```json
+{
+  "mcpServers": {
+    "bifrost-production": {
+      "url": "http://localhost:8080/mcp",
+      "headers": {
+        "Authorization": "Bearer vk_your_production_key"
+      }
+    },
+    "bifrost-development": {
+      "url": "http://localhost:8080/mcp",
+      "headers": {
+        "Authorization": "Bearer vk_your_development_key"
+      }
+    }
+  }
+}
+```
+
+---
+
+## Tool Filtering for MCP Clients
+
+Control which tools are exposed to MCP clients using Virtual Keys:
+
+### Per-Virtual Key Tool Access
+
+Configure which tools each Virtual Key can access:
+
+```json
+{
+  "governance": {
+    "virtual_keys": [
+      {
+        "name": "production-key",
+        "mcp_configs": [
+          {
+            "mcp_client_name": "filesystem",
+            "tools_to_execute": ["read_file", "list_directory"]
+          },
+          {
+            "mcp_client_name": "web_search",
+            "tools_to_execute": ["*"]
+          }
+        ]
+      },
+      {
+        "name": "admin-key",
+        "mcp_configs": [
+          {
+            "mcp_client_name": "filesystem",
+            "tools_to_execute": ["*"]
+          },
+          {
+            "mcp_client_name": "database",
+            "tools_to_execute": ["*"]
+          }
+        ]
+      }
+    ]
+  }
+}
+```
+
+Learn more about Virtual Key tool filtering in [MCP Tool Filtering](../features/governance/mcp-tools).
+
+---
+
+## Advanced Gateway Features
+
+### Health Monitoring
+
+Bifrost automatically monitors the health of connected MCP clients:
+
+**How it works:**
+- **Ping Mechanism:** Every 10 seconds (configurable), sends a ping to each connected client
+- **Check Timeout:** Each ping has a 5-second timeout
+- **Failure Threshold:** After 5 consecutive failed pings, client is marked as `disconnected`
+- **State Tracking:** Real-time state updates (connected ↔ disconnected)
+- **Manual Reconnection:** Once disconnected, requires manual reconnect via API or UI
+
+**Configuration:**
+```json
+{
+  "mcp": {
+    "health_monitor_config": {
+      "check_interval": "10s",
+      "check_timeout": "5s",
+      "max_consecutive_failures": 5
+    }
+  }
+}
+```
+
+When a client is disconnected after 5 consecutive failed health checks, tools from that client become unavailable. You can manually reconnect using the API or Go SDK:
+
+**Gateway API:**
+```bash
+POST /api/mcp/client/{id}/reconnect
+```
+
+**Go SDK:**
+```go
+// Reconnect a disconnected MCP client
+err := client.ReconnectMCPClient(context.Background(), clientID)
+if err != nil {
+    // Handle reconnection error
+    log.Printf("Failed to reconnect client: %v", err)
+}
+```
+
+### Request ID Tracking
+
+For Agent Mode operations, Bifrost can track intermediate tool executions:
+
+```go
+mcpConfig := &schemas.MCPConfig{
+    FetchNewRequestIDFunc: func(ctx context.Context) string {
+        // Generate unique ID per agent iteration
+        return fmt.Sprintf("agent-%s-%d", ctx.Value("original-id"), time.Now().UnixMilli())
+    },
+}
+```
+
+This enables detailed audit trails for autonomous tool execution.
+
+### Dynamic Tool Discovery
+
+Tools are discovered from MCP servers during:
+1. **Client Connection** - Initial ListTools request
+2. **Runtime Updates** - When server tool list changes
+3. **Configuration Changes** - When tools_to_execute is updated
+
+The MCP Server dynamically updates its tool registry from the tool manager.
+
+---
+
+## Per-User OAuth for MCP Clients
+
+When at least one MCP server is configured with `per_user_oauth`, the `/mcp` endpoint automatically advertises OAuth support via standard discovery headers. OAuth-capable MCP clients (Claude Code, Cursor, and others) detect this automatically — no manual configuration is needed on the client side.
+
+When an unauthenticated client connects, Bifrost responds with a `401` that points to the discovery endpoints:
+
+```
+WWW-Authenticate: Bearer resource_metadata="https://your-bifrost-domain/.well-known/oauth-protected-resource"
+```
+
+The client then fetches the OAuth metadata and kicks off a consent flow where the user can attach an identity and connect their upstream services. Subsequent requests use a Bifrost-issued session token as the Bearer credential.
+
+See [Per-User OAuth →](./per-user-oauth) for the full flow and identity options.
+
+---
+
+## Security Considerations
+
+<Warning>
+The MCP Gateway exposes tools to external clients. Consider these security measures:
+</Warning>
+
+### 1. Enable Virtual Key Enforcement
+
+Always enable `enforce_auth_on_inference` in production:
+
+```json
+{
+  "client": {
+    "enforce_auth_on_inference": true
+  }
+}
+```
+
+This ensures all MCP requests require a valid Virtual Key.
+
+### 2. Use HTTPS
+
+Deploy Bifrost behind a reverse proxy (nginx, Cloudflare, etc.) with TLS enabled:
+
+```
+MCP Client → HTTPS → Reverse Proxy → HTTP → Bifrost Gateway
+```
+
+### 3. Limit Tool Access
+
+Use Virtual Keys to limit which tools each client can access. Follow the principle of least privilege.
+
+### 4. Network Restrictions
+
+Consider network-level restrictions to limit which IPs can access the MCP endpoint.
+
+---
+
+## Troubleshooting
+
+<AccordionGroup>
+  <Accordion title="Claude Desktop not connecting">
+    1. Verify the URL is correct and Bifrost is running
+    2. Check if Bifrost is accessible from Claude Desktop's network
+    3. Restart Claude Desktop after configuration changes
+    4. Check Bifrost logs for connection attempts
+  </Accordion>
+
+  <Accordion title="No tools showing up">
+    1. Verify MCP clients are connected in Bifrost
+    2. Check that `tools_to_execute` includes the expected tools
+    3. If using Virtual Keys, verify the VK has MCP tool access configured
+  </Accordion>
+
+  <Accordion title="Virtual Key authentication failing">
+    1. Ensure the Virtual Key exists and is active
+    2. Check the header format (Bearer prefix for Authorization)
+    3. Verify `enforce_auth_on_inference` setting matches your setup
+  </Accordion>
+</AccordionGroup>
+
+---
+
+## Next Steps
+
+<CardGroup cols={2}>
+  <Card title="Tool Filtering" icon="filter" href="./filtering">
+    Control which tools are available per request
+  </Card>
+  <Card title="Virtual Key MCP Tools" icon="key" href="../features/governance/mcp-tools">
+    Configure per-VK tool access
+  </Card>
+</CardGroup>
--- a/docs/mcp/oauth.mdx
+++ b/docs/mcp/oauth.mdx
@@ -0,0 +1,483 @@
+---
+title: "OAuth 2.0 Authentication"
+sidebarTitle: "OAuth Authentication"
+description: "Configure OAuth 2.0 authentication for MCP HTTP and SSE connections. Support for automatic token refresh, PKCE, and dynamic client registration."
+icon: "lock"
+---
+
+<Info>
+This page covers **server-level OAuth**, where an admin authenticates once and the token is shared across all requests to that MCP server. If you need each end-user to authenticate with their own credentials (e.g., personal Notion or GitHub accounts), see [Per-User OAuth](./per-user-oauth).
+</Info>
+
+## Overview
+
+OAuth 2.0 authentication enables secure, user-delegated access to MCP servers. Bifrost handles:
+
+- **Automatic token refresh** - Tokens are refreshed before expiration
+- **PKCE support** - For public clients without client secrets
+- **Dynamic registration** - Automatic client registration (RFC 7591)
+- **OAuth discovery** - Discover endpoints from server URLs
+- **Token management** - Store and revoke OAuth tokens
+
+This is ideal for integrations that need user-based access, require periodic re-authorization, or must comply with OAuth 2.0 standards.
+
+## OAuth Flow
+
+Bifrost implements the **Authorization Code** flow, the most secure and widely-supported OAuth flow:
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant App["Your App"]
+    participant Bifrost as "Bifrost"
+    participant AuthServer as "OAuth Server"
+    participant MCPServer as "MCP Server"
+
+    User->>App: "Add new MCP tool"
+    App->>Bifrost: POST /api/mcp/client (with auth_type: oauth)
+
+    Bifrost->>AuthServer: Request authorization code
+    AuthServer-->>Bifrost: authorize_url + state
+
+    Bifrost-->>App: Return authorize_url
+    App->>User: Redirect to authorize_url
+    User->>AuthServer: Click authorize
+
+    AuthServer-->>User: Redirect to /api/oauth/callback?code=xxx&state=yyy
+    User->>Bifrost: Follow redirect
+
+    Bifrost->>AuthServer: Exchange code for token
+    AuthServer-->>Bifrost: access_token + refresh_token
+    Bifrost->>Bifrost: Store token securely
+
+    App->>Bifrost: POST /api/mcp/client/{id}/complete-oauth
+    Bifrost->>MCPServer: Use access_token for requests
+    MCPServer-->>Bifrost: Tool execution with OAuth auth
+
+    Bifrost-->>App: MCP client connected
+    App->>User: MCP tools now available
+```
+
+## Configuration
+
+### Basic OAuth Setup
+
+Configure OAuth authentication when creating an MCP client:
+
+<Tabs>
+<Tab title="Web UI">
+
+1. Navigate to **MCP Gateway** and click **New MCP Server**
+2. Select **HTTP** or **SSE** as connection type
+3. Set **Auth Type** to **OAuth 2.0**
+4. Provide OAuth configuration:
+   - **Client ID**: Your OAuth application's client ID
+   - **Client Secret**: (Optional for PKCE) Your OAuth application's secret
+   - **Authorize URL**: OAuth provider's authorization endpoint
+   - **Token URL**: OAuth provider's token endpoint
+   - **Scopes**: Comma-separated list of requested scopes
+5. Click **Authorize** to start the OAuth flow
+6. Complete the authorization in the browser
+7. MCP client will be created with the OAuth token
+
+</Tab>
+<Tab title="API">
+
+```bash
+curl -X POST http://localhost:8080/api/mcp/client \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "authenticated-service",
+    "connection_type": "http",
+    "connection_string": "https://api.example.com/mcp",
+    "auth_type": "oauth",
+    "oauth_config": {
+      "client_id": "your-client-id",
+      "client_secret": "your-client-secret",
+      "authorize_url": "https://auth.example.com/oauth/authorize",
+      "token_url": "https://auth.example.com/oauth/token",
+      "scopes": ["mcp:read", "mcp:write"]
+    },
+    "tools_to_execute": ["*"]
+  }'
+```
+
+This returns:
+```json
+{
+  "status": "pending_oauth",
+  "message": "OAuth authorization required",
+  "oauth_config_id": "oauth_cfg_abc123",
+  "authorize_url": "https://auth.example.com/oauth/authorize?client_id=...&state=xyz",
+  "expires_at": "2026-01-24T12:30:00Z",
+  "mcp_client_id": "mcp_client_abc123"
+}
+```
+
+Redirect the user to `authorize_url`. After authorization, complete the flow:
+
+```bash
+curl -X POST http://localhost:8080/api/mcp/client/mcp_client_abc123/complete-oauth
+```
+
+</Tab>
+<Tab title="Go SDK">
+
+```go
+import "github.com/maximhq/bifrost/core/schemas"
+
+mcpConfig := &schemas.MCPClientConfig{
+    Name:           "authenticated-service",
+    ConnectionType: schemas.MCPConnectionTypeHTTP,
+    ConnectionString: schemas.EnvVar{
+        Value: "https://api.example.com/mcp",
+    },
+    AuthType: schemas.MCPAuthTypeOauth,
+    OauthConfigID: &oauthConfigID, // Set after OAuth flow
+    ToolsToExecute: []string{"*"},
+}
+```
+
+</Tab>
+</Tabs>
+
+### Advanced OAuth Configuration
+
+#### PKCE for Public Clients
+
+For applications without a client secret, use PKCE (Proof Key for Code Exchange):
+
+```json
+{
+  "name": "public-client-service",
+  "connection_type": "http",
+  "connection_string": "https://api.example.com/mcp",
+  "auth_type": "oauth",
+  "oauth_config": {
+    "client_id": "your-public-client-id",
+    "authorize_url": "https://auth.example.com/oauth/authorize",
+    "token_url": "https://auth.example.com/oauth/token",
+    "scopes": ["mcp:read"]
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+Bifrost automatically generates and manages PKCE code verifiers.
+
+#### Dynamic Client Registration
+
+If your OAuth server supports RFC 7591, Bifrost can automatically register a client:
+
+```json
+{
+  "name": "auto-registered-service",
+  "connection_type": "http",
+  "connection_string": "https://api.example.com/mcp",
+  "auth_type": "oauth",
+  "oauth_config": {
+    "registration_url": "https://auth.example.com/oauth/register",
+    "server_url": "https://api.example.com",
+    "scopes": ["mcp:read", "mcp:write"]
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+Bifrost will:
+1. Discover OAuth endpoints from `server_url`
+2. Register a new client using `registration_url`
+3. Use the registered client ID for authorization
+
+#### OAuth Discovery
+
+Bifrost can automatically discover OAuth endpoints from your MCP server's metadata:
+
+```json
+{
+  "name": "discovered-service",
+  "connection_type": "http",
+  "connection_string": "https://api.example.com/mcp",
+  "auth_type": "oauth",
+  "oauth_config": {
+    "client_id": "your-client-id",
+    "server_url": "https://api.example.com",
+    "scopes": ["mcp:read"]
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+If OAuth endpoints aren't provided, Bifrost will check:
+1. `/.well-known/oauth-authorization-server` (RFC 8414)
+2. `/.well-known/openid-configuration`
+3. Server MCP metadata
+
+## Token Management
+
+### View OAuth Token Status
+
+Check the status of an OAuth configuration:
+
+```bash
+curl http://localhost:8080/api/oauth/config/oauth_cfg_abc123/status
+```
+
+Response:
+```json
+{
+  "id": "oauth_cfg_abc123",
+  "status": "authorized",
+  "created_at": "2026-01-24T10:00:00Z",
+  "expires_at": "2026-01-31T10:00:00Z",
+  "token_id": "oauth_token_xyz",
+  "token_expires_at": "2026-01-25T10:00:00Z",
+  "token_scopes": ["mcp:read", "mcp:write"]
+}
+```
+
+**Status values:**
+- `pending`: User hasn't authorized yet
+- `authorized`: Token is valid and active
+- `failed`: Authorization failed or token is invalid
+
+### Automatic Token Refresh
+
+Bifrost automatically refreshes OAuth tokens before expiration. No action required - tokens are refreshed transparently during tool execution.
+
+### Revoke OAuth Token
+
+Revoke an OAuth token when you want to disconnect:
+
+```bash
+curl -X DELETE http://localhost:8080/api/oauth/config/oauth_cfg_abc123
+```
+
+This:
+- Revokes the token with the OAuth provider
+- Deletes the token from Bifrost
+- Removes the OAuth configuration
+- The MCP client can still be used if auth_type is changed
+
+## Common OAuth Providers
+
+### GitHub
+
+<Tabs>
+<Tab title="Configuration">
+
+```json
+{
+  "name": "github-integration",
+  "connection_type": "http",
+  "connection_string": "https://github.example.com/api/v1/mcp",
+  "auth_type": "oauth",
+  "oauth_config": {
+    "client_id": "your-github-app-id",
+    "client_secret": "your-github-app-secret",
+    "authorize_url": "https://github.com/login/oauth/authorize",
+    "token_url": "https://github.com/login/oauth/access_token",
+    "scopes": ["repo", "user"]
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+</Tab>
+<Tab title="Setup Steps">
+
+1. Go to Settings → Developer settings → OAuth Apps
+2. Click "New OAuth App"
+3. Fill in:
+   - **Application name**: Bifrost MCP
+   - **Homepage URL**: `https://your-bifrost-domain.com`
+   - **Authorization callback URL**: `https://your-bifrost-domain.com/api/oauth/callback`
+4. Copy Client ID and Client Secret
+5. Use in Bifrost configuration above
+
+</Tab>
+</Tabs>
+
+### Google
+
+<Tabs>
+<Tab title="Configuration">
+
+```json
+{
+  "name": "google-api",
+  "connection_type": "http",
+  "connection_string": "https://mcp.example.com/api",
+  "auth_type": "oauth",
+  "oauth_config": {
+    "client_id": "your-google-client-id.apps.googleusercontent.com",
+    "client_secret": "your-google-client-secret",
+    "authorize_url": "https://accounts.google.com/o/oauth2/v2/auth",
+    "token_url": "https://oauth2.googleapis.com/token",
+    "scopes": ["openid", "email", "profile"]
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+</Tab>
+<Tab title="Setup Steps">
+
+1. Go to [Google Cloud Console](https://console.cloud.google.com)
+2. Create a new project
+3. Enable OAuth 2.0 consent screen
+4. Create OAuth 2.0 Client ID (Web application)
+5. Add Authorized redirect URIs:
+   - `https://your-bifrost-domain.com/api/oauth/callback`
+6. Copy Client ID and Client Secret
+7. Use in Bifrost configuration above
+
+</Tab>
+</Tabs>
+
+### Custom OAuth Server
+
+For your own OAuth server:
+
+```json
+{
+  "name": "custom-oauth-service",
+  "connection_type": "http",
+  "connection_string": "https://mcp.yourcompany.com/mcp",
+  "auth_type": "oauth",
+  "oauth_config": {
+    "client_id": "bifrost-client-id",
+    "client_secret": "bifrost-client-secret",
+    "authorize_url": "https://auth.yourcompany.com/authorize",
+    "token_url": "https://auth.yourcompany.com/token",
+    "registration_url": "https://auth.yourcompany.com/register",
+    "server_url": "https://mcp.yourcompany.com",
+    "scopes": ["mcp:full"]
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+## Troubleshooting
+
+### OAuth Flow Doesn't Start
+
+**Problem:** `authorize_url` not returned when creating MCP client
+
+**Solutions:**
+- Ensure `auth_type` is set to `"oauth"`
+- Check that `oauth_config` is provided in the request
+- Verify `authorize_url` is specified or `server_url` is provided for discovery
+
+### Token Refresh Fails
+
+**Problem:** Tools fail with "OAuth token expired" or "OAuth token invalid"
+
+**Solutions:**
+- Check if the refresh token is still valid
+- Revoke and re-authorize: `DELETE /api/oauth/config/{id}` then create a new client
+- Verify the OAuth provider hasn't revoked the token
+- Check that scopes are still sufficient
+
+### Authorization Callback Hangs
+
+**Problem:** Redirect to `/api/oauth/callback` doesn't complete
+
+**Solutions:**
+- Ensure Bifrost is accessible at the registered callback URL
+- Check network connectivity between Bifrost and OAuth provider
+- Verify the `state` parameter matches (for CSRF protection)
+- Check Bifrost logs for errors: `grep -i oauth /var/log/bifrost`
+
+### MCP Client Won't Connect with OAuth
+
+**Problem:** MCP client shows "error" state with OAuth configured
+
+**Solutions:**
+- Verify OAuth token is still valid: `GET /api/oauth/config/{id}/status`
+- Check that OAuth token has required scopes
+- Ensure MCP server accepts the `Authorization: Bearer {token}` header
+- Test HTTP connectivity to MCP server
+
+## API Reference
+
+### Create MCP Client with OAuth
+
+**POST** `/api/mcp/client`
+
+```json
+{
+  "name": "string",
+  "connection_type": "http|sse",
+  "connection_string": "string",
+  "auth_type": "oauth",
+  "oauth_config": {
+    "client_id": "string",
+    "client_secret": "string (optional)",
+    "authorize_url": "string",
+    "token_url": "string",
+    "registration_url": "string (optional)",
+    "server_url": "string (optional for discovery)",
+    "scopes": ["string"]
+  },
+  "tools_to_execute": ["*"]
+}
+```
+
+**Response:** `OAuthFlowInitiation` with `authorize_url`
+
+### Complete OAuth Flow
+
+**POST** `/api/mcp/client/{mcp_client_id}/complete-oauth`
+
+Called after user authorizes and is redirected back. Bifrost automatically handles the code exchange.
+
+**Response:** `SuccessResponse`
+
+### Get OAuth Config Status
+
+**GET** `/api/oauth/config/{oauth_config_id}/status`
+
+Returns current status of OAuth configuration and token information.
+
+**Response:** `OAuthConfigStatus`
+
+### Revoke OAuth Token
+
+**DELETE** `/api/oauth/config/{oauth_config_id}`
+
+Revokes the token and removes OAuth configuration.
+
+**Response:** `SuccessResponse`
+
+## Best Practices
+
+1. **Use HTTPS** - Always use HTTPS for OAuth flows. OAuth providers won't accept HTTP callback URLs in production.
+
+2. **Secure Client Secrets** - Store client secrets in environment variables or secure vaults, not in version control.
+
+3. **Rotate Tokens** - Periodically revoke and re-authorize OAuth tokens for enhanced security.
+
+4. **Monitor Token Status** - Check token status regularly, especially before critical operations.
+
+5. **Handle Refresh Failures** - If token refresh fails, prompt user to re-authorize rather than silently failing.
+
+6. **Limit Scopes** - Request only the scopes your MCP tools actually need.
+
+7. **Log OAuth Operations** - Keep audit logs of OAuth authorizations and token usage.
+
+## Security Considerations
+
+- **Token Storage** - Bifrost stores OAuth tokens in the database encrypted. Never log or expose tokens.
+- **PKCE Requirement** - For public clients, PKCE is automatically enabled and verified.
+- **State Parameter** - CSRF protection via state parameter is enforced in OAuth flows.
+- **Token Expiration** - Tokens are automatically refreshed, reducing the window of vulnerability.
+- **Revocation Support** - Tokens can be revoked immediately if compromised.
+
+---
+
+## Next Steps
+
+- [Connect to MCP Servers →](./connecting-to-servers)
+- [Tool Execution →](./tool-execution)
+- [Agent Mode →](./agent-mode)
--- a/docs/mcp/overview.mdx
+++ b/docs/mcp/overview.mdx
@@ -0,0 +1,142 @@
+---
+title: "Overview"
+sidebarTitle: "Overview"
+description: "Enable AI models to discover and execute external tools dynamically. Transform static chat models into action-capable agents."
+icon: "circle-info"
+---
+
+## What is MCP?
+
+**Model Context Protocol (MCP)** is an open standard that enables AI models to seamlessly discover and execute external tools at runtime. Instead of being limited to text generation, AI models can interact with filesystems, search the web, query databases, and execute custom business logic through external MCP servers.
+
+Bifrost provides a comprehensive MCP integration that goes beyond simple tool execution:
+
+- **MCP Client**: Connect to any MCP-compatible server (filesystem tools, web search, databases, etc.)
+- **MCP Server**: Expose your connected tools to external MCP clients (like Claude Desktop)
+- **Agent Mode**: Autonomous tool execution with configurable auto-approval
+- **Code Mode**: Let AI write and execute Python to orchestrate multiple tools
+
+## Security-First Design
+
+<Note>
+By default, Bifrost does NOT automatically execute tool calls. All tool execution requires explicit API calls, ensuring human oversight for potentially dangerous operations. However, you can enable [Agent Mode](./agent-mode) to allow automatic execution of specific tools via the `tools_to_auto_execute` configuration.
+</Note>
+
+**Key Security Principles:**
+
+| Principle | Description |
+|-----------|-------------|
+| **Explicit Execution** | Tool calls from LLMs are suggestions only - execution requires separate API call |
+| **Granular Control** | Filter tools per-request, per-client, or per-virtual-key |
+| **Opt-in Auto-execution** | Agent mode with auto-execution must be explicitly configured |
+| **Stateless Design** | Each API call is independent - your app controls conversation state |
+
+## Key Capabilities
+
+<CardGroup cols={2}>
+  <Card title="Connect to MCP Servers" icon="plug" href="./connecting-to-servers">
+    Connect to external MCP servers via STDIO, HTTP, or SSE protocols with automatic retry logic
+  </Card>
+  <Card title="OAuth Authentication" icon="lock" href="./oauth">
+    Secure OAuth 2.0 authentication with automatic token refresh
+  </Card>
+  <Card title="Per-User OAuth" icon="users" href="./per-user-oauth">
+    Let each end-user authenticate with upstream services under their own credentials
+  </Card>
+  <Card title="Tool Execution" icon="play" href="./tool-execution">
+    Execute tools with full control over approval and conversation flow
+  </Card>
+  <Card title="Agent Mode" icon="robot" href="./agent-mode">
+    Enable autonomous tool execution with configurable auto-approval
+  </Card>
+  <Card title="Code Mode" icon="code" href="./code-mode">
+    Let AI write Python to orchestrate multiple tools in one request
+  </Card>
+  <Card title="Connection Resilience" icon="shield-check" href="./connecting-to-servers#connection-resilience-and-retry-logic">
+    Automatic exponential backoff retry logic handles transient failures gracefully
+  </Card>
+  <Card title="MCP Gateway URL" icon="server" href="./gateway-url">
+    Expose Bifrost as an MCP server for Claude Desktop and other clients
+  </Card>
+  <Card title="Tool Hosting" icon="toolbox" href="./tool-hosting">
+    Register custom tools directly in your Go application
+  </Card>
+  <Card title="Tool Filtering" icon="filter" href="./filtering">
+    Control which tools are available per request or per virtual key
+  </Card>
+</CardGroup>
+
+## How MCP Works in Bifrost
+
+Bifrost acts as both an **MCP client** (connecting to external tool servers) and optionally as an **MCP server** (exposing tools to external clients like Claude Desktop).
+
+```mermaid
+graph TB
+    App["<b>Your Application</b>"]
+    Gateway["<b>Bifrost Gateway</b><br/>MCP Client | MCP Server<br/>Tool Filtering & Agent Mode"]
+    Servers["<b>MCP Servers</b><br/>filesystem, web search,<br/>databases, etc."]
+    Clients["<b>MCP Clients</b><br/>Claude Desktop,<br/>other apps"]
+
+    App -->|Connect| Gateway
+    Gateway -->|Connect to| Servers
+    Clients -->|Connect to| Gateway
+
+    style App fill:#E3F2FD,stroke:#0D47A1,stroke-width:2.5px,color:#1A1A1A
+    style Gateway fill:#E8F5E9,stroke:#1B5E20,stroke-width:2.5px,color:#1A1A1A
+    style Servers fill:#FFF3E0,stroke:#BF360C,stroke-width:2.5px,color:#1A1A1A
+    style Clients fill:#F3E5F5,stroke:#4A148C,stroke-width:2.5px,color:#1A1A1A
+```
+
+For detailed architecture information, see the [MCP Architecture](/architecture/core/mcp) documentation.
+
+## Basic Tool Calling Flow
+
+The default tool calling pattern in Bifrost is **stateless** with explicit execution:
+
+```
+1. POST /v1/chat/completions
+   → LLM returns tool call suggestions (NOT executed)
+
+2. Your app reviews the tool calls
+   → Apply security rules, get user approval if needed
+
+3. POST /v1/mcp/tool/execute
+   → Execute approved tool calls explicitly
+
+4. POST /v1/chat/completions
+   → Continue conversation with tool results
+```
+
+This pattern ensures:
+- No unintended API calls to external services
+- No accidental data modification or deletion
+- Full audit trail of all tool operations
+- Human oversight for sensitive operations
+
+## Why Code Mode Matters
+
+If you're planning to use **3+ MCP servers**, read the [Code Mode](./code-mode) documentation carefully.
+
+Code Mode reduces token usage by **50%+ and execution latency by 40-50%** compared to classic MCP by having the AI write Python code to orchestrate tools in a sandbox, rather than exposing 100+ tool definitions directly to the LLM.
+
+---
+
+## Next Steps
+
+<Steps>
+  <Step title="Connect to MCP Servers">
+    [Set up your first MCP client connection →](./connecting-to-servers)
+  </Step>
+  <Step title="Choose Authentication (if needed)">
+    [Learn about header-based and OAuth 2.0 authentication →](./oauth)
+  </Step>
+  <Step title="Enable Code Mode (for 3+ servers)">
+    [Learn how Code Mode reduces costs by 50% →](./code-mode)
+  </Step>
+  <Step title="Execute Tools">
+    [Learn the tool execution workflow →](./tool-execution)
+  </Step>
+  <Step title="Enable Agent Mode">
+    [Configure autonomous tool execution →](./agent-mode)
+  </Step>
+</Steps>
--- a/docs/mcp/per-user-oauth.mdx
+++ b/docs/mcp/per-user-oauth.mdx
@@ -0,0 +1,194 @@
+---
+title: "Per-User OAuth"
+sidebarTitle: "Per-User OAuth"
+description: "Let each end-user authenticate with upstream MCP services under their own credentials. Works with both the MCP Gateway and LLM Gateway."
+icon: "users"
+---
+
+## Overview
+
+<Info>Per-user OAuth is available in **Bifrost v1.5.0-prerelease2 and above**.</Info>
+
+**Per-user OAuth** lets each end-user connect to upstream MCP services (Notion, GitHub, etc.) using their own credentials. Instead of a single shared admin token, every user gets their own access — scoped to their account, their data.
+
+This is different from [server-level OAuth](./oauth), where an admin authenticates once and every request uses the same shared token:
+
+| | Server-level OAuth | Per-user OAuth |
+|---|---|---|
+| Who authenticates | Admin, once | Each end-user individually |
+| Token scope | Shared across all requests | Per-user, per-service |
+| Identity required | No | Yes (VK, User ID, or session) |
+| Persists across sessions | Yes (background refresh) | Yes, when tied to VK or User ID |
+| Works with MCP Gateway | Yes | Yes |
+| Works with LLM Gateway | Yes | Yes |
+
+---
+
+## Setup
+
+Per-user OAuth is configured through the Web UI only. During setup, Bifrost runs a test OAuth flow and pre-fetches the available tools from the upstream service — this is why file-based config is not supported for this auth type.
+
+<Tabs>
+<Tab title="Web UI">
+
+1. Navigate to **MCP Gateway** and click **New MCP Server**
+2. Select **HTTP** or **SSE** as the connection type and enter the server URL
+3. Set **Auth Type** to **Per-User OAuth**
+4. Fill in the OAuth application credentials:
+   - **Client ID** — your upstream OAuth app's client ID
+   - **Client Secret** — optional for PKCE flows
+   - **Authorize URL** — upstream authorization endpoint (or leave blank for auto-discovery)
+   - **Token URL** — upstream token endpoint (or leave blank for auto-discovery)
+   - **Scopes** — comma-separated list of requested scopes
+5. Click **Create** — Bifrost runs a test OAuth flow to validate the config and pre-fetches the tool list
+6. Complete the authorization in your browser
+7. Save the MCP client
+
+![Per-User OAuth MCP Server Configuration](../media/ui-mcp-per-user-oauth-setup.png)
+
+</Tab>
+</Tabs>
+
+<Info>
+If your upstream server supports OAuth Discovery (RFC 8414), you can leave the authorize and token URLs blank and provide only the **Server URL**. Bifrost will discover the endpoints automatically.
+</Info>
+
+---
+
+## How it works: MCP Gateway
+
+When you expose Bifrost as an MCP server (via the `/mcp` endpoint) and at least one MCP client is configured with `per_user_oauth`, Bifrost becomes an **OAuth 2.1 Authorization Server**. OAuth-capable MCP clients like Claude Code and Cursor detect this automatically — no manual configuration required on the client side.
+
+The full flow involves three distinct phases: **discovery** (the client finds Bifrost's OAuth endpoints), **consent** (the user attaches an identity and connects upstream services), and **authenticated use** (all subsequent tool calls carry the user's tokens transparently). The diagram below shows all three phases end to end.
+
+![MCP Gateway per-user OAuth flow — discovery, consent, and authenticated tool execution](../media/ui-mcp-per-user-oauth-flow-mcp.svg)
+
+### First connection: the consent flow
+
+The first time a client connects, Bifrost walks the user through a two-step consent screen:
+
+**Step 1 — Identity selection**
+
+The user chooses how to identify themselves for this session:
+
+- **Virtual Key** — ties upstream tokens to the VK permanently; tokens survive session restarts and work across the LLM Gateway too
+- **User ID** — a self-declared identifier with the same persistence guarantees as a VK
+- **Skip** — no identity attached; tokens are scoped to this session only and won't carry over to other sessions or the LLM Gateway
+
+![Consent identity selection screen](../media/ui-mcp-per-user-oauth-consent-identity.png)
+
+**Step 2 — Connect upstream services**
+
+The user sees all per-user OAuth MCP servers available on their Virtual Key. They can connect all of them at once or just the ones they want right now.
+
+![MCP services connection screen](../media/ui-mcp-per-user-oauth-consent-mcps.png)
+
+For each selected service, the user is redirected to the upstream OAuth provider (Notion, GitHub, etc.) to authorize access. After authorizing, they return to Bifrost and can connect additional services or finish.
+
+**Step 3 — Done**
+
+Bifrost issues a 24-hour session token. The MCP client receives this token and proceeds normally. All subsequent tool calls use the user's upstream tokens transparently.
+
+### Lazy auth for skipped services
+
+If the user skips a service during consent — or a new per-user MCP server is added later — Bifrost handles it lazily. When a tool call hits a service the user hasn't authenticated with yet, Bifrost returns an auth URL in the tool result instead of executing the tool:
+
+```
+Authentication required for Notion. Open this URL to connect:
+https://your-bifrost-domain.com/api/oauth/per-user/upstream/authorize?...
+```
+
+![Auth URL returned inline in a tool result in Claude Code](../media/ui-mcp-per-user-oauth-llm-prompt-mcp.png)
+
+The user opens the URL, completes the upstream OAuth flow, and Bifrost saves the token against their session identity. The next tool call proceeds without any re-auth. This lazy pattern is the same one used by the LLM Gateway — the only difference is the auth URL surfaces as a tool result message rather than an API response field.
+
+---
+
+## How it works: LLM Gateway
+
+When using per-user OAuth through the LLM Gateway (`/v1/chat/completions`), there is no upfront consent screen. Auth is **entirely lazy** — Bifrost waits until a tool actually needs a token before asking for one. This is also the same pattern used when a service is skipped during MCP Gateway consent.
+
+The pattern is simple: every request carries an identity header, and any tool call to an unauthenticated service returns an auth URL instead of a result. The user completes auth once at that URL; all subsequent calls to that service execute normally. The diagram below shows the full cycle.
+
+![LLM Gateway per-user OAuth flow — lazy auth on first tool call, transparent execution on retry](../media/ui-mcp-per-user-oauth-flow-llm.svg)
+
+1. The user makes a request with an identity header attached (required — see below)
+2. The LLM suggests a tool call to a per-user OAuth service
+3. If no token exists for that user + service, Bifrost returns an `mcp_auth_required` response with an `authorize_url` **instead of executing the tool** — the rest of the LLM response still comes through normally
+
+![API response showing mcp_auth_required with an authorize_url when the user has not yet authenticated](../media/ui-mcp-per-user-oauth-llm-prompt-llm.png)
+
+4. The user opens the URL and completes the upstream OAuth flow
+5. Bifrost saves the token against their identity — no action needed on your side
+6. On the next request, the tool call executes normally — no re-auth, no special handling required
+
+### Identity is required
+
+The LLM Gateway has no session management, so an identity must be declared on every request. Without one, Bifrost has no stable key to look up or store tokens against.
+
+Pass one of:
+
+```bash
+# Virtual Key (recommended — also works with MCP Gateway)
+-H "x-bf-virtual-key: vk_your_key"
+
+# Self-declared User ID
+-H "X-Bf-User-Id: user_123"
+```
+
+<Note>
+**Enterprise**: When enterprise user identity is configured, the user's identity is automatically attached as the User ID — no manual header required.
+</Note>
+
+---
+
+## Cross-gateway token sharing
+
+Tokens are stored against an **identity** (Virtual Key or User ID), not against a gateway. This means:
+
+- Authenticate via the **LLM Gateway** with a VK → that token is immediately usable on the **MCP Gateway** with the same VK
+- Authenticate via the **MCP Gateway** consent flow with a VK → that VK works on the **LLM Gateway** with no re-auth needed
+
+The only exception is **Skip** (session-only) auth: those tokens are not associated with any persistent identity and cannot be used from the LLM Gateway.
+
+| Identity mode | Set via | Cross-gateway portable | Persists across sessions |
+|---|---|---|---|
+| Virtual Key | Consent screen or `x-bf-virtual-key` header | Yes | Yes |
+| User ID | Consent screen or `X-Bf-User-Id` header | Yes | Yes |
+| Skip (MCP Gateway only) | Consent screen | No | No |
+
+---
+
+## Config reference
+
+Per-user OAuth is configured on the MCP client via `auth_type`. When `auth_type` is `per_user_oauth`, an `oauth_config_id` linking to the OAuth credentials is required (set automatically during UI setup):
+
+```json
+{
+  "mcp": {
+    "mcp_clients": [
+      {
+        "name": "notion",
+        "connection_type": "http",
+        "connection_string": "https://mcp.notion.so/sse",
+        "auth_type": "per_user_oauth",
+        "oauth_config_id": "oauth_cfg_abc123",
+        "tools_to_execute": ["*"]
+      }
+    ]
+  }
+}
+```
+
+| Field | Type | Description |
+|---|---|---|
+| `auth_type` | string | Set to `"per_user_oauth"` |
+| `oauth_config_id` | string | ID of the OAuth config created during UI setup |
+
+---
+
+## Next Steps
+
+- [Server-level OAuth →](./oauth) — admin authenticates once, shared token for all requests
+- [MCP Gateway URL →](./gateway-url) — expose Bifrost as an MCP server for Claude Code and Cursor
+- [Tool Filtering →](./filtering) — control which per-user tools are available per Virtual Key
--- a/docs/mcp/tool-execution.mdx
+++ b/docs/mcp/tool-execution.mdx
@@ -0,0 +1,414 @@
+---
+title: "Tool Execution"
+sidebarTitle: "Tool Execution"
+description: "Execute MCP tools with full control over approval and conversation flow."
+icon: "play"
+---
+
+## Overview
+
+When an LLM returns tool calls in its response, Bifrost does **not** automatically execute them. Instead, your application explicitly calls the tool execution API, giving you full control over:
+
+- Which tool calls to execute
+- User approval workflows
+- Security validation
+- Audit logging
+
+The basic flow is: **Chat Request → Review Tool Calls → Execute Tools → Continue Conversation**. For detailed architecture diagrams, see the [MCP Architecture](/architecture/core/mcp#tool-execution-engine) documentation.
+
+---
+
+## Authentication
+
+The `/v1/mcp/tool/execute` endpoint uses the same authentication as other inference endpoints like `/v1/chat/completions`:
+
+| Auth Configuration | Behavior |
+|--------------------|----------|
+| `disable_auth_on_inference: true` | No auth required |
+| `disable_auth_on_inference: false` | Auth required |
+
+Virtual keys and authentication are independent layers that work together. For details on how to use virtual keys with authentication, see [Authentication and Virtual Keys](/features/governance/virtual-keys#authentication-and-virtual-keys).
+
+---
+
+## End-to-End Example
+
+<Tabs>
+<Tab title="Gateway">
+
+### Step 1: Send Chat Request
+
+```bash
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "openai/gpt-4o",
+    "messages": [
+      {
+        "role": "user",
+        "content": "List files in the current directory"
+      }
+    ]
+  }'
+```
+
+**Response with tool calls:**
+```json
+{
+  "id": "chatcmpl-abc123",
+  "choices": [{
+    "index": 0,
+    "message": {
+      "role": "assistant",
+      "content": null,
+      "tool_calls": [{
+        "id": "call_xyz789",
+        "type": "function",
+        "function": {
+          "name": "filesystem_list_directory",
+          "arguments": "{\"path\": \".\"}"
+        }
+      }]
+    },
+    "finish_reason": "tool_calls"
+  }]
+}
+```
+
+<Note>
+Tool names are prefixed with the MCP client name (e.g., `filesystem_list_directory`). This ensures uniqueness across multiple MCP clients.
+</Note>
+
+### Step 2: Execute the Tool
+
+The request body matches the tool call object from the response:
+
+```bash
+curl -X POST http://localhost:8080/v1/mcp/tool/execute \
+  -H "Content-Type: application/json" \
+  -d '{
+    "id": "call_xyz789",
+    "type": "function",
+    "function": {
+      "name": "filesystem_list_directory",
+      "arguments": "{\"path\": \".\"}"
+    }
+  }'
+```
+
+**Tool result response:**
+```json
+{
+  "role": "tool",
+  "content": "[\"config.json\", \"main.go\", \"README.md\"]",
+  "tool_call_id": "call_xyz789"
+}
+```
+
+### Step 3: Continue the Conversation
+
+Assemble the full conversation history and continue:
+
+```bash
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "openai/gpt-4o",
+    "messages": [
+      {
+        "role": "user",
+        "content": "List files in the current directory"
+      },
+      {
+        "role": "assistant",
+        "content": null,
+        "tool_calls": [{
+          "id": "call_xyz789",
+          "type": "function",
+          "function": {
+            "name": "filesystem_list_directory",
+            "arguments": "{\"path\": \".\"}"
+          }
+        }]
+      },
+      {
+        "role": "tool",
+        "content": "[\"config.json\", \"main.go\", \"README.md\"]",
+        "tool_call_id": "call_xyz789"
+      }
+    ]
+  }'
+```
+
+**Final response:**
+```json
+{
+  "choices": [{
+    "message": {
+      "role": "assistant",
+      "content": "The current directory contains 3 files:\n\n1. **config.json** - Configuration file\n2. **main.go** - Go source file\n3. **README.md** - Documentation"
+    },
+    "finish_reason": "stop"
+  }]
+}
+```
+
+</Tab>
+<Tab title="Go SDK">
+
+```go
+package main
+
+import (
+    "context"
+    "fmt"
+    bifrost "github.com/maximhq/bifrost/core"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+func main() {
+    // Initialize Bifrost with MCP (see Connecting to Servers)
+    client, _ := bifrost.Init(context.Background(), config)
+
+    // Step 1: Send initial request
+    firstMessage := schemas.ChatMessage{
+        Role: schemas.ChatMessageRoleUser,
+        Content: schemas.ChatMessageContent{
+            ContentStr: bifrost.Ptr("List files in the current directory"),
+        },
+    }
+
+    request := &schemas.BifrostChatRequest{
+        Provider: schemas.OpenAI,
+        Model:    "gpt-4o",
+        Input:    []schemas.ChatMessage{firstMessage},
+    }
+
+    response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), request)
+    if err != nil {
+        panic(err)
+    }
+
+    // Build conversation history
+    history := []schemas.ChatMessage{firstMessage}
+
+    // Step 2: Process tool calls
+    if response.Choices[0].Message.ToolCalls != nil {
+        assistantMessage := response.Choices[0].Message
+        history = append(history, assistantMessage)
+
+        for _, toolCall := range *assistantMessage.ToolCalls {
+            fmt.Printf("Tool requested: %s\n", *toolCall.Function.Name)
+
+            // YOUR APPROVAL LOGIC HERE
+            // - Validate arguments
+            // - Check permissions
+            // - Get user confirmation if needed
+
+            // Step 3: Execute the tool
+            toolResult, err := client.ExecuteChatMCPTool(context.Background(), toolCall)
+            if err != nil {
+                fmt.Printf("Tool execution failed: %v\n", err)
+                continue
+            }
+
+            fmt.Printf("Tool result: %s\n", *toolResult.Content.ContentStr)
+            history = append(history, *toolResult)
+        }
+    }
+
+    // Step 4: Continue conversation with results
+    finalRequest := &schemas.BifrostChatRequest{
+        Provider: schemas.OpenAI,
+        Model:    "gpt-4o",
+        Input:    history,
+    }
+
+    finalResponse, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), finalRequest)
+    if err != nil {
+        panic(err)
+    }
+
+    fmt.Printf("Final response: %s\n", *finalResponse.Choices[0].Message.Content.ContentStr)
+}
+```
+
+</Tab>
+</Tabs>
+
+---
+
+## Response Formats
+
+Bifrost supports two API formats for tool execution:
+
+### Chat Format (Default)
+
+Use `?format=chat` or omit the parameter:
+
+```bash
+POST /v1/mcp/tool/execute?format=chat
+```
+
+**Request:**
+```json
+{
+  "id": "call_xyz789",
+  "type": "function",
+  "function": {
+    "name": "filesystem_read_file",
+    "arguments": "{\"path\": \"config.json\"}"
+  }
+}
+```
+
+**Response:**
+```json
+{
+  "role": "tool",
+  "content": "{\"key\": \"value\"}",
+  "tool_call_id": "call_xyz789"
+}
+```
+
+### Responses Format
+
+Use `?format=responses` for the Responses API format:
+
+```bash
+POST /v1/mcp/tool/execute?format=responses
+```
+
+**Request:**
+```json
+{
+  "type": "function_call_output",
+  "call_id": "call_xyz789",
+  "name": "filesystem_read_file",
+  "arguments": "{\"path\": \"config.json\"}"
+}
+```
+
+**Response:**
+```json
+{
+  "type": "function_call_output",
+  "call_id": "call_xyz789",
+  "output": "{\"key\": \"value\"}"
+}
+```
+
+---
+
+## Multiple Tool Calls
+
+LLMs often request multiple tools in a single response. Execute them in sequence or parallel:
+
+<Tabs>
+<Tab title="Sequential">
+
+```go
+for _, toolCall := range *response.Choices[0].Message.ToolCalls {
+    result, err := client.ExecuteChatMCPTool(ctx, toolCall)
+    if err != nil {
+        // Handle error
+        continue
+    }
+    history = append(history, *result)
+}
+```
+
+</Tab>
+<Tab title="Parallel">
+
+```go
+toolCalls := *response.Choices[0].Message.ToolCalls
+results := make([]*schemas.ChatMessage, len(toolCalls))
+var wg sync.WaitGroup
+
+for i, toolCall := range toolCalls {
+    wg.Add(1)
+    go func(idx int, tc schemas.ChatAssistantMessageToolCall) {
+        defer wg.Done()
+        result, err := client.ExecuteChatMCPTool(ctx, tc)
+        if err == nil {
+            results[idx] = result
+        }
+    }(i, toolCall)
+}
+
+wg.Wait()
+
+for _, result := range results {
+    if result != nil {
+        history = append(history, *result)
+    }
+}
+```
+
+</Tab>
+</Tabs>
+
+---
+
+## Error Handling
+
+Tool execution can fail for various reasons:
+
+```go
+result, err := client.ExecuteChatMCPTool(ctx, toolCall)
+if err != nil {
+    switch {
+    case errors.Is(err, context.DeadlineExceeded):
+        // Tool execution timed out
+    case strings.Contains(err.Error(), "tool not found"):
+        // Tool doesn't exist or client disconnected
+    case strings.Contains(err.Error(), "not allowed"):
+        // Tool filtered out by configuration
+    default:
+        // Other execution error
+    }
+}
+```
+
+**Gateway error responses:**
+```json
+{
+  "error": {
+    "type": "tool_execution_error",
+    "message": "Tool 'filesystem_delete_file' is not allowed for this request"
+  }
+}
+```
+
+---
+
+## Copy-Pastable Responses
+
+Tool execution responses are designed to be directly appended to your conversation history:
+
+```go
+// Tool result is already in the correct format
+toolResult, _ := client.ExecuteChatMCPTool(ctx, toolCall)
+
+// Just append it directly
+history = append(history, *toolResult)
+```
+
+The response includes:
+- Correct `role` field (`"tool"`)
+- Matching `tool_call_id` for correlation
+- Properly formatted `content`
+
+---
+
+## Next Steps
+
+<CardGroup cols={2}>
+  <Card title="Agent Mode" icon="robot" href="./agent-mode">
+    Enable autonomous tool execution with auto-approval
+  </Card>
+  <Card title="Tool Filtering" icon="filter" href="./filtering">
+    Control which tools are available per request
+  </Card>
+</CardGroup>
--- a/docs/mcp/tool-hosting.mdx
+++ b/docs/mcp/tool-hosting.mdx
@@ -0,0 +1,517 @@
+---
+title: "Tool Hosting"
+sidebarTitle: "Tool Hosting"
+description: "Register custom tools directly in your Go application without external MCP servers."
+icon: "toolbox"
+---
+
+<Info>
+This feature is only available when using Bifrost as a **Go SDK**. It is not available in the Gateway deployment.
+</Info>
+
+## Overview
+
+**Tool Hosting** allows you to register custom tools directly within your Go application. These tools run in-process with zero network overhead, making them ideal for:
+
+- Application-specific business logic
+- High-performance operations
+- Testing and development
+- Tools that need access to application state
+
+Bifrost automatically creates an internal MCP server (`bifrostInternal`) when you register your first tool.
+
+---
+
+## Basic Usage
+
+### Step 1: Define Your Tool Schema
+
+Create a schema that describes your tool's parameters:
+
+```go
+import "github.com/maximhq/bifrost/core/schemas"
+
+// Define the tool schema
+calculatorSchema := schemas.ChatTool{
+    Type: schemas.ChatToolTypeFunction,
+    Function: &schemas.ChatToolFunction{
+        Name:        "calculator",
+        Description: schemas.Ptr("Perform basic arithmetic operations"),
+        Parameters: &schemas.ToolFunctionParameters{
+            Type: "object",
+            Properties: &schemas.OrderedMap{
+                "operation": map[string]interface{}{
+                    "type":        "string",
+                    "description": "The arithmetic operation to perform",
+                    "enum":        []string{"add", "subtract", "multiply", "divide"},
+                },
+                "a": map[string]interface{}{
+                    "type":        "number",
+                    "description": "First operand",
+                },
+                "b": map[string]interface{}{
+                    "type":        "number",
+                    "description": "Second operand",
+                },
+            },
+            Required: []string{"operation", "a", "b"},
+        },
+    },
+}
+```
+
+### Step 2: Implement the Handler
+
+Create a function that handles tool execution:
+
+```go
+func calculatorHandler(args any) (string, error) {
+    // Parse arguments
+    argsMap, ok := args.(map[string]interface{})
+    if !ok {
+        return "", fmt.Errorf("invalid arguments")
+    }
+
+    operation, _ := argsMap["operation"].(string)
+    a, _ := argsMap["a"].(float64)
+    b, _ := argsMap["b"].(float64)
+
+    var result float64
+    switch operation {
+    case "add":
+        result = a + b
+    case "subtract":
+        result = a - b
+    case "multiply":
+        result = a * b
+    case "divide":
+        if b == 0 {
+            return "", fmt.Errorf("division by zero")
+        }
+        result = a / b
+    default:
+        return "", fmt.Errorf("unknown operation: %s", operation)
+    }
+
+    return fmt.Sprintf("%.2f", result), nil
+}
+```
+
+### Step 3: Register the Tool
+
+Register your tool with Bifrost:
+
+```go
+import (
+    "context"
+    bifrost "github.com/maximhq/bifrost/core"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+func main() {
+    // Initialize Bifrost with MCP enabled (even empty config is fine)
+    client, err := bifrost.Init(context.Background(), schemas.BifrostConfig{
+        Account: account,
+        MCPConfig: &schemas.MCPConfig{}, // Required for tool registration
+    })
+    if err != nil {
+        panic(err)
+    }
+
+    // Register the calculator tool
+    err = client.RegisterMCPTool(
+        "calculator",
+        "Perform basic arithmetic operations",
+        calculatorHandler,
+        calculatorSchema,
+    )
+    if err != nil {
+        panic(fmt.Sprintf("Failed to register tool: %v", err))
+    }
+
+    // Now the tool is available in all chat requests
+}
+```
+
+---
+
+## Complete Example
+
+Here's a complete example with multiple tools:
+
+```go
+package main
+
+import (
+    "context"
+    "encoding/json"
+    "fmt"
+    "time"
+
+    bifrost "github.com/maximhq/bifrost/core"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+func main() {
+    // Initialize with empty MCP config to enable tool registration
+    client, err := bifrost.Init(context.Background(), schemas.BifrostConfig{
+        Account: schemas.Account{
+            Provider: schemas.OpenAI,
+            APIKey:   "your-api-key",
+        },
+        MCPConfig: &schemas.MCPConfig{},
+    })
+    if err != nil {
+        panic(err)
+    }
+
+    // Register a calculator tool
+    registerCalculator(client)
+
+    // Register a time tool
+    registerTimeTool(client)
+
+    // Make a request - tools are automatically available
+    response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostChatRequest{
+        Provider: schemas.OpenAI,
+        Model:    "gpt-4o",
+        Input: []schemas.ChatMessage{
+            {
+                Role: schemas.ChatMessageRoleUser,
+                Content: schemas.ChatMessageContent{
+                    ContentStr: bifrost.Ptr("What is 15 * 7? Also, what time is it?"),
+                },
+            },
+        },
+    })
+
+    if err != nil {
+        panic(err)
+    }
+
+    // Handle tool calls...
+}
+
+func registerCalculator(client *bifrost.Bifrost) {
+    schema := schemas.ChatTool{
+        Type: schemas.ChatToolTypeFunction,
+        Function: &schemas.ChatToolFunction{
+            Name:        "calculator",
+            Description: schemas.Ptr("Perform arithmetic: add, subtract, multiply, divide"),
+            Parameters: &schemas.ToolFunctionParameters{
+                Type: "object",
+                Properties: &schemas.OrderedMap{
+                    "operation": map[string]interface{}{
+                        "type": "string",
+                        "enum": []string{"add", "subtract", "multiply", "divide"},
+                    },
+                    "a": map[string]interface{}{"type": "number"},
+                    "b": map[string]interface{}{"type": "number"},
+                },
+                Required: []string{"operation", "a", "b"},
+            },
+        },
+    }
+
+    handler := func(args any) (string, error) {
+        m := args.(map[string]interface{})
+        op := m["operation"].(string)
+        a := m["a"].(float64)
+        b := m["b"].(float64)
+
+        var result float64
+        switch op {
+        case "add":
+            result = a + b
+        case "subtract":
+            result = a - b
+        case "multiply":
+            result = a * b
+        case "divide":
+            if b == 0 {
+                return "", fmt.Errorf("cannot divide by zero")
+            }
+            result = a / b
+        }
+        return fmt.Sprintf("%.2f", result), nil
+    }
+
+    if err := client.RegisterMCPTool("calculator", "Arithmetic calculator", handler, schema); err != nil {
+        panic(err)
+    }
+}
+
+func registerTimeTool(client *bifrost.Bifrost) {
+    schema := schemas.ChatTool{
+        Type: schemas.ChatToolTypeFunction,
+        Function: &schemas.ChatToolFunction{
+            Name:        "get_current_time",
+            Description: schemas.Ptr("Get the current date and time"),
+            Parameters: &schemas.ToolFunctionParameters{
+                Type: "object",
+                Properties: &schemas.OrderedMap{
+                    "timezone": map[string]interface{}{
+                        "type":        "string",
+                        "description": "Timezone (e.g., 'America/New_York', 'UTC')",
+                    },
+                },
+                Required: []string{},
+            },
+        },
+    }
+
+    handler := func(args any) (string, error) {
+        m := args.(map[string]interface{})
+        tzName, _ := m["timezone"].(string)
+
+        var loc *time.Location
+        var err error
+        if tzName != "" {
+            loc, err = time.LoadLocation(tzName)
+            if err != nil {
+                return "", fmt.Errorf("invalid timezone: %s", tzName)
+            }
+        } else {
+            loc = time.UTC
+        }
+
+        now := time.Now().In(loc)
+        return now.Format("2006-01-02 15:04:05 MST"), nil
+    }
+
+    if err := client.RegisterMCPTool("get_current_time", "Get current time", handler, schema); err != nil {
+        panic(err)
+    }
+}
+```
+
+---
+
+## Typed Handlers
+
+For better type safety, use typed structs with JSON marshaling:
+
+```go
+// Define typed arguments
+type WeatherArgs struct {
+    City    string `json:"city"`
+    Units   string `json:"units,omitempty"` // celsius or fahrenheit
+}
+
+type WeatherResponse struct {
+    City        string  `json:"city"`
+    Temperature float64 `json:"temperature"`
+    Units       string  `json:"units"`
+    Condition   string  `json:"condition"`
+}
+
+func weatherHandler(args any) (string, error) {
+    // Parse to typed struct
+    argsBytes, _ := json.Marshal(args)
+    var typedArgs WeatherArgs
+    if err := json.Unmarshal(argsBytes, &typedArgs); err != nil {
+        return "", fmt.Errorf("invalid arguments: %v", err)
+    }
+
+    // Default units
+    if typedArgs.Units == "" {
+        typedArgs.Units = "celsius"
+    }
+
+    // Your weather logic here...
+    response := WeatherResponse{
+        City:        typedArgs.City,
+        Temperature: 22.5,
+        Units:       typedArgs.Units,
+        Condition:   "sunny",
+    }
+
+    // Return as JSON string
+    result, _ := json.Marshal(response)
+    return string(result), nil
+}
+```
+
+---
+
+## Tool Naming
+
+Tool names from `RegisterMCPTool` are prefixed with `bifrostInternal_` when exposed to LLMs:
+
+| Registered Name | LLM Sees |
+|-----------------|----------|
+| `calculator` | `bifrostInternal_calculator` |
+| `get_weather` | `bifrostInternal_get_weather` |
+
+This prevents naming conflicts with tools from external MCP servers.
+
+---
+
+## Error Handling
+
+Return errors from your handler to indicate tool execution failures:
+
+```go
+func myHandler(args any) (string, error) {
+    // Validation errors
+    if args == nil {
+        return "", fmt.Errorf("arguments required")
+    }
+
+    // Business logic errors
+    if someCondition {
+        return "", fmt.Errorf("operation not permitted: %s", reason)
+    }
+
+    // External service errors
+    result, err := callExternalService()
+    if err != nil {
+        return "", fmt.Errorf("service error: %w", err)
+    }
+
+    return result, nil
+}
+```
+
+Errors are returned to the LLM as tool error messages, allowing it to handle the failure gracefully.
+
+---
+
+## Accessing Application State
+
+Since tools run in-process, they can access your application's state:
+
+```go
+type AppContext struct {
+    DB        *sql.DB
+    Cache     *redis.Client
+    UserID    string
+    SessionID string
+}
+
+func createUserTool(appCtx *AppContext) func(args any) (string, error) {
+    return func(args any) (string, error) {
+        // Access database
+        rows, err := appCtx.DB.Query("SELECT * FROM users WHERE id = ?", appCtx.UserID)
+        if err != nil {
+            return "", err
+        }
+        defer rows.Close()
+
+        // Access cache
+        cached, _ := appCtx.Cache.Get(context.Background(), "user:"+appCtx.UserID).Result()
+
+        // Return result
+        return fmt.Sprintf("User data: %s", cached), nil
+    }
+}
+
+// Usage
+appCtx := &AppContext{
+    DB:     db,
+    Cache:  redisClient,
+    UserID: "user123",
+}
+client.RegisterMCPTool("get_user_data", "Get current user data", createUserTool(appCtx), schema)
+```
+
+---
+
+## Best Practices
+
+<AccordionGroup>
+  <Accordion title="Validate inputs">
+    Always validate arguments before processing:
+    ```go
+    func handler(args any) (string, error) {
+        m, ok := args.(map[string]interface{})
+        if !ok {
+            return "", fmt.Errorf("expected object arguments")
+        }
+
+        required := []string{"field1", "field2"}
+        for _, field := range required {
+            if _, exists := m[field]; !exists {
+                return "", fmt.Errorf("missing required field: %s", field)
+            }
+        }
+        // ...
+    }
+    ```
+  </Accordion>
+
+  <Accordion title="Return structured data">
+    Return JSON for complex responses:
+    ```go
+    func handler(args any) (string, error) {
+        result := map[string]interface{}{
+            "status": "success",
+            "data": []string{"item1", "item2"},
+            "count": 2,
+        }
+        bytes, _ := json.Marshal(result)
+        return string(bytes), nil
+    }
+    ```
+  </Accordion>
+
+  <Accordion title="Handle timeouts">
+    Use context for long-running operations:
+    ```go
+    func handler(args any) (string, error) {
+        ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
+        defer cancel()
+
+        result, err := longOperation(ctx)
+        if errors.Is(err, context.DeadlineExceeded) {
+            return "", fmt.Errorf("operation timed out")
+        }
+        return result, err
+    }
+    ```
+  </Accordion>
+
+  <Accordion title="Log for debugging">
+    Add logging for troubleshooting:
+    ```go
+    func handler(args any) (string, error) {
+        log.Printf("Tool called with args: %+v", args)
+
+        result, err := doWork(args)
+        if err != nil {
+            log.Printf("Tool error: %v", err)
+            return "", err
+        }
+
+        log.Printf("Tool result: %s", result)
+        return result, nil
+    }
+    ```
+  </Accordion>
+</AccordionGroup>
+
+---
+
+## Comparison with External MCP Servers
+
+| Aspect | Tool Hosting (In-Process) | External MCP Server |
+|--------|---------------------------|---------------------|
+| Latency | ~0.1ms (no network) | 10-500ms (network dependent) |
+| Deployment | Part of your app | Separate process/service |
+| Language | Go only | Any language |
+| Configuration | Code only | config.json, API, or UI |
+| State Access | Direct access | Via APIs |
+| Scaling | Scales with app | Independent scaling |
+
+---
+
+## Next Steps
+
+<CardGroup cols={2}>
+  <Card title="Tool Execution" icon="play" href="./tool-execution">
+    Learn how tool execution works
+  </Card>
+  <Card title="Agent Mode" icon="robot" href="./agent-mode">
+    Enable auto-execution for hosted tools
+  </Card>
+</CardGroup>