first commit

2026-04-26 21:52:23 +03:00
commit 880f412e2c
2662 changed files with 866266 additions and 0 deletions
--- a/docs/quickstart/go-sdk/context-keys.mdx
+++ b/docs/quickstart/go-sdk/context-keys.mdx
@@ -0,0 +1,388 @@
+---
+title: "Context Keys"
+description: "Use context keys to configure request behavior, pass metadata, and access response information throughout the request lifecycle."
+icon: "key"
+---
+
+Bifrost uses `BifrostContext` — a custom `context.Context` — to pass configuration and metadata through the request lifecycle. Context keys allow you to customize request behavior, pass request-specific settings, and read metadata set by Bifrost.
+
+The idiomatic pattern is to create a `BifrostContext` and call `SetValue` (or the chainable `WithValue`) directly on it:
+
+```go
+bfCtx := schemas.NewBifrostContext(context.Background(), schemas.NoDeadline)
+bfCtx.SetValue(schemas.BifrostContextKeyRequestID, "req-001")
+
+response, err := client.ChatCompletionRequest(bfCtx, &schemas.BifrostChatRequest{...})
+```
+
+## Request Configuration Keys
+
+These keys can be set before making a request to customize behavior.
+
+### Virtual Key
+
+Pass a virtual key identifier to the governance plugin for budget and rate-limit enforcement.
+
+```go
+bfCtx := schemas.NewBifrostContext(context.Background(), schemas.NoDeadline)
+bfCtx.SetValue(schemas.BifrostContextKeyVirtualKey, "vk-my-team")
+```
+
+### Extra Headers
+
+Pass custom headers with individual requests. Headers are automatically propagated to the provider.
+
+```go
+bfCtx := schemas.NewBifrostContext(context.Background(), schemas.NoDeadline)
+bfCtx.SetValue(schemas.BifrostContextKeyExtraHeaders, map[string][]string{
+    "user-id":    {"user-123"},
+    "session-id": {"session-abc"},
+})
+
+response, err := client.ChatCompletionRequest(bfCtx, &schemas.BifrostChatRequest{
+    Provider: schemas.OpenAI,
+    Model:    "gpt-4o-mini",
+    Input:    messages,
+})
+```
+
+<Note>
+See [Custom Headers Per Request](./provider-configuration#custom-headers-per-request) for detailed information on header handling and security restrictions.
+</Note>
+
+### API Key Selection
+
+Bifrost supports selecting a specific key by **ID** or **name**. When both are present, ID takes priority.
+
+#### By ID
+
+Explicitly select a key by its unique ID.
+
+```go
+bfCtx.SetValue(schemas.BifrostContextKeyAPIKeyID, "key-uuid-1234")
+```
+
+#### By Name
+
+Explicitly select a named API key from your configured keys.
+
+```go
+bfCtx.SetValue(schemas.BifrostContextKeyAPIKeyName, "premium-key")
+```
+
+### Direct Key
+
+Provide credentials directly, bypassing Bifrost's key selection entirely. Useful for dynamic or per-request key scenarios.
+
+```go
+bfCtx.SetValue(schemas.BifrostContextKeyDirectKey, schemas.Key{
+    Value:  "sk-direct-api-key",
+    Models: []string{"gpt-4o"},
+    Weight: 1.0,
+})
+```
+
+### Skip Key Selection
+
+Skip key selection entirely and pass an empty key to the provider. Useful for providers that don't require authentication or when using ambient credentials (e.g., IAM roles).
+
+```go
+bfCtx.SetValue(schemas.BifrostContextKeySkipKeySelection, true)
+```
+
+### Session Stickiness (Session ID)
+
+Bind a session to a specific API key so that requests with the same session ID consistently use the same key. Useful for predictable rate-limit buckets, cost attribution per user, and consistent model routing per session.
+
+On the first request for a session ID, Bifrost selects a key (via weighted random) and caches the binding in the KV store. Subsequent requests with the same session ID reuse the cached key as long as it remains valid. If the cached key is no longer in the supported set (disabled, removed, or model support changed), Bifrost re-selects and overwrites the cache.
+
+<Note>
+Session stickiness requires a `KVStore` to be configured in `BifrostConfig`.
+</Note>
+
+```go
+bfCtx.SetValue(schemas.BifrostContextKeySessionID, "user-123-session-abc")
+```
+
+### Session TTL
+
+Controls how long the session-to-key binding is cached. If not set, Bifrost uses a default TTL of 1 hour. The TTL is refreshed on each request so active sessions do not expire.
+
+```go
+bfCtx.SetValue(schemas.BifrostContextKeySessionTTL, 30*time.Minute)
+```
+
+### Request ID
+
+Set a custom request ID for tracking and correlation.
+
+```go
+bfCtx.SetValue(schemas.BifrostContextKeyRequestID, "req-12345-abc")
+```
+
+### Custom URL Path
+
+Append a custom path to the provider's base URL. Useful for accessing provider-specific endpoints.
+
+```go
+bfCtx.SetValue(schemas.BifrostContextKeyURLPath, "/custom/endpoint")
+```
+
+### Stream Idle Timeout
+
+Set a per-chunk idle timeout for streaming responses. If no chunk arrives within this duration, the stream is considered stalled and cancelled.
+
+```go
+bfCtx.SetValue(schemas.BifrostContextKeyStreamIdleTimeout, 10*time.Second)
+```
+
+### Raw Request Body
+
+Send a raw request body instead of Bifrost's standardized format. The provider receives your payload as-is. You must both set the context key AND populate `RawRequestBody` on the request.
+
+```go
+rawPayload := []byte(`{
+    "model": "gpt-4o",
+    "messages": [{"role": "user", "content": "Hello!"}],
+    "custom_field": "provider-specific-value"
+}`)
+
+bfCtx := schemas.NewBifrostContext(context.Background(), schemas.NoDeadline)
+bfCtx.SetValue(schemas.BifrostContextKeyUseRawRequestBody, true)
+
+response, err := client.ChatCompletionRequest(bfCtx, &schemas.BifrostChatRequest{
+    Provider:       schemas.OpenAI,
+    Model:          "gpt-4o",
+    RawRequestBody: rawPayload,
+})
+```
+
+<Note>
+When using raw request body, Bifrost bypasses its request conversion and sends your payload directly to the provider. You are responsible for ensuring the payload matches the provider's expected format.
+</Note>
+
+### Send Back Raw Request/Response
+
+Include the original request or response bytes in `ExtraFields` for debugging.
+
+```go
+bfCtx := schemas.NewBifrostContext(context.Background(), schemas.NoDeadline)
+bfCtx.SetValue(schemas.BifrostContextKeySendBackRawRequest, true)
+bfCtx.SetValue(schemas.BifrostContextKeySendBackRawResponse, true)
+
+response, _ := client.ChatCompletionRequest(bfCtx, request)
+if response.ChatResponse != nil {
+    rawReq := response.ChatResponse.ExtraFields.RawRequest
+    rawResp := response.ChatResponse.ExtraFields.RawResponse
+}
+```
+
+### Passthrough Extra Parameters
+
+When enabled, any parameters in `ExtraParams` are merged directly into the JSON body sent to the provider, bypassing Bifrost's parameter filtering. Useful for provider-specific parameters that Bifrost doesn't natively support.
+
+```go
+bfCtx := schemas.NewBifrostContext(context.Background(), schemas.NoDeadline)
+bfCtx.SetValue(schemas.BifrostContextKeyPassthroughExtraParams, true)
+
+response, err := client.ChatCompletionRequest(bfCtx, &schemas.BifrostChatRequest{
+    Provider: schemas.OpenAI,
+    Model:    "gpt-4o-mini",
+    Input:    messages,
+    Params: &schemas.ChatParameters{
+        ExtraParams: map[string]interface{}{
+            "custom_param": "value",
+            "another_param": 123,
+            "nested_param": map[string]interface{}{
+                "nested_key": "nested_value",
+            },
+        },
+    },
+})
+```
+
+<Note>
+- This feature only works for JSON requests, not multipart/form-data requests
+- Parameters already handled by Bifrost are not duplicated — they appear in their proper location
+- Nested parameters are merged recursively with existing nested structures
+</Note>
+
+## MCP Context Keys
+
+These keys control MCP tool execution behavior on a per-request basis. Request-level filtering takes priority over client-level configuration.
+
+### Include Clients
+
+Restrict which MCP clients can provide tools for this request. Pass `[]string{"*"}` to include all clients, or an empty slice to exclude all.
+
+```go
+bfCtx.SetValue(schemas.MCPContextKeyIncludeClients, []string{"github", "filesystem"})
+```
+
+### Include Tools
+
+Restrict which tools are available for this request. Use `"clientName-toolName"` format for individual tools or `"clientName-*"` as a wildcard for all tools from a client.
+
+```go
+// Allow only the search tool from the github client
+bfCtx.SetValue(schemas.MCPContextKeyIncludeTools, []string{"github-search_repositories"})
+
+// Allow all tools from filesystem client
+bfCtx.SetValue(schemas.MCPContextKeyIncludeTools, []string{"filesystem-*"})
+```
+
+### MCP Extra Headers
+
+Forward additional headers to MCP servers during tool execution. Only headers present in the MCP client's configured allowlist are forwarded.
+
+```go
+bfCtx.SetValue(schemas.BifrostContextKeyMCPExtraHeaders, map[string][]string{
+    "x-user-id":    {"user-123"},
+    "x-session-id": {"session-abc"},
+})
+```
+
+## Response Metadata Keys
+
+These keys are set by Bifrost and can be read from the context after a request completes. They are particularly useful in plugins and post-hooks.
+
+### Selected Key Information
+
+After Bifrost selects an API key, it stores the selection details in the context.
+
+```go
+keyID := ctx.Value(schemas.BifrostContextKeySelectedKeyID).(string)
+keyName := ctx.Value(schemas.BifrostContextKeySelectedKeyName).(string)
+```
+
+### Retry and Fallback Information
+
+Track retry attempts and fallback progression.
+
+```go
+// Number of retries attempted (0 = first attempt)
+retries := ctx.Value(schemas.BifrostContextKeyNumberOfRetries).(int)
+
+// Fallback index (0 = primary, 1 = first fallback, etc.)
+fallbackIdx := ctx.Value(schemas.BifrostContextKeyFallbackIndex).(int)
+
+// Request ID used for the fallback attempt
+fallbackReqID := ctx.Value(schemas.BifrostContextKeyFallbackRequestID).(string)
+```
+
+### Stream End Indicator
+
+For streaming responses, indicates when the stream has completed. Set by Bifrost automatically.
+
+```go
+isStreamEnd := ctx.Value(schemas.BifrostContextKeyStreamEndIndicator).(bool)
+```
+
+<Note>
+Plugin developers: When implementing a short-circuit streaming response in `PreLLMHook` or `PostLLMHook`, set `BifrostContextKeyStreamEndIndicator` to `true` on the last chunk to trigger proper cleanup.
+</Note>
+
+### Integration Type
+
+Identifies which SDK integration format is in use (useful in gateway plugins).
+
+```go
+integrationType := ctx.Value(schemas.BifrostContextKeyIntegrationType).(string)
+// e.g., "openai", "anthropic", "bedrock"
+```
+
+## Complete Example
+
+```go
+package main
+
+import (
+    "context"
+    "fmt"
+    "log"
+    "time"
+
+    "github.com/maximhq/bifrost"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+func makeRequest(client *bifrost.Bifrost) {
+    bfCtx := schemas.NewBifrostContext(context.Background(), schemas.NoDeadline)
+
+    // Request tracking
+    bfCtx.SetValue(schemas.BifrostContextKeyRequestID, "req-001")
+
+    // Custom headers forwarded to the provider
+    bfCtx.SetValue(schemas.BifrostContextKeyExtraHeaders, map[string][]string{
+        "x-correlation-id": {"corr-12345"},
+        "x-tenant-id":      {"tenant-abc"},
+    })
+
+    // Include raw provider response for debugging
+    bfCtx.SetValue(schemas.BifrostContextKeySendBackRawResponse, true)
+
+    // Restrict MCP tools to a specific client
+    bfCtx.SetValue(schemas.MCPContextKeyIncludeClients, []string{"filesystem"})
+
+    messages := []schemas.BifrostMessage{
+        {Role: "user", Content: &schemas.BifrostMessageContent{Text: bifrost.Ptr("Hello!")}},
+    }
+
+    response, err := client.ChatCompletionRequest(bfCtx, &schemas.BifrostChatRequest{
+        Provider: schemas.OpenAI,
+        Model:    "gpt-4o-mini",
+        Input:    messages,
+    })
+    if err != nil {
+        log.Printf("Request failed: %v", err)
+        return
+    }
+
+    if response.ChatResponse != nil {
+        extra := response.ChatResponse.ExtraFields
+        fmt.Printf("Provider: %s\n", extra.Provider)
+        fmt.Printf("Latency: %dms\n", extra.Latency)
+        if extra.RawResponse != nil {
+            fmt.Println("Raw response captured for debugging")
+        }
+    }
+}
+```
+
+## Context Keys Reference
+
+| Key | Type | Direction | Description |
+|-----|------|-----------|-------------|
+| `BifrostContextKeyVirtualKey` | `string` | Set | Virtual key identifier for governance |
+| `BifrostContextKeyAPIKeyName` | `string` | Set | Explicit API key name selection |
+| `BifrostContextKeyAPIKeyID` | `string` | Set | Explicit API key ID selection (priority over name) |
+| `BifrostContextKeyRequestID` | `string` | Set | Custom request ID for tracking |
+| `BifrostContextKeyFallbackRequestID` | `string` | Read | Request ID used for fallback attempt |
+| `BifrostContextKeyDirectKey` | `schemas.Key` | Set | Provide credentials directly, bypassing key selection |
+| `BifrostContextKeySkipKeySelection` | `bool` | Set | Skip key selection entirely |
+| `BifrostContextKeySessionID` | `string` | Set | Session ID for key stickiness (requires KV store) |
+| `BifrostContextKeySessionTTL` | `time.Duration` | Set | TTL for session-to-key cache (default: 1 hour) |
+| `BifrostContextKeyExtraHeaders` | `map[string][]string` | Set | Custom headers forwarded to the provider |
+| `BifrostContextKeyURLPath` | `string` | Set | Custom URL path appended to provider base URL |
+| `BifrostContextKeyStreamIdleTimeout` | `time.Duration` | Set | Per-chunk idle timeout for streaming responses |
+| `BifrostContextKeyUseRawRequestBody` | `bool` | Set | Send raw request body directly to provider |
+| `BifrostContextKeySendBackRawRequest` | `bool` | Set | Include raw request in `ExtraFields` |
+| `BifrostContextKeySendBackRawResponse` | `bool` | Set | Include raw provider response in `ExtraFields` |
+| `BifrostContextKeyPassthroughExtraParams` | `bool` | Set | Merge `ExtraParams` directly into provider request |
+| `MCPContextKeyIncludeClients` | `[]string` | Set | Allowlist of MCP client names for this request |
+| `MCPContextKeyIncludeTools` | `[]string` | Set | Allowlist of MCP tools (`"client-tool"` or `"client-*"`) |
+| `BifrostContextKeyMCPExtraHeaders` | `map[string][]string` | Set | Extra headers forwarded to MCP servers during tool execution |
+| `BifrostContextKeySelectedKeyID` | `string` | Read | ID of the key selected by Bifrost |
+| `BifrostContextKeySelectedKeyName` | `string` | Read | Name of the key selected by Bifrost |
+| `BifrostContextKeyNumberOfRetries` | `int` | Read | Number of retry attempts made |
+| `BifrostContextKeyFallbackIndex` | `int` | Read | Current fallback index (0 = primary) |
+| `BifrostContextKeyStreamEndIndicator` | `bool` | Read | Whether the stream has completed |
+| `BifrostContextKeyIntegrationType` | `string` | Read | SDK integration format in use (e.g. `"openai"`) |
+| `BifrostContextKeyUserAgent` | `string` | Read | User agent of the incoming request |
+
+## Next Steps
+
+- **[Provider Configuration](./provider-configuration)** - Configure providers and keys
+- **[Streaming Responses](./streaming)** - Real-time response handling
+- **[Tool Calling](./tool-calling)** - Enable AI function calling
+- **[Core Features](../../features/)** - Advanced Bifrost capabilities
--- a/docs/quickstart/go-sdk/logger.mdx
+++ b/docs/quickstart/go-sdk/logger.mdx
@@ -0,0 +1,303 @@
+---
+title: "Logging"
+description: "Configure logging for debugging, monitoring, and troubleshooting your Bifrost integration."
+icon: "file-lines"
+---
+
+Bifrost provides a flexible logging system with configurable log levels and output formats. You can use the built-in default logger or implement your own custom logger.
+
+## Using the Default Logger
+
+Bifrost includes a `DefaultLogger` that writes to stdout/stderr with timestamps. Create one with your desired log level:
+
+```go
+import (
+    "github.com/maximhq/bifrost"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+func main() {
+    // Create logger with desired level
+    logger := bifrost.NewDefaultLogger(schemas.LogLevelInfo)
+
+    // Initialize Bifrost with the logger
+    client, err := bifrost.Init(schemas.BifrostConfig{
+        Account: &MyAccount{},
+        Logger:  logger,
+    })
+    if err != nil {
+        panic(err)
+    }
+}
+```
+
+## Log Levels
+
+Bifrost supports four log levels, from most to least verbose:
+
+| Level | Constant | Description |
+|-------|----------|-------------|
+| Debug | `schemas.LogLevelDebug` | Detailed debugging information for development |
+| Info | `schemas.LogLevelInfo` | General operational messages |
+| Warn | `schemas.LogLevelWarn` | Potentially harmful situations |
+| Error | `schemas.LogLevelError` | Serious problems requiring attention |
+
+```go
+// Debug level - most verbose, includes all messages
+logger := bifrost.NewDefaultLogger(schemas.LogLevelDebug)
+
+// Info level - general operational messages
+logger := bifrost.NewDefaultLogger(schemas.LogLevelInfo)
+
+// Warn level - only warnings and errors
+logger := bifrost.NewDefaultLogger(schemas.LogLevelWarn)
+
+// Error level - only errors (least verbose)
+logger := bifrost.NewDefaultLogger(schemas.LogLevelError)
+```
+
+You can change the log level at runtime:
+
+```go
+logger.SetLevel(schemas.LogLevelDebug)
+```
+
+## Output Formats
+
+The default logger supports two output formats:
+
+### JSON Output (Default)
+
+Structured JSON logs, ideal for log aggregation systems:
+
+```go
+logger := bifrost.NewDefaultLogger(schemas.LogLevelInfo)
+logger.SetOutputType(schemas.LoggerOutputTypeJSON)
+```
+
+Output example:
+```json
+{"level":"info","time":"2024-01-15T10:30:00Z","message":"Request completed"}
+```
+
+### Pretty Output
+
+Human-readable colored output, ideal for development:
+
+```go
+logger := bifrost.NewDefaultLogger(schemas.LogLevelInfo)
+logger.SetOutputType(schemas.LoggerOutputTypePretty)
+```
+
+Output example:
+```
+10:30:00 INF Request completed
+```
+
+## Custom Logger Implementation
+
+Implement the `Logger` interface to integrate with your existing logging infrastructure:
+
+```go
+type Logger interface {
+    Debug(msg string, args ...any)
+    Info(msg string, args ...any)
+    Warn(msg string, args ...any)
+    Error(msg string, args ...any)
+    Fatal(msg string, args ...any)
+    SetLevel(level schemas.LogLevel)
+    SetOutputType(outputType schemas.LoggerOutputType)
+}
+```
+
+### Example: Zap Logger Integration
+
+```go
+import (
+    "go.uber.org/zap"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+type ZapLogger struct {
+    logger *zap.SugaredLogger
+    level  zap.AtomicLevel
+}
+
+func NewZapLogger() *ZapLogger {
+    level := zap.NewAtomicLevelAt(zap.InfoLevel)
+    config := zap.NewProductionConfig()
+    config.Level = level
+    logger, _ := config.Build()
+    return &ZapLogger{
+        logger: logger.Sugar(),
+        level:  level,
+    }
+}
+
+func (l *ZapLogger) Debug(msg string, args ...any) {
+    l.logger.Debugf(msg, args...)
+}
+
+func (l *ZapLogger) Info(msg string, args ...any) {
+    l.logger.Infof(msg, args...)
+}
+
+func (l *ZapLogger) Warn(msg string, args ...any) {
+    l.logger.Warnf(msg, args...)
+}
+
+func (l *ZapLogger) Error(msg string, args ...any) {
+    l.logger.Errorf(msg, args...)
+}
+
+func (l *ZapLogger) Fatal(msg string, args ...any) {
+    l.logger.Fatalf(msg, args...)
+}
+
+func (l *ZapLogger) SetLevel(level schemas.LogLevel) {
+    switch level {
+    case schemas.LogLevelDebug:
+        l.level.SetLevel(zap.DebugLevel)
+    case schemas.LogLevelInfo:
+        l.level.SetLevel(zap.InfoLevel)
+    case schemas.LogLevelWarn:
+        l.level.SetLevel(zap.WarnLevel)
+    case schemas.LogLevelError:
+        l.level.SetLevel(zap.ErrorLevel)
+    }
+}
+
+func (l *ZapLogger) SetOutputType(outputType schemas.LoggerOutputType) {
+    // Zap handles output format via encoder configuration
+}
+```
+
+### Example: Logrus Integration
+
+```go
+import (
+    "github.com/sirupsen/logrus"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+type LogrusLogger struct {
+    logger *logrus.Logger
+}
+
+func NewLogrusLogger() *LogrusLogger {
+    logger := logrus.New()
+    logger.SetLevel(logrus.InfoLevel)
+    return &LogrusLogger{logger: logger}
+}
+
+func (l *LogrusLogger) Debug(msg string, args ...any) {
+    l.logger.Debugf(msg, args...)
+}
+
+func (l *LogrusLogger) Info(msg string, args ...any) {
+    l.logger.Infof(msg, args...)
+}
+
+func (l *LogrusLogger) Warn(msg string, args ...any) {
+    l.logger.Warnf(msg, args...)
+}
+
+func (l *LogrusLogger) Error(msg string, args ...any) {
+    l.logger.Errorf(msg, args...)
+}
+
+func (l *LogrusLogger) Fatal(msg string, args ...any) {
+    l.logger.Fatalf(msg, args...)
+}
+
+func (l *LogrusLogger) SetLevel(level schemas.LogLevel) {
+    switch level {
+    case schemas.LogLevelDebug:
+        l.logger.SetLevel(logrus.DebugLevel)
+    case schemas.LogLevelInfo:
+        l.logger.SetLevel(logrus.InfoLevel)
+    case schemas.LogLevelWarn:
+        l.logger.SetLevel(logrus.WarnLevel)
+    case schemas.LogLevelError:
+        l.logger.SetLevel(logrus.ErrorLevel)
+    }
+}
+
+func (l *LogrusLogger) SetOutputType(outputType schemas.LoggerOutputType) {
+    switch outputType {
+    case schemas.LoggerOutputTypeJSON:
+        l.logger.SetFormatter(&logrus.JSONFormatter{})
+    case schemas.LoggerOutputTypePretty:
+        l.logger.SetFormatter(&logrus.TextFormatter{
+            FullTimestamp: true,
+        })
+    }
+}
+```
+
+## Using Your Custom Logger
+
+Pass your custom logger to Bifrost during initialization:
+
+```go
+client, err := bifrost.Init(schemas.BifrostConfig{
+    Account: &MyAccount{},
+    Logger:  NewZapLogger(),  // or NewLogrusLogger()
+})
+```
+
+## Disabling Logging
+
+To disable logging, implement a no-op logger:
+
+```go
+type NoOpLogger struct{}
+
+func (l *NoOpLogger) Debug(msg string, args ...any)                   {}
+func (l *NoOpLogger) Info(msg string, args ...any)                    {}
+func (l *NoOpLogger) Warn(msg string, args ...any)                    {}
+func (l *NoOpLogger) Error(msg string, args ...any)                   {}
+func (l *NoOpLogger) Fatal(msg string, args ...any)                   {}
+func (l *NoOpLogger) SetLevel(level schemas.LogLevel)                 {}
+func (l *NoOpLogger) SetOutputType(outputType schemas.LoggerOutputType) {}
+
+// Use it
+client, err := bifrost.Init(schemas.BifrostConfig{
+    Account: &MyAccount{},
+    Logger:  &NoOpLogger{},
+})
+```
+
+## Best Practices
+
+### Development vs Production
+
+```go
+func createLogger(env string) schemas.Logger {
+    logger := bifrost.NewDefaultLogger(schemas.LogLevelInfo)
+
+    if env == "development" {
+        logger.SetLevel(schemas.LogLevelDebug)
+        logger.SetOutputType(schemas.LoggerOutputTypePretty)
+    } else {
+        logger.SetLevel(schemas.LogLevelInfo)
+        logger.SetOutputType(schemas.LoggerOutputTypeJSON)
+    }
+
+    return logger
+}
+```
+
+### Log Level Guidelines
+
+- **Debug**: Use during development to trace request flow, inspect payloads, and diagnose issues
+- **Info**: Use for normal operational events like successful requests, provider switches
+- **Warn**: Use for recoverable issues like retries, fallback activations, deprecated usage
+- **Error**: Use for failures that need attention but don't crash the application
+
+## Next Steps
+
+- **[Context Keys](./context-keys)** - Pass metadata through requests
+- **[Provider Configuration](./provider-configuration)** - Configure multiple providers
+- **[Streaming Responses](./streaming)** - Real-time response handling
+- **[Core Features](../../features/)** - Advanced Bifrost capabilities
--- a/docs/quickstart/go-sdk/multimodal.mdx
+++ b/docs/quickstart/go-sdk/multimodal.mdx
@@ -0,0 +1,393 @@
+---
+title: "Multimodal Support"
+description: "Process multiple types of content including images, audio, and text with AI models. Bifrost supports vision analysis, image generation, speech synthesis, and audio transcription across various providers."
+icon: "images"
+---
+
+## Vision: Analyzing Images with AI
+
+Send images to vision-capable models for analysis, description, and understanding. This example shows how to analyze an image from a URL using GPT-4o with high detail processing for better accuracy.
+
+```go
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostChatRequest{
+	Provider: schemas.OpenAI,
+	Model:    "gpt-4o", // Using vision-capable model
+	Input: []schemas.ChatMessage{
+		{
+			Role: schemas.ChatMessageRoleUser,
+			Content: &schemas.ChatMessageContent{
+				ContentBlocks: []schemas.ChatContentBlock{
+					{
+						Type: schemas.ChatContentBlockTypeText,
+						Text: schemas.Ptr("What do you see in this image? Please describe it in detail."),
+					},
+					{
+						Type: schemas.ChatContentBlockTypeImage,
+						ImageURLStruct: &schemas.ChatInputImage{
+							URL:    "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
+							Detail: schemas.Ptr("high"), // Optional: can be "low", "high", or "auto"
+						},
+					},
+				},
+			},
+		},
+	},
+})
+
+if err != nil {
+	panic(err)
+}
+
+fmt.Println("Response:", *response.Choices[0].Message.Content.ContentStr)
+```
+
+## Image Generation: Generating Images with AI
+
+Generate images from text prompts using OpenAI-compatible image generation models via the Go SDK.
+
+```go
+response, err := client.ImageGenerationRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostImageGenerationRequest{
+	Provider: schemas.OpenAI,
+	Model:    "dall-e-3",
+	Input: &schemas.ImageGenerationInput{
+		Prompt: "A futuristic city skyline at sunset with flying cars",
+	},
+	Params: &schemas.ImageGenerationParameters{
+		Size:           schemas.Ptr("1024x1024"),
+		ResponseFormat: schemas.Ptr("url"),
+	},
+})
+
+if err != nil {
+	panic(err)
+}
+
+// Handle image generation response
+if len(response.Data) > 0 {
+	imageData := response.Data[0]
+	
+	// Handle URL response (when response_format is "url")
+	if imageData.URL != "" {
+		fmt.Printf("Generated image URL: %s\n", imageData.URL)
+	}
+	
+	// Handle base64-encoded response (when response_format is "b64_json")
+	if imageData.B64JSON != "" {
+		fmt.Printf("Generated base64 image (length: %d)\n", len(imageData.B64JSON))
+	}
+	
+	// Handle revised prompt if present
+	if imageData.RevisedPrompt != "" {
+		fmt.Printf("Revised prompt: %s\n", imageData.RevisedPrompt)
+	}
+}
+
+// Handle usage metrics
+// Note: For image generation endpoints, response.Usage and Usage.TotalTokens may be empty/not populated
+// as token-based usage metrics are not provided by some image-generation providers
+if response.Usage != nil {
+	fmt.Printf("Usage: %d tokens\n", response.Usage.TotalTokens)
+}
+```
+
+## Audio Understanding: Analyzing Audio with AI
+
+If your chat application supports text input, you can add audio input and output—just include audio in the modalities array and use an audio model, like gpt-4o-audio-preview.
+
+### Audio Input to Model
+
+```go
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostChatRequest{
+	Provider: schemas.OpenAI,
+	Model:    "gpt-4o-audio-preview",
+	Input: []schemas.ChatMessage{
+		{
+			Role: schemas.ChatMessageRoleUser,
+			Content: &schemas.ChatMessageContent{
+				ContentBlocks: []schemas.ChatContentBlock{
+					{
+						Type: schemas.ChatContentBlockTypeText,
+						Text: schemas.Ptr("Please analyze this audio recording and summarize what was discussed."),
+					},
+					{
+						Type: schemas.ChatContentBlockTypeInputAudio,
+						InputAudio: &schemas.ChatInputAudio{
+							Data:   []byte("base64-encoded audio data containing the word 'Affirmative'"),
+							Format: "wav",
+						},
+					},
+				},
+			},
+		},
+	},
+})
+```
+
+## Text-to-Speech: Converting Text to Audio
+
+Convert text into natural-sounding speech using AI voice models. This example demonstrates generating an MP3 audio file from text using the "alloy" voice. The result is saved to a local file for playback.
+
+```go
+response, err := client.SpeechRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostSpeechRequest{
+	Provider: schemas.OpenAI,
+	Model:    "tts-1", // Using text-to-speech model
+	Input: &schemas.SpeechInput{
+		Input: "Hello! This is a sample text that will be converted to speech using Bifrost's speech synthesis capabilities. The weather today is wonderful, and I hope you're having a great day!",
+	},
+	Params: &schemas.SpeechParameters{
+		VoiceConfig: &schemas.SpeechVoiceInput{
+			Voice: schemas.Ptr("alloy"),
+		},
+		ResponseFormat: schemas.Ptr("mp3"),
+	},
+})
+
+if err != nil {
+	panic(err)
+}
+
+// Handle speech synthesis response
+if response.Speech != nil && len(response.Speech.Audio) > 0 {
+	// Save the audio to a file
+	filename := "output.mp3"
+	err := os.WriteFile("output.mp3", response.Speech.Audio, 0644)
+	if err != nil {
+		panic(fmt.Sprintf("Failed to save audio file: %v", err))
+	}
+
+	fmt.Printf("Speech synthesis successful! Audio saved to %s, file size: %d bytes\n", filename, len(response.Speech.Audio))
+}
+```
+
+## Speech-to-Text: Transcribing Audio Files
+
+Convert audio files into text using AI transcription models. This example shows how to transcribe an MP3 file using OpenAI's Whisper model, with an optional context prompt to improve accuracy.
+
+```go
+// Read the audio file for transcription
+audioFilename := "output.mp3"
+audioData, err := os.ReadFile(audioFilename)
+if err != nil {
+	panic(fmt.Sprintf("Failed to read audio file %s: %v. Please make sure the file exists.", audioFilename, err))
+}
+
+fmt.Printf("Loaded audio file %s (%d bytes) for transcription...\n", audioFilename, len(audioData))
+
+response, err := client.TranscriptionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostTranscriptionRequest{
+	Provider: schemas.OpenAI,
+	Model:    "whisper-1", // Using Whisper model for transcription
+	Input: &schemas.TranscriptionInput{
+		File: audioData,
+	},
+	Params: &schemas.TranscriptionParameters{
+		Prompt: schemas.Ptr("This is a sample audio transcription from Bifrost speech synthesis."), // Optional: provide context
+	},
+})
+
+if err != nil {
+	panic(err)
+}
+
+fmt.Printf("Transcription Result: %s\n", response.Transcribe.Text)
+```
+
+## Advanced Vision Examples
+
+### Multiple Images
+
+Send multiple images in a single request for comparison or analysis. This is useful for comparing products, analyzing changes over time, or understanding relationships between different visual elements.
+
+```go
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostChatRequest{
+	Provider: schemas.OpenAI,
+	Model:    "gpt-4o",
+	Input: []schemas.ChatMessage{
+		{
+			Role: schemas.ChatMessageRoleUser,
+			Content: &schemas.ChatMessageContent{
+				ContentBlocks: []schemas.ChatContentBlock{
+					{
+						Type: schemas.ChatContentBlockTypeText,
+						Text: schemas.Ptr("Compare these two images. What are the differences?"),
+					},
+					{
+						Type: schemas.ChatContentBlockTypeImage,
+						ImageURLStruct: &schemas.ChatInputImage{
+							URL: "https://example.com/image1.jpg",
+						},
+					},
+					{
+						Type: schemas.ChatContentBlockTypeImage,
+						ImageURLStruct: &schemas.ChatInputImage{
+							URL: "https://example.com/image2.jpg",
+						},
+					},
+				},
+			},
+		},
+	},
+})
+```
+
+### Base64 Images
+
+Process local images by encoding them as base64 data URLs. This approach is ideal when you need to analyze images stored locally on your system without uploading them to external URLs first.
+
+```go
+// Read and encode image
+imageData, err := os.ReadFile("local_image.jpg")
+if err != nil {
+	panic(err)
+}
+base64Image := base64.StdEncoding.EncodeToString(imageData)
+dataURL := fmt.Sprintf("data:image/jpeg;base64,%s", base64Image)
+
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostChatRequest{
+	Provider: schemas.OpenAI,
+	Model:    "gpt-4o",
+	Input: []schemas.ChatMessage{
+		{
+			Role: schemas.ChatMessageRoleUser,
+			Content: &schemas.ChatMessageContent{
+				ContentBlocks: []schemas.ChatContentBlock{
+					{
+						Type: schemas.ChatContentBlockTypeText,
+						Text: schemas.Ptr("Analyze this image and describe what you see."),
+					},
+					{
+						Type: schemas.ChatContentBlockTypeImage,
+						ImageURLStruct: &schemas.ChatInputImage{
+							URL:    dataURL,
+							Detail: schemas.Ptr("high"),
+						},
+					},
+				},
+			},
+		},
+	},
+})
+```
+
+## Audio Configuration Options
+
+### Voice Selection for Speech Synthesis
+
+OpenAI provides six distinct voice options, each with different characteristics. This example generates sample audio files for each voice so you can compare and choose the one that best fits your application.
+
+```go
+// Available voices: alloy, echo, fable, onyx, nova, shimmer
+voices := []string{"alloy", "echo", "fable", "onyx", "nova", "shimmer"}
+
+for _, voice := range voices {
+	response, err := client.SpeechRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostSpeechRequest{
+		Provider: schemas.OpenAI,
+		Model:    "tts-1",
+		Input: &schemas.SpeechInput{
+			Input: fmt.Sprintf("This is the %s voice speaking.", voice),
+		},
+		Params: &schemas.SpeechParameters{
+			VoiceConfig: &schemas.SpeechVoiceInput{
+				Voice: schemas.Ptr(voice),
+			},
+			ResponseFormat: schemas.Ptr("mp3"),
+		},
+	})
+	
+	if err == nil && response.Speech != nil {
+		filename := fmt.Sprintf("sample_%s.mp3", voice)
+		os.WriteFile(filename, response.Speech.Audio, 0644)
+		fmt.Printf("Generated %s\n", filename)
+	}
+}
+```
+
+### Audio Formats
+
+Generate audio in different formats depending on your use case. MP3 for general use, Opus for web streaming, AAC for mobile apps, and FLAC for high-quality audio applications.
+
+```go
+formats := []string{"mp3", "opus", "aac", "flac"}
+
+for _, format := range formats {
+	response, err := client.SpeechRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostSpeechRequest{
+		Provider: schemas.OpenAI,
+		Model:    "tts-1",
+		Input: &schemas.SpeechInput{
+			Input: "Testing different audio formats.",
+		},
+		Params: &schemas.SpeechParameters{
+			VoiceConfig: &schemas.SpeechVoiceInput{
+				Voice: schemas.Ptr("alloy"),
+			},
+			ResponseFormat: schemas.Ptr(format),
+		}
+	})
+	
+	if err == nil && response.Speech != nil {
+		filename := fmt.Sprintf("output.%s", format)
+		os.WriteFile(filename, response.Speech.Audio, 0644)
+	}
+}
+```
+
+## Transcription Options
+
+### Language Specification
+
+Improve transcription accuracy by specifying the source language. This is particularly helpful for non-English audio or when the audio contains technical terms or specific domain vocabulary.
+
+```go
+response, err := client.TranscriptionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostTranscriptionRequest{
+	Provider: schemas.OpenAI,
+	Model:    "whisper-1",
+	Input: &schemas.TranscriptionInput{
+		File: audioData,
+	},
+	Params: &schemas.TranscriptionParameters{
+		Language: schemas.Ptr("es"), // Spanish
+		Prompt:   schemas.Ptr("This is a Spanish audio recording about technology."),
+	},
+})
+```
+
+### Response Formats
+
+Choose between simple text output or detailed JSON responses with timestamps. The verbose JSON format provides word-level and segment-level timing information, useful for creating subtitles or analyzing speech patterns.
+
+```go
+// Text only
+response, err := client.TranscriptionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostTranscriptionRequest{
+	Provider: schemas.OpenAI,
+	Model:    "whisper-1",
+	Input: &schemas.TranscriptionInput{
+		File: audioData,
+	},
+	Params: &schemas.TranscriptionParameters{
+		ResponseFormat: schemas.Ptr("text"),
+	},
+})
+
+// JSON with timestamps
+response, err := client.TranscriptionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostTranscriptionRequest{
+	Provider: schemas.OpenAI,
+	Model:    "whisper-1",
+	Input: &schemas.TranscriptionInput{
+		File: audioData,
+	},
+	Params: &schemas.TranscriptionParameters{
+		ResponseFormat:             schemas.Ptr("verbose_json"),
+		TimestampGranularities:     []string{"word", "segment"},
+	},
+})
+```
+
+<Info>
+Check the [Supported Providers](/providers/supported-providers/overview) page for more information on multimodal capabilities supported by each provider.
+</Info>
+
+## Next Steps
+
+- **[Streaming Responses](./streaming)** - Real-time multimodal processing
+- **[Tool Calling](./tool-calling)** - Combine with external tools
+- **[Provider Configuration](./provider-configuration)** - Multiple providers for different capabilities
+- **[Core Features](../../features/)** - Advanced Bifrost capabilities
--- a/docs/quickstart/go-sdk/provider-configuration.mdx
+++ b/docs/quickstart/go-sdk/provider-configuration.mdx
@@ -0,0 +1,504 @@
+---
+title: "Provider Configuration"
+description: "Configure multiple AI providers for custom concurrency, queue sizes, proxy settings, and more."
+icon: "sliders"
+---
+
+## Multi-Provider Setup
+
+Configure multiple providers to seamlessly switch between them. This example shows how to configure OpenAI, Anthropic, and Mistral providers.
+
+```go
+type MyAccount struct{}
+
+func (a *MyAccount) GetConfiguredProviders() ([]schemas.ModelProvider, error) {
+    return []schemas.ModelProvider{schemas.OpenAI, schemas.Anthropic, schemas.Mistral}, nil
+}
+
+func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
+    switch provider {
+    case schemas.OpenAI:
+        return []schemas.Key{{
+            Value:  os.Getenv("OPENAI_API_KEY"),
+            Models: []string{},
+            Weight: 1.0,
+        }}, nil
+    case schemas.Anthropic:
+        return []schemas.Key{{
+            Value:  os.Getenv("ANTHROPIC_API_KEY"),
+            Models: []string{},
+            Weight: 1.0,
+        }}, nil
+    case schemas.Mistral:
+        return []schemas.Key{{
+            Value:  os.Getenv("MISTRAL_API_KEY"),
+            Models: []string{},
+            Weight: 1.0,
+        }}, nil
+    }
+    return nil, fmt.Errorf("provider %s not supported", provider)
+}
+
+func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
+    // Return same config for all providers
+    return &schemas.ProviderConfig{
+            NetworkConfig:            schemas.DefaultNetworkConfig,
+            ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
+    }, nil
+}
+```
+
+> If Bifrost receives a new provider at runtime (i.e., one that is not returned by `GetConfiguredProviders()` initially on `bifrost.Init()`), it will set up the provider at runtime using `GetConfigForProvider()`, which may cause a delay in the first request to that provider.
+
+## Making Requests
+
+Once providers are configured, you can make requests to any specific provider. This example shows how to send a request directly to Mistral's latest vision model. Bifrost handles the provider-specific API formatting automatically.
+
+```go
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostChatRequest{
+    Provider: schemas.Mistral,
+    Model:    "pixtral-12b-latest",
+    Input:    messages,
+})
+```
+
+## Environment Variables
+
+Set up your API keys for the providers you want to use:
+
+```bash
+export OPENAI_API_KEY="your-openai-api-key"
+export ANTHROPIC_API_KEY="your-anthropic-api-key"
+export CEREBRAS_API_KEY="your-cerebras-api-key"
+export MISTRAL_API_KEY="your-mistral-api-key"
+export GROQ_API_KEY="your-groq-api-key"
+export COHERE_API_KEY="your-cohere-api-key"
+```
+
+## Advanced Configuration
+
+### Weighted Load Balancing
+
+Distribute requests across multiple API keys or providers based on custom weights. This example shows how to split traffic 70/30 between two OpenAI keys, useful for managing rate limits or costs across different accounts.
+
+```go
+func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
+    switch provider {
+    case schemas.OpenAI:
+        return []schemas.Key{{
+            Value:  os.Getenv("OPENAI_API_KEY_1"),
+            Models: []string{},
+            Weight: 0.7, // 70% of requests
+        },
+        {
+            Value:  os.Getenv("OPENAI_API_KEY_2"),
+            Models: []string{},
+            Weight: 0.3, // 30% of requests
+        },
+        }, nil
+    }
+    return nil, fmt.Errorf("provider %s not supported", provider)
+}
+```
+
+### Model-Specific Keys
+
+Use different API keys for specific models, allowing you to manage access controls and billing separately. This example uses a premium key for advanced reasoning models (o1-preview, o1-mini) and a standard key for regular GPT models.
+
+```go
+func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
+    switch provider {
+    case schemas.OpenAI:
+        return []schemas.Key{
+            {
+                Value:  os.Getenv("OPENAI_API_KEY"),
+                Models: []string{"gpt-4o", "gpt-4o-mini"},
+                Weight: 1.0,
+            },
+            {
+                Value:  os.Getenv("OPENAI_API_KEY_PREMIUM"),
+                Models: []string{"o1-preview", "o1-mini"},
+                Weight: 1.0,
+            },
+        }, nil
+    }
+    return nil, fmt.Errorf("provider %s not supported", provider)
+}
+```
+
+### Custom Base URL
+
+Override the default API endpoint for a provider. This is useful for connecting to self-hosted models, local development servers, or OpenAI-compatible APIs like vLLM, Ollama, or LiteLLM.
+
+```go
+func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
+	switch provider {
+	case schemas.OpenAI:
+		return &schemas.ProviderConfig{
+			NetworkConfig: schemas.NetworkConfig{
+				BaseURL: "http://localhost:8000/v1", // Custom endpoint
+			},
+			ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
+		}, nil
+	}
+	return nil, fmt.Errorf("provider %s not supported", provider)
+}
+```
+
+<Note>
+For self-hosted providers like Ollama and SGL, `BaseURL` is required. For standard providers, it's optional and overrides the default endpoint.
+</Note>
+### Managing Retries
+
+Configure retry behavior for handling temporary failures and rate limits. This example sets up exponential backoff with up to 5 retries, starting with 1ms delay and capping at 10 seconds - ideal for handling transient network issues.
+
+<Info>
+For a full explanation of how retries work, key rotation on rate limits, and how retries connect with fallbacks, see [Retries & Fallbacks](/features/retries-and-fallbacks).
+</Info>
+
+```go
+func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
+	switch provider {
+	case schemas.OpenAI:
+		return &schemas.ProviderConfig{
+			NetworkConfig: schemas.NetworkConfig{
+				MaxRetries:          5,
+				RetryBackoffInitial: 1 * time.Millisecond,
+				RetryBackoffMax:     10 * time.Second,
+			},
+			ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
+		}, nil
+	}
+	return nil, fmt.Errorf("provider %s not supported", provider)
+}
+```
+
+### Custom Concurrency and Buffer Size
+
+Fine-tune performance by adjusting worker concurrency and queue sizes per provider (defaults are 1000 workers and 5000 queue size). This example gives OpenAI higher limits (100 workers, 500 queue) for high throughput, while Anthropic gets conservative limits to respect their rate limits.
+
+```go
+func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
+    switch provider {
+    case schemas.OpenAI:
+        return &schemas.ProviderConfig{
+            NetworkConfig: schemas.DefaultNetworkConfig,
+            ConcurrencyAndBufferSize: schemas.ConcurrencyAndBufferSize{
+                MaxConcurrency: 100, // Max number of concurrent requests (no of workers)
+                BufferSize:     500, // Max number of requests in the buffer (queue size)
+            },
+        }, nil
+    case schemas.Anthropic:
+        return &schemas.ProviderConfig{
+            NetworkConfig: schemas.DefaultNetworkConfig,
+            ConcurrencyAndBufferSize: schemas.ConcurrencyAndBufferSize{
+                MaxConcurrency: 25,
+                BufferSize:     100,
+            },
+        }, nil
+    }
+    return nil, fmt.Errorf("provider %s not supported", provider)
+}
+```
+
+### Custom Headers
+
+Bifrost supports two ways to add custom headers to provider requests: **static headers** configured at the provider level, and **dynamic headers** passed per-request via context.
+
+#### Static Headers (Provider Level)
+
+Configure headers that are automatically included in every request to a specific provider using `NetworkConfig.ExtraHeaders`:
+
+```go
+func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
+	switch provider {
+	case schemas.OpenAI:
+		return &schemas.ProviderConfig{
+			NetworkConfig: schemas.NetworkConfig{
+				ExtraHeaders: map[string]string{
+					"x-custom-org":   "my-organization",
+					"x-environment":  "production",
+				},
+			},
+			ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
+		}, nil
+	}
+	return nil, fmt.Errorf("provider %s not supported", provider)
+}
+```
+
+#### Dynamic Headers (Per Request)
+
+Send custom headers with individual requests by adding them to the request context. Headers are automatically propagated to the provider:
+
+```go
+import (
+    "context"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+func makeRequestWithCustomHeaders() {
+    // Create base context
+    ctx := context.Background()
+
+    // Add custom headers using BifrostContextKeyExtraHeaders
+    extraHeaders := map[string][]string{
+        "user-id":         {"user-123"},
+        "session-id":      {"session-abc"},
+        "custom-metadata": {"value1", "value2"}, // Multiple values supported
+    }
+    ctx = context.WithValue(ctx, schemas.BifrostContextKeyExtraHeaders, extraHeaders)
+
+    // Make request with custom headers
+    response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostChatRequest{
+        Provider: schemas.OpenAI,
+        Model:    "gpt-4o-mini",
+        Input:    messages,
+    })
+    if err != nil {
+        // Handle error
+    }
+}
+```
+
+**How it works:**
+- Headers are stored as `map[string][]string` in the context
+- Multiple values per header name are supported
+- Header names are case-insensitive and normalized to lowercase
+- Headers are accessible throughout the request lifecycle
+
+**Example use cases:**
+- User identification: `user-id`, `tenant-id`
+- Request tracking: `correlation-id`, `trace-id`
+- Custom metadata: `department`, `cost-center`
+- A/B testing: `experiment-id`, `variant`
+
+#### Security Denylist
+
+Bifrost maintains a security denylist of headers that are never forwarded to providers, regardless of configuration:
+
+```go
+denylist := map[string]bool{
+    "proxy-authorization": true,
+    "cookie":              true,
+    "host":                true,
+    "content-length":      true,
+    "connection":          true,
+    "transfer-encoding":   true,
+
+    // prevent auth/key overrides
+    "x-api-key":      true,
+    "x-goog-api-key": true,
+    "x-bf-api-key":   true,
+    "x-bf-vk":        true,
+}
+```
+
+This denylist is applied to both static and dynamic headers to prevent security vulnerabilities.
+
+### Setting Up a Proxy
+
+Route requests through proxies for compliance, security, or geographic requirements. This example shows both HTTP proxy for OpenAI and authenticated SOCKS5 proxy for Anthropic, useful for corporate environments or regional access.
+
+```go
+func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
+	switch provider {
+	case schemas.OpenAI:
+		return &schemas.ProviderConfig{
+			NetworkConfig:            schemas.DefaultNetworkConfig,
+			ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
+			ProxyConfig: &schemas.ProxyConfig{
+				Type: schemas.HttpProxy,
+				URL:  "http://localhost:8000", // Proxy URL
+			},
+		}, nil
+	case schemas.Anthropic:
+		return &schemas.ProviderConfig{
+			NetworkConfig:            schemas.DefaultNetworkConfig,
+			ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
+			ProxyConfig: &schemas.ProxyConfig{
+				Type:     schemas.Socks5Proxy,
+				URL:      "http://localhost:8000", // Proxy URL
+				Username: "user",
+				Password: "password",
+			},
+		}, nil
+	}
+	return nil, fmt.Errorf("provider %s not supported", provider)
+}
+```
+
+### Send Back Raw Response
+
+Include the original provider response alongside Bifrost's standardized response format. Useful for debugging and accessing provider-specific metadata.
+
+**Provider-level default** (applies to all requests for this provider):
+
+```go
+func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
+    return &schemas.ProviderConfig{
+        NetworkConfig: schemas.DefaultNetworkConfig,
+        ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
+        SendBackRawResponse: true,
+    }, nil
+}
+```
+
+**Per-request override** (overrides the provider default for a single request):
+
+```go
+ctx := context.Background()
+ctx = context.WithValue(ctx, schemas.BifrostContextKeySendBackRawResponse, true) // or false to suppress
+
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostChatRequest{
+    Provider: schemas.OpenAI,
+    Model:    "gpt-4o-mini",
+    Input:    messages,
+})
+
+if response.ChatResponse != nil {
+    rawResp := response.ChatResponse.ExtraFields.RawResponse // original provider JSON
+}
+```
+
+When enabled, the raw provider response appears in `ExtraFields.RawResponse`:
+
+```go
+type BifrostChatResponse struct {
+	ID                string                     `json:"id"`
+	Choices           []BifrostResponseChoice    `json:"choices"`
+	Created           int                        `json:"created"` // The Unix timestamp (in seconds).
+	Model             string                     `json:"model"`
+	Object            string                     `json:"object"` // "chat.completion" or "chat.completion.chunk"
+	ServiceTier       string                     `json:"service_tier"`
+	SystemFingerprint string                     `json:"system_fingerprint"`
+	Usage             *BifrostLLMUsage           `json:"usage"`
+	ExtraFields       BifrostResponseExtraFields `json:"extra_fields"`
+}
+
+type BifrostResponseExtraFields struct {
+	RequestType    RequestType        `json:"request_type"`
+	Provider       ModelProvider      `json:"provider"`
+	ModelRequested string             `json:"model_requested"`
+	Latency        int64              `json:"latency"`     // in milliseconds (for streaming responses this will be each chunk latency, and the last chunk latency will be the total latency)
+	ChunkIndex     int                `json:"chunk_index"` // used for streaming responses to identify the chunk index, will be 0 for non-streaming responses
+	RawResponse    interface{}        `json:"raw_response,omitempty"`
+	CacheDebug     *BifrostCacheDebug `json:"cache_debug,omitempty"`
+}
+```
+
+### Send Back Raw Request
+
+Include the original request sent to the provider alongside Bifrost's response. Useful for debugging request transformations and verifying what was actually sent to the provider.
+
+**Provider-level default** (applies to all requests for this provider):
+
+```go
+func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
+    return &schemas.ProviderConfig{
+        NetworkConfig: schemas.DefaultNetworkConfig,
+        ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
+        SendBackRawRequest: true,
+    }, nil
+}
+```
+
+**Per-request override** (overrides the provider default for a single request):
+
+```go
+ctx := context.Background()
+ctx = context.WithValue(ctx, schemas.BifrostContextKeySendBackRawRequest, true) // or false to suppress
+
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostChatRequest{
+    Provider: schemas.OpenAI,
+    Model:    "gpt-4o-mini",
+    Input:    messages,
+})
+
+if response.ChatResponse != nil {
+    rawReq := response.ChatResponse.ExtraFields.RawRequest // exact JSON sent to the provider
+}
+```
+
+When enabled, the raw provider request appears in `ExtraFields.RawRequest`:
+
+```go
+type BifrostResponseExtraFields struct {
+	// ... other fields
+	RawRequest     interface{}        `json:"raw_request,omitempty"`
+	RawResponse    interface{}        `json:"raw_response,omitempty"`
+}
+```
+
+### Store Raw Request/Response
+
+Persist the raw provider request and response in the log record without necessarily returning them in the API response. This is orthogonal to the send-back flags — enabling this does not affect what the caller receives, and enabling send-back does not automatically store data in logs. Enable both to do both.
+
+**Provider-level default** (applies to all requests for this provider):
+
+```go
+func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
+    return &schemas.ProviderConfig{
+        NetworkConfig: schemas.DefaultNetworkConfig,
+        ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
+        StoreRawRequestResponse: true,
+    }, nil
+}
+```
+
+**Per-request override** (overrides the provider default for a single request):
+
+```go
+ctx := context.Background()
+ctx = context.WithValue(ctx, schemas.BifrostContextKeyStoreRawRequestResponse, true) // or false to disable
+
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostChatRequest{
+    Provider: schemas.OpenAI,
+    Model:    "gpt-4o-mini",
+    Input:    messages,
+})
+// Raw data is persisted in the log record.
+// ExtraFields.RawRequest/RawResponse are nil unless send-back flags are also enabled.
+```
+
+<Note>
+`StoreRawRequestResponse` only has effect when the logging plugin is active — raw data is written to the log record by the logging plugin. Without it, enabling this flag captures the data but nothing persists it.
+
+`StoreRawRequestResponse`, `SendBackRawRequest`, and `SendBackRawResponse` are orthogonal controls — enabling any one does not imply the others. Enable any combination depending on whether you need raw data in logs, in the response, or both.
+</Note>
+
+## Best Practices
+
+### Performance Considerations
+
+Keys are fetched from your `GetKeysForProvider` implementation on every request. Ensure your implementation is optimized for speed to avoid adding latency:
+
+```go
+func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
+    // ✅ Good: Fast in-memory lookup
+    switch provider {
+    case schemas.OpenAI:
+        return a.cachedOpenAIKeys, nil  // Pre-cached keys
+    }
+    
+    // ❌ Avoid: Database queries, API calls, complex algorithms
+    // This will add latency to every AI request
+    // keys := fetchKeysFromDatabase(provider)  // Too slow!
+    // return processWithComplexLogic(keys)     // Too slow!
+    
+    return nil, fmt.Errorf("provider %s not supported", provider)
+}
+```
+
+**Recommendations:**
+- Cache keys in memory during application startup
+- Use simple switch statements or map lookups
+- Avoid database queries, file I/O, or network calls
+- Keep complex key processing logic outside the request path
+
+## Next Steps
+
+- **[Streaming Responses](./streaming)** - Real-time response generation
+- **[Tool Calling](./tool-calling)** - Enable AI to use external functions
+- **[Multimodal AI](./multimodal)** - Process images, audio, and text
+- **[Core Features](../../features/)** - Advanced Bifrost capabilities
--- a/docs/quickstart/go-sdk/reranking.mdx
+++ b/docs/quickstart/go-sdk/reranking.mdx
@@ -0,0 +1,88 @@
+---
+title: "Reranking"
+description: "Rerank documents with Bifrost Go SDK using client.RerankRequest."
+icon: "book-open-cover"
+---
+
+Use the Go SDK to rank candidate documents by relevance to a query.
+
+Provider/model examples:
+- Cohere: `Provider: schemas.Cohere`, `Model: "rerank-v3.5"`
+- vLLM: `Provider: schemas.VLLM`, `Model: "BAAI/bge-reranker-v2-m3"`
+
+## Basic Example
+
+```go
+package main
+
+import (
+	"context"
+	"fmt"
+
+	bifrost "github.com/maximhq/bifrost/core"
+	"github.com/maximhq/bifrost/core/schemas"
+)
+
+func main() {
+	client, err := bifrost.Init(context.Background(), schemas.BifrostConfig{
+		Account: &MyAccount{},
+	})
+	if err != nil {
+		panic(err)
+	}
+	defer client.Shutdown()
+
+	request := &schemas.BifrostRerankRequest{
+		Provider: schemas.Cohere,
+		Model:    "rerank-v3.5",
+		Query:    "What is Bifrost?",
+		Documents: []schemas.RerankDocument{
+			{Text: "Bifrost is an AI gateway that unifies many LLM providers."},
+			{Text: "Paris is the capital of France."},
+			{Text: "Bifrost exposes an OpenAI-compatible API."},
+		},
+		Params: &schemas.RerankParameters{
+			TopN:            bifrost.Ptr(2),
+			ReturnDocuments: bifrost.Ptr(true),
+		},
+	}
+
+	resp, bfErr := client.RerankRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), request)
+	if bfErr != nil {
+		panic(bfErr.Error.Message)
+	}
+
+	for _, result := range resp.Results {
+		fmt.Printf("index=%d score=%.4f\n", result.Index, result.RelevanceScore)
+	}
+}
+```
+
+## Parameters
+
+- `Provider`, `Model`: provider/model to use for rerank
+- `Query`: query text
+- `Documents`: documents to score (`text`, optional `id`, `meta`)
+- `Params.TopN`: max result count
+- `Params.MaxTokensPerDoc`: provider-dependent token cap
+- `Params.Priority`: provider-dependent priority hint
+- `Params.ReturnDocuments`: include source document in each result
+- `Fallbacks`: fallback provider/model choices
+
+For vLLM, set `Provider` to `schemas.VLLM` and use the upstream model ID as `Model` (without the `vllm/` prefix that is used in Gateway HTTP requests).
+
+## Response
+
+`BifrostRerankResponse` includes:
+
+- `Results []RerankResult` (`index`, `relevance_score`, optional `document`)
+- `Model`
+- optional `Usage`
+- `ExtraFields` metadata (`provider`, `latency`, `request_type`, etc.)
+
+## Next Steps
+
+- **[Streaming Responses](./streaming)** - Real-time response processing
+- **[Tool Calling](./tool-calling)** - Enable AI to use external functions
+- **[Multimodal AI](./multimodal)** - Process images and multimedia content
+- **[Core Features](../../features/)** - Advanced Bifrost capabilities
--- a/docs/quickstart/go-sdk/setting-up.mdx
+++ b/docs/quickstart/go-sdk/setting-up.mdx
@@ -0,0 +1,144 @@
+---
+title: "Setting Up"
+description: "Get Bifrost running in your Go application in 30 seconds with minimal setup and direct code integration."
+icon: "play"
+---
+
+<video width="100%" controls>
+  <source src="https://github.com/maximhq/bifrost/raw/refs/heads/main/docs/media/package-demo.mp4" type="video/mp4" />
+  Your browser does not support the video tag.
+</video>
+
+
+## 30-Second Setup
+
+Get Bifrost running in your Go application with minimal setup. This guide shows you how to integrate multiple AI providers through a single, unified interface.
+
+### 1. Install Package
+
+```bash
+go mod init my-bifrost-app
+go get github.com/maximhq/bifrost/core
+```
+
+### 2. Set Environment Variable
+
+```bash
+export OPENAI_API_KEY="your-openai-api-key"
+```
+
+### 3. Create `main.go`
+
+```go
+package main
+
+import (
+    "context"
+    "fmt"
+    "os"
+
+    "github.com/maximhq/bifrost/core"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+type MyAccount struct{}
+
+// Account interface needs to implement these 3 methods
+func (a *MyAccount) GetConfiguredProviders() ([]schemas.ModelProvider, error) {
+    return []schemas.ModelProvider{schemas.OpenAI}, nil
+}
+
+func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
+    if provider == schemas.OpenAI {
+    return []schemas.Key{{
+        Value:  os.Getenv("OPENAI_API_KEY"),
+            Models: schemas.WhiteList{"*"}, // Keep Models ["*"] to use any model
+            Weight: 1.0,
+        }}, nil
+    }
+    return nil, fmt.Errorf("provider %s not supported", provider)
+}
+
+func (a *MyAccount) GetConfigForProvider(provider schemas.ModelProvider) (*schemas.ProviderConfig, error) {
+    if provider == schemas.OpenAI {
+        // Return default config (can be customized for advanced use cases)
+        return &schemas.ProviderConfig{
+                NetworkConfig:            schemas.DefaultNetworkConfig,
+                ConcurrencyAndBufferSize: schemas.DefaultConcurrencyAndBufferSize,
+        }, nil
+    }
+    return nil, fmt.Errorf("provider %s not supported", provider)
+}
+
+// Main function implement to initialize bifrost and make a request
+func main() {
+	client, initErr := bifrost.Init(context.Background(), schemas.BifrostConfig{
+		Account: &MyAccount{},
+	})
+	if initErr != nil {
+		panic(initErr)
+	}
+	defer client.Shutdown()
+
+	messages := []schemas.ChatMessage{
+		{
+            Role:    schemas.ChatMessageRoleUser,
+            Content: &schemas.ChatMessageContent{
+                ContentStr: schemas.Ptr("Hello, Bifrost!"),
+            },
+        },
+	}
+
+	response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostChatRequest{
+		Provider: schemas.OpenAI,
+		Model:    "gpt-4o-mini",
+		Input:    messages,
+	})
+
+	if err != nil {
+		panic(err)
+	}
+
+    fmt.Println("Response:", *response.Choices[0].Message.Content.ContentStr)
+}
+```
+
+### 4. Run Your App
+
+```bash
+go run main.go
+# Output: Response: Hello! I'm Bifrost, your AI model gateway...
+```
+
+**🎉 That's it!** You're now running Bifrost in your Go application.
+
+### What Just Happened?
+
+1. **Account Interface**: `MyAccount` provides API keys and list of providers to Bifrost for initialisation and key lookups.
+2. **Provider Resolution**: `schemas.OpenAI` tells Bifrost to use OpenAI as the provider.
+3. **Model Selection**: `"gpt-4o-mini"` specifies which model to use.
+4. **Unified API**: Same interface works for any provider/model combination (OpenAI, Anthropic, Vertex etc.)
+
+---
+
+## Next Steps
+
+Now that you have Bifrost running, explore these focused guides:
+
+### Essential Topics
+
+- **[Provider Configuration](./provider-configuration)** - Multiple providers & automatic failovers
+- **[Streaming Responses](./streaming)** - Real-time chat, audio, and transcription
+- **[Tool Calling](./tool-calling)** - Functions & MCP server integration  
+- **[Multimodal AI](./multimodal)** - Images, speech synthesis, and vision
+
+### Advanced Topics
+
+- **[Core Features](../../features/)** - Caching, observability, and governance
+- **[Integrations](../../integrations/)** - Drop-in replacements for existing SDKs
+- **[Architecture](../../architecture/)** - How Bifrost works internally
+- **[Deployment](../../deployment-guides)** - Production setup and scaling
+
+---
+
+**Happy coding with Bifrost!** 🚀
--- a/docs/quickstart/go-sdk/streaming.mdx
+++ b/docs/quickstart/go-sdk/streaming.mdx
@@ -0,0 +1,300 @@
+---
+title: "Streaming Responses"
+description: "Receive AI responses in real-time as they're generated. Perfect for chat applications, audio processing, and real-time transcription where you want immediate results."
+icon: "water"
+---
+
+## Streaming Text Completion
+
+Stream plain text completions as they are generated, ideal for autocomplete, summaries, and single-output generation.
+
+```go
+stream, err := client.TextCompletionStreamRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostTextCompletionRequest{
+	Provider: schemas.OpenAI,
+	Model:    "gpt-4o-mini",
+	Input: &schemas.TextCompletionInput{
+		PromptStr: bifrost.Ptr("A for apple and B for"),
+	},
+})
+
+if err != nil {
+	log.Printf("Streaming request failed: %v", err)
+	return
+}
+
+for chunk := range stream {
+	// Handle errors in stream
+	if chunk.BifrostError != nil {
+		log.Printf("Stream error: %v", chunk.BifrostError)
+		break
+	}
+
+	// Process response chunks
+	if chunk.BifrostTextCompletionResponse != nil && len(chunk.BifrostTextCompletionResponse.Choices) > 0 {
+		choice := chunk.BifrostTextCompletionResponse.Choices[0]
+		
+		// Check for streaming content
+		if choice.TextCompletionResponseChoice != nil &&
+			choice.TextCompletionResponseChoice.Text != nil {
+			content := *choice.BifrostTextCompletionResponseChoice.Text
+			fmt.Print(content) // Print content as it arrives
+		}
+	}
+}
+```
+
+## Streaming Chat Responses
+
+Receive incremental chat deltas in real-time. Append delta content to progressively render assistant messages.
+
+```go
+stream, err := client.ChatCompletionStreamRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostChatRequest{
+	Provider: schemas.OpenAI,
+	Model:    "gpt-4o-mini",
+	Input:    messages,
+})
+
+if err != nil {
+	log.Printf("Streaming request failed: %v", err)
+	return
+}
+
+for chunk := range stream {
+	// Handle errors in stream
+	if chunk.BifrostError != nil {
+		log.Printf("Stream error: %v", chunk.BifrostError)
+		break
+	}
+
+	// Process response chunks
+	if chunk.BifrostChatResponse != nil && len(chunk.BifrostChatResponse.Choices) > 0 {
+		choice := chunk.BifrostChatResponse.Choices[0]
+
+		// Check for streaming content
+		if choice.ChatStreamResponseChoice != nil &&
+			choice.ChatStreamResponseChoice.Delta != nil &&
+			choice.ChatStreamResponseChoice.Delta.Content != nil {
+
+			content := *choice.ChatStreamResponseChoice.Delta.Content
+			fmt.Print(content) // Print content as it arrives
+		}
+	}
+}
+```
+
+> **Note:** Streaming requests also follow the default timeout setting defined in provider configuration, which defaults to **30 seconds**.
+
+<Note>
+Bifrost standardizes all stream responses to send usage and finish reason only in the last chunk, and content in the previous chunks.
+</Note>
+
+## Responses API Streaming
+
+Use the OpenAI-style Responses API with streaming for unified flows. Events arrive via SSE; accumulate text deltas until completion.
+
+```go
+messages := []schemas.ResponsesMessage{
+	{
+		Role: bifrost.Ptr(schemas.ResponsesInputMessageRoleUser),
+		Content: &schemas.ResponsesMessageContent{
+			ContentStr: bifrost.Ptr("Hello, Bifrost!"),
+		},
+	},
+}
+
+stream, err := client.ResponsesStreamRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostResponsesRequest{
+	Provider: schemas.OpenAI,
+	Model:    "gpt-4o-mini",
+	Input:    messages,
+})
+
+if err != nil {
+	log.Printf("Streaming request failed: %v", err)
+	return
+}
+
+for chunk := range stream {
+	// Handle errors in stream
+	if chunk.BifrostError != nil {
+		log.Printf("Stream error: %v", chunk.BifrostError)
+		break
+	}
+
+	// Process response chunks
+	if chunk.BifrostResponsesStreamResponse != nil {
+		delta := chunk.BifrostResponsesStreamResponse.Delta
+
+		// Check for streaming content
+		if delta != nil {
+			fmt.Print(*delta) // Print content as it arrives
+		}
+	}
+}
+```
+
+## Text-to-Speech Streaming: Real-time Audio Generation
+
+Stream audio generation in real-time as text is converted to speech. Ideal for long texts or when you need immediate audio playback.
+
+```go
+stream, err := client.SpeechStreamRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostSpeechRequest{
+	Provider: schemas.OpenAI,
+	Model:    "tts-1", // Using text-to-speech model
+	Input: &schemas.SpeechInput{
+		Input: "Hello! This is a sample text that will be converted to speech using Bifrost's speech synthesis capabilities. The weather today is wonderful, and I hope you're having a great day!",
+	},
+	Params: &schemas.SpeechParameters{
+		VoiceConfig: &schemas.SpeechVoiceInput{
+			Voice: schemas.Ptr("alloy"),
+		},
+		ResponseFormat: schemas.Ptr("mp3"),
+	},
+})
+
+if err != nil {
+	panic(err)
+}
+
+// Handle speech synthesis stream
+var audioData []byte
+var totalChunks int
+filename := "output.mp3"
+
+for chunk := range stream {
+	if chunk.BifrostError != nil {
+		panic(fmt.Sprintf("Stream error: %s", chunk.BifrostError.Error.Message))
+	}
+
+	if chunk.BifrostSpeechStreamResponse != nil {
+		// Accumulate audio data from each chunk
+		audioData = append(audioData, chunk.BifrostSpeechStreamResponse.Audio...)
+		totalChunks++
+		fmt.Printf("Received chunk %d, size: %d bytes\n", totalChunks, len(chunk.BifrostSpeechStreamResponse.Audio))
+	}
+}
+
+if len(audioData) > 0 {
+	// Save the accumulated audio to a file
+	err := os.WriteFile(filename, audioData, 0644)
+	if err != nil {
+		panic(fmt.Sprintf("Failed to save audio file: %v", err))
+	}
+
+	fmt.Printf("Speech synthesis streaming complete! Audio saved to %s\n", filename)
+	fmt.Printf("Total chunks received: %d, final file size: %d bytes\n", totalChunks, len(audioData))
+}
+```
+
+## Speech-to-Text Streaming: Real-time Audio Transcription
+
+Stream audio transcription results as they're processed. Get immediate text output for real-time applications or long audio files.
+
+```go
+// Read the audio file for transcription
+audioFilename := "output.mp3"
+audioData, err := os.ReadFile(audioFilename)
+if err != nil {
+	panic(fmt.Sprintf("Failed to read audio file %s: %v. Please make sure the file exists.", audioFilename, err))
+}
+
+fmt.Printf("Loaded audio file %s (%d bytes) for transcription...\n", audioFilename, len(audioData))
+
+stream, err := client.TranscriptionStreamRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostTranscriptionRequest{
+	Provider: schemas.OpenAI,
+	Model:    "whisper-1", // Using Whisper model for transcription
+	Input: &schemas.TranscriptionInput{
+		File: audioData,
+	},
+	Params: &schemas.TranscriptionParameters{
+		Prompt: schemas.Ptr("This is a sample audio transcription from Bifrost speech synthesis."), // Optional: provide context
+	},
+})
+
+if err != nil {
+	panic(err)
+}
+
+for chunk := range stream {
+	if chunk.BifrostError != nil {
+		panic(fmt.Sprintf("Stream error: %s", chunk.BifrostError.Error.Message))
+	}
+
+	if chunk.BifrostTranscriptionStreamResponse != nil && chunk.BifrostTranscriptionStreamResponse.Delta != nil {
+		// Print each chunk of text as it arrives
+		fmt.Print(*chunk.BifrostTranscriptionStreamResponse.Delta)
+	}
+}
+```
+
+## Streaming Best Practices
+
+### Buffering for Audio
+
+For audio streaming, consider buffering chunks before saving:
+
+```go
+const bufferSize = 1024 * 1024 // 1MB buffer
+
+var audioBuffer bytes.Buffer
+var lastSave time.Time
+
+for chunk := range stream {
+	if chunk.BifrostSpeechStreamResponse != nil {
+		audioBuffer.Write(chunk.BifrostSpeechStreamResponse.Audio)
+
+		// Save every second or when buffer is full
+		if time.Since(lastSave) > time.Second || audioBuffer.Len() > bufferSize {
+			// Append to file
+			file, err := os.OpenFile("streaming_audio.mp3", os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
+			if err == nil {
+				file.Write(audioBuffer.Bytes())
+				file.Close()
+				audioBuffer.Reset()
+				lastSave = time.Now()
+			}
+		}
+	}
+}
+```
+
+### Context and Cancellation
+
+Use context to control streaming duration:
+
+```go
+ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+defer cancel()
+
+stream, err := client.ChatCompletionStreamRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostChatRequest{
+	// ... your request
+})
+
+// Stream will automatically stop after 30 seconds
+```
+
+## Voice Options
+
+OpenAI TTS supports these voices:
+
+- `alloy` - Balanced, natural voice
+- `echo` - Deep, resonant voice  
+- `fable` - Expressive, storytelling voice
+- `onyx` - Strong, confident voice
+- `nova` - Bright, energetic voice
+- `shimmer` - Gentle, soothing voice
+
+```go
+// Different voice example
+VoiceConfig: schemas.SpeechVoiceInput{
+    Voice: bifrost.Ptr("nova"),
+},
+```
+
+> **Note:** Please check each model's documentation to see if it supports the corresponding streaming features. Not all providers support all streaming capabilities.
+
+## Next Steps
+
+- **[Tool Calling](./tool-calling)** - Enable AI to use external functions
+- **[Multimodal AI](./multimodal)** - Process images and multimedia content
+- **[Provider Configuration](./provider-configuration)** - Multiple providers for redundancy
+- **[Core Features](../../features/)** - Advanced Bifrost capabilities
--- a/docs/quickstart/go-sdk/tool-calling.mdx
+++ b/docs/quickstart/go-sdk/tool-calling.mdx
@@ -0,0 +1,268 @@
+---
+title: "Tool Calling"
+description: "Enable AI models to use external functions and services by defining tool schemas or connecting to Model Context Protocol (MCP) servers. This allows AI to interact with databases, APIs, file systems, and more."
+icon: "wrench"
+---
+
+## Function Calling with Custom Tools
+
+Enable AI models to use external functions by defining tool schemas. Models can then call these functions automatically based on user requests.
+
+```go
+// Define a tool for the calculator
+calculatorTool := schemas.ChatTool{
+	Type: schemas.ChatToolTypeFunction,
+	Function: &schemas.ChatToolFunction{
+		Name:        "calculator",
+		Description: schemas.Ptr("A calculator tool"),
+		Parameters: &schemas.ToolFunctionParameters{
+			Type: "object",
+			Properties: map[string]interface{}{
+				"operation": map[string]interface{}{
+					"type":        "string",
+					"description": "The operation to perform",
+					"enum":        []string{"add", "subtract", "multiply", "divide"},
+				},
+				"a": map[string]interface{}{
+					"type":        "number",
+					"description": "The first number",
+				},
+				"b": map[string]interface{}{
+					"type":        "number",
+					"description": "The second number",
+				},
+			},
+			Required: []string{"operation", "a", "b"},
+		},
+	},
+}
+
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostChatRequest{
+	Provider: schemas.OpenAI,
+	Model:    "gpt-4o-mini",
+	Input: []schemas.ChatMessage{
+		{
+			Role: schemas.ChatMessageRoleUser,
+			Content: &schemas.ChatMessageContent{
+				ContentStr: schemas.Ptr("What is 2+2? Use the calculator tool."),
+			},
+		},
+	},
+	Params: &schemas.ChatParameters{
+		Tools: []schemas.ChatTool{calculatorTool},
+	},
+})
+
+if err != nil {
+	panic(err)
+}
+
+if response.Choices[0].Message.ChatAssistantMessage != nil && response.Choices[0].Message.ChatAssistantMessage.ToolCalls != nil {
+	for _, toolCall := range response.Choices[0].Message.ChatAssistantMessage.ToolCalls {
+		fmt.Printf("Tool call in response - %s: %s\n", *toolCall.ID, *toolCall.Function.Name)
+		fmt.Printf("Tool call arguments - %s\n", toolCall.Function.Arguments)
+	}
+}
+```
+
+## Connecting to MCP Servers
+
+Connect to Model Context Protocol (MCP) servers to give AI models access to external tools and services without manually defining each function.
+
+```go
+client, initErr := bifrost.Init(context.Background(), schemas.BifrostConfig{
+	Account: &MyAccount{},
+	MCPConfig: &schemas.MCPConfig{
+		ClientConfigs: []schemas.MCPClientConfig{
+			// Sample youtube-mcp server
+			{
+				Name:             "youtube-mcp",
+				ConnectionType:   schemas.MCPConnectionTypeHTTP,
+				ConnectionString: schemas.Ptr("http://your-youtube-mcp-url"),
+				ToolsToExecute: []string{"*"}, // Allow all tools from this client
+			},
+		},
+	},
+})
+if initErr != nil {
+	panic(initErr)
+}
+defer client.Shutdown()
+
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostChatRequest{
+	Provider: schemas.OpenAI,
+	Model:    "gpt-4o-mini",
+	Input: []schemas.ChatMessage{
+		{
+			Role: schemas.ChatMessageRoleUser,
+			Content: &schemas.ChatMessageContent{
+				ContentStr: schemas.Ptr("What do you see when you search for 'bifrost' on youtube?"),
+			},
+		},
+	},
+})
+
+if err != nil {
+	panic(err)
+}
+
+if response.Choices[0].Message.ChatAssistantMessage != nil && response.Choices[0].Message.ChatAssistantMessage.ToolCalls != nil {
+	for _, toolCall := range response.Choices[0].Message.ChatAssistantMessage.ToolCalls {
+		fmt.Printf("Tool call in response - %s: %s\n", *toolCall.ID, *toolCall.Function.Name)
+		fmt.Printf("Tool call arguments - %s\n", toolCall.Function.Arguments)
+	}
+}
+```
+
+Read more about MCP connections and in-house tool registration via local MCP server in the [MCP Features](../../mcp/overview) section.
+
+## Advanced Tool Examples
+
+### Weather API Tool
+
+```go
+weatherTool := schemas.ChatTool{
+	Type: schemas.ChatToolTypeFunction,
+	Function: &schemas.ChatToolFunction{
+		Name:        "get_weather",
+		Description: schemas.Ptr("Get the current weather for a location"),
+		Parameters: &schemas.ToolFunctionParameters{
+			Type: "object",
+			Properties: map[string]interface{}{
+				"location": map[string]interface{}{
+					"type":        "string",
+					"description": "The city and state, e.g. San Francisco, CA",
+				},
+				"unit": map[string]interface{}{
+					"type":        "string",
+					"description": "Temperature unit",
+					"enum":        []string{"celsius", "fahrenheit"},
+				},
+			},
+			Required: []string{"location"},
+		},
+	},
+}
+```
+
+### Database Query Tool
+
+```go
+databaseTool := schemas.ChatTool{
+	Type: schemas.ChatToolTypeFunction,
+	Function: &schemas.ChatToolFunction{
+		Name:        "query_database",
+		Description: schemas.Ptr("Execute a SQL query on the customer database"),
+		Parameters: &schemas.ToolFunctionParameters{
+			Type: "object",
+			Properties: map[string]interface{}{
+				"query": map[string]interface{}{
+					"type":        "string",
+					"description": "The SQL query to execute",
+				},
+				"table": map[string]interface{}{
+					"type":        "string",
+					"description": "The table to query",
+					"enum":        []string{"customers", "orders", "products"},
+				},
+			},
+			Required: []string{"query", "table"},
+		},
+	},
+}
+```
+
+### File System Tool
+
+```go
+fileSystemTool := schemas.ChatTool{
+	Type: schemas.ChatToolTypeFunction,
+	Function: &schemas.ChatToolFunction{
+		Name:        "read_file",
+		Description: schemas.Ptr("Read the contents of a file"),
+		Parameters: &schemas.ToolFunctionParameters{
+			Type: "object",
+			Properties: map[string]interface{}{
+				"path": map[string]interface{}{
+					"type":        "string",
+					"description": "The file path to read",
+				},
+				"encoding": map[string]interface{}{
+					"type":        "string",
+					"description": "File encoding",
+					"enum":        []string{"utf-8", "ascii", "base64"},
+					"default":     "utf-8",
+				},
+			},
+			Required: []string{"path"},
+		},
+	},
+}
+```
+
+## Multiple Tool Support
+
+Use multiple tools in a single request:
+
+```go
+response, err := client.ChatCompletionRequest(schemas.NewBifrostContext(context.Background(), schemas.NoDeadline), &schemas.BifrostChatRequest{
+	Provider: schemas.OpenAI,
+	Model:    "gpt-4o-mini",
+	Input: []schemas.ChatMessage{
+		{
+			Role: schemas.ChatMessageRoleUser,
+			Content: &schemas.ChatMessageContent{
+				ContentStr: schemas.Ptr("What's the weather in New York and calculate 15% tip for a $50 bill?"),
+			},
+		},
+	},
+	Params: &schemas.ChatParameters{
+		Tools: []schemas.ChatTool{weatherTool, calculatorTool},
+		ToolChoice: &schemas.ChatToolChoice{
+			ChatToolChoiceStr: schemas.Ptr("auto"), // Let AI decide which tools to use
+		},
+	},
+})
+```
+
+## Tool Choice Options
+
+Control how the AI uses tools:
+
+```go
+// Force use of a specific tool
+Params: &schemas.ChatParameters{
+	Tools: []schemas.ChatTool{calculatorTool},
+	ToolChoice: &schemas.ChatToolChoice{
+		ChatToolChoiceStruct: &schemas.ChatToolChoiceStruct{
+			Type: schemas.ChatToolChoiceTypeFunction,
+			Function: &schemas.ChatToolChoiceFunction{
+				Name: "calculator",
+			},
+		},
+	},
+}
+
+// Let AI decide automatically
+Params: &schemas.ChatParameters{
+	Tools: []schemas.ChatTool{calculatorTool, weatherTool},
+	ToolChoice: &schemas.ChatToolChoice{
+		ChatToolChoiceStr: schemas.Ptr("auto"),
+	},
+}
+
+// Disable tool usage
+Params: &schemas.ChatParameters{
+	Tools: []schemas.ChatTool{calculatorTool},
+	ToolChoice: &schemas.ChatToolChoice{
+		ChatToolChoiceStr: schemas.Ptr("none"),
+	},
+}
+```
+
+## Next Steps
+
+- **[Multimodal AI](./multimodal)** - Process images, audio, and multimedia content
+- **[Streaming Responses](./streaming)** - Real-time response generation
+- **[Provider Configuration](./provider-configuration)** - Multiple providers for redundancy
+- **[MCP Features](../../mcp/overview)** - Advanced MCP server management