first commit

2026-04-26 21:52:23 +03:00
commit 880f412e2c
2662 changed files with 866266 additions and 0 deletions
--- a/docs/.mintignore
+++ b/docs/.mintignore
@@ -0,0 +1,4 @@
+# Ignore modular OpenAPI source files
+openapi/paths/
+openapi/schemas/
+openapi/openapi.yaml
--- a/docs/README.md
+++ b/docs/README.md
@@ -0,0 +1,3 @@
+# Bifrost documentation
+
+For better accessibility we have moved documentation [here](https://www.getmaxim.ai/bifrost/docs).
--- a/docs/architecture/README.mdx
+++ b/docs/architecture/README.mdx
--- a/docs/architecture/core/concurrency.mdx
+++ b/docs/architecture/core/concurrency.mdx
@@ -0,0 +1,764 @@
+---
+title: "Concurrency"
+description: "Deep dive into Bifrost's advanced concurrency architecture - worker pools, goroutine management, channel-based communication, and resource isolation patterns."
+icon: "traffic-light"
+---
+
+## Concurrency Philosophy
+
+### **Core Principles**
+
+| Principle                          | Implementation                         | Benefit                                |
+| ---------------------------------- | -------------------------------------- | -------------------------------------- |
+| **Provider Isolation**          | Independent worker pools per provider  | Fault tolerance, no cascade failures   |
+| **Channel-Based Communication** | Go channels for all async operations   | Type-safe, deadlock-free communication |
+| **Resource Pooling**            | Object pools with lifecycle management | Predictable memory usage, minimal GC   |
+| **Non-Blocking Operations**     | Async processing throughout pipeline   | Maximum concurrency, no blocking waits |
+| **Backpressure Handling**       | Configurable buffers and flow control  | Graceful degradation under load        |
+
+### **Threading Architecture Overview**
+
+```mermaid
+graph TB
+    subgraph "Main Thread"
+        Main[Main Process<br/>HTTP Server]
+        Router[Request Router<br/>Goroutine]
+        PluginMgr[Plugin Manager<br/>Goroutine]
+    end
+
+    subgraph "Provider Worker Pools"
+        subgraph "OpenAI Pool"
+            OAI1[Worker 1<br/>Goroutine]
+            OAI2[Worker 2<br/>Goroutine]
+            OAIN[Worker N<br/>Goroutine]
+        end
+        subgraph "Anthropic Pool"
+            ANT1[Worker 1<br/>Goroutine]
+            ANT2[Worker 2<br/>Goroutine]
+            ANTN[Worker N<br/>Goroutine]
+        end
+        subgraph "Bedrock Pool"
+            BED1[Worker 1<br/>Goroutine]
+            BED2[Worker 2<br/>Goroutine]
+            BEDN[Worker N<br/>Goroutine]
+        end
+    end
+
+    subgraph "Memory Pools"
+        ChannelPool[Channel Pool<br/>sync.Pool]
+        MessagePool[Message Pool<br/>sync.Pool]
+        ResponsePool[Response Pool<br/>sync.Pool]
+    end
+
+    Main --> Router
+    Router --> PluginMgr
+    PluginMgr --> OAI1
+    PluginMgr --> ANT1
+    PluginMgr --> BED1
+
+    OAI1 --> ChannelPool
+    ANT1 --> MessagePool
+    BED1 --> ResponsePool
+```
+
+---
+
+## Worker Pool Architecture
+
+### **Provider-Isolated Worker Pools**
+
+```mermaid
+stateDiagram-v2
+    [*] --> PoolInit: Worker Pool Creation
+    PoolInit --> WorkerSpawn: Spawn Worker Goroutines
+    WorkerSpawn --> Listening: Workers Listen on Channels
+
+    Listening --> Processing: Job Received
+    Processing --> API_Call: Provider API Request
+    API_Call --> Response: Process Response
+    Response --> Listening: Job Complete
+
+    Listening --> Shutdown: Graceful Shutdown
+    Processing --> Shutdown: Complete Current Job
+    Shutdown --> [*]: Pool Destroyed
+```
+
+**Worker Pool Architecture:**
+
+The worker pool system maintains a sophisticated balance between resource efficiency and performance isolation:
+
+**Key Components:**
+
+- **Worker Pool Management** - Pre-spawned workers reduce startup latency
+- **Job Queue System** - Buffered channels provide smooth load balancing
+- **Resource Pools** - HTTP clients and API keys are pooled for efficiency
+- **Health Monitoring** - Circuit breakers detect and isolate failing providers
+- **Graceful Shutdown** - Workers complete current jobs before terminating
+
+**Startup Process:**
+
+1. **Worker Pre-spawning** - Workers are created during pool initialization
+2. **Channel Setup** - Job queues and worker channels are established
+3. **Resource Allocation** - HTTP clients and API keys are distributed
+4. **Health Checks** - Initial connectivity tests verify provider availability
+5. **Ready State** - Pool becomes available for request processing
+
+**Job Dispatch Logic:**
+
+- **Round-Robin Assignment** - Jobs are distributed evenly across available workers
+- **Load Balancing** - Worker availability determines job assignment
+- **Overflow Handling** - Excess jobs are queued or dropped based on configuration
+
+### **Worker Lifecycle Management**
+
+```mermaid
+sequenceDiagram
+    participant Pool
+    participant Worker
+    participant HTTPClient
+    participant Provider
+    participant Metrics
+
+    Pool->>Worker: Start()
+    Worker->>Worker: Initialize HTTP Client
+    Worker->>Pool: Ready Signal
+
+    loop Job Processing
+        Pool->>Worker: Job Assignment
+        Worker->>HTTPClient: Prepare Request
+        HTTPClient->>Provider: API Call
+        Provider-->>HTTPClient: Response
+        HTTPClient-->>Worker: Parsed Response
+        Worker->>Metrics: Record Performance
+        Worker->>Pool: Job Complete
+    end
+
+    Pool->>Worker: Shutdown Signal
+    Worker->>Worker: Complete Current Job
+    Worker-->>Pool: Shutdown Confirmed
+````
+
+---
+
+## Channel-Based Communication
+
+### **Channel Architecture**
+
+```mermaid
+graph TB
+    subgraph "Channel Types"
+        JobQueue[Job Queue<br/>Buffered Channel]
+        WorkerPool[Worker Pool<br/>Buffered Channel]
+        ResultChan[Result Channel<br/>Buffered Channel]
+        QuitChan[Quit Channel<br/>Unbuffered]
+    end
+
+    subgraph "Flow Control"
+        BackPressure[Backpressure<br/>Buffer Limits]
+        Timeout[Timeout<br/>Context Cancellation]
+        Graceful[Graceful Shutdown<br/>Channel Closing]
+    end
+
+    JobQueue --> BackPressure
+    WorkerPool --> Timeout
+    ResultChan --> Graceful
+```
+
+**Channel Configuration Principles:**
+
+Bifrost's channel system balances throughput and memory usage through careful buffer sizing:
+
+**Job Queuing Configuration:**
+
+- **Job Queue Buffer** - Sized based on expected burst traffic (100-1000 jobs)
+- **Worker Pool Size** - Matches provider concurrency limits (10-100 workers)
+- **Result Buffer** - Accommodates response processing delays (50-500 responses)
+
+**Flow Control Parameters:**
+
+- **Queue Wait Limits** - Maximum time jobs wait before timeout (1-10 seconds)
+- **Processing Timeouts** - Per-job execution limits (30-300 seconds)
+- **Shutdown Timeouts** - Graceful termination periods (5-30 seconds)
+
+**Backpressure Policies:**
+
+- **Drop Policy** - Discard excess jobs when queues are full
+- **Block Policy** - Wait for queue space with timeout
+- **Error Policy** - Immediately return error for full queues
+
+**Channel Type Selection:**
+
+- **Buffered Channels** - Used for async job processing and result handling
+- **Unbuffered Channels** - Used for synchronization signals (quit, done)
+- **Context Cancellation** - Used for timeout and cancellation propagation
+
+### **Backpressure and Flow Control**
+
+```mermaid
+flowchart TD
+    Request[Incoming Request] --> QueueCheck{Queue Full?}
+    QueueCheck -->|No| Queue[Add to Queue]
+    QueueCheck -->|Yes| Policy{Drop Policy?}
+
+    Policy -->|Drop| Drop[Drop Request<br/>Return Error]
+    Policy -->|Block| Block[Block Until Space<br/>With Timeout]
+    Policy -->|Error| Error[Return Queue Full Error]
+
+    Queue --> Worker[Assign to Worker]
+    Block --> TimeoutCheck{Timeout?}
+    TimeoutCheck -->|Yes| Error
+    TimeoutCheck -->|No| Queue
+
+    Worker --> Processing[Process Request]
+    Processing --> Complete[Complete]
+
+    Drop --> Client[Client Response]
+    Error --> Client
+    Complete --> Client
+````
+
+**Backpressure Implementation Strategy:**
+
+The backpressure system protects Bifrost from being overwhelmed while maintaining service availability:
+
+**Non-Blocking Job Submission:**
+
+- **Immediate Queue Check** - Jobs are submitted without blocking on queue space
+- **Success Path** - Available queue space allows immediate job acceptance
+- **Overflow Detection** - Full queues trigger backpressure policies
+- **Metrics Collection** - All queue operations are tracked for monitoring
+
+**Backpressure Policy Execution:**
+
+- **Drop Policy** - Immediately rejects excess jobs with meaningful error messages
+- **Block Policy** - Waits for queue space with configurable timeout limits
+- **Error Policy** - Returns queue full errors for immediate client feedback
+- **Metrics Tracking** - Dropped, blocked, and successful submissions are measured
+
+**Timeout Management:**
+
+- **Context-Based Timeouts** - All blocking operations respect timeout boundaries
+- **Graceful Degradation** - Timeouts result in controlled error responses
+- **Resource Protection** - Prevents goroutine leaks from infinite waits
+
+```go
+  case pool.jobQueue <- job:
+  pool.metrics.IncQueuedJobs()
+  return nil
+  case <-ctx.Done():
+  pool.metrics.IncTimeoutJobs()
+  return errors.New("queue full, timeout waiting")
+  }
+
+          case "error":
+              pool.metrics.IncRejectedJobs()
+              return errors.New("queue full, job rejected")
+
+          default:
+              return errors.New("unknown queue policy")
+          }
+      }
+  }
+```
+
+---
+
+## Memory Pool Concurrency
+
+### **Thread-Safe Object Pools**
+
+```mermaid
+graph TD
+    subgraph "sync.Pool Lifecycle"
+        direction LR
+        GetObject[Get Object<br/>sync.Pool.Get]
+        PoolCheck{Is Pool Empty?}
+        NewObject[New Object<br/>Factory Function]
+        UseObject[Use Object<br/>Application Logic]
+        ResetObject[Reset Object<br/>Clear State]
+        ReturnObject[Return Object<br/>sync.Pool.Put]
+
+        GetObject --> PoolCheck
+        PoolCheck -- Yes --> NewObject
+        PoolCheck -- No --> UseObject
+        NewObject --> UseObject
+        UseObject --> ResetObject
+        ResetObject --> ReturnObject
+        ReturnObject --> GetObject
+    end
+
+    subgraph "GC Interaction"
+        direction TB
+        GCRun[GC Runs]
+        PoolCleanup[Pool Cleanup<br>Removes idle objects]
+        
+        GCRun --> PoolCleanup
+    end
+```
+
+**Thread-Safe Pool Architecture:**
+
+Bifrost's memory pool system ensures thread-safe object reuse across multiple goroutines:
+
+**Pool Structure Design:**
+
+- **Multiple Pool Types** - Separate pools for channels, messages, responses, and buffers
+- **Factory Functions** - Dynamic object creation when pools are empty
+- **Statistics Tracking** - Comprehensive metrics for pool performance monitoring
+- **Thread Safety** - Synchronized access using Go's sync.Pool and read-write mutexes
+
+**Object Lifecycle Management:**
+
+- **Pool Initialization** - Factory functions define object creation patterns
+- **Unique Identification** - Each pooled object gets a unique ID for tracking
+- **Timestamp Tracking** - Creation, acquisition, and return times are recorded
+- **Reusability Flags** - Objects can be marked as non-reusable for single-use scenarios
+
+**Acquisition Strategy:**
+
+- **Request Tracking** - All pool requests are counted for monitoring
+- **Hit/Miss Tracking** - Pool effectiveness is measured through hit ratios
+- **Fallback Creation** - New objects are created when pools are empty
+- **Performance Metrics** - Acquisition times and patterns are monitored
+
+**Return and Reset Process:**
+
+- **State Validation** - Only reusable objects are returned to pools
+- **Object Reset** - All object state is cleared before returning to pool
+- **Return Tracking** - Return operations are counted and timed
+- **Pool Replenishment** - Returned objects become available for reuse
+
+### **Pool Performance Monitoring**
+
+Comprehensive metrics provide insights into pool efficiency and system health:
+
+**Usage Statistics Collection:**
+- **Request Counting** - Track total pool requests by object type
+- **Creation Tracking** - Monitor new object allocations when pools are empty
+- **Hit/Miss Ratios** - Measure pool effectiveness through reuse rates
+- **Return Monitoring** - Track successful object returns to pools
+
+**Performance Metrics Analysis:**
+- **Acquisition Times** - Measure how long it takes to get objects from pools
+- **Reset Performance** - Track time spent cleaning objects for reuse
+- **Hit Ratio Calculation** - Determine percentage of requests served from pools
+- **Memory Efficiency** - Calculate memory savings from object reuse
+
+**Key Performance Indicators:**
+- **Channel Pool Hit Ratio** - Typically 85-95% in steady state
+- **Message Pool Efficiency** - Usually 80-90% reuse rate
+- **Response Pool Utilization** - Often 70-85% hit ratio
+- **Total Memory Savings** - Measured reduction in garbage collection pressure
+
+**Monitoring Integration:**
+- **Thread-Safe Access** - All metrics collection is synchronized
+- **Real-Time Updates** - Statistics are updated with each pool operation
+- **Export Capability** - Metrics are available in JSON format for monitoring systems
+- **Alerting Support** - Low hit ratios can trigger performance alerts
+
+---
+
+## Goroutine Management
+
+### **Goroutine Lifecycle Patterns**
+
+```mermaid
+stateDiagram-v2
+    [*] --> Created: go routine()
+    Created --> Running: Execute Function
+    Running --> Waiting: Channel/Mutex Block
+    Waiting --> Running: Unblocked
+    Running --> Syscall: Network I/O
+    Syscall --> Running: I/O Complete
+    Running --> GCAssist: GC Triggered
+    GCAssist --> Running: GC Complete
+    Running --> Terminated: Function Exit
+    Terminated --> [*]: Cleanup
+```
+
+**Goroutine Pool Management Strategy:**
+
+Bifrost's goroutine management ensures optimal resource usage while preventing goroutine leaks:
+
+**Pool Configuration Management:**
+
+- **Goroutine Limits** - Maximum concurrent goroutines prevent resource exhaustion
+- **Active Counting** - Atomic counters track currently running goroutines
+- **Idle Timeouts** - Unused goroutines are cleaned up after configured periods
+- **Resource Boundaries** - Hard limits prevent runaway goroutine creation
+
+**Lifecycle Orchestration:**
+
+- **Spawn Channels** - New goroutine creation is tracked through channels
+- **Completion Monitoring** - Finished goroutines signal completion for cleanup
+- **Shutdown Coordination** - Graceful shutdown ensures all goroutines complete properly
+- **Health Monitoring** - Continuous monitoring tracks goroutine health and performance
+
+**Worker Creation Process:**
+
+- **Limit Enforcement** - Creation fails when maximum goroutine count is reached
+- **Unique Identification** - Each goroutine gets a unique ID for tracking and debugging
+- **Lifecycle Tracking** - Start times and names enable performance analysis
+- **Atomic Operations** - Thread-safe counters prevent race conditions
+
+**Panic Recovery and Error Handling:**
+
+- **Panic Isolation** - Goroutine panics don't crash the entire system
+- **Error Logging** - Panic details are logged with goroutine context
+- **Metrics Updates** - Panic counts are tracked for monitoring and alerting
+- **Resource Cleanup** - Failed goroutines are properly cleaned up and counted
+
+**Health Monitoring System:**
+
+- **Periodic Health Checks** - Regular intervals check goroutine pool health
+- **Completion Tracking** - Finished goroutines are recorded for performance analysis
+- **Shutdown Handling** - Clean shutdown process ensures no goroutine leaks
+
+### **Resource Leak Prevention**
+
+```mermaid
+flowchart TD
+    GoroutineStart[Goroutine Start] --> ResourceCheck[Resource Allocation Check]
+    ResourceCheck --> Timeout[Set Timeout Context]
+    Timeout --> Work[Execute Work]
+
+    Work --> Complete{Work Complete?}
+    Complete -->|Yes| Cleanup[Cleanup Resources]
+    Complete -->|No| TimeoutCheck{Timeout?}
+
+    TimeoutCheck -->|Yes| ForceCleanup[Force Cleanup]
+    TimeoutCheck -->|No| Work
+
+    Cleanup --> Return[Return Resources to Pool]
+    ForceCleanup --> Return
+    Return --> End[Goroutine End]
+````
+
+**Resource Leak Prevention:**
+
+```go
+func (worker *Worker) ExecuteWithCleanup(job *Job) {
+    // Set timeout context
+    ctx, cancel := context.WithTimeout(
+        context.Background(),
+        worker.config.ProcessTimeout,
+    )
+    defer cancel()
+
+    // Acquire resources with timeout
+    resources, err := worker.acquireResources(ctx)
+    if err != nil {
+        job.resultChan <- &Result{Error: err}
+        return
+    }
+
+    // Ensure cleanup happens
+    defer func() {
+        // Always return resources
+        worker.returnResources(resources)
+
+        // Handle panics
+        if r := recover(); r != nil {
+            worker.metrics.IncPanics()
+            job.resultChan <- &Result{
+                Error: fmt.Errorf("worker panic: %v", r),
+            }
+        }
+    }()
+
+    // Execute job with context
+    result := worker.processJob(ctx, job, resources)
+
+    // Return result
+    select {
+    case job.resultChan <- result:
+        // Success
+    case <-ctx.Done():
+        // Timeout - result channel might be closed
+        worker.metrics.IncTimeouts()
+    }
+}
+```
+
+---
+
+## Concurrency Optimization Strategies
+
+### **Load-Based Worker Scaling** (Planned)
+
+```mermaid
+graph TB
+    subgraph "Load Monitoring"
+        QueueDepth[Queue Depth<br/>Monitoring]
+        ResponseTime[Response Time<br/>Tracking]
+        WorkerUtil[Worker Utilization<br/>Metrics]
+    end
+
+    subgraph "Scaling Decisions"
+        ScaleUp{Scale Up?<br/>Load > 80%}
+        ScaleDown{Scale Down?<br/>Load < 30%}
+        Maintain[Maintain<br/>Current Size]
+    end
+
+    subgraph "Actions"
+        AddWorkers[Spawn Additional<br/>Workers]
+        RemoveWorkers[Graceful Worker<br/>Shutdown]
+        NoAction[No Action<br/>Monitor Continue]
+    end
+
+    QueueDepth --> ScaleUp
+    ResponseTime --> ScaleUp
+    WorkerUtil --> ScaleDown
+
+    ScaleUp -->|Yes| AddWorkers
+    ScaleUp -->|No| ScaleDown
+    ScaleDown -->|Yes| RemoveWorkers
+    ScaleDown -->|No| Maintain
+
+    Maintain --> NoAction
+```
+
+**Adaptive Scaling Implementation:**
+
+```go
+type AdaptiveScaler struct {
+    pool           *ProviderWorkerPool
+    config         ScalingConfig
+    metrics        *ScalingMetrics
+    lastScaleTime  time.Time
+    scalingMutex   sync.Mutex
+}
+
+func (scaler *AdaptiveScaler) EvaluateScaling() {
+    scaler.scalingMutex.Lock()
+    defer scaler.scalingMutex.Unlock()
+
+    // Prevent frequent scaling
+    if time.Since(scaler.lastScaleTime) < scaler.config.MinScaleInterval {
+        return
+    }
+
+    current := scaler.getCurrentMetrics()
+
+    // Scale up conditions
+    if current.QueueUtilization > scaler.config.ScaleUpThreshold ||
+       current.AvgResponseTime > scaler.config.MaxResponseTime {
+
+        scaler.scaleUp(current)
+        return
+    }
+
+    // Scale down conditions
+    if current.QueueUtilization < scaler.config.ScaleDownThreshold &&
+       current.AvgResponseTime < scaler.config.TargetResponseTime {
+
+        scaler.scaleDown(current)
+        return
+    }
+}
+
+func (scaler *AdaptiveScaler) scaleUp(metrics *CurrentMetrics) {
+    currentWorkers := scaler.pool.GetWorkerCount()
+    targetWorkers := int(float64(currentWorkers) * scaler.config.ScaleUpFactor)
+
+    // Respect maximum limits
+    if targetWorkers > scaler.config.MaxWorkers {
+        targetWorkers = scaler.config.MaxWorkers
+    }
+
+    additionalWorkers := targetWorkers - currentWorkers
+    if additionalWorkers > 0 {
+        scaler.pool.AddWorkers(additionalWorkers)
+        scaler.lastScaleTime = time.Now()
+        scaler.metrics.RecordScaleUp(additionalWorkers)
+    }
+}
+```
+
+### **Provider-Specific Optimization**
+
+```go
+type ProviderOptimization struct {
+    // Provider characteristics
+    ProviderName     string        `json:"provider_name"`
+    RateLimit        int           `json:"rate_limit"`        // Requests per second
+    AvgLatency       time.Duration `json:"avg_latency"`       // Average response time
+    ErrorRate        float64       `json:"error_rate"`        // Historical error rate
+
+    // Optimal configuration
+    OptimalWorkers   int           `json:"optimal_workers"`
+    OptimalBuffer    int           `json:"optimal_buffer"`
+    TimeoutConfig    time.Duration `json:"timeout_config"`
+    RetryStrategy    RetryConfig   `json:"retry_strategy"`
+}
+
+func CalculateOptimalConcurrency(provider ProviderOptimization) ConcurrencyConfig {
+    // Calculate based on rate limits and latency
+    optimalWorkers := provider.RateLimit * int(provider.AvgLatency.Seconds())
+
+    // Adjust for error rate (more workers for higher error rate)
+    errorAdjustment := 1.0 + provider.ErrorRate
+    optimalWorkers = int(float64(optimalWorkers) * errorAdjustment)
+
+    // Buffer should be 2-3x worker count for smooth operation
+    optimalBuffer := optimalWorkers * 3
+
+    return ConcurrencyConfig{
+        Concurrency: optimalWorkers,
+        BufferSize:  optimalBuffer,
+        Timeout:     provider.AvgLatency * 2, // 2x avg latency for timeout
+    }
+}
+```
+
+---
+
+## Concurrency Monitoring & Metrics
+
+### **Key Concurrency Metrics**
+
+```mermaid
+graph TB
+    subgraph "Worker Metrics"
+        ActiveWorkers[Active Workers<br/>Current Count]
+        IdleWorkers[Idle Workers<br/>Available Count]
+        BusyWorkers[Busy Workers<br/>Processing Count]
+    end
+
+    subgraph "Queue Metrics"
+        QueueDepth[Queue Depth<br/>Pending Jobs]
+        QueueThroughput[Queue Throughput<br/>Jobs/Second]
+        QueueWaitTime[Queue Wait Time<br/>Average Delay]
+    end
+
+    subgraph "Performance Metrics"
+        GoroutineCount[Goroutine Count<br/>Total Active]
+        MemoryUsage[Memory Usage<br/>Pool Utilization]
+        GCPressure[GC Pressure<br/>Collection Frequency]
+    end
+
+    subgraph "Health Metrics"
+        ErrorRate[Error Rate<br/>Failed Jobs %]
+        PanicCount[Panic Count<br/>Crashed Goroutines]
+        DeadlockDetection[Deadlock Detection<br/>Blocked Operations]
+    end
+```
+
+**Metrics Collection Strategy:**
+
+Comprehensive concurrency monitoring provides operational insights and performance optimization data:
+
+**Worker Pool Monitoring:**
+
+- **Total Worker Tracking** - Monitor configured vs actual worker counts
+- **Active Worker Monitoring** - Track workers currently processing requests
+- **Idle Worker Analysis** - Identify unused capacity and optimization opportunities
+- **Queue Depth Monitoring** - Track pending job backlog and processing delays
+
+**Performance Data Collection:**
+
+- **Throughput Metrics** - Measure jobs processed per second across all pools
+- **Wait Time Analysis** - Track how long jobs wait in queues before processing
+- **Memory Pool Performance** - Monitor hit/miss ratios for memory pool effectiveness
+- **Goroutine Count Tracking** - Ensure goroutine counts remain within healthy limits
+
+**Health and Reliability Metrics:**
+
+- **Panic Recovery Tracking** - Count and analyze worker panic occurrences
+- **Timeout Monitoring** - Track jobs that exceed processing time limits
+- **Circuit Breaker Events** - Monitor provider isolation events and recoveries
+- **Error Rate Analysis** - Track failure patterns for capacity planning
+
+**Real-Time Updates:**
+
+- **Live Metric Updates** - Worker metrics are updated continuously during operation
+- **Processing Event Recording** - Each job completion updates relevant metrics
+- **Performance Correlation** - Queue times and processing times are correlated for analysis
+- **Success/Failure Tracking** - All job outcomes are recorded for reliability analysis
+
+---
+
+## Deadlock Prevention & Detection
+
+### **Deadlock Prevention Strategies**
+
+```mermaid
+flowchart TD
+    Strategy1[Lock Ordering<br/>Consistent Acquisition]
+    Strategy2[Timeout-Based Locks<br/>Context Cancellation]
+    Strategy3[Channel Select<br/>Non-blocking Operations]
+    Strategy4[Resource Hierarchy<br/>Layered Locking]
+
+    Prevention[Deadlock Prevention<br/>Design Patterns]
+
+    Prevention --> Strategy1
+    Prevention --> Strategy2
+    Prevention --> Strategy3
+    Prevention --> Strategy4
+
+    Strategy1 --> Success[No Deadlocks<br/>Guaranteed Order]
+    Strategy2 --> Success
+    Strategy3 --> Success
+    Strategy4 --> Success
+````
+
+**Deadlock Prevention Implementation Strategy:**
+
+Bifrost employs multiple complementary strategies to prevent deadlocks in concurrent operations:
+
+**Lock Ordering Management:**
+
+- **Consistent Acquisition Order** - All locks are acquired in a predetermined order
+- **Global Lock Registry** - Centralized registry maintains lock ordering relationships
+- **Order Enforcement** - Lock acquisition automatically sorts by predetermined order
+- **Dependency Tracking** - Lock dependencies are mapped to prevent circular waits
+
+**Timeout-Based Protection:**
+
+- **Default Timeouts** - All lock acquisitions have reasonable timeout limits
+- **Context Cancellation** - Operations respect context cancellation for cleanup
+- **Maximum Timeout Limits** - Upper bounds prevent indefinite blocking
+- **Graceful Timeout Handling** - Timeout errors provide meaningful context
+
+**Multi-Lock Acquisition Process:**
+
+- **Ordered Sorting** - Multiple locks are sorted before acquisition attempts
+- **Progressive Acquisition** - Locks are acquired one by one in sorted order
+- **Failure Recovery** - Failed acquisitions trigger automatic cleanup of held locks
+- **Resource Tracking** - All acquired locks are tracked for proper release
+
+**Lock Acquisition Safety:**
+
+- **Non-Blocking Detection** - Channel-based lock attempts prevent indefinite blocking
+- **Timeout Enforcement** - All lock attempts respect configured timeout limits
+- **Error Propagation** - Lock failures are properly propagated with context
+- **Cleanup Guarantees** - Failed operations always clean up partially acquired resources
+
+**Deadlock Detection and Recovery:**
+
+- **Active Monitoring** - Continuous monitoring for potential deadlock conditions
+- **Automatic Recovery** - Detected deadlocks trigger automatic resolution procedures
+- **Resource Release** - Deadlock resolution involves strategic resource release
+- **Prevention Learning** - Deadlock patterns inform prevention strategy improvements
+
+---
+
+## Related Architecture Documentation
+
+- **[Request Flow](./request-flow)** - How concurrency fits in request processing
+- **[Benchmarks](../../benchmarking/getting-started)** - Concurrency performance characteristics
+- **[Plugin System](./plugins)** - Plugin concurrency considerations
+- **[MCP System](./mcp)** - MCP concurrency and worker integration
+
+## Usage Documentation
+
+- **[Provider Configuration](../../quickstart/gateway/provider-configuration)** - Configure concurrency settings per provider
+- **[Performance Analysis](../../benchmarking/getting-started)** - Memory pool configuration and optimization
+- **[Performance Monitoring](../../features/telemetry)** - Monitor concurrency metrics and health
+- **[Go SDK Usage](../../quickstart/go-sdk/setting-up)** - Use Bifrost concurrency in Go applications
+- **[Gateway Setup](../../quickstart/gateway/setting-up)** - Deploy Bifrost with optimal concurrency settings
+
+---
+
+**🎯 Next Step:** Understand how plugins integrate with the concurrency model in **[Plugin System](./plugins)**.
+```
--- a/docs/architecture/core/mcp.mdx
+++ b/docs/architecture/core/mcp.mdx
@@ -0,0 +1,985 @@
+---
+title: "Model Context Protocol (MCP)"
+description: "Deep dive into Bifrost's Model Context Protocol (MCP) integration - how external tool discovery, execution, and integration work internally."
+icon: "toolbox"
+---
+
+## MCP Architecture Overview
+
+### **What is MCP in Bifrost?**
+
+The Model Context Protocol (MCP) system in Bifrost enables AI models to seamlessly discover and execute external tools, transforming static chat models into dynamic, action-capable agents. This architecture bridges the gap between AI reasoning and real-world tool execution.
+
+**Core MCP Principles:**
+
+- **Dynamic Discovery** - Tools are discovered at runtime, not hardcoded
+- **Client-Side Execution** - Bifrost controls all tool execution for security
+- **Multi-Protocol Support** - STDIO, HTTP, and SSE connection types
+- **Request-Level Filtering** - Granular control over tool availability
+- **Async Execution** - Non-blocking tool invocation and response handling
+
+### **MCP System Components**
+
+```mermaid
+graph TB
+    subgraph "MCP Management Layer"
+        MCPMgr[MCP Manager<br/>Central Controller]
+        ClientRegistry[Client Registry<br/>Connection Management]
+        ToolDiscovery[Tool Discovery<br/>Runtime Registration]
+    end
+
+    subgraph "MCP Execution Layer"
+        ToolFilter[Tool Filter<br/>Access Control]
+        ToolExecutor[Tool Executor<br/>Invocation Engine]
+        ResultProcessor[Result Processor<br/>Response Handling]
+    end
+
+    subgraph "Connection Types"
+        STDIOConn[STDIO Connections<br/>Command-line Tools]
+        HTTPConn[HTTP Connections<br/>Web Services]
+        SSEConn[SSE Connections<br/>Real-time Streams]
+    end
+
+    subgraph "External MCP Servers"
+        FileSystem[Filesystem Tools<br/>File Operations]
+        WebSearch[Web Search<br/>Information Retrieval]
+        Database[Database Tools<br/>Data Access]
+        Custom[Custom Tools<br/>Business Logic]
+    end
+
+    MCPMgr --> ClientRegistry
+    ClientRegistry --> ToolDiscovery
+    ToolDiscovery --> ToolFilter
+    ToolFilter --> ToolExecutor
+    ToolExecutor --> ResultProcessor
+
+    ClientRegistry --> STDIOConn
+    ClientRegistry --> HTTPConn
+    ClientRegistry --> SSEConn
+
+    STDIOConn --> FileSystem
+    HTTPConn --> WebSearch
+    HTTPConn --> Database
+    STDIOConn --> Custom
+```
+
+---
+
+## MCP Connection Architecture
+
+### **Multi-Protocol Connection System**
+
+Bifrost supports four MCP connection types, each optimized for different tool deployment patterns:
+
+```mermaid
+graph TB
+    subgraph "InProcess Connections"
+        InProcess[In-Memory Tools<br/>Same Process]
+        InProcessEx[Examples:<br/>• Embedded tools<br/>• High-perf operations<br/>• Testing tools]
+    end
+
+    subgraph "STDIO Connections"
+        STDIO[Command Line Tools<br/>Local Execution]
+        STDIOEx[Examples:<br/>• Filesystem tools<br/>• Local scripts<br/>• CLI utilities]
+    end
+
+    subgraph "HTTP Connections"
+        HTTP[Web Service Tools<br/>Remote APIs]
+        HTTPEx[Examples:<br/>• Web search APIs<br/>• Database services<br/>• External integrations]
+    end
+
+    subgraph "SSE Connections"
+        SSE[Real-time Tools<br/>Streaming Data]
+        SSEEx[Examples:<br/>• Live data feeds<br/>• Real-time monitoring<br/>• Event streams]
+    end
+
+    subgraph "Connection Characteristics"
+        Latency[Latency:<br/>InProcess < STDIO < HTTP < SSE]
+        Security[Security:<br/>InProcess/Local > HTTP > SSE]
+        Scalability[Scalability:<br/>HTTP > SSE > STDIO > InProcess]
+        Complexity[Complexity:<br/>InProcess < STDIO < HTTP < SSE]
+    end
+
+    InProcess --> Latency
+    STDIO --> Latency
+    HTTP --> Security
+    SSE --> Scalability
+    HTTP --> Complexity
+```
+
+### **Connection Type Details**
+
+**InProcess Connections (In-Memory Tools):**
+
+- **Use Case:** Embedded tools, high-performance operations, testing
+- **Performance:** Lowest possible latency (~0.1ms) with no IPC overhead
+- **Security:** Highest security as tools run in the same process
+- **Limitations:** Go package only, cannot be configured via JSON
+
+**STDIO Connections (Local Tools):**
+
+- **Use Case:** Command-line tools, local scripts, filesystem operations
+- **Performance:** Low latency (~1-10ms) due to local execution
+- **Security:** High security with full local control
+- **Limitations:** Single-server deployment, resource sharing
+
+**HTTP Connections (Remote Services):**
+
+- **Use Case:** Web APIs, microservices, cloud functions
+- **Performance:** Network-dependent latency (~10-500ms)
+- **Security:** Configurable with authentication and encryption
+- **Advantages:** Scalable, multi-server deployment, service isolation
+
+**SSE Connections (Streaming Tools):**
+
+- **Use Case:** Real-time data feeds, live monitoring, event streams
+- **Performance:** Variable latency depending on stream frequency
+- **Security:** Similar to HTTP with streaming capabilities
+- **Benefits:** Real-time updates, persistent connections, event-driven
+
+> **MCP Configuration:** [MCP Setup Guide →](../../mcp/overview)
+
+---
+
+## Tool Discovery & Registration
+
+### **Dynamic Tool Discovery Process**
+
+The MCP system discovers tools at runtime rather than requiring static configuration, enabling flexible and adaptive tool availability:
+
+```mermaid
+sequenceDiagram
+    participant Bifrost
+    participant MCPManager
+    participant MCPServer
+    participant ToolRegistry
+    participant AIModel
+
+    Note over Bifrost: System Startup
+    Bifrost->>MCPManager: Initialize MCP System
+    MCPManager->>MCPServer: Establish Connection
+    MCPServer-->>MCPManager: Connection Ready
+
+    MCPManager->>MCPServer: List Available Tools
+    MCPServer-->>MCPManager: Tool Definitions
+    MCPManager->>ToolRegistry: Register Tools
+
+    Note over Bifrost: Runtime Request Processing
+    AIModel->>MCPManager: Request Available Tools
+    MCPManager->>ToolRegistry: Query Tools
+    ToolRegistry-->>MCPManager: Filtered Tool List
+    MCPManager-->>AIModel: Available Tools
+
+    AIModel->>MCPManager: Execute Tool Call
+    MCPManager->>MCPServer: Tool Invocation
+    MCPServer->>MCPServer: Execute Tool Logic
+    MCPServer-->>MCPManager: Tool Result
+    MCPManager-->>AIModel: Enhanced Response
+```
+
+### **Tool Registry Management**
+
+**Registration Process:**
+
+1. **Connection Establishment** - MCP client connects to configured servers
+2. **Capability Exchange** - Server announces available tools and schemas
+3. **Tool Validation** - Bifrost validates tool definitions and security
+4. **Registry Update** - Tools are registered in the internal tool registry
+5. **Availability Notification** - Tools become available for AI model use
+
+**Registry Features:**
+
+- **Dynamic Updates** - Tools can be added/removed during runtime
+- **Version Management** - Support for tool versioning and compatibility
+- **Access Control** - Request-level tool filtering and permissions
+- **Health Monitoring** - Continuous tool availability checking
+
+**Tool Metadata Structure:**
+
+- **Name & Description** - Human-readable tool identification
+- **Parameters Schema** - JSON schema for tool input validation
+- **Return Schema** - Expected response format definition
+- **Capabilities** - Tool feature flags and limitations
+- **Authentication** - Required credentials and permissions
+
+---
+
+## Tool Filtering & Access Control
+
+### **Multi-Level Filtering System**
+
+Bifrost provides granular control over tool availability through a sophisticated filtering system:
+
+```mermaid
+flowchart TD
+    Request[Incoming Request] --> GlobalFilter{Global MCP Filter}
+    GlobalFilter -->|Enabled| ClientFilter[MCP Client Filtering]
+    GlobalFilter -->|Disabled| NoMCP[No MCP Tools]
+
+    ClientFilter --> IncludeClients{Include Clients?}
+    IncludeClients -->|Yes| IncludeList[Include Specified<br/>MCP Clients]
+    IncludeClients -->|No| AllClients[All MCP Clients]
+
+    IncludeList --> ExcludeClients{Exclude Clients?}
+    AllClients --> ExcludeClients
+    ExcludeClients -->|Yes| RemoveClients[Remove Excluded<br/>MCP Clients]
+    ExcludeClients -->|No| ClientsFiltered[Filtered Clients]
+
+    RemoveClients --> ToolFilter[Tool-Level Filtering]
+    ClientsFiltered --> ToolFilter
+
+    ToolFilter --> IncludeTools{Include Tools?}
+    IncludeTools -->|Yes| IncludeSpecific[Include Specified<br/>Tools Only]
+    IncludeTools -->|No| AllTools[All Available Tools]
+
+    IncludeSpecific --> ExcludeTools{Exclude Tools?}
+    AllTools --> ExcludeTools
+    ExcludeTools -->|Yes| RemoveTools[Remove Excluded<br/>Tools]
+    ExcludeTools -->|No| FinalTools[Final Tool Set]
+
+    RemoveTools --> FinalTools
+    FinalTools --> AIModel[Available to AI Model]
+    NoMCP --> AIModel
+```
+
+### **Filtering Configuration Levels**
+
+**Request-Level Filtering:**
+
+```bash
+# Include only specific MCP clients
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "x-bf-mcp-include-clients: filesystem,websearch" \
+  -d '{"model": "gpt-4o-mini", "messages": [...]}'
+
+# Include only specific tools
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "x-bf-mcp-include-tools: filesystem-read_file,websearch-search" \
+  -d '{"model": "gpt-4o-mini", "messages": [...]}'
+```
+
+**Configuration-Level Filtering:**
+
+- **Client Selection** - Choose which MCP servers to connect to
+- **Tool Blacklisting** - Permanently disable dangerous or unwanted tools
+- **Permission Mapping** - Map user roles to available tool sets
+- **Environment-Based** - Different tool sets for development vs production
+
+**Security Benefits:**
+
+- **Principle of Least Privilege** - Only necessary tools are exposed
+- **Dynamic Access Control** - Per-request tool availability
+- **Audit Trail** - Track which tools are used by which requests
+- **Risk Mitigation** - Prevent access to dangerous operations
+
+> **📖 Tool Filtering:** [MCP Tool Control →](../../mcp/filtering)
+
+---
+
+## Tool Execution Engine
+
+### **Async Tool Execution Architecture**
+
+The MCP execution engine handles tool invocation asynchronously to maintain system responsiveness and enable complex multi-tool workflows:
+
+```mermaid
+sequenceDiagram
+    participant AIModel
+    participant ExecutionEngine
+    participant ToolInvoker
+    participant MCPServer
+    participant ResultProcessor
+
+    AIModel->>ExecutionEngine: Tool Call Request
+    ExecutionEngine->>ExecutionEngine: Validate Tool Call
+    ExecutionEngine->>ToolInvoker: Queue Tool Execution
+
+    Note over ToolInvoker: Async Tool Execution
+    ToolInvoker->>MCPServer: Invoke Tool
+    MCPServer->>MCPServer: Execute Tool Logic
+    MCPServer-->>ToolInvoker: Raw Tool Result
+
+    ToolInvoker->>ResultProcessor: Process Result
+    ResultProcessor->>ResultProcessor: Format & Validate
+    ResultProcessor-->>ExecutionEngine: Processed Result
+
+    ExecutionEngine-->>AIModel: Tool Execution Complete
+
+    Note over AIModel: Multi-turn Conversation
+    AIModel->>ExecutionEngine: Continue with Tool Results
+    ExecutionEngine->>ExecutionEngine: Merge Results into Context
+    ExecutionEngine-->>AIModel: Enhanced Response
+```
+
+### **Execution Flow Characteristics**
+
+**Validation Phase:**
+
+- **Parameter Validation** - Ensure tool arguments match expected schema
+- **Permission Checking** - Verify tool access permissions for the request
+- **Rate Limiting** - Apply per-tool and per-user rate limits
+- **Security Scanning** - Check for potentially dangerous operations
+
+**Execution Phase:**
+
+- **Timeout Management** - Bounded execution time to prevent hanging
+- **Error Handling** - Graceful handling of tool failures and timeouts
+- **Result Streaming** - Support for tools that return streaming responses
+- **Resource Monitoring** - Track tool resource usage and performance
+
+**Response Phase:**
+
+- **Result Formatting** - Convert tool outputs to consistent format
+- **Error Enrichment** - Add context and suggestions for tool failures
+- **Multi-Result Aggregation** - Combine multiple tool outputs coherently
+- **Context Integration** - Merge tool results into conversation context
+
+### **Multi-Turn Conversation Support**
+
+The MCP system enables sophisticated multi-turn conversations where AI models can:
+
+1. **Initial Tool Discovery** - Request available tools for a given context
+2. **Tool Execution** - Execute one or more tools based on user request
+3. **Result Analysis** - Analyze tool outputs and determine next steps
+4. **Follow-up Actions** - Execute additional tools based on previous results
+5. **Response Synthesis** - Combine tool results into coherent user response
+
+**Example Multi-Turn Flow:**
+
+```
+User: "Find recent news about AI and save interesting articles"
+AI: → Execute web_search("AI news recent")
+AI: → Analyze search results
+AI: → Execute save_article() for each interesting result
+AI: → Respond with summary of saved articles
+```
+
+### **Complete User-Controlled Tool Execution Flow**
+
+The following diagram shows the end-to-end user experience with MCP tool execution, highlighting the critical user control points and decision-making process:
+
+```mermaid
+flowchart TD
+    A["👤 User Message<br/>\"List files in current directory\""] --> B["🤖 Bifrost Core"]
+
+    B --> C["🔧 MCP Manager<br/>Auto-discovers and adds<br/>available tools to request"]
+
+    C --> D["🌐 LLM Provider<br/>(OpenAI, Anthropic, etc.)"]
+
+    D --> E{"🔍 Response contains<br/>tool_calls?"}
+
+    E -->|No| F["✅ Final Response<br/>Display to user"]
+
+    E -->|Yes| G["📝 Add assistant message<br/>with tool_calls to history"]
+
+    G --> H["🛡️ YOUR EXECUTION LOGIC<br/>(Security, Approval, Logging)"]
+
+    H --> I{"🤔 User Decision Point<br/>Execute this tool?"}
+
+    I -->|Deny| J["❌ Create denial result<br/>Add to conversation history"]
+
+    I -->|Approve| K["⚙️ client.ExecuteMCPTool()<br/>Bifrost executes via MCP"]
+
+    K --> L["📊 Tool Result<br/>Add to conversation history"]
+
+    J --> M["🔄 Continue conversation loop<br/>Send updated history back to LLM"]
+    L --> M
+
+    M --> D
+
+    style A fill:#e1f5fe
+    style F fill:#e8f5e8
+    style H fill:#fff3e0
+    style I fill:#fce4ec
+    style K fill:#f3e5f5
+```
+
+**Key Flow Characteristics:**
+
+**User Control Points:**
+
+- **Security Layer** - Your application controls all tool execution decisions
+- **Approval Gate** - Users can approve or deny each tool execution
+- **Transparency** - Full visibility into what tools will be executed and why
+- **Conversation Continuity** - Tool results seamlessly integrate into conversation flow
+
+**Security Benefits:**
+
+- **No Automatic Execution** - Tools never execute without explicit approval
+- **Audit Trail** - Complete logging of all tool execution decisions
+- **Contextual Security** - Approval decisions can consider full conversation context
+- **Graceful Denials** - Denied tools result in informative responses, not errors
+
+**Implementation Patterns:**
+
+```go
+// Example tool execution control in your application
+func handleToolExecution(toolCall schemas.ChatToolCall, userContext UserContext) error {
+    // YOUR SECURITY AND APPROVAL LOGIC HERE
+    if !userContext.HasPermission(toolCall.Function.Name) {
+        return createDenialResponse("Tool not permitted for user role")
+    }
+
+    if requiresApproval(toolCall) {
+        approved := promptUserForApproval(toolCall)
+        if !approved {
+            return createDenialResponse("User denied tool execution")
+        }
+    }
+
+    // Execute the tool via Bifrost
+    result, err := client.ExecuteMCPTool(ctx, toolCall)
+    if err != nil {
+        return handleToolError(err)
+    }
+
+    return addToolResultToHistory(result)
+}
+```
+
+This flow ensures that while AI models can discover and request tool usage, all actual execution remains under user control, providing the perfect balance of AI capability and human oversight.
+
+---
+
+## Agent Mode Architecture
+
+Agent Mode transforms Bifrost into an autonomous agent runtime by automatically executing pre-approved tools. This section details the internal architecture of the agent execution loop.
+
+### **Agent Execution Loop**
+
+The agent mode operates as an iterative loop that continues until one of the termination conditions is met:
+
+```mermaid
+flowchart TD
+    subgraph "Agent Mode Entry"
+        A["📥 Incoming Chat Request"] --> B{"🔍 Check MCP Config<br/>Any tools_to_auto_execute?"}
+        B -->|No| C["📤 Standard Flow<br/>Return tool_calls for manual execution"]
+        B -->|Yes| D["🤖 Enter Agent Loop"]
+    end
+
+    subgraph "Agent Execution Loop"
+        D --> E["🌐 Send to LLM Provider<br/>With available tools"]
+        E --> F{"🔧 Response has<br/>tool_calls?"}
+        F -->|No| G["✅ Return Final Response<br/>No more tools needed"]
+        F -->|Yes| H["📋 Classify Tool Calls"]
+
+        H --> I{"🔐 Separate by<br/>auto-execute status"}
+        I --> J["⚡ Auto-Executable Tools"]
+        I --> K["🛡️ Non-Auto-Executable Tools"]
+
+        J --> L["🔄 Execute in Parallel<br/>Via ToolsManager"]
+        L --> M["📊 Collect Results"]
+
+        K --> N{"Any non-auto<br/>tools found?"}
+        N -->|Yes| O["🛑 Exit Loop Early<br/>Return mixed response"]
+        N -->|No| P{"⏱️ Max depth<br/>reached?"}
+
+        M --> P
+        P -->|Yes| Q["⚠️ Return Current State<br/>May have pending tools"]
+        P -->|No| R["📝 Add results to history"]
+        R --> E
+    end
+
+    subgraph "Response Handling"
+        O --> S["📦 Create Mixed Response<br/>• Content: executed results JSON<br/>• tool_calls: pending tools<br/>• finish_reason: stop"]
+        G --> T["📦 Standard Response<br/>Final answer from LLM"]
+        Q --> U["📦 Depth Limit Response<br/>Current state with any pending"]
+    end
+
+    style D fill:#e3f2fd
+    style L fill:#e8f5e9
+    style O fill:#fff3e0
+    style S fill:#fce4ec
+```
+
+### **Tool Classification System**
+
+When the LLM returns tool calls, Bifrost classifies each tool based on the client configuration:
+
+```mermaid
+flowchart LR
+    subgraph "Tool Call Classification"
+        TC["🔧 Tool Call<br/>from LLM Response"] --> CHECK{"Tool in<br/>tools_to_execute?"}
+        CHECK -->|No| SKIP["❌ Skip<br/>Not allowed"]
+        CHECK -->|Yes| AUTO{"Tool in<br/>tools_to_auto_execute?"}
+        AUTO -->|Yes| EXEC["⚡ Auto-Execute<br/>Run immediately"]
+        AUTO -->|No| MANUAL["🛡️ Manual<br/>Return to caller"]
+    end
+
+    subgraph "Configuration Example"
+        CONFIG["MCPClientConfig"]
+        CONFIG --> TE["tools_to_execute: [*]<br/>All tools available"]
+        CONFIG --> TAE["tools_to_auto_execute:<br/>[read_file, list_dir]"]
+    end
+
+    style EXEC fill:#c8e6c9
+    style MANUAL fill:#fff9c4
+    style SKIP fill:#ffcdd2
+```
+
+### **Mixed Tool Response Format**
+
+When a response contains both auto-executable and non-auto-executable tools, the agent creates a special response format:
+
+<AccordionGroup>
+  <Accordion title="Chat API Response Format" icon="message" defaultOpen>
+
+```json
+{
+  "id": "chatcmpl-abc123",
+  "choices": [{
+    "index": 0,
+    "finish_reason": "stop",
+    "message": {
+      "role": "assistant",
+      "content": "The Output from allowed tools calls is - {\"filesystem_read_file\":\"file contents here\",\"filesystem_list_directory\":\"[\\\"file1.txt\\\",\\\"file2.txt\\\"]\"}\n\nNow I shall call these tools next...",
+      "tool_calls": [
+        {
+          "id": "call_write_123",
+          "type": "function",
+          "function": {
+            "name": "filesystem_write_file",
+            "arguments": "{\"path\":\"output.txt\",\"content\":\"...\"}"
+          }
+        }
+      ]
+    }
+  }]
+}
+```
+
+<Note>
+The `content` field contains JSON-formatted results from auto-executed tools. The `tool_calls` array contains only non-auto-executable tools awaiting approval. Setting `finish_reason` to `"stop"` ensures the agent loop exits.
+</Note>
+
+  </Accordion>
+
+  <Accordion title="Responses API Format" icon="code">
+
+```json
+{
+  "id": "resp-abc123",
+  "output": [
+    {
+      "type": "message",
+      "role": "assistant",
+      "content": [{
+        "type": "text",
+        "text": "The Output from allowed tools calls is - {...}\n\nNow I shall call these tools next..."
+      }]
+    },
+    {
+      "type": "function_call",
+      "role": "assistant",
+      "call_id": "call_write_123",
+      "name": "filesystem_write_file",
+      "arguments": "{\"path\":\"output.txt\",\"content\":\"...\"}"
+    }
+  ]
+}
+```
+
+  </Accordion>
+</AccordionGroup>
+
+### **Agent Depth Control**
+
+The `max_agent_depth` setting prevents infinite loops and controls resource usage:
+
+```mermaid
+graph LR
+    subgraph "Depth Tracking"
+        D0["Depth 0<br/>Initial Request"] --> D1["Depth 1<br/>First tool execution"]
+        D1 --> D2["Depth 2<br/>Second iteration"]
+        D2 --> D3["Depth 3<br/>..."]
+        D3 --> DN["Depth N<br/>Max reached"]
+    end
+
+    DN --> EXIT["🛑 Force Exit<br/>Return current state"]
+
+    subgraph "Configuration"
+        CFG["MCPToolManagerConfig"]
+        CFG --> MAX["max_agent_depth: 10<br/>(default)"]
+        CFG --> TIMEOUT["tool_execution_timeout:<br/>30s per tool"]
+    end
+```
+
+<Warning>
+When max depth is reached, the response may contain pending tool calls that weren't executed. Your application should handle this gracefully.
+</Warning>
+
+---
+
+## Code Mode Architecture
+
+Code Mode enables AI models to write and execute Python code (Starlark) that orchestrates multiple MCP tools in a single request. This provides a powerful meta-layer for complex multi-tool workflows.
+
+### **Code Mode System Overview**
+
+```mermaid
+graph TB
+    subgraph "Code Mode Components"
+        VM["🖥️ Starlark Interpreter<br/>Python-like Runtime"]
+        VFS["📁 Virtual File System<br/>Tool Definitions as .pyi"]
+        EXEC["⚙️ Code Executor<br/>Sandboxed Execution"]
+    end
+
+    subgraph "Meta Tools"
+        LIST["listToolFiles()<br/>Discover available servers"]
+        READ["readToolFile(fileName)<br/>Get tool signatures"]
+        DOCS["getToolDocs(server, tool)<br/>Get detailed docs"]
+        CODE["executeToolCode(code)<br/>Run Python code"]
+    end
+
+    subgraph "MCP Integration"
+        TOOLS["🔧 Connected MCP Tools"]
+        RESULTS["📊 Tool Results"]
+    end
+
+    LLM["🤖 LLM"] --> LIST
+    LIST --> VFS
+    VFS --> LLM
+    LLM --> READ
+    READ --> VFS
+    VFS --> LLM
+    LLM --> DOCS
+    DOCS --> VFS
+    VFS --> LLM
+    LLM --> CODE
+    CODE --> VM
+    VM --> EXEC
+    EXEC --> TOOLS
+    TOOLS --> RESULTS
+    RESULTS --> LLM
+
+    style VM fill:#e8eaf6
+    style VFS fill:#e3f2fd
+    style CODE fill:#e8f5e9
+```
+
+### **Virtual File System (VFS)**
+
+Code Mode generates Python stub files (`.pyi`) for all connected MCP tools, providing compact function signatures:
+
+<Tabs>
+  <Tab title="Server-Level Binding">
+
+When `code_mode_binding_level: "server"` (default), tools are grouped by MCP client:
+
+```
+servers/
+├── filesystem.pyi      → All filesystem tools
+├── web_search.pyi      → All web search tools
+└── database.pyi        → All database tools
+```
+
+**Generated Stub Example:**
+```python
+# servers/filesystem.pyi
+# Usage: filesystem.tool_name(param=value)
+# For detailed docs: use getToolDocs(server="filesystem", tool="tool_name")
+
+def read_file(path: str) -> dict:  # Read contents of a file
+def write_file(path: str, content: str) -> dict:  # Write content to a file
+def list_directory(path: str) -> dict:  # List directory contents
+```
+
+**Usage in Code:**
+```python
+files = filesystem.list_directory(path=".")
+content = filesystem.read_file(path=files["entries"][0])
+result = content
+```
+
+  </Tab>
+  <Tab title="Tool-Level Binding">
+
+When `code_mode_binding_level: "tool"`, each tool gets its own file:
+
+```
+servers/
+├── filesystem/
+│   ├── read_file.pyi
+│   ├── write_file.pyi
+│   └── list_directory.pyi
+├── web_search/
+│   └── search.pyi
+└── database/
+    └── query.pyi
+```
+
+**Generated Stub Example:**
+```python
+# servers/filesystem/read_file.pyi
+# Usage: filesystem.read_file(param=value)
+
+def read_file(path: str) -> dict:  # Read contents of a file
+```
+
+**Usage in Code:**
+```python
+content = filesystem.read_file(path="config.json")
+result = content
+```
+
+  </Tab>
+</Tabs>
+
+### **Code Execution Flow**
+
+```mermaid
+sequenceDiagram
+    participant LLM as 🤖 LLM
+    participant CM as 📝 Code Mode Handler
+    participant VM as 🖥️ Starlark Interpreter
+    participant TM as 🔧 Tools Manager
+    participant MCP as 🌐 MCP Servers
+
+    LLM->>CM: executeToolCode({ code: "..." })
+    CM->>VM: Initialize sandbox
+    CM->>VM: Inject tool bindings
+    CM->>VM: Execute Python code
+
+    loop For each tool call in code
+        VM->>TM: server.tool(param=value)
+        TM->>MCP: Execute tool
+        MCP-->>TM: Tool result
+        TM-->>VM: Return result
+    end
+
+    VM-->>CM: Execution result
+    CM-->>LLM: { result, logs }
+```
+
+### **Starlark Sandbox**
+
+The code execution environment is carefully sandboxed using Starlark, a Python-like language designed for configuration and embedded scripting:
+
+<AccordionGroup>
+  <Accordion title="Available Features" icon="check" defaultOpen>
+
+  - ✅ **Python-like syntax** - Familiar Python syntax and semantics
+  - ✅ **Synchronous calls** - No async/await needed, direct function calls
+  - ✅ **List comprehensions** - `[x for x in items if condition]`
+  - ✅ **print()** - Output captured and returned in logs
+  - ✅ **Dict/List operations** - Standard Python data structures
+  - ✅ **Tool bindings** - All connected MCP tools as globals
+  </Accordion>
+
+  <Accordion title="Restricted Features" icon="ban">
+
+  - ❌ **Imports** - No `import` statements (tools are pre-bound)
+  - ❌ **Classes** - Use dicts and functions instead
+  - ❌ **File I/O** - No direct filesystem access (use MCP tools)
+  - ❌ **Network** - No direct network access (use MCP tools)
+  - ❌ **Randomness/Time** - Deterministic execution only
+
+  </Accordion>
+</AccordionGroup>
+
+### **Code Mode Security Model**
+
+```mermaid
+graph TB
+    subgraph "Security Layers"
+        L1["🔒 Code Validation<br/>Syntax checking before execution"]
+        L2["🛡️ Sandboxed Runtime<br/>No external module access"]
+        L3["⏱️ Execution Timeout<br/>Bounded runtime"]
+        L4["🔐 Tool ACL<br/>Only allowed tools accessible"]
+    end
+
+    subgraph "Execution Boundaries"
+        B1["No filesystem access<br/>(except via MCP tools)"]
+        B2["No network access<br/>(except via MCP tools)"]
+        B3["No process spawning"]
+        B4["Memory isolation enforced"]
+    end
+
+    L1 --> L2 --> L3 --> L4
+    L4 --> B1
+    L4 --> B2
+    L4 --> B3
+    L4 --> B4
+```
+
+### **Code Mode Configuration**
+
+<Tabs>
+  <Tab title="Gateway (config.json)">
+
+```json
+{
+  "mcp": {
+    "client_configs": [
+      {
+        "name": "filesystem",
+        "is_code_mode_client": true,
+        "connection_type": "stdio",
+        "stdio_config": {
+          "command": "npx",
+          "args": ["-y", "@anthropic/mcp-filesystem"]
+        },
+        "tools_to_execute": ["*"]
+      }
+    ],
+    "tool_manager_config": {
+      "code_mode_binding_level": "server",
+      "tool_execution_timeout": "30s"
+    }
+  }
+}
+```
+
+  </Tab>
+  <Tab title="Go SDK">
+
+```go
+mcpConfig := &schemas.MCPConfig{
+    ClientConfigs: []schemas.MCPClientConfig{
+        {
+            Name:             "filesystem",
+            IsCodeModeClient: true,
+            ConnectionType:   schemas.MCPConnectionTypeSTDIO,
+            StdioConfig: &schemas.MCPStdioConfig{
+                Command: "npx",
+                Args:    []string{"-y", "@anthropic/mcp-filesystem"},
+            },
+            ToolsToExecute: []string{"*"},
+        },
+    },
+    ToolManagerConfig: &schemas.MCPToolManagerConfig{
+        CodeModeBindingLevel: schemas.CodeModeBindingLevelServer,
+        ToolExecutionTimeout: 30 * time.Second,
+    },
+}
+```
+
+  </Tab>
+</Tabs>
+
+### **Code Mode vs Agent Mode**
+
+| Aspect | Agent Mode | Code Mode |
+|--------|------------|-----------|
+| **Execution Model** | LLM decides one tool at a time | LLM writes code orchestrating multiple tools |
+| **Iterations** | Multiple LLM round-trips | Single LLM call, code handles orchestration |
+| **Complexity** | Simple tool chains | Complex workflows with conditionals/loops |
+| **Latency** | Higher (multiple LLM calls) | Lower (single LLM call + code execution) |
+| **Control** | Per-tool approval possible | Code runs atomically |
+| **Best For** | Interactive agents | Batch operations, complex data processing |
+
+---
+
+## MCP Integration Patterns
+
+### **Common Integration Scenarios**
+
+**1. Filesystem Operations**
+
+- **Tools:** `list_files`, `read_file`, `write_file`, `create_directory`
+- **Use Cases:** Code analysis, document processing, file management
+- **Security:** Sandboxed file access, path validation, permission checks
+- **Performance:** Local execution for fast file operations
+
+**2. Web Search & Information Retrieval**
+
+- **Tools:** `web_search`, `fetch_url`, `extract_content`, `summarize`
+- **Use Cases:** Research assistance, fact-checking, content gathering
+- **Integration:** External search APIs, content parsing services
+- **Caching:** Response caching for repeated queries
+
+**3. Database Operations**
+
+- **Tools:** `query_database`, `insert_record`, `update_record`, `schema_info`
+- **Use Cases:** Data analysis, report generation, database administration
+- **Security:** Read-only access by default, query validation, injection prevention
+- **Performance:** Connection pooling, query optimization
+
+**4. API Integrations**
+
+- **Tools:** Custom business logic tools, third-party service integration
+- **Use Cases:** CRM operations, payment processing, notification sending
+- **Authentication:** API key management, OAuth token handling
+- **Error Handling:** Retry logic, fallback mechanisms
+
+### **MCP Server Development Patterns**
+
+**Simple STDIO Server:**
+
+- **Language:** Any language that can read/write JSON to stdin/stdout
+- **Deployment:** Single executable, minimal dependencies
+- **Use Case:** Local tools, development utilities, simple scripts
+
+**HTTP Service Server:**
+
+- **Architecture:** RESTful API with MCP protocol endpoints
+- **Scalability:** Horizontal scaling, load balancing
+- **Use Case:** Shared tools, enterprise integrations, cloud services
+
+**Hybrid Approach:**
+
+- **Local + Remote:** Combine STDIO tools for local operations with HTTP for remote services
+- **Failover:** Use local fallbacks when remote services are unavailable
+- **Optimization:** Route tool calls to most appropriate execution environment
+
+> **📖 MCP Development:** [Tool Development Guide →](../../mcp/overview)
+
+---
+
+## Security & Safety Considerations
+
+### **MCP Security Architecture**
+
+```mermaid
+graph TB
+    subgraph "Security Layers"
+        L1[Connection Security<br/>Authentication & Encryption]
+        L2[Tool Validation<br/>Schema & Permission Checks]
+        L3[Execution Security<br/>Sandboxing & Limits]
+        L4[Result Security<br/>Output Validation & Filtering]
+    end
+
+    subgraph "Threat Mitigation"
+        T1[Malicious Tools<br/>Code Injection Prevention]
+        T2[Resource Abuse<br/>Rate Limiting & Quotas]
+        T3[Data Exposure<br/>Output Sanitization]
+        T4[System Access<br/>Privilege Isolation]
+    end
+
+    L1 --> T1
+    L2 --> T2
+    L3 --> T4
+    L4 --> T3
+```
+
+**Security Measures:**
+
+**Connection Security:**
+
+- **Authentication** - API keys, certificates, or token-based auth for HTTP/SSE
+- **Encryption** - TLS for HTTP connections, secure pipes for STDIO
+- **Network Isolation** - Firewall rules and network segmentation
+
+**Execution Security:**
+
+- **Sandboxing** - Isolated execution environments for tools
+- **Resource Limits** - CPU, memory, and time constraints
+- **Permission Model** - Principle of least privilege for tool access
+
+**Operational Security:**
+
+- **Regular Updates** - Keep MCP servers and tools updated
+- **Monitoring** - Continuous security monitoring and alerting
+- **Incident Response** - Procedures for security incidents involving tools
+
+---
+
+## Related Architecture Documentation
+
+- **[Request Flow](./request-flow)** - MCP integration in request processing
+- **[Concurrency Model](./concurrency)** - MCP concurrency and worker integration
+- **[Plugin System](./plugins)** - Integration between MCP and plugin systems
+- **[Benchmarks](../../benchmarking/getting-started)** - MCP performance impact and optimization
+
+
+
--- a/docs/architecture/core/plugins.mdx
+++ b/docs/architecture/core/plugins.mdx
@@ -0,0 +1,552 @@
+---
+title: "Plugins"
+description: "Deep dive into Bifrost's extensible plugin architecture - how plugins work internally, lifecycle management, execution model, and integration patterns."
+icon: "puzzle-piece"
+---
+
+## Plugin Architecture Philosophy
+
+### **Core Design Principles**
+
+Bifrost's plugin system is built around five key principles that ensure extensibility without compromising performance or reliability:
+
+| Principle                     | Implementation                                   | Benefit                                          |
+| ----------------------------- | ------------------------------------------------ | ------------------------------------------------ |
+| **Plugin-First Design**    | Core logic designed around plugin hook points    | Maximum extensibility without core modifications |
+| **Zero-Copy Integration**  | Direct memory access to request/response objects | Minimal performance overhead                     |
+| **Lifecycle Management**   | Complete plugin lifecycle with automatic cleanup | Resource safety and leak prevention              |
+| **Interface-Based Safety** | Well-defined interfaces for type safety          | Compile-time validation and consistency          |
+| **Failure Isolation**      | Plugin errors don't crash the core system        | Fault tolerance and system stability             |
+
+### **Plugin System Overview**
+
+```mermaid
+graph TB
+    subgraph "Plugin Management Layer"
+        PluginMgr[Plugin Manager<br/>Central Controller]
+        Registry[Plugin Registry<br/>Discovery & Loading]
+        Lifecycle[Lifecycle Manager<br/>State Management]
+    end
+
+    subgraph "Plugin Execution Layer"
+        Pipeline[Plugin Pipeline<br/>Execution Orchestrator]
+        PreHooks[Pre-Processing Hooks<br/>Request Modification]
+        PostHooks[Post-Processing Hooks<br/>Response Enhancement]
+    end
+
+    subgraph "Plugin Categories"
+        Auth[Authentication<br/>& Authorization]
+        RateLimit[Rate Limiting<br/>& Throttling]
+        Transform[Data Transformation<br/>& Validation]
+        Monitor[Monitoring<br/>& Analytics]
+        Custom[Custom Business<br/>Logic]
+    end
+
+    PluginMgr --> Registry
+    Registry --> Lifecycle
+    Lifecycle --> Pipeline
+
+    Pipeline --> PreHooks
+    Pipeline --> PostHooks
+
+    PreHooks --> Auth
+    PreHooks --> RateLimit
+    PostHooks --> Transform
+    PostHooks --> Monitor
+    PostHooks --> Custom
+```
+
+---
+
+## Plugin Lifecycle Management
+
+### **Complete Lifecycle States**
+
+Every plugin goes through a well-defined lifecycle that ensures proper resource management and error handling:
+
+```mermaid
+stateDiagram-v2
+    [*] --> PluginInit: Plugin Creation
+    PluginInit --> Registered: Add to BifrostConfig
+    Registered --> PreHookCall: Request Received
+
+    PreHookCall --> ModifyRequest: Normal Flow
+    PreHookCall --> ShortCircuitResponse: Return Response
+    PreHookCall --> ShortCircuitError: Return Error
+
+    ModifyRequest --> ProviderCall: Send to Provider
+    ProviderCall --> PostHookCall: Receive Response
+
+    ShortCircuitResponse --> PostHookCall: Skip Provider
+    ShortCircuitError --> PostHookCall: Pipeline Symmetry
+
+    PostHookCall --> ModifyResponse: Process Result
+    PostHookCall --> RecoverError: Error Recovery
+    PostHookCall --> FallbackCheck: Check AllowFallbacks
+    PostHookCall --> ResponseReady: Pass Through
+
+    FallbackCheck --> TryFallback: AllowFallbacks=true/nil
+    FallbackCheck --> ResponseReady: AllowFallbacks=false
+    TryFallback --> PreHookCall: Next Provider
+
+    ModifyResponse --> ResponseReady: Modified
+    RecoverError --> ResponseReady: Recovered
+    ResponseReady --> [*]: Return to Client
+
+    Registered --> CleanupCall: Bifrost Shutdown
+    CleanupCall --> [*]: Plugin Destroyed
+```
+
+### **Lifecycle Phase Details**
+
+**Discovery Phase:**
+
+- **Purpose:** Find and catalog available plugins
+- **Sources:** Command line, environment variables, JSON configuration, directory scanning
+- **Validation:** Basic existence and format checks
+- **Output:** Plugin descriptors with metadata
+
+**Loading Phase:**
+
+- **Purpose:** Load plugin binaries into memory
+- **Security:** Digital signature verification and checksum validation
+- **Compatibility:** Interface implementation validation
+- **Resource:** Memory and capability assessment
+
+**Initialization Phase:**
+
+- **Purpose:** Configure plugin with runtime settings
+- **Timeout:** Bounded initialization time to prevent hanging
+- **Dependencies:** External service connectivity verification
+- **State:** Internal state setup and resource allocation
+
+**Runtime Phase:**
+
+- **Purpose:** Active request processing
+- **Monitoring:** Continuous health checking and performance tracking
+- **Recovery:** Automatic error recovery and degraded mode handling
+- **Metrics:** Real-time performance and health metrics collection
+
+> **Plugin Lifecycle:** [Plugin Management →](../../enterprise/custom-plugins)
+
+---
+
+## Plugin Execution Pipeline
+
+### **Request Processing Flow**
+
+The plugin pipeline ensures consistent, predictable execution while maintaining high performance:
+
+#### **Normal Execution Flow (No Short-Circuit)**
+
+```mermaid
+sequenceDiagram
+    participant Client
+    participant Bifrost
+    participant Plugin1
+    participant Plugin2
+    participant Provider
+
+    Client->>Bifrost: Request
+    Bifrost->>Plugin1: PreLLMHook(request)
+    Plugin1-->>Bifrost: modified request
+    Bifrost->>Plugin2: PreLLMHook(request)
+    Plugin2-->>Bifrost: modified request
+    Bifrost->>Provider: API Call
+    Provider-->>Bifrost: response
+    Bifrost->>Plugin2: PostLLMHook(response)
+    Plugin2-->>Bifrost: modified response
+    Bifrost->>Plugin1: PostLLMHook(response)
+    Plugin1-->>Bifrost: modified response
+    Bifrost-->>Client: Final Response
+```
+
+**Execution Order:**
+
+1. **PreHooks:** Execute in registration order (1 → 2 → N)
+2. **Provider Call:** If no short-circuit occurred
+3. **PostHooks:** Execute in reverse order (N → 2 → 1)
+
+#### **Short-Circuit Response Flow (Cache Hit)**
+
+```mermaid
+sequenceDiagram
+    participant Client
+    participant Bifrost
+    participant Cache
+    participant Auth
+    participant Provider
+
+    Client->>Bifrost: Request
+    Bifrost->>Auth: PreLLMHook(request)
+    Auth-->>Bifrost: modified request
+    Bifrost->>Cache: PreLLMHook(request)
+    Cache-->>Bifrost: LLMPluginShortCircuit{Response}
+    Note over Provider: Provider call skipped
+    Bifrost->>Cache: PostLLMHook(response)
+    Cache-->>Bifrost: modified response
+    Bifrost->>Auth: PostLLMHook(response)
+    Auth-->>Bifrost: modified response
+    Bifrost-->>Client: Cached Response
+```
+
+#### **Streaming Response Flow**
+
+For streaming responses, the plugin pipeline executes post-hooks for every delta/chunk received from the provider:
+
+```mermaid
+sequenceDiagram
+    participant Client
+    participant Bifrost
+    participant Plugin1
+    participant Plugin2
+    participant Provider
+
+    Client->>Bifrost: Stream Request
+    Bifrost->>Plugin1: PreLLMHook(request)
+    Plugin1-->>Bifrost: modified request
+    Bifrost->>Plugin2: PreLLMHook(request)
+    Plugin2-->>Bifrost: modified request
+    Bifrost->>Provider: Stream API Call
+
+    loop For Each Delta
+        Provider-->>Bifrost: stream delta
+        Bifrost->>Plugin2: PostLLMHook(delta)
+        Plugin2-->>Bifrost: modified delta
+        Bifrost->>Plugin1: PostLLMHook(delta)
+        Plugin1-->>Bifrost: modified delta
+        Bifrost-->>Client: Send Delta
+    end
+
+    Provider-->>Bifrost: final chunk (finish reason)
+    Bifrost->>Plugin2: PostLLMHook(final)
+    Plugin2-->>Bifrost: modified final
+    Bifrost->>Plugin1: PostLLMHook(final)
+    Plugin1-->>Bifrost: modified final
+    Bifrost-->>Client: Final Chunk
+```
+
+**Streaming Execution Characteristics:**
+
+1. **Delta Processing:**
+   - Each stream delta (chunk) goes through all post-hooks
+   - Plugins can modify/transform each delta before it reaches the client
+   - Deltas can contain: text content, tool calls, role changes, or usage info
+
+2. **Special Delta Types:**
+   - **Start Event:** Initial delta with role information
+   - **Content Delta:** Regular text or tool call content
+   - **Usage Update:** Token usage statistics (if enabled)
+   - **Final Chunk:** Contains finish reason and any final metadata
+
+3. **Plugin Considerations:**
+   - Plugins must handle streaming responses efficiently
+   - Each delta should be processed quickly to maintain stream responsiveness
+   - Plugins can track state across deltas using context
+   - Heavy processing should be done asynchronously
+
+4. **Error Handling:**
+   - If a post-hook returns an error, it's sent as an error stream chunk
+   - Stream is terminated after error chunks
+   - Plugins can recover from errors by providing valid responses
+
+5. **Performance Optimization:**
+   - Lightweight delta processing to minimize latency
+   - Object pooling for common data structures
+   - Non-blocking operations for logging and metrics
+   - Efficient memory management for stream processing
+
+> **Streaming Details:** [Streaming Guide →](../../quickstart/gateway/streaming)
+
+**Short-Circuit Rules:**
+
+- **Provider Skipped:** When plugin returns short-circuit response/error
+- **PostLLMHook Guarantee:** All executed PreHooks get corresponding PostLLMHook calls
+- **Reverse Order:** PostHooks execute in reverse order of PreHooks
+
+#### **Short-Circuit Error Flow (Allow Fallbacks)**
+
+```mermaid
+sequenceDiagram
+    participant Client
+    participant Bifrost
+    participant Plugin1
+    participant Provider1
+    participant Provider2
+
+    Client->>Bifrost: Request (Provider1 + Fallback Provider2)
+    Bifrost->>Plugin1: PreLLMHook(request)
+    Plugin1-->>Bifrost: LLMPluginShortCircuit{Error, AllowFallbacks=true}
+    Note over Provider1: Provider1 call skipped
+    Bifrost->>Plugin1: PostLLMHook(error)
+    Plugin1-->>Bifrost: error unchanged
+
+    Note over Bifrost: Try fallback provider
+    Bifrost->>Plugin1: PreLLMHook(request for Provider2)
+    Plugin1-->>Bifrost: modified request
+    Bifrost->>Provider2: API Call
+    Provider2-->>Bifrost: response
+    Bifrost->>Plugin1: PostLLMHook(response)
+    Plugin1-->>Bifrost: modified response
+    Bifrost-->>Client: Final Response
+```
+
+#### **Error Recovery Flow**
+
+```mermaid
+sequenceDiagram
+    participant Client
+    participant Bifrost
+    participant Plugin1
+    participant Plugin2
+    participant Provider
+    participant RecoveryPlugin
+
+    Client->>Bifrost: Request
+    Bifrost->>Plugin1: PreLLMHook(request)
+    Plugin1-->>Bifrost: modified request
+    Bifrost->>Plugin2: PreLLMHook(request)
+    Plugin2-->>Bifrost: modified request
+    Bifrost->>RecoveryPlugin: PreLLMHook(request)
+    RecoveryPlugin-->>Bifrost: modified request
+    Bifrost->>Provider: API Call
+    Provider-->>Bifrost: error
+    Bifrost->>RecoveryPlugin: PostLLMHook(error)
+    RecoveryPlugin-->>Bifrost: recovered response
+    Bifrost->>Plugin2: PostLLMHook(response)
+    Plugin2-->>Bifrost: modified response
+    Bifrost->>Plugin1: PostLLMHook(response)
+    Plugin1-->>Bifrost: modified response
+    Bifrost-->>Client: Recovered Response
+```
+
+**Error Recovery Features:**
+
+- **Error Transformation:** Plugins can convert errors to successful responses
+- **Graceful Degradation:** Provide fallback responses for service failures
+- **Context Preservation:** Error context is maintained through recovery process
+
+### **Complex Plugin Decision Flow**
+
+Real-world plugin interactions involving authentication, rate limiting, and caching with different decision paths:
+
+```mermaid
+graph TD
+    A["Client Request"] --> B["Bifrost"]
+    B --> C["Auth Plugin PreLLMHook"]
+    C --> D{"Authenticated?"}
+    D -->|No| E["Return Auth Error<br/>AllowFallbacks=false"]
+    D -->|Yes| F["RateLimit Plugin PreLLMHook"]
+    F --> G{"Rate Limited?"}
+    G -->|Yes| H["Return Rate Error<br/>AllowFallbacks=nil"]
+    G -->|No| I["Cache Plugin PreLLMHook"]
+    I --> J{"Cache Hit?"}
+    J -->|Yes| K["Return Cached Response"]
+    J -->|No| L["Provider API Call"]
+    L --> M["Cache Plugin PostLLMHook"]
+    M --> N["Store in Cache"]
+    N --> O["RateLimit Plugin PostLLMHook"]
+    O --> P["Auth Plugin PostLLMHook"]
+    P --> Q["Final Response"]
+
+    E --> R["Skip Fallbacks"]
+    H --> S["Try Fallback Provider"]
+    K --> T["Skip Provider Call"]
+```
+
+### **Execution Characteristics**
+
+**Symmetric Execution Pattern:**
+
+- **Pre-processing:** Plugins execute in priority order (high to low)
+- **Post-processing:** Plugins execute in reverse order (low to high)
+- **Rationale:** Ensures proper cleanup and state management (last in, first out)
+
+**Performance Optimizations:**
+
+- **Timeout Boundaries:** Each plugin has configurable execution timeouts
+- **Panic Recovery:** Plugin panics are caught and logged without crashing the system
+- **Resource Limits:** Memory and CPU limits prevent runaway plugins
+- **Circuit Breaking:** Repeated failures trigger plugin isolation
+
+**Error Handling Strategies:**
+
+- **Continue:** Use original request/response if plugin fails
+- **Fail Fast:** Return error immediately if critical plugin fails
+- **Retry:** Attempt plugin execution with exponential backoff
+- **Fallback:** Use alternative plugin or default behavior
+
+> **Plugin Execution:** [Request Flow →](./request-flow#stage-3-plugin-pipeline-processing)
+
+---
+
+## Security & Validation
+
+### **Multi-Layer Security Model**
+
+Plugin security operates at multiple layers to ensure system integrity:
+
+```mermaid
+graph TB
+    subgraph "Security Validation Layers"
+        L1[Layer 1: Binary Validation<br/>Signature & Checksum]
+        L2[Layer 2: Interface Validation<br/>Type Safety & Compatibility]
+        L3[Layer 3: Runtime Validation<br/>Resource Limits & Timeouts]
+        L4[Layer 4: Execution Isolation<br/>Panic Recovery & Error Handling]
+    end
+
+    subgraph "Security Benefits"
+        Integrity[Code Integrity<br/>Verified Authenticity]
+        Safety[Type Safety<br/>Compile-time Checks]
+        Stability[System Stability<br/>Isolated Failures]
+        Performance[Performance Protection<br/>Resource Limits]
+    end
+
+    L1 --> Integrity
+    L2 --> Safety
+    L3 --> Performance
+    L4 --> Stability
+```
+
+### **Validation Process**
+
+**Binary Security:**
+
+- **Digital Signatures:** Cryptographic verification of plugin authenticity
+- **Checksum Validation:** File integrity verification
+- **Source Verification:** Trusted source requirements
+
+**Interface Security:**
+
+- **Type Safety:** Interface implementation verification
+- **Version Compatibility:** Plugin API version checking
+- **Memory Safety:** Safe memory access patterns
+
+**Runtime Security:**
+
+- **Resource Quotas:** Memory and CPU usage limits
+- **Execution Timeouts:** Bounded execution time
+- **Sandbox Execution:** Isolated execution environment
+
+**Operational Security:**
+
+- **Health Monitoring:** Continuous plugin health assessment
+- **Error Tracking:** Plugin error rate monitoring
+- **Automatic Recovery:** Failed plugin restart and recovery
+
+---
+
+## Plugin Performance & Monitoring
+
+### **Comprehensive Metrics System**
+
+Bifrost provides detailed metrics for plugin performance and health monitoring:
+
+```mermaid
+graph TB
+    subgraph "Execution Metrics"
+        ExecTime[Execution Time<br/>Latency per Plugin]
+        ExecCount[Execution Count<br/>Request Volume]
+        SuccessRate[Success Rate<br/>Error Percentage]
+        Throughput[Throughput<br/>Requests/Second]
+    end
+
+    subgraph "Resource Metrics"
+        MemoryUsage[Memory Usage<br/>Per Plugin Instance]
+        CPUUsage[CPU Utilization<br/>Processing Time]
+        IOMetrics[I/O Operations<br/>Network/Disk Activity]
+        PoolUtilization[Pool Utilization<br/>Resource Efficiency]
+    end
+
+    subgraph "Health Metrics"
+        ErrorRate[Error Rate<br/>Failed Executions]
+        PanicCount[Panic Recovery<br/>Crash Events]
+        TimeoutCount[Timeout Events<br/>Slow Executions]
+        RecoveryRate[Recovery Success<br/>Failure Handling]
+    end
+
+    subgraph "Business Metrics"
+        AddedLatency[Added Latency<br/>Plugin Overhead]
+        SystemImpact[System Impact<br/>Overall Performance]
+        FeatureUsage[Feature Usage<br/>Plugin Utilization]
+        CostImpact[Cost Impact<br/>Resource Consumption]
+    end
+```
+
+### **Performance Characteristics**
+
+**Plugin Execution Performance:**
+
+- **Typical Overhead:** 1-10μs per plugin for simple operations
+- **Authentication Plugins:** 1-5μs for key validation
+- **Rate Limiting Plugins:** 500ns for quota checks
+- **Monitoring Plugins:** 200ns for metric collection
+- **Transformation Plugins:** 2-10μs depending on complexity
+
+**Resource Usage Patterns:**
+
+- **Memory Efficiency:** Object pooling reduces allocations
+- **CPU Optimization:** Minimal processing overhead
+- **Network Impact:** Configurable external service calls
+- **Storage Overhead:** Minimal for stateless plugins
+
+---
+
+## Plugin Integration Patterns
+
+### **Common Integration Scenarios**
+
+**1. Authentication & Authorization**
+
+- **Pre-processing Hook:** Validate API keys or JWT tokens
+- **Configuration:** External identity provider integration
+- **Error Handling:** Return 401/403 responses for invalid credentials
+- **Performance:** Sub-5μs validation with caching
+
+**2. Rate Limiting & Quotas**
+
+- **Pre-processing Hook:** Check request quotas and limits
+- **Storage:** Redis or in-memory rate limit tracking
+- **Algorithms:** Token bucket, sliding window, fixed window
+- **Responses:** 429 Too Many Requests with retry headers
+
+**3. Request/Response Transformation**
+
+- **Dual Hooks:** Pre-processing for requests, post-processing for responses
+- **Use Cases:** Data format conversion, field mapping, content filtering
+- **Performance:** Streaming transformations for large payloads
+- **Compatibility:** Provider-specific format adaptations
+
+**4. Monitoring & Analytics**
+
+- **Post-processing Hook:** Collect metrics and logs after request completion
+- **Destinations:** Prometheus, DataDog, custom analytics systems
+- **Data:** Request/response metadata, performance metrics, error tracking
+- **Privacy:** Configurable data sanitization and filtering
+
+### **Plugin Communication Patterns**
+
+**Plugin-to-Plugin Communication:**
+
+- **Shared Context:** Plugins can store data in request context for downstream plugins
+- **Event System:** Plugin can emit events for other plugins to consume
+- **Data Passing:** Structured data exchange between related plugins
+
+**Plugin-to-External Service Communication:**
+
+- **HTTP Clients:** Built-in HTTP client pools for external API calls
+- **Database Connections:** Connection pooling for database access
+- **Message Queues:** Integration with message queue systems
+- **Caching Systems:** Redis, Memcached integration for state storage
+
+> **📖 Integration Examples:** [Plugin Development Guide →](../../enterprise/custom-plugins)
+
+---
+
+## Related Architecture Documentation
+
+- **[Request Flow](./request-flow)** - Plugin execution in request processing pipeline
+- **[Concurrency Model](./concurrency)** - Plugin concurrency and threading considerations
+- **[Benchmarks](../../benchmarking/getting-started)** - Plugin performance characteristics and optimization
+- **[MCP System](./mcp)** - Integration between plugins and MCP system
+
--- a/docs/architecture/core/providers.mdx
+++ b/docs/architecture/core/providers.mdx
--- a/docs/architecture/core/request-flow.mdx
+++ b/docs/architecture/core/request-flow.mdx
@@ -0,0 +1,527 @@
+---
+title: "Request Flow"
+description: "Deep dive into Bifrost's request processing pipeline - from transport layer ingestion through provider execution to response delivery."
+icon: "route"
+---
+
+## Stage 1: Transport Layer Processing
+
+### **HTTP Transport Flow**
+
+```mermaid
+sequenceDiagram
+    participant Client
+    participant HTTPTransport
+    participant Router
+    participant Validation
+
+    Client->>HTTPTransport: POST /v1/chat/completions
+    HTTPTransport->>HTTPTransport: Parse Headers
+    HTTPTransport->>HTTPTransport: Extract Body
+    HTTPTransport->>Validation: Validate JSON Schema
+    Validation->>Router: BifrostRequest
+    Router-->>HTTPTransport: Processing Started
+    HTTPTransport-->>Client: HTTP 200 (async processing)
+```
+
+**Key Processing Steps:**
+
+1. **Request Reception** - FastHTTP server receives request
+2. **Header Processing** - Extract authentication, content-type, custom headers
+3. **Body Parsing** - JSON unmarshaling with schema validation
+4. **Request Transformation** - Convert to internal `BifrostRequest` schema
+5. **Context Creation** - Build request context with metadata
+
+**Performance Characteristics:**
+
+- **Parsing Time:** ~2.1μs for typical requests
+- **Validation Overhead:** ~400ns for schema checks
+- **Memory Allocation:** Zero-copy where possible
+
+### **Go SDK Flow**
+
+```mermaid
+sequenceDiagram
+    participant Application
+    participant SDK
+    participant Core
+    participant Validation
+
+    Application->>SDK: bifrost.ChatCompletion(req)
+    SDK->>SDK: Type Validation
+    SDK->>Core: Direct Function Call
+    Core->>Validation: Schema Validation
+    Validation-->>Core: Validated Request
+    Core-->>SDK: Processing Result
+    SDK-->>Application: Typed Response
+```
+
+**Advantages:**
+
+- **Zero Serialization** - Direct Go struct passing
+- **Type Safety** - Compile-time validation
+- **Lower Latency** - No HTTP/JSON overhead
+- **Memory Efficiency** - No intermediate allocations
+
+---
+
+## Stage 2: Request Routing & Load Balancing
+
+### **Provider Selection Logic**
+
+```mermaid
+flowchart TD
+    Request[Incoming Request] --> ModelCheck{Model Available?}
+    ModelCheck -->|Yes| ProviderDirect[Use Specified Provider]
+    ModelCheck -->|No| ModelMapping[Model → Provider Mapping]
+
+    ProviderDirect --> KeyPool[API Key Pool]
+    ModelMapping --> KeyPool
+
+    KeyPool --> WeightedSelect[Weighted Random Selection]
+    WeightedSelect --> HealthCheck{Provider Healthy?}
+
+    HealthCheck -->|Yes| AssignWorker[Assign Worker]
+    HealthCheck -->|No| CircuitBreaker[Circuit Breaker]
+
+    CircuitBreaker --> FallbackCheck{Fallback Available?}
+    FallbackCheck -->|Yes| FallbackProvider[Try Fallback]
+    FallbackCheck -->|No| ErrorResponse[Return Error]
+
+    FallbackProvider --> KeyPool
+```
+
+**Key Selection Algorithm:**
+
+```go
+// Weighted random key selection
+type KeySelector struct {
+    keys    []APIKey
+    weights []float64
+    total   float64
+}
+
+func (ks *KeySelector) SelectKey() *APIKey {
+    r := rand.Float64() * ks.total
+    cumulative := 0.0
+
+    for i, weight := range ks.weights {
+        cumulative += weight
+        if r <= cumulative {
+            return &ks.keys[i]
+        }
+    }
+    return &ks.keys[len(ks.keys)-1]
+}
+```
+
+**Performance Metrics:**
+
+- **Key Selection Time:** ~10ns (constant time)
+- **Health Check Overhead:** ~50ns (cached results)
+- **Fallback Decision:** ~25ns (configuration lookup)
+
+---
+
+## Stage 3: Plugin Pipeline Processing
+
+### **Pre-Processing Hooks**
+
+```mermaid
+sequenceDiagram
+    participant Request
+    participant AuthPlugin
+    participant RateLimitPlugin
+    participant TransformPlugin
+    participant Core
+
+    Request->>AuthPlugin: ProcessRequest()
+    AuthPlugin->>AuthPlugin: Validate API Key
+    AuthPlugin->>RateLimitPlugin: Authorized Request
+
+    RateLimitPlugin->>RateLimitPlugin: Check Rate Limits
+    RateLimitPlugin->>TransformPlugin: Allowed Request
+
+    TransformPlugin->>TransformPlugin: Modify Request
+    TransformPlugin->>Core: Final Request
+```
+
+**Plugin Execution Model:**
+
+```go
+type PluginManager struct {
+    plugins []Plugin
+}
+
+func (pm *PluginManager) ExecutePreHooks(
+    ctx BifrostContext,
+    req *BifrostRequest,
+) (*BifrostRequest, *BifrostError) {
+    for _, plugin := range pm.plugins {
+        modifiedReq, err := plugin.ProcessRequest(ctx, req)
+        if err != nil {
+            return nil, err
+        }
+        req = modifiedReq
+    }
+    return req, nil
+}
+```
+
+**Plugin Types & Performance:**
+
+| Plugin Type           | Processing Time | Memory Impact | Failure Mode           |
+| --------------------- | --------------- | ------------- | ---------------------- |
+| **Authentication**    | ~1-5μs          | Minimal       | Reject request         |
+| **Rate Limiting**     | ~500ns          | Cache-based   | Throttle/reject        |
+| **Request Transform** | ~2-10μs         | Copy-on-write | Continue with original |
+| **Monitoring**        | ~200ns          | Append-only   | Continue silently      |
+
+---
+
+## Stage 4: MCP Tool Discovery & Integration
+
+### **Tool Discovery Process**
+
+```mermaid
+flowchart TD
+    Request[Request with Model] --> MCPCheck{MCP Enabled?}
+    MCPCheck -->|No| SkipMCP[Skip MCP Processing]
+    MCPCheck -->|Yes| ClientLookup[MCP Client Lookup]
+
+    ClientLookup --> ToolFilter[Tool Filtering]
+    ToolFilter --> ToolInject[Inject Tools into Request]
+
+    ToolFilter --> IncludeCheck{Include Filter?}
+    ToolFilter --> ExcludeCheck{Exclude Filter?}
+
+    IncludeCheck -->|Yes| IncludeTools[Include Specified Tools]
+    IncludeCheck -->|No| AllTools[Include All Tools]
+
+    ExcludeCheck -->|Yes| RemoveTools[Remove Excluded Tools]
+    ExcludeCheck -->|No| KeepFiltered[Keep Filtered Tools]
+
+    IncludeTools --> ToolInject
+    AllTools --> ToolInject
+    RemoveTools --> ToolInject
+    KeepFiltered --> ToolInject
+
+    ToolInject --> EnhancedRequest[Request with Tools]
+    SkipMCP --> EnhancedRequest
+```
+
+**Tool Integration Algorithm:**
+
+```go
+func (mcpm *MCPManager) EnhanceRequest(
+    ctx BifrostContext,
+    req *BifrostChatRequest,
+) (*BifrostRequest, error) {
+    // Extract tool filtering from context
+    includeClients := ctx.GetStringSlice("mcp-include-clients")
+    includeTools := ctx.GetStringSlice("mcp-include-tools")
+
+    // Get available tools
+    availableTools := mcpm.getAvailableTools(includeClients)
+
+    // Filter tools  
+    filteredTools := mcpm.filterTools(availableTools, includeTools)
+
+    // Inject into request
+    if req.Params == nil {
+        req.Params = &ChatParameters{}
+    }
+    req.Params.Tools = append(req.Params.Tools, filteredTools...)
+
+    return req, nil
+}
+```
+
+**MCP Performance Impact:**
+
+- **Tool Discovery:** ~100-500μs (cached after first request)
+- **Tool Filtering:** ~50-200ns per tool
+- **Request Enhancement:** ~1-5μs depending on tool count
+
+---
+
+## Stage 5: Memory Pool Management
+
+### **Object Pool Lifecycle**
+
+```mermaid
+stateDiagram-v2
+    [*] --> PoolInit: System Startup
+    PoolInit --> Available: Objects Pre-allocated
+
+    Available --> Acquired: Request Processing
+    Acquired --> InUse: Object Populated
+    InUse --> Processing: Worker Processing
+    Processing --> Completed: Processing Done
+    Completed --> Reset: Object Cleanup
+    Reset --> Available: Return to Pool
+
+    Available --> Expansion: Pool Exhaustion
+    Expansion --> Available: New Objects Created
+
+    Reset --> GC: Pool Full
+    GC --> [*]: Garbage Collection
+```
+
+**Memory Pool Implementation:**
+
+```go
+type MemoryPools struct {
+    channelPool  sync.Pool
+    messagePool  sync.Pool
+    responsePool sync.Pool
+    bufferPool   sync.Pool
+}
+
+func (mp *MemoryPools) GetChannel() *ProcessingChannel {
+    if ch := mp.channelPool.Get(); ch != nil {
+        return ch.(*ProcessingChannel)
+    }
+    return NewProcessingChannel()
+}
+
+func (mp *MemoryPools) ReturnChannel(ch *ProcessingChannel) {
+    ch.Reset() // Clear previous data
+    mp.channelPool.Put(ch)
+}
+```
+
+---
+
+## Stage 6: Worker Pool Processing
+
+### **Worker Assignment & Execution**
+
+```mermaid
+sequenceDiagram
+    participant Queue
+    participant WorkerPool
+    participant Worker
+    participant Provider
+    participant Circuit
+
+    Queue->>WorkerPool: Enqueue Request
+    WorkerPool->>Worker: Assign Available Worker
+    Worker->>Circuit: Check Circuit Breaker
+    Circuit->>Provider: Forward Request
+
+    Provider-->>Circuit: Response/Error
+    Circuit->>Circuit: Update Health Metrics
+    Circuit-->>Worker: Provider Response
+    Worker-->>WorkerPool: Release Worker
+    WorkerPool-->>Queue: Request Completed
+```
+
+**Worker Pool Architecture:**
+
+```go
+type ProviderWorkerPool struct {
+    workers    chan *Worker
+    queue      chan *ProcessingJob
+    config     WorkerPoolConfig
+    metrics    *PoolMetrics
+}
+
+func (pwp *ProviderWorkerPool) ProcessRequest(job *ProcessingJob) {
+    // Get worker from pool
+    worker := <-pwp.workers
+
+    go func() {
+        defer func() {
+            // Return worker to pool
+            pwp.workers <- worker
+        }()
+
+        // Process request
+        result := worker.Execute(job)
+        job.ResultChan <- result
+    }()
+}
+```
+
+---
+
+## Stage 7: Provider API Communication
+
+### **HTTP Request Execution**
+
+```mermaid
+sequenceDiagram
+    participant Worker
+    participant HTTPClient
+    participant Provider
+    participant CircuitBreaker
+    participant Metrics
+
+    Worker->>HTTPClient: PrepareRequest()
+    HTTPClient->>HTTPClient: Add Headers & Auth
+    HTTPClient->>CircuitBreaker: CheckHealth()
+    CircuitBreaker->>Provider: HTTP Request
+
+    Provider-->>CircuitBreaker: HTTP Response
+    CircuitBreaker->>Metrics: Record Metrics
+    CircuitBreaker-->>HTTPClient: Response/Error
+    HTTPClient-->>Worker: Parsed Response
+```
+
+**Request Preparation Pipeline:**
+
+```go
+func (w *ProviderWorker) ExecuteRequest(job *ProcessingJob) *ProviderResponse {
+    // Prepare HTTP request
+    httpReq := w.prepareHTTPRequest(job.Request)
+
+    // Add authentication
+    w.addAuthentication(httpReq, job.APIKey)
+
+    // Execute with timeout
+    ctx, cancel := context.WithTimeout(context.Background(), job.Timeout)
+    defer cancel()
+
+    httpResp, err := w.httpClient.Do(httpReq.WithContext(ctx))
+    if err != nil {
+        return w.handleError(err, job)
+    }
+
+    // Parse response
+    return w.parseResponse(httpResp, job)
+}
+```
+
+---
+
+## Stage 8: Tool Execution & Response Processing
+
+### **MCP Tool Execution Flow**
+
+```mermaid
+sequenceDiagram
+    participant Provider
+    participant MCPProcessor
+    participant MCPServer
+    participant ToolExecutor
+    participant ResponseBuilder
+
+    Provider->>MCPProcessor: Response with Tool Calls
+    MCPProcessor->>MCPProcessor: Extract Tool Calls
+
+    loop For each tool call
+        MCPProcessor->>MCPServer: Execute Tool
+        MCPServer->>ToolExecutor: Tool Invocation
+        ToolExecutor-->>MCPServer: Tool Result
+        MCPServer-->>MCPProcessor: Tool Response
+    end
+
+    MCPProcessor->>ResponseBuilder: Combine Results
+    ResponseBuilder-->>Provider: Enhanced Response
+```
+
+**Tool Execution Pipeline:**
+
+```go
+func (mcp *MCPProcessor) ProcessToolCalls(
+    response *ProviderResponse,
+) (*ProviderResponse, error) {
+    toolCalls := mcp.extractToolCalls(response)
+    if len(toolCalls) == 0 {
+        return response, nil
+    }
+
+    // Execute tools concurrently
+    results := make(chan ToolResult, len(toolCalls))
+    for _, toolCall := range toolCalls {
+        go func(tc ToolCall) {
+            result := mcp.executeTool(tc)
+            results <- result
+        }(toolCall)
+    }
+
+    // Collect results
+    toolResults := make([]ToolResult, 0, len(toolCalls))
+    for i := 0; i < len(toolCalls); i++ {
+        toolResults = append(toolResults, <-results)
+    }
+
+    // Enhance response
+    return mcp.enhanceResponse(response, toolResults), nil
+}
+```
+
+---
+
+## Stage 9: Post-Processing & Response Formation
+
+### **Plugin Post-Processing**
+
+```mermaid
+sequenceDiagram
+    participant CoreResponse
+    participant LoggingPlugin
+    participant CachePlugin
+    participant MetricsPlugin
+    participant Transport
+
+    CoreResponse->>LoggingPlugin: ProcessResponse()
+    LoggingPlugin->>LoggingPlugin: Log Request/Response
+    LoggingPlugin->>CachePlugin: Response + Logs
+
+    CachePlugin->>CachePlugin: Cache Response
+    CachePlugin->>MetricsPlugin: Cached Response
+
+    MetricsPlugin->>MetricsPlugin: Record Metrics
+    MetricsPlugin->>Transport: Final Response
+```
+
+**Response Enhancement Pipeline:**
+
+```go
+func (pm *PluginManager) ExecutePostHooks(
+    ctx BifrostContext,
+    req *BifrostRequest,
+    resp *BifrostResponse,
+) (*BifrostResponse, error) {
+    for _, plugin := range pm.plugins {
+        enhancedResp, err := plugin.ProcessResponse(ctx, req, resp)
+        if err != nil {
+            // Log error but continue processing
+            pm.logger.Warn("Plugin post-processing error", "plugin", plugin.Name(), "error", err)
+            continue
+        }
+        resp = enhancedResp
+    }
+    return resp, nil
+}
+```
+
+### **Response Serialization**
+
+```mermaid
+flowchart TD
+    Response[BifrostResponse] --> Format{Response Format}
+    Format -->|HTTP| JSONSerialize[JSON Serialization]
+    Format -->|SDK| DirectReturn[Direct Go Struct]
+
+    JSONSerialize --> Compress[Compression]
+    DirectReturn --> TypeCheck[Type Validation]
+
+    Compress --> Headers[Set Headers]
+    TypeCheck --> Return[Return Response]
+
+    Headers --> HTTPResponse[HTTP Response]
+    HTTPResponse --> Client[Client Response]
+    Return --> Client
+```
+
+---
+
+## Related Architecture Documentation
+
+- **[Concurrency Model](./concurrency)** - Worker pools and threading details
+- **[Plugin System](./plugins)** - Plugin execution and lifecycle
+- **[MCP System](./mcp)** - Tool discovery and execution internals
+- **[Benchmarks](../../benchmarking/getting-started)** - Detailed performance analysis
--- a/docs/architecture/framework/config-store.mdx
+++ b/docs/architecture/framework/config-store.mdx
@@ -0,0 +1,161 @@
+---
+title: "Config Store"
+description: "A persistent and flexible configuration management system for Bifrost, supporting multiple database backends."
+icon: "gear"
+---
+
+The ConfigStore is a critical component of the Bifrost framework, providing a centralized and persistent storage solution for all gateway configurations. It abstracts the underlying database, offering a unified API for managing everything from provider settings and virtual keys to governance policies and plugin configurations.
+
+## Core Features
+
+- **Unified Configuration API**: A single interface (`ConfigStore`) for all configuration CRUD (Create, Read, Update, Delete) operations.
+- **Multiple Backend Support**: Out-of-the-box support for SQLite and PostgreSQL, with an extensible architecture for adding new database backends.
+- **Comprehensive Data Management**: Manages a wide range of configuration data, including:
+    - Provider and key settings
+    - Virtual keys and governance rules (budgets, rate limits)
+    - Customer and team information for multi-tenancy
+    - Plugin configurations
+    - Vector store and log store settings
+    - Model pricing information
+- **Transactional Operations**: Ensures data consistency by supporting atomic transactions for complex configuration changes.
+- **Database Migrations**: Integrated migration system to manage schema evolution across different versions of Bifrost.
+- **Environment Variable Handling**: Securely manages sensitive data like API keys by storing references to environment variables instead of raw values.
+
+## Architecture
+
+The ConfigStore is designed around the `ConfigStore` interface, which defines all the methods for interacting with the configuration data. The primary implementation is `RDBConfigStore`, which uses [GORM](https://gorm.io/) as an ORM to communicate with relational databases.
+
+### Supported Backends
+
+- **SQLite**: The default, file-based database, perfect for local development, testing, and single-node deployments. It requires no external services.
+- **PostgreSQL**: A robust, production-grade database suitable for large-scale, high-availability deployments.
+
+The backend is selected and configured in Bifrost's main configuration file.
+
+### Initialization
+
+The ConfigStore is initialized at startup based on the provided configuration.
+
+```go
+import (
+    "github.com/maximhq/bifrost/framework/configstore"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+// Example: Initialize a SQLite-based ConfigStore
+config := &configstore.Config{
+    Enabled: true,
+    Type:    configstore.ConfigStoreTypeSQLite,
+    Config: &configstore.SQLiteConfig{
+        File: "/path/to/config.db",
+    },
+}
+
+var logger schemas.Logger // Assume logger is initialized
+store, err := configstore.NewConfigStore(context.Background(), config, logger)
+if err != nil {
+    // Handle error
+}
+```
+
+Here is an example for initializing a PostgreSQL-based `ConfigStore`:
+```go
+// Example: Initialize a PostgreSQL-based ConfigStore
+pgConfig := &configstore.Config{
+    Enabled: true,
+    Type:    configstore.ConfigStoreTypePostgres,
+    Config: &configstore.PostgresConfig{
+        Host:         "localhost",
+        Port:         "5432",
+        User:         "postgres",
+        Password:     "secret",
+        DBName:       "bifrost",
+        SSLMode:      "disable",
+        MaxIdleConns: 5,  // Optional: Maximum idle connections (default: 5)
+        MaxOpenConns: 50, // Optional: Maximum open connections (default: 50)
+    },
+}
+
+store, err = configstore.NewConfigStore(context.Background(), pgConfig, logger)
+if err != nil {
+    // Handle error
+}
+```
+
+<Note>
+PostgreSQL databases used by Bifrost stores must be UTF8 encoded. See [PostgreSQL UTF8 Requirement](../../quickstart/gateway/setting-up#postgresql-utf8-requirement).
+</Note>
+
+### Connection Pool Configuration
+
+For PostgreSQL backends, you can configure the database connection pool to optimize performance based on your workload:
+
+- **MaxIdleConns**: Maximum number of idle connections in the pool (default: 5)
+- **MaxOpenConns**: Maximum number of open connections to the database (default: 50)
+
+These parameters help manage database connection resources effectively. Increase them for high-traffic deployments or decrease them for resource-constrained environments.
+
+## Data Models
+
+The ConfigStore manages a variety of data models, which are defined as GORM tables in the `framework/configstore/tables` directory. Some of the key models include:
+
+- `TableVirtualKey`: Represents a virtual key with its associated governance rules, keys, and metadata.
+- `TableProvider` & `TableKey`: Store provider-specific configurations and the physical API keys.
+- `TableBudget` & `TableRateLimit`: Define spending limits and request rate limits for governance.
+- `TableCustomer` & `TableTeam`: Enable multi-tenant configurations.
+- `TableModelPricing`: Caches model pricing information for cost calculation.
+- `TablePlugin`: Stores configuration for loaded plugins.
+
+## Usage
+
+The `ConfigStore` interface provides a rich set of methods for managing Bifrost's configuration.
+
+### Managing Virtual Keys
+
+```go
+// Create a new virtual key
+newKey := &tables.TableVirtualKey{
+    ID: "vk-12345",
+    Name: "My Test Key",
+    // ... other fields
+}
+err := store.CreateVirtualKey(ctx, newKey)
+
+// Retrieve a virtual key
+virtualKey, err := store.GetVirtualKey(ctx, "vk-12345")
+```
+
+### Managing Providers
+
+```go
+// Get all provider configurations
+providers, err := store.GetProvidersConfig(ctx)
+
+// Update a specific provider
+providerConfig := providers[schemas.OpenAI]
+providerConfig.NetworkConfig.TimeoutSeconds = 120
+err = store.UpdateProvider(ctx, schemas.OpenAI, providerConfig, envKeys)
+```
+
+### Executing Transactions
+
+For operations that require multiple database writes, you can use a transaction to ensure atomicity.
+
+```go
+err := store.ExecuteTransaction(ctx, func(tx *gorm.DB) error {
+    // Perform multiple operations within this transaction
+    if err := store.CreateBudget(ctx, budget1, tx); err != nil {
+        return err // Rollback
+    }
+    if err := store.UpdateRateLimit(ctx, limit1, tx); err != nil {
+        return err // Rollback
+    }
+    return nil // Commit
+})
+```
+
+## Migrations
+
+The ConfigStore includes a migration system to handle database schema changes between Bifrost versions. Migrations are automatically applied at startup, ensuring the database schema is always up-to-date. This process is managed by the `migrator` package and is transparent to the user.
+
+The ConfigStore is a powerful and flexible component that provides the backbone for Bifrost's dynamic configuration capabilities. Its support for multiple backends and transactional operations makes it suitable for both small-scale and large-scale, production environments.
--- a/docs/architecture/framework/log-store.mdx
+++ b/docs/architecture/framework/log-store.mdx
@@ -0,0 +1,176 @@
+---
+title: "Log Store"
+description: "A robust and queryable system for persisting API request and response logs, with support for multiple database backends."
+icon: "clipboard-list"
+---
+
+The LogStore is a core component of the Bifrost framework responsible for capturing, storing, and retrieving detailed logs of API requests and responses. It provides a persistent, queryable audit trail of all activity passing through the gateway, which is essential for debugging, monitoring, analytics, and compliance.
+
+## Core Features
+
+- **Persistent Logging**: Automatically saves detailed information about each API request, including input, output, status, latency, and cost.
+- **Multiple Backend Support**: Comes with built-in support for SQLite and PostgreSQL, allowing you to choose the best storage solution for your deployment needs.
+- **Rich Querying and Filtering**: A powerful search API allows you to filter and sort logs based on a wide range of criteria such as provider, model, status, latency, cost, and content.
+- **Performance Analytics**: The search functionality also provides aggregated statistics, including total requests, success rate, average latency, total tokens, and total cost for the queried data.
+- **Structured Data Model**: Logs are stored in a structured format, with complex objects like message history and tool calls serialized as JSON for efficient storage and retrieval.
+- **Automatic Data Management**: Includes GORM hooks to automatically handle JSON serialization/deserialization and to build a searchable content summary.
+
+## Architecture
+
+The LogStore is built around the `LogStore` interface, which defines the standard methods for interacting with the log database. The primary implementation, `RDBLogStore`, uses GORM to provide an abstraction over relational databases.
+
+### Supported Backends
+
+- **SQLite**: The default, file-based database, ideal for local development and smaller, single-node deployments.
+- **PostgreSQL**: A production-ready database for scalable and high-availability deployments.
+
+The backend is configured in Bifrost's main configuration file.
+
+### Initialization
+
+The LogStore is initialized at startup based on the provided configuration.
+
+```go
+import (
+    "github.com/maximhq/bifrost/framework/logstore"
+    "github.com/maximhq/bifrost/core/schemas"
+)
+
+// Example: Initialize a SQLite-based LogStore
+config := &logstore.Config{
+    Enabled: true,
+    Type:    logstore.LogStoreTypeSQLite,
+    Config: &logstore.SQLiteConfig{
+        File: "/path/to/logs.db",
+    },
+}
+
+var logger schemas.Logger // Assume logger is initialized
+store, err := logstore.NewLogStore(context.Background(), config, logger)
+if err != nil {
+    // Handle error
+}
+```
+
+Here is an example for initializing a PostgreSQL-based `LogStore`:
+```go
+// Example: Initialize a PostgreSQL-based LogStore
+pgConfig := &logstore.Config{
+    Enabled: true,
+    Type:    logstore.LogStoreTypePostgres,
+    Config: &logstore.PostgresConfig{
+        Host:         "localhost",
+        Port:         "5432",
+        User:         "postgres",
+        Password:     "secret",
+        DBName:       "bifrost_logs",
+        SSLMode:      "disable",
+        MaxIdleConns: 5,  // Optional: Maximum idle connections (default: 5)
+        MaxOpenConns: 50, // Optional: Maximum open connections (default: 50)
+    },
+}
+
+store, err = logstore.NewLogStore(context.Background(), pgConfig, logger)
+if err != nil {
+    // Handle error
+}
+```
+
+<Note>
+PostgreSQL databases used by Bifrost stores must be UTF8 encoded. See [PostgreSQL UTF8 Requirement](../../quickstart/gateway/setting-up#postgresql-utf8-requirement).
+</Note>
+
+### Connection Pool Configuration
+
+For PostgreSQL backends, you can configure the database connection pool to optimize performance based on your workload:
+
+- **MaxIdleConns**: Maximum number of idle connections in the pool (default: 5)
+- **MaxOpenConns**: Maximum number of open connections to the database (default: 50)
+
+These parameters help manage database connection resources effectively. Increase them for high-traffic deployments or decrease them for resource-constrained environments.
+
+## Data Model
+
+The core of the LogStore is the `Log` struct, which represents a single log entry in the `logs` table.
+
+```go
+// Log represents a complete log entry for a request/response cycle
+type Log struct {
+    ID                  string    `gorm:"primaryKey;type:varchar(255)"`
+    Timestamp           time.Time `gorm:"index;not null"`
+    Object              string    `gorm:"type:varchar(255);index;not null;column:object_type"`
+    Provider            string    `gorm:"type:varchar(255);index;not null"`
+    Model               string    `gorm:"type:varchar(255);index;not null"`
+    Latency             *float64
+    Cost                *float64  `gorm:"index"`
+    Status              string    `gorm:"type:varchar(50);index;not null"` // "processing", "success", or "error"
+    Stream              bool      `gorm:"default:false"`
+
+    // Denormalized token fields for easier querying
+    PromptTokens     int `gorm:"default:0"`
+    CompletionTokens int `gorm:"default:0"`
+    TotalTokens      int `gorm:"default:0"`
+
+    // JSON serialized fields
+    InputHistory        string `gorm:"type:text"`
+    OutputMessage       string `gorm:"type:text"`
+    TokenUsage          string `gorm:"type:text"`
+    ErrorDetails        string `gorm:"type:text"`
+    // ... and many more for different data types
+}
+```
+Complex data like message arrays and tool calls are serialized into JSON strings for storage and are automatically deserialized back into their struct forms when retrieved.
+
+## Usage
+
+### Creating Log Entries
+
+A log entry is created by populating a `Log` struct and passing it to the `Create` method. This is typically handled internally by Bifrost's logging plugins.
+
+```go
+logEntry := &logstore.Log{
+    ID:        "req-xyz123",
+    Timestamp: time.Now(),
+    Provider:  "openai",
+    Model:     "gpt-4",
+    Status:    "success",
+    // ... other fields
+}
+err := store.Create(ctx, logEntry)
+```
+
+### Searching and Filtering Logs
+
+The `SearchLogs` method provides a powerful way to query logs with fine-grained filters and pagination.
+
+```go
+// Define search criteria
+filters := logstore.SearchFilters{
+    Providers: []string{"openai", "anthropic"},
+    Status:    []string{"error"},
+    StartTime: &startTime, // time.Time pointer
+}
+
+pagination := logstore.PaginationOptions{
+    Limit:  50,
+    Offset: 0,
+    SortBy: "timestamp",
+    Order:  "desc",
+}
+
+// Execute the search
+results, err := store.SearchLogs(ctx, filters, pagination)
+if err != nil {
+    // Handle error
+}
+
+// Process the results
+for _, log := range results.Logs {
+    fmt.Printf("Found log: %s\n", log.ID)
+}
+
+// Access aggregated stats
+fmt.Printf("Total errors: %d\n", results.Stats.TotalRequests)
+```
+
+The LogStore is an indispensable tool for observability in Bifrost, providing the detailed audit trail needed to monitor, debug, and analyze AI application performance and behavior effectively.
--- a/docs/architecture/framework/model-catalog.mdx
+++ b/docs/architecture/framework/model-catalog.mdx
@@ -0,0 +1,412 @@
+---
+title: "Model Catalog"
+description: "A centralized system for managing model information, pricing, and capabilities across all supported AI providers."
+icon: "book-open"
+---
+
+The Model Catalog is a foundational component of Bifrost that provides a unified interface for managing AI models, including their pricing, capabilities, and availability. It serves as a centralized repository for all model-related information, enabling dynamic cost calculation, intelligent model routing, and efficient resource management.
+
+<Info>
+**Related Documentation**: The Model Catalog powers Bifrost's intelligent routing system. See [Provider Routing](/providers/provider-routing) for detailed examples of how governance and load balancing use the catalog to make routing decisions, including cross-provider scenarios and weighted routing via proxy providers.
+</Info>
+
+## Core Features
+
+### **1. Automatic Pricing Synchronization**
+The Model Catalog manages pricing data through a two-phase approach:
+
+**Startup Behavior:**
+- **With ConfigStore**: Downloads a pricing sheet from Maxim's datasheet, persists it to the config store, and then loads it into memory for fast lookups.
+- **Without ConfigStore**: Downloads the pricing sheet directly into memory on every startup.
+
+**Ongoing Synchronization:**
+- When ConfigStore is available, an automatic sync occurs every 24 hours to keep pricing data current.
+- All pricing data is cached in memory for O(1) lookup performance during cost calculations.
+
+This ensures that cost calculations always use the latest pricing information from AI providers while maintaining optimal performance.
+
+### **2. Multi-Modal Cost Calculation**
+It supports diverse pricing models across different AI operation types:
+- **Text Operations**: Token-based pricing for chat completions, text completions, responses, and embeddings. Cache-read/cache-write pricing applies to chat/text/responses when providers surface prompt cache token details.
+- **Audio Processing**: Character-based, token-based, and duration-based pricing for speech synthesis and transcription, with audio token detail breakdown. Speech responses populate `usage.input_chars` so speech can be billed by input characters in addition to tokens/duration.
+- **Image Processing**: Per-image (`input_cost_per_image`/`output_cost_per_image`), per-pixel (`input_cost_per_pixel`/`output_cost_per_pixel`), or token-based pricing with text/image token breakdown.
+- **Video Processing**: Token-based or duration-based pricing. Input can use prompt tokens or `input_cost_per_video_per_second`; output can use completion tokens or fall back to `output_cost_per_video_per_second` / `output_cost_per_second`.
+- **Reranking**: Input/output token pricing with search query cost support.
+- **Prompt Caching**: Separate rates for cache-read tokens (`cached_read_tokens`) and cache-creation tokens (`cached_write_tokens`), both surfaced under `prompt_tokens_details` (see [Prompt Cache Cost Calculation](#prompt-cache-cost-calculation)).
+
+### **3. Model Information Management**
+The Model Catalog maintains a pool of available models for each provider, populated from both pricing data and provider list models APIs. This enables:
+- **Model Discovery**: Listing all available models for a given provider
+- **Provider Discovery**: Finding all providers that support a specific model with intelligent cross-provider resolution (OpenRouter, Vertex, Groq, Bedrock)
+- **Model Validation**: Checking if a model is allowed for a provider based on allowed models lists (supports provider-prefixed entries)
+
+### **4. Intelligent Cache Cost Handling**
+It integrates with semantic caching to provide accurate cost calculations:
+- **Cache Hits**: Zero cost for direct cache hits, and embedding cost only for semantic matches.
+- **Cache Misses**: Combined cost of the base model usage plus the embedding generation cost for cache storage.
+
+### **5. Tiered Pricing Support**
+The system automatically applies different pricing rates for high-token contexts, reflecting real provider pricing models. Two tiers are supported: above 128k tokens and above 200k tokens, with the higher tier taking precedence when both are configured.
+
+## Configuration
+
+The `ModelCatalog` can be configured during initialization by passing a `Config` struct.
+
+```go
+type Config struct {
+	PricingURL          *string        `json:"pricing_url,omitempty"`
+	PricingSyncInterval *time.Duration `json:"pricing_sync_interval,omitempty"`
+}
+```
+
+- **`PricingURL`**: Overrides the default URL (`https://getbifrost.ai/datasheet`) for downloading the pricing sheet.
+- **`PricingSyncInterval`**: Customizes the interval for periodic pricing data synchronization. The default is 24 hours.
+
+This configuration is passed during the initialization of the `ModelCatalog`:
+
+```go
+config := &modelcatalog.Config{
+    PricingURL: "https://my-custom-url.com/pricing.json",
+}
+modelCatalog, err := modelcatalog.Init(context.Background(), config, configStore, logger)
+```
+
+## Architecture
+
+### ModelCatalog
+The `ModelCatalog` is the central component that handles all model and pricing operations:
+
+```go
+type ModelCatalog struct {
+    configStore configstore.ConfigStore
+    logger      schemas.Logger
+
+    pricingURL          string
+    pricingSyncInterval time.Duration
+
+    // In-memory cache for fast access
+    pricingData map[string]configstoreTables.TableModelPricing
+    mu          sync.RWMutex
+
+    modelPool map[schemas.ModelProvider][]string
+
+    // Background sync worker
+    syncTicker *time.Ticker
+    done       chan struct{}
+    wg         sync.WaitGroup
+    syncCtx    context.Context
+    syncCancel context.CancelFunc
+}
+```
+
+### Pricing Data Structure
+Each model's pricing information includes comprehensive cost metrics, supporting various modalities and tiered pricing:
+
+```go
+// PricingEntry represents a single model's pricing information.
+// The fields below are an excerpt — see framework/modelcatalog/main.go for the full definition.
+type PricingEntry struct {
+    BaseModel string `json:"base_model,omitempty"`
+    Provider  string `json:"provider"`
+    Mode      string `json:"mode"`
+
+    // Costs - Text
+    InputCostPerToken                 float64  `json:"input_cost_per_token"`
+    OutputCostPerToken                float64  `json:"output_cost_per_token"`
+    InputCostPerTokenBatches          *float64 `json:"input_cost_per_token_batches,omitempty"`
+    OutputCostPerTokenBatches         *float64 `json:"output_cost_per_token_batches,omitempty"`
+    InputCostPerTokenPriority         *float64 `json:"input_cost_per_token_priority,omitempty"`
+    OutputCostPerTokenPriority        *float64 `json:"output_cost_per_token_priority,omitempty"`
+    InputCostPerTokenAbove200kTokens  *float64 `json:"input_cost_per_token_above_200k_tokens,omitempty"`
+    OutputCostPerTokenAbove200kTokens *float64 `json:"output_cost_per_token_above_200k_tokens,omitempty"`
+
+    // Costs - Cache
+    CacheCreationInputTokenCost                        *float64 `json:"cache_creation_input_token_cost,omitempty"`
+    CacheReadInputTokenCost                            *float64 `json:"cache_read_input_token_cost,omitempty"`
+    CacheCreationInputTokenCostAbove200kTokens         *float64 `json:"cache_creation_input_token_cost_above_200k_tokens,omitempty"`
+    CacheReadInputTokenCostAbove200kTokens             *float64 `json:"cache_read_input_token_cost_above_200k_tokens,omitempty"`
+    CacheCreationInputTokenCostAbove1hr                *float64 `json:"cache_creation_input_token_cost_above_1hr,omitempty"`
+    CacheCreationInputTokenCostAbove1hrAbove200kTokens *float64 `json:"cache_creation_input_token_cost_above_1hr_above_200k_tokens,omitempty"`
+    CacheCreationInputAudioTokenCost                   *float64 `json:"cache_creation_input_audio_token_cost,omitempty"`
+    CacheReadInputTokenCostPriority                    *float64 `json:"cache_read_input_token_cost_priority,omitempty"`
+
+    // Costs - Image
+    InputCostPerImage                             *float64 `json:"input_cost_per_image,omitempty"`
+    InputCostPerPixel                             *float64 `json:"input_cost_per_pixel,omitempty"`
+    OutputCostPerImage                            *float64 `json:"output_cost_per_image,omitempty"`
+    OutputCostPerPixel                            *float64 `json:"output_cost_per_pixel,omitempty"`
+    OutputCostPerImagePremiumImage                *float64 `json:"output_cost_per_image_premium_image,omitempty"`
+    OutputCostPerImageAbove512x512Pixels          *float64 `json:"output_cost_per_image_above_512_and_512_pixels,omitempty"`
+    OutputCostPerImageAbove512x512PixelsPremium   *float64 `json:"output_cost_per_image_above_512_and_512_pixels_and_premium_image,omitempty"`
+    OutputCostPerImageAbove1024x1024Pixels        *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels,omitempty"`
+    OutputCostPerImageAbove1024x1024PixelsPremium *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels_and_premium_image,omitempty"`
+    OutputCostPerImageAbove2048x2048Pixels        *float64 `json:"output_cost_per_image_above_2048_and_2048_pixels,omitempty"`
+    OutputCostPerImageAbove4096x4096Pixels        *float64 `json:"output_cost_per_image_above_4096_and_4096_pixels,omitempty"`
+    OutputCostPerImageLowQuality                  *float64 `json:"output_cost_per_image_low_quality,omitempty"`
+    OutputCostPerImageMediumQuality               *float64 `json:"output_cost_per_image_medium_quality,omitempty"`
+    OutputCostPerImageHighQuality                 *float64 `json:"output_cost_per_image_high_quality,omitempty"`
+    OutputCostPerImageAutoQuality                 *float64 `json:"output_cost_per_image_auto_quality,omitempty"`
+    // Costs - Audio/Video
+    InputCostPerAudioToken      *float64 `json:"input_cost_per_audio_token,omitempty"`
+    InputCostPerAudioPerSecond  *float64 `json:"input_cost_per_audio_per_second,omitempty"`
+    InputCostPerSecond          *float64 `json:"input_cost_per_second,omitempty"`
+    InputCostPerVideoPerSecond  *float64 `json:"input_cost_per_video_per_second,omitempty"`
+    OutputCostPerAudioToken     *float64 `json:"output_cost_per_audio_token,omitempty"`
+    OutputCostPerVideoPerSecond *float64 `json:"output_cost_per_video_per_second,omitempty"`
+    OutputCostPerSecond         *float64 `json:"output_cost_per_second,omitempty"`
+
+    // Costs - Other
+    SearchContextCostPerQuery     *float64 `json:"search_context_cost_per_query,omitempty"`
+    CodeInterpreterCostPerSession *float64 `json:"code_interpreter_cost_per_session,omitempty"`
+}
+```
+
+## Usage in Plugins
+
+The Model Catalog is designed to be shared across all Bifrost plugins, providing consistent model information and validation logic for governance, load balancing, and other routing mechanisms.
+
+<Note>
+**Governance & Load Balancing**: Both plugins delegate model validation to the Model Catalog's `IsModelAllowedForProvider` method, ensuring consistent handling of cross-provider scenarios and provider-prefixed allowed models. See [Provider Routing](/providers/provider-routing) for configuration examples.
+</Note>
+
+### Initialization
+In Bifrost's gateway, the `ModelCatalog` is initialized once at the start and shared across all plugins:
+
+```go
+import "github.com/maximhq/bifrost/framework/modelcatalog"
+
+// Initialize model catalog with config store and logger
+modelCatalog, err := modelcatalog.Init(context.Background(), &modelcatalog.Config{}, configStore, logger)
+if err != nil {
+    return fmt.Errorf("failed to initialize model catalog: %w", err)
+}
+```
+
+### Basic Cost Calculation
+Calculate costs from a Bifrost response:
+
+```go
+// Calculate cost for a completed request
+cost := modelCatalog.CalculateCost(
+    result, // *schemas.BifrostResponse
+    nil,    // *PricingLookupScopes (nil = no scoped overrides)
+)
+
+logger.Info("Request cost: $%.6f", cost)
+```
+
+### Unified Cost Calculation
+`CalculateCost` is the single entry point for all cost calculations. It handles all request types, semantic cache billing, and tiered pricing automatically:
+
+```go
+// CalculateCost handles all cost scenarios including cache-aware pricing
+cost := modelCatalog.CalculateCost(result, nil) // *schemas.BifrostResponse, *PricingLookupScopes
+
+// Cache hits return 0 for direct hits, embedding cost for semantic matches
+// Cache misses return base model cost + embedding generation cost
+// Returns 0.0 if pricing data is not found (logs a debug message)
+```
+
+### Model Discovery
+The `ModelCatalog` provides several methods to query for model and provider information.
+
+#### Get Models for a Provider
+Retrieve a list of all models supported by a specific provider.
+```go
+openaiModels := modelCatalog.GetModelsForProvider(schemas.OpenAI)
+for _, model := range openaiModels {
+    logger.Info("Found OpenAI model: %s", model)
+}
+```
+
+**Thread-safe**: Uses read lock for concurrent access.
+
+#### Get Providers for a Model
+Find all providers that offer a specific model, including cross-provider resolution.
+
+```go
+gpt4Providers := modelCatalog.GetProvidersForModel("gpt-4o")
+for _, provider := range gpt4Providers {
+    logger.Info("gpt-4o is available from: %s", provider)
+}
+// Result: [openai, azure, groq] (includes cross-provider mappings)
+```
+
+**Cross-Provider Resolution**:
+
+This method implements intelligent cross-provider routing logic to discover all providers that can serve a model:
+
+1. **Direct Match**: Checks each provider's model list in `modelPool` for the exact model name
+2. **OpenRouter Format**: For models found in other providers, checks if `provider/model` exists in OpenRouter
+   - Example: `claude-3-5-sonnet` found in Anthropic → checks OpenRouter for `anthropic/claude-3-5-sonnet`
+3. **Vertex Format**: Similar check for Vertex with `provider/model` format
+4. **Groq OpenAI Compatibility**: For GPT models, checks if `openai/model` exists in Groq's catalog
+5. **Bedrock Claude Models**: For Claude models, flexible matching against Bedrock's full ARN format
+
+**Example**:
+```go
+providers := modelCatalog.GetProvidersForModel("claude-3-5-sonnet")
+// Returns: [anthropic, vertex, bedrock, openrouter]
+// Even though request was just "claude-3-5-sonnet" without provider prefix!
+```
+
+<Note>
+This cross-provider logic powers Bifrost's intelligent routing capabilities. See [Provider Routing](/providers/provider-routing#the-model-catalog) for detailed examples of how this enables features like weighted routing via proxy providers.
+</Note>
+
+#### Check Model Allowance for Provider
+Validate if a model is allowed for a specific provider based on an allowed models list. This method is used internally by governance and load balancing plugins.
+
+```go
+// ["*"] wildcard - uses catalog to determine support
+isAllowed := modelCatalog.IsModelAllowedForProvider(
+    schemas.OpenRouter,
+    "gpt-4o",
+    schemas.WhiteList{"*"}, // wildcard = check catalog
+)
+// Returns: true (catalog knows OpenRouter supports openai/gpt-4o)
+
+// Explicit allowedModels with provider prefix
+isAllowed := modelCatalog.IsModelAllowedForProvider(
+    schemas.OpenRouter,
+    "gpt-4o",
+    schemas.WhiteList{"openai/gpt-4o", "anthropic/claude-3-5-sonnet"},
+)
+// Returns: true (strips "openai/" prefix and matches "gpt-4o")
+
+// Explicit allowedModels without prefix
+isAllowed := modelCatalog.IsModelAllowedForProvider(
+    schemas.OpenAI,
+    "gpt-4o",
+    schemas.WhiteList{"gpt-4o", "gpt-4o-mini"},
+)
+// Returns: true (direct match)
+```
+
+**Behavior**:
+- **`["*"]` wildcard**: Delegates to `GetProvidersForModel` (includes cross-provider logic) — this is the "allow all via catalog" mode
+- **Non-empty explicit list**: Checks for both direct matches and provider-prefixed entries
+- **Empty slice (`[]string{}` / empty `schemas.WhiteList`)**: Returns `false` (deny-all) — mirrors the config deny-by-default semantics
+
+<Note>
+In `config.json` and the governance API, `allowed_models: []` (empty array) means **deny all models** (deny-by-default, v1.5.0+). The Go helper `IsModelAllowedForProvider` behaves the same way: an empty `allowedModels` slice also returns `false`. Use `["*"]` to allow all models validated through the catalog.
+</Note>
+  - Direct: `"gpt-4o"` matches `"gpt-4o"`
+  - Prefixed: `"openai/gpt-4o"` matches request for `"gpt-4o"` (prefix stripped)
+
+**Use Cases**:
+- **Governance Routing**: Validate if a model request is allowed for a provider configuration
+- **Load Balancing**: Filter providers based on allowed models before performance scoring
+- **Virtual Key Validation**: Check if a model can be used with a specific virtual key's provider configs
+
+<Tip>
+This method is the central validation point for both governance and load balancing plugins, ensuring consistent model allowance logic across all routing mechanisms. It handles all edge cases including proxy providers (OpenRouter, Vertex) and provider-prefixed model entries.
+</Tip>
+
+#### Dynamically Add Models
+You can dynamically add models to the catalog's pool from a `v1/models` compatible response structure. This is useful for providers that expose a model list endpoint.
+```go
+// response is *schemas.BifrostListModelsResponse
+modelCatalog.AddModelDataToPool(response)
+```
+This is automatically done in Bifrost gateway initialization for all providers that are supported by Bifrost.
+
+**When to use**:
+- After fetching models from a provider's `/v1/models` endpoint
+- When a new provider is dynamically added at runtime
+- For testing with custom model lists
+### Reloading Configuration
+You can reload the pricing configuration at runtime if you need to change the pricing URL or sync interval.
+```go
+newConfig := &modelcatalog.Config{
+    PricingSyncInterval: 12 * time.Hour,
+}
+err := modelCatalog.UpdateSyncConfig(ctx, newConfig)
+```
+
+## Error Handling and Fallbacks
+
+The Model Catalog handles missing pricing data gracefully with intelligent fallbacks:
+
+```go
+// resolvePricing resolves the pricing entry for a model, trying deployment as fallback.
+func (mc *ModelCatalog) resolvePricing(provider, model, deployment string, requestType schemas.RequestType) *configstoreTables.TableModelPricing {
+	pricing, exists := mc.getPricing(model, provider, requestType)
+	if exists {
+		return pricing
+	}
+	// If pricing not found for model, try the deployment name
+	if deployment != "" {
+		pricing, exists = mc.getPricing(deployment, provider, requestType)
+		if exists {
+			return pricing
+		}
+	}
+	return nil
+}
+
+// getPricing returns pricing information for a model (thread-safe).
+// It implements a multi-step fallback chain:
+//   1. Direct lookup by model + provider + mode
+//   2. Gemini → Vertex provider fallback
+//   3. Vertex "provider/model" prefix stripping
+//   4. Bedrock "anthropic." prefix addition for Claude models
+//   5. Responses → Chat mode fallback (at each step)
+//   6. ImageEdit / ImageVariation → ImageGeneration mode fallback
+func (mc *ModelCatalog) getPricing(model, provider string, requestType schemas.RequestType) (*configstoreTables.TableModelPricing, bool) {
+	mc.mu.RLock()
+	defer mc.mu.RUnlock()
+
+	mode := normalizeRequestType(requestType)
+
+	pricing, ok := mc.pricingData[makeKey(model, provider, mode)]
+	if ok {
+		return &pricing, true
+	}
+
+	// Provider-specific fallbacks (Gemini→Vertex, Vertex prefix strip, Bedrock anthropic. prefix)
+	// Each fallback also tries Responses→Chat mode if applicable
+	// ...
+
+	// Final fallback: Responses → Chat mode for any provider
+	if requestType == schemas.ResponsesRequest || requestType == schemas.ResponsesStreamRequest {
+		pricing, ok = mc.pricingData[makeKey(model, provider, normalizeRequestType(schemas.ChatCompletionRequest))]
+		if ok {
+			return &pricing, true
+		}
+	}
+
+	return nil, false
+}
+
+// When pricing is not found, CalculateCost returns 0.0 and logs a debug message.
+// This ensures operations continue smoothly without billing failures.
+```
+
+
+## Cleanup and Lifecycle Management
+
+Properly clean up resources when shutting down:
+
+```go
+// Cleanup model catalog resources
+defer func() {
+    if err := modelCatalog.Cleanup(); err != nil {
+        logger.Error("Failed to cleanup model catalog: %v", err)
+    }
+}()
+```
+
+## Thread Safety
+
+All `ModelCatalog` operations are thread-safe, making it suitable for concurrent usage across multiple plugins and goroutines. The internal pricing data cache uses read-write mutexes for optimal performance during frequent lookups.
+
+## Best Practices
+
+1. **Shared Instance**: Use a single `ModelCatalog` instance across all plugins to avoid redundant data synchronization.
+2. **Error Handling**: Always handle the case where pricing returns 0.0 due to missing model data.
+3. **Logging**: Monitor pricing sync failures and missing model warnings in production.
+4. **Cache Awareness**: Use `CalculateCost` which automatically handles cache hits/misses and embedding costs.
+5. **Resource Cleanup**: Always call `Cleanup()` during application shutdown to prevent resource leaks.
+
+The Model Catalog provides a robust, production-ready foundation for implementing billing, budgeting, and cost monitoring features in Bifrost plugins.
--- a/docs/architecture/framework/streaming.mdx
+++ b/docs/architecture/framework/streaming.mdx
@@ -0,0 +1,130 @@
+---
+title: "Streaming"
+description: "Framework utility for aggregating and processing real-time stream chunks from AI providers"
+icon: "water"
+---
+
+## Overview
+
+The **Streaming** package (`framework/streaming`) is a core utility within Bifrost designed to handle real-time data streams from AI providers. It provides a robust and efficient mechanism for plugins like [Logging](/features/observability/default), [OTel](/features/observability/otel), and [Maxim](/features/observability/maxim) to process, aggregate, and format streaming responses for chat completions, transcriptions, and other real-time AI interactions.
+
+```mermaid
+sequenceDiagram
+    participant Plugin
+    participant BC as Bifrost Core
+    participant Accumulator
+
+    BC->>Plugin: PreLLMHook(StreamingRequest)
+    activate Plugin
+    Plugin->>Accumulator: CreateStreamAccumulator(requestID)
+    activate Accumulator
+    Accumulator-->>Plugin: ack
+    deactivate Accumulator
+    Plugin-->>BC: return
+    deactivate Plugin
+
+    loop For each response chunk
+        BC->>Plugin: PostLLMHook(StreamChunk)
+        activate Plugin
+        Plugin->>Accumulator: ProcessStreamingResponse(StreamChunk)
+        activate Accumulator
+        alt Is NOT Final Chunk
+            Accumulator-->>Plugin: return {Type: Delta}
+        else Is Final Chunk
+            Accumulator->>Accumulator: buildCompleteResponse()
+            Accumulator-->>Plugin: return {Type: Final, CompleteData}
+        end
+        deactivate Accumulator
+        Plugin-->>BC: return
+        deactivate Plugin
+    end
+
+```
+
+Its primary purpose is to simplify the complexity of handling chunked data, ensuring that plugins can work with complete, well-structured responses without needing to implement their own aggregation logic.
+
+
+## How It Works
+
+The streaming package uses an `Accumulator` to manage the lifecycle of a streaming operation. This process is designed to be highly efficient, using `sync.Pool` to reuse objects and minimize memory allocations.
+
+1.  **Initialization**: When a plugin that needs to process streams (like `logging` or `otel`) is initialized, it creates a new `streaming.Accumulator`.
+
+2.  **Stream Start**: In the `PreLLMHook` phase of a request, if the request is identified as a streaming type, the plugin calls `accumulator.CreateStreamAccumulator(requestID, timestamp)` to prepare a dedicated buffer for the incoming chunks of that request.
+
+3.  **Chunk Processing**: In the `PostLLMHook` phase, as each chunk of the streaming response arrives, the plugin passes it to `accumulator.ProcessStreamingResponse()`.
+    *   For each `delta` chunk, the accumulator appends it to the buffer associated with the request ID.
+    *   The accumulator handles different types of streams, including chat, audio, and transcriptions, using specialized logic to correctly piece together the data. For example, it accumulates text deltas, tool call argument deltas, and other parts of the message.
+
+4.  **Finalization**: When the final chunk of the stream is received (indicated by a `finish_reason` or other provider-specific signal), `ProcessStreamingResponse` performs the final assembly.
+    *   It reconstructs the complete `ChatMessage` or other response object from all the stored chunks.
+    *   It calculates total token usage, cost, and latency.
+    *   It returns a `ProcessedStreamResponse` object with `StreamResponseTypeFinal` and the complete, structured `AccumulatedData`.
+
+5.  **Cleanup**: Once the final response is processed, the accumulator cleans up all buffered chunks for that request ID, returning them to the `sync.Pool` for reuse.
+
+## Key Components
+
+### `Accumulator`
+
+The central component of the package. It is a thread-safe manager that:
+-   Tracks stream chunks for multiple concurrent requests using a `sync.Map`.
+-   Uses `sync.Pool` to recycle `*StreamChunk` objects, reducing garbage collection overhead.
+-   Provides methods to add chunks (`addChatStreamChunk`, `addAudioStreamChunk`, etc.).
+-   Includes a periodic cleanup worker to remove stale accumulators for incomplete or orphaned requests.
+
+### `ProcessStreamingResponse`
+
+This is the main entry point for plugins to process stream data. It inspects the response type and delegates to the appropriate handler:
+-   `processChatStreamingResponse`
+-   `processAudioStreamingResponse`
+-   `processTranscriptionStreamingResponse`
+-   `processResponsesStreamingResponse`
+
+It returns a `ProcessedStreamResponse`, which indicates whether the chunk is a `delta` or the `final` aggregated response.
+
+### Stream-Specific Builders
+
+The package includes internal logic to correctly build complete messages from chunks. For example, `buildCompleteMessageFromChatStreamChunks` iterates through the collected `ChatStreamChunk` objects, appending content deltas and assembling tool calls into a final, coherent `schemas.ChatMessage`.
+
+## Usage Example
+
+The following snippet from the `logging` plugin shows how the `streaming` package is used in practice within a plugin's `PostLLMHook`.
+
+```go
+// In plugins/logging/main.go
+
+func (p *LoggerPlugin) PostLLMHook(ctx *schemas.BifrostContext, result *schemas.BifrostResponse, bifrostErr *schemas.BifrostError) (*schemas.BifrostResponse, *schemas.BifrostError, error) {
+    // ... setup, get requestID ...
+
+    go func() {
+        // ...
+        if bifrost.IsStreamRequestType(requestType) {
+            p.logger.Debug("[logging] processing streaming response")
+
+            // 1. Pass the response chunk to the accumulator
+            streamResponse, err := p.accumulator.ProcessStreamingResponse(ctx, result, bifrostErr)
+            if err != nil {
+                p.logger.Error("failed to process streaming response: %v", err)
+            // 2. Check if this is the final, aggregated response
+            } else if streamResponse != nil && streamResponse.Type == streaming.StreamResponseTypeFinal {
+                // Prepare final log data
+                logMsg.Operation = LogOperationStreamUpdate
+                logMsg.StreamResponse = streamResponse
+                
+                // 3. Update the log entry with the complete data
+                processingErr := retryOnNotFound(p.ctx, func() error {
+                    return p.updateStreamingLogEntry(p.ctx, logMsg.RequestID, logMsg.SemanticCacheDebug, logMsg.StreamResponse, true)
+                })
+                
+                // ... handle errors and callbacks ...
+            }
+        }
+        // ... handle non-streaming responses ...
+    }()
+
+    return result, bifrostErr, nil
+}
+```
+
+This demonstrates how a plugin can remain agnostic to the details of stream aggregation and simply react to the final, complete data returned by the `streaming` package. This greatly simplifies plugin development and ensures consistent data handling across the framework.
--- a/docs/architecture/framework/vector-store.mdx
+++ b/docs/architecture/framework/vector-store.mdx
@@ -0,0 +1,185 @@
+---
+title: "Vector Store"
+description: "Vector database implementations for semantic search, embeddings storage, and AI-powered features in Bifrost."
+icon: "diagram-project"
+---
+
+## Overview
+
+The VectorStore is a core component of Bifrost's framework package that provides a unified interface for vector database operations. It enables plugins to store embeddings, perform similarity searches, and build AI-powered features like semantic caching, content recommendations, and knowledge retrieval.
+
+**Key Capabilities:**
+- **Vector Similarity Search**: Find semantically similar content using embeddings
+- **Namespace Management**: Organize data into separate collections with custom schemas
+- **Flexible Filtering**: Query data with complex filters and pagination
+- **Multiple Backends**: Support for Weaviate, Redis/Valkey-compatible, Qdrant, and Pinecone vector stores
+- **High Performance**: Optimized for production workloads
+- **Scalable Storage**: Handle millions of vectors with efficient indexing
+
+## VectorStore Interface Usage
+
+### Creating Namespaces
+Create collections (namespaces) with custom schemas:
+
+```go
+// Define properties for your data
+properties := map[string]vectorstore.VectorStoreProperties{
+    "content": {
+        DataType:    vectorstore.VectorStorePropertyTypeString,
+        Description: "The main content text",
+    },
+    "category": {
+        DataType:    vectorstore.VectorStorePropertyTypeString,
+        Description: "Content category",
+    },
+    "tags": {
+        DataType:    vectorstore.VectorStorePropertyTypeStringArray,
+        Description: "Content tags",
+    },
+}
+
+// Create namespace
+err := store.CreateNamespace(ctx, "my_content", 1536, properties)
+if err != nil {
+    log.Fatal("Failed to create namespace:", err)
+}
+```
+
+### Storing Data with Embeddings
+Add data with vector embeddings for similarity search:
+
+```go
+// Your embedding data (typically from an embedding model)
+embedding := []float32{0.1, 0.2, 0.3 } // example 3-dimensional vector
+
+// Metadata associated with this vector
+metadata := map[string]interface{}{
+    "content":  "This is my content text",
+    "category": "documentation",
+    "tags":     []string{"guide", "tutorial"},
+}
+
+// Store in vector database
+err := store.Add(ctx, "my_content", "unique-id-123", embedding, metadata)
+if err != nil {
+    log.Fatal("Failed to add data:", err)
+}
+```
+
+### Similarity Search
+Find similar content using vector similarity:
+
+```go
+// Query embedding (from user query)
+queryEmbedding := []float32{0.15, 0.25, 0.35, ...}
+
+// Optional filters
+filters := []vectorstore.Query{
+    {
+        Field:    "category",
+        Operator: vectorstore.QueryOperatorEqual,
+        Value:    "documentation",
+    },
+}
+
+// Perform similarity search
+results, err := store.GetNearest(
+    ctx,
+    "my_content",        // namespace
+    queryEmbedding,      // query vector
+    filters,             // optional filters
+    []string{"content", "category"}, // fields to return
+    0.7,                 // similarity threshold (0-1)
+    10,                  // limit
+)
+
+for _, result := range results {
+    fmt.Printf("Score: %.3f, Content: %s\n", *result.Score, result.Properties["content"])
+}
+```
+
+### Data Retrieval and Management
+Query and manage stored data:
+
+```go
+// Get specific item by ID
+item, err := store.GetChunk(ctx, "my_content", "unique-id-123")
+if err != nil {
+    log.Fatal("Failed to get item:", err)
+}
+
+// Get all items with filtering and pagination
+allResults, cursor, err := store.GetAll(
+    ctx,
+    "my_content",
+    []vectorstore.Query{
+        {Field: "category", Operator: vectorstore.QueryOperatorEqual, Value: "documentation"},
+    },
+    []string{"content", "tags"}, // select fields
+    nil,  // cursor for pagination
+    50,   // limit
+)
+
+// Delete items
+err = store.Delete(ctx, "my_content", "unique-id-123")
+```
+
+## Supported Vector Stores
+
+<CardGroup cols={2}>
+  <Card title="Weaviate" icon="database" href="/integrations/vector-databases/weaviate">
+    Production-ready vector database with gRPC support.
+  </Card>
+  <Card title="Redis / Valkey" icon="database" href="/integrations/vector-databases/redis">
+    High-performance in-memory vector store.
+  </Card>
+  <Card title="Qdrant" icon="database" href="/integrations/vector-databases/qdrant">
+    Rust-based vector search engine with advanced filtering.
+  </Card>
+  <Card title="Pinecone" icon="database" href="/integrations/vector-databases/pinecone">
+    Managed vector database with serverless options.
+  </Card>
+</CardGroup>
+
+---
+
+## Use Cases
+
+### [Semantic Caching](../../features/semantic-caching)
+Build intelligent caching systems that understand query intent rather than just exact matches.
+
+**Applications:**
+- Customer support systems with FAQ matching
+- Code completion and documentation search  
+- Content management with semantic deduplication
+
+### Knowledge Base & Search
+Create intelligent search systems that understand user queries contextually.
+
+**Applications:**
+- Document search and retrieval systems
+- Product recommendation engines
+- Research paper and knowledge discovery platforms
+
+### Content Classification
+Automatically categorize and tag content based on semantic similarity.
+
+**Applications:**
+- Email classification and routing
+- Content moderation and filtering
+- News article categorization and clustering
+
+### Recommendation Systems
+Build personalized recommendation engines using vector similarity.
+
+**Applications:**
+- Product recommendations based on user preferences
+- Content suggestions for media platforms
+- Similar document or article recommendations
+
+## Related Documentation
+
+| Topic | Documentation | Description |
+|-------|---------------|-------------|
+| **Framework Overview** | [What is Framework](./what-is-framework) | Understanding the framework package and VectorStore interface |
+| **Semantic Caching** | [Semantic Caching](../../features/semantic-caching) | Using VectorStore for AI response caching |
--- a/docs/architecture/framework/what-is-framework.mdx
+++ b/docs/architecture/framework/what-is-framework.mdx
@@ -0,0 +1,49 @@
+---
+title: "What is framework?"
+description: "Framework is Bifrost's shared storage and utilities SDK package that provides common database interfaces and logic for the plugin ecosystem."
+icon: "play"
+---
+
+Framework serves as the foundation layer that enables plugins to implement consistent data management patterns without reinventing storage solutions.
+
+## Installation
+
+```bash
+go get github.com/maximhq/bifrost/framework
+```
+
+## Purpose
+
+The framework package was designed to solve a fundamental challenge in plugin development: providing standardized, reliable storage and utility interfaces that plugins can depend on. Instead of each plugin implementing its own database logic, configuration management, or logging systems, framework offers battle-tested, shared implementations.
+
+## Core Components
+
+### ConfigStore
+A unified configuration persistence layer that provides consistent storage patterns for plugin settings, provider configurations, and system state. Plugins can leverage `ConfigStore` to manage their configuration data with built-in CRUD operations, transaction support, and schema management.
+
+### LogStore
+Standardized logging and audit trail capabilities that enable plugins to implement observability features. `LogStore` provides structured logging, search and filtering capabilities, pagination support, and automated data retention policies.
+
+### VectorStore
+Vector database operations designed for AI-powered plugins that need semantic capabilities. `VectorStore` handles embeddings management, similarity search operations, and namespace isolation, making it easy for plugins to add features like semantic caching, content search, and AI-powered recommendations.
+
+### Pricing Module
+Cost calculation and model pricing management tools that help plugins implement billing and usage tracking features. The pricing system supports multi-tier pricing models, real-time usage tracking, and dynamic pricing updates.
+
+## Benefits for Plugin Developers
+
+**Shared Logic**: Common patterns for configuration, logging, and data management are provided out-of-the-box, reducing development time and ensuring consistency across plugins.
+
+**Standardized Interfaces**: All framework components use consistent APIs, making it easier for developers to work across different plugins and maintain code quality.
+
+**Pluggable Architecture**: The interface-based design allows different storage backends to be used without changing plugin code, providing flexibility for different deployment scenarios.
+
+**Transaction Support**: Built-in transaction management and error handling ensure data integrity and provide reliable rollback capabilities.
+
+**Production Ready**: Framework components are battle-tested in production environments and include features like connection pooling, retry logic, and performance optimizations.
+
+## Integration with Bifrost
+
+Framework seamlessly integrates with the Bifrost ecosystem, providing the storage foundation that powers core features like provider management, request logging, semantic caching, and governance. When plugins use framework components, they automatically participate in Bifrost's unified data management strategy.
+
+The framework package enables plugin developers to focus on their core business logic while relying on robust, shared infrastructure for all storage and utility needs.
--- a/docs/architecture/plugins/governance.mdx
+++ b/docs/architecture/plugins/governance.mdx
--- a/docs/architecture/plugins/jsonparser.mdx
+++ b/docs/architecture/plugins/jsonparser.mdx
--- a/docs/architecture/plugins/logging.mdx
+++ b/docs/architecture/plugins/logging.mdx
--- a/docs/architecture/plugins/maxim.mdx
+++ b/docs/architecture/plugins/maxim.mdx
--- a/docs/architecture/plugins/mocker.mdx
+++ b/docs/architecture/plugins/mocker.mdx
--- a/docs/architecture/plugins/semantic-cache.mdx
+++ b/docs/architecture/plugins/semantic-cache.mdx
--- a/docs/architecture/plugins/telemetry.mdx
+++ b/docs/architecture/plugins/telemetry.mdx
--- a/docs/architecture/transports/in-memory-store.mdx
+++ b/docs/architecture/transports/in-memory-store.mdx
--- a/docs/baseUrlSwitcher.js
+++ b/docs/baseUrlSwitcher.js
@@ -0,0 +1,181 @@
+/**
+ * Bifrost docs — Base URL persistence
+ *
+ * The OpenAPI spec exposes the gateway base URL as a server variable
+ * (`{baseUrl}`), so Mintlify's API Reference playground renders an
+ * editable input for it. This script:
+ *
+ *   1. Preloads that input from localStorage on every page load /
+ *      SPA route change, so the user only has to type their URL once.
+ *   2. Persists any edit the user makes back to localStorage.
+ *   3. Rewrites every `<code>` block in the MDX docs that mentions the
+ *      default `http://localhost:8080`, so curl/SDK examples on the
+ *      regular doc pages also use the configured URL.
+ *
+ * Mintlify auto-injects any `.js` file in the docs root on every page,
+ * so no docs.json wiring is required.
+ */
+(function () {
+  if (typeof window === "undefined" || typeof document === "undefined") return;
+  if (window.__bifrostBaseUrlSwitcherLoaded) return;
+  window.__bifrostBaseUrlSwitcherLoaded = true;
+
+  var DEFAULT_URL = "http://localhost:8080";
+  var STORAGE_KEY = "bifrost_base_url";
+  // Per-element snapshot of original text-node values, keyed via a
+  // WeakMap so detached DOM nodes get GC'd cleanly.
+  var snapshots = new WeakMap();
+
+  function readStoredUrl() {
+    try {
+      var v = window.localStorage.getItem(STORAGE_KEY);
+      return v && v.trim() ? v.trim() : DEFAULT_URL;
+    } catch (e) {
+      return DEFAULT_URL;
+    }
+  }
+
+  function writeStoredUrl(url) {
+    try {
+      window.localStorage.setItem(STORAGE_KEY, url);
+    } catch (e) {
+      /* ignore quota / private mode */
+    }
+  }
+
+  function normalizeUrl(input) {
+    if (!input) return DEFAULT_URL;
+    var url = String(input).trim();
+    if (!url) return DEFAULT_URL;
+    if (!/^https?:\/\//i.test(url)) url = "http://" + url;
+    return url.replace(/\/+$/, "");
+  }
+
+  /**
+   * Snapshot every text node inside `el` and remember the original
+   * value, so subsequent URL changes can always rewrite from the
+   * canonical source. Returns the snapshot, or null if the block has
+   * no localhost reference (so we never visit it again).
+   */
+  function snapshotTextNodes(el) {
+    var walker = document.createTreeWalker(el, NodeFilter.SHOW_TEXT, null);
+    var entries = [];
+    var hasMatch = false;
+    var node;
+    while ((node = walker.nextNode())) {
+      var text = node.nodeValue || "";
+      if (text.indexOf("localhost:8080") !== -1) hasMatch = true;
+      entries.push({ node: node, original: text });
+    }
+    return hasMatch ? entries : null;
+  }
+
+  /**
+   * Rewrite every `<code>` block that mentions the default localhost
+   * URL. We only touch text nodes (`nodeValue`) — never `innerHTML` —
+   * so there is no path where a string is reinterpreted as HTML.
+   */
+  function rewriteCodeBlocks(currentUrl) {
+    var blocks = document.querySelectorAll("pre code, code");
+    var bareUrl = currentUrl.replace(/^https?:\/\//, "");
+    for (var i = 0; i < blocks.length; i++) {
+      var el = blocks[i];
+      var entries = snapshots.get(el);
+      if (entries === undefined) {
+        entries = snapshotTextNodes(el);
+        // Cache the result either way (null = "scanned, no match")
+        // so we don't re-walk this element on every observer tick.
+        snapshots.set(el, entries);
+      }
+      if (!entries) continue;
+      for (var j = 0; j < entries.length; j++) {
+        var entry = entries[j];
+        var next;
+        if (currentUrl === DEFAULT_URL) {
+          next = entry.original;
+        } else {
+          next = entry.original
+            .replace(/https?:\/\/localhost:8080/g, currentUrl)
+            .replace(/localhost:8080/g, bareUrl);
+        }
+        if (entry.node.nodeValue !== next) entry.node.nodeValue = next;
+      }
+    }
+  }
+
+  // ---------- API Reference playground sync ----------
+
+  /**
+   * Mintlify renders the server-variable field with a stable id of
+   * `api-playground-input`, so we can scope directly to it instead of
+   * heuristically scanning every text input on the page.
+   */
+  function findPlaygroundUrlInputs() {
+    var el = document.getElementById("api-playground-input");
+    if (!el || el.__bifrostPlaygroundBound) return [];
+    return [el];
+  }
+
+  function setNativeValue(el, value) {
+    // React overrides the input's value setter; bypass it so React's
+    // controlled state picks up the programmatic change.
+    var proto = Object.getPrototypeOf(el);
+    var descriptor = Object.getOwnPropertyDescriptor(proto, "value");
+    if (descriptor && descriptor.set) descriptor.set.call(el, value);
+    else el.value = value;
+    el.dispatchEvent(new Event("input", { bubbles: true }));
+    el.dispatchEvent(new Event("change", { bubbles: true }));
+  }
+
+  function syncPlaygroundInputs(state) {
+    var inputs = findPlaygroundUrlInputs();
+    for (var i = 0; i < inputs.length; i++) {
+      var el = inputs[i];
+      if (el.__bifrostPlaygroundBound) continue;
+      el.__bifrostPlaygroundBound = true;
+
+      // Persist on blur / change. Using `change` (not `input`) avoids
+      // fighting the user mid-keystroke.
+      el.addEventListener("change", function (e) {
+        var v = normalizeUrl(e.target.value);
+        state.currentUrl = v;
+        writeStoredUrl(v);
+        rewriteCodeBlocks(v);
+      });
+
+      // Preload from storage exactly once. After this, the input is
+      // user-owned — we never write to it again, otherwise typing would
+      // get clobbered by the next MutationObserver tick.
+      if (state.currentUrl !== DEFAULT_URL && el.value !== state.currentUrl) {
+        setNativeValue(el, state.currentUrl);
+      }
+    }
+  }
+
+  // ---------- Boot ----------
+
+  function boot() {
+    var state = { currentUrl: normalizeUrl(readStoredUrl()) };
+    rewriteCodeBlocks(state.currentUrl);
+    syncPlaygroundInputs(state);
+
+    // Mintlify is an SPA — re-run on any DOM mutation (debounced).
+    var pending = false;
+    var observer = new MutationObserver(function () {
+      if (pending) return;
+      pending = true;
+      window.requestAnimationFrame(function () {
+        pending = false;
+        rewriteCodeBlocks(state.currentUrl);
+        syncPlaygroundInputs(state);
+      });
+    });
+    observer.observe(document.body, { childList: true, subtree: true });
+  }
+
+  if (document.readyState === "loading") {
+    document.addEventListener("DOMContentLoaded", boot);
+  } else {
+    boot();
+  }
+})();
--- a/docs/benchmarking/getting-started.mdx
+++ b/docs/benchmarking/getting-started.mdx
@@ -0,0 +1,81 @@
+---
+title: "Getting Started"
+description: "Introduction to Bifrost's performance capabilities and how to choose the right instance size for your workload."
+icon: "rocket"
+---
+
+## Overview
+
+Bifrost has been rigorously tested under high load conditions to ensure optimal performance for production deployments. Our benchmark tests demonstrate exceptional performance characteristics at **5,000 requests per second (RPS)** across different AWS EC2 instance types.
+
+**Key Performance Highlights:**
+- **Perfect Success Rate**: 100% request success rate under high load
+- **Minimal Overhead**: Less than 15µs added latency per request on average
+- **Efficient Queue Management**: Sub-microsecond queue wait times on optimized instances
+- **Fast Key Selection**: Near-instantaneous weighted API key selection (~10 ns)
+
+---
+
+## Test Environment Summary
+
+Bifrost was benchmarked on two primary AWS EC2 instance configurations:
+
+### **t3.medium (2 vCPUs, 4GB RAM)**
+- **Buffer Size**: 15,000
+- **Initial Pool Size**: 10,000
+- **Use Case**: Cost-effective option for moderate workloads
+
+### **t3.xlarge (4 vCPUs, 16GB RAM)**  
+- **Buffer Size**: 20,000
+- **Initial Pool Size**: 15,000
+- **Use Case**: High-performance option for demanding workloads
+
+---
+
+## Performance Comparison at a Glance
+
+| Metric | t3.medium | t3.xlarge | Improvement |
+|--------|-----------|-----------|-------------|
+| **Success Rate @ 5k RPS** | 100% | 100% | No failed requests |
+| **Bifrost Overhead** | 59 µs | 11 µs | **-81%** |
+| **Average Latency** | 2.12s | 1.61s | **-24%** |
+| **Queue Wait Time** | 47.13 µs | 1.67 µs | **-96%** |
+| **JSON Marshaling** | 63.47 µs | 26.80 µs | **-58%** |
+| **Response Parsing** | 11.30 ms | 2.11 ms | **-81%** |
+| **Peak Memory Usage** | 1,312.79 MB | 3,340.44 MB | +155% |
+
+> **Note**: t3.xlarge tests used significantly larger response payloads (~10 KB vs ~1 KB), yet still achieved better performance metrics.
+
+<Note>
+All benchmarks are on mocked OpenAI calls, whose latency and payload size are mentioned in the respective analysis pages.
+</Note>
+
+---
+
+## Configuration Flexibility
+
+One of Bifrost's key strengths is its **configuration flexibility**. You can fine-tune the speed ↔ memory trade-off based on your specific requirements:
+
+| Configuration Parameter | Effect |
+|------------------------|--------|
+| `initial_pool_size` | Higher values = faster performance, more memory usage |
+| `buffer_size` & `concurrency` | Controls queue depth and max parallel workers (per provider) |
+| `retry` & `timeout` | Tune aggressiveness for each provider to meet your SLOs |
+
+**Configuration Philosophy:**
+- **Higher settings** (like t3.xlarge profile) prioritize raw speed
+- **Lower settings** (like t3.medium profile) optimize for memory efficiency  
+- **Custom tuning** lets you find the sweet spot for your specific workload
+
+---
+
+## Next Steps
+
+### **Detailed Performance Analysis**
+- **[t3.medium Performance](./t3.medium)** - Deep dive into cost-effective performance
+- **[t3.xlarge Performance](./t3.xl)** - High-performance configuration analysis
+
+### **Run Your Own Tests**
+- **[Run Your Own Benchmarks](./run-your-own-benchmarks)** - Step-by-step guide to benchmark Bifrost in your environment
+
+Ready to dive deeper? Choose your instance type above or learn how to run your own performance tests.
--- a/docs/benchmarking/run-your-own-benchmarks.mdx
+++ b/docs/benchmarking/run-your-own-benchmarks.mdx
@@ -0,0 +1,355 @@
+---
+title: "Run Your Own Benchmarks"
+description: "Step-by-step guide to benchmark Bifrost in your own environment using the official benchmarking tool."
+icon: "stopwatch"
+---
+
+## Overview
+
+Want to see Bifrost's performance in your specific environment? The [**Bifrost Benchmarking Repository**](https://github.com/maximhq/bifrost-benchmarking) provides everything you need to conduct comprehensive performance tests tailored to your infrastructure and workload requirements.
+
+**What You Can Test:**
+- **Custom Instance Sizes** - Test on your preferred AWS/GCP/Azure instances  
+- **Your Workload Patterns** - Use your actual request/response sizes
+- **Different Configurations** - Compare various Bifrost settings
+- **Provider Comparisons** - Benchmark against other AI gateways
+- **Load Scenarios** - Test burst loads, sustained traffic, and endurance
+
+> **💡 Open Source**: The benchmarking tool is completely open source! Feel free to submit pull requests if you think anything is missing or could be improved.
+
+---
+
+## Prerequisites
+
+Before running benchmarks, ensure you have:
+
+- **Go 1.26.1+** installed on your testing machine
+- **Bifrost instance** running and accessible
+- **Target API providers** configured (OpenAI, Anthropic, etc.)
+- **Network access** between benchmark tool and Bifrost
+- **Sufficient resources** on the testing machine to generate load
+
+---
+
+## Quick Start
+
+### **1. Clone the Repository**
+
+```bash
+git clone https://github.com/maximhq/bifrost-benchmarking.git
+cd bifrost-benchmarking
+```
+
+### **2. Build the Benchmark Tool**
+
+```bash
+go build benchmark.go
+```
+
+This creates a `benchmark` executable (or `benchmark.exe` on Windows).
+
+### **3. Run Your First Benchmark**
+
+```bash
+# Basic benchmark: 500 RPS for 10 seconds
+./benchmark -provider bifrost -port 8080
+
+# Custom benchmark: 1000 RPS for 30 seconds  
+./benchmark -provider bifrost -port 8080 -rate 1000 -duration 30 -output my_results.json
+```
+
+---
+
+## Configuration Options
+
+The benchmark tool offers extensive configuration through command-line flags:
+
+### **Basic Configuration**
+
+| Flag | Required | Description | Default |
+|------|----------|-------------|---------|
+| `-provider <name>` | ✅ | Provider name (e.g., `bifrost`, `litellm`) | None |
+| `-port <number>` | ✅ | Port number of your Bifrost instance | None |
+| `-endpoint <path>` | ❌ | API endpoint path | `v1/chat/completions` |
+| `-rate <number>` | ❌ | Requests per second | `500` |
+| `-duration <seconds>` | ❌ | Test duration in seconds | `10` |
+| `-output <filename>` | ❌ | Results output file | `results.json` |
+
+### **Advanced Configuration**
+
+| Flag | Description | Default |
+|------|-------------|---------|
+| `-include-provider-in-request` | Include provider name in request payload | `false` |
+| `-big-payload` | Use larger, more complex request payloads | `false` |
+
+---
+
+## Benchmark Scenarios
+
+### **1. Basic Performance Test**
+
+Test standard performance with typical request sizes:
+
+```bash
+./benchmark -provider bifrost -port 8080 -rate 1000 -duration 60 -output basic_test.json
+```
+
+**Use Case**: General performance validation
+
+### **2. High-Load Stress Test**
+
+Push your instance to its limits:
+
+```bash
+./benchmark -provider bifrost -port 8080 -rate 5000 -duration 120 -output stress_test.json
+```
+
+**Use Case**: Capacity planning and SLA validation
+
+### **3. Large Payload Test**
+
+Test with bigger request/response sizes:
+
+```bash
+./benchmark -provider bifrost -port 8080 -rate 500 -duration 60 -big-payload=true -output large_payload.json
+```
+
+**Use Case**: Document processing, code generation workloads
+
+### **4. Endurance Test**
+
+Long-running stability test:
+
+```bash
+./benchmark -provider bifrost -port 8080 -rate 1000 -duration 1800 -output endurance_test.json
+```
+
+**Use Case**: Production readiness validation (30-minute test)
+
+### **5. Comparative Benchmarking**
+
+Compare Bifrost against other providers:
+
+```bash
+# Test Bifrost
+./benchmark -provider bifrost -port 8080 -rate 1000 -duration 60 -output bifrost_results.json
+
+# Test LiteLLM
+./benchmark -provider litellm -port 8000 -rate 1000 -duration 60 -output litellm_results.json
+
+# Test direct OpenAI (if available)
+./benchmark -provider openai -port 443 -endpoint chat/completions -rate 1000 -duration 60 -output openai_results.json
+```
+
+---
+
+## Understanding Results
+
+The benchmark tool generates detailed JSON results with comprehensive metrics:
+
+### **Key Metrics Explained**
+
+```json
+{
+  "bifrost": {
+    "request_counts": {
+      "total_sent": 30000,
+      "successful": 30000,
+      "failed": 0
+    },
+    "success_rate": 100.0,
+    "latency_metrics": {
+      "mean_ms": 245.5,
+      "p50_ms": 230.2,
+      "p99_ms": 520.8,
+      "max_ms": 845.3
+    },
+    "throughput_rps": 5000.0,
+    "memory_usage": {
+      "before_mb": 512.5,
+      "after_mb": 1312.8,
+      "peak_mb": 1405.2,
+      "average_mb": 1156.7
+    },
+    "timestamp": "2025-01-14T10:30:00Z",
+    "status_codes": {
+      "200": 30000
+    }
+  }
+}
+```
+
+### **Critical Performance Indicators**
+
+**Success Rate:**
+- **Target**: >99.9% for production readiness
+- **Excellent**: 100% (perfect reliability)
+
+**Latency Metrics:**
+- **P50 (Median)**: Typical user experience
+- **P99**: Worst-case user experience  
+- **Mean**: Overall average performance
+
+**Memory Usage:**
+- **Peak**: Maximum memory consumption
+- **Average**: Sustained memory usage
+- **After - Before**: Memory growth during test
+
+---
+
+## Instance Sizing Recommendations
+
+Based on your benchmark results, use these guidelines for production sizing:
+
+### **Resource Planning Matrix**
+
+| Target RPS | Memory Usage | Recommended Instance | Notes |
+|------------|--------------|---------------------|--------|
+| **< 1,000** | < 1GB | t3.small | Cost-effective for light loads |
+| **1,000 - 3,000** | 1-2GB | t3.medium | Balanced performance/cost |
+| **3,000 - 5,000** | 2-4GB | t3.large | High-performance production |
+| **5,000+** | 3-6GB | t3.xlarge+ | Enterprise/mission-critical |
+
+### **Configuration Tuning Based on Results**
+
+**If seeing high latency:**
+- Increase `initial_pool_size`
+- Increase `buffer_size`
+- Consider larger instance
+
+**If memory usage is high:**
+- Decrease `initial_pool_size`
+- Optimize `buffer_size`
+- Monitor for memory leaks
+
+**If success rate < 100%:**
+- Reduce request rate
+- Increase timeout settings
+- Check provider limits
+
+---
+
+## Advanced Testing Scenarios
+
+### **Burst Load Testing**
+
+Simulate traffic spikes:
+
+```bash
+# Normal load
+./benchmark -provider bifrost -port 8080 -rate 1000 -duration 300 -output normal_load.json
+
+# Burst load (simulate 5x spike)
+./benchmark -provider bifrost -port 8080 -rate 5000 -duration 60 -output burst_load.json
+```
+
+### **Multi-Instance Testing**
+
+Test horizontal scaling:
+
+```bash
+# Instance 1
+./benchmark -provider bifrost-1 -port 8080 -rate 2500 -duration 120 -output instance_1.json &
+
+# Instance 2  
+./benchmark -provider bifrost-2 -port 8081 -rate 2500 -duration 120 -output instance_2.json &
+
+# Wait for both to complete
+wait
+```
+
+### **Different Payload Sizes**
+
+Compare performance across payload sizes:
+
+```bash
+# Small payloads (default)
+./benchmark -provider bifrost -port 8080 -rate 1000 -duration 60 -output small_payload.json
+
+# Large payloads
+./benchmark -provider bifrost -port 8080 -rate 1000 -duration 60 -big-payload=true -output large_payload.json
+```
+
+---
+
+## Continuous Benchmarking
+
+### **Automated Testing Pipeline**
+
+Set up regular performance regression testing:
+
+```bash
+#!/bin/bash
+# daily_benchmark.sh
+
+DATE=$(date +%Y%m%d_%H%M%S)
+OUTPUT_DIR="benchmarks/$DATE"
+mkdir -p $OUTPUT_DIR
+
+# Run standard benchmarks
+./benchmark -provider bifrost -port 8080 -rate 1000 -duration 300 -output "$OUTPUT_DIR/standard.json"
+./benchmark -provider bifrost -port 8080 -rate 3000 -duration 180 -output "$OUTPUT_DIR/high_load.json"  
+./benchmark -provider bifrost -port 8080 -rate 500 -duration 600 -big-payload=true -output "$OUTPUT_DIR/large_payload.json"
+
+echo "Benchmarks completed: $OUTPUT_DIR"
+```
+
+### **Performance Monitoring Integration**
+
+Monitor key metrics over time:
+- **Success rate trends**
+- **Latency percentile changes**
+- **Memory usage patterns**
+- **Throughput capacity**
+
+---
+
+## Troubleshooting
+
+### **Common Issues**
+
+**Connection Refused:**
+```bash
+# Check if Bifrost is running
+curl http://localhost:8080/health
+
+# Verify port configuration
+netstat -an | grep 8080
+```
+- Check PORT is defined in `.env` file at root.
+
+**High Error Rates:**
+- Check provider API key limits
+- Verify Bifrost configuration
+- Monitor upstream provider status
+- Reduce request rate for baseline test
+
+**Memory Issues:**
+- Monitor system resources during testing
+- Check for memory leaks in long tests
+- Adjust Bifrost pool sizes
+
+**Inconsistent Results:**
+- Run multiple test iterations
+- Account for network variability  
+- Use longer test durations (60+ seconds)
+- Isolate testing environment
+- Try hitting gateway requests to a Mock provider
+
+---
+
+## Next Steps
+
+### **After Running Benchmarks**
+
+1. **Analyze Results**: Compare against [official benchmarks](./getting-started)
+2. **Optimize Configuration**: Tune based on your specific results
+3. **Plan Capacity**: Size instances based on measured performance
+4. **Set Up Monitoring**: Track key metrics in production
+
+### **Compare Results**
+
+- **[t3.medium Performance](./t3.medium)** - Compare against medium instance results
+- **[t3.xlarge Performance](./t3.xl)** - Compare against high-performance configuration
+
+**Ready to benchmark? Clone the [repository](https://github.com/maximhq/bifrost-benchmarking) and start testing!**
--- a/docs/benchmarking/t3.medium.mdx
+++ b/docs/benchmarking/t3.medium.mdx
@@ -0,0 +1,127 @@
+---
+title: "t3.medium"
+description: "Detailed performance metrics and analysis for Bifrost running on AWS t3.medium instances (2 vCPUs, 4GB RAM)."
+icon: "server"
+---
+
+## Instance Configuration
+
+**AWS t3.medium Specifications:**
+- **vCPUs**: 2
+- **Memory**: 4GB RAM
+- **Network Performance**: Up to 5 Gigabit
+
+**Bifrost Configuration:**
+- **Buffer Size**: 15,000
+- **Initial Pool Size**: 10,000
+- **Test Load**: 5,000 requests per second (RPS)
+
+---
+
+## Performance Results
+
+### **Overall Performance Metrics**
+
+| Metric | Value | Notes |
+|--------|-------|--------|
+| **Success Rate** | 100.00% | Perfect reliability under high load |
+| **Average Request Size** | 0.13 KB | Lightweight request payload |
+| **Average Response Size** | 1.37 KB | Standard response size for testing |
+| **Average Latency** | 2.12s | Total end-to-end response time |
+| **Peak Memory Usage** | 1,312.79 MB | ~33% of available 4GB RAM |
+
+### **Detailed Performance Breakdown**
+
+| Operation | Latency | Performance Notes |
+|-----------|---------|-------------------|
+| **Queue Wait Time** | 47.13 µs | Time waiting in Bifrost's internal queue |
+| **Key Selection Time** | 16 ns | Weighted API key selection |
+| **Message Formatting** | 2.19 µs | Request message preparation |
+| **Params Preparation** | 436 ns | Parameter processing |
+| **Request Body Preparation** | 2.65 µs | HTTP request body assembly |
+| **JSON Marshaling** | 63.47 µs | JSON serialization time |
+| **Request Setup** | 6.59 µs | HTTP client configuration |
+| **HTTP Request** | 1.56s | Actual provider API call time |
+| **Error Handling** | 189 ns | Error processing overhead |
+| **Response Parsing** | 11.30 ms | JSON response deserialization |
+
+**Bifrost's Total Overhead: 59 µs***
+
+*\*Excludes JSON marshalling and HTTP calls, which are required in any implementation*
+
+---
+
+## Performance Analysis
+
+### **Strengths on t3.medium**
+
+1. **Perfect Reliability**: 100% success rate even at 5,000 RPS
+2. **Memory Efficiency**: Uses only 33% of available RAM (1,312.79 MB / 4GB)
+3. **Minimal Overhead**: Just 59 µs of added latency per request
+4. **Fast Operations**: Sub-microsecond performance for most internal operations
+
+### **Resource Utilization**
+
+- **Memory Usage**: Very efficient at 1,312.79 MB peak usage
+- **CPU Performance**: Handles 5,000 RPS workload effectively
+- **Queue Management**: 47.13 µs average wait time indicates good throughput
+
+---
+
+## Configuration Recommendations
+
+### **Optimal Settings for t3.medium**
+
+Based on test results, these configurations work well:
+
+```json
+{
+  "client": {
+    "initial_pool_size": 10000,
+    "buffer_size": 15000
+  }
+}
+```
+
+### **Tuning Opportunities**
+
+**For Lower Memory Usage:**
+- Reduce `initial_pool_size` to 7,500-8,000
+- Decrease `buffer_size` to 12,000-13,000
+- Trade-off: Slightly higher latency
+
+**For Better Performance:**
+- Increase `initial_pool_size` to 12,000-13,000  
+- Increase `buffer_size` to 17,000-18,000
+- Trade-off: Higher memory usage (monitor RAM limits)
+
+---
+
+## Comparison Context
+
+### **vs. t3.xlarge Performance**
+
+| Metric | t3.medium | t3.xlarge | Difference |
+|--------|-----------|-----------|------------|
+| **Bifrost Overhead** | 59 µs | 11 µs | +81% slower |
+| **Queue Wait Time** | 47.13 µs | 1.67 µs | +96% slower |
+| **JSON Marshaling** | 63.47 µs | 26.80 µs | +58% slower |
+| **Response Parsing** | 11.30 ms | 2.11 ms | +81% slower |
+| **Memory Usage** | 1,312.79 MB | 3,340.44 MB | -61% usage |
+
+**Key Insights:**
+- t3.medium uses **61% less memory** than t3.xlarge
+- Performance trade-offs are reasonable for cost savings
+- Most operations still complete in microseconds
+
+---
+
+## Next Steps
+
+**When to upgrade to t3.xlarge:**
+- Sustained load approaches 4,000+ RPS
+- Queue wait times consistently exceed 75 µs
+- Memory usage approaches 75% of available RAM
+
+- **[Run Your Own Benchmarks](./run-your-own-benchmarks)** to test with your specific workload
+- **[Compare with t3.xlarge](./t3.xl)** for performance scaling analysis
--- a/docs/benchmarking/t3.xl.mdx
+++ b/docs/benchmarking/t3.xl.mdx
@@ -0,0 +1,151 @@
+---
+title: "t3.xlarge"
+description: "Detailed performance metrics and analysis for Bifrost running on AWS t3.xlarge instances (4 vCPUs, 16GB RAM)."
+icon: "server"
+---
+
+## Instance Configuration
+
+**AWS t3.xlarge Specifications:**
+- **vCPUs**: 4
+- **Memory**: 16GB RAM
+- **Network Performance**: Up to 5 Gigabit
+
+**Bifrost Configuration:**
+- **Buffer Size**: 20,000
+- **Initial Pool Size**: 15,000
+- **Test Load**: 5,000 requests per second (RPS)
+
+---
+
+## Performance Results
+
+### **Overall Performance Metrics**
+
+| Metric | Value | Notes |
+|--------|-------|--------|
+| **Success Rate** | 100.00% | Perfect reliability under high load |
+| **Average Request Size** | 0.13 KB | Lightweight request payload |
+| **Average Response Size** | 10.32 KB | **Large response payload testing** |
+| **Average Latency** | 1.61s | Total end-to-end response time |
+| **Peak Memory Usage** | 3,340.44 MB | ~21% of available 16GB RAM |
+
+> **Note**: t3.xlarge tests used significantly larger response payloads (~10 KB vs ~1 KB on t3.medium) to stress-test performance with realistic production data sizes.
+
+### **Detailed Performance Breakdown**
+
+| Operation | Latency | Performance Notes |
+|-----------|---------|-------------------|
+| **Queue Wait Time** | 1.67 µs | **96% faster** than t3.medium |
+| **Key Selection Time** | 10 ns | **37% faster** weighted API key selection |
+| **Message Formatting** | 2.11 µs | Consistent with t3.medium performance |
+| **Params Preparation** | 417 ns | Slight improvement over t3.medium |
+| **Request Body Preparation** | 2.36 µs | **11% faster** request assembly |
+| **JSON Marshaling** | 26.80 µs | **58% faster** serialization |
+| **Request Setup** | 7.17 µs | Comparable to t3.medium |
+| **HTTP Request** | 1.50s | **4% faster** provider API calls |
+| **Error Handling** | 162 ns | **14% faster** error processing |
+| **Response Parsing** | 2.11 ms | **81% faster** despite 7.5x larger payloads |
+
+**Bifrost's Total Overhead: 11 µs***
+
+*\*Excludes JSON marshalling and HTTP calls, which are required in any implementation. 81% reduction compared to t3.medium (59 µs → 11 µs)*
+
+---
+
+## Performance Analysis
+
+### **Exceptional Performance Improvements**
+
+1. **Dramatic Overhead Reduction**: 81% lower Bifrost overhead (59 µs → 11 µs)
+2. **Superior Queue Management**: 96% faster queue wait times (47.13 µs → 1.67 µs)  
+3. **Faster JSON Processing**: 58% improvement in marshaling despite larger payloads
+4. **Efficient Response Parsing**: 81% faster parsing even with 7.5x larger responses
+5. **Perfect Reliability**: 100% success rate maintained under high load
+
+### **Resource Utilization**
+
+- **Memory Efficiency**: Uses only 21% of available RAM (3,340.44 MB / 16GB)
+- **CPU Performance**: Excellent multi-core utilization for 5,000 RPS
+- **Headroom**: Substantial capacity for traffic spikes and growth
+
+---
+
+## Scalability and Headroom
+
+### **Exceptional Scaling Characteristics**
+
+The t3.xlarge configuration demonstrates **excellent scaling potential**:
+
+**Current Utilization:**
+- **Memory**: 21% used (13GB available headroom)
+- **Queue Performance**: 1.67 µs wait time (near-optimal)
+- **Processing Speed**: Sub-microsecond for most operations
+
+**Scaling Potential:**
+- **Traffic Spikes**: Can likely handle 15,000+ RPS bursts
+- **Response Size Growth**: Efficiently handles 10 KB responses
+- **Concurrent Users**: Supports thousands of simultaneous users
+
+---
+
+## Advanced Configuration
+
+### **Optimal Settings for t3.xlarge**
+
+Based on test results, these configurations provide excellent performance:
+
+```json
+{
+  "client": {
+    "initial_pool_size": 15000,
+    "buffer_size": 20000
+  }
+}
+```
+
+### **Performance Tuning Opportunities**
+
+**For Maximum Performance:**
+- Increase `initial_pool_size` to 18,000-20,000
+- Increase `buffer_size` to 25,000-30,000
+- Trade-off: Higher memory usage (still well within limits)
+
+**For Memory Optimization:**
+- Current config already very efficient at 21% RAM usage
+- Could reduce settings if needed, but performance gains would be lost
+
+**For Extreme Workloads:**
+- Consider `initial_pool_size` up to 25,000
+- Increase `buffer_size` to 35,000+
+- Monitor memory usage approaching 50% of available RAM
+
+---
+
+## Performance Comparison
+
+### **vs. t3.medium Performance**
+
+| Metric | t3.medium | t3.xlarge | Improvement |
+|--------|-----------|-----------|-------------|
+| **Bifrost Overhead** | 59 µs | 11 µs | **-81%** |
+| **Average Latency** | 2.12s | 1.61s | **-24%** |
+| **Queue Wait Time** | 47.13 µs | 1.67 µs | **-96%** |
+| **JSON Marshaling** | 63.47 µs | 26.80 µs | **-58%** |
+| **Response Parsing** | 11.30 ms | 2.11 ms | **-81%** |
+| **Response Size Handled** | 1.37 KB | 10.32 KB | **+7.5x** |
+| **Peak Memory Usage** | 1,312.79 MB | 3,340.44 MB | +155% |
+| **Memory Utilization** | 33% | 21% | **-36%** |
+
+**Key Insights:**
+- **81% overhead reduction** while handling 7.5x larger responses
+- **Exceptional efficiency** with only 21% memory utilization
+- **Dramatic queue performance** improvements
+- **Substantial headroom** for growth and traffic spikes
+
+---
+
+## Next Steps
+
+- **[Run Your Own Benchmarks](./run-your-own-benchmarks)** with your specific payload sizes
+- **[Compare with t3.medium](./t3.medium)** for cost-optimization analysis
--- a/docs/changelogs/cli-v0.10.0.mdx
+++ b/docs/changelogs/cli-v0.10.0.mdx
@@ -0,0 +1,18 @@
+---
+title: "v0.10.0"
+description: "v0.10.0 changelog"
+---
+
+<Update label="Bifrost CLI" description="v0.10.0">
+- feat: tabbed multiplexer for running multiple coding-agent sessions in a single terminal
+- feat: self-update flow with `bifrost update` command and background version checks
+- feat: `bifrost version` subcommand
+- feat: native config writing for Claude Code (~/.claude/settings.json) with confirmation prompt
+- feat: PTY-based process execution with SIGWINCH propagation for proper TUI rendering
+- feat: npx installer rewrite with persistent install to ~/.bifrost/bin/ and automatic shell PATH setup
+- feat: Claude Code simple terminal mode (CLAUDE_CODE_SIMPLE=1) for tab compatibility
+- fix: opencode harness model reference format and provider config (bifrost/ prefix, dedicated provider)
+- feat: opencode adaptive TUI theme injection and JSONC config parsing
+- fix: chooser TUI prompt cleanup and tab bar integration (ReservedRows, BackToTabs, Notify)
+
+</Update>
--- a/docs/changelogs/cli-v0.10.1.mdx
+++ b/docs/changelogs/cli-v0.10.1.mdx
@@ -0,0 +1,15 @@
+---
+title: "v0.10.1"
+description: "v0.10.1 changelog - 2026-03-13"
+---
+
+<Update label="Bifrost CLI" description="v0.10.1">
+
+- feat: added "edit session" functionality via ^B e to reopen chooser with prefilled values
+- feat: Claude harness now pins selected models across Sonnet, Opus, and Haiku tiers
+- fix: improved terminal cursor restoration on PTY exit
+- fix: enhanced error notice handling in command mode with sticky error states
+- fix: improved MCP client reconnection with exponential backoff and connection timeout
+
+</Update>
+
--- a/docs/changelogs/cli-v0.10.2.mdx
+++ b/docs/changelogs/cli-v0.10.2.mdx
@@ -0,0 +1,14 @@
+---
+title: "v0.10.2"
+description: "v0.10.2 changelog - 2026-03-14"
+---
+
+<Update label="Bifrost CLI" description="v0.10.2">
+- feat: added in-tab self-update via U key in command mode when update is available
+- feat: improved tab bar to show update hint when newer version is detected
+- fix: terminal resize handling with proper size normalization and scroll region reset
+- fix: improved chooser integration with tab bar rendering via TabBarLine callback
+- fix: enhanced cursor positioning with absolute origin mode after scroll region reset
+
+</Update>
+
--- a/docs/changelogs/cli-v0.10.3.mdx
+++ b/docs/changelogs/cli-v0.10.3.mdx
@@ -0,0 +1,10 @@
+---
+title: "v0.10.3"
+description: "v0.10.3 changelog - 2026-03-27"
+---
+
+<Update label="Bifrost CLI" description="v0.10.3">
+feat: adds support for ANTHROPIC_AUTH_TOKEN
+
+</Update>
+
--- a/docs/changelogs/ent-v1.3.10.mdx
+++ b/docs/changelogs/ent-v1.3.10.mdx
@@ -0,0 +1,52 @@
+---
+title: "v1.3.10"
+description: "v1.3.10 changelog"
+---
+
+<Update label="Bifrost Enterprise" description="v1.3.10">
+
+
+## Changelog
+
+This release upgrades the base OSS version from v1.4.12 to v1.4.13, bringing plugin execution sequencing, Groq speech support, Azure GCC cloud environments, and connection pool management. On the enterprise side, this release adds Azure Entra ID support for GCC High environments, new customer deployments, and deployment pipeline improvements.
+
+## ✨ Features
+
+- **Plugin Sequencing** — Added plugin execution ordering with placement and priority controls for custom plugins relative to built-in plugins
+- **Groq Speech** — Added speech synthesis (TTS) and transcription (STT) support for Groq provider
+- **Gemini Model Metadata** — Added support for Gemini metadata endpoint (/v1beta/models/{model})
+- **Azure GCC High Integration** — Added Azure Entra ID support for GCC High and DoD cloud environments, including cloud-specific endpoints for SCIM provisioning and JWT validation
+- **Wildcard Header Forwarding** — Added wildcard pattern support in header forwarding configuration
+- **Log Metadata Columns** — Added metadata columns in logs and filters for richer observability
+- **Prompt Caching Improvements** — Preserved JSON key ordering for LLM prompt caching using byte-level operations
+- **Connection Pool Management** — Added connection lifetime limits and optimized pool behavior to prevent stale connections
+
+## 🐞 Fixed
+
+- **MCP Tool Headers** — Fixed MCP tools not passing required headers to the MCP server
+- **MCP Tool Call Detection** — Fixed tool calls not being detected in MCP agent mode when providers return "stop" finish reason
+- **Gemini Finish Reason** — Fixed Gemini models not returning correct "tool_calls" finish reason
+- **Prompt Cascade Deletion** — Fixed manual cascade deletion for prompt entities
+- **Deploy Maxim Workflow** — Fixed deployment workflow for Maxim environment
+- **Commit Message Parsing** — Fixed commit message parsing in enterprise build pipeline
+- **Customer License Expiry** — Updated license expiry configurations for customer deployments
+
+## 📀 Base OSS version
+
+`transports/v1.4.13`
+
+## 🔌 If you are compiling plugin against this release - use following deps
+
+```
+github.com/maximhq/bifrost/core v1.4.11
+github.com/maximhq/bifrost/framework v1.2.30
+github.com/maximhq/bifrost/plugins/governance v1.4.30
+github.com/maximhq/bifrost/plugins/logging v1.4.30
+github.com/maximhq/bifrost/transports v1.4.14
+github.com/weaviate/weaviate v1.36.5
+github.com/weaviate/weaviate-go-client/v5 v5.7.1
+google.golang.org/genproto/googleapis/api v0.0.0-20260203192932-546029d2fa20
+google.golang.org/genproto/googleapis/rpc v0.0.0-20260203192932-546029d2fa20
+```
+
+</Update>
--- a/docs/changelogs/ent-v1.3.11.mdx
+++ b/docs/changelogs/ent-v1.3.11.mdx
@@ -0,0 +1,47 @@
+---
+title: "v1.3.11"
+description: "v1.3.11 changelog"
+---
+
+<Update label="Bifrost Enterprise" description="v1.3.11">
+
+## Changelog
+
+This release upgrades the base OSS version from v1.4.14 to v1.4.15, bringing a custom SSE stream reader for smoother streaming, MCP config validation, configurable max open connections, and major dashboard improvements. On the enterprise side, this release adds new customer onboarding, Mantel authentication migration to username/password, and license management updates.
+
+## ✨ Features
+
+- **Custom SSE Stream Reader** — Replaced fasthttp's default stream reader with a custom implementation to reduce bursts in SSE streaming
+- **MCP Config Validation** — Added validation for MCP tool configurations in config.json
+- **Max Open Connections** — Exposed max-open-connections for provider domains as a configurable field
+- **Dashboard Improvements** — Added new tabs and graphs to the dashboard including Model Ranking, Cache usage, and MCP usage
+- **Dashboard & Logs Performance** — Improved LLM logs and Dashboard UI performance (~1400x faster) for large numbers of logs
+- **Anthropic Compaction** — Added compaction support for Anthropic provider
+
+## 🐞 Fixed
+
+- **Passthrough Streaming** — Fixed passthrough streaming responses being buffered instead of streamed
+- **MCP Notifications** — Fixed MCP notifications returning incorrect status code
+- **Streaming Function Calls** — Fixed function_call items not included in streaming response.completed output
+- **Bedrock API Key Auth** — Fixed Bedrock API key authentication without requiring bedrock_key_config
+- **Bedrock Token Count Fallback** — Added fallback to estimated token count when count-tokens API is unsupported
+- **Anthropic Thinking Fixes** — Fixed OpenAI-to-Anthropic-to-OpenAI thinking content conversion
+- **Anthropic Header Selection** — Fixed Anthropic header selection across providers
+- **Gemini OpenAI Integration** — Fixed Gemini flow for OpenAI-compatible integration
+- **Semantic Cache Hashing** — Fixed deterministic tools_hash and params_hash in semantic cache
+
+## 📀 Base OSS version
+
+`transports/v1.4.15`
+
+## 🔌 If you are compiling plugin against this release - use following deps
+
+```
+github.com/maximhq/bifrost/core v1.4.12
+github.com/maximhq/bifrost/framework v1.2.31
+github.com/maximhq/bifrost/plugins/governance v1.4.31
+github.com/maximhq/bifrost/plugins/logging v1.4.31
+github.com/maximhq/bifrost/transports v1.4.15
+```
+
+</Update>
--- a/docs/changelogs/ent-v1.3.12.mdx
+++ b/docs/changelogs/ent-v1.3.12.mdx
@@ -0,0 +1,36 @@
+---
+title: "v1.3.12"
+description: "v1.3.12 changelog"
+---
+
+<Update label="Bifrost Enterprise" description="v1.3.12">
+
+## Changelog
+
+This release upgrades the base OSS version from v1.4.15 to v1.4.16, fixing Responses API tool type routing, Postgres indexing deadlocks, and startup blocking. On the enterprise side, this release adds targeted release deployments via `--release-for` and fixes MCP tool group filtering.
+
+## ✨ Features
+
+- **Targeted Release Deployments** — Added `--release-for` flag to CI/CD pipeline, allowing releases to target specific environments by name instead of auto-detecting all environments
+
+## 🐞 Fixed
+
+- **Responses API Tool Types** — Normalized versioned/provider-specific tool type strings (e.g. `web_search_20250305`) to their canonical types for correct routing
+- **Provider Histogram Index** — Deferred provider histogram index creation to background goroutine to avoid blocking pod startup
+- **MCP Tool Group Filtering** — Fixed MCP tool include filter to use correct schema constant for proper tool group resolution
+
+## 📀 Base OSS version
+
+`transports/v1.4.16`
+
+## 🔌 If you are compiling plugin against this release - use following deps
+
+```
+github.com/maximhq/bifrost/core v1.4.13
+github.com/maximhq/bifrost/framework v1.2.32
+github.com/maximhq/bifrost/plugins/governance v1.4.32
+github.com/maximhq/bifrost/plugins/logging v1.4.32
+github.com/maximhq/bifrost/transports v1.4.16
+```
+
+</Update>
--- a/docs/changelogs/ent-v1.3.13.mdx
+++ b/docs/changelogs/ent-v1.3.13.mdx
@@ -0,0 +1,86 @@
+---
+title: "v1.3.13"
+description: "v1.3.13 changelog"
+---
+
+<Update label="Bifrost Enterprise" description="v1.3.13">
+
+## Changelog
+
+This release upgrades the base OSS version from v1.4.16 to v1.4.17, bringing denylist model support, numerous streaming and provider fixes, and WebSocket concurrency safety. On the enterprise side, the Datadog span type for LLM calls is updated to `llm.call` for correct Datadog LLM Observability categorization.
+
+## ✨ Features
+
+- **Denylist Models** — Provider keys now support a `blacklisted_models` field to exclude specific models from routing and filtered list-models ; denylist takes precedence over the `models` allow list
+
+## 🐞 Fixed
+
+- **Datadog LLM Span Type** — Changed Datadog span type for LLM calls from `llm` to `llm.call` for proper Datadog LLM Observability integration
+- **MCP Gateway Headers** — Fixed support for `x-bf-mcp-include-clients` and `x-bf-mcp-include-tools` headers to filter MCP tools/list response
+- **Bedrock Duplicate Events** — Fixed duplicate `content_block_stop` events in Bedrock streaming responses
+- **Reasoning Content Marshaling** — Fixed `reasoning_content` JSON tag in OpenAI response types
+- **OTEL Streaming Traces** — Fixed response capture in OTEL tracing for streaming calls
+- **Broken Pipe Handling** — Added broken pipe detection to connection pool error handler
+- **Cache Token Streaming** — Fixed cache token capture for streaming calls across Anthropic and Bedrock providers
+- **Vertex Embedding URL** — Fixed global region URL construction in Vertex embedding method
+- **Bedrock Reasoning Merge** — Fixed reasoning content merge logic for Bedrock provider
+- **Bedrock HTTP/2 Toggle** — Fixed enforce HTTP/2 toggle behavior for Bedrock provider
+- **Codex Store Parameter** — Fixed `store` parameter handling for Codex conversations
+- **Gemini Duplicate Text** — Skipped `OutputTextDone` events to prevent duplicate text in Gemini GenAI streaming
+- **Gemini Thought Signatures** — Handled missing thought signatures in Gemini provider
+- **Replicate Model Slugs** — Refined Replicate model slug resolution in model catalog
+- **Logging Default** — Kept logging enabled by default for new configurations
+- **Gin Migration Deadlocks** — Moved all gin migrations to Go to avoid deadlocks
+- **WebSocket Concurrent Writes** — Fixed concurrent write safety in WebSocket Responses API sessions
+- **Persist Store Config** — Persisted store raw request/response config at provider level
+
+## 📀 Base OSS version
+
+`transports/v1.4.17`
+
+## 🗺️ Helm chart version
+
+2.0.14
+
+## 🔌 If you are compiling plugin against this release - use following deps
+
+```go
+cloud.google.com/go/bigquery v1.73.1
+github.com/DataDog/datadog-go/v5 v5.6.0
+github.com/DataDog/dd-trace-go/v2 v2.4.0
+github.com/aws/aws-sdk-go-v2/config v1.32.11
+github.com/aws/aws-sdk-go-v2/credentials v1.19.11
+github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
+github.com/bytedance/sonic v1.15.0
+github.com/coreos/go-oidc/v3 v3.12.0
+github.com/fasthttp/router v1.5.4
+github.com/golang-jwt/jwt/v5 v5.3.0
+github.com/google/cel-go v0.26.1
+github.com/google/uuid v1.6.0
+github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
+github.com/grandcat/zeroconf v1.0.0
+github.com/hashicorp/consul/api v1.22.0
+github.com/hashicorp/memberlist v0.5.4
+github.com/maximhq/bifrost/core v1.4.14
+github.com/maximhq/bifrost/framework v1.2.33
+github.com/maximhq/bifrost/plugins/governance v1.4.33
+github.com/maximhq/bifrost/plugins/logging v1.4.33
+github.com/maximhq/bifrost/transports v1.4.17
+github.com/nakabonne/tstorage v0.3.6
+github.com/stretchr/testify v1.11.1
+github.com/testcontainers/testcontainers-go v0.40.0
+github.com/tetratelabs/wazero v1.11.0
+github.com/valyala/fasthttp v1.68.0
+go.etcd.io/etcd/client/v3 v3.6.6
+golang.org/x/crypto v0.49.0
+golang.org/x/oauth2 v0.35.0
+google.golang.org/api v0.265.0
+google.golang.org/protobuf v1.36.11
+gorm.io/driver/sqlite v1.6.0
+gorm.io/gorm v1.31.1
+k8s.io/api v0.34.1
+k8s.io/apimachinery v0.34.1
+k8s.io/client-go v0.34.1
+```
+
+</Update>
--- a/docs/changelogs/ent-v1.3.14.mdx
+++ b/docs/changelogs/ent-v1.3.14.mdx
@@ -0,0 +1,46 @@
+---
+title: "v1.3.14"
+description: "v1.3.14 changelog"
+---
+
+<Update label="Bifrost Enterprise" description="v1.3.14">
+
+## Changelog
+
+This release adds support for Claude Office Suite (Excel add-on), calendar-aligned billing, and ANTHROPIC_AUTH_TOKEN authentication. It also includes Anthropic streaming usage and cache token fixes, CORS wildcard header handling, and enterprise-side improvements for secrets management and new customer onboarding.
+
+## ✨ Features
+
+- **Claude Office Suite Support** — Added support for the Claude Office Suite Excel add-on, including fixes for proper integration
+- **Calendar-Aligned Billing** — Added calendar alignment feature for billing periods with supporting migration
+- **ANTHROPIC_AUTH_TOKEN Support** — Added support for `ANTHROPIC_AUTH_TOKEN` as an authentication method
+- **URL-Based Log Selection** — Added URL-based log selection with keyboard navigation and cross-page browsing in the dashboard
+- **Secrets Limit Workaround** — Added ability to circumvent the 100 secrets limit in CI/CD pipelines
+- **Manual Image Overrides** — Added manual overrides for container images in deployment configurations
+- **New Customer Environments** — Onboarded Beckhoff, Dish, and Technarts with full Terraform and Dockerfile configurations
+
+## 🐞 Fixed
+
+- **Anthropic Streaming Usage** — Fixed usage reporting for Anthropic streaming responses
+- **Anthropic Cache Token Reporting** — Fixed cache token reporting for Anthropic provider
+- **Semantic Cache count_tokens** — Skipped unsupported `count_tokens` requests in semantic cache plugin
+- **CORS Wildcard Headers** — Fixed wildcard (`*`) allowed headers handling for CORS
+- **Greptile Integration** — Fixed issues with Greptile integration
+- **Dashboard Style Fixes** — Refined dashboard page styling and layout improvements
+- **Ada Token Expiry** — Increased Ada environment token expiry duration
+
+## 📀 Base OSS version
+
+`transports/v1.4.18-0.20260327163039-277421844123`
+
+## 🔌 If you are compiling plugin against this release - use following deps
+
+```
+go get github.com/maximhq/bifrost/core@2774218441230eef858636ebe3b70552fb575a93
+go get github.com/maximhq/bifrost/framework@2774218441230eef858636ebe3b70552fb575a93
+go get github.com/maximhq/bifrost/plugins/governance@2774218441230eef858636ebe3b70552fb575a93
+go get github.com/maximhq/bifrost/plugins/logging@2774218441230eef858636ebe3b70552fb575a93
+go get github.com/maximhq/bifrost/transports@2774218441230eef858636ebe3b70552fb575a93
+```
+
+</Update>
--- a/docs/changelogs/ent-v1.3.15.mdx
+++ b/docs/changelogs/ent-v1.3.15.mdx
@@ -0,0 +1,358 @@
+---
+title: "v1.3.15"
+description: "v1.3.15 changelog"
+---
+
+<Update label="Bifrost Enterprise" description="v1.3.15">
+
+## Changelog
+
+This release pins Bifrost OSS dependencies to stable release tags (transports/v1.4.18), includes calendar-aligned budgets along with numerous streaming and caching fixes.
+
+## ✨ Features
+
+- **Calendar-Aligned Budgets** — Added calendar alignment support for budget periods in governance
+
+## 🐞 Fixed
+
+- **SSE Error Events** — Handle SSE error events for 429 rate-limit and other error status codes during streaming
+- **Anthropic Max Tokens** — Pick max tokens for Anthropic from model params cache instead of hardcoded values
+- **Anthropic Streaming Usage** — Fixed usage token reporting for Anthropic streaming responses
+- **Anthropic Cache Tokens** — Fixed Anthropic cache token reporting in non-streaming responses
+- **Embedding Precision** — Preserved provider precision in embedding responses instead of truncating float values
+- **Provider Caching** — Removed pending marshal-to-map to fix caching issues at provider level
+- **Claude Office Suite** — Fixed support for Claude office suite add-on model routing
+- **Semantic Cache Config** — Hardened direct-only config handling and aligned UI types for semantic cache
+- **Semantic Cache Count Tokens** — Skip unsupported count_tokens requests in semantic cache plugin
+- **Telemetry Events** — Removed reason field from telemetry events
+- **CORS Headers** — Fixed wildcard allowed headers for CORS
+- **UI Routing Display** — Shows selected virtual key and routing rule in UI
+
+## 📀 Base OSS version
+
+`transports/v1.4.18`
+
+## 🔌 If you are compiling plugin against this release - use following deps
+
+```
+module github.com/maximhq/bifrost-enterprise
+
+go 1.26.1
+
+require (
+	cloud.google.com/go/bigquery v1.73.1
+	github.com/DataDog/datadog-go/v5 v5.6.0
+	github.com/DataDog/dd-trace-go/v2 v2.4.0
+	github.com/aws/aws-sdk-go-v2/config v1.32.11
+	github.com/aws/aws-sdk-go-v2/credentials v1.19.11
+	github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
+	github.com/bytedance/sonic v1.15.0
+	github.com/coreos/go-oidc/v3 v3.12.0
+	github.com/fasthttp/router v1.5.4
+	github.com/golang-jwt/jwt/v5 v5.3.0
+	github.com/google/cel-go v0.26.1
+	github.com/google/uuid v1.6.0
+	github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
+	github.com/grandcat/zeroconf v1.0.0
+	github.com/hashicorp/consul/api v1.22.0
+	github.com/hashicorp/memberlist v0.5.4
+	github.com/maximhq/bifrost/core v1.4.15
+	github.com/maximhq/bifrost/framework v1.2.34
+	github.com/maximhq/bifrost/plugins/governance v1.4.34
+	github.com/maximhq/bifrost/plugins/logging v1.4.34
+	github.com/maximhq/bifrost/transports v1.4.18
+	github.com/nakabonne/tstorage v0.3.6
+	github.com/stretchr/testify v1.11.1
+	github.com/testcontainers/testcontainers-go v0.40.0
+	github.com/tetratelabs/wazero v1.11.0
+	github.com/valyala/fasthttp v1.68.0
+	go.etcd.io/etcd/client/v3 v3.6.6
+	golang.org/x/crypto v0.49.0
+	golang.org/x/oauth2 v0.35.0
+	google.golang.org/api v0.265.0
+	google.golang.org/protobuf v1.36.11
+	gorm.io/driver/sqlite v1.6.0
+	gorm.io/gorm v1.31.1
+	k8s.io/api v0.34.1
+	k8s.io/apimachinery v0.34.1
+	k8s.io/client-go v0.34.1
+)
+
+require (
+	cel.dev/expr v0.25.1 // indirect
+	cloud.google.com/go v0.123.0 // indirect
+	cloud.google.com/go/auth v0.18.1 // indirect
+	cloud.google.com/go/auth/oauth2adapt v0.2.8 // indirect
+	cloud.google.com/go/compute/metadata v0.9.0 // indirect
+	cloud.google.com/go/iam v1.5.3 // indirect
+	dario.cat/mergo v1.0.2 // indirect
+	github.com/Azure/azure-sdk-for-go/sdk/azcore v1.20.0 // indirect
+	github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.13.1 // indirect
+	github.com/Azure/azure-sdk-for-go/sdk/internal v1.11.2 // indirect
+	github.com/Azure/go-ansiterm v0.0.0-20250102033503-faa5f7b0171c // indirect
+	github.com/AzureAD/microsoft-authentication-library-for-go v1.6.0 // indirect
+	github.com/DataDog/datadog-agent/comp/core/tagger/origindetection v0.71.0 // indirect
+	github.com/DataDog/datadog-agent/pkg/obfuscate v0.71.0 // indirect
+	github.com/DataDog/datadog-agent/pkg/opentelemetry-mapping-go/otlp/attributes v0.71.0 // indirect
+	github.com/DataDog/datadog-agent/pkg/proto v0.71.0 // indirect
+	github.com/DataDog/datadog-agent/pkg/remoteconfig/state v0.73.0-rc.1 // indirect
+	github.com/DataDog/datadog-agent/pkg/trace v0.71.0 // indirect
+	github.com/DataDog/datadog-agent/pkg/util/log v0.71.0 // indirect
+	github.com/DataDog/datadog-agent/pkg/util/scrubber v0.71.0 // indirect
+	github.com/DataDog/datadog-agent/pkg/version v0.71.0 // indirect
+	github.com/DataDog/go-libddwaf/v4 v4.6.1 // indirect
+	github.com/DataDog/go-runtime-metrics-internal v0.0.4-0.20250721125240-fdf1ef85b633 // indirect
+	github.com/DataDog/go-sqllexer v0.1.8 // indirect
+	github.com/DataDog/go-tuf v1.1.1-0.5.2 // indirect
+	github.com/DataDog/sketches-go v1.4.7 // indirect
+	github.com/Masterminds/semver/v3 v3.3.1 // indirect
+	github.com/Microsoft/go-winio v0.6.2 // indirect
+	github.com/andybalholm/brotli v1.2.0 // indirect
+	github.com/antlr4-go/antlr/v4 v4.13.0 // indirect
+	github.com/apache/arrow/go/v15 v15.0.2 // indirect
+	github.com/apapsch/go-jsonmerge/v2 v2.0.0 // indirect
+	github.com/armon/go-metrics v0.4.1 // indirect
+	github.com/aws/aws-sdk-go-v2 v1.41.3 // indirect
+	github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.6 // indirect
+	github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.19 // indirect
+	github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.19 // indirect
+	github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.19 // indirect
+	github.com/aws/aws-sdk-go-v2/internal/ini v1.8.5 // indirect
+	github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.16 // indirect
+	github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.6 // indirect
+	github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.9.7 // indirect
+	github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.19 // indirect
+	github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.16 // indirect
+	github.com/aws/aws-sdk-go-v2/service/s3 v1.94.0 // indirect
+	github.com/aws/aws-sdk-go-v2/service/signin v1.0.7 // indirect
+	github.com/aws/aws-sdk-go-v2/service/sso v1.30.12 // indirect
+	github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.16 // indirect
+	github.com/aws/aws-sdk-go-v2/service/sts v1.41.8 // indirect
+	github.com/aws/smithy-go v1.24.2 // indirect
+	github.com/bahlo/generic-list-go v0.2.0 // indirect
+	github.com/beorn7/perks v1.0.1 // indirect
+	github.com/buger/jsonparser v1.1.2 // indirect
+	github.com/bytedance/gopkg v0.1.3 // indirect
+	github.com/bytedance/sonic/loader v0.5.0 // indirect
+	github.com/cenkalti/backoff v2.2.1+incompatible // indirect
+	github.com/cenkalti/backoff/v4 v4.3.0 // indirect
+	github.com/cenkalti/backoff/v5 v5.0.3 // indirect
+	github.com/cespare/xxhash/v2 v2.3.0 // indirect
+	github.com/cihub/seelog v0.0.0-20170130134532-f561c5e57575 // indirect
+	github.com/cloudwego/base64x v0.1.6 // indirect
+	github.com/containerd/errdefs v1.0.0 // indirect
+	github.com/containerd/errdefs/pkg v0.3.0 // indirect
+	github.com/containerd/log v0.1.0 // indirect
+	github.com/containerd/platforms v0.2.1 // indirect
+	github.com/coreos/go-semver v0.3.1 // indirect
+	github.com/coreos/go-systemd/v22 v22.5.0 // indirect
+	github.com/cpuguy83/dockercfg v0.3.2 // indirect
+	github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
+	github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f // indirect
+	github.com/distribution/reference v0.6.0 // indirect
+	github.com/docker/docker v28.5.2+incompatible // indirect
+	github.com/docker/go-connections v0.6.0 // indirect
+	github.com/docker/go-units v0.5.0 // indirect
+	github.com/dustin/go-humanize v1.0.1 // indirect
+	github.com/ebitengine/purego v0.9.1 // indirect
+	github.com/emicklei/go-restful/v3 v3.12.2 // indirect
+	github.com/fasthttp/websocket v1.5.12 // indirect
+	github.com/fatih/color v1.17.0 // indirect
+	github.com/felixge/httpsnoop v1.0.4 // indirect
+	github.com/fxamacker/cbor/v2 v2.9.0 // indirect
+	github.com/go-jose/go-jose/v4 v4.1.3 // indirect
+	github.com/go-logr/logr v1.4.3 // indirect
+	github.com/go-logr/stdr v1.2.2 // indirect
+	github.com/go-ole/go-ole v1.3.0 // indirect
+	github.com/go-openapi/analysis v0.24.2 // indirect
+	github.com/go-openapi/errors v0.22.5 // indirect
+	github.com/go-openapi/jsonpointer v0.22.4 // indirect
+	github.com/go-openapi/jsonreference v0.21.4 // indirect
+	github.com/go-openapi/loads v0.23.2 // indirect
+	github.com/go-openapi/runtime v0.29.2 // indirect
+	github.com/go-openapi/spec v0.22.2 // indirect
+	github.com/go-openapi/strfmt v0.25.0 // indirect
+	github.com/go-openapi/swag v0.25.4 // indirect
+	github.com/go-openapi/swag/cmdutils v0.25.4 // indirect
+	github.com/go-openapi/swag/conv v0.25.4 // indirect
+	github.com/go-openapi/swag/fileutils v0.25.4 // indirect
+	github.com/go-openapi/swag/jsonname v0.25.4 // indirect
+	github.com/go-openapi/swag/jsonutils v0.25.4 // indirect
+	github.com/go-openapi/swag/loading v0.25.4 // indirect
+	github.com/go-openapi/swag/mangling v0.25.4 // indirect
+	github.com/go-openapi/swag/netutils v0.25.4 // indirect
+	github.com/go-openapi/swag/stringutils v0.25.4 // indirect
+	github.com/go-openapi/swag/typeutils v0.25.4 // indirect
+	github.com/go-openapi/swag/yamlutils v0.25.4 // indirect
+	github.com/go-openapi/validate v0.25.1 // indirect
+	github.com/go-viper/mapstructure/v2 v2.4.0 // indirect
+	github.com/goccy/go-json v0.10.5 // indirect
+	github.com/gogo/protobuf v1.3.2 // indirect
+	github.com/golang/groupcache v0.0.0-20241129210726-2c02b8208cf8 // indirect
+	github.com/golang/protobuf v1.5.4 // indirect
+	github.com/google/btree v1.1.3 // indirect
+	github.com/google/flatbuffers v23.5.26+incompatible // indirect
+	github.com/google/gnostic-models v0.7.0 // indirect
+	github.com/google/pprof v0.0.0-20251213031049-b05bdaca462f // indirect
+	github.com/google/s2a-go v0.1.9 // indirect
+	github.com/googleapis/enterprise-certificate-proxy v0.3.11 // indirect
+	github.com/googleapis/gax-go/v2 v2.17.0 // indirect
+	github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.7 // indirect
+	github.com/hashicorp/errwrap v1.1.0 // indirect
+	github.com/hashicorp/go-cleanhttp v0.5.2 // indirect
+	github.com/hashicorp/go-hclog v1.6.3 // indirect
+	github.com/hashicorp/go-immutable-radix v1.3.1 // indirect
+	github.com/hashicorp/go-metrics v0.5.4 // indirect
+	github.com/hashicorp/go-msgpack/v2 v2.1.5 // indirect
+	github.com/hashicorp/go-multierror v1.1.1 // indirect
+	github.com/hashicorp/go-rootcerts v1.0.2 // indirect
+	github.com/hashicorp/go-sockaddr v1.0.7 // indirect
+	github.com/hashicorp/go-version v1.7.0 // indirect
+	github.com/hashicorp/golang-lru v1.0.2 // indirect
+	github.com/hashicorp/serf v0.10.1 // indirect
+	github.com/invopop/jsonschema v0.13.0 // indirect
+	github.com/jackc/pgpassfile v1.0.0 // indirect
+	github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 // indirect
+	github.com/jackc/pgx/v5 v5.7.6 // indirect
+	github.com/jackc/puddle/v2 v2.2.2 // indirect
+	github.com/jaswdr/faker/v2 v2.8.0 // indirect
+	github.com/jinzhu/inflection v1.0.0 // indirect
+	github.com/jinzhu/now v1.1.5 // indirect
+	github.com/json-iterator/go v1.1.12 // indirect
+	github.com/klauspost/compress v1.18.2 // indirect
+	github.com/klauspost/cpuid/v2 v2.3.0 // indirect
+	github.com/kylelemons/godebug v1.1.0 // indirect
+	github.com/lufia/plan9stats v0.0.0-20251013123823-9fd1530e3ec3 // indirect
+	github.com/magiconair/properties v1.8.10 // indirect
+	github.com/mailru/easyjson v0.9.1 // indirect
+	github.com/mark3labs/mcp-go v0.43.2 // indirect
+	github.com/mattn/go-colorable v0.1.14 // indirect
+	github.com/mattn/go-isatty v0.0.20 // indirect
+	github.com/mattn/go-sqlite3 v1.14.32 // indirect
+	github.com/maximhq/bifrost/plugins/litellmcompat v0.0.23 // indirect
+	github.com/maximhq/bifrost/plugins/maxim v1.5.33 // indirect
+	github.com/maximhq/bifrost/plugins/mocker v1.4.33 // indirect
+	github.com/maximhq/bifrost/plugins/otel v1.1.33 // indirect
+	github.com/maximhq/bifrost/plugins/semanticcache v1.4.32 // indirect
+	github.com/maximhq/bifrost/plugins/telemetry v1.4.34 // indirect
+	github.com/maximhq/maxim-go v0.2.0 // indirect
+	github.com/miekg/dns v1.1.68 // indirect
+	github.com/minio/simdjson-go v0.4.5 // indirect
+	github.com/mitchellh/go-homedir v1.1.0 // indirect
+	github.com/mitchellh/mapstructure v1.5.0 // indirect
+	github.com/moby/docker-image-spec v1.3.1 // indirect
+	github.com/moby/go-archive v0.1.0 // indirect
+	github.com/moby/patternmatcher v0.6.0 // indirect
+	github.com/moby/sys/sequential v0.6.0 // indirect
+	github.com/moby/sys/user v0.4.0 // indirect
+	github.com/moby/sys/userns v0.1.0 // indirect
+	github.com/moby/term v0.5.2 // indirect
+	github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
+	github.com/modern-go/reflect2 v1.0.3-0.20250322232337-35a7c28c31ee // indirect
+	github.com/morikuni/aec v1.0.0 // indirect
+	github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
+	github.com/oapi-codegen/runtime v1.1.1 // indirect
+	github.com/oklog/ulid v1.3.1 // indirect
+	github.com/opencontainers/go-digest v1.0.0 // indirect
+	github.com/opencontainers/image-spec v1.1.1 // indirect
+	github.com/outcaste-io/ristretto v0.2.3 // indirect
+	github.com/philhofer/fwd v1.2.0 // indirect
+	github.com/pierrec/lz4/v4 v4.1.21 // indirect
+	github.com/pinecone-io/go-pinecone/v5 v5.3.0 // indirect
+	github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c // indirect
+	github.com/pkg/errors v0.9.1 // indirect
+	github.com/planetscale/vtprotobuf v0.6.1-0.20240319094008-0393e58bdf10 // indirect
+	github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
+	github.com/power-devops/perfstat v0.0.0-20240221224432-82ca36839d55 // indirect
+	github.com/prometheus/client_golang v1.23.2 // indirect
+	github.com/prometheus/client_model v0.6.2 // indirect
+	github.com/prometheus/common v0.66.1 // indirect
+	github.com/prometheus/procfs v0.17.0 // indirect
+	github.com/puzpuzpuz/xsync/v3 v3.5.1 // indirect
+	github.com/qdrant/go-client v1.16.2 // indirect
+	github.com/redis/go-redis/v9 v9.17.2 // indirect
+	github.com/rs/zerolog v1.34.0 // indirect
+	github.com/santhosh-tekuri/jsonschema/v6 v6.0.2 // indirect
+	github.com/savsgio/gotils v0.0.0-20250408102913-196191ec6287 // indirect
+	github.com/sean-/seed v0.0.0-20170313163322-e2103e2c3529 // indirect
+	github.com/secure-systems-lab/go-securesystemslib v0.9.0 // indirect
+	github.com/shirou/gopsutil/v4 v4.25.10 // indirect
+	github.com/sirupsen/logrus v1.9.4 // indirect
+	github.com/spf13/cast v1.10.0 // indirect
+	github.com/stoewer/go-strcase v1.3.0 // indirect
+	github.com/theckman/httpforwarded v0.4.0 // indirect
+	github.com/tidwall/gjson v1.18.0 // indirect
+	github.com/tidwall/match v1.1.1 // indirect
+	github.com/tidwall/pretty v1.2.0 // indirect
+	github.com/tidwall/sjson v1.2.5 // indirect
+	github.com/tinylib/msgp v1.3.0 // indirect
+	github.com/tklauser/go-sysconf v0.3.16 // indirect
+	github.com/tklauser/numcpus v0.11.0 // indirect
+	github.com/twitchyliquid64/golang-asm v0.15.1 // indirect
+	github.com/valyala/bytebufferpool v1.0.0 // indirect
+	github.com/weaviate/weaviate v1.36.5 // indirect
+	github.com/weaviate/weaviate-go-client/v5 v5.7.1 // indirect
+	github.com/wk8/go-ordered-map/v2 v2.1.8 // indirect
+	github.com/x448/float16 v0.8.4 // indirect
+	github.com/yosida95/uritemplate/v3 v3.0.2 // indirect
+	github.com/yusufpapurcu/wmi v1.2.4 // indirect
+	github.com/zeebo/xxh3 v1.0.2 // indirect
+	go.etcd.io/etcd/api/v3 v3.6.6 // indirect
+	go.etcd.io/etcd/client/pkg/v3 v3.6.6 // indirect
+	go.mongodb.org/mongo-driver v1.17.6 // indirect
+	go.opencensus.io v0.24.0 // indirect
+	go.opentelemetry.io/auto/sdk v1.2.1 // indirect
+	go.opentelemetry.io/collector/component v1.39.0 // indirect
+	go.opentelemetry.io/collector/featuregate v1.39.0 // indirect
+	go.opentelemetry.io/collector/internal/telemetry v0.133.0 // indirect
+	go.opentelemetry.io/collector/pdata v1.39.0 // indirect
+	go.opentelemetry.io/contrib/bridges/otelzap v0.12.0 // indirect
+	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.63.0 // indirect
+	go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.63.0 // indirect
+	go.opentelemetry.io/otel v1.40.0 // indirect
+	go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.40.0 // indirect
+	go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.40.0 // indirect
+	go.opentelemetry.io/otel/log v0.14.0 // indirect
+	go.opentelemetry.io/otel/metric v1.40.0 // indirect
+	go.opentelemetry.io/otel/sdk v1.40.0 // indirect
+	go.opentelemetry.io/otel/sdk/metric v1.40.0 // indirect
+	go.opentelemetry.io/otel/trace v1.40.0 // indirect
+	go.opentelemetry.io/proto/otlp v1.9.0 // indirect
+	go.starlark.net v0.0.0-20260102030733-3fee463870c9 // indirect
+	go.uber.org/atomic v1.11.0 // indirect
+	go.uber.org/multierr v1.11.0 // indirect
+	go.uber.org/zap v1.27.0 // indirect
+	go.yaml.in/yaml/v2 v2.4.2 // indirect
+	go.yaml.in/yaml/v3 v3.0.4 // indirect
+	golang.org/x/arch v0.23.0 // indirect
+	golang.org/x/exp v0.0.0-20251113190631-e25ba8c21ef6 // indirect
+	golang.org/x/mod v0.33.0 // indirect
+	golang.org/x/net v0.52.0 // indirect
+	golang.org/x/sync v0.20.0 // indirect
+	golang.org/x/sys v0.42.0 // indirect
+	golang.org/x/telemetry v0.0.0-20260209163413-e7419c687ee4 // indirect
+	golang.org/x/term v0.41.0 // indirect
+	golang.org/x/text v0.35.0 // indirect
+	golang.org/x/time v0.14.0 // indirect
+	golang.org/x/tools v0.42.0 // indirect
+	golang.org/x/xerrors v0.0.0-20240903120638-7835f813f4da // indirect
+	google.golang.org/genproto v0.0.0-20260128011058-8636f8732409 // indirect
+	google.golang.org/genproto/googleapis/api v0.0.0-20260203192932-546029d2fa20 // indirect
+	google.golang.org/genproto/googleapis/rpc v0.0.0-20260319201613-d00831a3d3e7 // indirect
+	google.golang.org/grpc v1.79.3 // indirect
+	gopkg.in/evanphx/json-patch.v4 v4.12.0 // indirect
+	gopkg.in/inf.v0 v0.9.1 // indirect
+	gopkg.in/ini.v1 v1.67.0 // indirect
+	gopkg.in/yaml.v3 v3.0.1 // indirect
+	gorm.io/driver/postgres v1.6.0 // indirect
+	k8s.io/klog/v2 v2.130.1 // indirect
+	k8s.io/kube-openapi v0.0.0-20250710124328-f3f2b991d03b // indirect
+	k8s.io/utils v0.0.0-20250604170112-4c0f3b243397 // indirect
+	sigs.k8s.io/json v0.0.0-20241014173422-cfa47c3a1cc8 // indirect
+	sigs.k8s.io/randfill v1.0.0 // indirect
+	sigs.k8s.io/structured-merge-diff/v6 v6.3.0 // indirect
+	sigs.k8s.io/yaml v1.6.0 // indirect
+)
+```
+
+</Update>
--- a/docs/changelogs/ent-v1.3.16.mdx
+++ b/docs/changelogs/ent-v1.3.16.mdx
@@ -0,0 +1,80 @@
+---
+title: "v1.3.16"
+description: "v1.3.16 changelog"
+---
+
+<Update label="Bifrost Enterprise" description="v1.3.16">
+
+## Changelog
+
+This release adds a Model Details API endpoint, Anthropic beta headers support, and includes fixes for reasoning content handling, timeout status codes, and cross-provider caching.
+
+## ✨ Features
+
+- **Model Details API** — Added /api/models/details endpoint for querying model capability metadata
+- **Anthropic Beta Headers** — Support for Anthropic beta feature headers in requests
+
+
+## 🐞 Fixed
+
+- **Reasoning Content Leak** — Prevented reasoning text from leaking into Gemini response content
+- **Timeout Status Code** — Fixed timeout status code handling across all providers
+- **Cross-Provider Cache** — Preserved cached provider metadata on cross-provider cache hits
+- **Governance Virtual Keys** — Populated customer virtual_keys in governance APIs
+- **List Models Integration** — Removed default provider override on list models request in integrations
+- **Client Settings Headers** — Fixed Client settings UI to accept * as allowed headers
+- **SCIM API Key Auth** — Clarified API key authentication flow in SCIM middleware to skip redundant validation
+
+## 📀 Base OSS version
+
+`transports/v1.4.19`
+
+## 🔌 If you are compiling plugin against this release - use following deps
+
+```
+module github.com/maximhq/bifrost-enterprise
+
+go 1.26.1
+
+require (
+	cloud.google.com/go/bigquery v1.73.1
+	github.com/DataDog/datadog-go/v5 v5.6.0
+	github.com/DataDog/dd-trace-go/v2 v2.4.0
+	github.com/aws/aws-sdk-go-v2/config v1.32.11
+	github.com/aws/aws-sdk-go-v2/credentials v1.19.11
+	github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
+	github.com/bytedance/sonic v1.15.0
+	github.com/coreos/go-oidc/v3 v3.12.0
+	github.com/fasthttp/router v1.5.4
+	github.com/golang-jwt/jwt/v5 v5.3.0
+	github.com/google/cel-go v0.26.1
+	github.com/google/uuid v1.6.0
+	github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
+	github.com/grandcat/zeroconf v1.0.0
+	github.com/hashicorp/consul/api v1.22.0
+	github.com/hashicorp/memberlist v0.5.4
+	github.com/maximhq/bifrost/core v1.4.16
+	github.com/maximhq/bifrost/framework v1.2.35
+	github.com/maximhq/bifrost/plugins/governance v1.4.35
+	github.com/maximhq/bifrost/plugins/logging v1.4.35
+	github.com/maximhq/bifrost/transports v1.4.19
+	github.com/nakabonne/tstorage v0.3.6
+	github.com/stretchr/testify v1.11.1
+	github.com/testcontainers/testcontainers-go v0.40.0
+	github.com/tetratelabs/wazero v1.11.0
+	github.com/valyala/fasthttp v1.68.0
+	go.etcd.io/etcd/client/v3 v3.6.6
+	golang.org/x/crypto v0.49.0
+	golang.org/x/oauth2 v0.35.0
+	google.golang.org/api v0.265.0
+	google.golang.org/protobuf v1.36.11
+	gorm.io/driver/sqlite v1.6.0
+	gorm.io/gorm v1.31.1
+	k8s.io/api v0.34.1
+	k8s.io/apimachinery v0.34.1
+	k8s.io/client-go v0.34.1
+)
+```
+
+
+</Update>
--- a/docs/changelogs/ent-v1.3.17.mdx
+++ b/docs/changelogs/ent-v1.3.17.mdx
@@ -0,0 +1,88 @@
+---
+title: "v1.3.17"
+description: "Enterprise v1.3.17 changelog"
+---
+
+<Update label="Bifrost Enterprise" description="v1.3.17">
+
+## Changelog
+
+This release introduces model blacklisting in load balancing, Fireworks AI provider support, cluster stability improvements with unique node IDs and leader visibility, and numerous OSS fixes including Bedrock streaming retries and Gemini thinking budget validation.
+
+## ✨ Features
+
+- **Model Blacklisting for Load Balancer** — Added ability to exclude specific models from provider selection in the load balancing plugin, with support for per-key blacklists, block-all (`["*"]`) wildcards, and provider-level intersection logic
+- **Fireworks AI Provider** — Added Fireworks AI as a first-class provider in the OSS transport layer
+- **Unified Models API** — Unified /api/models and /api/models/details listing behavior
+- **Unique Cluster Node IDs** — Auto-generate a unique UUID for each node's NodeID on config load, ensuring distinct cluster node identifiers
+- **Leader Badge in Cluster View** — Display a "Leader" badge with crown icon in the cluster node table, with sorting by node name
+- **Server Bootstrap Timer** — Added server bootstrap timer for performance monitoring
+- **Security Path Whitelisting** — Allow path whitelisting from security config
+- **Large Payload Optimizations** — Updated config schema for large payload optimizations
+- **Virtual Keys Table** — Added sorting and CSV export to virtual keys table
+
+## 🐞 Fixed
+
+- **Leader Election Interval** — Increased leader election check interval to 10 seconds for improved cluster stability
+- **Node ID Consistency** — Minor fixes for node ID consistency across cluster operations
+- **ECR Cross-Account Access** — Fixed IAM role ARN format for ECR pull principals and cleaned up unused AWS provider config
+- **Bedrock Streaming Retries** — Retry retryable AWS exceptions and stale/closed-connection errors
+- **Gemini Thinking Budget** — Fixed thinking budget validation for Gemini models
+- **Integration Data Race** — Fixed race condition in data reading from fasthttp request for integrations
+- **Beta Headers** — Fixed case-insensitive lookup in merge beta headers
+- **Deprecated Config Field** — Replaced enforce_governance_header with enforce_auth_on_inference
+- **Bedrock Config Schema** — Fixed config schema for Bedrock key config
+- **OpenAI Codex** — Fixed store flag for OpenAI Codex
+
+## 📀 Base OSS version
+
+`transports/v1.4.20`
+
+## 🔌 If you are compiling plugin against this release - use following deps
+
+```
+module github.com/maximhq/bifrost-enterprise
+
+go 1.26.1
+
+require (
+	cloud.google.com/go/bigquery v1.73.1
+	github.com/DataDog/datadog-go/v5 v5.6.0
+	github.com/DataDog/dd-trace-go/v2 v2.4.0
+	github.com/aws/aws-sdk-go-v2/config v1.32.11
+	github.com/aws/aws-sdk-go-v2/credentials v1.19.11
+	github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
+	github.com/bytedance/sonic v1.15.0
+	github.com/coreos/go-oidc/v3 v3.12.0
+	github.com/fasthttp/router v1.5.4
+	github.com/golang-jwt/jwt/v5 v5.3.0
+	github.com/google/cel-go v0.26.1
+	github.com/google/uuid v1.6.0
+	github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
+	github.com/grandcat/zeroconf v1.0.0
+	github.com/hashicorp/consul/api v1.22.0
+	github.com/hashicorp/memberlist v0.5.4
+	github.com/maximhq/bifrost/core v1.4.17
+	github.com/maximhq/bifrost/framework v1.2.36
+	github.com/maximhq/bifrost/plugins/governance v1.4.36
+	github.com/maximhq/bifrost/plugins/logging v1.4.36
+	github.com/maximhq/bifrost/transports v1.4.20
+	github.com/nakabonne/tstorage v0.3.6
+	github.com/stretchr/testify v1.11.1
+	github.com/testcontainers/testcontainers-go v0.40.0
+	github.com/tetratelabs/wazero v1.11.0
+	github.com/valyala/fasthttp v1.68.0
+	go.etcd.io/etcd/client/v3 v3.6.6
+	golang.org/x/crypto v0.49.0
+	golang.org/x/oauth2 v0.35.0
+	google.golang.org/api v0.265.0
+	google.golang.org/protobuf v1.36.11
+	gorm.io/driver/sqlite v1.6.0
+	gorm.io/gorm v1.31.1
+	k8s.io/api v0.34.1
+	k8s.io/apimachinery v0.34.1
+	k8s.io/client-go v0.34.1
+)
+```
+
+</Update>
--- a/docs/changelogs/ent-v1.3.8.mdx
+++ b/docs/changelogs/ent-v1.3.8.mdx
@@ -0,0 +1,58 @@
+---
+title: "v1.3.8"
+description: "v1.3.8 changelog"
+---
+
+<Update label="Bifrost Enterprise" description="v1.3.8">
+
+This release upgrades the base OSS version from v1.4.10 to v1.4.11, bringing Anthropic cache control for tool calls, Helm graceful shutdown improvements, Codex compatibility fixes, and numerous streaming/serialization bug fixes. On the enterprise side, Gray Swan guardrails now support custom base URLs.
+
+This build also upgrades to go 1.26.1 - that fixes CVE-2026-25679, CVE-2026-27137, CVE-2026-27138, CVE-2026-27139, CVE-2026-27142.
+
+### ✨ Features
+
+- Gray Swan Custom Base URL — Added support for custom base URLs in Gray Swan guardrails configuration
+- Anthropic Cache Control for Tool Calls — Added cache-control support for Anthropic tool calls
+- Helm Graceful Shutdown — Added graceful shutdown and HPA stabilization for streaming connections
+- Logstore Sonic Serialization — Replaced encoding/json with sonic for logstore serialization, improving performance
+- Maxim Attachments — Added attachment support to Maxim plugin
+
+### 🚨 Breaking changes
+
+Based on our recent pentesting, we have updated configuration for open endpoints.
+
+1. /metrics endpoint is now protected behind auth. You can create an API key - and add Metrics scope to it. You have to configure scraper with Header authorization `bearer api_key`
+
+### 🐞 Fixed
+
+- Codex Compatibility — Fixed fallback handling and request decompression for Codex compatibility
+- Anthropic SSE Streaming — Use NewSSEScanner for Responses API streaming
+- Audio Filename Preservation — Preserve original audio filename in transcription requests
+- Proxy Override — Fixed proxy override handling
+- Raw Request Serialization — Fixed raw request serialization in SSE events
+- Key List Models — Fixed key list models serialization
+- Async Job Recovery — Fixed async jobs stuck in "processing" on marshal failure, now correctly transition to "failed"
+- Valkey/Redis Vector Store — Improved Valkey Search compatibility and correctness in Redis vector store
+- Semanticcache Nil Check — Added nil check on message Content before accessing fields
+- Dashboard Overflow — Resolved dashboard and provider config overflow regressions
+- Config Schema Alignment — Fixed config schema and added test to verify Go model alignment
+- Key Selection Panic — Prevent panic in key selection when all keys have zero weight
+- Security Patches — Applied security patches including default Anthropic error type fix
+
+### 📀 Base OSS version
+
+```
+transports/v1.4.12-0.20260306144022-5ac7c2732345
+```
+
+### 🔌 If you are compiling plugin against this release - use following deps
+
+```
+github.com/maximhq/bifrost/core v1.4.8-0.20260306144022-5ac7c2732345
+github.com/maximhq/bifrost/framework v1.2.26
+github.com/maximhq/bifrost/plugins/governance v1.4.27
+github.com/maximhq/bifrost/plugins/logging v1.4.27
+github.com/maximhq/bifrost/transports v1.4.12-0.20260306144022-5ac7c2732345
+```
+
+</Update>
--- a/docs/changelogs/ent-v1.3.9.mdx
+++ b/docs/changelogs/ent-v1.3.9.mdx
@@ -0,0 +1,64 @@
+---
+title: "v1.3.9"
+description: "v1.3.9 changelog"
+---
+
+<Update label="Bifrost Enterprise" description="v1.3.9">
+
+## Changelog
+
+This release upgrades the base OSS version from v1.4.11 to v1.4.12, bringing a full-featured prompt repository with RBAC, large payload optimization, WebSocket-based responses API, Anthropic passthrough, session stickiness, and a unified pricing engine. On the enterprise side, this release adds KV store gossip protocol support, RBAC for the prompt repository, and build/deployment improvements.
+
+## ✨ Features
+
+- **Prompt Repository** — Full prompt management system with folders, prompts, versions, sessions, playground, versioning, deployment features, and Jinja2 variable support
+- **Prompt Repository RBAC** — Added role-based access control for prompt repository operations
+- **Large Payload Optimization** — End-to-end large payload support with streaming primitives, detection hooks, passthrough eligibility, provider support, plugin awareness, and enterprise settings UI
+- **WebSocket Responses aAPI** — Added WebSocket transport for OpenAI responses API and realtime API support
+- **Anthropic Passthrough** — Added native Anthropic passthrough endpoint
+- **KV Store Gossip Protocol** — Added gossip-based KV store for distributed state synchronization
+- **Session Stickiness** — Added session stickiness in key selection for consistent routing
+- **Model Parameters API** — Added model parameters table and API endpoint with in-memory caching
+- **Virtual Key Limit Resets** — Added virtual key limit reset functionality
+- **Pricing Engine Refactor** — Unified cost calculation with quality-based image and video pricing
+- **Image Configuration** — Added size/aspect ratio config for Gemini and size-to-resolution conversion for Replicate
+- **Streaming Request Decompression** — Threshold-gated streaming decompression with pooled readers
+- **Raw Request/Response Storage** — Allow storing raw request/response without returning them to clients
+- **Weighted Routing Targets** — Added weighted routing targets for probabilistic routing rules with key selection support
+- **API Key Selection by ID** — Added API key selection by ID with priority over name selection
+- **TLS Configuration** — Added TLS configuration support for all providers and TLS termination inside Bifrost server
+- **K8s Deployment Workflow** — Added workflow to deploy Bifrost Enterprise to Maxim K8s cluster
+
+## 🐞 Fixed
+
+- **Deterministic Tool Schema** — Fixed deterministic tool schema serialization for Anthropic prompt caching
+- **CORS Wildcard** — Fixed CORS issue with allowing * origin
+- **Bedrock toolChoice** — Fixed toolChoice silently dropped on Bedrock /converse and /converse-stream endpoints
+- **Count Tokens Passthrough** — Fixed request body passthrough for count tokens endpoint for Anthropic and Vertex
+- **Chat Finish Reason** — Map chat finish_reason to responses status and preserve terminal stream semantics
+- **Tool Call Indexes** — Fixed streaming tool call indices for parallel tool calls in chat completions stream
+- **Video Pricing** — Fixed video pricing calculation
+- **SQLite Migration** — Prevented CASCADE deletion during routing targets migration
+- **Log Serialization** — Reduced logstore serialization overhead and batch cost updates
+- **Log List Queries** — Avoid loading raw_request/raw_response in log list queries
+- **MCP Reconnection** — Improved MCP client reconnection with exponential backoff and connection timeout
+- **Create Manifest Flow** — Fixed create manifest flow
+- **Build Pipeline** — Fixed builds skipping latest changes
+- **BigQuery Import** — Fixed import for codeEditor in bigqueryFormFragment.tsx
+- **OSS Build Integration** — Support latest-main OSS build with go.mod replace directives
+
+## 📀 Base OSS version
+
+`transports/v1.4.12`
+
+## 🔌 If you are compiling plugin against this release - use following deps
+
+```
+github.com/maximhq/bifrost/core v1.4.8
+github.com/maximhq/bifrost/framework v1.2.27
+github.com/maximhq/bifrost/plugins/governance v1.4.28
+github.com/maximhq/bifrost/plugins/logging v1.4.28
+github.com/maximhq/bifrost/transports v1.4.12
+```
+
+</Update>
--- a/docs/changelogs/ent-v1.4.0-prerelease1.mdx
+++ b/docs/changelogs/ent-v1.4.0-prerelease1.mdx
@@ -0,0 +1,156 @@
+---
+title: "v1.4.0--prerelease1"
+description: "Enterprise v1.4.0-prerelease1"
+---
+
+<Update label="Bifrost Enterprise" description="v1.4.0-prerelease1">
+
+## Changelog
+
+This is a major release that introduces deny-by-default semantics across all allow-list fields (models, keys, tools, providers), a dedicated Provider Keys API, blacklist support in load balancing, redesigned adaptive routing UI, and scoped pricing overrides. **This release contains multiple breaking changes — please review the breaking changes section and migration checklist carefully before upgrading.**
+
+## ⚠️ Breaking Changes
+
+> **v1.5.0 OSS base flips the meaning of empty arrays across all allow-list fields.** Existing deployments with a database are protected by automatic migrations on startup, but any new configuration created after upgrading must follow the new semantics. **Back up your config store database before upgrading — this migration is not revertible.**
+
+| What you write | v1.4.x meaning | v1.5.0 meaning |
+|---|---|---|
+| `[]` (empty array) | Allow **all** | Allow **none** (deny by default) |
+| `["*"]` (wildcard) | N/A | Allow **all** |
+| `["a", "b"]` | Only a and b | Only a and b (unchanged) |
+
+### 1. Provider Key `models` Field
+
+Empty `models` array now means "allow none" instead of "allow all". Use `["*"]` to allow a key to serve all models.
+
+### 2. Virtual Key `allowed_models` Field
+
+Missing or empty `allowed_models` on a VK provider config now blocks all models from that provider. Use `["*"]` to allow all.
+
+### 3. Virtual Key Provider Configs — Deny-by-Default
+
+Virtual Keys with empty or missing `provider_configs` now block all providers. Every VK must explicitly list its permitted providers.
+
+### 4. `allowed_keys` Renamed to `key_ids`
+
+Field renamed in VK provider configs. Same deny-by-default semantics — omitted or empty `key_ids` now blocks all keys. Use `["*"]` to allow all. **Note:** Unlike `allowed_models`, there is no automatic migration for `key_ids`.
+
+### 5. Virtual Key MCP `tools_to_execute` Field
+
+Empty `tools_to_execute` now blocks all tools. The `mcp_configs` list itself acts as a strict allow-list — no `mcp_configs` means all MCP tools are blocked for that VK.
+
+### 6. `weight` Field is Now Optional
+
+`weight` on VK provider configs is now nullable (`*float64`). `null` or omitted means the provider is excluded from weighted routing but still reachable via direct routing or fallbacks.
+
+### 7. Compat Plugin Configuration Changes
+
+- `enable_litellm_fallbacks` option **removed**
+- Replaced with: `compat.convert_text_to_chat`, `compat.convert_chat_to_responses`, `compat.should_drop_params`
+- Response field `extra_fields.litellm_compat` **removed**
+- New response fields: `extra_fields.dropped_compat_plugin_params`, `extra_fields.converted_request_type`
+
+### 8. Image Edits No Longer Supported on Replicate's Image Generation Endpoint
+
+`/v1/images/generations` on Replicate now only handles pure text-to-image generation. Image editing parameters must use `/v1/images/edits`. Note: `/v1/images/edits` on Replicate will also be removed in a follow-up release.
+
+### 9. Provider Keys API Separated from Provider API
+
+- `keys` field **removed** from provider create/update requests and responses
+- New dedicated endpoints: `GET/POST /api/providers/{provider}/keys`, `GET/PUT/DELETE /api/providers/{provider}/keys/{key_id}`
+- Create providers first, then add keys separately
+
+### New Validation: WhiteList Rules
+
+- Wildcard `["*"]` cannot be mixed with other values (HTTP 400)
+- No duplicate values allowed in allow-list fields
+- Applies to: `allowed_models`, `key_ids`, `models`, `tools_to_execute`, `tools_to_auto_execute`, `allowed_extra_headers`
+
+### Quick Migration Checklist
+
+1. Update provider key `models` in config.json — change `[]` to `["*"]`
+2. Add `allowed_models: ["*"]` to every VK provider config
+3. Ensure every VK has at least one provider config entry
+4. Rename `allowed_keys` to `key_ids` and set `["*"]` where needed
+5. Update `tools_to_execute` for MCP configs — change `[]` to `["*"]`
+6. Handle nullable `weight` in API consumers
+7. Fix any invalid WhiteList values (no mixing wildcards, no duplicates)
+8. Migrate key management to dedicated `/api/providers/{provider}/keys` endpoints
+
+## ✨ Features
+
+- **Dedicated Provider Keys API** — Keys are now managed via `/api/providers/{provider}/keys` endpoints instead of being embedded in provider create/update payloads
+- **Deny-by-Default Access Control** — Standardized empty array conventions across all allow-list fields; `[]` means deny all, `["*"]` means allow all
+- **VK Provider Config Key Wildcards** — `key_ids` now supports `["*"]` wildcard to allow all keys; handler resolves wildcard to AllowAllKeys flag without DB key lookups
+- **VK MCP Allow-List** — Virtual key MCP configs now act as an execution-time allow-list — tools not permitted by the VK are blocked at inference and MCP tool execution
+- **MCP Virtual Key Assignment** — MCP configuration now supports assigning virtual keys with per-tool access control, with an option to allow MCP clients to run on all virtual keys
+- **Disable Auto MCP Tool Injection** — Add option to disable automatic MCP tool injection per request
+- **MCP Request-Level Extra Headers** — Support for request-level extra headers in MCP tool execution
+- **MCP Gateway Filtering** — Support for `x-bf-mcp-include-clients` and `x-bf-mcp-include-tools` request headers to filter MCP tools/list response
+- **Scoped Pricing Overrides** — Support for pricing overrides at a scoped level
+- **StabilityAI on Bedrock** — Added StabilityAI provider support to Bedrock
+- **Plugin Trace Logging** — Plugins can now inject logs at trace level using `ctx.Log(schemas.LogLevelInfo, "Test log")`
+- **Blacklist Support in Load Balancing** — Added model blacklist support to the load balancing plugin
+- **Adaptive Routing UI Redesign** — Redesigned adaptive routing UI with improved layout and Sankey chart visualization
+- **Governance Refactor** — Governance module changes for improved structure
+- **Compat Plugin New Modes** — Chat-to-responses fallback and OpenAI-compatible parameter dropping modes added to compat plugin
+
+## 🐞 Fixed
+
+- **MCP Agent Usage Accumulation** — Fixed accumulated usage not being sent back in MCP agent mode
+- **OpenAI Transcription Formats** — Handle text, vtt, srt response formats in OpenAI transcription response
+- **HuggingFace Load Balancing** — Removed HuggingFace deployment handling from load balancing plugin
+- **Parallelized Model Listing** — Parallelized model listing for providers to speed up startup time
+
+## 📀 Base OSS version
+
+`transports/v1.5.0-prerelease1`
+
+## 🔌 If you are compiling plugin against this release - use following deps
+
+```
+module github.com/maximhq/bifrost-enterprise
+
+go 1.26.1
+
+require (
+	cloud.google.com/go/bigquery v1.73.1
+	github.com/DataDog/datadog-go/v5 v5.6.0
+	github.com/DataDog/dd-trace-go/v2 v2.4.0
+	github.com/aws/aws-sdk-go-v2/config v1.32.11
+	github.com/aws/aws-sdk-go-v2/credentials v1.19.11
+	github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
+	github.com/bytedance/sonic v1.15.0
+	github.com/coreos/go-oidc/v3 v3.12.0
+	github.com/fasthttp/router v1.5.4
+	github.com/golang-jwt/jwt/v5 v5.3.0
+	github.com/google/cel-go v0.26.1
+	github.com/google/uuid v1.6.0
+	github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
+	github.com/grandcat/zeroconf v1.0.0
+	github.com/hashicorp/consul/api v1.22.0
+	github.com/hashicorp/memberlist v0.5.4
+	github.com/maximhq/bifrost/core v1.5.0
+	github.com/maximhq/bifrost/framework v1.3.0
+	github.com/maximhq/bifrost/plugins/governance v1.5.0
+	github.com/maximhq/bifrost/plugins/logging v1.5.0
+	github.com/maximhq/bifrost/transports v1.5.0-prerelease1
+	github.com/nakabonne/tstorage v0.3.6
+	github.com/stretchr/testify v1.11.1
+	github.com/testcontainers/testcontainers-go v0.40.0
+	github.com/tetratelabs/wazero v1.11.0
+	github.com/valyala/fasthttp v1.68.0
+	go.etcd.io/etcd/client/v3 v3.6.6
+	golang.org/x/crypto v0.49.0
+	golang.org/x/oauth2 v0.35.0
+	google.golang.org/api v0.265.0
+	google.golang.org/protobuf v1.36.11
+	gorm.io/driver/sqlite v1.6.0
+	gorm.io/gorm v1.31.1
+	k8s.io/api v0.34.1
+	k8s.io/apimachinery v0.34.1
+	k8s.io/client-go v0.34.1
+)
+```
+
+</Update>
--- a/docs/changelogs/ent-v1.4.0-prerelease2.mdx
+++ b/docs/changelogs/ent-v1.4.0-prerelease2.mdx
@@ -0,0 +1,107 @@
+---
+title: "v1.4.0--prerelease2"
+description: "Enterprise v1.4.0-prerelease2"
+---
+
+<Update label="Bifrost Enterprise" description="v1.4.0-prerelease2">
+
+## Changelog
+
+This release introduces realtime (WebSocket/WebRTC) support, Fireworks AI as a new provider, a comprehensive SCIM provider expansion (Google Workspace, Keycloak, Zitadel, SailPoint), access profiles for fine-grained permission control, business units and teams for organizational hierarchy, a user ranking dashboard, and a guardrail verification flow.
+
+## ✨ Features
+
+- **Realtime Support** — WebSocket, WebRTC, and client secret handlers with session state management and transport context helpers for real-time streaming use cases
+- **Fireworks AI Provider** — Fireworks AI added as a first-class provider with native completions, responses, embeddings, and image generations
+- **Access Profiles** — Fine-grained permission control with access profiles for managing model access at team and business unit levels, including propagation dialogs and full CRUD UI
+- **SCIM Provider Expansion** — Added support for Google Workspace, Keycloak, Zitadel, and SailPoint identity providers with full SCIM provisioning, attribute mapping, and sync workflows
+- **Okta Custom Provider + Group Mapping** — Custom Okta provider configurations with attribute-to-role, team, and business unit mapping support
+- **Business Units & Teams** — New organizational hierarchy for managing users with business units, teams, sync dialogs, and detail sheets
+- **User Ranking Dashboard** — Dashboard for tracking and visualizing user activity and rankings
+- **Guardrail Verify Flow** — Verify guardrail configurations against providers (Azure, Bedrock, GraySwan) before deployment
+- **Per-User OAuth Consent** — Per-user OAuth consent flow with identity selection and MCP authentication
+- **Prompts Plugin** — New prompts plugin with direct key header resolver and selective message inclusion when committing prompt sessions
+- **Bedrock Embeddings & Image Gen** — Embeddings, image generation, edit, and variation support added to Bedrock provider
+- **Logging Tracking Fields** — Support for tracking userId, teamId, customerId, and businessUnitId in logging plugin
+- **Virtual Keys Export** — Sorting and CSV export added to virtual keys table
+- **Path Whitelisting** — Allow path whitelisting from security config
+- **Model Blacklist in Load Balancing** — Blacklist model support in the load balancing plugin to exclude specific models from routing
+- **Cluster Leader Badge** — Leader badge display added to cluster node view
+- **Server Bootstrap Timer** — Startup diagnostics with server bootstrap timer
+
+## 🐞 Fixed
+
+- **Traffic Distribution Label** — Added "last 10s" label to Traffic Distribution Sankey chart for clarity
+- **Node ID Consistency** — Generate unique node ID on config load with minor consistency fixes
+- **Leader Election Stability** — Increased leader election check interval to 10 seconds for improved stability
+- **Bedrock Tool Choice** — Fix bedrock tool choice conversion to auto
+- **Bedrock Streaming Retries** — Retry retryable AWS exceptions and stale/closed-connection errors in bedrock streaming
+- **Bedrock SigV4 Service** — Correct SigV4 service name for agent runtime rerank
+- **MCP Tool Logs** — Fix MCP tool logs not being captured correctly
+- **Routing Rule Targets** — Preserve routing rule targets for genai and bedrock paths
+- **Provider Budget Duplication** — Fix provider level multiline budget duplication issue
+- **Vertex Endpoint** — Fix vertex endpoint correction
+- **Gemini Thinking Budget** — Fix thinking budget validation for gemini models
+- **SQLite Migrations** — Fix SQLite migration connections, error handling, and disable foreign key checks during migration
+- **Tool Parameter Schemas** — Preserve explicit empty tool parameter schemas for openai passthrough
+- **List Models Output** — Include raw model ID in list-models output alongside aliases
+- **Config Schema** — Fix config schema for bedrock key config
+- **Data Race Fix** — Fix race in data reading from fasthttp request for integrations
+- **Model Listing** — Unify /api/models and /api/models/details listing behavior
+
+## 📀 Base OSS version
+
+`transports/v1.5.0-prerelease2`
+
+## 🔌 If you are compiling plugin against this release - use following deps
+
+```
+module github.com/maximhq/bifrost-enterprise
+
+go 1.26.1
+
+require (
+	cloud.google.com/go/bigquery v1.73.1
+	github.com/DataDog/datadog-go/v5 v5.6.0
+	github.com/DataDog/dd-trace-go/v2 v2.4.0
+	github.com/aws/aws-sdk-go-v2/config v1.32.11
+	github.com/aws/aws-sdk-go-v2/credentials v1.19.11
+	github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
+	github.com/bytedance/sonic v1.15.0
+	github.com/coreos/go-oidc/v3 v3.12.0
+	github.com/fasthttp/router v1.5.4
+	github.com/golang-jwt/jwt/v5 v5.3.0
+	github.com/google/cel-go v0.26.1
+	github.com/google/uuid v1.6.0
+	github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
+	github.com/grandcat/zeroconf v1.0.0
+	github.com/hashicorp/consul/api v1.22.0
+	github.com/hashicorp/memberlist v0.5.4
+	github.com/maximhq/bifrost/core v1.5.1
+	github.com/maximhq/bifrost/framework v1.3.1
+	github.com/maximhq/bifrost/plugins/governance v1.5.1
+	github.com/maximhq/bifrost/plugins/logging v1.5.1
+	github.com/maximhq/bifrost/transports v1.5.0-prerelease2
+	github.com/nakabonne/tstorage v0.3.6
+	github.com/stretchr/testify v1.11.1
+	github.com/testcontainers/testcontainers-go v0.40.0
+	github.com/tetratelabs/wazero v1.11.0
+	github.com/valyala/fasthttp v1.68.0
+	go.etcd.io/etcd/client/v3 v3.6.6
+	golang.org/x/crypto v0.49.0
+	golang.org/x/oauth2 v0.36.0
+	google.golang.org/api v0.265.0
+	google.golang.org/protobuf v1.36.11
+	gorm.io/driver/sqlite v1.6.0
+	gorm.io/gorm v1.31.1
+	k8s.io/api v0.34.1
+	k8s.io/apimachinery v0.34.1
+	k8s.io/client-go v0.34.1
+)
+```
+
+
+</Update>
+
+
+
--- a/docs/changelogs/v1.2.21.mdx
+++ b/docs/changelogs/v1.2.21.mdx
@@ -0,0 +1,64 @@
+---
+title: "v1.2.21"
+description: "v1.2.21 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.2.21
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.2.21
+    docker run -p 8080:8080 maximhq/bifrost:v1.2.21
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.2.21">
+
+- Fixes pricing computation for nested model names i.e. groq/openai/gpt-oss-20b.
+
+</Update>
+<Update label="Framework" description="v1.2.21">
+
+- Pricing module now accommodates nested model names i.e. groq/openai/gpt-oss-20b was getting skipped while computing costs.
+
+</Update>
+<Update label="governance" description="v1.2.21">
+
+- Upgrades framework to 1.0.23
+
+</Update>
+<Update label="jsonparser" description="v1.2.21">
+
+- Upgrades framework to 1.0.23
+
+</Update>
+<Update label="logging" description="v1.2.21">
+
+- Upgrades framework to 1.0.23
+- Fixes pricing computation for nested model names.
+
+</Update>
+<Update label="maxim" description="v1.2.21">
+
+- Upgrades framework to 1.0.23
+
+</Update>
+<Update label="mocker" description="v1.2.21">
+
+- Upgrades framework to 1.0.23
+
+</Update>
+<Update label="semantic_cache" description="v1.2.21">
+
+- Upgrades framework to 1.0.23
+
+</Update>
+<Update label="telemetry" description="v1.2.21">
+
+- Upgrades framework to 1.0.23
+
+</Update>
--- a/docs/changelogs/v1.2.22.mdx
+++ b/docs/changelogs/v1.2.22.mdx
@@ -0,0 +1,78 @@
+---
+title: "v1.2.22"
+description: "v1.2.22 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.2.22
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.2.22
+    docker run -p 8080:8080 maximhq/bifrost:v1.2.22
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.2.22">
+
+- Fix: Users can now delete custom providers from the UI
+- Fix: Token count no longer displays as N/A in certain streaming response cases
+- Fix: Streaming responses now properly display errors on the UI instead of getting stuck in processing state
+
+</Update>
+<Update label="Core" description="v1.2.22">
+
+- Fix: Updates token calculation for streaming responses. #520
+
+</Update>
+<Update label="Framework" description="v1.2.22">
+
+- upgrade: core upgrades to 1.1.38
+
+</Update>
+<Update label="governance" description="v1.2.22">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="jsonparser" description="v1.2.22">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="logging" description="v1.2.22">
+
+- fix: fixes error logging for streaming and non-streaming responses.
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="maxim" description="v1.2.22">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="mocker" description="v1.2.22">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="semantic_cache" description="v1.2.22">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="telemetry" description="v1.2.22">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
--- a/docs/changelogs/v1.2.23.mdx
+++ b/docs/changelogs/v1.2.23.mdx
@@ -0,0 +1,76 @@
+---
+title: "v1.2.23"
+description: "v1.2.23 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.2.23
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.2.23
+    docker run -p 8080:8080 maximhq/bifrost:v1.2.23
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.2.23">
+
+- Fix: Fixes editing experience of weight for API keys.
+
+</Update>
+<Update label="Core" description="v1.2.23">
+
+- Fix: Updates token calculation for streaming responses. #520
+
+</Update>
+<Update label="Framework" description="v1.2.23">
+
+- upgrade: core upgrades to 1.1.38
+
+</Update>
+<Update label="governance" description="v1.2.23">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="jsonparser" description="v1.2.23">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="logging" description="v1.2.23">
+
+- fix: fixes error logging for streaming and non-streaming responses.
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="maxim" description="v1.2.23">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="mocker" description="v1.2.23">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="semantic_cache" description="v1.2.23">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="telemetry" description="v1.2.23">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
--- a/docs/changelogs/v1.2.24.mdx
+++ b/docs/changelogs/v1.2.24.mdx
@@ -0,0 +1,77 @@
+---
+title: "v1.2.24"
+description: "v1.2.24 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.2.24
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.2.24
+    docker run -p 8080:8080 maximhq/bifrost:v1.2.24
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.2.24">
+
+- Fix: Adds `Base URL` input in custom provider creation dialog.
+- Fix: Fixes `x` button getting hidden behind dialog header.
+
+</Update>
+<Update label="Core" description="v1.2.24">
+
+- Fix: Updates token calculation for streaming responses. #520
+
+</Update>
+<Update label="Framework" description="v1.2.24">
+
+- upgrade: core upgrades to 1.1.38
+
+</Update>
+<Update label="governance" description="v1.2.24">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="jsonparser" description="v1.2.24">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="logging" description="v1.2.24">
+
+- fix: fixes error logging for streaming and non-streaming responses.
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="maxim" description="v1.2.24">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="mocker" description="v1.2.24">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="semantic_cache" description="v1.2.24">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
+<Update label="telemetry" description="v1.2.24">
+
+- upgrade: core to 1.1.38
+- upgrade: framework to 1.0.24
+
+</Update>
--- a/docs/changelogs/v1.3.0-prerelease1.mdx
+++ b/docs/changelogs/v1.3.0-prerelease1.mdx
@@ -0,0 +1,95 @@
+---
+title: "v1.3.0-prerelease1"
+description: "v1.3.0-prerelease1 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease1
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.0-prerelease1
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease1
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease1">
+
+- Fix: Token count no longer displays as N/A in certain streaming response cases
+- Fix: Streaming responses now properly display errors on the UI instead of getting stuck in processing state
+- Feat: UI for configuring external observability connectors
+- Feat: OTLP collector
+- Feat: UI-driven Maxim observability configuration
+- Fix: Fixes Bifrost specific error logging in first party and third party logging plugins
+
+</Update>
+<Update label="Core" description="v1.3.0-prerelease1">
+
+- Feature: Adds dynamic reloads for plugins. This removes the requirement for restarts when updating plugins.
+- Feature: Adds responses API support.
+- This release contains multiple breaking changes for Bifrost Core. These were necessary to ensure we incorporate responses without compromising on speed or architecture.
+
+</Update>
+<Update label="Framework" description="v1.3.0-prerelease1">
+
+- Chore: Adds ctx to each function to gracefully shutdown ongoing tasks and bring better concurrency management
+- Fix: Fixes pricing sync to make sure latest updates are synced at every restart.
+- Feat: Adds new accumulator for accumulating all streaming responses from LLMs.
+
+</Update>
+<Update label="governance" description="v1.3.0-prerelease1">
+
+- Feat: Now Bifrost supports provider level fallbacks
+- Chore: Dependency upgrades
+
+</Update>
+<Update label="jsonparser" description="v1.3.0-prerelease1">
+
+- Upgrade dependency: core to 1.2.0
+
+</Update>
+<Update label="logging" description="v1.3.0-prerelease1">
+
+- Fix: Captures Bifrost-specific errors in logs (e.g. provider not configured)
+- Fix: Fixes audio streaming captures
+- Upgrade dependency: core to 1.2.0
+- Upgrade dependency: framework to 1.1.0
+
+</Update>
+<Update label="maxim" description="v1.3.0-prerelease1">
+
+- Fix: Maxim plugin now captures Bifrost gateway specific errors.
+- Upgrade dependency: maxim-go to 0.1.11
+- Upgrade dependency: core to 1.2.0
+- Upgrade dependency: framework to 1.1.0
+
+</Update>
+<Update label="mocker" description="v1.3.0-prerelease1">
+
+- Upgrade dependency: core to 1.2.0
+- Upgrade dependency: framework to 1.1.0
+
+</Update>
+<Update label="otel" description="v1.3.0-prerelease1">
+
+- First version cut 🚀
+- Feature: Support OTLP collector over HTTP or gRPC protocol.
+
+</Update>
+<Update label="semantic_cache" description="v1.3.0-prerelease1">
+
+- Feat: Adds support for Responses and Text completions
+- Upgrade dependency: core to 1.2.0
+- Upgrade dependency: framework to 1.1.0
+
+</Update>
+<Update label="telemetry" description="v1.3.0-prerelease1">
+
+- Fix: Adds support for Responses and Text completions.
+- Upgrade dependency: core to 1.2.0
+- Upgrade dependency: framework to 1.2.0
+
+</Update>
--- a/docs/changelogs/v1.3.0-prerelease2.mdx
+++ b/docs/changelogs/v1.3.0-prerelease2.mdx
@@ -0,0 +1,83 @@
+---
+title: "v1.3.0-prerelease2"
+description: "v1.3.0-prerelease2 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease2
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.0-prerelease2
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease2
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease2">
+
+- Added specific error handling for timeout scenarios (context.Canceled, context.DeadlineExceeded, fasthttp.ErrTimeout) across all providers
+- Created a dedicated error message for timeouts that guides users to adjust the timeout setting
+- Fixed validation in HTTP handlers for embeddings, speech, and text completion requests
+- Improved CORS wildcard pattern matching to support domain patterns like *.example.com
+- Fixed issues in the logging plugin to properly handle text completion responses
+- Enhanced UI form handling for network configuration with proper default values
+- Feat: Adds Text Completion Streaming support
+
+</Update>
+<Update label="Core" description="v1.3.0-prerelease2">
+
+- Added specific error handling for timeout scenarios (context.Canceled, context.DeadlineExceeded, fasthttp.ErrTimeout) across all providers
+- Created a dedicated error message for timeouts that guides users to adjust the timeout setting
+- Added Text Completion Streaming support
+
+</Update>
+<Update label="Framework" description="v1.3.0-prerelease2">
+
+- Feat: Adds Text Completion Streaming support
+
+</Update>
+<Update label="governance" description="v1.3.0-prerelease2">
+
+- Chore: using core 1.2.1 and framework 1.1.1
+
+</Update>
+<Update label="jsonparser" description="v1.3.0-prerelease2">
+
+- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
+
+</Update>
+<Update label="logging" description="v1.3.0-prerelease2">
+
+- Feat: Adds Text Completion Streaming support
+- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
+
+</Update>
+<Update label="maxim" description="v1.3.0-prerelease2">
+
+- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
+
+</Update>
+<Update label="mocker" description="v1.3.0-prerelease2">
+
+- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
+
+</Update>
+<Update label="otel" description="v1.3.0-prerelease2">
+
+- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
+
+</Update>
+<Update label="semantic_cache" description="v1.3.0-prerelease2">
+
+- Feat: Adds Text Completion Streaming support
+- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
+
+</Update>
+<Update label="telemetry" description="v1.3.0-prerelease2">
+
+- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
+
+</Update>
--- a/docs/changelogs/v1.3.0-prerelease3.mdx
+++ b/docs/changelogs/v1.3.0-prerelease3.mdx
@@ -0,0 +1,74 @@
+---
+title: "v1.3.0-prerelease3"
+description: "v1.3.0-prerelease3 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease3
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.0-prerelease3
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease3
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease3">
+
+- Fix: Fixes string input support for responses requests.
+- Feat: Adds responses endpoint to openai integration.
+
+</Update>
+<Update label="Core" description="v1.3.0-prerelease3">
+
+- Fix: String inputs tranformat added for responses requests.
+
+</Update>
+<Update label="Framework" description="v1.3.0-prerelease3">
+
+- Chore: core upgrades to 1.2.2
+
+</Update>
+<Update label="governance" description="v1.3.0-prerelease3">
+
+- Chore: using core 1.2.2 and framework 1.1.2
+
+</Update>
+<Update label="jsonparser" description="v1.3.0-prerelease3">
+
+- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
+
+</Update>
+<Update label="logging" description="v1.3.0-prerelease3">
+
+- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
+
+</Update>
+<Update label="maxim" description="v1.3.0-prerelease3">
+
+- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
+
+</Update>
+<Update label="mocker" description="v1.3.0-prerelease3">
+
+- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
+
+</Update>
+<Update label="otel" description="v1.3.0-prerelease3">
+
+- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
+
+</Update>
+<Update label="semantic_cache" description="v1.3.0-prerelease3">
+
+- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
+
+</Update>
+<Update label="telemetry" description="v1.3.0-prerelease3">
+
+- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
+
+</Update>
--- a/docs/changelogs/v1.3.0-prerelease4.mdx
+++ b/docs/changelogs/v1.3.0-prerelease4.mdx
@@ -0,0 +1,73 @@
+---
+title: "v1.3.0-prerelease4"
+description: "v1.3.0-prerelease4 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease4
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.0-prerelease4
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease4
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease4">
+
+- Feat: A new config called `Enable LiteLLM Fallback` that enables text_completion calls to fall back to chat_completions calls for the Groq provider. This is an anti-pattern, but we are adding this to help users migrate from LiteLLM easily. Reach out to us if you want us to enable any other quirky patterns LiteLLM has.
+
+</Update>
+<Update label="Core" description="v1.3.0-prerelease4">
+
+- Feat: Adds litellm-specific fallbacks for text completion for Groq. This enables users with codebases stuck in this antipattern out-of-the-box.
+
+</Update>
+<Update label="Framework" description="v1.3.0-prerelease4">
+
+- Chore: core upgrades to 1.2.3
+
+</Update>
+<Update label="governance" description="v1.3.0-prerelease4">
+
+- Chore: core upgrades to 1.2.3
+
+</Update>
+<Update label="jsonparser" description="v1.3.0-prerelease4">
+
+- Chore: core upgrades to 1.2.3
+
+</Update>
+<Update label="logging" description="v1.3.0-prerelease4">
+
+- Chore: core upgrades to 1.2.3
+
+</Update>
+<Update label="maxim" description="v1.3.0-prerelease4">
+
+- Chore: core upgrades to 1.2.3
+
+</Update>
+<Update label="mocker" description="v1.3.0-prerelease4">
+
+- Chore: core upgrades to 1.2.3
+
+</Update>
+<Update label="otel" description="v1.3.0-prerelease4">
+
+- Chore: core upgrades to 1.2.3
+
+</Update>
+<Update label="semantic_cache" description="v1.3.0-prerelease4">
+
+- Chore: core upgrades to 1.2.3
+
+</Update>
+<Update label="telemetry" description="v1.3.0-prerelease4">
+
+- Chore: core upgrades to 1.2.3
+
+</Update>
--- a/docs/changelogs/v1.3.0-prerelease5.mdx
+++ b/docs/changelogs/v1.3.0-prerelease5.mdx
@@ -0,0 +1,76 @@
+---
+title: "v1.3.0-prerelease5"
+description: "v1.3.0-prerelease5 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease5
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.0-prerelease5
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease5
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease5">
+
+- Fix: Anthropic tool results aggregation logic (core 1.2.4)
+- Feat: Raw response saved in logs (framework 1.1.4)
+
+</Update>
+<Update label="Core" description="v1.3.0-prerelease5">
+
+- Fix: Anthropic tool results aggregation logic.
+
+</Update>
+<Update label="Framework" description="v1.3.0-prerelease5">
+
+- Feat: Raw response saved in logs.
+- Upgrade dependency: core to 1.2.4
+
+</Update>
+<Update label="governance" description="v1.3.0-prerelease5">
+
+- Chore: using core 1.2.4 and framework 1.1.4
+
+</Update>
+<Update label="jsonparser" description="v1.3.0-prerelease5">
+
+- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
+
+</Update>
+<Update label="logging" description="v1.3.0-prerelease5">
+
+- Feat: Raw response saved in logs.
+- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
+
+</Update>
+<Update label="maxim" description="v1.3.0-prerelease5">
+
+- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
+
+</Update>
+<Update label="mocker" description="v1.3.0-prerelease5">
+
+- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
+
+</Update>
+<Update label="otel" description="v1.3.0-prerelease5">
+
+- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
+
+</Update>
+<Update label="semantic_cache" description="v1.3.0-prerelease5">
+
+- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
+
+</Update>
+<Update label="telemetry" description="v1.3.0-prerelease5">
+
+- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
+
+</Update>
--- a/docs/changelogs/v1.3.0-prerelease6.mdx
+++ b/docs/changelogs/v1.3.0-prerelease6.mdx
@@ -0,0 +1,87 @@
+---
+title: "v1.3.0-prerelease6"
+description: "v1.3.0-prerelease6 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease6
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.0-prerelease6
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease6
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease6">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+- Feat: Added Anthropic thinking parameter in responses API.
+- Feat: Added Anthropic text completion integration support.
+- Fix: Extra fields sent back in streaming responses.
+- Feat: Latency for all request types (with inter token latency for streaming requests) sent back in Extra fields.
+- Feat: UI websocket implementation generalized.
+- Feat: TokenInterceptor interface added to plugins.
+- Fix: Middlewares added to integrations route.
+
+</Update>
+<Update label="Core" description="v1.3.0-prerelease6">
+
+- Feat: Stream token latency sent back in extra fields.
+- Feat: Plugin interface extended with TransportInterceptor method.
+- Feat: Add Anthropic thinking parameter
+- Feat: Add Custom key selector logic and send back request latency in extra fields.
+- Bug: Fallbacks not working occasionally.
+
+</Update>
+<Update label="Framework" description="v1.3.0-prerelease6">
+
+- Upgrade dependency: core to 1.2.5
+- Feat: User table added to config store.
+
+</Update>
+<Update label="governance" description="v1.3.0-prerelease6">
+
+- Chore: using core 1.2.5 and framework 1.1.5
+- Feat: Added provider routing TransportInterceptor.
+
+</Update>
+<Update label="jsonparser" description="v1.3.0-prerelease6">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="logging" description="v1.3.0-prerelease6">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="maxim" description="v1.3.0-prerelease6">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="mocker" description="v1.3.0-prerelease6">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="otel" description="v1.3.0-prerelease6">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="semantic_cache" description="v1.3.0-prerelease6">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="telemetry" description="v1.3.0-prerelease6">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+- Feat: Added First Token and Inter Token latency metrics for streaming requests.
+
+</Update>
--- a/docs/changelogs/v1.3.0-prerelease7.mdx
+++ b/docs/changelogs/v1.3.0-prerelease7.mdx
@@ -0,0 +1,81 @@
+---
+title: "v1.3.0-prerelease7"
+description: "v1.3.0-prerelease7 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease7
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.0-prerelease7
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease7
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease7">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+- Added Responses streaming across all providers.
+- Fixed bedrock chat streaming decoding issues.
+- Added raw response support for all streaming requests.
+- Removed last token's accumulated latency from inter token latency metric.
+
+</Update>
+<Update label="Core" description="v1.3.0-prerelease7">
+
+- Feat: Responses streaming added across all providers.
+- Fix: Bedrock chat streaming decoding fixes.
+- Feat: Added raw response support for all streaming requests.
+
+</Update>
+<Update label="Framework" description="v1.3.0-prerelease7">
+
+- Upgrade dependency: core to 1.2.6
+- Feat: Moved the migrator package to a more general location and added database migrations for the logstore to standardize object type values.
+
+</Update>
+<Update label="governance" description="v1.3.0-prerelease7">
+
+- Chore: using core 1.2.6 and framework 1.1.6
+
+</Update>
+<Update label="jsonparser" description="v1.3.0-prerelease7">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="logging" description="v1.3.0-prerelease7">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="maxim" description="v1.3.0-prerelease7">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="mocker" description="v1.3.0-prerelease7">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="otel" description="v1.3.0-prerelease7">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="semantic_cache" description="v1.3.0-prerelease7">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="telemetry" description="v1.3.0-prerelease7">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+- Fix: Removed last token's accumulated latency from inter token latency metric.
+
+</Update>
--- a/docs/changelogs/v1.3.0.mdx
+++ b/docs/changelogs/v1.3.0.mdx
@@ -0,0 +1,119 @@
+---
+title: "v1.3.0"
+description: "v1.3.0 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.0
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.0
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.0
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.0">
+
+We're excited to ship v1.3.0 with major quality, compatibility, and governance upgrades across OSS and Enterprise. 
+
+🌟 Highlights
+- OTel traces support (OSS): First-class support for OTLP collectors.  
+- Responses API (OSS): First-class support for the OpenAI-style Responses format, streaming + non-streaming.
+- Drop-in for LiteLLM (OSS): Config-level fallbacks to ease migrations.
+- Guardrails (Enterprise): Initial set with AWS Bedrock, Azure Content Moderator, and Patronus AI.
+- Provisioning (Enterprise): Okta SCIM now supported alongside Microsoft Entra.
+- Adaptive LB Dashboard (Enterprise, beta): Live traffic, weight shifts, and failover visibility.
+
+### Features
+- Added Anthropic thinking parameter in Responses API.
+- Added Anthropic text completion integration support.
+- Latency metrics for all request types now returned in extra (includes inter-token latency for streaming).
+- TokenInterceptor interface added to plugins.
+- Raw provider response saved in logs (framework v1.1.4).
+
+### Fixes
+
+- Removed extra fields erroneously sent in streaming responses.
+- Anthropic tool results aggregation corrected (core v1.2.4).
+- String input support fixed for Responses requests.
+- Specific timeout error handling across all providers for context.Canceled, context.DeadlineExceeded, and fasthttp.ErrTimeout.
+- Pricing manager fixes.
+
+### Improvements
+
+- CORS wildcard matching improved to support domain patterns like *.example.com.
+
+## Closed  tickets
+
+- [#605: [Bug]: UI Docker building errors](https://github.com/maximhq/bifrost/issues/605)
+- [#597: [Bug Report] Bedrock streaming has many missing chunks](https://github.com/maximhq/bifrost/issues/597)
+- [#567: Handling reasoning content](https://github.com/maximhq/bifrost/issues/567)
+- [#565: The "pricing not found for model ..." message is repeated for each request processed, which is too noisy for the warn level.](https://github.com/maximhq/bifrost/issues/565)
+- [#552: [Bug]: "index" not specified for tool calls in OpenAI chunks](https://github.com/maximhq/bifrost/issues/552)
+- [#543: [Bug]: Indicate timeouts in error response while logging](https://github.com/maximhq/bifrost/issues/543)
+- [#542: [Feature]: Logs should show timestamps in browser timezone](https://github.com/maximhq/bifrost/issues/542)
+- [#520: [Bug]: tokens and cost for "Chat Stream" requests is missing in logs](https://github.com/maximhq/bifrost/issues/520)
+- [#516: [Bug]: Can't delete custom provider from Web UI](https://github.com/maximhq/bifrost/issues/516)
+- [#504: [Bug]: cannot use self-hosted SGLang instance with http:// URLs only](https://github.com/maximhq/bifrost/issues/504)
+- [#497: [Feature]: Add full support for standard OpenTelemetry GenAI Observability](https://github.com/maximhq/bifrost/issues/497)
+- [#479: [Feature]: Support for API Key Authentication in Bedrock](https://github.com/maximhq/bifrost/issues/479)
+- [#463: [Feature]: Support for Thinking blocks](https://github.com/maximhq/bifrost/issues/463)
+- [#456: [Docs]: Update API reference docs](https://github.com/maximhq/bifrost/issues/456)
+- [#451: [Feature]: Offline usage](https://github.com/maximhq/bifrost/issues/451)
+
+</Update>
+<Update label="Core" description="v1.3.0">
+
+- Refactor: Bifrost Response structure seggragated.
+
+</Update>
+<Update label="Framework" description="v1.3.0">
+
+- Upgrade dependency: core to 1.2.7
+- Fix: Added missing migration for `parent_request_id_column` in logs table.
+
+</Update>
+<Update label="governance" description="v1.3.0">
+
+- Chore: using core 1.2.7 and framework 1.1.7
+
+</Update>
+<Update label="jsonparser" description="v1.3.0">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
+<Update label="logging" description="v1.3.0">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
+<Update label="maxim" description="v1.3.0">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
+<Update label="mocker" description="v1.3.0">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
+<Update label="otel" description="v1.3.0">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="semantic_cache" description="v1.3.0">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
+<Update label="telemetry" description="v1.3.0">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
--- a/docs/changelogs/v1.3.1.mdx
+++ b/docs/changelogs/v1.3.1.mdx
@@ -0,0 +1,74 @@
+---
+title: "v1.3.1"
+description: "v1.3.1 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.1
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.1
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.1
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.1">
+
+- Bug: "x-bf-vk" missing error fixed.
+
+</Update>
+<Update label="Core" description="v1.3.1">
+
+- Refactor: Bifrost Response structure seggragated.
+
+</Update>
+<Update label="Framework" description="v1.3.1">
+
+- Upgrade dependency: core to 1.2.7
+- Fix: Added missing migration for `parent_request_id_column` in logs table.
+
+</Update>
+<Update label="governance" description="v1.3.1">
+
+- Chore: taking context key from core package instead of governance package
+
+</Update>
+<Update label="jsonparser" description="v1.3.1">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
+<Update label="logging" description="v1.3.1">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
+<Update label="maxim" description="v1.3.1">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
+<Update label="mocker" description="v1.3.1">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
+<Update label="otel" description="v1.3.1">
+
+- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
+
+</Update>
+<Update label="semantic_cache" description="v1.3.1">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
+<Update label="telemetry" description="v1.3.1">
+
+- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
+
+</Update>
--- a/docs/changelogs/v1.3.10.mdx
+++ b/docs/changelogs/v1.3.10.mdx
@@ -0,0 +1,93 @@
+---
+title: "v1.3.10"
+description: "v1.3.10 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.10
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.10
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.10
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.10">
+
+- chore: version update core to 1.2.13 and framework to 1.1.15
+- feat: added headers support for OTel configuration. Value prefixed with env will be fetched from environment variables (`env.ENV_VAR_NAME`)
+- feat: emission of OTel resource spans is completely async - this brings down inference overhead to < 1µsecond
+- fix: added latency calculation for vertex native requests
+- feat: added cached tokens and reasoning tokens to the usage in ui
+- fix: cost calculation for vertex requests
+- feat: added global region support for vertex API
+- fix: added filter for extra fields in chat completions request for Mistral provider
+- fix: added wildcard validation for allowed origins in UI security settings
+- fix: fixed code field in pending_safety_checks for Responses API
+
+</Update>
+<Update label="Core" description="v1.3.10">
+
+- bug: fixed embedding request not being handled in `GetExtraFields()` method of `BifrostResponse`
+- fix: added latency calculation for vertex native requests
+- feat: added cached tokens and reasoning tokens to the usage metadata for chat completions
+- feat: added global region support for vertex API
+- fix: added filter for extra fields in chat completions request for Mistral provider
+- fix: fixed ResponsesComputerToolCallPendingSafetyCheck code field
+
+</Update>
+<Update label="Framework" description="v1.3.10">
+
+- chore: version update core to 1.2.13
+- feat: added support for vertex provider/model format in pricing lookup
+
+</Update>
+<Update label="governance" description="v1.3.10">
+
+- chore: version update core to 1.2.13 and framework to 1.1.15
+
+</Update>
+<Update label="jsonparser" description="v1.3.10">
+
+- chore: version update core to 1.2.13 and framework to 1.1.15
+
+</Update>
+<Update label="logging" description="v1.3.10">
+
+- chore: version update core to 1.2.13 and framework to 1.1.15
+
+</Update>
+<Update label="maxim" description="v1.3.10">
+
+- chore: version update core to 1.2.13 and framework to 1.1.15
+
+</Update>
+<Update label="mocker" description="v1.3.10">
+
+- chore: version update core to 1.2.13 and framework to 1.1.15
+- feat: added support for responses request
+- feat: added "skip-mocker" context key to skip mocker plugin per request
+
+</Update>
+<Update label="otel" description="v1.3.10">
+
+- chore: version update core to 1.2.13 and framework to 1.1.15
+- feat: added headers support for OTel configuration. Value prefixed with env will be fetched from environment variables (`env.ENV_VAR_NAME`)
+- feat: emission of OTel resource spans is completely async - this brings down inference overhead to < 1µsecond
+
+</Update>
+<Update label="semantic_cache" description="v1.3.10">
+
+- chore: version update core to 1.2.13 and framework to 1.1.15
+- tests: added mocker plugin to all chat/responses tests
+
+</Update>
+<Update label="telemetry" description="v1.3.10">
+
+- chore: version update core to 1.2.13 and framework to 1.1.15
+
+</Update>
--- a/docs/changelogs/v1.3.11.mdx
+++ b/docs/changelogs/v1.3.11.mdx
@@ -0,0 +1,75 @@
+---
+title: "v1.3.11"
+description: "v1.3.11 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.11
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.11
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.11
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.11">
+
+- chore: version update core to 1.2.14 and framework to 1.1.16
+- feat: added `/v1/models` endpoint to list models of configured providers
+
+</Update>
+<Update label="Core" description="v1.3.11">
+
+- feat: added ListModels method to Provider interface
+- feat: enabled provider tracking in Bifrost core for API exposure
+
+</Update>
+<Update label="Framework" description="v1.3.11">
+
+- chore: version update core to 1.2.14
+
+</Update>
+<Update label="governance" description="v1.3.11">
+
+- chore: version update core to 1.2.14 and framework to 1.1.16
+
+</Update>
+<Update label="jsonparser" description="v1.3.11">
+
+- chore: version update core to 1.2.14 and framework to 1.1.16
+
+</Update>
+<Update label="logging" description="v1.3.11">
+
+- chore: version update core to 1.2.14 and framework to 1.1.16
+
+</Update>
+<Update label="maxim" description="v1.3.11">
+
+- chore: version update core to 1.2.14 and framework to 1.1.16
+
+</Update>
+<Update label="mocker" description="v1.3.11">
+
+- chore: version update core to 1.2.14 and framework to 1.1.16
+
+</Update>
+<Update label="otel" description="v1.3.11">
+
+- chore: version update core to 1.2.14 and framework to 1.1.16
+
+</Update>
+<Update label="semantic_cache" description="v1.3.11">
+
+- chore: version update core to 1.2.14 and framework to 1.1.16
+
+</Update>
+<Update label="telemetry" description="v1.3.11">
+
+- chore: version update core to 1.2.14 and framework to 1.1.16
+
+</Update>
--- a/docs/changelogs/v1.3.12.mdx
+++ b/docs/changelogs/v1.3.12.mdx
@@ -0,0 +1,89 @@
+---
+title: "v1.3.12"
+description: "v1.3.12 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.12
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.12
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.12
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.12">
+
+- chore: version update core to 1.2.15 and framework to 1.1.17
+- feat: add azure provider native responses API support
+- chore: suppress irrelevant warnings in ListModels
+- feat: refactored all plugin operations to completely async to prevent any blocking behavior
+- feat: added provider level budget and rate limits using virtual keys
+- feat: added streaming support in maxim plugin
+
+</Update>
+<Update label="Core" description="v1.3.12">
+
+- feat: add azure provider native responses API support
+- feat: improve retry logic for rate limiting errors
+- feat: add retries on list models request
+- chore: suppress irrelevant warnings in ListModels
+
+</Update>
+<Update label="Framework" description="v1.3.12">
+
+- chore: version update core to 1.2.15
+- [BREAKING] feat: renamed pricing module to modelcatalog and added list models population support for model pool
+- feat: added chunk index based sorting for streaming responses in streaming package
+- feat: added budget and rate limit to provider configs in virtual key table
+
+</Update>
+<Update label="governance" description="v1.3.12">
+
+- chore: version update core to 1.2.15 and framework to 1.1.17
+- feat: added provider level budget and rate limits
+
+</Update>
+<Update label="jsonparser" description="v1.3.12">
+
+- chore: version update core to 1.2.15 and framework to 1.1.17
+- feat: creates deep copy of the response in PostHook to avoid modifying the original response pointer
+
+</Update>
+<Update label="logging" description="v1.3.12">
+
+- chore: version update core to 1.2.15 and framework to 1.1.17
+- feat: all operations moved async to prevent any blocking behavior
+
+</Update>
+<Update label="maxim" description="v1.3.12">
+
+- chore: version update core to 1.2.15 and framework to 1.1.17
+- feat: added support for streaming responses
+
+</Update>
+<Update label="mocker" description="v1.3.12">
+
+- chore: version update core to 1.2.15 and framework to 1.1.17
+
+</Update>
+<Update label="otel" description="v1.3.12">
+
+- chore: version update core to 1.2.15 and framework to 1.1.17
+- feat: all operations moved async to prevent any blocking behavior
+
+</Update>
+<Update label="semantic_cache" description="v1.3.12">
+
+- chore: version update core to 1.2.15 and framework to 1.1.17
+
+</Update>
+<Update label="telemetry" description="v1.3.12">
+
+- chore: version update core to 1.2.15 and framework to 1.1.17
+
+</Update>
--- a/docs/changelogs/v1.3.13.mdx
+++ b/docs/changelogs/v1.3.13.mdx
@@ -0,0 +1,78 @@
+---
+title: "v1.3.13"
+description: "v1.3.13 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.13
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.13
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.13
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.13">
+
+- chore: version update framework to 1.1.18 and core to 1.2.16
+- Adds env variable support for postgres config
+- feat: standardize finish reason and single response handling across providers
+- feat: provider config hot reloading added (no need to restart Bifrost after updating provider configs now)
+
+</Update>
+<Update label="Core" description="v1.3.13">
+
+- feat: standardize finish reason and single response handling across providers
+- feat: provider config hot reloading added
+
+</Update>
+<Update label="Framework" description="v1.3.13">
+
+- Adds env variable resolution for postgres config
+- chore: Upgrades core to 1.2.16
+
+</Update>
+<Update label="governance" description="v1.3.13">
+
+- chore: version update core to 1.2.16 and framework to 1.1.18
+
+</Update>
+<Update label="jsonparser" description="v1.3.13">
+
+- chore: version update core to 1.2.16 and framework to 1.1.18
+
+</Update>
+<Update label="logging" description="v1.3.13">
+
+- chore: version update core to 1.2.16 and framework to 1.1.18
+
+</Update>
+<Update label="maxim" description="v1.3.13">
+
+- chore: version update core to 1.2.16 and framework to 1.1.18
+
+</Update>
+<Update label="mocker" description="v1.3.13">
+
+- chore: version update core to 1.2.16 and framework to 1.1.18
+
+</Update>
+<Update label="otel" description="v1.3.13">
+
+- chore: version update core to 1.2.16 and framework to 1.1.18
+
+</Update>
+<Update label="semantic_cache" description="v1.3.13">
+
+- chore: version update core to 1.2.16 and framework to 1.1.18
+
+</Update>
+<Update label="telemetry" description="v1.3.13">
+
+- chore: version update core to 1.2.16 and framework to 1.1.18
+
+</Update>
--- a/docs/changelogs/v1.3.14.mdx
+++ b/docs/changelogs/v1.3.14.mdx
@@ -0,0 +1,84 @@
+---
+title: "v1.3.14"
+description: "v1.3.14 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.14
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.14
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.14
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.14">
+
+- chore: version update framework to 1.1.18 and core to 1.2.16
+- feat: Use all keys for list models request
+- fix: handled panic when using gemini models with openai integration responses API requests
+- chore: Added id, object, and model fields to Chat Completion responses from Bedrock and Cohere providers
+- feat: Adds support for dynamic plugins. Note that dynamic plugins are in beta
+- feat: Adds auth support for dashboard, inference APIs and dashboard APIs.
+
+</Update>
+<Update label="Core" description="v1.3.14">
+
+- feat: Use all keys for list models request
+- refactor: Cohere provider to use completeRequest and response pooling for all requests
+- chore: Added id, object, and model fields to Chat Completion responses from Bedrock and Cohere providers
+- feat: Moved all streaming calls to use fasthttp client for efficiency
+- feat: Adds support for auth
+
+</Update>
+<Update label="Framework" description="v1.3.14">
+
+- chore: Upgrades core to 1.2.17
+- feat: Adds dynamic plugins support
+- feat: Adds auth tables in config store
+
+</Update>
+<Update label="governance" description="v1.3.14">
+
+- chore: version update core to 1.2.17 and framework to 1.1.19
+
+</Update>
+<Update label="jsonparser" description="v1.3.14">
+
+- chore: version update core to 1.2.17 and framework to 1.1.19
+
+</Update>
+<Update label="logging" description="v1.3.14">
+
+- chore: version update core to 1.2.17 and framework to 1.1.19
+
+</Update>
+<Update label="maxim" description="v1.3.14">
+
+- chore: version update core to 1.2.17 and framework to 1.1.19
+
+</Update>
+<Update label="mocker" description="v1.3.14">
+
+- chore: version update core to 1.2.17 and framework to 1.1.19
+
+</Update>
+<Update label="otel" description="v1.3.14">
+
+- chore: version update core to 1.2.17 and framework to 1.1.19
+
+</Update>
+<Update label="semantic_cache" description="v1.3.14">
+
+- chore: version update core to 1.2.17 and framework to 1.1.19
+
+</Update>
+<Update label="telemetry" description="v1.3.14">
+
+- chore: version update core to 1.2.17 and framework to 1.1.19
+
+</Update>
--- a/docs/changelogs/v1.3.15.mdx
+++ b/docs/changelogs/v1.3.15.mdx
@@ -0,0 +1,75 @@
+---
+title: "v1.3.15"
+description: "v1.3.15 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.15
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.15
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.15
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.15">
+
+- chore: version update core to 1.2.18 and framework to 1.1.21
+- enhancement: provider lookup enhancements in modelcatelog
+
+</Update>
+<Update label="Core" description="v1.3.15">
+
+- refactor: minor until changes
+
+</Update>
+<Update label="Framework" description="v1.3.15">
+
+- chore: Upgrades core to 1.2.18
+- enhancement: provider lookup enhancements
+
+</Update>
+<Update label="governance" description="v1.3.15">
+
+- chore: version update core to 1.2.18 and framework to 1.1.21
+
+</Update>
+<Update label="jsonparser" description="v1.3.15">
+
+- chore: version update core to 1.2.18 and framework to 1.1.21
+
+</Update>
+<Update label="logging" description="v1.3.15">
+
+- chore: version update core to 1.2.18 and framework to 1.1.21
+
+</Update>
+<Update label="maxim" description="v1.3.15">
+
+- chore: version update core to 1.2.18 and framework to 1.1.21
+
+</Update>
+<Update label="mocker" description="v1.3.15">
+
+- chore: version update core to 1.2.18 and framework to 1.1.21
+
+</Update>
+<Update label="otel" description="v1.3.15">
+
+- chore: version update core to 1.2.18 and framework to 1.1.21
+
+</Update>
+<Update label="semantic_cache" description="v1.3.15">
+
+- chore: version update core to 1.2.18 and framework to 1.1.21
+
+</Update>
+<Update label="telemetry" description="v1.3.15">
+
+- chore: version update core to 1.2.18 and framework to 1.1.21
+
+</Update>
--- a/docs/changelogs/v1.3.16.mdx
+++ b/docs/changelogs/v1.3.16.mdx
@@ -0,0 +1,79 @@
+---
+title: "v1.3.16"
+description: "v1.3.16 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.16
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.16
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.16
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.16">
+
+- chore: version update core to 1.2.18 and framework to 1.1.21
+- feat: added Perplexity provider support
+- chore: version update core to 1.2.19 and framework to 1.1.22
+- feat: support for mistralai publisher endpoint in vertex provider
+- enhancement: Anthropic's computer tool in the Responses API stream handling,
+
+</Update>
+<Update label="Core" description="v1.3.16">
+
+- feat: support for mistralai publisher endpoint in vertex provider
+- enhancement: Anthropic's computer tool in the Responses API stream handling,
+- feat: added Perplexity provider support
+
+</Update>
+<Update label="Framework" description="v1.3.16">
+
+- chore: Upgrades core to 1.2.19
+
+</Update>
+<Update label="governance" description="v1.3.16">
+
+- chore: version update core to 1.2.19 and framework to 1.1.22
+
+</Update>
+<Update label="jsonparser" description="v1.3.16">
+
+- chore: version update core to 1.2.19 and framework to 1.1.22
+
+</Update>
+<Update label="logging" description="v1.3.16">
+
+- chore: version update core to 1.2.19 and framework to 1.1.22
+
+</Update>
+<Update label="maxim" description="v1.3.16">
+
+- chore: version update core to 1.2.19 and framework to 1.1.22
+
+</Update>
+<Update label="mocker" description="v1.3.16">
+
+- chore: version update core to 1.2.19 and framework to 1.1.22
+
+</Update>
+<Update label="otel" description="v1.3.16">
+
+- chore: version update core to 1.2.19 and framework to 1.1.22
+
+</Update>
+<Update label="semantic_cache" description="v1.3.16">
+
+- chore: version update core to 1.2.19 and framework to 1.1.22
+
+</Update>
+<Update label="telemetry" description="v1.3.16">
+
+- chore: version update core to 1.2.19 and framework to 1.1.22
+
+</Update>
--- a/docs/changelogs/v1.3.17.mdx
+++ b/docs/changelogs/v1.3.17.mdx
@@ -0,0 +1,72 @@
+---
+title: "v1.3.17"
+description: "v1.3.17 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.17
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.17
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.17
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.17">
+
+- chore: version update framework to 1.1.24
+- fix: resolve MCP client deletion when attached to a virtual key
+- chore: allowed changing name when updating a virtual key
+- fix: vk team/customer association issue when updating a vk
+
+</Update>
+<Update label="Framework" description="v1.3.17">
+
+- fix: resolve MCP client deletion when attached to a virtual key
+- fix: vk team/customer association issue when updating a vk
+
+</Update>
+<Update label="governance" description="v1.3.17">
+
+- chore: version update framework to 1.1.23
+
+</Update>
+<Update label="jsonparser" description="v1.3.17">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="logging" description="v1.3.17">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="maxim" description="v1.3.17">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="mocker" description="v1.3.17">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="otel" description="v1.3.17">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="semantic_cache" description="v1.3.17">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="telemetry" description="v1.3.17">
+
+- chore: version update framework to 1.1.24
+
+</Update>
--- a/docs/changelogs/v1.3.18.mdx
+++ b/docs/changelogs/v1.3.18.mdx
@@ -0,0 +1,69 @@
+---
+title: "v1.3.18"
+description: "v1.3.18 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.18
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.18
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.18
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.18">
+
+- change: health endpoint is whitelisted from auth middleware
+
+</Update>
+<Update label="Framework" description="v1.3.18">
+
+- fix: resolve MCP client deletion when attached to a virtual key
+- fix: vk team/customer association issue when updating a vk
+
+</Update>
+<Update label="governance" description="v1.3.18">
+
+- chore: version update framework to 1.1.23
+
+</Update>
+<Update label="jsonparser" description="v1.3.18">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="logging" description="v1.3.18">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="maxim" description="v1.3.18">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="mocker" description="v1.3.18">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="otel" description="v1.3.18">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="semantic_cache" description="v1.3.18">
+
+- chore: version update framework to 1.1.24
+
+</Update>
+<Update label="telemetry" description="v1.3.18">
+
+- chore: version update framework to 1.1.24
+
+</Update>
--- a/docs/changelogs/v1.3.19.mdx
+++ b/docs/changelogs/v1.3.19.mdx
@@ -0,0 +1,89 @@
+---
+title: "v1.3.19"
+description: "v1.3.19 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.19
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.19
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.19
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.19">
+
+- chore: version update core to 1.2.20 and framework to 1.1.24
+- chore: allowed changing name when updating a virtual key
+- feat: add numberOfRetries, fallbackIndex and selected key name and id to context to telemetry metrics
+- feat: add used virtual key name and id to telemetry metrics
+- feat: send model deployment back in response extra fields
+- feat: add selected key and virtual key to logs filter
+- feat: add headers to MCP client config
+- feat: add `is_success` label to upstream latency metrics
+
+</Update>
+<Update label="Core" description="v1.3.19">
+
+- feat: add numberOfRetries, fallbackIndex and selected key name to context
+[BREAKING] changed BifrostContextKeySelectedKey to BifrostContextKeySelectedKeyID
+- feat: send model deployment back in response extra fields
+- feat: add headers to MCP client config
+
+</Update>
+<Update label="Framework" description="v1.3.19">
+
+- chore: Upgrades core to 1.2.20
+- feat: add selected key and virtual key to logs table
+- feat: add headers to MCP client config
+
+</Update>
+<Update label="governance" description="v1.3.19">
+
+- chore: version update core to 1.2.20 and framework to 1.1.24
+
+</Update>
+<Update label="jsonparser" description="v1.3.19">
+
+- chore: version update core to 1.2.20 and framework to 1.1.24
+
+</Update>
+<Update label="logging" description="v1.3.19">
+
+- chore: version update core to 1.2.20 and framework to 1.1.24
+- feat: add selected key and virtual key to logs
+
+</Update>
+<Update label="maxim" description="v1.3.19">
+
+- chore: version update core to 1.2.20 and framework to 1.1.24
+
+</Update>
+<Update label="mocker" description="v1.3.19">
+
+- chore: version update core to 1.2.20 and framework to 1.1.24
+
+</Update>
+<Update label="otel" description="v1.3.19">
+
+- chore: version update core to 1.2.20 and framework to 1.1.24
+
+</Update>
+<Update label="semantic_cache" description="v1.3.19">
+
+- chore: version update core to 1.2.20 and framework to 1.1.24
+
+</Update>
+<Update label="telemetry" description="v1.3.19">
+
+- chore: version update core to 1.2.20 and framework to 1.1.24
+- feat: add numberOfRetries, fallbackIndex and selected key name and id to context to telemetry metrics
+- feat: add used virtual key name and id to telemetry metrics
+- feat: add `is_success` label to upstream latency metrics
+
+</Update>
--- a/docs/changelogs/v1.3.2.mdx
+++ b/docs/changelogs/v1.3.2.mdx
@@ -0,0 +1,84 @@
+---
+title: "v1.3.2"
+description: "v1.3.2 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.2
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.2
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.2
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.2">
+
+- Refactor: Moves all context key types to schemas.BifrostContextKey
+- Fix: Fixes Maxim plugin bug where external traceId were blocking new trace creations
+
+</Update>
+<Update label="Core" description="v1.3.2">
+
+- Chore: Now schema.BifrostContextKey is the only valid ctx key type throughout the project
+
+</Update>
+<Update label="Framework" description="v1.3.2">
+
+- Upgrade dependency: core to 1.2.8
+- Chore: Moves all context key types to schemas.BifrostContextKey
+- Chore: Adds new logs table migration to avoid missing any required columns in the DB
+
+</Update>
+<Update label="governance" description="v1.3.2">
+
+- Upgrade dependency: core to 1.2.8
+- Chore: Moves all context key types to schemas.BifrostContextKey
+
+</Update>
+<Update label="jsonparser" description="v1.3.2">
+
+- Upgrade dependency: core to 1.2.8
+- Chore: Moves all context key types to schemas.BifrostContextKey
+
+</Update>
+<Update label="logging" description="v1.3.2">
+
+- Upgrade dependency: core to 1.2.8
+- Chore: Moves all context key types to schemas.BifrostContextKey
+
+</Update>
+<Update label="maxim" description="v1.3.2">
+
+- Upgrade dependency: core to 1.2.8
+- Chore: Moves all context key types to schemas.BifrostContextKey
+- Fix: Fixes a bug where external trace id was blocking new trace creation
+
+</Update>
+<Update label="mocker" description="v1.3.2">
+
+- Upgrade dependency: core to 1.2.8
+
+</Update>
+<Update label="otel" description="v1.3.2">
+
+- Upgrade dependency: core to 1.2.8
+- Chore: Moves all context key types to schemas.BifrostContextKey
+
+</Update>
+<Update label="semantic_cache" description="v1.3.2">
+
+- Upgrade dependency: core to 1.2.8
+- Chore: Moves all context key types to schemas.BifrostContextKey
+
+</Update>
+<Update label="telemetry" description="v1.3.2">
+
+- Upgrade dependency: core to 1.2.8
+- Chore: Moves all context key types to schemas.BifrostContextKey
+
+</Update>
--- a/docs/changelogs/v1.3.20.mdx
+++ b/docs/changelogs/v1.3.20.mdx
@@ -0,0 +1,23 @@
+---
+title: "v1.3.20"
+description: "v1.3.20 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.20
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.20
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.20
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.20">
+
+- fix: handle case when config store is nil in session and plugins handlers
+
+</Update>
--- a/docs/changelogs/v1.3.21.mdx
+++ b/docs/changelogs/v1.3.21.mdx
@@ -0,0 +1,24 @@
+---
+title: "v1.3.21"
+description: "v1.3.21 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.21
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.21
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.21
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.21">
+
+- fix: handle case when config store is nil in session and plugins handlers
+- chore: adds integration tests for different config combinations
+
+</Update>
--- a/docs/changelogs/v1.3.22.mdx
+++ b/docs/changelogs/v1.3.22.mdx
@@ -0,0 +1,77 @@
+---
+title: "v1.3.22"
+description: "v1.3.22 changelog - 2025-11-09"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.22
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.22
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.22
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.22">
+
+- feat: Adds option to disable authentication on inference calls
+- chore: Adds dark image for new version infographic
+
+</Update>
+<Update label="Core" description="v1.3.22">
+
+- feat: add numberOfRetries, fallbackIndex and selected key name to context
+[BREAKING] changed BifrostContextKeySelectedKey to BifrostContextKeySelectedKeyID
+- feat: send model deployment back in response extra fields
+- feat: add headers to MCP client config
+
+</Update>
+<Update label="Framework" description="v1.3.22">
+
+- Adds DisableAuthOnInference to AuthConfig
+
+</Update>
+<Update label="governance" description="v1.3.22">
+
+- chore: version update framework to 1.1.25
+
+</Update>
+<Update label="jsonparser" description="v1.3.22">
+
+- chore: version update framework to 1.1.25
+
+</Update>
+<Update label="logging" description="v1.3.22">
+
+- chore: version update framework to 1.1.25
+
+</Update>
+<Update label="maxim" description="v1.3.22">
+
+- chore: version update framework to 1.1.25
+
+</Update>
+<Update label="mocker" description="v1.3.22">
+
+- chore: version update framework to 1.1.25
+
+</Update>
+<Update label="otel" description="v1.3.22">
+
+- chore: version update framework to 1.1.25
+
+</Update>
+<Update label="semantic_cache" description="v1.3.22">
+
+- chore: version update framework to 1.1.25
+
+</Update>
+<Update label="telemetry" description="v1.3.22">
+
+- chore: version update framework to 1.1.25
+
+</Update>
--- a/docs/changelogs/v1.3.23.mdx
+++ b/docs/changelogs/v1.3.23.mdx
@@ -0,0 +1,81 @@
+---
+title: "v1.3.23"
+description: "v1.3.23 changelog - 2025-11-10"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.23
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.23
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.23
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.23">
+- chore: version update core to 1.2.21 and framework to 1.1.26
+- feat: add headers to MCP client config and provider config
+- feat: adds support for custom path overrides for custom providers
+- feat: adds support for key less authentication for custom providers
+- feat: handles `response_schema` and `response_json_schema` parameter in gemini integration
+- refactor: better mcp client management
+- feat: option to disable content logging
+- feat: key selection and retries info sent in genai traces
+- feat: option to edit and reconnect mcp clients
+
+</Update>
+<Update label="Core" description="v1.3.23">
+
+- feat: add headers to MCP client config and provider config
+- feat: adds support for custom path overrides for custom providers
+- feat: adds support for key less authentication for custom providers
+- feat: handles `response_schema` and `response_json_schema` parameter in gemini integration
+- [BREAKING] MCP client Public API now takes mcp client ids instead of names
+- refactor: better mcp client management
+
+</Update>
+<Update label="Framework" description="v1.3.23">
+- chore: version update core to 1.2.21
+- feat: add headers to MCP client config
+- refactor: mcp clients to use ids instead of names
+- feat: option to disable content logging
+
+</Update>
+<Update label="governance" description="v1.3.23">
+- chore: version update core to 1.2.21 and framework to 1.1.26
+
+</Update>
+<Update label="jsonparser" description="v1.3.23">
+- chore: version update core to 1.2.21 and framework to 1.1.26
+
+</Update>
+<Update label="logging" description="v1.3.23">
+- chore: version update core to 1.2.21 and framework to 1.1.26
+- feat: option to disable content logging
+
+</Update>
+<Update label="maxim" description="v1.3.23">
+- chore: version update core to 1.2.21 and framework to 1.1.26
+
+</Update>
+<Update label="mocker" description="v1.3.23">
+- chore: version update core to 1.2.21 and framework to 1.1.26
+
+</Update>
+<Update label="otel" description="v1.3.23">
+- chore: version update core to 1.2.21 and framework to 1.1.26
+- feat: key selection and retries info sent in genai traces
+
+</Update>
+<Update label="semantic_cache" description="v1.3.23">
+- chore: version update core to 1.2.21 and framework to 1.1.26
+
+</Update>
+<Update label="telemetry" description="v1.3.23">
+- chore: version update core to 1.2.21 and framework to 1.1.26
+
+</Update>
--- a/docs/changelogs/v1.3.24.mdx
+++ b/docs/changelogs/v1.3.24.mdx
@@ -0,0 +1,64 @@
+---
+title: "v1.3.24"
+description: "v1.3.24 changelog - 2025-11-11"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.24
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.24
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.24
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.24">
+- chore: update core version to 1.2.22 and framework version to 1.1.27
+- feat: Adds input message in logs table for easier navigation
+
+</Update>
+<Update label="Core" description="v1.3.24">
+- chore: Adds index to ChatAssistantMessageToolCall
+- fix: responses text output standardization to content blocks
+
+</Update>
+<Update label="Framework" description="v1.3.24">
+- chore: update core version to 1.2.22
+
+</Update>
+<Update label="governance" description="v1.3.24">
+- chore: update core version to 1.2.22 and framework version to 1.1.27
+
+</Update>
+<Update label="jsonparser" description="v1.3.24">
+- chore: update core version to 1.2.22 and framework version to 1.1.27
+
+</Update>
+<Update label="logging" description="v1.3.24">
+- chore: update core version to 1.2.22 and framework version to 1.1.27
+
+</Update>
+<Update label="maxim" description="v1.3.24">
+- chore: update core version to 1.2.22 and framework version to 1.1.27
+
+</Update>
+<Update label="mocker" description="v1.3.24">
+- chore: update core version to 1.2.22 and framework version to 1.1.27
+
+</Update>
+<Update label="otel" description="v1.3.24">
+- chore: update core version to 1.2.22 and framework version to 1.1.27
+
+</Update>
+<Update label="semantic_cache" description="v1.3.24">
+- chore: update core version to 1.2.22 and framework version to 1.1.27
+
+</Update>
+<Update label="telemetry" description="v1.3.24">
+- chore: update core version to 1.2.22 and framework version to 1.1.27
+
+</Update>
--- a/docs/changelogs/v1.3.25.mdx
+++ b/docs/changelogs/v1.3.25.mdx
@@ -0,0 +1,78 @@
+---
+title: "v1.3.25"
+description: "v1.3.25 changelog - 2025-11-14"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.25
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.25
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.25
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.25">
+- chore: update core version to 1.2.23 and framework version to 1.1.28
+- feat: added unified streaming lifecycle events across all providers to fully align with OpenAI’s streaming response types.
+- chore: shift from `alpha/responses` to `v1/responses` in openrouter provider for responses API
+- feat: send back pricing data for models in list models response
+- fix: add support for keyless providers in list models request
+- feat: add support for custom fine-tuned models in vertex provider
+- feat: send deployment aliases in list models response for supported providers
+- feat: support for API Key auth in vertex provider
+- feat: support for system account in environment for vertex provider
+
+</Update>
+<Update label="Core" description="1.2.23">
+- feat: added unified streaming lifecycle events across all providers to fully align with OpenAI’s streaming response types.
+- chore: shift from `alpha/responses` to `v1/responses` in openrouter provider for responses API
+- fix: add support for keyless providers in list models request
+- feat: add support for custom fine-tuned models in vertex provider
+- fix: vertex provider list models now correctly returns the custom fine-tuned model ids in the response
+- feat: send deployment aliases in list models response for supported providers
+- feat: support for API Key auth in vertex provider
+
+</Update>
+<Update label="Framework" description="1.1.28">
+- chore: update core version to 1.2.23
+- feat: expose method to get pricing data for a model in model catalog
+- feat: add project number and deployments to vertex key config
+
+</Update>
+<Update label="governance" description="1.3.29">
+- chore: update core version to 1.2.23 and framework version to 1.1.28
+
+</Update>
+<Update label="jsonparser" description="1.3.29">
+- chore: update core version to 1.2.23 and framework version to 1.1.28
+
+</Update>
+<Update label="logging" description="1.3.29">
+- chore: update core version to 1.2.23 and framework version to 1.1.28
+
+</Update>
+<Update label="maxim" description="1.4.28">
+- chore: update core version to 1.2.23 and framework version to 1.1.28
+
+</Update>
+<Update label="mocker" description="1.3.28">
+- chore: update core version to 1.2.23 and framework version to 1.1.28
+
+</Update>
+<Update label="otel" description="1.0.28">
+- chore: update core version to 1.2.23 and framework version to 1.1.28
+
+</Update>
+<Update label="semantic_cache" description="1.3.28">
+- chore: update core version to 1.2.23 and framework version to 1.1.28
+
+</Update>
+<Update label="telemetry" description="1.3.28">
+- chore: update core version to 1.2.23 and framework version to 1.1.28
+
+</Update>
--- a/docs/changelogs/v1.3.26.mdx
+++ b/docs/changelogs/v1.3.26.mdx
@@ -0,0 +1,64 @@
+---
+title: "v1.3.26"
+description: "v1.3.26 changelog - 2025-11-16"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.26
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.26
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.26
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.26">
+- feat: adds support for elevenlabs provider
+- fix: fixes security settings form submission with empty cors domains.
+- chore: minor ui enhancements
+
+</Update>
+<Update label="Core" description="1.2.24">
+- feat: Added Elevenlabs provider
+
+</Update>
+<Update label="Framework" description="1.1.29">
+- chore: update core version to 1.2.24
+
+</Update>
+<Update label="governance" description="1.3.30">
+- chore: update core version to 1.2.24 and framework version to 1.1.29
+
+</Update>
+<Update label="jsonparser" description="1.3.30">
+- chore: update core version to 1.2.24 and framework version to 1.1.29
+
+</Update>
+<Update label="logging" description="1.3.30">
+- chore: update core version to 1.2.24 and framework version to 1.1.29
+
+</Update>
+<Update label="maxim" description="1.4.29">
+- chore: update core version to 1.2.24 and framework version to 1.1.29
+
+</Update>
+<Update label="mocker" description="1.3.29">
+- chore: update core version to 1.2.24 and framework version to 1.1.29
+
+</Update>
+<Update label="otel" description="1.0.29">
+- chore: update core version to 1.2.24 and framework version to 1.1.29
+
+</Update>
+<Update label="semantic_cache" description="1.3.29">
+- chore: update core version to 1.2.24 and framework version to 1.1.29
+
+</Update>
+<Update label="telemetry" description="1.3.29">
+- chore: update core version to 1.2.24 and framework version to 1.1.29
+
+</Update>
--- a/docs/changelogs/v1.3.27.mdx
+++ b/docs/changelogs/v1.3.27.mdx
@@ -0,0 +1,62 @@
+---
+title: "v1.3.27"
+description: "v1.3.27 changelog - 2025-11-17"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.27
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.27
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.27
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.27">
+- fix: bedrock memory and streaming response parsing fixes
+
+</Update>
+<Update label="Core" description="1.2.25">
+- fix: bedrock memory and streaming response parsing fixes
+
+</Update>
+<Update label="Framework" description="1.1.30">
+- chore: update core version to 1.2.25
+
+</Update>
+<Update label="governance" description="1.3.31">
+- chore: update core version to 1.2.25 and framework version to 1.1.30
+
+</Update>
+<Update label="jsonparser" description="1.3.31">
+- chore: update core version to 1.2.25 and framework version to 1.1.30
+
+</Update>
+<Update label="logging" description="1.3.31">
+- chore: update core version to 1.2.25 and framework version to 1.1.30
+
+</Update>
+<Update label="maxim" description="1.4.30">
+- chore: update core version to 1.2.25 and framework version to 1.1.30
+
+</Update>
+<Update label="mocker" description="1.3.30">
+- chore: update core version to 1.2.25 and framework version to 1.1.30
+
+</Update>
+<Update label="otel" description="1.0.30">
+- chore: update core version to 1.2.25 and framework version to 1.1.30
+
+</Update>
+<Update label="semantic_cache" description="1.3.30">
+- chore: update core version to 1.2.25 and framework version to 1.1.30
+
+</Update>
+<Update label="telemetry" description="1.3.30">
+- chore: update core version to 1.2.25 and framework version to 1.1.30
+
+</Update>
--- a/docs/changelogs/v1.3.28.mdx
+++ b/docs/changelogs/v1.3.28.mdx
@@ -0,0 +1,58 @@
+---
+title: "v1.3.28"
+description: "v1.3.28 changelog - 2025-11-18"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.28
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.28
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.28
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.28">
+feat: Improves log page loading performance for millions of logs stored on sqlite
+
+</Update>
+<Update label="Framework" description="1.1.31">
+feat: splits logs APIs into `getStats` and `getLogs` to improve speed for sqlite
+
+</Update>
+<Update label="governance" description="1.3.32">
+chore: update framework version to 1.1.31
+
+</Update>
+<Update label="jsonparser" description="1.3.32">
+chore: update framework version to 1.1.31
+
+</Update>
+<Update label="logging" description="1.3.32">
+chore: update framework version to 1.1.31
+
+</Update>
+<Update label="maxim" description="1.4.31">
+chore: update framework version to 1.1.31
+
+</Update>
+<Update label="mocker" description="1.3.31">
+chore: update framework version to 1.1.31
+
+</Update>
+<Update label="otel" description="1.0.31">
+chore: update framework version to 1.1.31
+
+</Update>
+<Update label="semantic_cache" description="1.3.31">
+chore: update framework version to 1.1.31
+
+</Update>
+<Update label="telemetry" description="1.3.31">
+chore: update framework version to 1.1.31
+
+</Update>
--- a/docs/changelogs/v1.3.29.mdx
+++ b/docs/changelogs/v1.3.29.mdx
@@ -0,0 +1,71 @@
+---
+title: "v1.3.29"
+description: "v1.3.29 changelog - 2025-11-18"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.29
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.29
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.29
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.29">
+- fix: properly set bifrost version in metrics
+- feat: added team_id, team_name, customer_id and customer_name labels to otel metrics
+- fix: skip adding google/ prefix for custom fine-tuned models in vertex provider (for genai integration)
+- fix: deep copy inputs in semantic cache plugin to not mutate the original request
+
+</Update>
+<Update label="Core" description="1.2.26">
+- fix: skip adding google/ prefix for custom fine-tuned models in vertex provider
+- feat: added DeepCopy functions to schemas package
+
+</Update>
+<Update label="Framework" description="1.1.32">
+chore: update core version to 1.2.26
+
+</Update>
+<Update label="governance" description="1.3.33">
+chore: update core version to 1.2.26 and framework version to 1.1.32
+
+</Update>
+<Update label="jsonparser" description="1.3.33">
+chore: update core version to 1.2.26 and framework version to 1.1.32
+
+</Update>
+<Update label="logging" description="1.3.33">
+chore: update core version to 1.2.26 and framework version to 1.1.32
+
+</Update>
+<Update label="maxim" description="1.4.32">
+chore: update core version to 1.2.26 and framework version to 1.1.32
+
+</Update>
+<Update label="mocker" description="1.3.32">
+chore: update core version to 1.2.26 and framework version to 1.1.32
+
+</Update>
+<Update label="otel" description="1.0.32">
+- chore: update core version to 1.2.26 and framework version to 1.1.32
+- fix: properly set bifrost version in metrics
+- feat: added team_id, team_name, customer_id and customer_name labels to otel metrics
+
+</Update>
+<Update label="semantic_cache" description="1.3.32">
+- chore: update core version to 1.2.26 and framework version to 1.1.32
+- fix: deep copy inputs to not mutate the original request
+
+</Update>
+<Update label="telemetry" description="1.3.32">
+- chore: update core version to 1.2.26 and framework version to 1.1.32
+- feat: added filter for custom labels that are already default labels
+- feat: added team_id, team_name, customer_id and customer_name labels to telemetry metrics
+
+</Update>
--- a/docs/changelogs/v1.3.3.mdx
+++ b/docs/changelogs/v1.3.3.mdx
@@ -0,0 +1,75 @@
+---
+title: "v1.3.3"
+description: "v1.3.3 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.3
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.3
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.3
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.3">
+
+- Upgrade dependency: core to 1.2.9
+- Fix: JSON serialization for error objects and tool function parameters
+
+</Update>
+<Update label="Core" description="v1.3.3">
+
+- Fix: Fixed JSON serialization for error objects and tool function parameters
+
+</Update>
+<Update label="Framework" description="v1.3.3">
+
+- Upgrade dependency: core to 1.2.9
+- Fix: JSON serialization for error objects and tool function parameters
+
+</Update>
+<Update label="governance" description="v1.3.3">
+
+- chore: version update core to 1.2.9
+
+</Update>
+<Update label="jsonparser" description="v1.3.3">
+
+- chore: version update core to 1.2.9
+
+</Update>
+<Update label="logging" description="v1.3.3">
+
+- chore: version update core to 1.2.9
+
+</Update>
+<Update label="maxim" description="v1.3.3">
+
+- chore: version update core to 1.2.9
+
+</Update>
+<Update label="mocker" description="v1.3.3">
+
+- chore: version update core to 1.2.9
+
+</Update>
+<Update label="otel" description="v1.3.3">
+
+- chore: version update core to 1.2.9
+
+</Update>
+<Update label="semantic_cache" description="v1.3.3">
+
+- chore: version update core to 1.2.9
+
+</Update>
+<Update label="telemetry" description="v1.3.3">
+
+- chore: version update core to 1.2.9
+
+</Update>
--- a/docs/changelogs/v1.3.30.mdx
+++ b/docs/changelogs/v1.3.30.mdx
@@ -0,0 +1,62 @@
+---
+title: "v1.3.30"
+description: "v1.3.30 changelog - 2025-11-18"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.30
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.30
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.30
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.30">
+- feat: adds migration for missing provider column in key table
+
+<Warning>
+"keys" in "provider_config" in `config.json` file requires unique name. If there is any collision, Bifrost wont be able to boot.
+</Warning>
+
+</Update>
+<Update label="Framework" description="1.1.33">
+feat: add migration for missing provider column in key table
+
+</Update>
+<Update label="governance" description="1.3.34">
+chore: update framework version to 1.1.33
+
+</Update>
+<Update label="jsonparser" description="1.3.34">
+chore: update framework version to 1.1.33
+
+</Update>
+<Update label="logging" description="1.3.34">
+chore: update framework version to 1.1.33
+
+</Update>
+<Update label="maxim" description="1.4.33">
+chore: update framework version to 1.1.33
+
+</Update>
+<Update label="mocker" description="1.3.33">
+chore: update framework version to 1.1.33
+
+</Update>
+<Update label="otel" description="1.0.33">
+chore: update framework version to 1.1.33
+
+</Update>
+<Update label="semantic_cache" description="1.3.33">
+chore: update framework version to 1.1.33
+
+</Update>
+<Update label="telemetry" description="1.3.33">
+chore: update framework version to 1.1.33
+
+</Update>
--- a/docs/changelogs/v1.3.31.mdx
+++ b/docs/changelogs/v1.3.31.mdx
@@ -0,0 +1,62 @@
+---
+title: "v1.3.31"
+description: "v1.3.31 changelog - 2025-11-19"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.31
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.31
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.31
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.31">
+fix: integration fixes for fallbacks
+
+</Update>
+<Update label="Core" description="1.2.27">
+fix: integration convertor fixes for fallbacks
+
+</Update>
+<Update label="Framework" description="1.1.34">
+chore: update core version to 1.2.27
+
+</Update>
+<Update label="governance" description="1.3.35">
+chore: update core version to 1.2.27 to framework version 1.1.34
+
+</Update>
+<Update label="jsonparser" description="1.3.35">
+chore: update core version to 1.2.27 to framework version 1.1.34
+
+</Update>
+<Update label="logging" description="1.3.35">
+chore: update core version to 1.2.27 to framework version 1.1.34
+
+</Update>
+<Update label="maxim" description="1.4.34">
+chore: update core version to 1.2.27 to framework version 1.1.34
+
+</Update>
+<Update label="mocker" description="1.3.34">
+chore: update core version to 1.2.27 to framework version 1.1.34
+
+</Update>
+<Update label="otel" description="1.0.34">
+chore: update core version to 1.2.27 to framework version 1.1.34
+
+</Update>
+<Update label="semantic_cache" description="1.3.34">
+chore: update core version to 1.2.27 to framework version 1.1.34
+
+</Update>
+<Update label="telemetry" description="1.3.34">
+chore: update core version to 1.2.27 to framework version 1.1.34
+
+</Update>
--- a/docs/changelogs/v1.3.32.mdx
+++ b/docs/changelogs/v1.3.32.mdx
@@ -0,0 +1,72 @@
+---
+title: "v1.3.32"
+description: "v1.3.32 changelog - 2025-11-20"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.32
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.32
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.32
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.32">
+- feat: support added for structured output Anthropic provider
+- fix: Gemini thought signature preservation for multi-turn function calling (#879)
+- fix: responses API stream lifecycle events fixes
+- fix: embedding models usage with vertex provider using gemini integration
+- feat: support for anthropic passthrough in streaming for claude code
+- fix: lookup for virtual key in authorization and x-api-key headers for provider routing
+- fix: added responses stream passthrough for codex in openai integration
+
+</Update>
+<Update label="Core" description="1.2.28">
+- feat: support added for structured output Anthropic provider
+- fix: Gemini thought signature preservation for multi-turn function calling (#879)
+- fix: responses API stream lifecycle events fixes
+- feat: support for anthropic passthrough in streaming for claude code
+
+</Update>
+<Update label="Framework" description="1.1.35">
+chore: update core version to 1.2.28
+
+</Update>
+<Update label="governance" description="1.3.36">
+- chore: update core version to 1.2.28 and framework version to 1.1.35
+- fix: lookup for virtual key in authorization and x-api-key headers
+
+</Update>
+<Update label="jsonparser" description="1.3.36">
+chore: update core version to 1.2.28 and framework version to 1.1.35
+
+</Update>
+<Update label="logging" description="1.3.36">
+chore: update core version to 1.2.28 and framework version to 1.1.35
+
+</Update>
+<Update label="maxim" description="1.4.35">
+chore: update core version to 1.2.28 and framework version to 1.1.35
+
+</Update>
+<Update label="mocker" description="1.3.35">
+chore: update core version to 1.2.28 and framework version to 1.1.35
+
+</Update>
+<Update label="otel" description="1.0.35">
+chore: update core version to 1.2.28 and framework version to 1.1.35
+
+</Update>
+<Update label="semantic_cache" description="1.3.35">
+chore: update core version to 1.2.28 and framework version to 1.1.35
+
+</Update>
+<Update label="telemetry" description="1.3.35">
+chore: update core version to 1.2.28 and framework version to 1.1.35
+
+</Update>
--- a/docs/changelogs/v1.3.33.mdx
+++ b/docs/changelogs/v1.3.33.mdx
@@ -0,0 +1,65 @@
+---
+title: "v1.3.33"
+description: "v1.3.33 changelog - 2025-11-21"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.33
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.33
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.33
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.33">
+- feat: Adds log retention config and a routine to cleanup logs daily based on the retention config. Default retention days are 365.
+- fix: Added parsing for cached creation input tokens for Anthropic and Bedrock
+- fix: Handled cost calculation for cached tokens
+
+</Update>
+<Update label="Core" description="1.2.29">
+- fix: added parsing for cached creation input tokens for Anthropic and Bedrock
+
+</Update>
+<Update label="Framework" description="1.1.36">
+- fix: handled cost calculation for cached tokens
+- feat: adds support for log cleanup routine
+
+</Update>
+<Update label="governance" description="1.3.37">
+- chore: updates core version to 1.2.29 and framework version to 1.1.36
+
+</Update>
+<Update label="jsonparser" description="1.3.37">
+- chore: updates core version to 1.2.29 and framework version to 1.1.36
+
+</Update>
+<Update label="logging" description="1.3.37">
+- chore: updates core version to 1.2.29 and framework version to 1.1.36
+
+</Update>
+<Update label="maxim" description="1.4.36">
+- chore: updates core version to 1.2.29 and framework version to 1.1.36
+
+</Update>
+<Update label="mocker" description="1.3.36">
+- chore: updates core version to 1.2.29 and framework version to 1.1.36
+
+</Update>
+<Update label="otel" description="1.0.36">
+- chore: updates core version to 1.2.29 and framework version to 1.1.36
+
+</Update>
+<Update label="semantic_cache" description="1.3.36">
+- chore: updates core version to 1.2.29 and framework version to 1.1.36
+
+</Update>
+<Update label="telemetry" description="1.3.36">
+- chore: updates core version to 1.2.29 and framework version to 1.1.36
+
+</Update>
--- a/docs/changelogs/v1.3.34.mdx
+++ b/docs/changelogs/v1.3.34.mdx
@@ -0,0 +1,59 @@
+---
+title: "v1.3.34"
+description: "v1.3.34 changelog - 2025-11-21"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.34
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.34
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.34
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.34">
+- feat: Log view is enabled even if config_store is disabled
+- fix: Add missing cache and batch pricing columns to ensure we compute costs for those operations accurately.
+
+</Update>
+<Update label="Framework" description="1.1.37">
+hotfix: Adds missing batch and cache token pricing columns in config_store
+
+</Update>
+<Update label="governance" description="1.3.38">
+- chore: upgrades framework version to 1.1.37
+
+</Update>
+<Update label="jsonparser" description="1.3.38">
+- chore: upgrades framework version to 1.1.37
+
+</Update>
+<Update label="logging" description="1.3.38">
+- chore: upgrades framework version to 1.1.37
+
+</Update>
+<Update label="maxim" description="1.4.37">
+- chore: upgrades framework version to 1.1.37
+
+</Update>
+<Update label="mocker" description="1.3.37">
+- chore: upgrades framework version to 1.1.37
+
+</Update>
+<Update label="otel" description="1.0.37">
+- chore: upgrades framework version to 1.1.37
+
+</Update>
+<Update label="semantic_cache" description="1.3.37">
+- chore: upgrades framework version to 1.1.37
+
+</Update>
+<Update label="telemetry" description="1.3.37">
+- chore: upgrades framework version to 1.1.37
+
+</Update>
--- a/docs/changelogs/v1.3.35.mdx
+++ b/docs/changelogs/v1.3.35.mdx
@@ -0,0 +1,71 @@
+---
+title: "v1.3.35"
+description: "v1.3.35 changelog - 2025-11-24"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.35
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.35
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.35
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.35">
+- feat: Qdrant Vector Search Support (#893)
+- fix: bedrock responses streaming last chunk indicator fixes
+- fix: gemini nil content check fixes
+- fix: handle responses.incomplete event in openai responses streaming
+- fix: stream accumulator nil content check fixes
+
+</Update>
+<Update label="Core" description="1.2.30">
+- fix: bedrock responses streaming last chunk indicator fixes
+- fix: gemini nil content check fixes
+- fix: handle responses.incomplete event in openai responses streaming
+- enhancements: provider tests enhancements
+
+</Update>
+<Update label="Framework" description="1.1.38">
+- feat: Qdrant Vector Search Support (#893)
+- fix: stream accumulator nil content check fixes
+- enhancement: added transactions on provider config updates
+
+</Update>
+<Update label="governance" description="1.3.39">
+- chore: upgrades core to 1.2.30 and framework to 1.1.38
+
+</Update>
+<Update label="jsonparser" description="1.3.39">
+- chore: upgrades core to 1.2.30 and framework to 1.1.38
+
+</Update>
+<Update label="logging" description="1.3.39">
+- chore: upgrades core to 1.2.30 and framework to 1.1.38
+
+</Update>
+<Update label="maxim" description="1.4.38">
+- chore: upgrades core to 1.2.30 and framework to 1.1.38
+
+</Update>
+<Update label="mocker" description="1.3.38">
+- chore: upgrades core to 1.2.30 and framework to 1.1.38
+
+</Update>
+<Update label="otel" description="1.0.38">
+- chore: upgrades core to 1.2.30 and framework to 1.1.38
+
+</Update>
+<Update label="semantic_cache" description="1.3.38">
+- chore: upgrades core to 1.2.30 and framework to 1.1.38
+
+</Update>
+<Update label="telemetry" description="1.3.38">
+- chore: upgrades core to 1.2.30 and framework to 1.1.38
+
+</Update>
--- a/docs/changelogs/v1.3.36.mdx
+++ b/docs/changelogs/v1.3.36.mdx
@@ -0,0 +1,60 @@
+---
+title: "v1.3.36"
+description: "v1.3.36 changelog - 2025-11-25"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.36
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.36
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.36
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.36">
+- feat: opus 4.5 is supported
+- chore: changelog structure update
+- fix: race conditions in stream accumulator
+
+</Update>
+<Update label="Framework" description="1.1.39">
+- fix: Fixes race condition in accumulator
+
+</Update>
+<Update label="governance" description="1.3.40">
+- chore: upgrades framework version to 1.1.39
+
+</Update>
+<Update label="jsonparser" description="1.3.40">
+- chore: upgrades framework version to 1.1.39
+
+</Update>
+<Update label="logging" description="1.3.40">
+- chore: upgrades framework version to 1.1.39
+
+</Update>
+<Update label="maxim" description="1.4.39">
+- chore: upgrades framework version to 1.1.39
+
+</Update>
+<Update label="mocker" description="1.3.39">
+- chore: upgrades framework version to 1.1.39
+
+</Update>
+<Update label="otel" description="1.0.39">
+- chore: upgrades framework version to 1.1.39
+
+</Update>
+<Update label="semantic_cache" description="1.3.39">
+- chore: upgrades framework version to 1.1.39
+
+</Update>
+<Update label="telemetry" description="1.3.39">
+- chore: upgrades framework version to 1.1.39
+
+</Update>
--- a/docs/changelogs/v1.3.37.mdx
+++ b/docs/changelogs/v1.3.37.mdx
@@ -0,0 +1,78 @@
+---
+title: "v1.3.37"
+description: "v1.3.37 changelog - 2025-11-28"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.37
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.37
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.37
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.37">
+- feat: pydantic SDK support
+- feat: bedrock SDK support
+- feat: adds versioning support for plugins
+- **breaking change**: plugins now accept *schemas.BifrostContext instead of *context.Context
+- fix: gemini tts fixes with audio encoding for cross SDK compatibility
+- feat: improved virtual key configuration flows
+- chore: improved test coverage
+- feat: check allowed models from model catalog for provider routing using virtual keys
+- fix: log cleanup timestamp in UTC to match log entry timestamps for processing logs
+- fix: prompt caching issue fixes for openai chat completions
+
+</Update>
+<Update label="Core" description="1.2.31">
+- **breaking change**: plugins now accept *schemas.BifrostContext instead of *context.Context
+- feat: adds support for bedrock, pydantic and cohere SDK.
+- fix: minor fixes around audio streaming for gemini and vertex
+- fix: prompt caching issue fixes for openai chat completions
+- feat: add versioning support for plugins
+- [BREAKING CHANGE]: ToolFunctionParameters.Properties is now an *OrderedMap instead of *map[string]interface{}
+
+</Update>
+<Update label="Framework" description="1.1.40">
+- feat: adds audio encoding flows for gemini tts workflows
+
+</Update>
+<Update label="governance" description="1.3.41">
+- chore: upgrades core to 1.2.31 and framework to 1.1.40
+- feat: check allowed models from model catalog for provider configs
+
+</Update>
+<Update label="jsonparser" description="1.3.41">
+- chore: upgrades core to 1.2.31 and framework to 1.1.40
+
+</Update>
+<Update label="logging" description="1.3.41">
+- chore: upgrades core to 1.2.31 and framework to 1.1.40
+- fix: log cleanup timestamp in UTC to match log entry timestamps
+
+</Update>
+<Update label="maxim" description="1.4.40">
+- chore: upgrades core to 1.2.31 and framework to 1.1.40
+
+</Update>
+<Update label="mocker" description="1.3.40">
+- chore: upgrades core to 1.2.31 and framework to 1.1.40
+
+</Update>
+<Update label="otel" description="1.0.40">
+- chore: upgrades core to 1.2.31 and framework to 1.1.40
+
+</Update>
+<Update label="semantic_cache" description="1.3.40">
+- chore: upgrades core to 1.2.31 and framework to 1.1.40
+
+</Update>
+<Update label="telemetry" description="1.3.40">
+- chore: upgrades core to 1.2.31 and framework to 1.1.40
+
+</Update>
--- a/docs/changelogs/v1.3.38.mdx
+++ b/docs/changelogs/v1.3.38.mdx
@@ -0,0 +1,74 @@
+---
+title: "v1.3.38"
+description: "v1.3.38 changelog - 2025-12-01"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.38
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.38
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.38
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.38">
+- feat: support added for x-goog-api-key header for Google Gemini style for virtual key lookup and direct api key bypass
+- feat: added support for Anthropic models in Azure
+- chore: version update core to 1.2.32 and framework to 1.1.41
+- fix: provider retry config time conversion issue
+- fix: cache read input token cost calculation bug
+- enhancement: made model lookup for pricing more robust
+
+</Update>
+<Update label="Core" description="1.2.32">
+- feat: added support for Anthropic models in Azure
+- enhancement: using naive anthropic converters for Vertex Anthropic responses and responses stream
+- [breaking change] NetworkConfig retry backoff values (RetryBackoffInitial and RetryBackoffMax) now handle milliseconds in JSON while storing as time.Duration internally. Custom MarshalJSON/UnmarshalJSON methods ensure values are always interpreted as milliseconds when serializing/deserializing from JSON, fixing issues where values were incorrectly interpreted as nanoseconds.
+
+</Update>
+<Update label="Framework" description="1.1.41">
+- chore: version update core to 1.2.32
+- fix: cache read input token cost calculation bug
+- enhancement: made bedrock model lookup more robust
+- enhancement: added support for deployment lookup in pricing
+
+</Update>
+<Update label="governance" description="1.3.42">
+- feat: support added for x-goog-api-key header for Google Gemini style
+- chore: version update core to 1.2.32 and framework to 1.1.41
+
+</Update>
+<Update label="jsonparser" description="1.3.42">
+- chore: version update core to 1.2.32 and framework to 1.1.41
+
+</Update>
+<Update label="logging" description="1.3.42">
+- chore: version update core to 1.2.32 and framework to 1.1.41
+- fix: log entry number of retries not being updated
+
+</Update>
+<Update label="maxim" description="1.4.41">
+- chore: version update core to 1.2.32 and framework to 1.1.41
+
+</Update>
+<Update label="mocker" description="1.3.41">
+- chore: version update core to 1.2.32 and framework to 1.1.41
+
+</Update>
+<Update label="otel" description="1.0.41">
+- chore: version update core to 1.2.32 and framework to 1.1.41
+
+</Update>
+<Update label="semantic_cache" description="1.3.41">
+- chore: version update core to 1.2.32 and framework to 1.1.41
+
+</Update>
+<Update label="telemetry" description="1.3.41">
+- chore: version update core to 1.2.32 and framework to 1.1.41
+
+</Update>
--- a/docs/changelogs/v1.3.39.mdx
+++ b/docs/changelogs/v1.3.39.mdx
@@ -0,0 +1,70 @@
+---
+title: "v1.3.39"
+description: "v1.3.39 changelog - 2025-12-04"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.39
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.39
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.39
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.39">
+- fix: vertex and bedrock usage aggregation improvements for streaming
+- fix: choice index fixed to 0 for anthropic and bedrock streaming
+- feat: model field added to responses api response
+- feat: check allowed models and deployments of key for list models
+- bug: ui breaking when list models is empty on virtual key provider config
+- chore: update core version to 1.2.33 and framework version to 1.1.42
+
+</Update>
+<Update label="Core" description="1.2.33">
+- fix: vertex and bedrock usage aggregation improvements for streaming
+- fix: choice index fixed to 0 for anthropic and bedrock streaming
+- feat: model field added to responses api response
+- feat: check allowed models and deployments of key for list models
+
+</Update>
+<Update label="Framework" description="1.1.42">
+- chore: update core version to 1.2.33
+
+</Update>
+<Update label="governance" description="1.3.43">
+- chore: update core version to 1.2.33 and framework version to 1.1.42
+
+</Update>
+<Update label="jsonparser" description="1.3.43">
+- chore: update core version to 1.2.33 and framework version to 1.1.42
+
+</Update>
+<Update label="logging" description="1.3.43">
+- chore: update core version to 1.2.33 and framework version to 1.1.42
+
+</Update>
+<Update label="maxim" description="1.4.42">
+- chore: update core version to 1.2.33 and framework version to 1.1.42
+
+</Update>
+<Update label="mocker" description="1.3.42">
+- chore: update core version to 1.2.33 and framework version to 1.1.42
+
+</Update>
+<Update label="otel" description="1.0.42">
+- chore: update core version to 1.2.33 and framework version to 1.1.42
+
+</Update>
+<Update label="semantic_cache" description="1.3.42">
+- chore: update core version to 1.2.33 and framework version to 1.1.42
+
+</Update>
+<Update label="telemetry" description="1.3.42">
+- chore: update core version to 1.2.33 and framework version to 1.1.42
+
+</Update>
--- a/docs/changelogs/v1.3.4.mdx
+++ b/docs/changelogs/v1.3.4.mdx
@@ -0,0 +1,81 @@
+---
+title: "v1.3.4"
+description: "v1.3.4 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.4
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.4
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.4
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.4">
+
+- Upgrade dependency: core to 1.2.10 and framework to 1.1.10
+- Feat: Added virtual key level support for MCP tools to execute
+- Feat: Added names to keys
+- Fix: provider selection from url params
+
+</Update>
+<Update label="Core" description="v1.3.4">
+
+- Feat: Added key name field to account schema for external key management
+- Feat: Simplified MCP client management by removing toolsToSkip field, allowing wildcard (*) for all tools, and better tool filtering logic.
+
+</Update>
+<Update label="Framework" description="v1.3.4">
+
+- Upgrade dependency: core to 1.2.10
+- Feat: Added key name column to config keys table
+- Feat: Removed tools_to_skip field from MCP client config table
+- Feat: Added virtual_key_mcp_config table to store MCP client configs for virtual keys along with its relationships
+
+</Update>
+<Update label="governance" description="v1.3.4">
+
+- chore: version update core to 1.2.10 and framework to 1.1.10
+- feat: added virtual key level support for MCP tools to execute
+
+</Update>
+<Update label="jsonparser" description="v1.3.4">
+
+- chore: version update core to 1.2.10 and framework to 1.1.10
+
+</Update>
+<Update label="logging" description="v1.3.4">
+
+- chore: version update core to 1.2.10 and framework to 1.1.10
+
+</Update>
+<Update label="maxim" description="v1.3.4">
+
+- chore: version update core to 1.2.10 and framework to 1.1.10
+
+</Update>
+<Update label="mocker" description="v1.3.4">
+
+- chore: version update core to 1.2.10 and framework to 1.1.10
+
+</Update>
+<Update label="otel" description="v1.3.4">
+
+- chore: version update core to 1.2.10 and framework to 1.1.10
+
+</Update>
+<Update label="semantic_cache" description="v1.3.4">
+
+- chore: version update core to 1.2.10
+
+</Update>
+<Update label="telemetry" description="v1.3.4">
+
+- chore: version update core to 1.2.10 and framework to 1.1.10
+
+</Update>
--- a/docs/changelogs/v1.3.40.mdx
+++ b/docs/changelogs/v1.3.40.mdx
@@ -0,0 +1,22 @@
+---
+title: "v1.3.40"
+description: "v1.3.40 changelog - 2025-12-04"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.40
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.40
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.40
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.40">
+- security: upgrades React and Next against [CVE-2025-66478](https://nextjs.org/blog/CVE-2025-66478)
+
+</Update>
--- a/docs/changelogs/v1.3.41.mdx
+++ b/docs/changelogs/v1.3.41.mdx
@@ -0,0 +1,26 @@
+---
+title: "v1.3.41"
+description: "v1.3.41 changelog - 2025-12-05"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.41
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.41
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.41
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.41">
+- fix: remove UPX binary compression from Docker build to resolve segmentation faults when combined with PIE (Position Independent Executable)
+
+</Update>
+<Update label="maxim" description="1.4.43">
+chore: Refactored the Maxim plugin to move tag handling from pre-hook to post-hook, improving the tag management process for generations.
+
+</Update>
--- a/docs/changelogs/v1.3.42.mdx
+++ b/docs/changelogs/v1.3.42.mdx
@@ -0,0 +1,63 @@
+---
+title: "v1.3.42"
+description: "v1.3.42 changelog - 2025-12-05"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.42
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.42
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.42
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.42">
+- fix: added region prefix check for bedrock list models
+- chore: update core version to 1.2.34 and framework version to 1.1.43
+
+</Update>
+<Update label="Core" description="1.2.34">
+- fix: added region prefix check for bedrock list models
+
+</Update>
+<Update label="Framework" description="1.1.43">
+- chore: upgraded core version to 1.2.34
+
+</Update>
+<Update label="governance" description="1.3.44">
+- chore: update core version to 1.2.34 and framework version to 1.1.43
+
+</Update>
+<Update label="jsonparser" description="1.3.44">
+- chore: update core version to 1.2.34 and framework version to 1.1.43
+
+</Update>
+<Update label="logging" description="1.3.44">
+- chore: update core version to 1.2.34 and framework version to 1.1.43
+
+</Update>
+<Update label="maxim" description="1.4.44">
+- chore: update core version to 1.2.34 and framework version to 1.1.43
+
+</Update>
+<Update label="mocker" description="1.3.43">
+- chore: update core version to 1.2.34 and framework version to 1.1.43
+
+</Update>
+<Update label="otel" description="1.0.43">
+- chore: update core version to 1.2.34 and framework version to 1.1.43
+
+</Update>
+<Update label="semantic_cache" description="1.3.43">
+- chore: update core version to 1.2.34 and framework version to 1.1.43
+
+</Update>
+<Update label="telemetry" description="1.3.43">
+- chore: update core version to 1.2.34 and framework version to 1.1.43
+
+</Update>
--- a/docs/changelogs/v1.3.43.mdx
+++ b/docs/changelogs/v1.3.43.mdx
@@ -0,0 +1,72 @@
+---
+title: "v1.3.43"
+description: "v1.3.43 changelog - 2025-12-09"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.43
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.43
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.43
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.43">
+- feat: adds global proxy support
+- feat: adds datadog native integration handling
+- feat: enterprise plugin handling for OSS
+- feat: adds support `OTEL_RESOURCE_ATTRIBUTES` for otel plugin
+- chore: some minor bug fixes
+
+</Update>
+<Update label="Core" description="1.2.35">
+- feat: added missing extrafields to errors in core
+- feat: adds global proxy support
+- feat: handle cached tokens in Anthropic streaming responses
+- fix: adds status field for responses API
+
+</Update>
+<Update label="Framework" description="1.1.44">
+- feat: adds global proxy support
+- feat: enterprise plugin handling
+
+</Update>
+<Update label="governance" description="1.3.45">
+- chore: updating core to 1.2.35 and framework to 1.1.44
+
+</Update>
+<Update label="jsonparser" description="1.3.45">
+- chore: updating core to 1.2.35 and framework to 1.1.44
+
+</Update>
+<Update label="logging" description="1.3.45">
+- chore: updating core to 1.2.35 and framework to 1.1.44
+
+</Update>
+<Update label="maxim" description="1.4.45">
+- chore: updating core to 1.2.35 and framework to 1.1.44
+
+</Update>
+<Update label="mocker" description="1.3.44">
+- chore: updating core to 1.2.35 and framework to 1.1.44
+
+</Update>
+<Update label="otel" description="1.0.44">
+- feat: add custom CA TLS cert support for protocols
+- feat: enterprise plugin handling
+- chore: updating core to 1.2.35 and framework to 1.1.44
+
+</Update>
+<Update label="semantic_cache" description="1.3.44">
+- chore: updating core to 1.2.35 and framework to 1.1.44
+
+</Update>
+<Update label="telemetry" description="1.3.44">
+- chore: updating core to 1.2.35 and framework to 1.1.44
+
+</Update>
--- a/docs/changelogs/v1.3.44.mdx
+++ b/docs/changelogs/v1.3.44.mdx
@@ -0,0 +1,60 @@
+---
+title: "v1.3.44"
+description: "v1.3.44 changelog - 2025-12-10"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.44
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.44
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.44
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.44">
+- feat: adds rbac support across all pages
+- fix: fixes config.json - config store streaming cases for virtual keys, providers and keys. Improved test coverage for this flow.
+- fix: adds support for text streaming logging
+
+</Update>
+<Update label="Framework" description="1.1.45">
+- fix: adds support for text streaming accumulation
+
+</Update>
+<Update label="governance" description="1.3.46">
+- chore: updates framework to 1.1.45
+
+</Update>
+<Update label="jsonparser" description="1.3.46">
+- chore: updates framework to 1.1.45
+
+</Update>
+<Update label="logging" description="1.3.46">
+- chore: updates framework to 1.1.45
+
+</Update>
+<Update label="maxim" description="1.4.46">
+- chore: updates framework to 1.1.45
+
+</Update>
+<Update label="mocker" description="1.3.45">
+- chore: updates framework to 1.1.45
+
+</Update>
+<Update label="otel" description="1.0.45">
+- chore: updates framework to 1.1.45
+
+</Update>
+<Update label="semantic_cache" description="1.3.45">
+- chore: updates framework to 1.1.45
+
+</Update>
+<Update label="telemetry" description="1.3.45">
+- chore: updates framework to 1.1.45
+
+</Update>
--- a/docs/changelogs/v1.3.45.mdx
+++ b/docs/changelogs/v1.3.45.mdx
@@ -0,0 +1,66 @@
+---
+title: "v1.3.45"
+description: "v1.3.45 changelog - 2025-12-11"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.45
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.45
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.45
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.45">
+- feat: complete config.json to config-store sync using hash
+- fix: structured output in bedrock, cohere and anthropic
+- fix: tool calls in bedrock chat completion
+
+</Update>
+<Update label="Core" description="1.2.36">
+- feat: complete config.json to config-store sync using hash
+- fix: structured output in bedrock, cohere and anthropic
+- fix: tool calls in bedrock chat completion
+
+</Update>
+<Update label="Framework" description="1.1.46">
+- chore: updating core to 1.2.36 and framework to 1.1.46
+
+</Update>
+<Update label="governance" description="1.3.47">
+- chore: updating core to 1.2.36 and framework to 1.1.46
+
+</Update>
+<Update label="jsonparser" description="1.3.47">
+- chore: updating core to 1.2.36 and framework to 1.1.46
+
+</Update>
+<Update label="logging" description="1.3.47">
+- chore: updating core to 1.2.36 and framework to 1.1.46
+
+</Update>
+<Update label="maxim" description="1.4.47">
+- chore: updating core to 1.2.36 and framework to 1.1.46
+
+</Update>
+<Update label="mocker" description="1.3.46">
+- chore: updating core to 1.2.36 and framework to 1.1.46
+
+</Update>
+<Update label="otel" description="1.0.46">
+- chore: updating core to 1.2.36 and framework to 1.1.46
+
+</Update>
+<Update label="semantic_cache" description="1.3.46">
+- chore: updating core to 1.2.36 and framework to 1.1.46
+
+</Update>
+<Update label="telemetry" description="1.3.46">
+- chore: updating core to 1.2.36 and framework to 1.1.46
+
+</Update>
--- a/docs/changelogs/v1.3.46.mdx
+++ b/docs/changelogs/v1.3.46.mdx
@@ -0,0 +1,22 @@
+---
+title: "v1.3.46"
+description: "v1.3.46 changelog - 2025-12-12"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.46
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.46
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.46
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.46">
+- hotfix: security patches for [react](https://react.dev/blog/2025/12/11/denial-of-service-and-source-code-exposure-in-react-server-components) and [nextjs](https://nextjs.org/blog/security-update-2025-12-11)
+
+</Update>
--- a/docs/changelogs/v1.3.47.mdx
+++ b/docs/changelogs/v1.3.47.mdx
@@ -0,0 +1,76 @@
+---
+title: "v1.3.47"
+description: "v1.3.47 changelog - 2025-12-12"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.47
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.47
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.47
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.47">
+- feat: support for raw response accumulation for streaming
+- feat: support for raw request logging and sending back in response
+- feat: added support for reasoning in chat completions
+- feat: enhanced reasoning support in responses api
+- enhancement: improved internal inter provider conversions for integrations
+- feat: switched to gemini native api
+
+</Update>
+<Update label="Core" description="1.2.37">
+- feat: send back raw request in extra fields
+- feat: added support for reasoning in chat completions
+- feat: enhanced reasoning support in responses api
+- enhancement: improved internal inter provider conversions for integrations
+- feat: switched to gemini native api
+- feat: fallback to supported request type for custom models used in integration
+
+</Update>
+<Update label="Framework" description="1.1.47">
+- feat: support raw response accumulation in stream accumulator
+- feat: support raw request configuration and logging
+- feat: added support for reasoning accumulation in stream accumulator
+- chore: updating core to 1.2.37 and framework to 1.1.47
+
+</Update>
+<Update label="governance" description="1.3.48">
+- chore: updating core to 1.2.37 and framework to 1.1.47
+
+</Update>
+<Update label="jsonparser" description="1.3.48">
+- chore: updating core to 1.2.37 and framework to 1.1.47
+
+</Update>
+<Update label="logging" description="1.3.48">
+- feat: support for raw request logging
+- chore: updating core to 1.2.37 and framework to 1.1.47
+
+</Update>
+<Update label="maxim" description="1.4.48">
+- chore: updating core to 1.2.37 and framework to 1.1.47
+
+</Update>
+<Update label="mocker" description="1.3.47">
+- chore: updating core to 1.2.37 and framework to 1.1.47
+
+</Update>
+<Update label="otel" description="1.0.47">
+- chore: updating core to 1.2.37 and framework to 1.1.47
+
+</Update>
+<Update label="semantic_cache" description="1.3.47">
+- chore: updating core to 1.2.37 and framework to 1.1.47
+
+</Update>
+<Update label="telemetry" description="1.3.47">
+- chore: updating core to 1.2.37 and framework to 1.1.47
+
+</Update>
--- a/docs/changelogs/v1.3.48.mdx
+++ b/docs/changelogs/v1.3.48.mdx
@@ -0,0 +1,22 @@
+---
+title: "v1.3.48"
+description: "v1.3.48 changelog - 2025-12-12"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.48
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.48
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.48
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.48">
+- chore: security patches 2 to next + react
+
+</Update>
--- a/docs/changelogs/v1.3.49.mdx
+++ b/docs/changelogs/v1.3.49.mdx
@@ -0,0 +1,79 @@
+---
+title: "v1.3.49"
+description: "v1.3.49 changelog - 2025-12-16"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.49
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.49
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.49
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="1.3.49">
+- feat: add `x-bf-api-key` header to send requests with a key by name
+- feat: parse `x-bf-eh-*` request headers as extra headers
+- feat: addded api endpoint for /api/pricing/force-syncfeat: support for raw response accumulation for streaming
+- feat: add support for enabling/disabling provider keys without deletion.
+- feat: add batch api support for OpenAI, Anthropic, Google Gemini and Bedrock <Badge color="blue">Beta</Badge>.
+- feat: new provider support - nebius.
+- feat: force refresh datasheet support.
+- fix: fixed minor issues with structured output support for Gemini and Bedrock.
+- fix: fixed token usage base cost compute for models like gemini
+- chore: CORS policy now allows `x-stainless-timeout`
+
+</Update>
+<Update label="Core" description="1.2.38">
+- feat: adds batch and files API support for bedrock, openai, anthropic and gemini
+- feat: new provider support - nebius
+- feat: structured output support 
+- fix: vertex and bedrock usage aggregation improvements for streaming
+- fix: choice index fixed to 0 for anthropic and bedrock streaming
+
+</Update>
+<Update label="Framework" description="1.1.48">
+- feat: added force sync function in pricing and pricing according to 200k token
+- feat: adds logging support for batch and file requests
+- chore: upgrades core to 1.2.38 and framework to 1.1.48
+
+</Update>
+<Update label="governance" description="1.3.49">
+- chore: upgrades core to 1.2.38 and framework to 1.1.48
+
+</Update>
+<Update label="jsonparser" description="1.3.49">
+- chore: upgrades core to 1.2.38 and framework to 1.1.48
+
+</Update>
+<Update label="logging" description="1.3.49">
+- chore: upgrades core to 1.2.38 and framework to 1.1.48
+
+</Update>
+<Update label="maxim" description="1.4.49">
+- chore: upgrades core to 1.2.38 and framework to 1.1.48
+
+</Update>
+<Update label="mocker" description="1.3.48">
+- chore: upgrades core to 1.2.38 and framework to 1.1.48
+
+</Update>
+<Update label="otel" description="1.0.48">
+- feat: add batch and file request logging support; refactor centralized request handling
+- chore: upgrades core to 1.2.38 and framework to 1.1.48
+
+</Update>
+<Update label="semantic_cache" description="1.3.48">
+- chore: upgrades core to 1.2.38 and framework to 1.1.48
+
+</Update>
+<Update label="telemetry" description="1.3.48">
+- feat: adds logging support for batch and file requests
+- chore: upgrades core to 1.2.38 and framework to 1.1.48
+
+</Update>
--- a/docs/changelogs/v1.3.5.mdx
+++ b/docs/changelogs/v1.3.5.mdx
@@ -0,0 +1,75 @@
+---
+title: "v1.3.5"
+description: "v1.3.5 changelog"
+---
+<Tabs>
+  <Tab title="NPX">
+    ```bash
+    npx -y @maximhq/bifrost --transport-version v1.3.5
+    ```
+  </Tab>
+  <Tab title="Docker">
+    ```bash
+    docker pull maximhq/bifrost:v1.3.5
+    docker run -p 8080:8080 maximhq/bifrost:v1.3.5
+    ```
+  </Tab>
+</Tabs>
+
+<Update label="Bifrost(HTTP)" description="v1.3.5">
+
+- chore: version update framework to 1.1.11
+- fix: added missing migration for `cost` and `cache_debug` columns in logs table for old databases.
+
+</Update>
+<Update label="Core" description="v1.3.5">
+
+- Feat: Added key name field to account schema for external key management
+- Feat: Simplified MCP client management by removing toolsToSkip field, allowing wildcard (*) for all tools, and better tool filtering logic.
+
+</Update>
+<Update label="Framework" description="v1.3.5">
+
+- Fix: Added missing migration for `cost` and `cache_debug` columns in logs table for old databases.
+
+</Update>
+<Update label="governance" description="v1.3.5">
+
+- chore: version update framework to 1.1.11
+
+</Update>
+<Update label="jsonparser" description="v1.3.5">
+
+- chore: version update framework to 1.1.11
+
+</Update>
+<Update label="logging" description="v1.3.5">
+
+- chore: version update framework to 1.1.11
+
+</Update>
+<Update label="maxim" description="v1.3.5">
+
+- chore: version update framework to 1.1.11
+
+</Update>
+<Update label="mocker" description="v1.3.5">
+
+- chore: version update framework to 1.1.11
+
+</Update>
+<Update label="otel" description="v1.3.5">
+
+- chore: version update framework to 1.1.11
+
+</Update>
+<Update label="semantic_cache" description="v1.3.5">
+
+- chore: version update framework to 1.1.11
+
+</Update>
+<Update label="telemetry" description="v1.3.5">
+
+- chore: version update framework to 1.1.11
+
+</Update>
--- a/Show More
+++ b/Show More