first commit

This commit is contained in:
Beyhan Oğur
2026-04-26 21:52:23 +03:00
commit 880f412e2c
2662 changed files with 866266 additions and 0 deletions

4
docs/.mintignore Normal file
View File

@@ -0,0 +1,4 @@
# Ignore modular OpenAPI source files
openapi/paths/
openapi/schemas/
openapi/openapi.yaml

3
docs/README.md Normal file
View File

@@ -0,0 +1,3 @@
# Bifrost documentation
For better accessibility we have moved documentation [here](https://www.getmaxim.ai/bifrost/docs).

View File

View File

@@ -0,0 +1,764 @@
---
title: "Concurrency"
description: "Deep dive into Bifrost's advanced concurrency architecture - worker pools, goroutine management, channel-based communication, and resource isolation patterns."
icon: "traffic-light"
---
## Concurrency Philosophy
### **Core Principles**
| Principle | Implementation | Benefit |
| ---------------------------------- | -------------------------------------- | -------------------------------------- |
| **Provider Isolation** | Independent worker pools per provider | Fault tolerance, no cascade failures |
| **Channel-Based Communication** | Go channels for all async operations | Type-safe, deadlock-free communication |
| **Resource Pooling** | Object pools with lifecycle management | Predictable memory usage, minimal GC |
| **Non-Blocking Operations** | Async processing throughout pipeline | Maximum concurrency, no blocking waits |
| **Backpressure Handling** | Configurable buffers and flow control | Graceful degradation under load |
### **Threading Architecture Overview**
```mermaid
graph TB
subgraph "Main Thread"
Main[Main Process<br/>HTTP Server]
Router[Request Router<br/>Goroutine]
PluginMgr[Plugin Manager<br/>Goroutine]
end
subgraph "Provider Worker Pools"
subgraph "OpenAI Pool"
OAI1[Worker 1<br/>Goroutine]
OAI2[Worker 2<br/>Goroutine]
OAIN[Worker N<br/>Goroutine]
end
subgraph "Anthropic Pool"
ANT1[Worker 1<br/>Goroutine]
ANT2[Worker 2<br/>Goroutine]
ANTN[Worker N<br/>Goroutine]
end
subgraph "Bedrock Pool"
BED1[Worker 1<br/>Goroutine]
BED2[Worker 2<br/>Goroutine]
BEDN[Worker N<br/>Goroutine]
end
end
subgraph "Memory Pools"
ChannelPool[Channel Pool<br/>sync.Pool]
MessagePool[Message Pool<br/>sync.Pool]
ResponsePool[Response Pool<br/>sync.Pool]
end
Main --> Router
Router --> PluginMgr
PluginMgr --> OAI1
PluginMgr --> ANT1
PluginMgr --> BED1
OAI1 --> ChannelPool
ANT1 --> MessagePool
BED1 --> ResponsePool
```
---
## Worker Pool Architecture
### **Provider-Isolated Worker Pools**
```mermaid
stateDiagram-v2
[*] --> PoolInit: Worker Pool Creation
PoolInit --> WorkerSpawn: Spawn Worker Goroutines
WorkerSpawn --> Listening: Workers Listen on Channels
Listening --> Processing: Job Received
Processing --> API_Call: Provider API Request
API_Call --> Response: Process Response
Response --> Listening: Job Complete
Listening --> Shutdown: Graceful Shutdown
Processing --> Shutdown: Complete Current Job
Shutdown --> [*]: Pool Destroyed
```
**Worker Pool Architecture:**
The worker pool system maintains a sophisticated balance between resource efficiency and performance isolation:
**Key Components:**
- **Worker Pool Management** - Pre-spawned workers reduce startup latency
- **Job Queue System** - Buffered channels provide smooth load balancing
- **Resource Pools** - HTTP clients and API keys are pooled for efficiency
- **Health Monitoring** - Circuit breakers detect and isolate failing providers
- **Graceful Shutdown** - Workers complete current jobs before terminating
**Startup Process:**
1. **Worker Pre-spawning** - Workers are created during pool initialization
2. **Channel Setup** - Job queues and worker channels are established
3. **Resource Allocation** - HTTP clients and API keys are distributed
4. **Health Checks** - Initial connectivity tests verify provider availability
5. **Ready State** - Pool becomes available for request processing
**Job Dispatch Logic:**
- **Round-Robin Assignment** - Jobs are distributed evenly across available workers
- **Load Balancing** - Worker availability determines job assignment
- **Overflow Handling** - Excess jobs are queued or dropped based on configuration
### **Worker Lifecycle Management**
```mermaid
sequenceDiagram
participant Pool
participant Worker
participant HTTPClient
participant Provider
participant Metrics
Pool->>Worker: Start()
Worker->>Worker: Initialize HTTP Client
Worker->>Pool: Ready Signal
loop Job Processing
Pool->>Worker: Job Assignment
Worker->>HTTPClient: Prepare Request
HTTPClient->>Provider: API Call
Provider-->>HTTPClient: Response
HTTPClient-->>Worker: Parsed Response
Worker->>Metrics: Record Performance
Worker->>Pool: Job Complete
end
Pool->>Worker: Shutdown Signal
Worker->>Worker: Complete Current Job
Worker-->>Pool: Shutdown Confirmed
````
---
## Channel-Based Communication
### **Channel Architecture**
```mermaid
graph TB
subgraph "Channel Types"
JobQueue[Job Queue<br/>Buffered Channel]
WorkerPool[Worker Pool<br/>Buffered Channel]
ResultChan[Result Channel<br/>Buffered Channel]
QuitChan[Quit Channel<br/>Unbuffered]
end
subgraph "Flow Control"
BackPressure[Backpressure<br/>Buffer Limits]
Timeout[Timeout<br/>Context Cancellation]
Graceful[Graceful Shutdown<br/>Channel Closing]
end
JobQueue --> BackPressure
WorkerPool --> Timeout
ResultChan --> Graceful
```
**Channel Configuration Principles:**
Bifrost's channel system balances throughput and memory usage through careful buffer sizing:
**Job Queuing Configuration:**
- **Job Queue Buffer** - Sized based on expected burst traffic (100-1000 jobs)
- **Worker Pool Size** - Matches provider concurrency limits (10-100 workers)
- **Result Buffer** - Accommodates response processing delays (50-500 responses)
**Flow Control Parameters:**
- **Queue Wait Limits** - Maximum time jobs wait before timeout (1-10 seconds)
- **Processing Timeouts** - Per-job execution limits (30-300 seconds)
- **Shutdown Timeouts** - Graceful termination periods (5-30 seconds)
**Backpressure Policies:**
- **Drop Policy** - Discard excess jobs when queues are full
- **Block Policy** - Wait for queue space with timeout
- **Error Policy** - Immediately return error for full queues
**Channel Type Selection:**
- **Buffered Channels** - Used for async job processing and result handling
- **Unbuffered Channels** - Used for synchronization signals (quit, done)
- **Context Cancellation** - Used for timeout and cancellation propagation
### **Backpressure and Flow Control**
```mermaid
flowchart TD
Request[Incoming Request] --> QueueCheck{Queue Full?}
QueueCheck -->|No| Queue[Add to Queue]
QueueCheck -->|Yes| Policy{Drop Policy?}
Policy -->|Drop| Drop[Drop Request<br/>Return Error]
Policy -->|Block| Block[Block Until Space<br/>With Timeout]
Policy -->|Error| Error[Return Queue Full Error]
Queue --> Worker[Assign to Worker]
Block --> TimeoutCheck{Timeout?}
TimeoutCheck -->|Yes| Error
TimeoutCheck -->|No| Queue
Worker --> Processing[Process Request]
Processing --> Complete[Complete]
Drop --> Client[Client Response]
Error --> Client
Complete --> Client
````
**Backpressure Implementation Strategy:**
The backpressure system protects Bifrost from being overwhelmed while maintaining service availability:
**Non-Blocking Job Submission:**
- **Immediate Queue Check** - Jobs are submitted without blocking on queue space
- **Success Path** - Available queue space allows immediate job acceptance
- **Overflow Detection** - Full queues trigger backpressure policies
- **Metrics Collection** - All queue operations are tracked for monitoring
**Backpressure Policy Execution:**
- **Drop Policy** - Immediately rejects excess jobs with meaningful error messages
- **Block Policy** - Waits for queue space with configurable timeout limits
- **Error Policy** - Returns queue full errors for immediate client feedback
- **Metrics Tracking** - Dropped, blocked, and successful submissions are measured
**Timeout Management:**
- **Context-Based Timeouts** - All blocking operations respect timeout boundaries
- **Graceful Degradation** - Timeouts result in controlled error responses
- **Resource Protection** - Prevents goroutine leaks from infinite waits
```go
case pool.jobQueue <- job:
pool.metrics.IncQueuedJobs()
return nil
case <-ctx.Done():
pool.metrics.IncTimeoutJobs()
return errors.New("queue full, timeout waiting")
}
case "error":
pool.metrics.IncRejectedJobs()
return errors.New("queue full, job rejected")
default:
return errors.New("unknown queue policy")
}
}
}
```
---
## Memory Pool Concurrency
### **Thread-Safe Object Pools**
```mermaid
graph TD
subgraph "sync.Pool Lifecycle"
direction LR
GetObject[Get Object<br/>sync.Pool.Get]
PoolCheck{Is Pool Empty?}
NewObject[New Object<br/>Factory Function]
UseObject[Use Object<br/>Application Logic]
ResetObject[Reset Object<br/>Clear State]
ReturnObject[Return Object<br/>sync.Pool.Put]
GetObject --> PoolCheck
PoolCheck -- Yes --> NewObject
PoolCheck -- No --> UseObject
NewObject --> UseObject
UseObject --> ResetObject
ResetObject --> ReturnObject
ReturnObject --> GetObject
end
subgraph "GC Interaction"
direction TB
GCRun[GC Runs]
PoolCleanup[Pool Cleanup<br>Removes idle objects]
GCRun --> PoolCleanup
end
```
**Thread-Safe Pool Architecture:**
Bifrost's memory pool system ensures thread-safe object reuse across multiple goroutines:
**Pool Structure Design:**
- **Multiple Pool Types** - Separate pools for channels, messages, responses, and buffers
- **Factory Functions** - Dynamic object creation when pools are empty
- **Statistics Tracking** - Comprehensive metrics for pool performance monitoring
- **Thread Safety** - Synchronized access using Go's sync.Pool and read-write mutexes
**Object Lifecycle Management:**
- **Pool Initialization** - Factory functions define object creation patterns
- **Unique Identification** - Each pooled object gets a unique ID for tracking
- **Timestamp Tracking** - Creation, acquisition, and return times are recorded
- **Reusability Flags** - Objects can be marked as non-reusable for single-use scenarios
**Acquisition Strategy:**
- **Request Tracking** - All pool requests are counted for monitoring
- **Hit/Miss Tracking** - Pool effectiveness is measured through hit ratios
- **Fallback Creation** - New objects are created when pools are empty
- **Performance Metrics** - Acquisition times and patterns are monitored
**Return and Reset Process:**
- **State Validation** - Only reusable objects are returned to pools
- **Object Reset** - All object state is cleared before returning to pool
- **Return Tracking** - Return operations are counted and timed
- **Pool Replenishment** - Returned objects become available for reuse
### **Pool Performance Monitoring**
Comprehensive metrics provide insights into pool efficiency and system health:
**Usage Statistics Collection:**
- **Request Counting** - Track total pool requests by object type
- **Creation Tracking** - Monitor new object allocations when pools are empty
- **Hit/Miss Ratios** - Measure pool effectiveness through reuse rates
- **Return Monitoring** - Track successful object returns to pools
**Performance Metrics Analysis:**
- **Acquisition Times** - Measure how long it takes to get objects from pools
- **Reset Performance** - Track time spent cleaning objects for reuse
- **Hit Ratio Calculation** - Determine percentage of requests served from pools
- **Memory Efficiency** - Calculate memory savings from object reuse
**Key Performance Indicators:**
- **Channel Pool Hit Ratio** - Typically 85-95% in steady state
- **Message Pool Efficiency** - Usually 80-90% reuse rate
- **Response Pool Utilization** - Often 70-85% hit ratio
- **Total Memory Savings** - Measured reduction in garbage collection pressure
**Monitoring Integration:**
- **Thread-Safe Access** - All metrics collection is synchronized
- **Real-Time Updates** - Statistics are updated with each pool operation
- **Export Capability** - Metrics are available in JSON format for monitoring systems
- **Alerting Support** - Low hit ratios can trigger performance alerts
---
## Goroutine Management
### **Goroutine Lifecycle Patterns**
```mermaid
stateDiagram-v2
[*] --> Created: go routine()
Created --> Running: Execute Function
Running --> Waiting: Channel/Mutex Block
Waiting --> Running: Unblocked
Running --> Syscall: Network I/O
Syscall --> Running: I/O Complete
Running --> GCAssist: GC Triggered
GCAssist --> Running: GC Complete
Running --> Terminated: Function Exit
Terminated --> [*]: Cleanup
```
**Goroutine Pool Management Strategy:**
Bifrost's goroutine management ensures optimal resource usage while preventing goroutine leaks:
**Pool Configuration Management:**
- **Goroutine Limits** - Maximum concurrent goroutines prevent resource exhaustion
- **Active Counting** - Atomic counters track currently running goroutines
- **Idle Timeouts** - Unused goroutines are cleaned up after configured periods
- **Resource Boundaries** - Hard limits prevent runaway goroutine creation
**Lifecycle Orchestration:**
- **Spawn Channels** - New goroutine creation is tracked through channels
- **Completion Monitoring** - Finished goroutines signal completion for cleanup
- **Shutdown Coordination** - Graceful shutdown ensures all goroutines complete properly
- **Health Monitoring** - Continuous monitoring tracks goroutine health and performance
**Worker Creation Process:**
- **Limit Enforcement** - Creation fails when maximum goroutine count is reached
- **Unique Identification** - Each goroutine gets a unique ID for tracking and debugging
- **Lifecycle Tracking** - Start times and names enable performance analysis
- **Atomic Operations** - Thread-safe counters prevent race conditions
**Panic Recovery and Error Handling:**
- **Panic Isolation** - Goroutine panics don't crash the entire system
- **Error Logging** - Panic details are logged with goroutine context
- **Metrics Updates** - Panic counts are tracked for monitoring and alerting
- **Resource Cleanup** - Failed goroutines are properly cleaned up and counted
**Health Monitoring System:**
- **Periodic Health Checks** - Regular intervals check goroutine pool health
- **Completion Tracking** - Finished goroutines are recorded for performance analysis
- **Shutdown Handling** - Clean shutdown process ensures no goroutine leaks
### **Resource Leak Prevention**
```mermaid
flowchart TD
GoroutineStart[Goroutine Start] --> ResourceCheck[Resource Allocation Check]
ResourceCheck --> Timeout[Set Timeout Context]
Timeout --> Work[Execute Work]
Work --> Complete{Work Complete?}
Complete -->|Yes| Cleanup[Cleanup Resources]
Complete -->|No| TimeoutCheck{Timeout?}
TimeoutCheck -->|Yes| ForceCleanup[Force Cleanup]
TimeoutCheck -->|No| Work
Cleanup --> Return[Return Resources to Pool]
ForceCleanup --> Return
Return --> End[Goroutine End]
````
**Resource Leak Prevention:**
```go
func (worker *Worker) ExecuteWithCleanup(job *Job) {
// Set timeout context
ctx, cancel := context.WithTimeout(
context.Background(),
worker.config.ProcessTimeout,
)
defer cancel()
// Acquire resources with timeout
resources, err := worker.acquireResources(ctx)
if err != nil {
job.resultChan <- &Result{Error: err}
return
}
// Ensure cleanup happens
defer func() {
// Always return resources
worker.returnResources(resources)
// Handle panics
if r := recover(); r != nil {
worker.metrics.IncPanics()
job.resultChan <- &Result{
Error: fmt.Errorf("worker panic: %v", r),
}
}
}()
// Execute job with context
result := worker.processJob(ctx, job, resources)
// Return result
select {
case job.resultChan <- result:
// Success
case <-ctx.Done():
// Timeout - result channel might be closed
worker.metrics.IncTimeouts()
}
}
```
---
## Concurrency Optimization Strategies
### **Load-Based Worker Scaling** (Planned)
```mermaid
graph TB
subgraph "Load Monitoring"
QueueDepth[Queue Depth<br/>Monitoring]
ResponseTime[Response Time<br/>Tracking]
WorkerUtil[Worker Utilization<br/>Metrics]
end
subgraph "Scaling Decisions"
ScaleUp{Scale Up?<br/>Load > 80%}
ScaleDown{Scale Down?<br/>Load < 30%}
Maintain[Maintain<br/>Current Size]
end
subgraph "Actions"
AddWorkers[Spawn Additional<br/>Workers]
RemoveWorkers[Graceful Worker<br/>Shutdown]
NoAction[No Action<br/>Monitor Continue]
end
QueueDepth --> ScaleUp
ResponseTime --> ScaleUp
WorkerUtil --> ScaleDown
ScaleUp -->|Yes| AddWorkers
ScaleUp -->|No| ScaleDown
ScaleDown -->|Yes| RemoveWorkers
ScaleDown -->|No| Maintain
Maintain --> NoAction
```
**Adaptive Scaling Implementation:**
```go
type AdaptiveScaler struct {
pool *ProviderWorkerPool
config ScalingConfig
metrics *ScalingMetrics
lastScaleTime time.Time
scalingMutex sync.Mutex
}
func (scaler *AdaptiveScaler) EvaluateScaling() {
scaler.scalingMutex.Lock()
defer scaler.scalingMutex.Unlock()
// Prevent frequent scaling
if time.Since(scaler.lastScaleTime) < scaler.config.MinScaleInterval {
return
}
current := scaler.getCurrentMetrics()
// Scale up conditions
if current.QueueUtilization > scaler.config.ScaleUpThreshold ||
current.AvgResponseTime > scaler.config.MaxResponseTime {
scaler.scaleUp(current)
return
}
// Scale down conditions
if current.QueueUtilization < scaler.config.ScaleDownThreshold &&
current.AvgResponseTime < scaler.config.TargetResponseTime {
scaler.scaleDown(current)
return
}
}
func (scaler *AdaptiveScaler) scaleUp(metrics *CurrentMetrics) {
currentWorkers := scaler.pool.GetWorkerCount()
targetWorkers := int(float64(currentWorkers) * scaler.config.ScaleUpFactor)
// Respect maximum limits
if targetWorkers > scaler.config.MaxWorkers {
targetWorkers = scaler.config.MaxWorkers
}
additionalWorkers := targetWorkers - currentWorkers
if additionalWorkers > 0 {
scaler.pool.AddWorkers(additionalWorkers)
scaler.lastScaleTime = time.Now()
scaler.metrics.RecordScaleUp(additionalWorkers)
}
}
```
### **Provider-Specific Optimization**
```go
type ProviderOptimization struct {
// Provider characteristics
ProviderName string `json:"provider_name"`
RateLimit int `json:"rate_limit"` // Requests per second
AvgLatency time.Duration `json:"avg_latency"` // Average response time
ErrorRate float64 `json:"error_rate"` // Historical error rate
// Optimal configuration
OptimalWorkers int `json:"optimal_workers"`
OptimalBuffer int `json:"optimal_buffer"`
TimeoutConfig time.Duration `json:"timeout_config"`
RetryStrategy RetryConfig `json:"retry_strategy"`
}
func CalculateOptimalConcurrency(provider ProviderOptimization) ConcurrencyConfig {
// Calculate based on rate limits and latency
optimalWorkers := provider.RateLimit * int(provider.AvgLatency.Seconds())
// Adjust for error rate (more workers for higher error rate)
errorAdjustment := 1.0 + provider.ErrorRate
optimalWorkers = int(float64(optimalWorkers) * errorAdjustment)
// Buffer should be 2-3x worker count for smooth operation
optimalBuffer := optimalWorkers * 3
return ConcurrencyConfig{
Concurrency: optimalWorkers,
BufferSize: optimalBuffer,
Timeout: provider.AvgLatency * 2, // 2x avg latency for timeout
}
}
```
---
## Concurrency Monitoring & Metrics
### **Key Concurrency Metrics**
```mermaid
graph TB
subgraph "Worker Metrics"
ActiveWorkers[Active Workers<br/>Current Count]
IdleWorkers[Idle Workers<br/>Available Count]
BusyWorkers[Busy Workers<br/>Processing Count]
end
subgraph "Queue Metrics"
QueueDepth[Queue Depth<br/>Pending Jobs]
QueueThroughput[Queue Throughput<br/>Jobs/Second]
QueueWaitTime[Queue Wait Time<br/>Average Delay]
end
subgraph "Performance Metrics"
GoroutineCount[Goroutine Count<br/>Total Active]
MemoryUsage[Memory Usage<br/>Pool Utilization]
GCPressure[GC Pressure<br/>Collection Frequency]
end
subgraph "Health Metrics"
ErrorRate[Error Rate<br/>Failed Jobs %]
PanicCount[Panic Count<br/>Crashed Goroutines]
DeadlockDetection[Deadlock Detection<br/>Blocked Operations]
end
```
**Metrics Collection Strategy:**
Comprehensive concurrency monitoring provides operational insights and performance optimization data:
**Worker Pool Monitoring:**
- **Total Worker Tracking** - Monitor configured vs actual worker counts
- **Active Worker Monitoring** - Track workers currently processing requests
- **Idle Worker Analysis** - Identify unused capacity and optimization opportunities
- **Queue Depth Monitoring** - Track pending job backlog and processing delays
**Performance Data Collection:**
- **Throughput Metrics** - Measure jobs processed per second across all pools
- **Wait Time Analysis** - Track how long jobs wait in queues before processing
- **Memory Pool Performance** - Monitor hit/miss ratios for memory pool effectiveness
- **Goroutine Count Tracking** - Ensure goroutine counts remain within healthy limits
**Health and Reliability Metrics:**
- **Panic Recovery Tracking** - Count and analyze worker panic occurrences
- **Timeout Monitoring** - Track jobs that exceed processing time limits
- **Circuit Breaker Events** - Monitor provider isolation events and recoveries
- **Error Rate Analysis** - Track failure patterns for capacity planning
**Real-Time Updates:**
- **Live Metric Updates** - Worker metrics are updated continuously during operation
- **Processing Event Recording** - Each job completion updates relevant metrics
- **Performance Correlation** - Queue times and processing times are correlated for analysis
- **Success/Failure Tracking** - All job outcomes are recorded for reliability analysis
---
## Deadlock Prevention & Detection
### **Deadlock Prevention Strategies**
```mermaid
flowchart TD
Strategy1[Lock Ordering<br/>Consistent Acquisition]
Strategy2[Timeout-Based Locks<br/>Context Cancellation]
Strategy3[Channel Select<br/>Non-blocking Operations]
Strategy4[Resource Hierarchy<br/>Layered Locking]
Prevention[Deadlock Prevention<br/>Design Patterns]
Prevention --> Strategy1
Prevention --> Strategy2
Prevention --> Strategy3
Prevention --> Strategy4
Strategy1 --> Success[No Deadlocks<br/>Guaranteed Order]
Strategy2 --> Success
Strategy3 --> Success
Strategy4 --> Success
````
**Deadlock Prevention Implementation Strategy:**
Bifrost employs multiple complementary strategies to prevent deadlocks in concurrent operations:
**Lock Ordering Management:**
- **Consistent Acquisition Order** - All locks are acquired in a predetermined order
- **Global Lock Registry** - Centralized registry maintains lock ordering relationships
- **Order Enforcement** - Lock acquisition automatically sorts by predetermined order
- **Dependency Tracking** - Lock dependencies are mapped to prevent circular waits
**Timeout-Based Protection:**
- **Default Timeouts** - All lock acquisitions have reasonable timeout limits
- **Context Cancellation** - Operations respect context cancellation for cleanup
- **Maximum Timeout Limits** - Upper bounds prevent indefinite blocking
- **Graceful Timeout Handling** - Timeout errors provide meaningful context
**Multi-Lock Acquisition Process:**
- **Ordered Sorting** - Multiple locks are sorted before acquisition attempts
- **Progressive Acquisition** - Locks are acquired one by one in sorted order
- **Failure Recovery** - Failed acquisitions trigger automatic cleanup of held locks
- **Resource Tracking** - All acquired locks are tracked for proper release
**Lock Acquisition Safety:**
- **Non-Blocking Detection** - Channel-based lock attempts prevent indefinite blocking
- **Timeout Enforcement** - All lock attempts respect configured timeout limits
- **Error Propagation** - Lock failures are properly propagated with context
- **Cleanup Guarantees** - Failed operations always clean up partially acquired resources
**Deadlock Detection and Recovery:**
- **Active Monitoring** - Continuous monitoring for potential deadlock conditions
- **Automatic Recovery** - Detected deadlocks trigger automatic resolution procedures
- **Resource Release** - Deadlock resolution involves strategic resource release
- **Prevention Learning** - Deadlock patterns inform prevention strategy improvements
---
## Related Architecture Documentation
- **[Request Flow](./request-flow)** - How concurrency fits in request processing
- **[Benchmarks](../../benchmarking/getting-started)** - Concurrency performance characteristics
- **[Plugin System](./plugins)** - Plugin concurrency considerations
- **[MCP System](./mcp)** - MCP concurrency and worker integration
## Usage Documentation
- **[Provider Configuration](../../quickstart/gateway/provider-configuration)** - Configure concurrency settings per provider
- **[Performance Analysis](../../benchmarking/getting-started)** - Memory pool configuration and optimization
- **[Performance Monitoring](../../features/telemetry)** - Monitor concurrency metrics and health
- **[Go SDK Usage](../../quickstart/go-sdk/setting-up)** - Use Bifrost concurrency in Go applications
- **[Gateway Setup](../../quickstart/gateway/setting-up)** - Deploy Bifrost with optimal concurrency settings
---
**🎯 Next Step:** Understand how plugins integrate with the concurrency model in **[Plugin System](./plugins)**.
```

View File

@@ -0,0 +1,985 @@
---
title: "Model Context Protocol (MCP)"
description: "Deep dive into Bifrost's Model Context Protocol (MCP) integration - how external tool discovery, execution, and integration work internally."
icon: "toolbox"
---
## MCP Architecture Overview
### **What is MCP in Bifrost?**
The Model Context Protocol (MCP) system in Bifrost enables AI models to seamlessly discover and execute external tools, transforming static chat models into dynamic, action-capable agents. This architecture bridges the gap between AI reasoning and real-world tool execution.
**Core MCP Principles:**
- **Dynamic Discovery** - Tools are discovered at runtime, not hardcoded
- **Client-Side Execution** - Bifrost controls all tool execution for security
- **Multi-Protocol Support** - STDIO, HTTP, and SSE connection types
- **Request-Level Filtering** - Granular control over tool availability
- **Async Execution** - Non-blocking tool invocation and response handling
### **MCP System Components**
```mermaid
graph TB
subgraph "MCP Management Layer"
MCPMgr[MCP Manager<br/>Central Controller]
ClientRegistry[Client Registry<br/>Connection Management]
ToolDiscovery[Tool Discovery<br/>Runtime Registration]
end
subgraph "MCP Execution Layer"
ToolFilter[Tool Filter<br/>Access Control]
ToolExecutor[Tool Executor<br/>Invocation Engine]
ResultProcessor[Result Processor<br/>Response Handling]
end
subgraph "Connection Types"
STDIOConn[STDIO Connections<br/>Command-line Tools]
HTTPConn[HTTP Connections<br/>Web Services]
SSEConn[SSE Connections<br/>Real-time Streams]
end
subgraph "External MCP Servers"
FileSystem[Filesystem Tools<br/>File Operations]
WebSearch[Web Search<br/>Information Retrieval]
Database[Database Tools<br/>Data Access]
Custom[Custom Tools<br/>Business Logic]
end
MCPMgr --> ClientRegistry
ClientRegistry --> ToolDiscovery
ToolDiscovery --> ToolFilter
ToolFilter --> ToolExecutor
ToolExecutor --> ResultProcessor
ClientRegistry --> STDIOConn
ClientRegistry --> HTTPConn
ClientRegistry --> SSEConn
STDIOConn --> FileSystem
HTTPConn --> WebSearch
HTTPConn --> Database
STDIOConn --> Custom
```
---
## MCP Connection Architecture
### **Multi-Protocol Connection System**
Bifrost supports four MCP connection types, each optimized for different tool deployment patterns:
```mermaid
graph TB
subgraph "InProcess Connections"
InProcess[In-Memory Tools<br/>Same Process]
InProcessEx[Examples:<br/>• Embedded tools<br/>• High-perf operations<br/>• Testing tools]
end
subgraph "STDIO Connections"
STDIO[Command Line Tools<br/>Local Execution]
STDIOEx[Examples:<br/>• Filesystem tools<br/>• Local scripts<br/>• CLI utilities]
end
subgraph "HTTP Connections"
HTTP[Web Service Tools<br/>Remote APIs]
HTTPEx[Examples:<br/>• Web search APIs<br/>• Database services<br/>• External integrations]
end
subgraph "SSE Connections"
SSE[Real-time Tools<br/>Streaming Data]
SSEEx[Examples:<br/>• Live data feeds<br/>• Real-time monitoring<br/>• Event streams]
end
subgraph "Connection Characteristics"
Latency[Latency:<br/>InProcess < STDIO < HTTP < SSE]
Security[Security:<br/>InProcess/Local > HTTP > SSE]
Scalability[Scalability:<br/>HTTP > SSE > STDIO > InProcess]
Complexity[Complexity:<br/>InProcess < STDIO < HTTP < SSE]
end
InProcess --> Latency
STDIO --> Latency
HTTP --> Security
SSE --> Scalability
HTTP --> Complexity
```
### **Connection Type Details**
**InProcess Connections (In-Memory Tools):**
- **Use Case:** Embedded tools, high-performance operations, testing
- **Performance:** Lowest possible latency (~0.1ms) with no IPC overhead
- **Security:** Highest security as tools run in the same process
- **Limitations:** Go package only, cannot be configured via JSON
**STDIO Connections (Local Tools):**
- **Use Case:** Command-line tools, local scripts, filesystem operations
- **Performance:** Low latency (~1-10ms) due to local execution
- **Security:** High security with full local control
- **Limitations:** Single-server deployment, resource sharing
**HTTP Connections (Remote Services):**
- **Use Case:** Web APIs, microservices, cloud functions
- **Performance:** Network-dependent latency (~10-500ms)
- **Security:** Configurable with authentication and encryption
- **Advantages:** Scalable, multi-server deployment, service isolation
**SSE Connections (Streaming Tools):**
- **Use Case:** Real-time data feeds, live monitoring, event streams
- **Performance:** Variable latency depending on stream frequency
- **Security:** Similar to HTTP with streaming capabilities
- **Benefits:** Real-time updates, persistent connections, event-driven
> **MCP Configuration:** [MCP Setup Guide →](../../mcp/overview)
---
## Tool Discovery & Registration
### **Dynamic Tool Discovery Process**
The MCP system discovers tools at runtime rather than requiring static configuration, enabling flexible and adaptive tool availability:
```mermaid
sequenceDiagram
participant Bifrost
participant MCPManager
participant MCPServer
participant ToolRegistry
participant AIModel
Note over Bifrost: System Startup
Bifrost->>MCPManager: Initialize MCP System
MCPManager->>MCPServer: Establish Connection
MCPServer-->>MCPManager: Connection Ready
MCPManager->>MCPServer: List Available Tools
MCPServer-->>MCPManager: Tool Definitions
MCPManager->>ToolRegistry: Register Tools
Note over Bifrost: Runtime Request Processing
AIModel->>MCPManager: Request Available Tools
MCPManager->>ToolRegistry: Query Tools
ToolRegistry-->>MCPManager: Filtered Tool List
MCPManager-->>AIModel: Available Tools
AIModel->>MCPManager: Execute Tool Call
MCPManager->>MCPServer: Tool Invocation
MCPServer->>MCPServer: Execute Tool Logic
MCPServer-->>MCPManager: Tool Result
MCPManager-->>AIModel: Enhanced Response
```
### **Tool Registry Management**
**Registration Process:**
1. **Connection Establishment** - MCP client connects to configured servers
2. **Capability Exchange** - Server announces available tools and schemas
3. **Tool Validation** - Bifrost validates tool definitions and security
4. **Registry Update** - Tools are registered in the internal tool registry
5. **Availability Notification** - Tools become available for AI model use
**Registry Features:**
- **Dynamic Updates** - Tools can be added/removed during runtime
- **Version Management** - Support for tool versioning and compatibility
- **Access Control** - Request-level tool filtering and permissions
- **Health Monitoring** - Continuous tool availability checking
**Tool Metadata Structure:**
- **Name & Description** - Human-readable tool identification
- **Parameters Schema** - JSON schema for tool input validation
- **Return Schema** - Expected response format definition
- **Capabilities** - Tool feature flags and limitations
- **Authentication** - Required credentials and permissions
---
## Tool Filtering & Access Control
### **Multi-Level Filtering System**
Bifrost provides granular control over tool availability through a sophisticated filtering system:
```mermaid
flowchart TD
Request[Incoming Request] --> GlobalFilter{Global MCP Filter}
GlobalFilter -->|Enabled| ClientFilter[MCP Client Filtering]
GlobalFilter -->|Disabled| NoMCP[No MCP Tools]
ClientFilter --> IncludeClients{Include Clients?}
IncludeClients -->|Yes| IncludeList[Include Specified<br/>MCP Clients]
IncludeClients -->|No| AllClients[All MCP Clients]
IncludeList --> ExcludeClients{Exclude Clients?}
AllClients --> ExcludeClients
ExcludeClients -->|Yes| RemoveClients[Remove Excluded<br/>MCP Clients]
ExcludeClients -->|No| ClientsFiltered[Filtered Clients]
RemoveClients --> ToolFilter[Tool-Level Filtering]
ClientsFiltered --> ToolFilter
ToolFilter --> IncludeTools{Include Tools?}
IncludeTools -->|Yes| IncludeSpecific[Include Specified<br/>Tools Only]
IncludeTools -->|No| AllTools[All Available Tools]
IncludeSpecific --> ExcludeTools{Exclude Tools?}
AllTools --> ExcludeTools
ExcludeTools -->|Yes| RemoveTools[Remove Excluded<br/>Tools]
ExcludeTools -->|No| FinalTools[Final Tool Set]
RemoveTools --> FinalTools
FinalTools --> AIModel[Available to AI Model]
NoMCP --> AIModel
```
### **Filtering Configuration Levels**
**Request-Level Filtering:**
```bash
# Include only specific MCP clients
curl -X POST http://localhost:8080/v1/chat/completions \
-H "x-bf-mcp-include-clients: filesystem,websearch" \
-d '{"model": "gpt-4o-mini", "messages": [...]}'
# Include only specific tools
curl -X POST http://localhost:8080/v1/chat/completions \
-H "x-bf-mcp-include-tools: filesystem-read_file,websearch-search" \
-d '{"model": "gpt-4o-mini", "messages": [...]}'
```
**Configuration-Level Filtering:**
- **Client Selection** - Choose which MCP servers to connect to
- **Tool Blacklisting** - Permanently disable dangerous or unwanted tools
- **Permission Mapping** - Map user roles to available tool sets
- **Environment-Based** - Different tool sets for development vs production
**Security Benefits:**
- **Principle of Least Privilege** - Only necessary tools are exposed
- **Dynamic Access Control** - Per-request tool availability
- **Audit Trail** - Track which tools are used by which requests
- **Risk Mitigation** - Prevent access to dangerous operations
> **📖 Tool Filtering:** [MCP Tool Control →](../../mcp/filtering)
---
## Tool Execution Engine
### **Async Tool Execution Architecture**
The MCP execution engine handles tool invocation asynchronously to maintain system responsiveness and enable complex multi-tool workflows:
```mermaid
sequenceDiagram
participant AIModel
participant ExecutionEngine
participant ToolInvoker
participant MCPServer
participant ResultProcessor
AIModel->>ExecutionEngine: Tool Call Request
ExecutionEngine->>ExecutionEngine: Validate Tool Call
ExecutionEngine->>ToolInvoker: Queue Tool Execution
Note over ToolInvoker: Async Tool Execution
ToolInvoker->>MCPServer: Invoke Tool
MCPServer->>MCPServer: Execute Tool Logic
MCPServer-->>ToolInvoker: Raw Tool Result
ToolInvoker->>ResultProcessor: Process Result
ResultProcessor->>ResultProcessor: Format & Validate
ResultProcessor-->>ExecutionEngine: Processed Result
ExecutionEngine-->>AIModel: Tool Execution Complete
Note over AIModel: Multi-turn Conversation
AIModel->>ExecutionEngine: Continue with Tool Results
ExecutionEngine->>ExecutionEngine: Merge Results into Context
ExecutionEngine-->>AIModel: Enhanced Response
```
### **Execution Flow Characteristics**
**Validation Phase:**
- **Parameter Validation** - Ensure tool arguments match expected schema
- **Permission Checking** - Verify tool access permissions for the request
- **Rate Limiting** - Apply per-tool and per-user rate limits
- **Security Scanning** - Check for potentially dangerous operations
**Execution Phase:**
- **Timeout Management** - Bounded execution time to prevent hanging
- **Error Handling** - Graceful handling of tool failures and timeouts
- **Result Streaming** - Support for tools that return streaming responses
- **Resource Monitoring** - Track tool resource usage and performance
**Response Phase:**
- **Result Formatting** - Convert tool outputs to consistent format
- **Error Enrichment** - Add context and suggestions for tool failures
- **Multi-Result Aggregation** - Combine multiple tool outputs coherently
- **Context Integration** - Merge tool results into conversation context
### **Multi-Turn Conversation Support**
The MCP system enables sophisticated multi-turn conversations where AI models can:
1. **Initial Tool Discovery** - Request available tools for a given context
2. **Tool Execution** - Execute one or more tools based on user request
3. **Result Analysis** - Analyze tool outputs and determine next steps
4. **Follow-up Actions** - Execute additional tools based on previous results
5. **Response Synthesis** - Combine tool results into coherent user response
**Example Multi-Turn Flow:**
```
User: "Find recent news about AI and save interesting articles"
AI: → Execute web_search("AI news recent")
AI: → Analyze search results
AI: → Execute save_article() for each interesting result
AI: → Respond with summary of saved articles
```
### **Complete User-Controlled Tool Execution Flow**
The following diagram shows the end-to-end user experience with MCP tool execution, highlighting the critical user control points and decision-making process:
```mermaid
flowchart TD
A["👤 User Message<br/>\"List files in current directory\""] --> B["🤖 Bifrost Core"]
B --> C["🔧 MCP Manager<br/>Auto-discovers and adds<br/>available tools to request"]
C --> D["🌐 LLM Provider<br/>(OpenAI, Anthropic, etc.)"]
D --> E{"🔍 Response contains<br/>tool_calls?"}
E -->|No| F["✅ Final Response<br/>Display to user"]
E -->|Yes| G["📝 Add assistant message<br/>with tool_calls to history"]
G --> H["🛡️ YOUR EXECUTION LOGIC<br/>(Security, Approval, Logging)"]
H --> I{"🤔 User Decision Point<br/>Execute this tool?"}
I -->|Deny| J["❌ Create denial result<br/>Add to conversation history"]
I -->|Approve| K["⚙️ client.ExecuteMCPTool()<br/>Bifrost executes via MCP"]
K --> L["📊 Tool Result<br/>Add to conversation history"]
J --> M["🔄 Continue conversation loop<br/>Send updated history back to LLM"]
L --> M
M --> D
style A fill:#e1f5fe
style F fill:#e8f5e8
style H fill:#fff3e0
style I fill:#fce4ec
style K fill:#f3e5f5
```
**Key Flow Characteristics:**
**User Control Points:**
- **Security Layer** - Your application controls all tool execution decisions
- **Approval Gate** - Users can approve or deny each tool execution
- **Transparency** - Full visibility into what tools will be executed and why
- **Conversation Continuity** - Tool results seamlessly integrate into conversation flow
**Security Benefits:**
- **No Automatic Execution** - Tools never execute without explicit approval
- **Audit Trail** - Complete logging of all tool execution decisions
- **Contextual Security** - Approval decisions can consider full conversation context
- **Graceful Denials** - Denied tools result in informative responses, not errors
**Implementation Patterns:**
```go
// Example tool execution control in your application
func handleToolExecution(toolCall schemas.ChatToolCall, userContext UserContext) error {
// YOUR SECURITY AND APPROVAL LOGIC HERE
if !userContext.HasPermission(toolCall.Function.Name) {
return createDenialResponse("Tool not permitted for user role")
}
if requiresApproval(toolCall) {
approved := promptUserForApproval(toolCall)
if !approved {
return createDenialResponse("User denied tool execution")
}
}
// Execute the tool via Bifrost
result, err := client.ExecuteMCPTool(ctx, toolCall)
if err != nil {
return handleToolError(err)
}
return addToolResultToHistory(result)
}
```
This flow ensures that while AI models can discover and request tool usage, all actual execution remains under user control, providing the perfect balance of AI capability and human oversight.
---
## Agent Mode Architecture
Agent Mode transforms Bifrost into an autonomous agent runtime by automatically executing pre-approved tools. This section details the internal architecture of the agent execution loop.
### **Agent Execution Loop**
The agent mode operates as an iterative loop that continues until one of the termination conditions is met:
```mermaid
flowchart TD
subgraph "Agent Mode Entry"
A["📥 Incoming Chat Request"] --> B{"🔍 Check MCP Config<br/>Any tools_to_auto_execute?"}
B -->|No| C["📤 Standard Flow<br/>Return tool_calls for manual execution"]
B -->|Yes| D["🤖 Enter Agent Loop"]
end
subgraph "Agent Execution Loop"
D --> E["🌐 Send to LLM Provider<br/>With available tools"]
E --> F{"🔧 Response has<br/>tool_calls?"}
F -->|No| G["✅ Return Final Response<br/>No more tools needed"]
F -->|Yes| H["📋 Classify Tool Calls"]
H --> I{"🔐 Separate by<br/>auto-execute status"}
I --> J["⚡ Auto-Executable Tools"]
I --> K["🛡️ Non-Auto-Executable Tools"]
J --> L["🔄 Execute in Parallel<br/>Via ToolsManager"]
L --> M["📊 Collect Results"]
K --> N{"Any non-auto<br/>tools found?"}
N -->|Yes| O["🛑 Exit Loop Early<br/>Return mixed response"]
N -->|No| P{"⏱️ Max depth<br/>reached?"}
M --> P
P -->|Yes| Q["⚠️ Return Current State<br/>May have pending tools"]
P -->|No| R["📝 Add results to history"]
R --> E
end
subgraph "Response Handling"
O --> S["📦 Create Mixed Response<br/>• Content: executed results JSON<br/>• tool_calls: pending tools<br/>• finish_reason: stop"]
G --> T["📦 Standard Response<br/>Final answer from LLM"]
Q --> U["📦 Depth Limit Response<br/>Current state with any pending"]
end
style D fill:#e3f2fd
style L fill:#e8f5e9
style O fill:#fff3e0
style S fill:#fce4ec
```
### **Tool Classification System**
When the LLM returns tool calls, Bifrost classifies each tool based on the client configuration:
```mermaid
flowchart LR
subgraph "Tool Call Classification"
TC["🔧 Tool Call<br/>from LLM Response"] --> CHECK{"Tool in<br/>tools_to_execute?"}
CHECK -->|No| SKIP["❌ Skip<br/>Not allowed"]
CHECK -->|Yes| AUTO{"Tool in<br/>tools_to_auto_execute?"}
AUTO -->|Yes| EXEC["⚡ Auto-Execute<br/>Run immediately"]
AUTO -->|No| MANUAL["🛡️ Manual<br/>Return to caller"]
end
subgraph "Configuration Example"
CONFIG["MCPClientConfig"]
CONFIG --> TE["tools_to_execute: [*]<br/>All tools available"]
CONFIG --> TAE["tools_to_auto_execute:<br/>[read_file, list_dir]"]
end
style EXEC fill:#c8e6c9
style MANUAL fill:#fff9c4
style SKIP fill:#ffcdd2
```
### **Mixed Tool Response Format**
When a response contains both auto-executable and non-auto-executable tools, the agent creates a special response format:
<AccordionGroup>
<Accordion title="Chat API Response Format" icon="message" defaultOpen>
```json
{
"id": "chatcmpl-abc123",
"choices": [{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "The Output from allowed tools calls is - {\"filesystem_read_file\":\"file contents here\",\"filesystem_list_directory\":\"[\\\"file1.txt\\\",\\\"file2.txt\\\"]\"}\n\nNow I shall call these tools next...",
"tool_calls": [
{
"id": "call_write_123",
"type": "function",
"function": {
"name": "filesystem_write_file",
"arguments": "{\"path\":\"output.txt\",\"content\":\"...\"}"
}
}
]
}
}]
}
```
<Note>
The `content` field contains JSON-formatted results from auto-executed tools. The `tool_calls` array contains only non-auto-executable tools awaiting approval. Setting `finish_reason` to `"stop"` ensures the agent loop exits.
</Note>
</Accordion>
<Accordion title="Responses API Format" icon="code">
```json
{
"id": "resp-abc123",
"output": [
{
"type": "message",
"role": "assistant",
"content": [{
"type": "text",
"text": "The Output from allowed tools calls is - {...}\n\nNow I shall call these tools next..."
}]
},
{
"type": "function_call",
"role": "assistant",
"call_id": "call_write_123",
"name": "filesystem_write_file",
"arguments": "{\"path\":\"output.txt\",\"content\":\"...\"}"
}
]
}
```
</Accordion>
</AccordionGroup>
### **Agent Depth Control**
The `max_agent_depth` setting prevents infinite loops and controls resource usage:
```mermaid
graph LR
subgraph "Depth Tracking"
D0["Depth 0<br/>Initial Request"] --> D1["Depth 1<br/>First tool execution"]
D1 --> D2["Depth 2<br/>Second iteration"]
D2 --> D3["Depth 3<br/>..."]
D3 --> DN["Depth N<br/>Max reached"]
end
DN --> EXIT["🛑 Force Exit<br/>Return current state"]
subgraph "Configuration"
CFG["MCPToolManagerConfig"]
CFG --> MAX["max_agent_depth: 10<br/>(default)"]
CFG --> TIMEOUT["tool_execution_timeout:<br/>30s per tool"]
end
```
<Warning>
When max depth is reached, the response may contain pending tool calls that weren't executed. Your application should handle this gracefully.
</Warning>
---
## Code Mode Architecture
Code Mode enables AI models to write and execute Python code (Starlark) that orchestrates multiple MCP tools in a single request. This provides a powerful meta-layer for complex multi-tool workflows.
### **Code Mode System Overview**
```mermaid
graph TB
subgraph "Code Mode Components"
VM["🖥️ Starlark Interpreter<br/>Python-like Runtime"]
VFS["📁 Virtual File System<br/>Tool Definitions as .pyi"]
EXEC["⚙️ Code Executor<br/>Sandboxed Execution"]
end
subgraph "Meta Tools"
LIST["listToolFiles()<br/>Discover available servers"]
READ["readToolFile(fileName)<br/>Get tool signatures"]
DOCS["getToolDocs(server, tool)<br/>Get detailed docs"]
CODE["executeToolCode(code)<br/>Run Python code"]
end
subgraph "MCP Integration"
TOOLS["🔧 Connected MCP Tools"]
RESULTS["📊 Tool Results"]
end
LLM["🤖 LLM"] --> LIST
LIST --> VFS
VFS --> LLM
LLM --> READ
READ --> VFS
VFS --> LLM
LLM --> DOCS
DOCS --> VFS
VFS --> LLM
LLM --> CODE
CODE --> VM
VM --> EXEC
EXEC --> TOOLS
TOOLS --> RESULTS
RESULTS --> LLM
style VM fill:#e8eaf6
style VFS fill:#e3f2fd
style CODE fill:#e8f5e9
```
### **Virtual File System (VFS)**
Code Mode generates Python stub files (`.pyi`) for all connected MCP tools, providing compact function signatures:
<Tabs>
<Tab title="Server-Level Binding">
When `code_mode_binding_level: "server"` (default), tools are grouped by MCP client:
```
servers/
├── filesystem.pyi → All filesystem tools
├── web_search.pyi → All web search tools
└── database.pyi → All database tools
```
**Generated Stub Example:**
```python
# servers/filesystem.pyi
# Usage: filesystem.tool_name(param=value)
# For detailed docs: use getToolDocs(server="filesystem", tool="tool_name")
def read_file(path: str) -> dict: # Read contents of a file
def write_file(path: str, content: str) -> dict: # Write content to a file
def list_directory(path: str) -> dict: # List directory contents
```
**Usage in Code:**
```python
files = filesystem.list_directory(path=".")
content = filesystem.read_file(path=files["entries"][0])
result = content
```
</Tab>
<Tab title="Tool-Level Binding">
When `code_mode_binding_level: "tool"`, each tool gets its own file:
```
servers/
├── filesystem/
│ ├── read_file.pyi
│ ├── write_file.pyi
│ └── list_directory.pyi
├── web_search/
│ └── search.pyi
└── database/
└── query.pyi
```
**Generated Stub Example:**
```python
# servers/filesystem/read_file.pyi
# Usage: filesystem.read_file(param=value)
def read_file(path: str) -> dict: # Read contents of a file
```
**Usage in Code:**
```python
content = filesystem.read_file(path="config.json")
result = content
```
</Tab>
</Tabs>
### **Code Execution Flow**
```mermaid
sequenceDiagram
participant LLM as 🤖 LLM
participant CM as 📝 Code Mode Handler
participant VM as 🖥️ Starlark Interpreter
participant TM as 🔧 Tools Manager
participant MCP as 🌐 MCP Servers
LLM->>CM: executeToolCode({ code: "..." })
CM->>VM: Initialize sandbox
CM->>VM: Inject tool bindings
CM->>VM: Execute Python code
loop For each tool call in code
VM->>TM: server.tool(param=value)
TM->>MCP: Execute tool
MCP-->>TM: Tool result
TM-->>VM: Return result
end
VM-->>CM: Execution result
CM-->>LLM: { result, logs }
```
### **Starlark Sandbox**
The code execution environment is carefully sandboxed using Starlark, a Python-like language designed for configuration and embedded scripting:
<AccordionGroup>
<Accordion title="Available Features" icon="check" defaultOpen>
- ✅ **Python-like syntax** - Familiar Python syntax and semantics
- ✅ **Synchronous calls** - No async/await needed, direct function calls
- ✅ **List comprehensions** - `[x for x in items if condition]`
- ✅ **print()** - Output captured and returned in logs
- ✅ **Dict/List operations** - Standard Python data structures
- ✅ **Tool bindings** - All connected MCP tools as globals
</Accordion>
<Accordion title="Restricted Features" icon="ban">
- ❌ **Imports** - No `import` statements (tools are pre-bound)
- ❌ **Classes** - Use dicts and functions instead
- ❌ **File I/O** - No direct filesystem access (use MCP tools)
- ❌ **Network** - No direct network access (use MCP tools)
- ❌ **Randomness/Time** - Deterministic execution only
</Accordion>
</AccordionGroup>
### **Code Mode Security Model**
```mermaid
graph TB
subgraph "Security Layers"
L1["🔒 Code Validation<br/>Syntax checking before execution"]
L2["🛡️ Sandboxed Runtime<br/>No external module access"]
L3["⏱️ Execution Timeout<br/>Bounded runtime"]
L4["🔐 Tool ACL<br/>Only allowed tools accessible"]
end
subgraph "Execution Boundaries"
B1["No filesystem access<br/>(except via MCP tools)"]
B2["No network access<br/>(except via MCP tools)"]
B3["No process spawning"]
B4["Memory isolation enforced"]
end
L1 --> L2 --> L3 --> L4
L4 --> B1
L4 --> B2
L4 --> B3
L4 --> B4
```
### **Code Mode Configuration**
<Tabs>
<Tab title="Gateway (config.json)">
```json
{
"mcp": {
"client_configs": [
{
"name": "filesystem",
"is_code_mode_client": true,
"connection_type": "stdio",
"stdio_config": {
"command": "npx",
"args": ["-y", "@anthropic/mcp-filesystem"]
},
"tools_to_execute": ["*"]
}
],
"tool_manager_config": {
"code_mode_binding_level": "server",
"tool_execution_timeout": "30s"
}
}
}
```
</Tab>
<Tab title="Go SDK">
```go
mcpConfig := &schemas.MCPConfig{
ClientConfigs: []schemas.MCPClientConfig{
{
Name: "filesystem",
IsCodeModeClient: true,
ConnectionType: schemas.MCPConnectionTypeSTDIO,
StdioConfig: &schemas.MCPStdioConfig{
Command: "npx",
Args: []string{"-y", "@anthropic/mcp-filesystem"},
},
ToolsToExecute: []string{"*"},
},
},
ToolManagerConfig: &schemas.MCPToolManagerConfig{
CodeModeBindingLevel: schemas.CodeModeBindingLevelServer,
ToolExecutionTimeout: 30 * time.Second,
},
}
```
</Tab>
</Tabs>
### **Code Mode vs Agent Mode**
| Aspect | Agent Mode | Code Mode |
|--------|------------|-----------|
| **Execution Model** | LLM decides one tool at a time | LLM writes code orchestrating multiple tools |
| **Iterations** | Multiple LLM round-trips | Single LLM call, code handles orchestration |
| **Complexity** | Simple tool chains | Complex workflows with conditionals/loops |
| **Latency** | Higher (multiple LLM calls) | Lower (single LLM call + code execution) |
| **Control** | Per-tool approval possible | Code runs atomically |
| **Best For** | Interactive agents | Batch operations, complex data processing |
---
## MCP Integration Patterns
### **Common Integration Scenarios**
**1. Filesystem Operations**
- **Tools:** `list_files`, `read_file`, `write_file`, `create_directory`
- **Use Cases:** Code analysis, document processing, file management
- **Security:** Sandboxed file access, path validation, permission checks
- **Performance:** Local execution for fast file operations
**2. Web Search & Information Retrieval**
- **Tools:** `web_search`, `fetch_url`, `extract_content`, `summarize`
- **Use Cases:** Research assistance, fact-checking, content gathering
- **Integration:** External search APIs, content parsing services
- **Caching:** Response caching for repeated queries
**3. Database Operations**
- **Tools:** `query_database`, `insert_record`, `update_record`, `schema_info`
- **Use Cases:** Data analysis, report generation, database administration
- **Security:** Read-only access by default, query validation, injection prevention
- **Performance:** Connection pooling, query optimization
**4. API Integrations**
- **Tools:** Custom business logic tools, third-party service integration
- **Use Cases:** CRM operations, payment processing, notification sending
- **Authentication:** API key management, OAuth token handling
- **Error Handling:** Retry logic, fallback mechanisms
### **MCP Server Development Patterns**
**Simple STDIO Server:**
- **Language:** Any language that can read/write JSON to stdin/stdout
- **Deployment:** Single executable, minimal dependencies
- **Use Case:** Local tools, development utilities, simple scripts
**HTTP Service Server:**
- **Architecture:** RESTful API with MCP protocol endpoints
- **Scalability:** Horizontal scaling, load balancing
- **Use Case:** Shared tools, enterprise integrations, cloud services
**Hybrid Approach:**
- **Local + Remote:** Combine STDIO tools for local operations with HTTP for remote services
- **Failover:** Use local fallbacks when remote services are unavailable
- **Optimization:** Route tool calls to most appropriate execution environment
> **📖 MCP Development:** [Tool Development Guide →](../../mcp/overview)
---
## Security & Safety Considerations
### **MCP Security Architecture**
```mermaid
graph TB
subgraph "Security Layers"
L1[Connection Security<br/>Authentication & Encryption]
L2[Tool Validation<br/>Schema & Permission Checks]
L3[Execution Security<br/>Sandboxing & Limits]
L4[Result Security<br/>Output Validation & Filtering]
end
subgraph "Threat Mitigation"
T1[Malicious Tools<br/>Code Injection Prevention]
T2[Resource Abuse<br/>Rate Limiting & Quotas]
T3[Data Exposure<br/>Output Sanitization]
T4[System Access<br/>Privilege Isolation]
end
L1 --> T1
L2 --> T2
L3 --> T4
L4 --> T3
```
**Security Measures:**
**Connection Security:**
- **Authentication** - API keys, certificates, or token-based auth for HTTP/SSE
- **Encryption** - TLS for HTTP connections, secure pipes for STDIO
- **Network Isolation** - Firewall rules and network segmentation
**Execution Security:**
- **Sandboxing** - Isolated execution environments for tools
- **Resource Limits** - CPU, memory, and time constraints
- **Permission Model** - Principle of least privilege for tool access
**Operational Security:**
- **Regular Updates** - Keep MCP servers and tools updated
- **Monitoring** - Continuous security monitoring and alerting
- **Incident Response** - Procedures for security incidents involving tools
---
## Related Architecture Documentation
- **[Request Flow](./request-flow)** - MCP integration in request processing
- **[Concurrency Model](./concurrency)** - MCP concurrency and worker integration
- **[Plugin System](./plugins)** - Integration between MCP and plugin systems
- **[Benchmarks](../../benchmarking/getting-started)** - MCP performance impact and optimization

View File

@@ -0,0 +1,552 @@
---
title: "Plugins"
description: "Deep dive into Bifrost's extensible plugin architecture - how plugins work internally, lifecycle management, execution model, and integration patterns."
icon: "puzzle-piece"
---
## Plugin Architecture Philosophy
### **Core Design Principles**
Bifrost's plugin system is built around five key principles that ensure extensibility without compromising performance or reliability:
| Principle | Implementation | Benefit |
| ----------------------------- | ------------------------------------------------ | ------------------------------------------------ |
| **Plugin-First Design** | Core logic designed around plugin hook points | Maximum extensibility without core modifications |
| **Zero-Copy Integration** | Direct memory access to request/response objects | Minimal performance overhead |
| **Lifecycle Management** | Complete plugin lifecycle with automatic cleanup | Resource safety and leak prevention |
| **Interface-Based Safety** | Well-defined interfaces for type safety | Compile-time validation and consistency |
| **Failure Isolation** | Plugin errors don't crash the core system | Fault tolerance and system stability |
### **Plugin System Overview**
```mermaid
graph TB
subgraph "Plugin Management Layer"
PluginMgr[Plugin Manager<br/>Central Controller]
Registry[Plugin Registry<br/>Discovery & Loading]
Lifecycle[Lifecycle Manager<br/>State Management]
end
subgraph "Plugin Execution Layer"
Pipeline[Plugin Pipeline<br/>Execution Orchestrator]
PreHooks[Pre-Processing Hooks<br/>Request Modification]
PostHooks[Post-Processing Hooks<br/>Response Enhancement]
end
subgraph "Plugin Categories"
Auth[Authentication<br/>& Authorization]
RateLimit[Rate Limiting<br/>& Throttling]
Transform[Data Transformation<br/>& Validation]
Monitor[Monitoring<br/>& Analytics]
Custom[Custom Business<br/>Logic]
end
PluginMgr --> Registry
Registry --> Lifecycle
Lifecycle --> Pipeline
Pipeline --> PreHooks
Pipeline --> PostHooks
PreHooks --> Auth
PreHooks --> RateLimit
PostHooks --> Transform
PostHooks --> Monitor
PostHooks --> Custom
```
---
## Plugin Lifecycle Management
### **Complete Lifecycle States**
Every plugin goes through a well-defined lifecycle that ensures proper resource management and error handling:
```mermaid
stateDiagram-v2
[*] --> PluginInit: Plugin Creation
PluginInit --> Registered: Add to BifrostConfig
Registered --> PreHookCall: Request Received
PreHookCall --> ModifyRequest: Normal Flow
PreHookCall --> ShortCircuitResponse: Return Response
PreHookCall --> ShortCircuitError: Return Error
ModifyRequest --> ProviderCall: Send to Provider
ProviderCall --> PostHookCall: Receive Response
ShortCircuitResponse --> PostHookCall: Skip Provider
ShortCircuitError --> PostHookCall: Pipeline Symmetry
PostHookCall --> ModifyResponse: Process Result
PostHookCall --> RecoverError: Error Recovery
PostHookCall --> FallbackCheck: Check AllowFallbacks
PostHookCall --> ResponseReady: Pass Through
FallbackCheck --> TryFallback: AllowFallbacks=true/nil
FallbackCheck --> ResponseReady: AllowFallbacks=false
TryFallback --> PreHookCall: Next Provider
ModifyResponse --> ResponseReady: Modified
RecoverError --> ResponseReady: Recovered
ResponseReady --> [*]: Return to Client
Registered --> CleanupCall: Bifrost Shutdown
CleanupCall --> [*]: Plugin Destroyed
```
### **Lifecycle Phase Details**
**Discovery Phase:**
- **Purpose:** Find and catalog available plugins
- **Sources:** Command line, environment variables, JSON configuration, directory scanning
- **Validation:** Basic existence and format checks
- **Output:** Plugin descriptors with metadata
**Loading Phase:**
- **Purpose:** Load plugin binaries into memory
- **Security:** Digital signature verification and checksum validation
- **Compatibility:** Interface implementation validation
- **Resource:** Memory and capability assessment
**Initialization Phase:**
- **Purpose:** Configure plugin with runtime settings
- **Timeout:** Bounded initialization time to prevent hanging
- **Dependencies:** External service connectivity verification
- **State:** Internal state setup and resource allocation
**Runtime Phase:**
- **Purpose:** Active request processing
- **Monitoring:** Continuous health checking and performance tracking
- **Recovery:** Automatic error recovery and degraded mode handling
- **Metrics:** Real-time performance and health metrics collection
> **Plugin Lifecycle:** [Plugin Management →](../../enterprise/custom-plugins)
---
## Plugin Execution Pipeline
### **Request Processing Flow**
The plugin pipeline ensures consistent, predictable execution while maintaining high performance:
#### **Normal Execution Flow (No Short-Circuit)**
```mermaid
sequenceDiagram
participant Client
participant Bifrost
participant Plugin1
participant Plugin2
participant Provider
Client->>Bifrost: Request
Bifrost->>Plugin1: PreLLMHook(request)
Plugin1-->>Bifrost: modified request
Bifrost->>Plugin2: PreLLMHook(request)
Plugin2-->>Bifrost: modified request
Bifrost->>Provider: API Call
Provider-->>Bifrost: response
Bifrost->>Plugin2: PostLLMHook(response)
Plugin2-->>Bifrost: modified response
Bifrost->>Plugin1: PostLLMHook(response)
Plugin1-->>Bifrost: modified response
Bifrost-->>Client: Final Response
```
**Execution Order:**
1. **PreHooks:** Execute in registration order (1 → 2 → N)
2. **Provider Call:** If no short-circuit occurred
3. **PostHooks:** Execute in reverse order (N → 2 → 1)
#### **Short-Circuit Response Flow (Cache Hit)**
```mermaid
sequenceDiagram
participant Client
participant Bifrost
participant Cache
participant Auth
participant Provider
Client->>Bifrost: Request
Bifrost->>Auth: PreLLMHook(request)
Auth-->>Bifrost: modified request
Bifrost->>Cache: PreLLMHook(request)
Cache-->>Bifrost: LLMPluginShortCircuit{Response}
Note over Provider: Provider call skipped
Bifrost->>Cache: PostLLMHook(response)
Cache-->>Bifrost: modified response
Bifrost->>Auth: PostLLMHook(response)
Auth-->>Bifrost: modified response
Bifrost-->>Client: Cached Response
```
#### **Streaming Response Flow**
For streaming responses, the plugin pipeline executes post-hooks for every delta/chunk received from the provider:
```mermaid
sequenceDiagram
participant Client
participant Bifrost
participant Plugin1
participant Plugin2
participant Provider
Client->>Bifrost: Stream Request
Bifrost->>Plugin1: PreLLMHook(request)
Plugin1-->>Bifrost: modified request
Bifrost->>Plugin2: PreLLMHook(request)
Plugin2-->>Bifrost: modified request
Bifrost->>Provider: Stream API Call
loop For Each Delta
Provider-->>Bifrost: stream delta
Bifrost->>Plugin2: PostLLMHook(delta)
Plugin2-->>Bifrost: modified delta
Bifrost->>Plugin1: PostLLMHook(delta)
Plugin1-->>Bifrost: modified delta
Bifrost-->>Client: Send Delta
end
Provider-->>Bifrost: final chunk (finish reason)
Bifrost->>Plugin2: PostLLMHook(final)
Plugin2-->>Bifrost: modified final
Bifrost->>Plugin1: PostLLMHook(final)
Plugin1-->>Bifrost: modified final
Bifrost-->>Client: Final Chunk
```
**Streaming Execution Characteristics:**
1. **Delta Processing:**
- Each stream delta (chunk) goes through all post-hooks
- Plugins can modify/transform each delta before it reaches the client
- Deltas can contain: text content, tool calls, role changes, or usage info
2. **Special Delta Types:**
- **Start Event:** Initial delta with role information
- **Content Delta:** Regular text or tool call content
- **Usage Update:** Token usage statistics (if enabled)
- **Final Chunk:** Contains finish reason and any final metadata
3. **Plugin Considerations:**
- Plugins must handle streaming responses efficiently
- Each delta should be processed quickly to maintain stream responsiveness
- Plugins can track state across deltas using context
- Heavy processing should be done asynchronously
4. **Error Handling:**
- If a post-hook returns an error, it's sent as an error stream chunk
- Stream is terminated after error chunks
- Plugins can recover from errors by providing valid responses
5. **Performance Optimization:**
- Lightweight delta processing to minimize latency
- Object pooling for common data structures
- Non-blocking operations for logging and metrics
- Efficient memory management for stream processing
> **Streaming Details:** [Streaming Guide →](../../quickstart/gateway/streaming)
**Short-Circuit Rules:**
- **Provider Skipped:** When plugin returns short-circuit response/error
- **PostLLMHook Guarantee:** All executed PreHooks get corresponding PostLLMHook calls
- **Reverse Order:** PostHooks execute in reverse order of PreHooks
#### **Short-Circuit Error Flow (Allow Fallbacks)**
```mermaid
sequenceDiagram
participant Client
participant Bifrost
participant Plugin1
participant Provider1
participant Provider2
Client->>Bifrost: Request (Provider1 + Fallback Provider2)
Bifrost->>Plugin1: PreLLMHook(request)
Plugin1-->>Bifrost: LLMPluginShortCircuit{Error, AllowFallbacks=true}
Note over Provider1: Provider1 call skipped
Bifrost->>Plugin1: PostLLMHook(error)
Plugin1-->>Bifrost: error unchanged
Note over Bifrost: Try fallback provider
Bifrost->>Plugin1: PreLLMHook(request for Provider2)
Plugin1-->>Bifrost: modified request
Bifrost->>Provider2: API Call
Provider2-->>Bifrost: response
Bifrost->>Plugin1: PostLLMHook(response)
Plugin1-->>Bifrost: modified response
Bifrost-->>Client: Final Response
```
#### **Error Recovery Flow**
```mermaid
sequenceDiagram
participant Client
participant Bifrost
participant Plugin1
participant Plugin2
participant Provider
participant RecoveryPlugin
Client->>Bifrost: Request
Bifrost->>Plugin1: PreLLMHook(request)
Plugin1-->>Bifrost: modified request
Bifrost->>Plugin2: PreLLMHook(request)
Plugin2-->>Bifrost: modified request
Bifrost->>RecoveryPlugin: PreLLMHook(request)
RecoveryPlugin-->>Bifrost: modified request
Bifrost->>Provider: API Call
Provider-->>Bifrost: error
Bifrost->>RecoveryPlugin: PostLLMHook(error)
RecoveryPlugin-->>Bifrost: recovered response
Bifrost->>Plugin2: PostLLMHook(response)
Plugin2-->>Bifrost: modified response
Bifrost->>Plugin1: PostLLMHook(response)
Plugin1-->>Bifrost: modified response
Bifrost-->>Client: Recovered Response
```
**Error Recovery Features:**
- **Error Transformation:** Plugins can convert errors to successful responses
- **Graceful Degradation:** Provide fallback responses for service failures
- **Context Preservation:** Error context is maintained through recovery process
### **Complex Plugin Decision Flow**
Real-world plugin interactions involving authentication, rate limiting, and caching with different decision paths:
```mermaid
graph TD
A["Client Request"] --> B["Bifrost"]
B --> C["Auth Plugin PreLLMHook"]
C --> D{"Authenticated?"}
D -->|No| E["Return Auth Error<br/>AllowFallbacks=false"]
D -->|Yes| F["RateLimit Plugin PreLLMHook"]
F --> G{"Rate Limited?"}
G -->|Yes| H["Return Rate Error<br/>AllowFallbacks=nil"]
G -->|No| I["Cache Plugin PreLLMHook"]
I --> J{"Cache Hit?"}
J -->|Yes| K["Return Cached Response"]
J -->|No| L["Provider API Call"]
L --> M["Cache Plugin PostLLMHook"]
M --> N["Store in Cache"]
N --> O["RateLimit Plugin PostLLMHook"]
O --> P["Auth Plugin PostLLMHook"]
P --> Q["Final Response"]
E --> R["Skip Fallbacks"]
H --> S["Try Fallback Provider"]
K --> T["Skip Provider Call"]
```
### **Execution Characteristics**
**Symmetric Execution Pattern:**
- **Pre-processing:** Plugins execute in priority order (high to low)
- **Post-processing:** Plugins execute in reverse order (low to high)
- **Rationale:** Ensures proper cleanup and state management (last in, first out)
**Performance Optimizations:**
- **Timeout Boundaries:** Each plugin has configurable execution timeouts
- **Panic Recovery:** Plugin panics are caught and logged without crashing the system
- **Resource Limits:** Memory and CPU limits prevent runaway plugins
- **Circuit Breaking:** Repeated failures trigger plugin isolation
**Error Handling Strategies:**
- **Continue:** Use original request/response if plugin fails
- **Fail Fast:** Return error immediately if critical plugin fails
- **Retry:** Attempt plugin execution with exponential backoff
- **Fallback:** Use alternative plugin or default behavior
> **Plugin Execution:** [Request Flow →](./request-flow#stage-3-plugin-pipeline-processing)
---
## Security & Validation
### **Multi-Layer Security Model**
Plugin security operates at multiple layers to ensure system integrity:
```mermaid
graph TB
subgraph "Security Validation Layers"
L1[Layer 1: Binary Validation<br/>Signature & Checksum]
L2[Layer 2: Interface Validation<br/>Type Safety & Compatibility]
L3[Layer 3: Runtime Validation<br/>Resource Limits & Timeouts]
L4[Layer 4: Execution Isolation<br/>Panic Recovery & Error Handling]
end
subgraph "Security Benefits"
Integrity[Code Integrity<br/>Verified Authenticity]
Safety[Type Safety<br/>Compile-time Checks]
Stability[System Stability<br/>Isolated Failures]
Performance[Performance Protection<br/>Resource Limits]
end
L1 --> Integrity
L2 --> Safety
L3 --> Performance
L4 --> Stability
```
### **Validation Process**
**Binary Security:**
- **Digital Signatures:** Cryptographic verification of plugin authenticity
- **Checksum Validation:** File integrity verification
- **Source Verification:** Trusted source requirements
**Interface Security:**
- **Type Safety:** Interface implementation verification
- **Version Compatibility:** Plugin API version checking
- **Memory Safety:** Safe memory access patterns
**Runtime Security:**
- **Resource Quotas:** Memory and CPU usage limits
- **Execution Timeouts:** Bounded execution time
- **Sandbox Execution:** Isolated execution environment
**Operational Security:**
- **Health Monitoring:** Continuous plugin health assessment
- **Error Tracking:** Plugin error rate monitoring
- **Automatic Recovery:** Failed plugin restart and recovery
---
## Plugin Performance & Monitoring
### **Comprehensive Metrics System**
Bifrost provides detailed metrics for plugin performance and health monitoring:
```mermaid
graph TB
subgraph "Execution Metrics"
ExecTime[Execution Time<br/>Latency per Plugin]
ExecCount[Execution Count<br/>Request Volume]
SuccessRate[Success Rate<br/>Error Percentage]
Throughput[Throughput<br/>Requests/Second]
end
subgraph "Resource Metrics"
MemoryUsage[Memory Usage<br/>Per Plugin Instance]
CPUUsage[CPU Utilization<br/>Processing Time]
IOMetrics[I/O Operations<br/>Network/Disk Activity]
PoolUtilization[Pool Utilization<br/>Resource Efficiency]
end
subgraph "Health Metrics"
ErrorRate[Error Rate<br/>Failed Executions]
PanicCount[Panic Recovery<br/>Crash Events]
TimeoutCount[Timeout Events<br/>Slow Executions]
RecoveryRate[Recovery Success<br/>Failure Handling]
end
subgraph "Business Metrics"
AddedLatency[Added Latency<br/>Plugin Overhead]
SystemImpact[System Impact<br/>Overall Performance]
FeatureUsage[Feature Usage<br/>Plugin Utilization]
CostImpact[Cost Impact<br/>Resource Consumption]
end
```
### **Performance Characteristics**
**Plugin Execution Performance:**
- **Typical Overhead:** 1-10μs per plugin for simple operations
- **Authentication Plugins:** 1-5μs for key validation
- **Rate Limiting Plugins:** 500ns for quota checks
- **Monitoring Plugins:** 200ns for metric collection
- **Transformation Plugins:** 2-10μs depending on complexity
**Resource Usage Patterns:**
- **Memory Efficiency:** Object pooling reduces allocations
- **CPU Optimization:** Minimal processing overhead
- **Network Impact:** Configurable external service calls
- **Storage Overhead:** Minimal for stateless plugins
---
## Plugin Integration Patterns
### **Common Integration Scenarios**
**1. Authentication & Authorization**
- **Pre-processing Hook:** Validate API keys or JWT tokens
- **Configuration:** External identity provider integration
- **Error Handling:** Return 401/403 responses for invalid credentials
- **Performance:** Sub-5μs validation with caching
**2. Rate Limiting & Quotas**
- **Pre-processing Hook:** Check request quotas and limits
- **Storage:** Redis or in-memory rate limit tracking
- **Algorithms:** Token bucket, sliding window, fixed window
- **Responses:** 429 Too Many Requests with retry headers
**3. Request/Response Transformation**
- **Dual Hooks:** Pre-processing for requests, post-processing for responses
- **Use Cases:** Data format conversion, field mapping, content filtering
- **Performance:** Streaming transformations for large payloads
- **Compatibility:** Provider-specific format adaptations
**4. Monitoring & Analytics**
- **Post-processing Hook:** Collect metrics and logs after request completion
- **Destinations:** Prometheus, DataDog, custom analytics systems
- **Data:** Request/response metadata, performance metrics, error tracking
- **Privacy:** Configurable data sanitization and filtering
### **Plugin Communication Patterns**
**Plugin-to-Plugin Communication:**
- **Shared Context:** Plugins can store data in request context for downstream plugins
- **Event System:** Plugin can emit events for other plugins to consume
- **Data Passing:** Structured data exchange between related plugins
**Plugin-to-External Service Communication:**
- **HTTP Clients:** Built-in HTTP client pools for external API calls
- **Database Connections:** Connection pooling for database access
- **Message Queues:** Integration with message queue systems
- **Caching Systems:** Redis, Memcached integration for state storage
> **📖 Integration Examples:** [Plugin Development Guide →](../../enterprise/custom-plugins)
---
## Related Architecture Documentation
- **[Request Flow](./request-flow)** - Plugin execution in request processing pipeline
- **[Concurrency Model](./concurrency)** - Plugin concurrency and threading considerations
- **[Benchmarks](../../benchmarking/getting-started)** - Plugin performance characteristics and optimization
- **[MCP System](./mcp)** - Integration between plugins and MCP system

View File

View File

@@ -0,0 +1,527 @@
---
title: "Request Flow"
description: "Deep dive into Bifrost's request processing pipeline - from transport layer ingestion through provider execution to response delivery."
icon: "route"
---
## Stage 1: Transport Layer Processing
### **HTTP Transport Flow**
```mermaid
sequenceDiagram
participant Client
participant HTTPTransport
participant Router
participant Validation
Client->>HTTPTransport: POST /v1/chat/completions
HTTPTransport->>HTTPTransport: Parse Headers
HTTPTransport->>HTTPTransport: Extract Body
HTTPTransport->>Validation: Validate JSON Schema
Validation->>Router: BifrostRequest
Router-->>HTTPTransport: Processing Started
HTTPTransport-->>Client: HTTP 200 (async processing)
```
**Key Processing Steps:**
1. **Request Reception** - FastHTTP server receives request
2. **Header Processing** - Extract authentication, content-type, custom headers
3. **Body Parsing** - JSON unmarshaling with schema validation
4. **Request Transformation** - Convert to internal `BifrostRequest` schema
5. **Context Creation** - Build request context with metadata
**Performance Characteristics:**
- **Parsing Time:** ~2.1μs for typical requests
- **Validation Overhead:** ~400ns for schema checks
- **Memory Allocation:** Zero-copy where possible
### **Go SDK Flow**
```mermaid
sequenceDiagram
participant Application
participant SDK
participant Core
participant Validation
Application->>SDK: bifrost.ChatCompletion(req)
SDK->>SDK: Type Validation
SDK->>Core: Direct Function Call
Core->>Validation: Schema Validation
Validation-->>Core: Validated Request
Core-->>SDK: Processing Result
SDK-->>Application: Typed Response
```
**Advantages:**
- **Zero Serialization** - Direct Go struct passing
- **Type Safety** - Compile-time validation
- **Lower Latency** - No HTTP/JSON overhead
- **Memory Efficiency** - No intermediate allocations
---
## Stage 2: Request Routing & Load Balancing
### **Provider Selection Logic**
```mermaid
flowchart TD
Request[Incoming Request] --> ModelCheck{Model Available?}
ModelCheck -->|Yes| ProviderDirect[Use Specified Provider]
ModelCheck -->|No| ModelMapping[Model → Provider Mapping]
ProviderDirect --> KeyPool[API Key Pool]
ModelMapping --> KeyPool
KeyPool --> WeightedSelect[Weighted Random Selection]
WeightedSelect --> HealthCheck{Provider Healthy?}
HealthCheck -->|Yes| AssignWorker[Assign Worker]
HealthCheck -->|No| CircuitBreaker[Circuit Breaker]
CircuitBreaker --> FallbackCheck{Fallback Available?}
FallbackCheck -->|Yes| FallbackProvider[Try Fallback]
FallbackCheck -->|No| ErrorResponse[Return Error]
FallbackProvider --> KeyPool
```
**Key Selection Algorithm:**
```go
// Weighted random key selection
type KeySelector struct {
keys []APIKey
weights []float64
total float64
}
func (ks *KeySelector) SelectKey() *APIKey {
r := rand.Float64() * ks.total
cumulative := 0.0
for i, weight := range ks.weights {
cumulative += weight
if r <= cumulative {
return &ks.keys[i]
}
}
return &ks.keys[len(ks.keys)-1]
}
```
**Performance Metrics:**
- **Key Selection Time:** ~10ns (constant time)
- **Health Check Overhead:** ~50ns (cached results)
- **Fallback Decision:** ~25ns (configuration lookup)
---
## Stage 3: Plugin Pipeline Processing
### **Pre-Processing Hooks**
```mermaid
sequenceDiagram
participant Request
participant AuthPlugin
participant RateLimitPlugin
participant TransformPlugin
participant Core
Request->>AuthPlugin: ProcessRequest()
AuthPlugin->>AuthPlugin: Validate API Key
AuthPlugin->>RateLimitPlugin: Authorized Request
RateLimitPlugin->>RateLimitPlugin: Check Rate Limits
RateLimitPlugin->>TransformPlugin: Allowed Request
TransformPlugin->>TransformPlugin: Modify Request
TransformPlugin->>Core: Final Request
```
**Plugin Execution Model:**
```go
type PluginManager struct {
plugins []Plugin
}
func (pm *PluginManager) ExecutePreHooks(
ctx BifrostContext,
req *BifrostRequest,
) (*BifrostRequest, *BifrostError) {
for _, plugin := range pm.plugins {
modifiedReq, err := plugin.ProcessRequest(ctx, req)
if err != nil {
return nil, err
}
req = modifiedReq
}
return req, nil
}
```
**Plugin Types & Performance:**
| Plugin Type | Processing Time | Memory Impact | Failure Mode |
| --------------------- | --------------- | ------------- | ---------------------- |
| **Authentication** | ~1-5μs | Minimal | Reject request |
| **Rate Limiting** | ~500ns | Cache-based | Throttle/reject |
| **Request Transform** | ~2-10μs | Copy-on-write | Continue with original |
| **Monitoring** | ~200ns | Append-only | Continue silently |
---
## Stage 4: MCP Tool Discovery & Integration
### **Tool Discovery Process**
```mermaid
flowchart TD
Request[Request with Model] --> MCPCheck{MCP Enabled?}
MCPCheck -->|No| SkipMCP[Skip MCP Processing]
MCPCheck -->|Yes| ClientLookup[MCP Client Lookup]
ClientLookup --> ToolFilter[Tool Filtering]
ToolFilter --> ToolInject[Inject Tools into Request]
ToolFilter --> IncludeCheck{Include Filter?}
ToolFilter --> ExcludeCheck{Exclude Filter?}
IncludeCheck -->|Yes| IncludeTools[Include Specified Tools]
IncludeCheck -->|No| AllTools[Include All Tools]
ExcludeCheck -->|Yes| RemoveTools[Remove Excluded Tools]
ExcludeCheck -->|No| KeepFiltered[Keep Filtered Tools]
IncludeTools --> ToolInject
AllTools --> ToolInject
RemoveTools --> ToolInject
KeepFiltered --> ToolInject
ToolInject --> EnhancedRequest[Request with Tools]
SkipMCP --> EnhancedRequest
```
**Tool Integration Algorithm:**
```go
func (mcpm *MCPManager) EnhanceRequest(
ctx BifrostContext,
req *BifrostChatRequest,
) (*BifrostRequest, error) {
// Extract tool filtering from context
includeClients := ctx.GetStringSlice("mcp-include-clients")
includeTools := ctx.GetStringSlice("mcp-include-tools")
// Get available tools
availableTools := mcpm.getAvailableTools(includeClients)
// Filter tools
filteredTools := mcpm.filterTools(availableTools, includeTools)
// Inject into request
if req.Params == nil {
req.Params = &ChatParameters{}
}
req.Params.Tools = append(req.Params.Tools, filteredTools...)
return req, nil
}
```
**MCP Performance Impact:**
- **Tool Discovery:** ~100-500μs (cached after first request)
- **Tool Filtering:** ~50-200ns per tool
- **Request Enhancement:** ~1-5μs depending on tool count
---
## Stage 5: Memory Pool Management
### **Object Pool Lifecycle**
```mermaid
stateDiagram-v2
[*] --> PoolInit: System Startup
PoolInit --> Available: Objects Pre-allocated
Available --> Acquired: Request Processing
Acquired --> InUse: Object Populated
InUse --> Processing: Worker Processing
Processing --> Completed: Processing Done
Completed --> Reset: Object Cleanup
Reset --> Available: Return to Pool
Available --> Expansion: Pool Exhaustion
Expansion --> Available: New Objects Created
Reset --> GC: Pool Full
GC --> [*]: Garbage Collection
```
**Memory Pool Implementation:**
```go
type MemoryPools struct {
channelPool sync.Pool
messagePool sync.Pool
responsePool sync.Pool
bufferPool sync.Pool
}
func (mp *MemoryPools) GetChannel() *ProcessingChannel {
if ch := mp.channelPool.Get(); ch != nil {
return ch.(*ProcessingChannel)
}
return NewProcessingChannel()
}
func (mp *MemoryPools) ReturnChannel(ch *ProcessingChannel) {
ch.Reset() // Clear previous data
mp.channelPool.Put(ch)
}
```
---
## Stage 6: Worker Pool Processing
### **Worker Assignment & Execution**
```mermaid
sequenceDiagram
participant Queue
participant WorkerPool
participant Worker
participant Provider
participant Circuit
Queue->>WorkerPool: Enqueue Request
WorkerPool->>Worker: Assign Available Worker
Worker->>Circuit: Check Circuit Breaker
Circuit->>Provider: Forward Request
Provider-->>Circuit: Response/Error
Circuit->>Circuit: Update Health Metrics
Circuit-->>Worker: Provider Response
Worker-->>WorkerPool: Release Worker
WorkerPool-->>Queue: Request Completed
```
**Worker Pool Architecture:**
```go
type ProviderWorkerPool struct {
workers chan *Worker
queue chan *ProcessingJob
config WorkerPoolConfig
metrics *PoolMetrics
}
func (pwp *ProviderWorkerPool) ProcessRequest(job *ProcessingJob) {
// Get worker from pool
worker := <-pwp.workers
go func() {
defer func() {
// Return worker to pool
pwp.workers <- worker
}()
// Process request
result := worker.Execute(job)
job.ResultChan <- result
}()
}
```
---
## Stage 7: Provider API Communication
### **HTTP Request Execution**
```mermaid
sequenceDiagram
participant Worker
participant HTTPClient
participant Provider
participant CircuitBreaker
participant Metrics
Worker->>HTTPClient: PrepareRequest()
HTTPClient->>HTTPClient: Add Headers & Auth
HTTPClient->>CircuitBreaker: CheckHealth()
CircuitBreaker->>Provider: HTTP Request
Provider-->>CircuitBreaker: HTTP Response
CircuitBreaker->>Metrics: Record Metrics
CircuitBreaker-->>HTTPClient: Response/Error
HTTPClient-->>Worker: Parsed Response
```
**Request Preparation Pipeline:**
```go
func (w *ProviderWorker) ExecuteRequest(job *ProcessingJob) *ProviderResponse {
// Prepare HTTP request
httpReq := w.prepareHTTPRequest(job.Request)
// Add authentication
w.addAuthentication(httpReq, job.APIKey)
// Execute with timeout
ctx, cancel := context.WithTimeout(context.Background(), job.Timeout)
defer cancel()
httpResp, err := w.httpClient.Do(httpReq.WithContext(ctx))
if err != nil {
return w.handleError(err, job)
}
// Parse response
return w.parseResponse(httpResp, job)
}
```
---
## Stage 8: Tool Execution & Response Processing
### **MCP Tool Execution Flow**
```mermaid
sequenceDiagram
participant Provider
participant MCPProcessor
participant MCPServer
participant ToolExecutor
participant ResponseBuilder
Provider->>MCPProcessor: Response with Tool Calls
MCPProcessor->>MCPProcessor: Extract Tool Calls
loop For each tool call
MCPProcessor->>MCPServer: Execute Tool
MCPServer->>ToolExecutor: Tool Invocation
ToolExecutor-->>MCPServer: Tool Result
MCPServer-->>MCPProcessor: Tool Response
end
MCPProcessor->>ResponseBuilder: Combine Results
ResponseBuilder-->>Provider: Enhanced Response
```
**Tool Execution Pipeline:**
```go
func (mcp *MCPProcessor) ProcessToolCalls(
response *ProviderResponse,
) (*ProviderResponse, error) {
toolCalls := mcp.extractToolCalls(response)
if len(toolCalls) == 0 {
return response, nil
}
// Execute tools concurrently
results := make(chan ToolResult, len(toolCalls))
for _, toolCall := range toolCalls {
go func(tc ToolCall) {
result := mcp.executeTool(tc)
results <- result
}(toolCall)
}
// Collect results
toolResults := make([]ToolResult, 0, len(toolCalls))
for i := 0; i < len(toolCalls); i++ {
toolResults = append(toolResults, <-results)
}
// Enhance response
return mcp.enhanceResponse(response, toolResults), nil
}
```
---
## Stage 9: Post-Processing & Response Formation
### **Plugin Post-Processing**
```mermaid
sequenceDiagram
participant CoreResponse
participant LoggingPlugin
participant CachePlugin
participant MetricsPlugin
participant Transport
CoreResponse->>LoggingPlugin: ProcessResponse()
LoggingPlugin->>LoggingPlugin: Log Request/Response
LoggingPlugin->>CachePlugin: Response + Logs
CachePlugin->>CachePlugin: Cache Response
CachePlugin->>MetricsPlugin: Cached Response
MetricsPlugin->>MetricsPlugin: Record Metrics
MetricsPlugin->>Transport: Final Response
```
**Response Enhancement Pipeline:**
```go
func (pm *PluginManager) ExecutePostHooks(
ctx BifrostContext,
req *BifrostRequest,
resp *BifrostResponse,
) (*BifrostResponse, error) {
for _, plugin := range pm.plugins {
enhancedResp, err := plugin.ProcessResponse(ctx, req, resp)
if err != nil {
// Log error but continue processing
pm.logger.Warn("Plugin post-processing error", "plugin", plugin.Name(), "error", err)
continue
}
resp = enhancedResp
}
return resp, nil
}
```
### **Response Serialization**
```mermaid
flowchart TD
Response[BifrostResponse] --> Format{Response Format}
Format -->|HTTP| JSONSerialize[JSON Serialization]
Format -->|SDK| DirectReturn[Direct Go Struct]
JSONSerialize --> Compress[Compression]
DirectReturn --> TypeCheck[Type Validation]
Compress --> Headers[Set Headers]
TypeCheck --> Return[Return Response]
Headers --> HTTPResponse[HTTP Response]
HTTPResponse --> Client[Client Response]
Return --> Client
```
---
## Related Architecture Documentation
- **[Concurrency Model](./concurrency)** - Worker pools and threading details
- **[Plugin System](./plugins)** - Plugin execution and lifecycle
- **[MCP System](./mcp)** - Tool discovery and execution internals
- **[Benchmarks](../../benchmarking/getting-started)** - Detailed performance analysis

View File

@@ -0,0 +1,161 @@
---
title: "Config Store"
description: "A persistent and flexible configuration management system for Bifrost, supporting multiple database backends."
icon: "gear"
---
The ConfigStore is a critical component of the Bifrost framework, providing a centralized and persistent storage solution for all gateway configurations. It abstracts the underlying database, offering a unified API for managing everything from provider settings and virtual keys to governance policies and plugin configurations.
## Core Features
- **Unified Configuration API**: A single interface (`ConfigStore`) for all configuration CRUD (Create, Read, Update, Delete) operations.
- **Multiple Backend Support**: Out-of-the-box support for SQLite and PostgreSQL, with an extensible architecture for adding new database backends.
- **Comprehensive Data Management**: Manages a wide range of configuration data, including:
- Provider and key settings
- Virtual keys and governance rules (budgets, rate limits)
- Customer and team information for multi-tenancy
- Plugin configurations
- Vector store and log store settings
- Model pricing information
- **Transactional Operations**: Ensures data consistency by supporting atomic transactions for complex configuration changes.
- **Database Migrations**: Integrated migration system to manage schema evolution across different versions of Bifrost.
- **Environment Variable Handling**: Securely manages sensitive data like API keys by storing references to environment variables instead of raw values.
## Architecture
The ConfigStore is designed around the `ConfigStore` interface, which defines all the methods for interacting with the configuration data. The primary implementation is `RDBConfigStore`, which uses [GORM](https://gorm.io/) as an ORM to communicate with relational databases.
### Supported Backends
- **SQLite**: The default, file-based database, perfect for local development, testing, and single-node deployments. It requires no external services.
- **PostgreSQL**: A robust, production-grade database suitable for large-scale, high-availability deployments.
The backend is selected and configured in Bifrost's main configuration file.
### Initialization
The ConfigStore is initialized at startup based on the provided configuration.
```go
import (
"github.com/maximhq/bifrost/framework/configstore"
"github.com/maximhq/bifrost/core/schemas"
)
// Example: Initialize a SQLite-based ConfigStore
config := &configstore.Config{
Enabled: true,
Type: configstore.ConfigStoreTypeSQLite,
Config: &configstore.SQLiteConfig{
File: "/path/to/config.db",
},
}
var logger schemas.Logger // Assume logger is initialized
store, err := configstore.NewConfigStore(context.Background(), config, logger)
if err != nil {
// Handle error
}
```
Here is an example for initializing a PostgreSQL-based `ConfigStore`:
```go
// Example: Initialize a PostgreSQL-based ConfigStore
pgConfig := &configstore.Config{
Enabled: true,
Type: configstore.ConfigStoreTypePostgres,
Config: &configstore.PostgresConfig{
Host: "localhost",
Port: "5432",
User: "postgres",
Password: "secret",
DBName: "bifrost",
SSLMode: "disable",
MaxIdleConns: 5, // Optional: Maximum idle connections (default: 5)
MaxOpenConns: 50, // Optional: Maximum open connections (default: 50)
},
}
store, err = configstore.NewConfigStore(context.Background(), pgConfig, logger)
if err != nil {
// Handle error
}
```
<Note>
PostgreSQL databases used by Bifrost stores must be UTF8 encoded. See [PostgreSQL UTF8 Requirement](../../quickstart/gateway/setting-up#postgresql-utf8-requirement).
</Note>
### Connection Pool Configuration
For PostgreSQL backends, you can configure the database connection pool to optimize performance based on your workload:
- **MaxIdleConns**: Maximum number of idle connections in the pool (default: 5)
- **MaxOpenConns**: Maximum number of open connections to the database (default: 50)
These parameters help manage database connection resources effectively. Increase them for high-traffic deployments or decrease them for resource-constrained environments.
## Data Models
The ConfigStore manages a variety of data models, which are defined as GORM tables in the `framework/configstore/tables` directory. Some of the key models include:
- `TableVirtualKey`: Represents a virtual key with its associated governance rules, keys, and metadata.
- `TableProvider` & `TableKey`: Store provider-specific configurations and the physical API keys.
- `TableBudget` & `TableRateLimit`: Define spending limits and request rate limits for governance.
- `TableCustomer` & `TableTeam`: Enable multi-tenant configurations.
- `TableModelPricing`: Caches model pricing information for cost calculation.
- `TablePlugin`: Stores configuration for loaded plugins.
## Usage
The `ConfigStore` interface provides a rich set of methods for managing Bifrost's configuration.
### Managing Virtual Keys
```go
// Create a new virtual key
newKey := &tables.TableVirtualKey{
ID: "vk-12345",
Name: "My Test Key",
// ... other fields
}
err := store.CreateVirtualKey(ctx, newKey)
// Retrieve a virtual key
virtualKey, err := store.GetVirtualKey(ctx, "vk-12345")
```
### Managing Providers
```go
// Get all provider configurations
providers, err := store.GetProvidersConfig(ctx)
// Update a specific provider
providerConfig := providers[schemas.OpenAI]
providerConfig.NetworkConfig.TimeoutSeconds = 120
err = store.UpdateProvider(ctx, schemas.OpenAI, providerConfig, envKeys)
```
### Executing Transactions
For operations that require multiple database writes, you can use a transaction to ensure atomicity.
```go
err := store.ExecuteTransaction(ctx, func(tx *gorm.DB) error {
// Perform multiple operations within this transaction
if err := store.CreateBudget(ctx, budget1, tx); err != nil {
return err // Rollback
}
if err := store.UpdateRateLimit(ctx, limit1, tx); err != nil {
return err // Rollback
}
return nil // Commit
})
```
## Migrations
The ConfigStore includes a migration system to handle database schema changes between Bifrost versions. Migrations are automatically applied at startup, ensuring the database schema is always up-to-date. This process is managed by the `migrator` package and is transparent to the user.
The ConfigStore is a powerful and flexible component that provides the backbone for Bifrost's dynamic configuration capabilities. Its support for multiple backends and transactional operations makes it suitable for both small-scale and large-scale, production environments.

View File

@@ -0,0 +1,176 @@
---
title: "Log Store"
description: "A robust and queryable system for persisting API request and response logs, with support for multiple database backends."
icon: "clipboard-list"
---
The LogStore is a core component of the Bifrost framework responsible for capturing, storing, and retrieving detailed logs of API requests and responses. It provides a persistent, queryable audit trail of all activity passing through the gateway, which is essential for debugging, monitoring, analytics, and compliance.
## Core Features
- **Persistent Logging**: Automatically saves detailed information about each API request, including input, output, status, latency, and cost.
- **Multiple Backend Support**: Comes with built-in support for SQLite and PostgreSQL, allowing you to choose the best storage solution for your deployment needs.
- **Rich Querying and Filtering**: A powerful search API allows you to filter and sort logs based on a wide range of criteria such as provider, model, status, latency, cost, and content.
- **Performance Analytics**: The search functionality also provides aggregated statistics, including total requests, success rate, average latency, total tokens, and total cost for the queried data.
- **Structured Data Model**: Logs are stored in a structured format, with complex objects like message history and tool calls serialized as JSON for efficient storage and retrieval.
- **Automatic Data Management**: Includes GORM hooks to automatically handle JSON serialization/deserialization and to build a searchable content summary.
## Architecture
The LogStore is built around the `LogStore` interface, which defines the standard methods for interacting with the log database. The primary implementation, `RDBLogStore`, uses GORM to provide an abstraction over relational databases.
### Supported Backends
- **SQLite**: The default, file-based database, ideal for local development and smaller, single-node deployments.
- **PostgreSQL**: A production-ready database for scalable and high-availability deployments.
The backend is configured in Bifrost's main configuration file.
### Initialization
The LogStore is initialized at startup based on the provided configuration.
```go
import (
"github.com/maximhq/bifrost/framework/logstore"
"github.com/maximhq/bifrost/core/schemas"
)
// Example: Initialize a SQLite-based LogStore
config := &logstore.Config{
Enabled: true,
Type: logstore.LogStoreTypeSQLite,
Config: &logstore.SQLiteConfig{
File: "/path/to/logs.db",
},
}
var logger schemas.Logger // Assume logger is initialized
store, err := logstore.NewLogStore(context.Background(), config, logger)
if err != nil {
// Handle error
}
```
Here is an example for initializing a PostgreSQL-based `LogStore`:
```go
// Example: Initialize a PostgreSQL-based LogStore
pgConfig := &logstore.Config{
Enabled: true,
Type: logstore.LogStoreTypePostgres,
Config: &logstore.PostgresConfig{
Host: "localhost",
Port: "5432",
User: "postgres",
Password: "secret",
DBName: "bifrost_logs",
SSLMode: "disable",
MaxIdleConns: 5, // Optional: Maximum idle connections (default: 5)
MaxOpenConns: 50, // Optional: Maximum open connections (default: 50)
},
}
store, err = logstore.NewLogStore(context.Background(), pgConfig, logger)
if err != nil {
// Handle error
}
```
<Note>
PostgreSQL databases used by Bifrost stores must be UTF8 encoded. See [PostgreSQL UTF8 Requirement](../../quickstart/gateway/setting-up#postgresql-utf8-requirement).
</Note>
### Connection Pool Configuration
For PostgreSQL backends, you can configure the database connection pool to optimize performance based on your workload:
- **MaxIdleConns**: Maximum number of idle connections in the pool (default: 5)
- **MaxOpenConns**: Maximum number of open connections to the database (default: 50)
These parameters help manage database connection resources effectively. Increase them for high-traffic deployments or decrease them for resource-constrained environments.
## Data Model
The core of the LogStore is the `Log` struct, which represents a single log entry in the `logs` table.
```go
// Log represents a complete log entry for a request/response cycle
type Log struct {
ID string `gorm:"primaryKey;type:varchar(255)"`
Timestamp time.Time `gorm:"index;not null"`
Object string `gorm:"type:varchar(255);index;not null;column:object_type"`
Provider string `gorm:"type:varchar(255);index;not null"`
Model string `gorm:"type:varchar(255);index;not null"`
Latency *float64
Cost *float64 `gorm:"index"`
Status string `gorm:"type:varchar(50);index;not null"` // "processing", "success", or "error"
Stream bool `gorm:"default:false"`
// Denormalized token fields for easier querying
PromptTokens int `gorm:"default:0"`
CompletionTokens int `gorm:"default:0"`
TotalTokens int `gorm:"default:0"`
// JSON serialized fields
InputHistory string `gorm:"type:text"`
OutputMessage string `gorm:"type:text"`
TokenUsage string `gorm:"type:text"`
ErrorDetails string `gorm:"type:text"`
// ... and many more for different data types
}
```
Complex data like message arrays and tool calls are serialized into JSON strings for storage and are automatically deserialized back into their struct forms when retrieved.
## Usage
### Creating Log Entries
A log entry is created by populating a `Log` struct and passing it to the `Create` method. This is typically handled internally by Bifrost's logging plugins.
```go
logEntry := &logstore.Log{
ID: "req-xyz123",
Timestamp: time.Now(),
Provider: "openai",
Model: "gpt-4",
Status: "success",
// ... other fields
}
err := store.Create(ctx, logEntry)
```
### Searching and Filtering Logs
The `SearchLogs` method provides a powerful way to query logs with fine-grained filters and pagination.
```go
// Define search criteria
filters := logstore.SearchFilters{
Providers: []string{"openai", "anthropic"},
Status: []string{"error"},
StartTime: &startTime, // time.Time pointer
}
pagination := logstore.PaginationOptions{
Limit: 50,
Offset: 0,
SortBy: "timestamp",
Order: "desc",
}
// Execute the search
results, err := store.SearchLogs(ctx, filters, pagination)
if err != nil {
// Handle error
}
// Process the results
for _, log := range results.Logs {
fmt.Printf("Found log: %s\n", log.ID)
}
// Access aggregated stats
fmt.Printf("Total errors: %d\n", results.Stats.TotalRequests)
```
The LogStore is an indispensable tool for observability in Bifrost, providing the detailed audit trail needed to monitor, debug, and analyze AI application performance and behavior effectively.

View File

@@ -0,0 +1,412 @@
---
title: "Model Catalog"
description: "A centralized system for managing model information, pricing, and capabilities across all supported AI providers."
icon: "book-open"
---
The Model Catalog is a foundational component of Bifrost that provides a unified interface for managing AI models, including their pricing, capabilities, and availability. It serves as a centralized repository for all model-related information, enabling dynamic cost calculation, intelligent model routing, and efficient resource management.
<Info>
**Related Documentation**: The Model Catalog powers Bifrost's intelligent routing system. See [Provider Routing](/providers/provider-routing) for detailed examples of how governance and load balancing use the catalog to make routing decisions, including cross-provider scenarios and weighted routing via proxy providers.
</Info>
## Core Features
### **1. Automatic Pricing Synchronization**
The Model Catalog manages pricing data through a two-phase approach:
**Startup Behavior:**
- **With ConfigStore**: Downloads a pricing sheet from Maxim's datasheet, persists it to the config store, and then loads it into memory for fast lookups.
- **Without ConfigStore**: Downloads the pricing sheet directly into memory on every startup.
**Ongoing Synchronization:**
- When ConfigStore is available, an automatic sync occurs every 24 hours to keep pricing data current.
- All pricing data is cached in memory for O(1) lookup performance during cost calculations.
This ensures that cost calculations always use the latest pricing information from AI providers while maintaining optimal performance.
### **2. Multi-Modal Cost Calculation**
It supports diverse pricing models across different AI operation types:
- **Text Operations**: Token-based pricing for chat completions, text completions, responses, and embeddings. Cache-read/cache-write pricing applies to chat/text/responses when providers surface prompt cache token details.
- **Audio Processing**: Character-based, token-based, and duration-based pricing for speech synthesis and transcription, with audio token detail breakdown. Speech responses populate `usage.input_chars` so speech can be billed by input characters in addition to tokens/duration.
- **Image Processing**: Per-image (`input_cost_per_image`/`output_cost_per_image`), per-pixel (`input_cost_per_pixel`/`output_cost_per_pixel`), or token-based pricing with text/image token breakdown.
- **Video Processing**: Token-based or duration-based pricing. Input can use prompt tokens or `input_cost_per_video_per_second`; output can use completion tokens or fall back to `output_cost_per_video_per_second` / `output_cost_per_second`.
- **Reranking**: Input/output token pricing with search query cost support.
- **Prompt Caching**: Separate rates for cache-read tokens (`cached_read_tokens`) and cache-creation tokens (`cached_write_tokens`), both surfaced under `prompt_tokens_details` (see [Prompt Cache Cost Calculation](#prompt-cache-cost-calculation)).
### **3. Model Information Management**
The Model Catalog maintains a pool of available models for each provider, populated from both pricing data and provider list models APIs. This enables:
- **Model Discovery**: Listing all available models for a given provider
- **Provider Discovery**: Finding all providers that support a specific model with intelligent cross-provider resolution (OpenRouter, Vertex, Groq, Bedrock)
- **Model Validation**: Checking if a model is allowed for a provider based on allowed models lists (supports provider-prefixed entries)
### **4. Intelligent Cache Cost Handling**
It integrates with semantic caching to provide accurate cost calculations:
- **Cache Hits**: Zero cost for direct cache hits, and embedding cost only for semantic matches.
- **Cache Misses**: Combined cost of the base model usage plus the embedding generation cost for cache storage.
### **5. Tiered Pricing Support**
The system automatically applies different pricing rates for high-token contexts, reflecting real provider pricing models. Two tiers are supported: above 128k tokens and above 200k tokens, with the higher tier taking precedence when both are configured.
## Configuration
The `ModelCatalog` can be configured during initialization by passing a `Config` struct.
```go
type Config struct {
PricingURL *string `json:"pricing_url,omitempty"`
PricingSyncInterval *time.Duration `json:"pricing_sync_interval,omitempty"`
}
```
- **`PricingURL`**: Overrides the default URL (`https://getbifrost.ai/datasheet`) for downloading the pricing sheet.
- **`PricingSyncInterval`**: Customizes the interval for periodic pricing data synchronization. The default is 24 hours.
This configuration is passed during the initialization of the `ModelCatalog`:
```go
config := &modelcatalog.Config{
PricingURL: "https://my-custom-url.com/pricing.json",
}
modelCatalog, err := modelcatalog.Init(context.Background(), config, configStore, logger)
```
## Architecture
### ModelCatalog
The `ModelCatalog` is the central component that handles all model and pricing operations:
```go
type ModelCatalog struct {
configStore configstore.ConfigStore
logger schemas.Logger
pricingURL string
pricingSyncInterval time.Duration
// In-memory cache for fast access
pricingData map[string]configstoreTables.TableModelPricing
mu sync.RWMutex
modelPool map[schemas.ModelProvider][]string
// Background sync worker
syncTicker *time.Ticker
done chan struct{}
wg sync.WaitGroup
syncCtx context.Context
syncCancel context.CancelFunc
}
```
### Pricing Data Structure
Each model's pricing information includes comprehensive cost metrics, supporting various modalities and tiered pricing:
```go
// PricingEntry represents a single model's pricing information.
// The fields below are an excerpt — see framework/modelcatalog/main.go for the full definition.
type PricingEntry struct {
BaseModel string `json:"base_model,omitempty"`
Provider string `json:"provider"`
Mode string `json:"mode"`
// Costs - Text
InputCostPerToken float64 `json:"input_cost_per_token"`
OutputCostPerToken float64 `json:"output_cost_per_token"`
InputCostPerTokenBatches *float64 `json:"input_cost_per_token_batches,omitempty"`
OutputCostPerTokenBatches *float64 `json:"output_cost_per_token_batches,omitempty"`
InputCostPerTokenPriority *float64 `json:"input_cost_per_token_priority,omitempty"`
OutputCostPerTokenPriority *float64 `json:"output_cost_per_token_priority,omitempty"`
InputCostPerTokenAbove200kTokens *float64 `json:"input_cost_per_token_above_200k_tokens,omitempty"`
OutputCostPerTokenAbove200kTokens *float64 `json:"output_cost_per_token_above_200k_tokens,omitempty"`
// Costs - Cache
CacheCreationInputTokenCost *float64 `json:"cache_creation_input_token_cost,omitempty"`
CacheReadInputTokenCost *float64 `json:"cache_read_input_token_cost,omitempty"`
CacheCreationInputTokenCostAbove200kTokens *float64 `json:"cache_creation_input_token_cost_above_200k_tokens,omitempty"`
CacheReadInputTokenCostAbove200kTokens *float64 `json:"cache_read_input_token_cost_above_200k_tokens,omitempty"`
CacheCreationInputTokenCostAbove1hr *float64 `json:"cache_creation_input_token_cost_above_1hr,omitempty"`
CacheCreationInputTokenCostAbove1hrAbove200kTokens *float64 `json:"cache_creation_input_token_cost_above_1hr_above_200k_tokens,omitempty"`
CacheCreationInputAudioTokenCost *float64 `json:"cache_creation_input_audio_token_cost,omitempty"`
CacheReadInputTokenCostPriority *float64 `json:"cache_read_input_token_cost_priority,omitempty"`
// Costs - Image
InputCostPerImage *float64 `json:"input_cost_per_image,omitempty"`
InputCostPerPixel *float64 `json:"input_cost_per_pixel,omitempty"`
OutputCostPerImage *float64 `json:"output_cost_per_image,omitempty"`
OutputCostPerPixel *float64 `json:"output_cost_per_pixel,omitempty"`
OutputCostPerImagePremiumImage *float64 `json:"output_cost_per_image_premium_image,omitempty"`
OutputCostPerImageAbove512x512Pixels *float64 `json:"output_cost_per_image_above_512_and_512_pixels,omitempty"`
OutputCostPerImageAbove512x512PixelsPremium *float64 `json:"output_cost_per_image_above_512_and_512_pixels_and_premium_image,omitempty"`
OutputCostPerImageAbove1024x1024Pixels *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels,omitempty"`
OutputCostPerImageAbove1024x1024PixelsPremium *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels_and_premium_image,omitempty"`
OutputCostPerImageAbove2048x2048Pixels *float64 `json:"output_cost_per_image_above_2048_and_2048_pixels,omitempty"`
OutputCostPerImageAbove4096x4096Pixels *float64 `json:"output_cost_per_image_above_4096_and_4096_pixels,omitempty"`
OutputCostPerImageLowQuality *float64 `json:"output_cost_per_image_low_quality,omitempty"`
OutputCostPerImageMediumQuality *float64 `json:"output_cost_per_image_medium_quality,omitempty"`
OutputCostPerImageHighQuality *float64 `json:"output_cost_per_image_high_quality,omitempty"`
OutputCostPerImageAutoQuality *float64 `json:"output_cost_per_image_auto_quality,omitempty"`
// Costs - Audio/Video
InputCostPerAudioToken *float64 `json:"input_cost_per_audio_token,omitempty"`
InputCostPerAudioPerSecond *float64 `json:"input_cost_per_audio_per_second,omitempty"`
InputCostPerSecond *float64 `json:"input_cost_per_second,omitempty"`
InputCostPerVideoPerSecond *float64 `json:"input_cost_per_video_per_second,omitempty"`
OutputCostPerAudioToken *float64 `json:"output_cost_per_audio_token,omitempty"`
OutputCostPerVideoPerSecond *float64 `json:"output_cost_per_video_per_second,omitempty"`
OutputCostPerSecond *float64 `json:"output_cost_per_second,omitempty"`
// Costs - Other
SearchContextCostPerQuery *float64 `json:"search_context_cost_per_query,omitempty"`
CodeInterpreterCostPerSession *float64 `json:"code_interpreter_cost_per_session,omitempty"`
}
```
## Usage in Plugins
The Model Catalog is designed to be shared across all Bifrost plugins, providing consistent model information and validation logic for governance, load balancing, and other routing mechanisms.
<Note>
**Governance & Load Balancing**: Both plugins delegate model validation to the Model Catalog's `IsModelAllowedForProvider` method, ensuring consistent handling of cross-provider scenarios and provider-prefixed allowed models. See [Provider Routing](/providers/provider-routing) for configuration examples.
</Note>
### Initialization
In Bifrost's gateway, the `ModelCatalog` is initialized once at the start and shared across all plugins:
```go
import "github.com/maximhq/bifrost/framework/modelcatalog"
// Initialize model catalog with config store and logger
modelCatalog, err := modelcatalog.Init(context.Background(), &modelcatalog.Config{}, configStore, logger)
if err != nil {
return fmt.Errorf("failed to initialize model catalog: %w", err)
}
```
### Basic Cost Calculation
Calculate costs from a Bifrost response:
```go
// Calculate cost for a completed request
cost := modelCatalog.CalculateCost(
result, // *schemas.BifrostResponse
nil, // *PricingLookupScopes (nil = no scoped overrides)
)
logger.Info("Request cost: $%.6f", cost)
```
### Unified Cost Calculation
`CalculateCost` is the single entry point for all cost calculations. It handles all request types, semantic cache billing, and tiered pricing automatically:
```go
// CalculateCost handles all cost scenarios including cache-aware pricing
cost := modelCatalog.CalculateCost(result, nil) // *schemas.BifrostResponse, *PricingLookupScopes
// Cache hits return 0 for direct hits, embedding cost for semantic matches
// Cache misses return base model cost + embedding generation cost
// Returns 0.0 if pricing data is not found (logs a debug message)
```
### Model Discovery
The `ModelCatalog` provides several methods to query for model and provider information.
#### Get Models for a Provider
Retrieve a list of all models supported by a specific provider.
```go
openaiModels := modelCatalog.GetModelsForProvider(schemas.OpenAI)
for _, model := range openaiModels {
logger.Info("Found OpenAI model: %s", model)
}
```
**Thread-safe**: Uses read lock for concurrent access.
#### Get Providers for a Model
Find all providers that offer a specific model, including cross-provider resolution.
```go
gpt4Providers := modelCatalog.GetProvidersForModel("gpt-4o")
for _, provider := range gpt4Providers {
logger.Info("gpt-4o is available from: %s", provider)
}
// Result: [openai, azure, groq] (includes cross-provider mappings)
```
**Cross-Provider Resolution**:
This method implements intelligent cross-provider routing logic to discover all providers that can serve a model:
1. **Direct Match**: Checks each provider's model list in `modelPool` for the exact model name
2. **OpenRouter Format**: For models found in other providers, checks if `provider/model` exists in OpenRouter
- Example: `claude-3-5-sonnet` found in Anthropic → checks OpenRouter for `anthropic/claude-3-5-sonnet`
3. **Vertex Format**: Similar check for Vertex with `provider/model` format
4. **Groq OpenAI Compatibility**: For GPT models, checks if `openai/model` exists in Groq's catalog
5. **Bedrock Claude Models**: For Claude models, flexible matching against Bedrock's full ARN format
**Example**:
```go
providers := modelCatalog.GetProvidersForModel("claude-3-5-sonnet")
// Returns: [anthropic, vertex, bedrock, openrouter]
// Even though request was just "claude-3-5-sonnet" without provider prefix!
```
<Note>
This cross-provider logic powers Bifrost's intelligent routing capabilities. See [Provider Routing](/providers/provider-routing#the-model-catalog) for detailed examples of how this enables features like weighted routing via proxy providers.
</Note>
#### Check Model Allowance for Provider
Validate if a model is allowed for a specific provider based on an allowed models list. This method is used internally by governance and load balancing plugins.
```go
// ["*"] wildcard - uses catalog to determine support
isAllowed := modelCatalog.IsModelAllowedForProvider(
schemas.OpenRouter,
"gpt-4o",
schemas.WhiteList{"*"}, // wildcard = check catalog
)
// Returns: true (catalog knows OpenRouter supports openai/gpt-4o)
// Explicit allowedModels with provider prefix
isAllowed := modelCatalog.IsModelAllowedForProvider(
schemas.OpenRouter,
"gpt-4o",
schemas.WhiteList{"openai/gpt-4o", "anthropic/claude-3-5-sonnet"},
)
// Returns: true (strips "openai/" prefix and matches "gpt-4o")
// Explicit allowedModels without prefix
isAllowed := modelCatalog.IsModelAllowedForProvider(
schemas.OpenAI,
"gpt-4o",
schemas.WhiteList{"gpt-4o", "gpt-4o-mini"},
)
// Returns: true (direct match)
```
**Behavior**:
- **`["*"]` wildcard**: Delegates to `GetProvidersForModel` (includes cross-provider logic) — this is the "allow all via catalog" mode
- **Non-empty explicit list**: Checks for both direct matches and provider-prefixed entries
- **Empty slice (`[]string{}` / empty `schemas.WhiteList`)**: Returns `false` (deny-all) — mirrors the config deny-by-default semantics
<Note>
In `config.json` and the governance API, `allowed_models: []` (empty array) means **deny all models** (deny-by-default, v1.5.0+). The Go helper `IsModelAllowedForProvider` behaves the same way: an empty `allowedModels` slice also returns `false`. Use `["*"]` to allow all models validated through the catalog.
</Note>
- Direct: `"gpt-4o"` matches `"gpt-4o"`
- Prefixed: `"openai/gpt-4o"` matches request for `"gpt-4o"` (prefix stripped)
**Use Cases**:
- **Governance Routing**: Validate if a model request is allowed for a provider configuration
- **Load Balancing**: Filter providers based on allowed models before performance scoring
- **Virtual Key Validation**: Check if a model can be used with a specific virtual key's provider configs
<Tip>
This method is the central validation point for both governance and load balancing plugins, ensuring consistent model allowance logic across all routing mechanisms. It handles all edge cases including proxy providers (OpenRouter, Vertex) and provider-prefixed model entries.
</Tip>
#### Dynamically Add Models
You can dynamically add models to the catalog's pool from a `v1/models` compatible response structure. This is useful for providers that expose a model list endpoint.
```go
// response is *schemas.BifrostListModelsResponse
modelCatalog.AddModelDataToPool(response)
```
This is automatically done in Bifrost gateway initialization for all providers that are supported by Bifrost.
**When to use**:
- After fetching models from a provider's `/v1/models` endpoint
- When a new provider is dynamically added at runtime
- For testing with custom model lists
### Reloading Configuration
You can reload the pricing configuration at runtime if you need to change the pricing URL or sync interval.
```go
newConfig := &modelcatalog.Config{
PricingSyncInterval: 12 * time.Hour,
}
err := modelCatalog.UpdateSyncConfig(ctx, newConfig)
```
## Error Handling and Fallbacks
The Model Catalog handles missing pricing data gracefully with intelligent fallbacks:
```go
// resolvePricing resolves the pricing entry for a model, trying deployment as fallback.
func (mc *ModelCatalog) resolvePricing(provider, model, deployment string, requestType schemas.RequestType) *configstoreTables.TableModelPricing {
pricing, exists := mc.getPricing(model, provider, requestType)
if exists {
return pricing
}
// If pricing not found for model, try the deployment name
if deployment != "" {
pricing, exists = mc.getPricing(deployment, provider, requestType)
if exists {
return pricing
}
}
return nil
}
// getPricing returns pricing information for a model (thread-safe).
// It implements a multi-step fallback chain:
// 1. Direct lookup by model + provider + mode
// 2. Gemini → Vertex provider fallback
// 3. Vertex "provider/model" prefix stripping
// 4. Bedrock "anthropic." prefix addition for Claude models
// 5. Responses → Chat mode fallback (at each step)
// 6. ImageEdit / ImageVariation → ImageGeneration mode fallback
func (mc *ModelCatalog) getPricing(model, provider string, requestType schemas.RequestType) (*configstoreTables.TableModelPricing, bool) {
mc.mu.RLock()
defer mc.mu.RUnlock()
mode := normalizeRequestType(requestType)
pricing, ok := mc.pricingData[makeKey(model, provider, mode)]
if ok {
return &pricing, true
}
// Provider-specific fallbacks (Gemini→Vertex, Vertex prefix strip, Bedrock anthropic. prefix)
// Each fallback also tries Responses→Chat mode if applicable
// ...
// Final fallback: Responses → Chat mode for any provider
if requestType == schemas.ResponsesRequest || requestType == schemas.ResponsesStreamRequest {
pricing, ok = mc.pricingData[makeKey(model, provider, normalizeRequestType(schemas.ChatCompletionRequest))]
if ok {
return &pricing, true
}
}
return nil, false
}
// When pricing is not found, CalculateCost returns 0.0 and logs a debug message.
// This ensures operations continue smoothly without billing failures.
```
## Cleanup and Lifecycle Management
Properly clean up resources when shutting down:
```go
// Cleanup model catalog resources
defer func() {
if err := modelCatalog.Cleanup(); err != nil {
logger.Error("Failed to cleanup model catalog: %v", err)
}
}()
```
## Thread Safety
All `ModelCatalog` operations are thread-safe, making it suitable for concurrent usage across multiple plugins and goroutines. The internal pricing data cache uses read-write mutexes for optimal performance during frequent lookups.
## Best Practices
1. **Shared Instance**: Use a single `ModelCatalog` instance across all plugins to avoid redundant data synchronization.
2. **Error Handling**: Always handle the case where pricing returns 0.0 due to missing model data.
3. **Logging**: Monitor pricing sync failures and missing model warnings in production.
4. **Cache Awareness**: Use `CalculateCost` which automatically handles cache hits/misses and embedding costs.
5. **Resource Cleanup**: Always call `Cleanup()` during application shutdown to prevent resource leaks.
The Model Catalog provides a robust, production-ready foundation for implementing billing, budgeting, and cost monitoring features in Bifrost plugins.

View File

@@ -0,0 +1,130 @@
---
title: "Streaming"
description: "Framework utility for aggregating and processing real-time stream chunks from AI providers"
icon: "water"
---
## Overview
The **Streaming** package (`framework/streaming`) is a core utility within Bifrost designed to handle real-time data streams from AI providers. It provides a robust and efficient mechanism for plugins like [Logging](/features/observability/default), [OTel](/features/observability/otel), and [Maxim](/features/observability/maxim) to process, aggregate, and format streaming responses for chat completions, transcriptions, and other real-time AI interactions.
```mermaid
sequenceDiagram
participant Plugin
participant BC as Bifrost Core
participant Accumulator
BC->>Plugin: PreLLMHook(StreamingRequest)
activate Plugin
Plugin->>Accumulator: CreateStreamAccumulator(requestID)
activate Accumulator
Accumulator-->>Plugin: ack
deactivate Accumulator
Plugin-->>BC: return
deactivate Plugin
loop For each response chunk
BC->>Plugin: PostLLMHook(StreamChunk)
activate Plugin
Plugin->>Accumulator: ProcessStreamingResponse(StreamChunk)
activate Accumulator
alt Is NOT Final Chunk
Accumulator-->>Plugin: return {Type: Delta}
else Is Final Chunk
Accumulator->>Accumulator: buildCompleteResponse()
Accumulator-->>Plugin: return {Type: Final, CompleteData}
end
deactivate Accumulator
Plugin-->>BC: return
deactivate Plugin
end
```
Its primary purpose is to simplify the complexity of handling chunked data, ensuring that plugins can work with complete, well-structured responses without needing to implement their own aggregation logic.
## How It Works
The streaming package uses an `Accumulator` to manage the lifecycle of a streaming operation. This process is designed to be highly efficient, using `sync.Pool` to reuse objects and minimize memory allocations.
1. **Initialization**: When a plugin that needs to process streams (like `logging` or `otel`) is initialized, it creates a new `streaming.Accumulator`.
2. **Stream Start**: In the `PreLLMHook` phase of a request, if the request is identified as a streaming type, the plugin calls `accumulator.CreateStreamAccumulator(requestID, timestamp)` to prepare a dedicated buffer for the incoming chunks of that request.
3. **Chunk Processing**: In the `PostLLMHook` phase, as each chunk of the streaming response arrives, the plugin passes it to `accumulator.ProcessStreamingResponse()`.
* For each `delta` chunk, the accumulator appends it to the buffer associated with the request ID.
* The accumulator handles different types of streams, including chat, audio, and transcriptions, using specialized logic to correctly piece together the data. For example, it accumulates text deltas, tool call argument deltas, and other parts of the message.
4. **Finalization**: When the final chunk of the stream is received (indicated by a `finish_reason` or other provider-specific signal), `ProcessStreamingResponse` performs the final assembly.
* It reconstructs the complete `ChatMessage` or other response object from all the stored chunks.
* It calculates total token usage, cost, and latency.
* It returns a `ProcessedStreamResponse` object with `StreamResponseTypeFinal` and the complete, structured `AccumulatedData`.
5. **Cleanup**: Once the final response is processed, the accumulator cleans up all buffered chunks for that request ID, returning them to the `sync.Pool` for reuse.
## Key Components
### `Accumulator`
The central component of the package. It is a thread-safe manager that:
- Tracks stream chunks for multiple concurrent requests using a `sync.Map`.
- Uses `sync.Pool` to recycle `*StreamChunk` objects, reducing garbage collection overhead.
- Provides methods to add chunks (`addChatStreamChunk`, `addAudioStreamChunk`, etc.).
- Includes a periodic cleanup worker to remove stale accumulators for incomplete or orphaned requests.
### `ProcessStreamingResponse`
This is the main entry point for plugins to process stream data. It inspects the response type and delegates to the appropriate handler:
- `processChatStreamingResponse`
- `processAudioStreamingResponse`
- `processTranscriptionStreamingResponse`
- `processResponsesStreamingResponse`
It returns a `ProcessedStreamResponse`, which indicates whether the chunk is a `delta` or the `final` aggregated response.
### Stream-Specific Builders
The package includes internal logic to correctly build complete messages from chunks. For example, `buildCompleteMessageFromChatStreamChunks` iterates through the collected `ChatStreamChunk` objects, appending content deltas and assembling tool calls into a final, coherent `schemas.ChatMessage`.
## Usage Example
The following snippet from the `logging` plugin shows how the `streaming` package is used in practice within a plugin's `PostLLMHook`.
```go
// In plugins/logging/main.go
func (p *LoggerPlugin) PostLLMHook(ctx *schemas.BifrostContext, result *schemas.BifrostResponse, bifrostErr *schemas.BifrostError) (*schemas.BifrostResponse, *schemas.BifrostError, error) {
// ... setup, get requestID ...
go func() {
// ...
if bifrost.IsStreamRequestType(requestType) {
p.logger.Debug("[logging] processing streaming response")
// 1. Pass the response chunk to the accumulator
streamResponse, err := p.accumulator.ProcessStreamingResponse(ctx, result, bifrostErr)
if err != nil {
p.logger.Error("failed to process streaming response: %v", err)
// 2. Check if this is the final, aggregated response
} else if streamResponse != nil && streamResponse.Type == streaming.StreamResponseTypeFinal {
// Prepare final log data
logMsg.Operation = LogOperationStreamUpdate
logMsg.StreamResponse = streamResponse
// 3. Update the log entry with the complete data
processingErr := retryOnNotFound(p.ctx, func() error {
return p.updateStreamingLogEntry(p.ctx, logMsg.RequestID, logMsg.SemanticCacheDebug, logMsg.StreamResponse, true)
})
// ... handle errors and callbacks ...
}
}
// ... handle non-streaming responses ...
}()
return result, bifrostErr, nil
}
```
This demonstrates how a plugin can remain agnostic to the details of stream aggregation and simply react to the final, complete data returned by the `streaming` package. This greatly simplifies plugin development and ensures consistent data handling across the framework.

View File

@@ -0,0 +1,185 @@
---
title: "Vector Store"
description: "Vector database implementations for semantic search, embeddings storage, and AI-powered features in Bifrost."
icon: "diagram-project"
---
## Overview
The VectorStore is a core component of Bifrost's framework package that provides a unified interface for vector database operations. It enables plugins to store embeddings, perform similarity searches, and build AI-powered features like semantic caching, content recommendations, and knowledge retrieval.
**Key Capabilities:**
- **Vector Similarity Search**: Find semantically similar content using embeddings
- **Namespace Management**: Organize data into separate collections with custom schemas
- **Flexible Filtering**: Query data with complex filters and pagination
- **Multiple Backends**: Support for Weaviate, Redis/Valkey-compatible, Qdrant, and Pinecone vector stores
- **High Performance**: Optimized for production workloads
- **Scalable Storage**: Handle millions of vectors with efficient indexing
## VectorStore Interface Usage
### Creating Namespaces
Create collections (namespaces) with custom schemas:
```go
// Define properties for your data
properties := map[string]vectorstore.VectorStoreProperties{
"content": {
DataType: vectorstore.VectorStorePropertyTypeString,
Description: "The main content text",
},
"category": {
DataType: vectorstore.VectorStorePropertyTypeString,
Description: "Content category",
},
"tags": {
DataType: vectorstore.VectorStorePropertyTypeStringArray,
Description: "Content tags",
},
}
// Create namespace
err := store.CreateNamespace(ctx, "my_content", 1536, properties)
if err != nil {
log.Fatal("Failed to create namespace:", err)
}
```
### Storing Data with Embeddings
Add data with vector embeddings for similarity search:
```go
// Your embedding data (typically from an embedding model)
embedding := []float32{0.1, 0.2, 0.3 } // example 3-dimensional vector
// Metadata associated with this vector
metadata := map[string]interface{}{
"content": "This is my content text",
"category": "documentation",
"tags": []string{"guide", "tutorial"},
}
// Store in vector database
err := store.Add(ctx, "my_content", "unique-id-123", embedding, metadata)
if err != nil {
log.Fatal("Failed to add data:", err)
}
```
### Similarity Search
Find similar content using vector similarity:
```go
// Query embedding (from user query)
queryEmbedding := []float32{0.15, 0.25, 0.35, ...}
// Optional filters
filters := []vectorstore.Query{
{
Field: "category",
Operator: vectorstore.QueryOperatorEqual,
Value: "documentation",
},
}
// Perform similarity search
results, err := store.GetNearest(
ctx,
"my_content", // namespace
queryEmbedding, // query vector
filters, // optional filters
[]string{"content", "category"}, // fields to return
0.7, // similarity threshold (0-1)
10, // limit
)
for _, result := range results {
fmt.Printf("Score: %.3f, Content: %s\n", *result.Score, result.Properties["content"])
}
```
### Data Retrieval and Management
Query and manage stored data:
```go
// Get specific item by ID
item, err := store.GetChunk(ctx, "my_content", "unique-id-123")
if err != nil {
log.Fatal("Failed to get item:", err)
}
// Get all items with filtering and pagination
allResults, cursor, err := store.GetAll(
ctx,
"my_content",
[]vectorstore.Query{
{Field: "category", Operator: vectorstore.QueryOperatorEqual, Value: "documentation"},
},
[]string{"content", "tags"}, // select fields
nil, // cursor for pagination
50, // limit
)
// Delete items
err = store.Delete(ctx, "my_content", "unique-id-123")
```
## Supported Vector Stores
<CardGroup cols={2}>
<Card title="Weaviate" icon="database" href="/integrations/vector-databases/weaviate">
Production-ready vector database with gRPC support.
</Card>
<Card title="Redis / Valkey" icon="database" href="/integrations/vector-databases/redis">
High-performance in-memory vector store.
</Card>
<Card title="Qdrant" icon="database" href="/integrations/vector-databases/qdrant">
Rust-based vector search engine with advanced filtering.
</Card>
<Card title="Pinecone" icon="database" href="/integrations/vector-databases/pinecone">
Managed vector database with serverless options.
</Card>
</CardGroup>
---
## Use Cases
### [Semantic Caching](../../features/semantic-caching)
Build intelligent caching systems that understand query intent rather than just exact matches.
**Applications:**
- Customer support systems with FAQ matching
- Code completion and documentation search
- Content management with semantic deduplication
### Knowledge Base & Search
Create intelligent search systems that understand user queries contextually.
**Applications:**
- Document search and retrieval systems
- Product recommendation engines
- Research paper and knowledge discovery platforms
### Content Classification
Automatically categorize and tag content based on semantic similarity.
**Applications:**
- Email classification and routing
- Content moderation and filtering
- News article categorization and clustering
### Recommendation Systems
Build personalized recommendation engines using vector similarity.
**Applications:**
- Product recommendations based on user preferences
- Content suggestions for media platforms
- Similar document or article recommendations
## Related Documentation
| Topic | Documentation | Description |
|-------|---------------|-------------|
| **Framework Overview** | [What is Framework](./what-is-framework) | Understanding the framework package and VectorStore interface |
| **Semantic Caching** | [Semantic Caching](../../features/semantic-caching) | Using VectorStore for AI response caching |

View File

@@ -0,0 +1,49 @@
---
title: "What is framework?"
description: "Framework is Bifrost's shared storage and utilities SDK package that provides common database interfaces and logic for the plugin ecosystem."
icon: "play"
---
Framework serves as the foundation layer that enables plugins to implement consistent data management patterns without reinventing storage solutions.
## Installation
```bash
go get github.com/maximhq/bifrost/framework
```
## Purpose
The framework package was designed to solve a fundamental challenge in plugin development: providing standardized, reliable storage and utility interfaces that plugins can depend on. Instead of each plugin implementing its own database logic, configuration management, or logging systems, framework offers battle-tested, shared implementations.
## Core Components
### ConfigStore
A unified configuration persistence layer that provides consistent storage patterns for plugin settings, provider configurations, and system state. Plugins can leverage `ConfigStore` to manage their configuration data with built-in CRUD operations, transaction support, and schema management.
### LogStore
Standardized logging and audit trail capabilities that enable plugins to implement observability features. `LogStore` provides structured logging, search and filtering capabilities, pagination support, and automated data retention policies.
### VectorStore
Vector database operations designed for AI-powered plugins that need semantic capabilities. `VectorStore` handles embeddings management, similarity search operations, and namespace isolation, making it easy for plugins to add features like semantic caching, content search, and AI-powered recommendations.
### Pricing Module
Cost calculation and model pricing management tools that help plugins implement billing and usage tracking features. The pricing system supports multi-tier pricing models, real-time usage tracking, and dynamic pricing updates.
## Benefits for Plugin Developers
**Shared Logic**: Common patterns for configuration, logging, and data management are provided out-of-the-box, reducing development time and ensuring consistency across plugins.
**Standardized Interfaces**: All framework components use consistent APIs, making it easier for developers to work across different plugins and maintain code quality.
**Pluggable Architecture**: The interface-based design allows different storage backends to be used without changing plugin code, providing flexibility for different deployment scenarios.
**Transaction Support**: Built-in transaction management and error handling ensure data integrity and provide reliable rollback capabilities.
**Production Ready**: Framework components are battle-tested in production environments and include features like connection pooling, retry logic, and performance optimizations.
## Integration with Bifrost
Framework seamlessly integrates with the Bifrost ecosystem, providing the storage foundation that powers core features like provider management, request logging, semantic caching, and governance. When plugins use framework components, they automatically participate in Bifrost's unified data management strategy.
The framework package enables plugin developers to focus on their core business logic while relying on robust, shared infrastructure for all storage and utility needs.

View File

View File

View File

View File

View File

View File

181
docs/baseUrlSwitcher.js Normal file
View File

@@ -0,0 +1,181 @@
/**
* Bifrost docs — Base URL persistence
*
* The OpenAPI spec exposes the gateway base URL as a server variable
* (`{baseUrl}`), so Mintlify's API Reference playground renders an
* editable input for it. This script:
*
* 1. Preloads that input from localStorage on every page load /
* SPA route change, so the user only has to type their URL once.
* 2. Persists any edit the user makes back to localStorage.
* 3. Rewrites every `<code>` block in the MDX docs that mentions the
* default `http://localhost:8080`, so curl/SDK examples on the
* regular doc pages also use the configured URL.
*
* Mintlify auto-injects any `.js` file in the docs root on every page,
* so no docs.json wiring is required.
*/
(function () {
if (typeof window === "undefined" || typeof document === "undefined") return;
if (window.__bifrostBaseUrlSwitcherLoaded) return;
window.__bifrostBaseUrlSwitcherLoaded = true;
var DEFAULT_URL = "http://localhost:8080";
var STORAGE_KEY = "bifrost_base_url";
// Per-element snapshot of original text-node values, keyed via a
// WeakMap so detached DOM nodes get GC'd cleanly.
var snapshots = new WeakMap();
function readStoredUrl() {
try {
var v = window.localStorage.getItem(STORAGE_KEY);
return v && v.trim() ? v.trim() : DEFAULT_URL;
} catch (e) {
return DEFAULT_URL;
}
}
function writeStoredUrl(url) {
try {
window.localStorage.setItem(STORAGE_KEY, url);
} catch (e) {
/* ignore quota / private mode */
}
}
function normalizeUrl(input) {
if (!input) return DEFAULT_URL;
var url = String(input).trim();
if (!url) return DEFAULT_URL;
if (!/^https?:\/\//i.test(url)) url = "http://" + url;
return url.replace(/\/+$/, "");
}
/**
* Snapshot every text node inside `el` and remember the original
* value, so subsequent URL changes can always rewrite from the
* canonical source. Returns the snapshot, or null if the block has
* no localhost reference (so we never visit it again).
*/
function snapshotTextNodes(el) {
var walker = document.createTreeWalker(el, NodeFilter.SHOW_TEXT, null);
var entries = [];
var hasMatch = false;
var node;
while ((node = walker.nextNode())) {
var text = node.nodeValue || "";
if (text.indexOf("localhost:8080") !== -1) hasMatch = true;
entries.push({ node: node, original: text });
}
return hasMatch ? entries : null;
}
/**
* Rewrite every `<code>` block that mentions the default localhost
* URL. We only touch text nodes (`nodeValue`) — never `innerHTML` —
* so there is no path where a string is reinterpreted as HTML.
*/
function rewriteCodeBlocks(currentUrl) {
var blocks = document.querySelectorAll("pre code, code");
var bareUrl = currentUrl.replace(/^https?:\/\//, "");
for (var i = 0; i < blocks.length; i++) {
var el = blocks[i];
var entries = snapshots.get(el);
if (entries === undefined) {
entries = snapshotTextNodes(el);
// Cache the result either way (null = "scanned, no match")
// so we don't re-walk this element on every observer tick.
snapshots.set(el, entries);
}
if (!entries) continue;
for (var j = 0; j < entries.length; j++) {
var entry = entries[j];
var next;
if (currentUrl === DEFAULT_URL) {
next = entry.original;
} else {
next = entry.original
.replace(/https?:\/\/localhost:8080/g, currentUrl)
.replace(/localhost:8080/g, bareUrl);
}
if (entry.node.nodeValue !== next) entry.node.nodeValue = next;
}
}
}
// ---------- API Reference playground sync ----------
/**
* Mintlify renders the server-variable field with a stable id of
* `api-playground-input`, so we can scope directly to it instead of
* heuristically scanning every text input on the page.
*/
function findPlaygroundUrlInputs() {
var el = document.getElementById("api-playground-input");
if (!el || el.__bifrostPlaygroundBound) return [];
return [el];
}
function setNativeValue(el, value) {
// React overrides the input's value setter; bypass it so React's
// controlled state picks up the programmatic change.
var proto = Object.getPrototypeOf(el);
var descriptor = Object.getOwnPropertyDescriptor(proto, "value");
if (descriptor && descriptor.set) descriptor.set.call(el, value);
else el.value = value;
el.dispatchEvent(new Event("input", { bubbles: true }));
el.dispatchEvent(new Event("change", { bubbles: true }));
}
function syncPlaygroundInputs(state) {
var inputs = findPlaygroundUrlInputs();
for (var i = 0; i < inputs.length; i++) {
var el = inputs[i];
if (el.__bifrostPlaygroundBound) continue;
el.__bifrostPlaygroundBound = true;
// Persist on blur / change. Using `change` (not `input`) avoids
// fighting the user mid-keystroke.
el.addEventListener("change", function (e) {
var v = normalizeUrl(e.target.value);
state.currentUrl = v;
writeStoredUrl(v);
rewriteCodeBlocks(v);
});
// Preload from storage exactly once. After this, the input is
// user-owned — we never write to it again, otherwise typing would
// get clobbered by the next MutationObserver tick.
if (state.currentUrl !== DEFAULT_URL && el.value !== state.currentUrl) {
setNativeValue(el, state.currentUrl);
}
}
}
// ---------- Boot ----------
function boot() {
var state = { currentUrl: normalizeUrl(readStoredUrl()) };
rewriteCodeBlocks(state.currentUrl);
syncPlaygroundInputs(state);
// Mintlify is an SPA — re-run on any DOM mutation (debounced).
var pending = false;
var observer = new MutationObserver(function () {
if (pending) return;
pending = true;
window.requestAnimationFrame(function () {
pending = false;
rewriteCodeBlocks(state.currentUrl);
syncPlaygroundInputs(state);
});
});
observer.observe(document.body, { childList: true, subtree: true });
}
if (document.readyState === "loading") {
document.addEventListener("DOMContentLoaded", boot);
} else {
boot();
}
})();

View File

@@ -0,0 +1,81 @@
---
title: "Getting Started"
description: "Introduction to Bifrost's performance capabilities and how to choose the right instance size for your workload."
icon: "rocket"
---
## Overview
Bifrost has been rigorously tested under high load conditions to ensure optimal performance for production deployments. Our benchmark tests demonstrate exceptional performance characteristics at **5,000 requests per second (RPS)** across different AWS EC2 instance types.
**Key Performance Highlights:**
- **Perfect Success Rate**: 100% request success rate under high load
- **Minimal Overhead**: Less than 15µs added latency per request on average
- **Efficient Queue Management**: Sub-microsecond queue wait times on optimized instances
- **Fast Key Selection**: Near-instantaneous weighted API key selection (~10 ns)
---
## Test Environment Summary
Bifrost was benchmarked on two primary AWS EC2 instance configurations:
### **t3.medium (2 vCPUs, 4GB RAM)**
- **Buffer Size**: 15,000
- **Initial Pool Size**: 10,000
- **Use Case**: Cost-effective option for moderate workloads
### **t3.xlarge (4 vCPUs, 16GB RAM)**
- **Buffer Size**: 20,000
- **Initial Pool Size**: 15,000
- **Use Case**: High-performance option for demanding workloads
---
## Performance Comparison at a Glance
| Metric | t3.medium | t3.xlarge | Improvement |
|--------|-----------|-----------|-------------|
| **Success Rate @ 5k RPS** | 100% | 100% | No failed requests |
| **Bifrost Overhead** | 59 µs | 11 µs | **-81%** |
| **Average Latency** | 2.12s | 1.61s | **-24%** |
| **Queue Wait Time** | 47.13 µs | 1.67 µs | **-96%** |
| **JSON Marshaling** | 63.47 µs | 26.80 µs | **-58%** |
| **Response Parsing** | 11.30 ms | 2.11 ms | **-81%** |
| **Peak Memory Usage** | 1,312.79 MB | 3,340.44 MB | +155% |
> **Note**: t3.xlarge tests used significantly larger response payloads (~10 KB vs ~1 KB), yet still achieved better performance metrics.
<Note>
All benchmarks are on mocked OpenAI calls, whose latency and payload size are mentioned in the respective analysis pages.
</Note>
---
## Configuration Flexibility
One of Bifrost's key strengths is its **configuration flexibility**. You can fine-tune the speed ↔ memory trade-off based on your specific requirements:
| Configuration Parameter | Effect |
|------------------------|--------|
| `initial_pool_size` | Higher values = faster performance, more memory usage |
| `buffer_size` & `concurrency` | Controls queue depth and max parallel workers (per provider) |
| `retry` & `timeout` | Tune aggressiveness for each provider to meet your SLOs |
**Configuration Philosophy:**
- **Higher settings** (like t3.xlarge profile) prioritize raw speed
- **Lower settings** (like t3.medium profile) optimize for memory efficiency
- **Custom tuning** lets you find the sweet spot for your specific workload
---
## Next Steps
### **Detailed Performance Analysis**
- **[t3.medium Performance](./t3.medium)** - Deep dive into cost-effective performance
- **[t3.xlarge Performance](./t3.xl)** - High-performance configuration analysis
### **Run Your Own Tests**
- **[Run Your Own Benchmarks](./run-your-own-benchmarks)** - Step-by-step guide to benchmark Bifrost in your environment
Ready to dive deeper? Choose your instance type above or learn how to run your own performance tests.

View File

@@ -0,0 +1,355 @@
---
title: "Run Your Own Benchmarks"
description: "Step-by-step guide to benchmark Bifrost in your own environment using the official benchmarking tool."
icon: "stopwatch"
---
## Overview
Want to see Bifrost's performance in your specific environment? The [**Bifrost Benchmarking Repository**](https://github.com/maximhq/bifrost-benchmarking) provides everything you need to conduct comprehensive performance tests tailored to your infrastructure and workload requirements.
**What You Can Test:**
- **Custom Instance Sizes** - Test on your preferred AWS/GCP/Azure instances
- **Your Workload Patterns** - Use your actual request/response sizes
- **Different Configurations** - Compare various Bifrost settings
- **Provider Comparisons** - Benchmark against other AI gateways
- **Load Scenarios** - Test burst loads, sustained traffic, and endurance
> **💡 Open Source**: The benchmarking tool is completely open source! Feel free to submit pull requests if you think anything is missing or could be improved.
---
## Prerequisites
Before running benchmarks, ensure you have:
- **Go 1.26.1+** installed on your testing machine
- **Bifrost instance** running and accessible
- **Target API providers** configured (OpenAI, Anthropic, etc.)
- **Network access** between benchmark tool and Bifrost
- **Sufficient resources** on the testing machine to generate load
---
## Quick Start
### **1. Clone the Repository**
```bash
git clone https://github.com/maximhq/bifrost-benchmarking.git
cd bifrost-benchmarking
```
### **2. Build the Benchmark Tool**
```bash
go build benchmark.go
```
This creates a `benchmark` executable (or `benchmark.exe` on Windows).
### **3. Run Your First Benchmark**
```bash
# Basic benchmark: 500 RPS for 10 seconds
./benchmark -provider bifrost -port 8080
# Custom benchmark: 1000 RPS for 30 seconds
./benchmark -provider bifrost -port 8080 -rate 1000 -duration 30 -output my_results.json
```
---
## Configuration Options
The benchmark tool offers extensive configuration through command-line flags:
### **Basic Configuration**
| Flag | Required | Description | Default |
|------|----------|-------------|---------|
| `-provider <name>` | ✅ | Provider name (e.g., `bifrost`, `litellm`) | None |
| `-port <number>` | ✅ | Port number of your Bifrost instance | None |
| `-endpoint <path>` | ❌ | API endpoint path | `v1/chat/completions` |
| `-rate <number>` | ❌ | Requests per second | `500` |
| `-duration <seconds>` | ❌ | Test duration in seconds | `10` |
| `-output <filename>` | ❌ | Results output file | `results.json` |
### **Advanced Configuration**
| Flag | Description | Default |
|------|-------------|---------|
| `-include-provider-in-request` | Include provider name in request payload | `false` |
| `-big-payload` | Use larger, more complex request payloads | `false` |
---
## Benchmark Scenarios
### **1. Basic Performance Test**
Test standard performance with typical request sizes:
```bash
./benchmark -provider bifrost -port 8080 -rate 1000 -duration 60 -output basic_test.json
```
**Use Case**: General performance validation
### **2. High-Load Stress Test**
Push your instance to its limits:
```bash
./benchmark -provider bifrost -port 8080 -rate 5000 -duration 120 -output stress_test.json
```
**Use Case**: Capacity planning and SLA validation
### **3. Large Payload Test**
Test with bigger request/response sizes:
```bash
./benchmark -provider bifrost -port 8080 -rate 500 -duration 60 -big-payload=true -output large_payload.json
```
**Use Case**: Document processing, code generation workloads
### **4. Endurance Test**
Long-running stability test:
```bash
./benchmark -provider bifrost -port 8080 -rate 1000 -duration 1800 -output endurance_test.json
```
**Use Case**: Production readiness validation (30-minute test)
### **5. Comparative Benchmarking**
Compare Bifrost against other providers:
```bash
# Test Bifrost
./benchmark -provider bifrost -port 8080 -rate 1000 -duration 60 -output bifrost_results.json
# Test LiteLLM
./benchmark -provider litellm -port 8000 -rate 1000 -duration 60 -output litellm_results.json
# Test direct OpenAI (if available)
./benchmark -provider openai -port 443 -endpoint chat/completions -rate 1000 -duration 60 -output openai_results.json
```
---
## Understanding Results
The benchmark tool generates detailed JSON results with comprehensive metrics:
### **Key Metrics Explained**
```json
{
"bifrost": {
"request_counts": {
"total_sent": 30000,
"successful": 30000,
"failed": 0
},
"success_rate": 100.0,
"latency_metrics": {
"mean_ms": 245.5,
"p50_ms": 230.2,
"p99_ms": 520.8,
"max_ms": 845.3
},
"throughput_rps": 5000.0,
"memory_usage": {
"before_mb": 512.5,
"after_mb": 1312.8,
"peak_mb": 1405.2,
"average_mb": 1156.7
},
"timestamp": "2025-01-14T10:30:00Z",
"status_codes": {
"200": 30000
}
}
}
```
### **Critical Performance Indicators**
**Success Rate:**
- **Target**: >99.9% for production readiness
- **Excellent**: 100% (perfect reliability)
**Latency Metrics:**
- **P50 (Median)**: Typical user experience
- **P99**: Worst-case user experience
- **Mean**: Overall average performance
**Memory Usage:**
- **Peak**: Maximum memory consumption
- **Average**: Sustained memory usage
- **After - Before**: Memory growth during test
---
## Instance Sizing Recommendations
Based on your benchmark results, use these guidelines for production sizing:
### **Resource Planning Matrix**
| Target RPS | Memory Usage | Recommended Instance | Notes |
|------------|--------------|---------------------|--------|
| **< 1,000** | < 1GB | t3.small | Cost-effective for light loads |
| **1,000 - 3,000** | 1-2GB | t3.medium | Balanced performance/cost |
| **3,000 - 5,000** | 2-4GB | t3.large | High-performance production |
| **5,000+** | 3-6GB | t3.xlarge+ | Enterprise/mission-critical |
### **Configuration Tuning Based on Results**
**If seeing high latency:**
- Increase `initial_pool_size`
- Increase `buffer_size`
- Consider larger instance
**If memory usage is high:**
- Decrease `initial_pool_size`
- Optimize `buffer_size`
- Monitor for memory leaks
**If success rate < 100%:**
- Reduce request rate
- Increase timeout settings
- Check provider limits
---
## Advanced Testing Scenarios
### **Burst Load Testing**
Simulate traffic spikes:
```bash
# Normal load
./benchmark -provider bifrost -port 8080 -rate 1000 -duration 300 -output normal_load.json
# Burst load (simulate 5x spike)
./benchmark -provider bifrost -port 8080 -rate 5000 -duration 60 -output burst_load.json
```
### **Multi-Instance Testing**
Test horizontal scaling:
```bash
# Instance 1
./benchmark -provider bifrost-1 -port 8080 -rate 2500 -duration 120 -output instance_1.json &
# Instance 2
./benchmark -provider bifrost-2 -port 8081 -rate 2500 -duration 120 -output instance_2.json &
# Wait for both to complete
wait
```
### **Different Payload Sizes**
Compare performance across payload sizes:
```bash
# Small payloads (default)
./benchmark -provider bifrost -port 8080 -rate 1000 -duration 60 -output small_payload.json
# Large payloads
./benchmark -provider bifrost -port 8080 -rate 1000 -duration 60 -big-payload=true -output large_payload.json
```
---
## Continuous Benchmarking
### **Automated Testing Pipeline**
Set up regular performance regression testing:
```bash
#!/bin/bash
# daily_benchmark.sh
DATE=$(date +%Y%m%d_%H%M%S)
OUTPUT_DIR="benchmarks/$DATE"
mkdir -p $OUTPUT_DIR
# Run standard benchmarks
./benchmark -provider bifrost -port 8080 -rate 1000 -duration 300 -output "$OUTPUT_DIR/standard.json"
./benchmark -provider bifrost -port 8080 -rate 3000 -duration 180 -output "$OUTPUT_DIR/high_load.json"
./benchmark -provider bifrost -port 8080 -rate 500 -duration 600 -big-payload=true -output "$OUTPUT_DIR/large_payload.json"
echo "Benchmarks completed: $OUTPUT_DIR"
```
### **Performance Monitoring Integration**
Monitor key metrics over time:
- **Success rate trends**
- **Latency percentile changes**
- **Memory usage patterns**
- **Throughput capacity**
---
## Troubleshooting
### **Common Issues**
**Connection Refused:**
```bash
# Check if Bifrost is running
curl http://localhost:8080/health
# Verify port configuration
netstat -an | grep 8080
```
- Check PORT is defined in `.env` file at root.
**High Error Rates:**
- Check provider API key limits
- Verify Bifrost configuration
- Monitor upstream provider status
- Reduce request rate for baseline test
**Memory Issues:**
- Monitor system resources during testing
- Check for memory leaks in long tests
- Adjust Bifrost pool sizes
**Inconsistent Results:**
- Run multiple test iterations
- Account for network variability
- Use longer test durations (60+ seconds)
- Isolate testing environment
- Try hitting gateway requests to a Mock provider
---
## Next Steps
### **After Running Benchmarks**
1. **Analyze Results**: Compare against [official benchmarks](./getting-started)
2. **Optimize Configuration**: Tune based on your specific results
3. **Plan Capacity**: Size instances based on measured performance
4. **Set Up Monitoring**: Track key metrics in production
### **Compare Results**
- **[t3.medium Performance](./t3.medium)** - Compare against medium instance results
- **[t3.xlarge Performance](./t3.xl)** - Compare against high-performance configuration
**Ready to benchmark? Clone the [repository](https://github.com/maximhq/bifrost-benchmarking) and start testing!**

View File

@@ -0,0 +1,127 @@
---
title: "t3.medium"
description: "Detailed performance metrics and analysis for Bifrost running on AWS t3.medium instances (2 vCPUs, 4GB RAM)."
icon: "server"
---
## Instance Configuration
**AWS t3.medium Specifications:**
- **vCPUs**: 2
- **Memory**: 4GB RAM
- **Network Performance**: Up to 5 Gigabit
**Bifrost Configuration:**
- **Buffer Size**: 15,000
- **Initial Pool Size**: 10,000
- **Test Load**: 5,000 requests per second (RPS)
---
## Performance Results
### **Overall Performance Metrics**
| Metric | Value | Notes |
|--------|-------|--------|
| **Success Rate** | 100.00% | Perfect reliability under high load |
| **Average Request Size** | 0.13 KB | Lightweight request payload |
| **Average Response Size** | 1.37 KB | Standard response size for testing |
| **Average Latency** | 2.12s | Total end-to-end response time |
| **Peak Memory Usage** | 1,312.79 MB | ~33% of available 4GB RAM |
### **Detailed Performance Breakdown**
| Operation | Latency | Performance Notes |
|-----------|---------|-------------------|
| **Queue Wait Time** | 47.13 µs | Time waiting in Bifrost's internal queue |
| **Key Selection Time** | 16 ns | Weighted API key selection |
| **Message Formatting** | 2.19 µs | Request message preparation |
| **Params Preparation** | 436 ns | Parameter processing |
| **Request Body Preparation** | 2.65 µs | HTTP request body assembly |
| **JSON Marshaling** | 63.47 µs | JSON serialization time |
| **Request Setup** | 6.59 µs | HTTP client configuration |
| **HTTP Request** | 1.56s | Actual provider API call time |
| **Error Handling** | 189 ns | Error processing overhead |
| **Response Parsing** | 11.30 ms | JSON response deserialization |
**Bifrost's Total Overhead: 59 µs***
*\*Excludes JSON marshalling and HTTP calls, which are required in any implementation*
---
## Performance Analysis
### **Strengths on t3.medium**
1. **Perfect Reliability**: 100% success rate even at 5,000 RPS
2. **Memory Efficiency**: Uses only 33% of available RAM (1,312.79 MB / 4GB)
3. **Minimal Overhead**: Just 59 µs of added latency per request
4. **Fast Operations**: Sub-microsecond performance for most internal operations
### **Resource Utilization**
- **Memory Usage**: Very efficient at 1,312.79 MB peak usage
- **CPU Performance**: Handles 5,000 RPS workload effectively
- **Queue Management**: 47.13 µs average wait time indicates good throughput
---
## Configuration Recommendations
### **Optimal Settings for t3.medium**
Based on test results, these configurations work well:
```json
{
"client": {
"initial_pool_size": 10000,
"buffer_size": 15000
}
}
```
### **Tuning Opportunities**
**For Lower Memory Usage:**
- Reduce `initial_pool_size` to 7,500-8,000
- Decrease `buffer_size` to 12,000-13,000
- Trade-off: Slightly higher latency
**For Better Performance:**
- Increase `initial_pool_size` to 12,000-13,000
- Increase `buffer_size` to 17,000-18,000
- Trade-off: Higher memory usage (monitor RAM limits)
---
## Comparison Context
### **vs. t3.xlarge Performance**
| Metric | t3.medium | t3.xlarge | Difference |
|--------|-----------|-----------|------------|
| **Bifrost Overhead** | 59 µs | 11 µs | +81% slower |
| **Queue Wait Time** | 47.13 µs | 1.67 µs | +96% slower |
| **JSON Marshaling** | 63.47 µs | 26.80 µs | +58% slower |
| **Response Parsing** | 11.30 ms | 2.11 ms | +81% slower |
| **Memory Usage** | 1,312.79 MB | 3,340.44 MB | -61% usage |
**Key Insights:**
- t3.medium uses **61% less memory** than t3.xlarge
- Performance trade-offs are reasonable for cost savings
- Most operations still complete in microseconds
---
## Next Steps
**When to upgrade to t3.xlarge:**
- Sustained load approaches 4,000+ RPS
- Queue wait times consistently exceed 75 µs
- Memory usage approaches 75% of available RAM
- **[Run Your Own Benchmarks](./run-your-own-benchmarks)** to test with your specific workload
- **[Compare with t3.xlarge](./t3.xl)** for performance scaling analysis

151
docs/benchmarking/t3.xl.mdx Normal file
View File

@@ -0,0 +1,151 @@
---
title: "t3.xlarge"
description: "Detailed performance metrics and analysis for Bifrost running on AWS t3.xlarge instances (4 vCPUs, 16GB RAM)."
icon: "server"
---
## Instance Configuration
**AWS t3.xlarge Specifications:**
- **vCPUs**: 4
- **Memory**: 16GB RAM
- **Network Performance**: Up to 5 Gigabit
**Bifrost Configuration:**
- **Buffer Size**: 20,000
- **Initial Pool Size**: 15,000
- **Test Load**: 5,000 requests per second (RPS)
---
## Performance Results
### **Overall Performance Metrics**
| Metric | Value | Notes |
|--------|-------|--------|
| **Success Rate** | 100.00% | Perfect reliability under high load |
| **Average Request Size** | 0.13 KB | Lightweight request payload |
| **Average Response Size** | 10.32 KB | **Large response payload testing** |
| **Average Latency** | 1.61s | Total end-to-end response time |
| **Peak Memory Usage** | 3,340.44 MB | ~21% of available 16GB RAM |
> **Note**: t3.xlarge tests used significantly larger response payloads (~10 KB vs ~1 KB on t3.medium) to stress-test performance with realistic production data sizes.
### **Detailed Performance Breakdown**
| Operation | Latency | Performance Notes |
|-----------|---------|-------------------|
| **Queue Wait Time** | 1.67 µs | **96% faster** than t3.medium |
| **Key Selection Time** | 10 ns | **37% faster** weighted API key selection |
| **Message Formatting** | 2.11 µs | Consistent with t3.medium performance |
| **Params Preparation** | 417 ns | Slight improvement over t3.medium |
| **Request Body Preparation** | 2.36 µs | **11% faster** request assembly |
| **JSON Marshaling** | 26.80 µs | **58% faster** serialization |
| **Request Setup** | 7.17 µs | Comparable to t3.medium |
| **HTTP Request** | 1.50s | **4% faster** provider API calls |
| **Error Handling** | 162 ns | **14% faster** error processing |
| **Response Parsing** | 2.11 ms | **81% faster** despite 7.5x larger payloads |
**Bifrost's Total Overhead: 11 µs***
*\*Excludes JSON marshalling and HTTP calls, which are required in any implementation. 81% reduction compared to t3.medium (59 µs → 11 µs)*
---
## Performance Analysis
### **Exceptional Performance Improvements**
1. **Dramatic Overhead Reduction**: 81% lower Bifrost overhead (59 µs → 11 µs)
2. **Superior Queue Management**: 96% faster queue wait times (47.13 µs → 1.67 µs)
3. **Faster JSON Processing**: 58% improvement in marshaling despite larger payloads
4. **Efficient Response Parsing**: 81% faster parsing even with 7.5x larger responses
5. **Perfect Reliability**: 100% success rate maintained under high load
### **Resource Utilization**
- **Memory Efficiency**: Uses only 21% of available RAM (3,340.44 MB / 16GB)
- **CPU Performance**: Excellent multi-core utilization for 5,000 RPS
- **Headroom**: Substantial capacity for traffic spikes and growth
---
## Scalability and Headroom
### **Exceptional Scaling Characteristics**
The t3.xlarge configuration demonstrates **excellent scaling potential**:
**Current Utilization:**
- **Memory**: 21% used (13GB available headroom)
- **Queue Performance**: 1.67 µs wait time (near-optimal)
- **Processing Speed**: Sub-microsecond for most operations
**Scaling Potential:**
- **Traffic Spikes**: Can likely handle 15,000+ RPS bursts
- **Response Size Growth**: Efficiently handles 10 KB responses
- **Concurrent Users**: Supports thousands of simultaneous users
---
## Advanced Configuration
### **Optimal Settings for t3.xlarge**
Based on test results, these configurations provide excellent performance:
```json
{
"client": {
"initial_pool_size": 15000,
"buffer_size": 20000
}
}
```
### **Performance Tuning Opportunities**
**For Maximum Performance:**
- Increase `initial_pool_size` to 18,000-20,000
- Increase `buffer_size` to 25,000-30,000
- Trade-off: Higher memory usage (still well within limits)
**For Memory Optimization:**
- Current config already very efficient at 21% RAM usage
- Could reduce settings if needed, but performance gains would be lost
**For Extreme Workloads:**
- Consider `initial_pool_size` up to 25,000
- Increase `buffer_size` to 35,000+
- Monitor memory usage approaching 50% of available RAM
---
## Performance Comparison
### **vs. t3.medium Performance**
| Metric | t3.medium | t3.xlarge | Improvement |
|--------|-----------|-----------|-------------|
| **Bifrost Overhead** | 59 µs | 11 µs | **-81%** |
| **Average Latency** | 2.12s | 1.61s | **-24%** |
| **Queue Wait Time** | 47.13 µs | 1.67 µs | **-96%** |
| **JSON Marshaling** | 63.47 µs | 26.80 µs | **-58%** |
| **Response Parsing** | 11.30 ms | 2.11 ms | **-81%** |
| **Response Size Handled** | 1.37 KB | 10.32 KB | **+7.5x** |
| **Peak Memory Usage** | 1,312.79 MB | 3,340.44 MB | +155% |
| **Memory Utilization** | 33% | 21% | **-36%** |
**Key Insights:**
- **81% overhead reduction** while handling 7.5x larger responses
- **Exceptional efficiency** with only 21% memory utilization
- **Dramatic queue performance** improvements
- **Substantial headroom** for growth and traffic spikes
---
## Next Steps
- **[Run Your Own Benchmarks](./run-your-own-benchmarks)** with your specific payload sizes
- **[Compare with t3.medium](./t3.medium)** for cost-optimization analysis

View File

@@ -0,0 +1,18 @@
---
title: "v0.10.0"
description: "v0.10.0 changelog"
---
<Update label="Bifrost CLI" description="v0.10.0">
- feat: tabbed multiplexer for running multiple coding-agent sessions in a single terminal
- feat: self-update flow with `bifrost update` command and background version checks
- feat: `bifrost version` subcommand
- feat: native config writing for Claude Code (~/.claude/settings.json) with confirmation prompt
- feat: PTY-based process execution with SIGWINCH propagation for proper TUI rendering
- feat: npx installer rewrite with persistent install to ~/.bifrost/bin/ and automatic shell PATH setup
- feat: Claude Code simple terminal mode (CLAUDE_CODE_SIMPLE=1) for tab compatibility
- fix: opencode harness model reference format and provider config (bifrost/ prefix, dedicated provider)
- feat: opencode adaptive TUI theme injection and JSONC config parsing
- fix: chooser TUI prompt cleanup and tab bar integration (ReservedRows, BackToTabs, Notify)
</Update>

View File

@@ -0,0 +1,15 @@
---
title: "v0.10.1"
description: "v0.10.1 changelog - 2026-03-13"
---
<Update label="Bifrost CLI" description="v0.10.1">
- feat: added "edit session" functionality via ^B e to reopen chooser with prefilled values
- feat: Claude harness now pins selected models across Sonnet, Opus, and Haiku tiers
- fix: improved terminal cursor restoration on PTY exit
- fix: enhanced error notice handling in command mode with sticky error states
- fix: improved MCP client reconnection with exponential backoff and connection timeout
</Update>

View File

@@ -0,0 +1,14 @@
---
title: "v0.10.2"
description: "v0.10.2 changelog - 2026-03-14"
---
<Update label="Bifrost CLI" description="v0.10.2">
- feat: added in-tab self-update via U key in command mode when update is available
- feat: improved tab bar to show update hint when newer version is detected
- fix: terminal resize handling with proper size normalization and scroll region reset
- fix: improved chooser integration with tab bar rendering via TabBarLine callback
- fix: enhanced cursor positioning with absolute origin mode after scroll region reset
</Update>

View File

@@ -0,0 +1,10 @@
---
title: "v0.10.3"
description: "v0.10.3 changelog - 2026-03-27"
---
<Update label="Bifrost CLI" description="v0.10.3">
feat: adds support for ANTHROPIC_AUTH_TOKEN
</Update>

View File

@@ -0,0 +1,52 @@
---
title: "v1.3.10"
description: "v1.3.10 changelog"
---
<Update label="Bifrost Enterprise" description="v1.3.10">
## Changelog
This release upgrades the base OSS version from v1.4.12 to v1.4.13, bringing plugin execution sequencing, Groq speech support, Azure GCC cloud environments, and connection pool management. On the enterprise side, this release adds Azure Entra ID support for GCC High environments, new customer deployments, and deployment pipeline improvements.
## ✨ Features
- **Plugin Sequencing** — Added plugin execution ordering with placement and priority controls for custom plugins relative to built-in plugins
- **Groq Speech** — Added speech synthesis (TTS) and transcription (STT) support for Groq provider
- **Gemini Model Metadata** — Added support for Gemini metadata endpoint (/v1beta/models/{model})
- **Azure GCC High Integration** — Added Azure Entra ID support for GCC High and DoD cloud environments, including cloud-specific endpoints for SCIM provisioning and JWT validation
- **Wildcard Header Forwarding** — Added wildcard pattern support in header forwarding configuration
- **Log Metadata Columns** — Added metadata columns in logs and filters for richer observability
- **Prompt Caching Improvements** — Preserved JSON key ordering for LLM prompt caching using byte-level operations
- **Connection Pool Management** — Added connection lifetime limits and optimized pool behavior to prevent stale connections
## 🐞 Fixed
- **MCP Tool Headers** — Fixed MCP tools not passing required headers to the MCP server
- **MCP Tool Call Detection** — Fixed tool calls not being detected in MCP agent mode when providers return "stop" finish reason
- **Gemini Finish Reason** — Fixed Gemini models not returning correct "tool_calls" finish reason
- **Prompt Cascade Deletion** — Fixed manual cascade deletion for prompt entities
- **Deploy Maxim Workflow** — Fixed deployment workflow for Maxim environment
- **Commit Message Parsing** — Fixed commit message parsing in enterprise build pipeline
- **Customer License Expiry** — Updated license expiry configurations for customer deployments
## 📀 Base OSS version
`transports/v1.4.13`
## 🔌 If you are compiling plugin against this release - use following deps
```
github.com/maximhq/bifrost/core v1.4.11
github.com/maximhq/bifrost/framework v1.2.30
github.com/maximhq/bifrost/plugins/governance v1.4.30
github.com/maximhq/bifrost/plugins/logging v1.4.30
github.com/maximhq/bifrost/transports v1.4.14
github.com/weaviate/weaviate v1.36.5
github.com/weaviate/weaviate-go-client/v5 v5.7.1
google.golang.org/genproto/googleapis/api v0.0.0-20260203192932-546029d2fa20
google.golang.org/genproto/googleapis/rpc v0.0.0-20260203192932-546029d2fa20
```
</Update>

View File

@@ -0,0 +1,47 @@
---
title: "v1.3.11"
description: "v1.3.11 changelog"
---
<Update label="Bifrost Enterprise" description="v1.3.11">
## Changelog
This release upgrades the base OSS version from v1.4.14 to v1.4.15, bringing a custom SSE stream reader for smoother streaming, MCP config validation, configurable max open connections, and major dashboard improvements. On the enterprise side, this release adds new customer onboarding, Mantel authentication migration to username/password, and license management updates.
## ✨ Features
- **Custom SSE Stream Reader** — Replaced fasthttp's default stream reader with a custom implementation to reduce bursts in SSE streaming
- **MCP Config Validation** — Added validation for MCP tool configurations in config.json
- **Max Open Connections** — Exposed max-open-connections for provider domains as a configurable field
- **Dashboard Improvements** — Added new tabs and graphs to the dashboard including Model Ranking, Cache usage, and MCP usage
- **Dashboard & Logs Performance** — Improved LLM logs and Dashboard UI performance (~1400x faster) for large numbers of logs
- **Anthropic Compaction** — Added compaction support for Anthropic provider
## 🐞 Fixed
- **Passthrough Streaming** — Fixed passthrough streaming responses being buffered instead of streamed
- **MCP Notifications** — Fixed MCP notifications returning incorrect status code
- **Streaming Function Calls** — Fixed function_call items not included in streaming response.completed output
- **Bedrock API Key Auth** — Fixed Bedrock API key authentication without requiring bedrock_key_config
- **Bedrock Token Count Fallback** — Added fallback to estimated token count when count-tokens API is unsupported
- **Anthropic Thinking Fixes** — Fixed OpenAI-to-Anthropic-to-OpenAI thinking content conversion
- **Anthropic Header Selection** — Fixed Anthropic header selection across providers
- **Gemini OpenAI Integration** — Fixed Gemini flow for OpenAI-compatible integration
- **Semantic Cache Hashing** — Fixed deterministic tools_hash and params_hash in semantic cache
## 📀 Base OSS version
`transports/v1.4.15`
## 🔌 If you are compiling plugin against this release - use following deps
```
github.com/maximhq/bifrost/core v1.4.12
github.com/maximhq/bifrost/framework v1.2.31
github.com/maximhq/bifrost/plugins/governance v1.4.31
github.com/maximhq/bifrost/plugins/logging v1.4.31
github.com/maximhq/bifrost/transports v1.4.15
```
</Update>

View File

@@ -0,0 +1,36 @@
---
title: "v1.3.12"
description: "v1.3.12 changelog"
---
<Update label="Bifrost Enterprise" description="v1.3.12">
## Changelog
This release upgrades the base OSS version from v1.4.15 to v1.4.16, fixing Responses API tool type routing, Postgres indexing deadlocks, and startup blocking. On the enterprise side, this release adds targeted release deployments via `--release-for` and fixes MCP tool group filtering.
## ✨ Features
- **Targeted Release Deployments** — Added `--release-for` flag to CI/CD pipeline, allowing releases to target specific environments by name instead of auto-detecting all environments
## 🐞 Fixed
- **Responses API Tool Types** — Normalized versioned/provider-specific tool type strings (e.g. `web_search_20250305`) to their canonical types for correct routing
- **Provider Histogram Index** — Deferred provider histogram index creation to background goroutine to avoid blocking pod startup
- **MCP Tool Group Filtering** — Fixed MCP tool include filter to use correct schema constant for proper tool group resolution
## 📀 Base OSS version
`transports/v1.4.16`
## 🔌 If you are compiling plugin against this release - use following deps
```
github.com/maximhq/bifrost/core v1.4.13
github.com/maximhq/bifrost/framework v1.2.32
github.com/maximhq/bifrost/plugins/governance v1.4.32
github.com/maximhq/bifrost/plugins/logging v1.4.32
github.com/maximhq/bifrost/transports v1.4.16
```
</Update>

View File

@@ -0,0 +1,86 @@
---
title: "v1.3.13"
description: "v1.3.13 changelog"
---
<Update label="Bifrost Enterprise" description="v1.3.13">
## Changelog
This release upgrades the base OSS version from v1.4.16 to v1.4.17, bringing denylist model support, numerous streaming and provider fixes, and WebSocket concurrency safety. On the enterprise side, the Datadog span type for LLM calls is updated to `llm.call` for correct Datadog LLM Observability categorization.
## ✨ Features
- **Denylist Models** — Provider keys now support a `blacklisted_models` field to exclude specific models from routing and filtered list-models ; denylist takes precedence over the `models` allow list
## 🐞 Fixed
- **Datadog LLM Span Type** — Changed Datadog span type for LLM calls from `llm` to `llm.call` for proper Datadog LLM Observability integration
- **MCP Gateway Headers** — Fixed support for `x-bf-mcp-include-clients` and `x-bf-mcp-include-tools` headers to filter MCP tools/list response
- **Bedrock Duplicate Events** — Fixed duplicate `content_block_stop` events in Bedrock streaming responses
- **Reasoning Content Marshaling** — Fixed `reasoning_content` JSON tag in OpenAI response types
- **OTEL Streaming Traces** — Fixed response capture in OTEL tracing for streaming calls
- **Broken Pipe Handling** — Added broken pipe detection to connection pool error handler
- **Cache Token Streaming** — Fixed cache token capture for streaming calls across Anthropic and Bedrock providers
- **Vertex Embedding URL** — Fixed global region URL construction in Vertex embedding method
- **Bedrock Reasoning Merge** — Fixed reasoning content merge logic for Bedrock provider
- **Bedrock HTTP/2 Toggle** — Fixed enforce HTTP/2 toggle behavior for Bedrock provider
- **Codex Store Parameter** — Fixed `store` parameter handling for Codex conversations
- **Gemini Duplicate Text** — Skipped `OutputTextDone` events to prevent duplicate text in Gemini GenAI streaming
- **Gemini Thought Signatures** — Handled missing thought signatures in Gemini provider
- **Replicate Model Slugs** — Refined Replicate model slug resolution in model catalog
- **Logging Default** — Kept logging enabled by default for new configurations
- **Gin Migration Deadlocks** — Moved all gin migrations to Go to avoid deadlocks
- **WebSocket Concurrent Writes** — Fixed concurrent write safety in WebSocket Responses API sessions
- **Persist Store Config** — Persisted store raw request/response config at provider level
## 📀 Base OSS version
`transports/v1.4.17`
## 🗺️ Helm chart version
2.0.14
## 🔌 If you are compiling plugin against this release - use following deps
```go
cloud.google.com/go/bigquery v1.73.1
github.com/DataDog/datadog-go/v5 v5.6.0
github.com/DataDog/dd-trace-go/v2 v2.4.0
github.com/aws/aws-sdk-go-v2/config v1.32.11
github.com/aws/aws-sdk-go-v2/credentials v1.19.11
github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
github.com/bytedance/sonic v1.15.0
github.com/coreos/go-oidc/v3 v3.12.0
github.com/fasthttp/router v1.5.4
github.com/golang-jwt/jwt/v5 v5.3.0
github.com/google/cel-go v0.26.1
github.com/google/uuid v1.6.0
github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
github.com/grandcat/zeroconf v1.0.0
github.com/hashicorp/consul/api v1.22.0
github.com/hashicorp/memberlist v0.5.4
github.com/maximhq/bifrost/core v1.4.14
github.com/maximhq/bifrost/framework v1.2.33
github.com/maximhq/bifrost/plugins/governance v1.4.33
github.com/maximhq/bifrost/plugins/logging v1.4.33
github.com/maximhq/bifrost/transports v1.4.17
github.com/nakabonne/tstorage v0.3.6
github.com/stretchr/testify v1.11.1
github.com/testcontainers/testcontainers-go v0.40.0
github.com/tetratelabs/wazero v1.11.0
github.com/valyala/fasthttp v1.68.0
go.etcd.io/etcd/client/v3 v3.6.6
golang.org/x/crypto v0.49.0
golang.org/x/oauth2 v0.35.0
google.golang.org/api v0.265.0
google.golang.org/protobuf v1.36.11
gorm.io/driver/sqlite v1.6.0
gorm.io/gorm v1.31.1
k8s.io/api v0.34.1
k8s.io/apimachinery v0.34.1
k8s.io/client-go v0.34.1
```
</Update>

View File

@@ -0,0 +1,46 @@
---
title: "v1.3.14"
description: "v1.3.14 changelog"
---
<Update label="Bifrost Enterprise" description="v1.3.14">
## Changelog
This release adds support for Claude Office Suite (Excel add-on), calendar-aligned billing, and ANTHROPIC_AUTH_TOKEN authentication. It also includes Anthropic streaming usage and cache token fixes, CORS wildcard header handling, and enterprise-side improvements for secrets management and new customer onboarding.
## ✨ Features
- **Claude Office Suite Support** — Added support for the Claude Office Suite Excel add-on, including fixes for proper integration
- **Calendar-Aligned Billing** — Added calendar alignment feature for billing periods with supporting migration
- **ANTHROPIC_AUTH_TOKEN Support** — Added support for `ANTHROPIC_AUTH_TOKEN` as an authentication method
- **URL-Based Log Selection** — Added URL-based log selection with keyboard navigation and cross-page browsing in the dashboard
- **Secrets Limit Workaround** — Added ability to circumvent the 100 secrets limit in CI/CD pipelines
- **Manual Image Overrides** — Added manual overrides for container images in deployment configurations
- **New Customer Environments** — Onboarded Beckhoff, Dish, and Technarts with full Terraform and Dockerfile configurations
## 🐞 Fixed
- **Anthropic Streaming Usage** — Fixed usage reporting for Anthropic streaming responses
- **Anthropic Cache Token Reporting** — Fixed cache token reporting for Anthropic provider
- **Semantic Cache count_tokens** — Skipped unsupported `count_tokens` requests in semantic cache plugin
- **CORS Wildcard Headers** — Fixed wildcard (`*`) allowed headers handling for CORS
- **Greptile Integration** — Fixed issues with Greptile integration
- **Dashboard Style Fixes** — Refined dashboard page styling and layout improvements
- **Ada Token Expiry** — Increased Ada environment token expiry duration
## 📀 Base OSS version
`transports/v1.4.18-0.20260327163039-277421844123`
## 🔌 If you are compiling plugin against this release - use following deps
```
go get github.com/maximhq/bifrost/core@2774218441230eef858636ebe3b70552fb575a93
go get github.com/maximhq/bifrost/framework@2774218441230eef858636ebe3b70552fb575a93
go get github.com/maximhq/bifrost/plugins/governance@2774218441230eef858636ebe3b70552fb575a93
go get github.com/maximhq/bifrost/plugins/logging@2774218441230eef858636ebe3b70552fb575a93
go get github.com/maximhq/bifrost/transports@2774218441230eef858636ebe3b70552fb575a93
```
</Update>

View File

@@ -0,0 +1,358 @@
---
title: "v1.3.15"
description: "v1.3.15 changelog"
---
<Update label="Bifrost Enterprise" description="v1.3.15">
## Changelog
This release pins Bifrost OSS dependencies to stable release tags (transports/v1.4.18), includes calendar-aligned budgets along with numerous streaming and caching fixes.
## ✨ Features
- **Calendar-Aligned Budgets** — Added calendar alignment support for budget periods in governance
## 🐞 Fixed
- **SSE Error Events** — Handle SSE error events for 429 rate-limit and other error status codes during streaming
- **Anthropic Max Tokens** — Pick max tokens for Anthropic from model params cache instead of hardcoded values
- **Anthropic Streaming Usage** — Fixed usage token reporting for Anthropic streaming responses
- **Anthropic Cache Tokens** — Fixed Anthropic cache token reporting in non-streaming responses
- **Embedding Precision** — Preserved provider precision in embedding responses instead of truncating float values
- **Provider Caching** — Removed pending marshal-to-map to fix caching issues at provider level
- **Claude Office Suite** — Fixed support for Claude office suite add-on model routing
- **Semantic Cache Config** — Hardened direct-only config handling and aligned UI types for semantic cache
- **Semantic Cache Count Tokens** — Skip unsupported count_tokens requests in semantic cache plugin
- **Telemetry Events** — Removed reason field from telemetry events
- **CORS Headers** — Fixed wildcard allowed headers for CORS
- **UI Routing Display** — Shows selected virtual key and routing rule in UI
## 📀 Base OSS version
`transports/v1.4.18`
## 🔌 If you are compiling plugin against this release - use following deps
```
module github.com/maximhq/bifrost-enterprise
go 1.26.1
require (
cloud.google.com/go/bigquery v1.73.1
github.com/DataDog/datadog-go/v5 v5.6.0
github.com/DataDog/dd-trace-go/v2 v2.4.0
github.com/aws/aws-sdk-go-v2/config v1.32.11
github.com/aws/aws-sdk-go-v2/credentials v1.19.11
github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
github.com/bytedance/sonic v1.15.0
github.com/coreos/go-oidc/v3 v3.12.0
github.com/fasthttp/router v1.5.4
github.com/golang-jwt/jwt/v5 v5.3.0
github.com/google/cel-go v0.26.1
github.com/google/uuid v1.6.0
github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
github.com/grandcat/zeroconf v1.0.0
github.com/hashicorp/consul/api v1.22.0
github.com/hashicorp/memberlist v0.5.4
github.com/maximhq/bifrost/core v1.4.15
github.com/maximhq/bifrost/framework v1.2.34
github.com/maximhq/bifrost/plugins/governance v1.4.34
github.com/maximhq/bifrost/plugins/logging v1.4.34
github.com/maximhq/bifrost/transports v1.4.18
github.com/nakabonne/tstorage v0.3.6
github.com/stretchr/testify v1.11.1
github.com/testcontainers/testcontainers-go v0.40.0
github.com/tetratelabs/wazero v1.11.0
github.com/valyala/fasthttp v1.68.0
go.etcd.io/etcd/client/v3 v3.6.6
golang.org/x/crypto v0.49.0
golang.org/x/oauth2 v0.35.0
google.golang.org/api v0.265.0
google.golang.org/protobuf v1.36.11
gorm.io/driver/sqlite v1.6.0
gorm.io/gorm v1.31.1
k8s.io/api v0.34.1
k8s.io/apimachinery v0.34.1
k8s.io/client-go v0.34.1
)
require (
cel.dev/expr v0.25.1 // indirect
cloud.google.com/go v0.123.0 // indirect
cloud.google.com/go/auth v0.18.1 // indirect
cloud.google.com/go/auth/oauth2adapt v0.2.8 // indirect
cloud.google.com/go/compute/metadata v0.9.0 // indirect
cloud.google.com/go/iam v1.5.3 // indirect
dario.cat/mergo v1.0.2 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.20.0 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.13.1 // indirect
github.com/Azure/azure-sdk-for-go/sdk/internal v1.11.2 // indirect
github.com/Azure/go-ansiterm v0.0.0-20250102033503-faa5f7b0171c // indirect
github.com/AzureAD/microsoft-authentication-library-for-go v1.6.0 // indirect
github.com/DataDog/datadog-agent/comp/core/tagger/origindetection v0.71.0 // indirect
github.com/DataDog/datadog-agent/pkg/obfuscate v0.71.0 // indirect
github.com/DataDog/datadog-agent/pkg/opentelemetry-mapping-go/otlp/attributes v0.71.0 // indirect
github.com/DataDog/datadog-agent/pkg/proto v0.71.0 // indirect
github.com/DataDog/datadog-agent/pkg/remoteconfig/state v0.73.0-rc.1 // indirect
github.com/DataDog/datadog-agent/pkg/trace v0.71.0 // indirect
github.com/DataDog/datadog-agent/pkg/util/log v0.71.0 // indirect
github.com/DataDog/datadog-agent/pkg/util/scrubber v0.71.0 // indirect
github.com/DataDog/datadog-agent/pkg/version v0.71.0 // indirect
github.com/DataDog/go-libddwaf/v4 v4.6.1 // indirect
github.com/DataDog/go-runtime-metrics-internal v0.0.4-0.20250721125240-fdf1ef85b633 // indirect
github.com/DataDog/go-sqllexer v0.1.8 // indirect
github.com/DataDog/go-tuf v1.1.1-0.5.2 // indirect
github.com/DataDog/sketches-go v1.4.7 // indirect
github.com/Masterminds/semver/v3 v3.3.1 // indirect
github.com/Microsoft/go-winio v0.6.2 // indirect
github.com/andybalholm/brotli v1.2.0 // indirect
github.com/antlr4-go/antlr/v4 v4.13.0 // indirect
github.com/apache/arrow/go/v15 v15.0.2 // indirect
github.com/apapsch/go-jsonmerge/v2 v2.0.0 // indirect
github.com/armon/go-metrics v0.4.1 // indirect
github.com/aws/aws-sdk-go-v2 v1.41.3 // indirect
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.6 // indirect
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.19 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.19 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.19 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.5 // indirect
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.16 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.6 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.9.7 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.19 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.16 // indirect
github.com/aws/aws-sdk-go-v2/service/s3 v1.94.0 // indirect
github.com/aws/aws-sdk-go-v2/service/signin v1.0.7 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.30.12 // indirect
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.16 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.41.8 // indirect
github.com/aws/smithy-go v1.24.2 // indirect
github.com/bahlo/generic-list-go v0.2.0 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/buger/jsonparser v1.1.2 // indirect
github.com/bytedance/gopkg v0.1.3 // indirect
github.com/bytedance/sonic/loader v0.5.0 // indirect
github.com/cenkalti/backoff v2.2.1+incompatible // indirect
github.com/cenkalti/backoff/v4 v4.3.0 // indirect
github.com/cenkalti/backoff/v5 v5.0.3 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/cihub/seelog v0.0.0-20170130134532-f561c5e57575 // indirect
github.com/cloudwego/base64x v0.1.6 // indirect
github.com/containerd/errdefs v1.0.0 // indirect
github.com/containerd/errdefs/pkg v0.3.0 // indirect
github.com/containerd/log v0.1.0 // indirect
github.com/containerd/platforms v0.2.1 // indirect
github.com/coreos/go-semver v0.3.1 // indirect
github.com/coreos/go-systemd/v22 v22.5.0 // indirect
github.com/cpuguy83/dockercfg v0.3.2 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f // indirect
github.com/distribution/reference v0.6.0 // indirect
github.com/docker/docker v28.5.2+incompatible // indirect
github.com/docker/go-connections v0.6.0 // indirect
github.com/docker/go-units v0.5.0 // indirect
github.com/dustin/go-humanize v1.0.1 // indirect
github.com/ebitengine/purego v0.9.1 // indirect
github.com/emicklei/go-restful/v3 v3.12.2 // indirect
github.com/fasthttp/websocket v1.5.12 // indirect
github.com/fatih/color v1.17.0 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
github.com/fxamacker/cbor/v2 v2.9.0 // indirect
github.com/go-jose/go-jose/v4 v4.1.3 // indirect
github.com/go-logr/logr v1.4.3 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-ole/go-ole v1.3.0 // indirect
github.com/go-openapi/analysis v0.24.2 // indirect
github.com/go-openapi/errors v0.22.5 // indirect
github.com/go-openapi/jsonpointer v0.22.4 // indirect
github.com/go-openapi/jsonreference v0.21.4 // indirect
github.com/go-openapi/loads v0.23.2 // indirect
github.com/go-openapi/runtime v0.29.2 // indirect
github.com/go-openapi/spec v0.22.2 // indirect
github.com/go-openapi/strfmt v0.25.0 // indirect
github.com/go-openapi/swag v0.25.4 // indirect
github.com/go-openapi/swag/cmdutils v0.25.4 // indirect
github.com/go-openapi/swag/conv v0.25.4 // indirect
github.com/go-openapi/swag/fileutils v0.25.4 // indirect
github.com/go-openapi/swag/jsonname v0.25.4 // indirect
github.com/go-openapi/swag/jsonutils v0.25.4 // indirect
github.com/go-openapi/swag/loading v0.25.4 // indirect
github.com/go-openapi/swag/mangling v0.25.4 // indirect
github.com/go-openapi/swag/netutils v0.25.4 // indirect
github.com/go-openapi/swag/stringutils v0.25.4 // indirect
github.com/go-openapi/swag/typeutils v0.25.4 // indirect
github.com/go-openapi/swag/yamlutils v0.25.4 // indirect
github.com/go-openapi/validate v0.25.1 // indirect
github.com/go-viper/mapstructure/v2 v2.4.0 // indirect
github.com/goccy/go-json v0.10.5 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/groupcache v0.0.0-20241129210726-2c02b8208cf8 // indirect
github.com/golang/protobuf v1.5.4 // indirect
github.com/google/btree v1.1.3 // indirect
github.com/google/flatbuffers v23.5.26+incompatible // indirect
github.com/google/gnostic-models v0.7.0 // indirect
github.com/google/pprof v0.0.0-20251213031049-b05bdaca462f // indirect
github.com/google/s2a-go v0.1.9 // indirect
github.com/googleapis/enterprise-certificate-proxy v0.3.11 // indirect
github.com/googleapis/gax-go/v2 v2.17.0 // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.27.7 // indirect
github.com/hashicorp/errwrap v1.1.0 // indirect
github.com/hashicorp/go-cleanhttp v0.5.2 // indirect
github.com/hashicorp/go-hclog v1.6.3 // indirect
github.com/hashicorp/go-immutable-radix v1.3.1 // indirect
github.com/hashicorp/go-metrics v0.5.4 // indirect
github.com/hashicorp/go-msgpack/v2 v2.1.5 // indirect
github.com/hashicorp/go-multierror v1.1.1 // indirect
github.com/hashicorp/go-rootcerts v1.0.2 // indirect
github.com/hashicorp/go-sockaddr v1.0.7 // indirect
github.com/hashicorp/go-version v1.7.0 // indirect
github.com/hashicorp/golang-lru v1.0.2 // indirect
github.com/hashicorp/serf v0.10.1 // indirect
github.com/invopop/jsonschema v0.13.0 // indirect
github.com/jackc/pgpassfile v1.0.0 // indirect
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 // indirect
github.com/jackc/pgx/v5 v5.7.6 // indirect
github.com/jackc/puddle/v2 v2.2.2 // indirect
github.com/jaswdr/faker/v2 v2.8.0 // indirect
github.com/jinzhu/inflection v1.0.0 // indirect
github.com/jinzhu/now v1.1.5 // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/klauspost/compress v1.18.2 // indirect
github.com/klauspost/cpuid/v2 v2.3.0 // indirect
github.com/kylelemons/godebug v1.1.0 // indirect
github.com/lufia/plan9stats v0.0.0-20251013123823-9fd1530e3ec3 // indirect
github.com/magiconair/properties v1.8.10 // indirect
github.com/mailru/easyjson v0.9.1 // indirect
github.com/mark3labs/mcp-go v0.43.2 // indirect
github.com/mattn/go-colorable v0.1.14 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/mattn/go-sqlite3 v1.14.32 // indirect
github.com/maximhq/bifrost/plugins/litellmcompat v0.0.23 // indirect
github.com/maximhq/bifrost/plugins/maxim v1.5.33 // indirect
github.com/maximhq/bifrost/plugins/mocker v1.4.33 // indirect
github.com/maximhq/bifrost/plugins/otel v1.1.33 // indirect
github.com/maximhq/bifrost/plugins/semanticcache v1.4.32 // indirect
github.com/maximhq/bifrost/plugins/telemetry v1.4.34 // indirect
github.com/maximhq/maxim-go v0.2.0 // indirect
github.com/miekg/dns v1.1.68 // indirect
github.com/minio/simdjson-go v0.4.5 // indirect
github.com/mitchellh/go-homedir v1.1.0 // indirect
github.com/mitchellh/mapstructure v1.5.0 // indirect
github.com/moby/docker-image-spec v1.3.1 // indirect
github.com/moby/go-archive v0.1.0 // indirect
github.com/moby/patternmatcher v0.6.0 // indirect
github.com/moby/sys/sequential v0.6.0 // indirect
github.com/moby/sys/user v0.4.0 // indirect
github.com/moby/sys/userns v0.1.0 // indirect
github.com/moby/term v0.5.2 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.3-0.20250322232337-35a7c28c31ee // indirect
github.com/morikuni/aec v1.0.0 // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/oapi-codegen/runtime v1.1.1 // indirect
github.com/oklog/ulid v1.3.1 // indirect
github.com/opencontainers/go-digest v1.0.0 // indirect
github.com/opencontainers/image-spec v1.1.1 // indirect
github.com/outcaste-io/ristretto v0.2.3 // indirect
github.com/philhofer/fwd v1.2.0 // indirect
github.com/pierrec/lz4/v4 v4.1.21 // indirect
github.com/pinecone-io/go-pinecone/v5 v5.3.0 // indirect
github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/planetscale/vtprotobuf v0.6.1-0.20240319094008-0393e58bdf10 // indirect
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
github.com/power-devops/perfstat v0.0.0-20240221224432-82ca36839d55 // indirect
github.com/prometheus/client_golang v1.23.2 // indirect
github.com/prometheus/client_model v0.6.2 // indirect
github.com/prometheus/common v0.66.1 // indirect
github.com/prometheus/procfs v0.17.0 // indirect
github.com/puzpuzpuz/xsync/v3 v3.5.1 // indirect
github.com/qdrant/go-client v1.16.2 // indirect
github.com/redis/go-redis/v9 v9.17.2 // indirect
github.com/rs/zerolog v1.34.0 // indirect
github.com/santhosh-tekuri/jsonschema/v6 v6.0.2 // indirect
github.com/savsgio/gotils v0.0.0-20250408102913-196191ec6287 // indirect
github.com/sean-/seed v0.0.0-20170313163322-e2103e2c3529 // indirect
github.com/secure-systems-lab/go-securesystemslib v0.9.0 // indirect
github.com/shirou/gopsutil/v4 v4.25.10 // indirect
github.com/sirupsen/logrus v1.9.4 // indirect
github.com/spf13/cast v1.10.0 // indirect
github.com/stoewer/go-strcase v1.3.0 // indirect
github.com/theckman/httpforwarded v0.4.0 // indirect
github.com/tidwall/gjson v1.18.0 // indirect
github.com/tidwall/match v1.1.1 // indirect
github.com/tidwall/pretty v1.2.0 // indirect
github.com/tidwall/sjson v1.2.5 // indirect
github.com/tinylib/msgp v1.3.0 // indirect
github.com/tklauser/go-sysconf v0.3.16 // indirect
github.com/tklauser/numcpus v0.11.0 // indirect
github.com/twitchyliquid64/golang-asm v0.15.1 // indirect
github.com/valyala/bytebufferpool v1.0.0 // indirect
github.com/weaviate/weaviate v1.36.5 // indirect
github.com/weaviate/weaviate-go-client/v5 v5.7.1 // indirect
github.com/wk8/go-ordered-map/v2 v2.1.8 // indirect
github.com/x448/float16 v0.8.4 // indirect
github.com/yosida95/uritemplate/v3 v3.0.2 // indirect
github.com/yusufpapurcu/wmi v1.2.4 // indirect
github.com/zeebo/xxh3 v1.0.2 // indirect
go.etcd.io/etcd/api/v3 v3.6.6 // indirect
go.etcd.io/etcd/client/pkg/v3 v3.6.6 // indirect
go.mongodb.org/mongo-driver v1.17.6 // indirect
go.opencensus.io v0.24.0 // indirect
go.opentelemetry.io/auto/sdk v1.2.1 // indirect
go.opentelemetry.io/collector/component v1.39.0 // indirect
go.opentelemetry.io/collector/featuregate v1.39.0 // indirect
go.opentelemetry.io/collector/internal/telemetry v0.133.0 // indirect
go.opentelemetry.io/collector/pdata v1.39.0 // indirect
go.opentelemetry.io/contrib/bridges/otelzap v0.12.0 // indirect
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.63.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.63.0 // indirect
go.opentelemetry.io/otel v1.40.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc v1.40.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.40.0 // indirect
go.opentelemetry.io/otel/log v0.14.0 // indirect
go.opentelemetry.io/otel/metric v1.40.0 // indirect
go.opentelemetry.io/otel/sdk v1.40.0 // indirect
go.opentelemetry.io/otel/sdk/metric v1.40.0 // indirect
go.opentelemetry.io/otel/trace v1.40.0 // indirect
go.opentelemetry.io/proto/otlp v1.9.0 // indirect
go.starlark.net v0.0.0-20260102030733-3fee463870c9 // indirect
go.uber.org/atomic v1.11.0 // indirect
go.uber.org/multierr v1.11.0 // indirect
go.uber.org/zap v1.27.0 // indirect
go.yaml.in/yaml/v2 v2.4.2 // indirect
go.yaml.in/yaml/v3 v3.0.4 // indirect
golang.org/x/arch v0.23.0 // indirect
golang.org/x/exp v0.0.0-20251113190631-e25ba8c21ef6 // indirect
golang.org/x/mod v0.33.0 // indirect
golang.org/x/net v0.52.0 // indirect
golang.org/x/sync v0.20.0 // indirect
golang.org/x/sys v0.42.0 // indirect
golang.org/x/telemetry v0.0.0-20260209163413-e7419c687ee4 // indirect
golang.org/x/term v0.41.0 // indirect
golang.org/x/text v0.35.0 // indirect
golang.org/x/time v0.14.0 // indirect
golang.org/x/tools v0.42.0 // indirect
golang.org/x/xerrors v0.0.0-20240903120638-7835f813f4da // indirect
google.golang.org/genproto v0.0.0-20260128011058-8636f8732409 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20260203192932-546029d2fa20 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20260319201613-d00831a3d3e7 // indirect
google.golang.org/grpc v1.79.3 // indirect
gopkg.in/evanphx/json-patch.v4 v4.12.0 // indirect
gopkg.in/inf.v0 v0.9.1 // indirect
gopkg.in/ini.v1 v1.67.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
gorm.io/driver/postgres v1.6.0 // indirect
k8s.io/klog/v2 v2.130.1 // indirect
k8s.io/kube-openapi v0.0.0-20250710124328-f3f2b991d03b // indirect
k8s.io/utils v0.0.0-20250604170112-4c0f3b243397 // indirect
sigs.k8s.io/json v0.0.0-20241014173422-cfa47c3a1cc8 // indirect
sigs.k8s.io/randfill v1.0.0 // indirect
sigs.k8s.io/structured-merge-diff/v6 v6.3.0 // indirect
sigs.k8s.io/yaml v1.6.0 // indirect
)
```
</Update>

View File

@@ -0,0 +1,80 @@
---
title: "v1.3.16"
description: "v1.3.16 changelog"
---
<Update label="Bifrost Enterprise" description="v1.3.16">
## Changelog
This release adds a Model Details API endpoint, Anthropic beta headers support, and includes fixes for reasoning content handling, timeout status codes, and cross-provider caching.
## ✨ Features
- **Model Details API** — Added /api/models/details endpoint for querying model capability metadata
- **Anthropic Beta Headers** — Support for Anthropic beta feature headers in requests
## 🐞 Fixed
- **Reasoning Content Leak** — Prevented reasoning text from leaking into Gemini response content
- **Timeout Status Code** — Fixed timeout status code handling across all providers
- **Cross-Provider Cache** — Preserved cached provider metadata on cross-provider cache hits
- **Governance Virtual Keys** — Populated customer virtual_keys in governance APIs
- **List Models Integration** — Removed default provider override on list models request in integrations
- **Client Settings Headers** — Fixed Client settings UI to accept * as allowed headers
- **SCIM API Key Auth** — Clarified API key authentication flow in SCIM middleware to skip redundant validation
## 📀 Base OSS version
`transports/v1.4.19`
## 🔌 If you are compiling plugin against this release - use following deps
```
module github.com/maximhq/bifrost-enterprise
go 1.26.1
require (
cloud.google.com/go/bigquery v1.73.1
github.com/DataDog/datadog-go/v5 v5.6.0
github.com/DataDog/dd-trace-go/v2 v2.4.0
github.com/aws/aws-sdk-go-v2/config v1.32.11
github.com/aws/aws-sdk-go-v2/credentials v1.19.11
github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
github.com/bytedance/sonic v1.15.0
github.com/coreos/go-oidc/v3 v3.12.0
github.com/fasthttp/router v1.5.4
github.com/golang-jwt/jwt/v5 v5.3.0
github.com/google/cel-go v0.26.1
github.com/google/uuid v1.6.0
github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
github.com/grandcat/zeroconf v1.0.0
github.com/hashicorp/consul/api v1.22.0
github.com/hashicorp/memberlist v0.5.4
github.com/maximhq/bifrost/core v1.4.16
github.com/maximhq/bifrost/framework v1.2.35
github.com/maximhq/bifrost/plugins/governance v1.4.35
github.com/maximhq/bifrost/plugins/logging v1.4.35
github.com/maximhq/bifrost/transports v1.4.19
github.com/nakabonne/tstorage v0.3.6
github.com/stretchr/testify v1.11.1
github.com/testcontainers/testcontainers-go v0.40.0
github.com/tetratelabs/wazero v1.11.0
github.com/valyala/fasthttp v1.68.0
go.etcd.io/etcd/client/v3 v3.6.6
golang.org/x/crypto v0.49.0
golang.org/x/oauth2 v0.35.0
google.golang.org/api v0.265.0
google.golang.org/protobuf v1.36.11
gorm.io/driver/sqlite v1.6.0
gorm.io/gorm v1.31.1
k8s.io/api v0.34.1
k8s.io/apimachinery v0.34.1
k8s.io/client-go v0.34.1
)
```
</Update>

View File

@@ -0,0 +1,88 @@
---
title: "v1.3.17"
description: "Enterprise v1.3.17 changelog"
---
<Update label="Bifrost Enterprise" description="v1.3.17">
## Changelog
This release introduces model blacklisting in load balancing, Fireworks AI provider support, cluster stability improvements with unique node IDs and leader visibility, and numerous OSS fixes including Bedrock streaming retries and Gemini thinking budget validation.
## ✨ Features
- **Model Blacklisting for Load Balancer** — Added ability to exclude specific models from provider selection in the load balancing plugin, with support for per-key blacklists, block-all (`["*"]`) wildcards, and provider-level intersection logic
- **Fireworks AI Provider** — Added Fireworks AI as a first-class provider in the OSS transport layer
- **Unified Models API** — Unified /api/models and /api/models/details listing behavior
- **Unique Cluster Node IDs** — Auto-generate a unique UUID for each node's NodeID on config load, ensuring distinct cluster node identifiers
- **Leader Badge in Cluster View** — Display a "Leader" badge with crown icon in the cluster node table, with sorting by node name
- **Server Bootstrap Timer** — Added server bootstrap timer for performance monitoring
- **Security Path Whitelisting** — Allow path whitelisting from security config
- **Large Payload Optimizations** — Updated config schema for large payload optimizations
- **Virtual Keys Table** — Added sorting and CSV export to virtual keys table
## 🐞 Fixed
- **Leader Election Interval** — Increased leader election check interval to 10 seconds for improved cluster stability
- **Node ID Consistency** — Minor fixes for node ID consistency across cluster operations
- **ECR Cross-Account Access** — Fixed IAM role ARN format for ECR pull principals and cleaned up unused AWS provider config
- **Bedrock Streaming Retries** — Retry retryable AWS exceptions and stale/closed-connection errors
- **Gemini Thinking Budget** — Fixed thinking budget validation for Gemini models
- **Integration Data Race** — Fixed race condition in data reading from fasthttp request for integrations
- **Beta Headers** — Fixed case-insensitive lookup in merge beta headers
- **Deprecated Config Field** — Replaced enforce_governance_header with enforce_auth_on_inference
- **Bedrock Config Schema** — Fixed config schema for Bedrock key config
- **OpenAI Codex** — Fixed store flag for OpenAI Codex
## 📀 Base OSS version
`transports/v1.4.20`
## 🔌 If you are compiling plugin against this release - use following deps
```
module github.com/maximhq/bifrost-enterprise
go 1.26.1
require (
cloud.google.com/go/bigquery v1.73.1
github.com/DataDog/datadog-go/v5 v5.6.0
github.com/DataDog/dd-trace-go/v2 v2.4.0
github.com/aws/aws-sdk-go-v2/config v1.32.11
github.com/aws/aws-sdk-go-v2/credentials v1.19.11
github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
github.com/bytedance/sonic v1.15.0
github.com/coreos/go-oidc/v3 v3.12.0
github.com/fasthttp/router v1.5.4
github.com/golang-jwt/jwt/v5 v5.3.0
github.com/google/cel-go v0.26.1
github.com/google/uuid v1.6.0
github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
github.com/grandcat/zeroconf v1.0.0
github.com/hashicorp/consul/api v1.22.0
github.com/hashicorp/memberlist v0.5.4
github.com/maximhq/bifrost/core v1.4.17
github.com/maximhq/bifrost/framework v1.2.36
github.com/maximhq/bifrost/plugins/governance v1.4.36
github.com/maximhq/bifrost/plugins/logging v1.4.36
github.com/maximhq/bifrost/transports v1.4.20
github.com/nakabonne/tstorage v0.3.6
github.com/stretchr/testify v1.11.1
github.com/testcontainers/testcontainers-go v0.40.0
github.com/tetratelabs/wazero v1.11.0
github.com/valyala/fasthttp v1.68.0
go.etcd.io/etcd/client/v3 v3.6.6
golang.org/x/crypto v0.49.0
golang.org/x/oauth2 v0.35.0
google.golang.org/api v0.265.0
google.golang.org/protobuf v1.36.11
gorm.io/driver/sqlite v1.6.0
gorm.io/gorm v1.31.1
k8s.io/api v0.34.1
k8s.io/apimachinery v0.34.1
k8s.io/client-go v0.34.1
)
```
</Update>

View File

@@ -0,0 +1,58 @@
---
title: "v1.3.8"
description: "v1.3.8 changelog"
---
<Update label="Bifrost Enterprise" description="v1.3.8">
This release upgrades the base OSS version from v1.4.10 to v1.4.11, bringing Anthropic cache control for tool calls, Helm graceful shutdown improvements, Codex compatibility fixes, and numerous streaming/serialization bug fixes. On the enterprise side, Gray Swan guardrails now support custom base URLs.
This build also upgrades to go 1.26.1 - that fixes CVE-2026-25679, CVE-2026-27137, CVE-2026-27138, CVE-2026-27139, CVE-2026-27142.
### ✨ Features
- Gray Swan Custom Base URL — Added support for custom base URLs in Gray Swan guardrails configuration
- Anthropic Cache Control for Tool Calls — Added cache-control support for Anthropic tool calls
- Helm Graceful Shutdown — Added graceful shutdown and HPA stabilization for streaming connections
- Logstore Sonic Serialization — Replaced encoding/json with sonic for logstore serialization, improving performance
- Maxim Attachments — Added attachment support to Maxim plugin
### 🚨 Breaking changes
Based on our recent pentesting, we have updated configuration for open endpoints.
1. /metrics endpoint is now protected behind auth. You can create an API key - and add Metrics scope to it. You have to configure scraper with Header authorization `bearer api_key`
### 🐞 Fixed
- Codex Compatibility — Fixed fallback handling and request decompression for Codex compatibility
- Anthropic SSE Streaming — Use NewSSEScanner for Responses API streaming
- Audio Filename Preservation — Preserve original audio filename in transcription requests
- Proxy Override — Fixed proxy override handling
- Raw Request Serialization — Fixed raw request serialization in SSE events
- Key List Models — Fixed key list models serialization
- Async Job Recovery — Fixed async jobs stuck in "processing" on marshal failure, now correctly transition to "failed"
- Valkey/Redis Vector Store — Improved Valkey Search compatibility and correctness in Redis vector store
- Semanticcache Nil Check — Added nil check on message Content before accessing fields
- Dashboard Overflow — Resolved dashboard and provider config overflow regressions
- Config Schema Alignment — Fixed config schema and added test to verify Go model alignment
- Key Selection Panic — Prevent panic in key selection when all keys have zero weight
- Security Patches — Applied security patches including default Anthropic error type fix
### 📀 Base OSS version
```
transports/v1.4.12-0.20260306144022-5ac7c2732345
```
### 🔌 If you are compiling plugin against this release - use following deps
```
github.com/maximhq/bifrost/core v1.4.8-0.20260306144022-5ac7c2732345
github.com/maximhq/bifrost/framework v1.2.26
github.com/maximhq/bifrost/plugins/governance v1.4.27
github.com/maximhq/bifrost/plugins/logging v1.4.27
github.com/maximhq/bifrost/transports v1.4.12-0.20260306144022-5ac7c2732345
```
</Update>

View File

@@ -0,0 +1,64 @@
---
title: "v1.3.9"
description: "v1.3.9 changelog"
---
<Update label="Bifrost Enterprise" description="v1.3.9">
## Changelog
This release upgrades the base OSS version from v1.4.11 to v1.4.12, bringing a full-featured prompt repository with RBAC, large payload optimization, WebSocket-based responses API, Anthropic passthrough, session stickiness, and a unified pricing engine. On the enterprise side, this release adds KV store gossip protocol support, RBAC for the prompt repository, and build/deployment improvements.
## ✨ Features
- **Prompt Repository** — Full prompt management system with folders, prompts, versions, sessions, playground, versioning, deployment features, and Jinja2 variable support
- **Prompt Repository RBAC** — Added role-based access control for prompt repository operations
- **Large Payload Optimization** — End-to-end large payload support with streaming primitives, detection hooks, passthrough eligibility, provider support, plugin awareness, and enterprise settings UI
- **WebSocket Responses aAPI** — Added WebSocket transport for OpenAI responses API and realtime API support
- **Anthropic Passthrough** — Added native Anthropic passthrough endpoint
- **KV Store Gossip Protocol** — Added gossip-based KV store for distributed state synchronization
- **Session Stickiness** — Added session stickiness in key selection for consistent routing
- **Model Parameters API** — Added model parameters table and API endpoint with in-memory caching
- **Virtual Key Limit Resets** — Added virtual key limit reset functionality
- **Pricing Engine Refactor** — Unified cost calculation with quality-based image and video pricing
- **Image Configuration** — Added size/aspect ratio config for Gemini and size-to-resolution conversion for Replicate
- **Streaming Request Decompression** — Threshold-gated streaming decompression with pooled readers
- **Raw Request/Response Storage** — Allow storing raw request/response without returning them to clients
- **Weighted Routing Targets** — Added weighted routing targets for probabilistic routing rules with key selection support
- **API Key Selection by ID** — Added API key selection by ID with priority over name selection
- **TLS Configuration** — Added TLS configuration support for all providers and TLS termination inside Bifrost server
- **K8s Deployment Workflow** — Added workflow to deploy Bifrost Enterprise to Maxim K8s cluster
## 🐞 Fixed
- **Deterministic Tool Schema** — Fixed deterministic tool schema serialization for Anthropic prompt caching
- **CORS Wildcard** — Fixed CORS issue with allowing * origin
- **Bedrock toolChoice** — Fixed toolChoice silently dropped on Bedrock /converse and /converse-stream endpoints
- **Count Tokens Passthrough** — Fixed request body passthrough for count tokens endpoint for Anthropic and Vertex
- **Chat Finish Reason** — Map chat finish_reason to responses status and preserve terminal stream semantics
- **Tool Call Indexes** — Fixed streaming tool call indices for parallel tool calls in chat completions stream
- **Video Pricing** — Fixed video pricing calculation
- **SQLite Migration** — Prevented CASCADE deletion during routing targets migration
- **Log Serialization** — Reduced logstore serialization overhead and batch cost updates
- **Log List Queries** — Avoid loading raw_request/raw_response in log list queries
- **MCP Reconnection** — Improved MCP client reconnection with exponential backoff and connection timeout
- **Create Manifest Flow** — Fixed create manifest flow
- **Build Pipeline** — Fixed builds skipping latest changes
- **BigQuery Import** — Fixed import for codeEditor in bigqueryFormFragment.tsx
- **OSS Build Integration** — Support latest-main OSS build with go.mod replace directives
## 📀 Base OSS version
`transports/v1.4.12`
## 🔌 If you are compiling plugin against this release - use following deps
```
github.com/maximhq/bifrost/core v1.4.8
github.com/maximhq/bifrost/framework v1.2.27
github.com/maximhq/bifrost/plugins/governance v1.4.28
github.com/maximhq/bifrost/plugins/logging v1.4.28
github.com/maximhq/bifrost/transports v1.4.12
```
</Update>

View File

@@ -0,0 +1,156 @@
---
title: "v1.4.0--prerelease1"
description: "Enterprise v1.4.0-prerelease1"
---
<Update label="Bifrost Enterprise" description="v1.4.0-prerelease1">
## Changelog
This is a major release that introduces deny-by-default semantics across all allow-list fields (models, keys, tools, providers), a dedicated Provider Keys API, blacklist support in load balancing, redesigned adaptive routing UI, and scoped pricing overrides. **This release contains multiple breaking changes — please review the breaking changes section and migration checklist carefully before upgrading.**
## ⚠️ Breaking Changes
> **v1.5.0 OSS base flips the meaning of empty arrays across all allow-list fields.** Existing deployments with a database are protected by automatic migrations on startup, but any new configuration created after upgrading must follow the new semantics. **Back up your config store database before upgrading — this migration is not revertible.**
| What you write | v1.4.x meaning | v1.5.0 meaning |
|---|---|---|
| `[]` (empty array) | Allow **all** | Allow **none** (deny by default) |
| `["*"]` (wildcard) | N/A | Allow **all** |
| `["a", "b"]` | Only a and b | Only a and b (unchanged) |
### 1. Provider Key `models` Field
Empty `models` array now means "allow none" instead of "allow all". Use `["*"]` to allow a key to serve all models.
### 2. Virtual Key `allowed_models` Field
Missing or empty `allowed_models` on a VK provider config now blocks all models from that provider. Use `["*"]` to allow all.
### 3. Virtual Key Provider Configs — Deny-by-Default
Virtual Keys with empty or missing `provider_configs` now block all providers. Every VK must explicitly list its permitted providers.
### 4. `allowed_keys` Renamed to `key_ids`
Field renamed in VK provider configs. Same deny-by-default semantics — omitted or empty `key_ids` now blocks all keys. Use `["*"]` to allow all. **Note:** Unlike `allowed_models`, there is no automatic migration for `key_ids`.
### 5. Virtual Key MCP `tools_to_execute` Field
Empty `tools_to_execute` now blocks all tools. The `mcp_configs` list itself acts as a strict allow-list — no `mcp_configs` means all MCP tools are blocked for that VK.
### 6. `weight` Field is Now Optional
`weight` on VK provider configs is now nullable (`*float64`). `null` or omitted means the provider is excluded from weighted routing but still reachable via direct routing or fallbacks.
### 7. Compat Plugin Configuration Changes
- `enable_litellm_fallbacks` option **removed**
- Replaced with: `compat.convert_text_to_chat`, `compat.convert_chat_to_responses`, `compat.should_drop_params`
- Response field `extra_fields.litellm_compat` **removed**
- New response fields: `extra_fields.dropped_compat_plugin_params`, `extra_fields.converted_request_type`
### 8. Image Edits No Longer Supported on Replicate's Image Generation Endpoint
`/v1/images/generations` on Replicate now only handles pure text-to-image generation. Image editing parameters must use `/v1/images/edits`. Note: `/v1/images/edits` on Replicate will also be removed in a follow-up release.
### 9. Provider Keys API Separated from Provider API
- `keys` field **removed** from provider create/update requests and responses
- New dedicated endpoints: `GET/POST /api/providers/{provider}/keys`, `GET/PUT/DELETE /api/providers/{provider}/keys/{key_id}`
- Create providers first, then add keys separately
### New Validation: WhiteList Rules
- Wildcard `["*"]` cannot be mixed with other values (HTTP 400)
- No duplicate values allowed in allow-list fields
- Applies to: `allowed_models`, `key_ids`, `models`, `tools_to_execute`, `tools_to_auto_execute`, `allowed_extra_headers`
### Quick Migration Checklist
1. Update provider key `models` in config.json — change `[]` to `["*"]`
2. Add `allowed_models: ["*"]` to every VK provider config
3. Ensure every VK has at least one provider config entry
4. Rename `allowed_keys` to `key_ids` and set `["*"]` where needed
5. Update `tools_to_execute` for MCP configs — change `[]` to `["*"]`
6. Handle nullable `weight` in API consumers
7. Fix any invalid WhiteList values (no mixing wildcards, no duplicates)
8. Migrate key management to dedicated `/api/providers/{provider}/keys` endpoints
## ✨ Features
- **Dedicated Provider Keys API** — Keys are now managed via `/api/providers/{provider}/keys` endpoints instead of being embedded in provider create/update payloads
- **Deny-by-Default Access Control** — Standardized empty array conventions across all allow-list fields; `[]` means deny all, `["*"]` means allow all
- **VK Provider Config Key Wildcards** — `key_ids` now supports `["*"]` wildcard to allow all keys; handler resolves wildcard to AllowAllKeys flag without DB key lookups
- **VK MCP Allow-List** — Virtual key MCP configs now act as an execution-time allow-list — tools not permitted by the VK are blocked at inference and MCP tool execution
- **MCP Virtual Key Assignment** — MCP configuration now supports assigning virtual keys with per-tool access control, with an option to allow MCP clients to run on all virtual keys
- **Disable Auto MCP Tool Injection** — Add option to disable automatic MCP tool injection per request
- **MCP Request-Level Extra Headers** — Support for request-level extra headers in MCP tool execution
- **MCP Gateway Filtering** — Support for `x-bf-mcp-include-clients` and `x-bf-mcp-include-tools` request headers to filter MCP tools/list response
- **Scoped Pricing Overrides** — Support for pricing overrides at a scoped level
- **StabilityAI on Bedrock** — Added StabilityAI provider support to Bedrock
- **Plugin Trace Logging** — Plugins can now inject logs at trace level using `ctx.Log(schemas.LogLevelInfo, "Test log")`
- **Blacklist Support in Load Balancing** — Added model blacklist support to the load balancing plugin
- **Adaptive Routing UI Redesign** — Redesigned adaptive routing UI with improved layout and Sankey chart visualization
- **Governance Refactor** — Governance module changes for improved structure
- **Compat Plugin New Modes** — Chat-to-responses fallback and OpenAI-compatible parameter dropping modes added to compat plugin
## 🐞 Fixed
- **MCP Agent Usage Accumulation** — Fixed accumulated usage not being sent back in MCP agent mode
- **OpenAI Transcription Formats** — Handle text, vtt, srt response formats in OpenAI transcription response
- **HuggingFace Load Balancing** — Removed HuggingFace deployment handling from load balancing plugin
- **Parallelized Model Listing** — Parallelized model listing for providers to speed up startup time
## 📀 Base OSS version
`transports/v1.5.0-prerelease1`
## 🔌 If you are compiling plugin against this release - use following deps
```
module github.com/maximhq/bifrost-enterprise
go 1.26.1
require (
cloud.google.com/go/bigquery v1.73.1
github.com/DataDog/datadog-go/v5 v5.6.0
github.com/DataDog/dd-trace-go/v2 v2.4.0
github.com/aws/aws-sdk-go-v2/config v1.32.11
github.com/aws/aws-sdk-go-v2/credentials v1.19.11
github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
github.com/bytedance/sonic v1.15.0
github.com/coreos/go-oidc/v3 v3.12.0
github.com/fasthttp/router v1.5.4
github.com/golang-jwt/jwt/v5 v5.3.0
github.com/google/cel-go v0.26.1
github.com/google/uuid v1.6.0
github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
github.com/grandcat/zeroconf v1.0.0
github.com/hashicorp/consul/api v1.22.0
github.com/hashicorp/memberlist v0.5.4
github.com/maximhq/bifrost/core v1.5.0
github.com/maximhq/bifrost/framework v1.3.0
github.com/maximhq/bifrost/plugins/governance v1.5.0
github.com/maximhq/bifrost/plugins/logging v1.5.0
github.com/maximhq/bifrost/transports v1.5.0-prerelease1
github.com/nakabonne/tstorage v0.3.6
github.com/stretchr/testify v1.11.1
github.com/testcontainers/testcontainers-go v0.40.0
github.com/tetratelabs/wazero v1.11.0
github.com/valyala/fasthttp v1.68.0
go.etcd.io/etcd/client/v3 v3.6.6
golang.org/x/crypto v0.49.0
golang.org/x/oauth2 v0.35.0
google.golang.org/api v0.265.0
google.golang.org/protobuf v1.36.11
gorm.io/driver/sqlite v1.6.0
gorm.io/gorm v1.31.1
k8s.io/api v0.34.1
k8s.io/apimachinery v0.34.1
k8s.io/client-go v0.34.1
)
```
</Update>

View File

@@ -0,0 +1,107 @@
---
title: "v1.4.0--prerelease2"
description: "Enterprise v1.4.0-prerelease2"
---
<Update label="Bifrost Enterprise" description="v1.4.0-prerelease2">
## Changelog
This release introduces realtime (WebSocket/WebRTC) support, Fireworks AI as a new provider, a comprehensive SCIM provider expansion (Google Workspace, Keycloak, Zitadel, SailPoint), access profiles for fine-grained permission control, business units and teams for organizational hierarchy, a user ranking dashboard, and a guardrail verification flow.
## ✨ Features
- **Realtime Support** — WebSocket, WebRTC, and client secret handlers with session state management and transport context helpers for real-time streaming use cases
- **Fireworks AI Provider** — Fireworks AI added as a first-class provider with native completions, responses, embeddings, and image generations
- **Access Profiles** — Fine-grained permission control with access profiles for managing model access at team and business unit levels, including propagation dialogs and full CRUD UI
- **SCIM Provider Expansion** — Added support for Google Workspace, Keycloak, Zitadel, and SailPoint identity providers with full SCIM provisioning, attribute mapping, and sync workflows
- **Okta Custom Provider + Group Mapping** — Custom Okta provider configurations with attribute-to-role, team, and business unit mapping support
- **Business Units & Teams** — New organizational hierarchy for managing users with business units, teams, sync dialogs, and detail sheets
- **User Ranking Dashboard** — Dashboard for tracking and visualizing user activity and rankings
- **Guardrail Verify Flow** — Verify guardrail configurations against providers (Azure, Bedrock, GraySwan) before deployment
- **Per-User OAuth Consent** — Per-user OAuth consent flow with identity selection and MCP authentication
- **Prompts Plugin** — New prompts plugin with direct key header resolver and selective message inclusion when committing prompt sessions
- **Bedrock Embeddings & Image Gen** — Embeddings, image generation, edit, and variation support added to Bedrock provider
- **Logging Tracking Fields** — Support for tracking userId, teamId, customerId, and businessUnitId in logging plugin
- **Virtual Keys Export** — Sorting and CSV export added to virtual keys table
- **Path Whitelisting** — Allow path whitelisting from security config
- **Model Blacklist in Load Balancing** — Blacklist model support in the load balancing plugin to exclude specific models from routing
- **Cluster Leader Badge** — Leader badge display added to cluster node view
- **Server Bootstrap Timer** — Startup diagnostics with server bootstrap timer
## 🐞 Fixed
- **Traffic Distribution Label** — Added "last 10s" label to Traffic Distribution Sankey chart for clarity
- **Node ID Consistency** — Generate unique node ID on config load with minor consistency fixes
- **Leader Election Stability** — Increased leader election check interval to 10 seconds for improved stability
- **Bedrock Tool Choice** — Fix bedrock tool choice conversion to auto
- **Bedrock Streaming Retries** — Retry retryable AWS exceptions and stale/closed-connection errors in bedrock streaming
- **Bedrock SigV4 Service** — Correct SigV4 service name for agent runtime rerank
- **MCP Tool Logs** — Fix MCP tool logs not being captured correctly
- **Routing Rule Targets** — Preserve routing rule targets for genai and bedrock paths
- **Provider Budget Duplication** — Fix provider level multiline budget duplication issue
- **Vertex Endpoint** — Fix vertex endpoint correction
- **Gemini Thinking Budget** — Fix thinking budget validation for gemini models
- **SQLite Migrations** — Fix SQLite migration connections, error handling, and disable foreign key checks during migration
- **Tool Parameter Schemas** — Preserve explicit empty tool parameter schemas for openai passthrough
- **List Models Output** — Include raw model ID in list-models output alongside aliases
- **Config Schema** — Fix config schema for bedrock key config
- **Data Race Fix** — Fix race in data reading from fasthttp request for integrations
- **Model Listing** — Unify /api/models and /api/models/details listing behavior
## 📀 Base OSS version
`transports/v1.5.0-prerelease2`
## 🔌 If you are compiling plugin against this release - use following deps
```
module github.com/maximhq/bifrost-enterprise
go 1.26.1
require (
cloud.google.com/go/bigquery v1.73.1
github.com/DataDog/datadog-go/v5 v5.6.0
github.com/DataDog/dd-trace-go/v2 v2.4.0
github.com/aws/aws-sdk-go-v2/config v1.32.11
github.com/aws/aws-sdk-go-v2/credentials v1.19.11
github.com/aws/aws-sdk-go-v2/service/bedrockruntime v1.50.1
github.com/bytedance/sonic v1.15.0
github.com/coreos/go-oidc/v3 v3.12.0
github.com/fasthttp/router v1.5.4
github.com/golang-jwt/jwt/v5 v5.3.0
github.com/google/cel-go v0.26.1
github.com/google/uuid v1.6.0
github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674
github.com/grandcat/zeroconf v1.0.0
github.com/hashicorp/consul/api v1.22.0
github.com/hashicorp/memberlist v0.5.4
github.com/maximhq/bifrost/core v1.5.1
github.com/maximhq/bifrost/framework v1.3.1
github.com/maximhq/bifrost/plugins/governance v1.5.1
github.com/maximhq/bifrost/plugins/logging v1.5.1
github.com/maximhq/bifrost/transports v1.5.0-prerelease2
github.com/nakabonne/tstorage v0.3.6
github.com/stretchr/testify v1.11.1
github.com/testcontainers/testcontainers-go v0.40.0
github.com/tetratelabs/wazero v1.11.0
github.com/valyala/fasthttp v1.68.0
go.etcd.io/etcd/client/v3 v3.6.6
golang.org/x/crypto v0.49.0
golang.org/x/oauth2 v0.36.0
google.golang.org/api v0.265.0
google.golang.org/protobuf v1.36.11
gorm.io/driver/sqlite v1.6.0
gorm.io/gorm v1.31.1
k8s.io/api v0.34.1
k8s.io/apimachinery v0.34.1
k8s.io/client-go v0.34.1
)
```
</Update>

View File

@@ -0,0 +1,64 @@
---
title: "v1.2.21"
description: "v1.2.21 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.2.21
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.2.21
docker run -p 8080:8080 maximhq/bifrost:v1.2.21
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.2.21">
- Fixes pricing computation for nested model names i.e. groq/openai/gpt-oss-20b.
</Update>
<Update label="Framework" description="v1.2.21">
- Pricing module now accommodates nested model names i.e. groq/openai/gpt-oss-20b was getting skipped while computing costs.
</Update>
<Update label="governance" description="v1.2.21">
- Upgrades framework to 1.0.23
</Update>
<Update label="jsonparser" description="v1.2.21">
- Upgrades framework to 1.0.23
</Update>
<Update label="logging" description="v1.2.21">
- Upgrades framework to 1.0.23
- Fixes pricing computation for nested model names.
</Update>
<Update label="maxim" description="v1.2.21">
- Upgrades framework to 1.0.23
</Update>
<Update label="mocker" description="v1.2.21">
- Upgrades framework to 1.0.23
</Update>
<Update label="semantic_cache" description="v1.2.21">
- Upgrades framework to 1.0.23
</Update>
<Update label="telemetry" description="v1.2.21">
- Upgrades framework to 1.0.23
</Update>

View File

@@ -0,0 +1,78 @@
---
title: "v1.2.22"
description: "v1.2.22 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.2.22
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.2.22
docker run -p 8080:8080 maximhq/bifrost:v1.2.22
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.2.22">
- Fix: Users can now delete custom providers from the UI
- Fix: Token count no longer displays as N/A in certain streaming response cases
- Fix: Streaming responses now properly display errors on the UI instead of getting stuck in processing state
</Update>
<Update label="Core" description="v1.2.22">
- Fix: Updates token calculation for streaming responses. #520
</Update>
<Update label="Framework" description="v1.2.22">
- upgrade: core upgrades to 1.1.38
</Update>
<Update label="governance" description="v1.2.22">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="jsonparser" description="v1.2.22">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="logging" description="v1.2.22">
- fix: fixes error logging for streaming and non-streaming responses.
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="maxim" description="v1.2.22">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="mocker" description="v1.2.22">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="semantic_cache" description="v1.2.22">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="telemetry" description="v1.2.22">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>

View File

@@ -0,0 +1,76 @@
---
title: "v1.2.23"
description: "v1.2.23 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.2.23
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.2.23
docker run -p 8080:8080 maximhq/bifrost:v1.2.23
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.2.23">
- Fix: Fixes editing experience of weight for API keys.
</Update>
<Update label="Core" description="v1.2.23">
- Fix: Updates token calculation for streaming responses. #520
</Update>
<Update label="Framework" description="v1.2.23">
- upgrade: core upgrades to 1.1.38
</Update>
<Update label="governance" description="v1.2.23">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="jsonparser" description="v1.2.23">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="logging" description="v1.2.23">
- fix: fixes error logging for streaming and non-streaming responses.
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="maxim" description="v1.2.23">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="mocker" description="v1.2.23">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="semantic_cache" description="v1.2.23">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="telemetry" description="v1.2.23">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>

View File

@@ -0,0 +1,77 @@
---
title: "v1.2.24"
description: "v1.2.24 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.2.24
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.2.24
docker run -p 8080:8080 maximhq/bifrost:v1.2.24
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.2.24">
- Fix: Adds `Base URL` input in custom provider creation dialog.
- Fix: Fixes `x` button getting hidden behind dialog header.
</Update>
<Update label="Core" description="v1.2.24">
- Fix: Updates token calculation for streaming responses. #520
</Update>
<Update label="Framework" description="v1.2.24">
- upgrade: core upgrades to 1.1.38
</Update>
<Update label="governance" description="v1.2.24">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="jsonparser" description="v1.2.24">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="logging" description="v1.2.24">
- fix: fixes error logging for streaming and non-streaming responses.
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="maxim" description="v1.2.24">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="mocker" description="v1.2.24">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="semantic_cache" description="v1.2.24">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>
<Update label="telemetry" description="v1.2.24">
- upgrade: core to 1.1.38
- upgrade: framework to 1.0.24
</Update>

View File

@@ -0,0 +1,95 @@
---
title: "v1.3.0-prerelease1"
description: "v1.3.0-prerelease1 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease1
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.0-prerelease1
docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease1
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease1">
- Fix: Token count no longer displays as N/A in certain streaming response cases
- Fix: Streaming responses now properly display errors on the UI instead of getting stuck in processing state
- Feat: UI for configuring external observability connectors
- Feat: OTLP collector
- Feat: UI-driven Maxim observability configuration
- Fix: Fixes Bifrost specific error logging in first party and third party logging plugins
</Update>
<Update label="Core" description="v1.3.0-prerelease1">
- Feature: Adds dynamic reloads for plugins. This removes the requirement for restarts when updating plugins.
- Feature: Adds responses API support.
- This release contains multiple breaking changes for Bifrost Core. These were necessary to ensure we incorporate responses without compromising on speed or architecture.
</Update>
<Update label="Framework" description="v1.3.0-prerelease1">
- Chore: Adds ctx to each function to gracefully shutdown ongoing tasks and bring better concurrency management
- Fix: Fixes pricing sync to make sure latest updates are synced at every restart.
- Feat: Adds new accumulator for accumulating all streaming responses from LLMs.
</Update>
<Update label="governance" description="v1.3.0-prerelease1">
- Feat: Now Bifrost supports provider level fallbacks
- Chore: Dependency upgrades
</Update>
<Update label="jsonparser" description="v1.3.0-prerelease1">
- Upgrade dependency: core to 1.2.0
</Update>
<Update label="logging" description="v1.3.0-prerelease1">
- Fix: Captures Bifrost-specific errors in logs (e.g. provider not configured)
- Fix: Fixes audio streaming captures
- Upgrade dependency: core to 1.2.0
- Upgrade dependency: framework to 1.1.0
</Update>
<Update label="maxim" description="v1.3.0-prerelease1">
- Fix: Maxim plugin now captures Bifrost gateway specific errors.
- Upgrade dependency: maxim-go to 0.1.11
- Upgrade dependency: core to 1.2.0
- Upgrade dependency: framework to 1.1.0
</Update>
<Update label="mocker" description="v1.3.0-prerelease1">
- Upgrade dependency: core to 1.2.0
- Upgrade dependency: framework to 1.1.0
</Update>
<Update label="otel" description="v1.3.0-prerelease1">
- First version cut 🚀
- Feature: Support OTLP collector over HTTP or gRPC protocol.
</Update>
<Update label="semantic_cache" description="v1.3.0-prerelease1">
- Feat: Adds support for Responses and Text completions
- Upgrade dependency: core to 1.2.0
- Upgrade dependency: framework to 1.1.0
</Update>
<Update label="telemetry" description="v1.3.0-prerelease1">
- Fix: Adds support for Responses and Text completions.
- Upgrade dependency: core to 1.2.0
- Upgrade dependency: framework to 1.2.0
</Update>

View File

@@ -0,0 +1,83 @@
---
title: "v1.3.0-prerelease2"
description: "v1.3.0-prerelease2 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease2
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.0-prerelease2
docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease2
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease2">
- Added specific error handling for timeout scenarios (context.Canceled, context.DeadlineExceeded, fasthttp.ErrTimeout) across all providers
- Created a dedicated error message for timeouts that guides users to adjust the timeout setting
- Fixed validation in HTTP handlers for embeddings, speech, and text completion requests
- Improved CORS wildcard pattern matching to support domain patterns like *.example.com
- Fixed issues in the logging plugin to properly handle text completion responses
- Enhanced UI form handling for network configuration with proper default values
- Feat: Adds Text Completion Streaming support
</Update>
<Update label="Core" description="v1.3.0-prerelease2">
- Added specific error handling for timeout scenarios (context.Canceled, context.DeadlineExceeded, fasthttp.ErrTimeout) across all providers
- Created a dedicated error message for timeouts that guides users to adjust the timeout setting
- Added Text Completion Streaming support
</Update>
<Update label="Framework" description="v1.3.0-prerelease2">
- Feat: Adds Text Completion Streaming support
</Update>
<Update label="governance" description="v1.3.0-prerelease2">
- Chore: using core 1.2.1 and framework 1.1.1
</Update>
<Update label="jsonparser" description="v1.3.0-prerelease2">
- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
</Update>
<Update label="logging" description="v1.3.0-prerelease2">
- Feat: Adds Text Completion Streaming support
- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
</Update>
<Update label="maxim" description="v1.3.0-prerelease2">
- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
</Update>
<Update label="mocker" description="v1.3.0-prerelease2">
- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
</Update>
<Update label="otel" description="v1.3.0-prerelease2">
- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
</Update>
<Update label="semantic_cache" description="v1.3.0-prerelease2">
- Feat: Adds Text Completion Streaming support
- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
</Update>
<Update label="telemetry" description="v1.3.0-prerelease2">
- Upgrade dependency: core to 1.2.1 and framework to 1.1.1
</Update>

View File

@@ -0,0 +1,74 @@
---
title: "v1.3.0-prerelease3"
description: "v1.3.0-prerelease3 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease3
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.0-prerelease3
docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease3
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease3">
- Fix: Fixes string input support for responses requests.
- Feat: Adds responses endpoint to openai integration.
</Update>
<Update label="Core" description="v1.3.0-prerelease3">
- Fix: String inputs tranformat added for responses requests.
</Update>
<Update label="Framework" description="v1.3.0-prerelease3">
- Chore: core upgrades to 1.2.2
</Update>
<Update label="governance" description="v1.3.0-prerelease3">
- Chore: using core 1.2.2 and framework 1.1.2
</Update>
<Update label="jsonparser" description="v1.3.0-prerelease3">
- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
</Update>
<Update label="logging" description="v1.3.0-prerelease3">
- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
</Update>
<Update label="maxim" description="v1.3.0-prerelease3">
- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
</Update>
<Update label="mocker" description="v1.3.0-prerelease3">
- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
</Update>
<Update label="otel" description="v1.3.0-prerelease3">
- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
</Update>
<Update label="semantic_cache" description="v1.3.0-prerelease3">
- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
</Update>
<Update label="telemetry" description="v1.3.0-prerelease3">
- Upgrade dependency: core to 1.2.2 and framework to 1.1.2
</Update>

View File

@@ -0,0 +1,73 @@
---
title: "v1.3.0-prerelease4"
description: "v1.3.0-prerelease4 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease4
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.0-prerelease4
docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease4
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease4">
- Feat: A new config called `Enable LiteLLM Fallback` that enables text_completion calls to fall back to chat_completions calls for the Groq provider. This is an anti-pattern, but we are adding this to help users migrate from LiteLLM easily. Reach out to us if you want us to enable any other quirky patterns LiteLLM has.
</Update>
<Update label="Core" description="v1.3.0-prerelease4">
- Feat: Adds litellm-specific fallbacks for text completion for Groq. This enables users with codebases stuck in this antipattern out-of-the-box.
</Update>
<Update label="Framework" description="v1.3.0-prerelease4">
- Chore: core upgrades to 1.2.3
</Update>
<Update label="governance" description="v1.3.0-prerelease4">
- Chore: core upgrades to 1.2.3
</Update>
<Update label="jsonparser" description="v1.3.0-prerelease4">
- Chore: core upgrades to 1.2.3
</Update>
<Update label="logging" description="v1.3.0-prerelease4">
- Chore: core upgrades to 1.2.3
</Update>
<Update label="maxim" description="v1.3.0-prerelease4">
- Chore: core upgrades to 1.2.3
</Update>
<Update label="mocker" description="v1.3.0-prerelease4">
- Chore: core upgrades to 1.2.3
</Update>
<Update label="otel" description="v1.3.0-prerelease4">
- Chore: core upgrades to 1.2.3
</Update>
<Update label="semantic_cache" description="v1.3.0-prerelease4">
- Chore: core upgrades to 1.2.3
</Update>
<Update label="telemetry" description="v1.3.0-prerelease4">
- Chore: core upgrades to 1.2.3
</Update>

View File

@@ -0,0 +1,76 @@
---
title: "v1.3.0-prerelease5"
description: "v1.3.0-prerelease5 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease5
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.0-prerelease5
docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease5
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease5">
- Fix: Anthropic tool results aggregation logic (core 1.2.4)
- Feat: Raw response saved in logs (framework 1.1.4)
</Update>
<Update label="Core" description="v1.3.0-prerelease5">
- Fix: Anthropic tool results aggregation logic.
</Update>
<Update label="Framework" description="v1.3.0-prerelease5">
- Feat: Raw response saved in logs.
- Upgrade dependency: core to 1.2.4
</Update>
<Update label="governance" description="v1.3.0-prerelease5">
- Chore: using core 1.2.4 and framework 1.1.4
</Update>
<Update label="jsonparser" description="v1.3.0-prerelease5">
- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
</Update>
<Update label="logging" description="v1.3.0-prerelease5">
- Feat: Raw response saved in logs.
- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
</Update>
<Update label="maxim" description="v1.3.0-prerelease5">
- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
</Update>
<Update label="mocker" description="v1.3.0-prerelease5">
- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
</Update>
<Update label="otel" description="v1.3.0-prerelease5">
- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
</Update>
<Update label="semantic_cache" description="v1.3.0-prerelease5">
- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
</Update>
<Update label="telemetry" description="v1.3.0-prerelease5">
- Upgrade dependency: core to 1.2.4 and framework to 1.1.4
</Update>

View File

@@ -0,0 +1,87 @@
---
title: "v1.3.0-prerelease6"
description: "v1.3.0-prerelease6 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease6
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.0-prerelease6
docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease6
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease6">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
- Feat: Added Anthropic thinking parameter in responses API.
- Feat: Added Anthropic text completion integration support.
- Fix: Extra fields sent back in streaming responses.
- Feat: Latency for all request types (with inter token latency for streaming requests) sent back in Extra fields.
- Feat: UI websocket implementation generalized.
- Feat: TokenInterceptor interface added to plugins.
- Fix: Middlewares added to integrations route.
</Update>
<Update label="Core" description="v1.3.0-prerelease6">
- Feat: Stream token latency sent back in extra fields.
- Feat: Plugin interface extended with TransportInterceptor method.
- Feat: Add Anthropic thinking parameter
- Feat: Add Custom key selector logic and send back request latency in extra fields.
- Bug: Fallbacks not working occasionally.
</Update>
<Update label="Framework" description="v1.3.0-prerelease6">
- Upgrade dependency: core to 1.2.5
- Feat: User table added to config store.
</Update>
<Update label="governance" description="v1.3.0-prerelease6">
- Chore: using core 1.2.5 and framework 1.1.5
- Feat: Added provider routing TransportInterceptor.
</Update>
<Update label="jsonparser" description="v1.3.0-prerelease6">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="logging" description="v1.3.0-prerelease6">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="maxim" description="v1.3.0-prerelease6">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="mocker" description="v1.3.0-prerelease6">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="otel" description="v1.3.0-prerelease6">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="semantic_cache" description="v1.3.0-prerelease6">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="telemetry" description="v1.3.0-prerelease6">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
- Feat: Added First Token and Inter Token latency metrics for streaming requests.
</Update>

View File

@@ -0,0 +1,81 @@
---
title: "v1.3.0-prerelease7"
description: "v1.3.0-prerelease7 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.0-prerelease7
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.0-prerelease7
docker run -p 8080:8080 maximhq/bifrost:v1.3.0-prerelease7
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.0-prerelease7">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
- Added Responses streaming across all providers.
- Fixed bedrock chat streaming decoding issues.
- Added raw response support for all streaming requests.
- Removed last token's accumulated latency from inter token latency metric.
</Update>
<Update label="Core" description="v1.3.0-prerelease7">
- Feat: Responses streaming added across all providers.
- Fix: Bedrock chat streaming decoding fixes.
- Feat: Added raw response support for all streaming requests.
</Update>
<Update label="Framework" description="v1.3.0-prerelease7">
- Upgrade dependency: core to 1.2.6
- Feat: Moved the migrator package to a more general location and added database migrations for the logstore to standardize object type values.
</Update>
<Update label="governance" description="v1.3.0-prerelease7">
- Chore: using core 1.2.6 and framework 1.1.6
</Update>
<Update label="jsonparser" description="v1.3.0-prerelease7">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="logging" description="v1.3.0-prerelease7">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="maxim" description="v1.3.0-prerelease7">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="mocker" description="v1.3.0-prerelease7">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="otel" description="v1.3.0-prerelease7">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="semantic_cache" description="v1.3.0-prerelease7">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="telemetry" description="v1.3.0-prerelease7">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
- Fix: Removed last token's accumulated latency from inter token latency metric.
</Update>

119
docs/changelogs/v1.3.0.mdx Normal file
View File

@@ -0,0 +1,119 @@
---
title: "v1.3.0"
description: "v1.3.0 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.0
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.0
docker run -p 8080:8080 maximhq/bifrost:v1.3.0
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.0">
We're excited to ship v1.3.0 with major quality, compatibility, and governance upgrades across OSS and Enterprise.
🌟 Highlights
- OTel traces support (OSS): First-class support for OTLP collectors.
- Responses API (OSS): First-class support for the OpenAI-style Responses format, streaming + non-streaming.
- Drop-in for LiteLLM (OSS): Config-level fallbacks to ease migrations.
- Guardrails (Enterprise): Initial set with AWS Bedrock, Azure Content Moderator, and Patronus AI.
- Provisioning (Enterprise): Okta SCIM now supported alongside Microsoft Entra.
- Adaptive LB Dashboard (Enterprise, beta): Live traffic, weight shifts, and failover visibility.
### Features
- Added Anthropic thinking parameter in Responses API.
- Added Anthropic text completion integration support.
- Latency metrics for all request types now returned in extra (includes inter-token latency for streaming).
- TokenInterceptor interface added to plugins.
- Raw provider response saved in logs (framework v1.1.4).
### Fixes
- Removed extra fields erroneously sent in streaming responses.
- Anthropic tool results aggregation corrected (core v1.2.4).
- String input support fixed for Responses requests.
- Specific timeout error handling across all providers for context.Canceled, context.DeadlineExceeded, and fasthttp.ErrTimeout.
- Pricing manager fixes.
### Improvements
- CORS wildcard matching improved to support domain patterns like *.example.com.
## Closed tickets
- [#605: [Bug]: UI Docker building errors](https://github.com/maximhq/bifrost/issues/605)
- [#597: [Bug Report] Bedrock streaming has many missing chunks](https://github.com/maximhq/bifrost/issues/597)
- [#567: Handling reasoning content](https://github.com/maximhq/bifrost/issues/567)
- [#565: The "pricing not found for model ..." message is repeated for each request processed, which is too noisy for the warn level.](https://github.com/maximhq/bifrost/issues/565)
- [#552: [Bug]: "index" not specified for tool calls in OpenAI chunks](https://github.com/maximhq/bifrost/issues/552)
- [#543: [Bug]: Indicate timeouts in error response while logging](https://github.com/maximhq/bifrost/issues/543)
- [#542: [Feature]: Logs should show timestamps in browser timezone](https://github.com/maximhq/bifrost/issues/542)
- [#520: [Bug]: tokens and cost for "Chat Stream" requests is missing in logs](https://github.com/maximhq/bifrost/issues/520)
- [#516: [Bug]: Can't delete custom provider from Web UI](https://github.com/maximhq/bifrost/issues/516)
- [#504: [Bug]: cannot use self-hosted SGLang instance with http:// URLs only](https://github.com/maximhq/bifrost/issues/504)
- [#497: [Feature]: Add full support for standard OpenTelemetry GenAI Observability](https://github.com/maximhq/bifrost/issues/497)
- [#479: [Feature]: Support for API Key Authentication in Bedrock](https://github.com/maximhq/bifrost/issues/479)
- [#463: [Feature]: Support for Thinking blocks](https://github.com/maximhq/bifrost/issues/463)
- [#456: [Docs]: Update API reference docs](https://github.com/maximhq/bifrost/issues/456)
- [#451: [Feature]: Offline usage](https://github.com/maximhq/bifrost/issues/451)
</Update>
<Update label="Core" description="v1.3.0">
- Refactor: Bifrost Response structure seggragated.
</Update>
<Update label="Framework" description="v1.3.0">
- Upgrade dependency: core to 1.2.7
- Fix: Added missing migration for `parent_request_id_column` in logs table.
</Update>
<Update label="governance" description="v1.3.0">
- Chore: using core 1.2.7 and framework 1.1.7
</Update>
<Update label="jsonparser" description="v1.3.0">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>
<Update label="logging" description="v1.3.0">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>
<Update label="maxim" description="v1.3.0">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>
<Update label="mocker" description="v1.3.0">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>
<Update label="otel" description="v1.3.0">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="semantic_cache" description="v1.3.0">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>
<Update label="telemetry" description="v1.3.0">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>

View File

@@ -0,0 +1,74 @@
---
title: "v1.3.1"
description: "v1.3.1 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.1
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.1
docker run -p 8080:8080 maximhq/bifrost:v1.3.1
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.1">
- Bug: "x-bf-vk" missing error fixed.
</Update>
<Update label="Core" description="v1.3.1">
- Refactor: Bifrost Response structure seggragated.
</Update>
<Update label="Framework" description="v1.3.1">
- Upgrade dependency: core to 1.2.7
- Fix: Added missing migration for `parent_request_id_column` in logs table.
</Update>
<Update label="governance" description="v1.3.1">
- Chore: taking context key from core package instead of governance package
</Update>
<Update label="jsonparser" description="v1.3.1">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>
<Update label="logging" description="v1.3.1">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>
<Update label="maxim" description="v1.3.1">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>
<Update label="mocker" description="v1.3.1">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>
<Update label="otel" description="v1.3.1">
- Upgrade dependency: core to 1.2.6 and framework to 1.1.6
</Update>
<Update label="semantic_cache" description="v1.3.1">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>
<Update label="telemetry" description="v1.3.1">
- Upgrade dependency: core to 1.2.7 and framework to 1.1.7
</Update>

View File

@@ -0,0 +1,93 @@
---
title: "v1.3.10"
description: "v1.3.10 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.10
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.10
docker run -p 8080:8080 maximhq/bifrost:v1.3.10
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.10">
- chore: version update core to 1.2.13 and framework to 1.1.15
- feat: added headers support for OTel configuration. Value prefixed with env will be fetched from environment variables (`env.ENV_VAR_NAME`)
- feat: emission of OTel resource spans is completely async - this brings down inference overhead to < 1µsecond
- fix: added latency calculation for vertex native requests
- feat: added cached tokens and reasoning tokens to the usage in ui
- fix: cost calculation for vertex requests
- feat: added global region support for vertex API
- fix: added filter for extra fields in chat completions request for Mistral provider
- fix: added wildcard validation for allowed origins in UI security settings
- fix: fixed code field in pending_safety_checks for Responses API
</Update>
<Update label="Core" description="v1.3.10">
- bug: fixed embedding request not being handled in `GetExtraFields()` method of `BifrostResponse`
- fix: added latency calculation for vertex native requests
- feat: added cached tokens and reasoning tokens to the usage metadata for chat completions
- feat: added global region support for vertex API
- fix: added filter for extra fields in chat completions request for Mistral provider
- fix: fixed ResponsesComputerToolCallPendingSafetyCheck code field
</Update>
<Update label="Framework" description="v1.3.10">
- chore: version update core to 1.2.13
- feat: added support for vertex provider/model format in pricing lookup
</Update>
<Update label="governance" description="v1.3.10">
- chore: version update core to 1.2.13 and framework to 1.1.15
</Update>
<Update label="jsonparser" description="v1.3.10">
- chore: version update core to 1.2.13 and framework to 1.1.15
</Update>
<Update label="logging" description="v1.3.10">
- chore: version update core to 1.2.13 and framework to 1.1.15
</Update>
<Update label="maxim" description="v1.3.10">
- chore: version update core to 1.2.13 and framework to 1.1.15
</Update>
<Update label="mocker" description="v1.3.10">
- chore: version update core to 1.2.13 and framework to 1.1.15
- feat: added support for responses request
- feat: added "skip-mocker" context key to skip mocker plugin per request
</Update>
<Update label="otel" description="v1.3.10">
- chore: version update core to 1.2.13 and framework to 1.1.15
- feat: added headers support for OTel configuration. Value prefixed with env will be fetched from environment variables (`env.ENV_VAR_NAME`)
- feat: emission of OTel resource spans is completely async - this brings down inference overhead to < 1µsecond
</Update>
<Update label="semantic_cache" description="v1.3.10">
- chore: version update core to 1.2.13 and framework to 1.1.15
- tests: added mocker plugin to all chat/responses tests
</Update>
<Update label="telemetry" description="v1.3.10">
- chore: version update core to 1.2.13 and framework to 1.1.15
</Update>

View File

@@ -0,0 +1,75 @@
---
title: "v1.3.11"
description: "v1.3.11 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.11
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.11
docker run -p 8080:8080 maximhq/bifrost:v1.3.11
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.11">
- chore: version update core to 1.2.14 and framework to 1.1.16
- feat: added `/v1/models` endpoint to list models of configured providers
</Update>
<Update label="Core" description="v1.3.11">
- feat: added ListModels method to Provider interface
- feat: enabled provider tracking in Bifrost core for API exposure
</Update>
<Update label="Framework" description="v1.3.11">
- chore: version update core to 1.2.14
</Update>
<Update label="governance" description="v1.3.11">
- chore: version update core to 1.2.14 and framework to 1.1.16
</Update>
<Update label="jsonparser" description="v1.3.11">
- chore: version update core to 1.2.14 and framework to 1.1.16
</Update>
<Update label="logging" description="v1.3.11">
- chore: version update core to 1.2.14 and framework to 1.1.16
</Update>
<Update label="maxim" description="v1.3.11">
- chore: version update core to 1.2.14 and framework to 1.1.16
</Update>
<Update label="mocker" description="v1.3.11">
- chore: version update core to 1.2.14 and framework to 1.1.16
</Update>
<Update label="otel" description="v1.3.11">
- chore: version update core to 1.2.14 and framework to 1.1.16
</Update>
<Update label="semantic_cache" description="v1.3.11">
- chore: version update core to 1.2.14 and framework to 1.1.16
</Update>
<Update label="telemetry" description="v1.3.11">
- chore: version update core to 1.2.14 and framework to 1.1.16
</Update>

View File

@@ -0,0 +1,89 @@
---
title: "v1.3.12"
description: "v1.3.12 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.12
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.12
docker run -p 8080:8080 maximhq/bifrost:v1.3.12
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.12">
- chore: version update core to 1.2.15 and framework to 1.1.17
- feat: add azure provider native responses API support
- chore: suppress irrelevant warnings in ListModels
- feat: refactored all plugin operations to completely async to prevent any blocking behavior
- feat: added provider level budget and rate limits using virtual keys
- feat: added streaming support in maxim plugin
</Update>
<Update label="Core" description="v1.3.12">
- feat: add azure provider native responses API support
- feat: improve retry logic for rate limiting errors
- feat: add retries on list models request
- chore: suppress irrelevant warnings in ListModels
</Update>
<Update label="Framework" description="v1.3.12">
- chore: version update core to 1.2.15
- [BREAKING] feat: renamed pricing module to modelcatalog and added list models population support for model pool
- feat: added chunk index based sorting for streaming responses in streaming package
- feat: added budget and rate limit to provider configs in virtual key table
</Update>
<Update label="governance" description="v1.3.12">
- chore: version update core to 1.2.15 and framework to 1.1.17
- feat: added provider level budget and rate limits
</Update>
<Update label="jsonparser" description="v1.3.12">
- chore: version update core to 1.2.15 and framework to 1.1.17
- feat: creates deep copy of the response in PostHook to avoid modifying the original response pointer
</Update>
<Update label="logging" description="v1.3.12">
- chore: version update core to 1.2.15 and framework to 1.1.17
- feat: all operations moved async to prevent any blocking behavior
</Update>
<Update label="maxim" description="v1.3.12">
- chore: version update core to 1.2.15 and framework to 1.1.17
- feat: added support for streaming responses
</Update>
<Update label="mocker" description="v1.3.12">
- chore: version update core to 1.2.15 and framework to 1.1.17
</Update>
<Update label="otel" description="v1.3.12">
- chore: version update core to 1.2.15 and framework to 1.1.17
- feat: all operations moved async to prevent any blocking behavior
</Update>
<Update label="semantic_cache" description="v1.3.12">
- chore: version update core to 1.2.15 and framework to 1.1.17
</Update>
<Update label="telemetry" description="v1.3.12">
- chore: version update core to 1.2.15 and framework to 1.1.17
</Update>

View File

@@ -0,0 +1,78 @@
---
title: "v1.3.13"
description: "v1.3.13 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.13
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.13
docker run -p 8080:8080 maximhq/bifrost:v1.3.13
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.13">
- chore: version update framework to 1.1.18 and core to 1.2.16
- Adds env variable support for postgres config
- feat: standardize finish reason and single response handling across providers
- feat: provider config hot reloading added (no need to restart Bifrost after updating provider configs now)
</Update>
<Update label="Core" description="v1.3.13">
- feat: standardize finish reason and single response handling across providers
- feat: provider config hot reloading added
</Update>
<Update label="Framework" description="v1.3.13">
- Adds env variable resolution for postgres config
- chore: Upgrades core to 1.2.16
</Update>
<Update label="governance" description="v1.3.13">
- chore: version update core to 1.2.16 and framework to 1.1.18
</Update>
<Update label="jsonparser" description="v1.3.13">
- chore: version update core to 1.2.16 and framework to 1.1.18
</Update>
<Update label="logging" description="v1.3.13">
- chore: version update core to 1.2.16 and framework to 1.1.18
</Update>
<Update label="maxim" description="v1.3.13">
- chore: version update core to 1.2.16 and framework to 1.1.18
</Update>
<Update label="mocker" description="v1.3.13">
- chore: version update core to 1.2.16 and framework to 1.1.18
</Update>
<Update label="otel" description="v1.3.13">
- chore: version update core to 1.2.16 and framework to 1.1.18
</Update>
<Update label="semantic_cache" description="v1.3.13">
- chore: version update core to 1.2.16 and framework to 1.1.18
</Update>
<Update label="telemetry" description="v1.3.13">
- chore: version update core to 1.2.16 and framework to 1.1.18
</Update>

View File

@@ -0,0 +1,84 @@
---
title: "v1.3.14"
description: "v1.3.14 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.14
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.14
docker run -p 8080:8080 maximhq/bifrost:v1.3.14
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.14">
- chore: version update framework to 1.1.18 and core to 1.2.16
- feat: Use all keys for list models request
- fix: handled panic when using gemini models with openai integration responses API requests
- chore: Added id, object, and model fields to Chat Completion responses from Bedrock and Cohere providers
- feat: Adds support for dynamic plugins. Note that dynamic plugins are in beta
- feat: Adds auth support for dashboard, inference APIs and dashboard APIs.
</Update>
<Update label="Core" description="v1.3.14">
- feat: Use all keys for list models request
- refactor: Cohere provider to use completeRequest and response pooling for all requests
- chore: Added id, object, and model fields to Chat Completion responses from Bedrock and Cohere providers
- feat: Moved all streaming calls to use fasthttp client for efficiency
- feat: Adds support for auth
</Update>
<Update label="Framework" description="v1.3.14">
- chore: Upgrades core to 1.2.17
- feat: Adds dynamic plugins support
- feat: Adds auth tables in config store
</Update>
<Update label="governance" description="v1.3.14">
- chore: version update core to 1.2.17 and framework to 1.1.19
</Update>
<Update label="jsonparser" description="v1.3.14">
- chore: version update core to 1.2.17 and framework to 1.1.19
</Update>
<Update label="logging" description="v1.3.14">
- chore: version update core to 1.2.17 and framework to 1.1.19
</Update>
<Update label="maxim" description="v1.3.14">
- chore: version update core to 1.2.17 and framework to 1.1.19
</Update>
<Update label="mocker" description="v1.3.14">
- chore: version update core to 1.2.17 and framework to 1.1.19
</Update>
<Update label="otel" description="v1.3.14">
- chore: version update core to 1.2.17 and framework to 1.1.19
</Update>
<Update label="semantic_cache" description="v1.3.14">
- chore: version update core to 1.2.17 and framework to 1.1.19
</Update>
<Update label="telemetry" description="v1.3.14">
- chore: version update core to 1.2.17 and framework to 1.1.19
</Update>

View File

@@ -0,0 +1,75 @@
---
title: "v1.3.15"
description: "v1.3.15 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.15
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.15
docker run -p 8080:8080 maximhq/bifrost:v1.3.15
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.15">
- chore: version update core to 1.2.18 and framework to 1.1.21
- enhancement: provider lookup enhancements in modelcatelog
</Update>
<Update label="Core" description="v1.3.15">
- refactor: minor until changes
</Update>
<Update label="Framework" description="v1.3.15">
- chore: Upgrades core to 1.2.18
- enhancement: provider lookup enhancements
</Update>
<Update label="governance" description="v1.3.15">
- chore: version update core to 1.2.18 and framework to 1.1.21
</Update>
<Update label="jsonparser" description="v1.3.15">
- chore: version update core to 1.2.18 and framework to 1.1.21
</Update>
<Update label="logging" description="v1.3.15">
- chore: version update core to 1.2.18 and framework to 1.1.21
</Update>
<Update label="maxim" description="v1.3.15">
- chore: version update core to 1.2.18 and framework to 1.1.21
</Update>
<Update label="mocker" description="v1.3.15">
- chore: version update core to 1.2.18 and framework to 1.1.21
</Update>
<Update label="otel" description="v1.3.15">
- chore: version update core to 1.2.18 and framework to 1.1.21
</Update>
<Update label="semantic_cache" description="v1.3.15">
- chore: version update core to 1.2.18 and framework to 1.1.21
</Update>
<Update label="telemetry" description="v1.3.15">
- chore: version update core to 1.2.18 and framework to 1.1.21
</Update>

View File

@@ -0,0 +1,79 @@
---
title: "v1.3.16"
description: "v1.3.16 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.16
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.16
docker run -p 8080:8080 maximhq/bifrost:v1.3.16
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.16">
- chore: version update core to 1.2.18 and framework to 1.1.21
- feat: added Perplexity provider support
- chore: version update core to 1.2.19 and framework to 1.1.22
- feat: support for mistralai publisher endpoint in vertex provider
- enhancement: Anthropic's computer tool in the Responses API stream handling,
</Update>
<Update label="Core" description="v1.3.16">
- feat: support for mistralai publisher endpoint in vertex provider
- enhancement: Anthropic's computer tool in the Responses API stream handling,
- feat: added Perplexity provider support
</Update>
<Update label="Framework" description="v1.3.16">
- chore: Upgrades core to 1.2.19
</Update>
<Update label="governance" description="v1.3.16">
- chore: version update core to 1.2.19 and framework to 1.1.22
</Update>
<Update label="jsonparser" description="v1.3.16">
- chore: version update core to 1.2.19 and framework to 1.1.22
</Update>
<Update label="logging" description="v1.3.16">
- chore: version update core to 1.2.19 and framework to 1.1.22
</Update>
<Update label="maxim" description="v1.3.16">
- chore: version update core to 1.2.19 and framework to 1.1.22
</Update>
<Update label="mocker" description="v1.3.16">
- chore: version update core to 1.2.19 and framework to 1.1.22
</Update>
<Update label="otel" description="v1.3.16">
- chore: version update core to 1.2.19 and framework to 1.1.22
</Update>
<Update label="semantic_cache" description="v1.3.16">
- chore: version update core to 1.2.19 and framework to 1.1.22
</Update>
<Update label="telemetry" description="v1.3.16">
- chore: version update core to 1.2.19 and framework to 1.1.22
</Update>

View File

@@ -0,0 +1,72 @@
---
title: "v1.3.17"
description: "v1.3.17 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.17
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.17
docker run -p 8080:8080 maximhq/bifrost:v1.3.17
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.17">
- chore: version update framework to 1.1.24
- fix: resolve MCP client deletion when attached to a virtual key
- chore: allowed changing name when updating a virtual key
- fix: vk team/customer association issue when updating a vk
</Update>
<Update label="Framework" description="v1.3.17">
- fix: resolve MCP client deletion when attached to a virtual key
- fix: vk team/customer association issue when updating a vk
</Update>
<Update label="governance" description="v1.3.17">
- chore: version update framework to 1.1.23
</Update>
<Update label="jsonparser" description="v1.3.17">
- chore: version update framework to 1.1.24
</Update>
<Update label="logging" description="v1.3.17">
- chore: version update framework to 1.1.24
</Update>
<Update label="maxim" description="v1.3.17">
- chore: version update framework to 1.1.24
</Update>
<Update label="mocker" description="v1.3.17">
- chore: version update framework to 1.1.24
</Update>
<Update label="otel" description="v1.3.17">
- chore: version update framework to 1.1.24
</Update>
<Update label="semantic_cache" description="v1.3.17">
- chore: version update framework to 1.1.24
</Update>
<Update label="telemetry" description="v1.3.17">
- chore: version update framework to 1.1.24
</Update>

View File

@@ -0,0 +1,69 @@
---
title: "v1.3.18"
description: "v1.3.18 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.18
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.18
docker run -p 8080:8080 maximhq/bifrost:v1.3.18
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.18">
- change: health endpoint is whitelisted from auth middleware
</Update>
<Update label="Framework" description="v1.3.18">
- fix: resolve MCP client deletion when attached to a virtual key
- fix: vk team/customer association issue when updating a vk
</Update>
<Update label="governance" description="v1.3.18">
- chore: version update framework to 1.1.23
</Update>
<Update label="jsonparser" description="v1.3.18">
- chore: version update framework to 1.1.24
</Update>
<Update label="logging" description="v1.3.18">
- chore: version update framework to 1.1.24
</Update>
<Update label="maxim" description="v1.3.18">
- chore: version update framework to 1.1.24
</Update>
<Update label="mocker" description="v1.3.18">
- chore: version update framework to 1.1.24
</Update>
<Update label="otel" description="v1.3.18">
- chore: version update framework to 1.1.24
</Update>
<Update label="semantic_cache" description="v1.3.18">
- chore: version update framework to 1.1.24
</Update>
<Update label="telemetry" description="v1.3.18">
- chore: version update framework to 1.1.24
</Update>

View File

@@ -0,0 +1,89 @@
---
title: "v1.3.19"
description: "v1.3.19 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.19
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.19
docker run -p 8080:8080 maximhq/bifrost:v1.3.19
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.19">
- chore: version update core to 1.2.20 and framework to 1.1.24
- chore: allowed changing name when updating a virtual key
- feat: add numberOfRetries, fallbackIndex and selected key name and id to context to telemetry metrics
- feat: add used virtual key name and id to telemetry metrics
- feat: send model deployment back in response extra fields
- feat: add selected key and virtual key to logs filter
- feat: add headers to MCP client config
- feat: add `is_success` label to upstream latency metrics
</Update>
<Update label="Core" description="v1.3.19">
- feat: add numberOfRetries, fallbackIndex and selected key name to context
[BREAKING] changed BifrostContextKeySelectedKey to BifrostContextKeySelectedKeyID
- feat: send model deployment back in response extra fields
- feat: add headers to MCP client config
</Update>
<Update label="Framework" description="v1.3.19">
- chore: Upgrades core to 1.2.20
- feat: add selected key and virtual key to logs table
- feat: add headers to MCP client config
</Update>
<Update label="governance" description="v1.3.19">
- chore: version update core to 1.2.20 and framework to 1.1.24
</Update>
<Update label="jsonparser" description="v1.3.19">
- chore: version update core to 1.2.20 and framework to 1.1.24
</Update>
<Update label="logging" description="v1.3.19">
- chore: version update core to 1.2.20 and framework to 1.1.24
- feat: add selected key and virtual key to logs
</Update>
<Update label="maxim" description="v1.3.19">
- chore: version update core to 1.2.20 and framework to 1.1.24
</Update>
<Update label="mocker" description="v1.3.19">
- chore: version update core to 1.2.20 and framework to 1.1.24
</Update>
<Update label="otel" description="v1.3.19">
- chore: version update core to 1.2.20 and framework to 1.1.24
</Update>
<Update label="semantic_cache" description="v1.3.19">
- chore: version update core to 1.2.20 and framework to 1.1.24
</Update>
<Update label="telemetry" description="v1.3.19">
- chore: version update core to 1.2.20 and framework to 1.1.24
- feat: add numberOfRetries, fallbackIndex and selected key name and id to context to telemetry metrics
- feat: add used virtual key name and id to telemetry metrics
- feat: add `is_success` label to upstream latency metrics
</Update>

View File

@@ -0,0 +1,84 @@
---
title: "v1.3.2"
description: "v1.3.2 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.2
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.2
docker run -p 8080:8080 maximhq/bifrost:v1.3.2
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.2">
- Refactor: Moves all context key types to schemas.BifrostContextKey
- Fix: Fixes Maxim plugin bug where external traceId were blocking new trace creations
</Update>
<Update label="Core" description="v1.3.2">
- Chore: Now schema.BifrostContextKey is the only valid ctx key type throughout the project
</Update>
<Update label="Framework" description="v1.3.2">
- Upgrade dependency: core to 1.2.8
- Chore: Moves all context key types to schemas.BifrostContextKey
- Chore: Adds new logs table migration to avoid missing any required columns in the DB
</Update>
<Update label="governance" description="v1.3.2">
- Upgrade dependency: core to 1.2.8
- Chore: Moves all context key types to schemas.BifrostContextKey
</Update>
<Update label="jsonparser" description="v1.3.2">
- Upgrade dependency: core to 1.2.8
- Chore: Moves all context key types to schemas.BifrostContextKey
</Update>
<Update label="logging" description="v1.3.2">
- Upgrade dependency: core to 1.2.8
- Chore: Moves all context key types to schemas.BifrostContextKey
</Update>
<Update label="maxim" description="v1.3.2">
- Upgrade dependency: core to 1.2.8
- Chore: Moves all context key types to schemas.BifrostContextKey
- Fix: Fixes a bug where external trace id was blocking new trace creation
</Update>
<Update label="mocker" description="v1.3.2">
- Upgrade dependency: core to 1.2.8
</Update>
<Update label="otel" description="v1.3.2">
- Upgrade dependency: core to 1.2.8
- Chore: Moves all context key types to schemas.BifrostContextKey
</Update>
<Update label="semantic_cache" description="v1.3.2">
- Upgrade dependency: core to 1.2.8
- Chore: Moves all context key types to schemas.BifrostContextKey
</Update>
<Update label="telemetry" description="v1.3.2">
- Upgrade dependency: core to 1.2.8
- Chore: Moves all context key types to schemas.BifrostContextKey
</Update>

View File

@@ -0,0 +1,23 @@
---
title: "v1.3.20"
description: "v1.3.20 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.20
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.20
docker run -p 8080:8080 maximhq/bifrost:v1.3.20
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.20">
- fix: handle case when config store is nil in session and plugins handlers
</Update>

View File

@@ -0,0 +1,24 @@
---
title: "v1.3.21"
description: "v1.3.21 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.21
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.21
docker run -p 8080:8080 maximhq/bifrost:v1.3.21
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.21">
- fix: handle case when config store is nil in session and plugins handlers
- chore: adds integration tests for different config combinations
</Update>

View File

@@ -0,0 +1,77 @@
---
title: "v1.3.22"
description: "v1.3.22 changelog - 2025-11-09"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.22
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.22
docker run -p 8080:8080 maximhq/bifrost:v1.3.22
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.22">
- feat: Adds option to disable authentication on inference calls
- chore: Adds dark image for new version infographic
</Update>
<Update label="Core" description="v1.3.22">
- feat: add numberOfRetries, fallbackIndex and selected key name to context
[BREAKING] changed BifrostContextKeySelectedKey to BifrostContextKeySelectedKeyID
- feat: send model deployment back in response extra fields
- feat: add headers to MCP client config
</Update>
<Update label="Framework" description="v1.3.22">
- Adds DisableAuthOnInference to AuthConfig
</Update>
<Update label="governance" description="v1.3.22">
- chore: version update framework to 1.1.25
</Update>
<Update label="jsonparser" description="v1.3.22">
- chore: version update framework to 1.1.25
</Update>
<Update label="logging" description="v1.3.22">
- chore: version update framework to 1.1.25
</Update>
<Update label="maxim" description="v1.3.22">
- chore: version update framework to 1.1.25
</Update>
<Update label="mocker" description="v1.3.22">
- chore: version update framework to 1.1.25
</Update>
<Update label="otel" description="v1.3.22">
- chore: version update framework to 1.1.25
</Update>
<Update label="semantic_cache" description="v1.3.22">
- chore: version update framework to 1.1.25
</Update>
<Update label="telemetry" description="v1.3.22">
- chore: version update framework to 1.1.25
</Update>

View File

@@ -0,0 +1,81 @@
---
title: "v1.3.23"
description: "v1.3.23 changelog - 2025-11-10"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.23
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.23
docker run -p 8080:8080 maximhq/bifrost:v1.3.23
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.23">
- chore: version update core to 1.2.21 and framework to 1.1.26
- feat: add headers to MCP client config and provider config
- feat: adds support for custom path overrides for custom providers
- feat: adds support for key less authentication for custom providers
- feat: handles `response_schema` and `response_json_schema` parameter in gemini integration
- refactor: better mcp client management
- feat: option to disable content logging
- feat: key selection and retries info sent in genai traces
- feat: option to edit and reconnect mcp clients
</Update>
<Update label="Core" description="v1.3.23">
- feat: add headers to MCP client config and provider config
- feat: adds support for custom path overrides for custom providers
- feat: adds support for key less authentication for custom providers
- feat: handles `response_schema` and `response_json_schema` parameter in gemini integration
- [BREAKING] MCP client Public API now takes mcp client ids instead of names
- refactor: better mcp client management
</Update>
<Update label="Framework" description="v1.3.23">
- chore: version update core to 1.2.21
- feat: add headers to MCP client config
- refactor: mcp clients to use ids instead of names
- feat: option to disable content logging
</Update>
<Update label="governance" description="v1.3.23">
- chore: version update core to 1.2.21 and framework to 1.1.26
</Update>
<Update label="jsonparser" description="v1.3.23">
- chore: version update core to 1.2.21 and framework to 1.1.26
</Update>
<Update label="logging" description="v1.3.23">
- chore: version update core to 1.2.21 and framework to 1.1.26
- feat: option to disable content logging
</Update>
<Update label="maxim" description="v1.3.23">
- chore: version update core to 1.2.21 and framework to 1.1.26
</Update>
<Update label="mocker" description="v1.3.23">
- chore: version update core to 1.2.21 and framework to 1.1.26
</Update>
<Update label="otel" description="v1.3.23">
- chore: version update core to 1.2.21 and framework to 1.1.26
- feat: key selection and retries info sent in genai traces
</Update>
<Update label="semantic_cache" description="v1.3.23">
- chore: version update core to 1.2.21 and framework to 1.1.26
</Update>
<Update label="telemetry" description="v1.3.23">
- chore: version update core to 1.2.21 and framework to 1.1.26
</Update>

View File

@@ -0,0 +1,64 @@
---
title: "v1.3.24"
description: "v1.3.24 changelog - 2025-11-11"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.24
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.24
docker run -p 8080:8080 maximhq/bifrost:v1.3.24
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.24">
- chore: update core version to 1.2.22 and framework version to 1.1.27
- feat: Adds input message in logs table for easier navigation
</Update>
<Update label="Core" description="v1.3.24">
- chore: Adds index to ChatAssistantMessageToolCall
- fix: responses text output standardization to content blocks
</Update>
<Update label="Framework" description="v1.3.24">
- chore: update core version to 1.2.22
</Update>
<Update label="governance" description="v1.3.24">
- chore: update core version to 1.2.22 and framework version to 1.1.27
</Update>
<Update label="jsonparser" description="v1.3.24">
- chore: update core version to 1.2.22 and framework version to 1.1.27
</Update>
<Update label="logging" description="v1.3.24">
- chore: update core version to 1.2.22 and framework version to 1.1.27
</Update>
<Update label="maxim" description="v1.3.24">
- chore: update core version to 1.2.22 and framework version to 1.1.27
</Update>
<Update label="mocker" description="v1.3.24">
- chore: update core version to 1.2.22 and framework version to 1.1.27
</Update>
<Update label="otel" description="v1.3.24">
- chore: update core version to 1.2.22 and framework version to 1.1.27
</Update>
<Update label="semantic_cache" description="v1.3.24">
- chore: update core version to 1.2.22 and framework version to 1.1.27
</Update>
<Update label="telemetry" description="v1.3.24">
- chore: update core version to 1.2.22 and framework version to 1.1.27
</Update>

View File

@@ -0,0 +1,78 @@
---
title: "v1.3.25"
description: "v1.3.25 changelog - 2025-11-14"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.25
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.25
docker run -p 8080:8080 maximhq/bifrost:v1.3.25
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.25">
- chore: update core version to 1.2.23 and framework version to 1.1.28
- feat: added unified streaming lifecycle events across all providers to fully align with OpenAIs streaming response types.
- chore: shift from `alpha/responses` to `v1/responses` in openrouter provider for responses API
- feat: send back pricing data for models in list models response
- fix: add support for keyless providers in list models request
- feat: add support for custom fine-tuned models in vertex provider
- feat: send deployment aliases in list models response for supported providers
- feat: support for API Key auth in vertex provider
- feat: support for system account in environment for vertex provider
</Update>
<Update label="Core" description="1.2.23">
- feat: added unified streaming lifecycle events across all providers to fully align with OpenAIs streaming response types.
- chore: shift from `alpha/responses` to `v1/responses` in openrouter provider for responses API
- fix: add support for keyless providers in list models request
- feat: add support for custom fine-tuned models in vertex provider
- fix: vertex provider list models now correctly returns the custom fine-tuned model ids in the response
- feat: send deployment aliases in list models response for supported providers
- feat: support for API Key auth in vertex provider
</Update>
<Update label="Framework" description="1.1.28">
- chore: update core version to 1.2.23
- feat: expose method to get pricing data for a model in model catalog
- feat: add project number and deployments to vertex key config
</Update>
<Update label="governance" description="1.3.29">
- chore: update core version to 1.2.23 and framework version to 1.1.28
</Update>
<Update label="jsonparser" description="1.3.29">
- chore: update core version to 1.2.23 and framework version to 1.1.28
</Update>
<Update label="logging" description="1.3.29">
- chore: update core version to 1.2.23 and framework version to 1.1.28
</Update>
<Update label="maxim" description="1.4.28">
- chore: update core version to 1.2.23 and framework version to 1.1.28
</Update>
<Update label="mocker" description="1.3.28">
- chore: update core version to 1.2.23 and framework version to 1.1.28
</Update>
<Update label="otel" description="1.0.28">
- chore: update core version to 1.2.23 and framework version to 1.1.28
</Update>
<Update label="semantic_cache" description="1.3.28">
- chore: update core version to 1.2.23 and framework version to 1.1.28
</Update>
<Update label="telemetry" description="1.3.28">
- chore: update core version to 1.2.23 and framework version to 1.1.28
</Update>

View File

@@ -0,0 +1,64 @@
---
title: "v1.3.26"
description: "v1.3.26 changelog - 2025-11-16"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.26
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.26
docker run -p 8080:8080 maximhq/bifrost:v1.3.26
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.26">
- feat: adds support for elevenlabs provider
- fix: fixes security settings form submission with empty cors domains.
- chore: minor ui enhancements
</Update>
<Update label="Core" description="1.2.24">
- feat: Added Elevenlabs provider
</Update>
<Update label="Framework" description="1.1.29">
- chore: update core version to 1.2.24
</Update>
<Update label="governance" description="1.3.30">
- chore: update core version to 1.2.24 and framework version to 1.1.29
</Update>
<Update label="jsonparser" description="1.3.30">
- chore: update core version to 1.2.24 and framework version to 1.1.29
</Update>
<Update label="logging" description="1.3.30">
- chore: update core version to 1.2.24 and framework version to 1.1.29
</Update>
<Update label="maxim" description="1.4.29">
- chore: update core version to 1.2.24 and framework version to 1.1.29
</Update>
<Update label="mocker" description="1.3.29">
- chore: update core version to 1.2.24 and framework version to 1.1.29
</Update>
<Update label="otel" description="1.0.29">
- chore: update core version to 1.2.24 and framework version to 1.1.29
</Update>
<Update label="semantic_cache" description="1.3.29">
- chore: update core version to 1.2.24 and framework version to 1.1.29
</Update>
<Update label="telemetry" description="1.3.29">
- chore: update core version to 1.2.24 and framework version to 1.1.29
</Update>

View File

@@ -0,0 +1,62 @@
---
title: "v1.3.27"
description: "v1.3.27 changelog - 2025-11-17"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.27
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.27
docker run -p 8080:8080 maximhq/bifrost:v1.3.27
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.27">
- fix: bedrock memory and streaming response parsing fixes
</Update>
<Update label="Core" description="1.2.25">
- fix: bedrock memory and streaming response parsing fixes
</Update>
<Update label="Framework" description="1.1.30">
- chore: update core version to 1.2.25
</Update>
<Update label="governance" description="1.3.31">
- chore: update core version to 1.2.25 and framework version to 1.1.30
</Update>
<Update label="jsonparser" description="1.3.31">
- chore: update core version to 1.2.25 and framework version to 1.1.30
</Update>
<Update label="logging" description="1.3.31">
- chore: update core version to 1.2.25 and framework version to 1.1.30
</Update>
<Update label="maxim" description="1.4.30">
- chore: update core version to 1.2.25 and framework version to 1.1.30
</Update>
<Update label="mocker" description="1.3.30">
- chore: update core version to 1.2.25 and framework version to 1.1.30
</Update>
<Update label="otel" description="1.0.30">
- chore: update core version to 1.2.25 and framework version to 1.1.30
</Update>
<Update label="semantic_cache" description="1.3.30">
- chore: update core version to 1.2.25 and framework version to 1.1.30
</Update>
<Update label="telemetry" description="1.3.30">
- chore: update core version to 1.2.25 and framework version to 1.1.30
</Update>

View File

@@ -0,0 +1,58 @@
---
title: "v1.3.28"
description: "v1.3.28 changelog - 2025-11-18"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.28
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.28
docker run -p 8080:8080 maximhq/bifrost:v1.3.28
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.28">
feat: Improves log page loading performance for millions of logs stored on sqlite
</Update>
<Update label="Framework" description="1.1.31">
feat: splits logs APIs into `getStats` and `getLogs` to improve speed for sqlite
</Update>
<Update label="governance" description="1.3.32">
chore: update framework version to 1.1.31
</Update>
<Update label="jsonparser" description="1.3.32">
chore: update framework version to 1.1.31
</Update>
<Update label="logging" description="1.3.32">
chore: update framework version to 1.1.31
</Update>
<Update label="maxim" description="1.4.31">
chore: update framework version to 1.1.31
</Update>
<Update label="mocker" description="1.3.31">
chore: update framework version to 1.1.31
</Update>
<Update label="otel" description="1.0.31">
chore: update framework version to 1.1.31
</Update>
<Update label="semantic_cache" description="1.3.31">
chore: update framework version to 1.1.31
</Update>
<Update label="telemetry" description="1.3.31">
chore: update framework version to 1.1.31
</Update>

View File

@@ -0,0 +1,71 @@
---
title: "v1.3.29"
description: "v1.3.29 changelog - 2025-11-18"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.29
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.29
docker run -p 8080:8080 maximhq/bifrost:v1.3.29
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.29">
- fix: properly set bifrost version in metrics
- feat: added team_id, team_name, customer_id and customer_name labels to otel metrics
- fix: skip adding google/ prefix for custom fine-tuned models in vertex provider (for genai integration)
- fix: deep copy inputs in semantic cache plugin to not mutate the original request
</Update>
<Update label="Core" description="1.2.26">
- fix: skip adding google/ prefix for custom fine-tuned models in vertex provider
- feat: added DeepCopy functions to schemas package
</Update>
<Update label="Framework" description="1.1.32">
chore: update core version to 1.2.26
</Update>
<Update label="governance" description="1.3.33">
chore: update core version to 1.2.26 and framework version to 1.1.32
</Update>
<Update label="jsonparser" description="1.3.33">
chore: update core version to 1.2.26 and framework version to 1.1.32
</Update>
<Update label="logging" description="1.3.33">
chore: update core version to 1.2.26 and framework version to 1.1.32
</Update>
<Update label="maxim" description="1.4.32">
chore: update core version to 1.2.26 and framework version to 1.1.32
</Update>
<Update label="mocker" description="1.3.32">
chore: update core version to 1.2.26 and framework version to 1.1.32
</Update>
<Update label="otel" description="1.0.32">
- chore: update core version to 1.2.26 and framework version to 1.1.32
- fix: properly set bifrost version in metrics
- feat: added team_id, team_name, customer_id and customer_name labels to otel metrics
</Update>
<Update label="semantic_cache" description="1.3.32">
- chore: update core version to 1.2.26 and framework version to 1.1.32
- fix: deep copy inputs to not mutate the original request
</Update>
<Update label="telemetry" description="1.3.32">
- chore: update core version to 1.2.26 and framework version to 1.1.32
- feat: added filter for custom labels that are already default labels
- feat: added team_id, team_name, customer_id and customer_name labels to telemetry metrics
</Update>

View File

@@ -0,0 +1,75 @@
---
title: "v1.3.3"
description: "v1.3.3 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.3
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.3
docker run -p 8080:8080 maximhq/bifrost:v1.3.3
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.3">
- Upgrade dependency: core to 1.2.9
- Fix: JSON serialization for error objects and tool function parameters
</Update>
<Update label="Core" description="v1.3.3">
- Fix: Fixed JSON serialization for error objects and tool function parameters
</Update>
<Update label="Framework" description="v1.3.3">
- Upgrade dependency: core to 1.2.9
- Fix: JSON serialization for error objects and tool function parameters
</Update>
<Update label="governance" description="v1.3.3">
- chore: version update core to 1.2.9
</Update>
<Update label="jsonparser" description="v1.3.3">
- chore: version update core to 1.2.9
</Update>
<Update label="logging" description="v1.3.3">
- chore: version update core to 1.2.9
</Update>
<Update label="maxim" description="v1.3.3">
- chore: version update core to 1.2.9
</Update>
<Update label="mocker" description="v1.3.3">
- chore: version update core to 1.2.9
</Update>
<Update label="otel" description="v1.3.3">
- chore: version update core to 1.2.9
</Update>
<Update label="semantic_cache" description="v1.3.3">
- chore: version update core to 1.2.9
</Update>
<Update label="telemetry" description="v1.3.3">
- chore: version update core to 1.2.9
</Update>

View File

@@ -0,0 +1,62 @@
---
title: "v1.3.30"
description: "v1.3.30 changelog - 2025-11-18"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.30
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.30
docker run -p 8080:8080 maximhq/bifrost:v1.3.30
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.30">
- feat: adds migration for missing provider column in key table
<Warning>
"keys" in "provider_config" in `config.json` file requires unique name. If there is any collision, Bifrost wont be able to boot.
</Warning>
</Update>
<Update label="Framework" description="1.1.33">
feat: add migration for missing provider column in key table
</Update>
<Update label="governance" description="1.3.34">
chore: update framework version to 1.1.33
</Update>
<Update label="jsonparser" description="1.3.34">
chore: update framework version to 1.1.33
</Update>
<Update label="logging" description="1.3.34">
chore: update framework version to 1.1.33
</Update>
<Update label="maxim" description="1.4.33">
chore: update framework version to 1.1.33
</Update>
<Update label="mocker" description="1.3.33">
chore: update framework version to 1.1.33
</Update>
<Update label="otel" description="1.0.33">
chore: update framework version to 1.1.33
</Update>
<Update label="semantic_cache" description="1.3.33">
chore: update framework version to 1.1.33
</Update>
<Update label="telemetry" description="1.3.33">
chore: update framework version to 1.1.33
</Update>

View File

@@ -0,0 +1,62 @@
---
title: "v1.3.31"
description: "v1.3.31 changelog - 2025-11-19"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.31
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.31
docker run -p 8080:8080 maximhq/bifrost:v1.3.31
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.31">
fix: integration fixes for fallbacks
</Update>
<Update label="Core" description="1.2.27">
fix: integration convertor fixes for fallbacks
</Update>
<Update label="Framework" description="1.1.34">
chore: update core version to 1.2.27
</Update>
<Update label="governance" description="1.3.35">
chore: update core version to 1.2.27 to framework version 1.1.34
</Update>
<Update label="jsonparser" description="1.3.35">
chore: update core version to 1.2.27 to framework version 1.1.34
</Update>
<Update label="logging" description="1.3.35">
chore: update core version to 1.2.27 to framework version 1.1.34
</Update>
<Update label="maxim" description="1.4.34">
chore: update core version to 1.2.27 to framework version 1.1.34
</Update>
<Update label="mocker" description="1.3.34">
chore: update core version to 1.2.27 to framework version 1.1.34
</Update>
<Update label="otel" description="1.0.34">
chore: update core version to 1.2.27 to framework version 1.1.34
</Update>
<Update label="semantic_cache" description="1.3.34">
chore: update core version to 1.2.27 to framework version 1.1.34
</Update>
<Update label="telemetry" description="1.3.34">
chore: update core version to 1.2.27 to framework version 1.1.34
</Update>

View File

@@ -0,0 +1,72 @@
---
title: "v1.3.32"
description: "v1.3.32 changelog - 2025-11-20"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.32
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.32
docker run -p 8080:8080 maximhq/bifrost:v1.3.32
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.32">
- feat: support added for structured output Anthropic provider
- fix: Gemini thought signature preservation for multi-turn function calling (#879)
- fix: responses API stream lifecycle events fixes
- fix: embedding models usage with vertex provider using gemini integration
- feat: support for anthropic passthrough in streaming for claude code
- fix: lookup for virtual key in authorization and x-api-key headers for provider routing
- fix: added responses stream passthrough for codex in openai integration
</Update>
<Update label="Core" description="1.2.28">
- feat: support added for structured output Anthropic provider
- fix: Gemini thought signature preservation for multi-turn function calling (#879)
- fix: responses API stream lifecycle events fixes
- feat: support for anthropic passthrough in streaming for claude code
</Update>
<Update label="Framework" description="1.1.35">
chore: update core version to 1.2.28
</Update>
<Update label="governance" description="1.3.36">
- chore: update core version to 1.2.28 and framework version to 1.1.35
- fix: lookup for virtual key in authorization and x-api-key headers
</Update>
<Update label="jsonparser" description="1.3.36">
chore: update core version to 1.2.28 and framework version to 1.1.35
</Update>
<Update label="logging" description="1.3.36">
chore: update core version to 1.2.28 and framework version to 1.1.35
</Update>
<Update label="maxim" description="1.4.35">
chore: update core version to 1.2.28 and framework version to 1.1.35
</Update>
<Update label="mocker" description="1.3.35">
chore: update core version to 1.2.28 and framework version to 1.1.35
</Update>
<Update label="otel" description="1.0.35">
chore: update core version to 1.2.28 and framework version to 1.1.35
</Update>
<Update label="semantic_cache" description="1.3.35">
chore: update core version to 1.2.28 and framework version to 1.1.35
</Update>
<Update label="telemetry" description="1.3.35">
chore: update core version to 1.2.28 and framework version to 1.1.35
</Update>

View File

@@ -0,0 +1,65 @@
---
title: "v1.3.33"
description: "v1.3.33 changelog - 2025-11-21"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.33
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.33
docker run -p 8080:8080 maximhq/bifrost:v1.3.33
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.33">
- feat: Adds log retention config and a routine to cleanup logs daily based on the retention config. Default retention days are 365.
- fix: Added parsing for cached creation input tokens for Anthropic and Bedrock
- fix: Handled cost calculation for cached tokens
</Update>
<Update label="Core" description="1.2.29">
- fix: added parsing for cached creation input tokens for Anthropic and Bedrock
</Update>
<Update label="Framework" description="1.1.36">
- fix: handled cost calculation for cached tokens
- feat: adds support for log cleanup routine
</Update>
<Update label="governance" description="1.3.37">
- chore: updates core version to 1.2.29 and framework version to 1.1.36
</Update>
<Update label="jsonparser" description="1.3.37">
- chore: updates core version to 1.2.29 and framework version to 1.1.36
</Update>
<Update label="logging" description="1.3.37">
- chore: updates core version to 1.2.29 and framework version to 1.1.36
</Update>
<Update label="maxim" description="1.4.36">
- chore: updates core version to 1.2.29 and framework version to 1.1.36
</Update>
<Update label="mocker" description="1.3.36">
- chore: updates core version to 1.2.29 and framework version to 1.1.36
</Update>
<Update label="otel" description="1.0.36">
- chore: updates core version to 1.2.29 and framework version to 1.1.36
</Update>
<Update label="semantic_cache" description="1.3.36">
- chore: updates core version to 1.2.29 and framework version to 1.1.36
</Update>
<Update label="telemetry" description="1.3.36">
- chore: updates core version to 1.2.29 and framework version to 1.1.36
</Update>

View File

@@ -0,0 +1,59 @@
---
title: "v1.3.34"
description: "v1.3.34 changelog - 2025-11-21"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.34
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.34
docker run -p 8080:8080 maximhq/bifrost:v1.3.34
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.34">
- feat: Log view is enabled even if config_store is disabled
- fix: Add missing cache and batch pricing columns to ensure we compute costs for those operations accurately.
</Update>
<Update label="Framework" description="1.1.37">
hotfix: Adds missing batch and cache token pricing columns in config_store
</Update>
<Update label="governance" description="1.3.38">
- chore: upgrades framework version to 1.1.37
</Update>
<Update label="jsonparser" description="1.3.38">
- chore: upgrades framework version to 1.1.37
</Update>
<Update label="logging" description="1.3.38">
- chore: upgrades framework version to 1.1.37
</Update>
<Update label="maxim" description="1.4.37">
- chore: upgrades framework version to 1.1.37
</Update>
<Update label="mocker" description="1.3.37">
- chore: upgrades framework version to 1.1.37
</Update>
<Update label="otel" description="1.0.37">
- chore: upgrades framework version to 1.1.37
</Update>
<Update label="semantic_cache" description="1.3.37">
- chore: upgrades framework version to 1.1.37
</Update>
<Update label="telemetry" description="1.3.37">
- chore: upgrades framework version to 1.1.37
</Update>

View File

@@ -0,0 +1,71 @@
---
title: "v1.3.35"
description: "v1.3.35 changelog - 2025-11-24"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.35
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.35
docker run -p 8080:8080 maximhq/bifrost:v1.3.35
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.35">
- feat: Qdrant Vector Search Support (#893)
- fix: bedrock responses streaming last chunk indicator fixes
- fix: gemini nil content check fixes
- fix: handle responses.incomplete event in openai responses streaming
- fix: stream accumulator nil content check fixes
</Update>
<Update label="Core" description="1.2.30">
- fix: bedrock responses streaming last chunk indicator fixes
- fix: gemini nil content check fixes
- fix: handle responses.incomplete event in openai responses streaming
- enhancements: provider tests enhancements
</Update>
<Update label="Framework" description="1.1.38">
- feat: Qdrant Vector Search Support (#893)
- fix: stream accumulator nil content check fixes
- enhancement: added transactions on provider config updates
</Update>
<Update label="governance" description="1.3.39">
- chore: upgrades core to 1.2.30 and framework to 1.1.38
</Update>
<Update label="jsonparser" description="1.3.39">
- chore: upgrades core to 1.2.30 and framework to 1.1.38
</Update>
<Update label="logging" description="1.3.39">
- chore: upgrades core to 1.2.30 and framework to 1.1.38
</Update>
<Update label="maxim" description="1.4.38">
- chore: upgrades core to 1.2.30 and framework to 1.1.38
</Update>
<Update label="mocker" description="1.3.38">
- chore: upgrades core to 1.2.30 and framework to 1.1.38
</Update>
<Update label="otel" description="1.0.38">
- chore: upgrades core to 1.2.30 and framework to 1.1.38
</Update>
<Update label="semantic_cache" description="1.3.38">
- chore: upgrades core to 1.2.30 and framework to 1.1.38
</Update>
<Update label="telemetry" description="1.3.38">
- chore: upgrades core to 1.2.30 and framework to 1.1.38
</Update>

View File

@@ -0,0 +1,60 @@
---
title: "v1.3.36"
description: "v1.3.36 changelog - 2025-11-25"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.36
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.36
docker run -p 8080:8080 maximhq/bifrost:v1.3.36
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.36">
- feat: opus 4.5 is supported
- chore: changelog structure update
- fix: race conditions in stream accumulator
</Update>
<Update label="Framework" description="1.1.39">
- fix: Fixes race condition in accumulator
</Update>
<Update label="governance" description="1.3.40">
- chore: upgrades framework version to 1.1.39
</Update>
<Update label="jsonparser" description="1.3.40">
- chore: upgrades framework version to 1.1.39
</Update>
<Update label="logging" description="1.3.40">
- chore: upgrades framework version to 1.1.39
</Update>
<Update label="maxim" description="1.4.39">
- chore: upgrades framework version to 1.1.39
</Update>
<Update label="mocker" description="1.3.39">
- chore: upgrades framework version to 1.1.39
</Update>
<Update label="otel" description="1.0.39">
- chore: upgrades framework version to 1.1.39
</Update>
<Update label="semantic_cache" description="1.3.39">
- chore: upgrades framework version to 1.1.39
</Update>
<Update label="telemetry" description="1.3.39">
- chore: upgrades framework version to 1.1.39
</Update>

View File

@@ -0,0 +1,78 @@
---
title: "v1.3.37"
description: "v1.3.37 changelog - 2025-11-28"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.37
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.37
docker run -p 8080:8080 maximhq/bifrost:v1.3.37
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.37">
- feat: pydantic SDK support
- feat: bedrock SDK support
- feat: adds versioning support for plugins
- **breaking change**: plugins now accept *schemas.BifrostContext instead of *context.Context
- fix: gemini tts fixes with audio encoding for cross SDK compatibility
- feat: improved virtual key configuration flows
- chore: improved test coverage
- feat: check allowed models from model catalog for provider routing using virtual keys
- fix: log cleanup timestamp in UTC to match log entry timestamps for processing logs
- fix: prompt caching issue fixes for openai chat completions
</Update>
<Update label="Core" description="1.2.31">
- **breaking change**: plugins now accept *schemas.BifrostContext instead of *context.Context
- feat: adds support for bedrock, pydantic and cohere SDK.
- fix: minor fixes around audio streaming for gemini and vertex
- fix: prompt caching issue fixes for openai chat completions
- feat: add versioning support for plugins
- [BREAKING CHANGE]: ToolFunctionParameters.Properties is now an *OrderedMap instead of *map[string]interface{}
</Update>
<Update label="Framework" description="1.1.40">
- feat: adds audio encoding flows for gemini tts workflows
</Update>
<Update label="governance" description="1.3.41">
- chore: upgrades core to 1.2.31 and framework to 1.1.40
- feat: check allowed models from model catalog for provider configs
</Update>
<Update label="jsonparser" description="1.3.41">
- chore: upgrades core to 1.2.31 and framework to 1.1.40
</Update>
<Update label="logging" description="1.3.41">
- chore: upgrades core to 1.2.31 and framework to 1.1.40
- fix: log cleanup timestamp in UTC to match log entry timestamps
</Update>
<Update label="maxim" description="1.4.40">
- chore: upgrades core to 1.2.31 and framework to 1.1.40
</Update>
<Update label="mocker" description="1.3.40">
- chore: upgrades core to 1.2.31 and framework to 1.1.40
</Update>
<Update label="otel" description="1.0.40">
- chore: upgrades core to 1.2.31 and framework to 1.1.40
</Update>
<Update label="semantic_cache" description="1.3.40">
- chore: upgrades core to 1.2.31 and framework to 1.1.40
</Update>
<Update label="telemetry" description="1.3.40">
- chore: upgrades core to 1.2.31 and framework to 1.1.40
</Update>

View File

@@ -0,0 +1,74 @@
---
title: "v1.3.38"
description: "v1.3.38 changelog - 2025-12-01"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.38
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.38
docker run -p 8080:8080 maximhq/bifrost:v1.3.38
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.38">
- feat: support added for x-goog-api-key header for Google Gemini style for virtual key lookup and direct api key bypass
- feat: added support for Anthropic models in Azure
- chore: version update core to 1.2.32 and framework to 1.1.41
- fix: provider retry config time conversion issue
- fix: cache read input token cost calculation bug
- enhancement: made model lookup for pricing more robust
</Update>
<Update label="Core" description="1.2.32">
- feat: added support for Anthropic models in Azure
- enhancement: using naive anthropic converters for Vertex Anthropic responses and responses stream
- [breaking change] NetworkConfig retry backoff values (RetryBackoffInitial and RetryBackoffMax) now handle milliseconds in JSON while storing as time.Duration internally. Custom MarshalJSON/UnmarshalJSON methods ensure values are always interpreted as milliseconds when serializing/deserializing from JSON, fixing issues where values were incorrectly interpreted as nanoseconds.
</Update>
<Update label="Framework" description="1.1.41">
- chore: version update core to 1.2.32
- fix: cache read input token cost calculation bug
- enhancement: made bedrock model lookup more robust
- enhancement: added support for deployment lookup in pricing
</Update>
<Update label="governance" description="1.3.42">
- feat: support added for x-goog-api-key header for Google Gemini style
- chore: version update core to 1.2.32 and framework to 1.1.41
</Update>
<Update label="jsonparser" description="1.3.42">
- chore: version update core to 1.2.32 and framework to 1.1.41
</Update>
<Update label="logging" description="1.3.42">
- chore: version update core to 1.2.32 and framework to 1.1.41
- fix: log entry number of retries not being updated
</Update>
<Update label="maxim" description="1.4.41">
- chore: version update core to 1.2.32 and framework to 1.1.41
</Update>
<Update label="mocker" description="1.3.41">
- chore: version update core to 1.2.32 and framework to 1.1.41
</Update>
<Update label="otel" description="1.0.41">
- chore: version update core to 1.2.32 and framework to 1.1.41
</Update>
<Update label="semantic_cache" description="1.3.41">
- chore: version update core to 1.2.32 and framework to 1.1.41
</Update>
<Update label="telemetry" description="1.3.41">
- chore: version update core to 1.2.32 and framework to 1.1.41
</Update>

View File

@@ -0,0 +1,70 @@
---
title: "v1.3.39"
description: "v1.3.39 changelog - 2025-12-04"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.39
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.39
docker run -p 8080:8080 maximhq/bifrost:v1.3.39
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.39">
- fix: vertex and bedrock usage aggregation improvements for streaming
- fix: choice index fixed to 0 for anthropic and bedrock streaming
- feat: model field added to responses api response
- feat: check allowed models and deployments of key for list models
- bug: ui breaking when list models is empty on virtual key provider config
- chore: update core version to 1.2.33 and framework version to 1.1.42
</Update>
<Update label="Core" description="1.2.33">
- fix: vertex and bedrock usage aggregation improvements for streaming
- fix: choice index fixed to 0 for anthropic and bedrock streaming
- feat: model field added to responses api response
- feat: check allowed models and deployments of key for list models
</Update>
<Update label="Framework" description="1.1.42">
- chore: update core version to 1.2.33
</Update>
<Update label="governance" description="1.3.43">
- chore: update core version to 1.2.33 and framework version to 1.1.42
</Update>
<Update label="jsonparser" description="1.3.43">
- chore: update core version to 1.2.33 and framework version to 1.1.42
</Update>
<Update label="logging" description="1.3.43">
- chore: update core version to 1.2.33 and framework version to 1.1.42
</Update>
<Update label="maxim" description="1.4.42">
- chore: update core version to 1.2.33 and framework version to 1.1.42
</Update>
<Update label="mocker" description="1.3.42">
- chore: update core version to 1.2.33 and framework version to 1.1.42
</Update>
<Update label="otel" description="1.0.42">
- chore: update core version to 1.2.33 and framework version to 1.1.42
</Update>
<Update label="semantic_cache" description="1.3.42">
- chore: update core version to 1.2.33 and framework version to 1.1.42
</Update>
<Update label="telemetry" description="1.3.42">
- chore: update core version to 1.2.33 and framework version to 1.1.42
</Update>

View File

@@ -0,0 +1,81 @@
---
title: "v1.3.4"
description: "v1.3.4 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.4
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.4
docker run -p 8080:8080 maximhq/bifrost:v1.3.4
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.4">
- Upgrade dependency: core to 1.2.10 and framework to 1.1.10
- Feat: Added virtual key level support for MCP tools to execute
- Feat: Added names to keys
- Fix: provider selection from url params
</Update>
<Update label="Core" description="v1.3.4">
- Feat: Added key name field to account schema for external key management
- Feat: Simplified MCP client management by removing toolsToSkip field, allowing wildcard (*) for all tools, and better tool filtering logic.
</Update>
<Update label="Framework" description="v1.3.4">
- Upgrade dependency: core to 1.2.10
- Feat: Added key name column to config keys table
- Feat: Removed tools_to_skip field from MCP client config table
- Feat: Added virtual_key_mcp_config table to store MCP client configs for virtual keys along with its relationships
</Update>
<Update label="governance" description="v1.3.4">
- chore: version update core to 1.2.10 and framework to 1.1.10
- feat: added virtual key level support for MCP tools to execute
</Update>
<Update label="jsonparser" description="v1.3.4">
- chore: version update core to 1.2.10 and framework to 1.1.10
</Update>
<Update label="logging" description="v1.3.4">
- chore: version update core to 1.2.10 and framework to 1.1.10
</Update>
<Update label="maxim" description="v1.3.4">
- chore: version update core to 1.2.10 and framework to 1.1.10
</Update>
<Update label="mocker" description="v1.3.4">
- chore: version update core to 1.2.10 and framework to 1.1.10
</Update>
<Update label="otel" description="v1.3.4">
- chore: version update core to 1.2.10 and framework to 1.1.10
</Update>
<Update label="semantic_cache" description="v1.3.4">
- chore: version update core to 1.2.10
</Update>
<Update label="telemetry" description="v1.3.4">
- chore: version update core to 1.2.10 and framework to 1.1.10
</Update>

View File

@@ -0,0 +1,22 @@
---
title: "v1.3.40"
description: "v1.3.40 changelog - 2025-12-04"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.40
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.40
docker run -p 8080:8080 maximhq/bifrost:v1.3.40
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.40">
- security: upgrades React and Next against [CVE-2025-66478](https://nextjs.org/blog/CVE-2025-66478)
</Update>

View File

@@ -0,0 +1,26 @@
---
title: "v1.3.41"
description: "v1.3.41 changelog - 2025-12-05"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.41
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.41
docker run -p 8080:8080 maximhq/bifrost:v1.3.41
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.41">
- fix: remove UPX binary compression from Docker build to resolve segmentation faults when combined with PIE (Position Independent Executable)
</Update>
<Update label="maxim" description="1.4.43">
chore: Refactored the Maxim plugin to move tag handling from pre-hook to post-hook, improving the tag management process for generations.
</Update>

View File

@@ -0,0 +1,63 @@
---
title: "v1.3.42"
description: "v1.3.42 changelog - 2025-12-05"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.42
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.42
docker run -p 8080:8080 maximhq/bifrost:v1.3.42
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.42">
- fix: added region prefix check for bedrock list models
- chore: update core version to 1.2.34 and framework version to 1.1.43
</Update>
<Update label="Core" description="1.2.34">
- fix: added region prefix check for bedrock list models
</Update>
<Update label="Framework" description="1.1.43">
- chore: upgraded core version to 1.2.34
</Update>
<Update label="governance" description="1.3.44">
- chore: update core version to 1.2.34 and framework version to 1.1.43
</Update>
<Update label="jsonparser" description="1.3.44">
- chore: update core version to 1.2.34 and framework version to 1.1.43
</Update>
<Update label="logging" description="1.3.44">
- chore: update core version to 1.2.34 and framework version to 1.1.43
</Update>
<Update label="maxim" description="1.4.44">
- chore: update core version to 1.2.34 and framework version to 1.1.43
</Update>
<Update label="mocker" description="1.3.43">
- chore: update core version to 1.2.34 and framework version to 1.1.43
</Update>
<Update label="otel" description="1.0.43">
- chore: update core version to 1.2.34 and framework version to 1.1.43
</Update>
<Update label="semantic_cache" description="1.3.43">
- chore: update core version to 1.2.34 and framework version to 1.1.43
</Update>
<Update label="telemetry" description="1.3.43">
- chore: update core version to 1.2.34 and framework version to 1.1.43
</Update>

View File

@@ -0,0 +1,72 @@
---
title: "v1.3.43"
description: "v1.3.43 changelog - 2025-12-09"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.43
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.43
docker run -p 8080:8080 maximhq/bifrost:v1.3.43
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.43">
- feat: adds global proxy support
- feat: adds datadog native integration handling
- feat: enterprise plugin handling for OSS
- feat: adds support `OTEL_RESOURCE_ATTRIBUTES` for otel plugin
- chore: some minor bug fixes
</Update>
<Update label="Core" description="1.2.35">
- feat: added missing extrafields to errors in core
- feat: adds global proxy support
- feat: handle cached tokens in Anthropic streaming responses
- fix: adds status field for responses API
</Update>
<Update label="Framework" description="1.1.44">
- feat: adds global proxy support
- feat: enterprise plugin handling
</Update>
<Update label="governance" description="1.3.45">
- chore: updating core to 1.2.35 and framework to 1.1.44
</Update>
<Update label="jsonparser" description="1.3.45">
- chore: updating core to 1.2.35 and framework to 1.1.44
</Update>
<Update label="logging" description="1.3.45">
- chore: updating core to 1.2.35 and framework to 1.1.44
</Update>
<Update label="maxim" description="1.4.45">
- chore: updating core to 1.2.35 and framework to 1.1.44
</Update>
<Update label="mocker" description="1.3.44">
- chore: updating core to 1.2.35 and framework to 1.1.44
</Update>
<Update label="otel" description="1.0.44">
- feat: add custom CA TLS cert support for protocols
- feat: enterprise plugin handling
- chore: updating core to 1.2.35 and framework to 1.1.44
</Update>
<Update label="semantic_cache" description="1.3.44">
- chore: updating core to 1.2.35 and framework to 1.1.44
</Update>
<Update label="telemetry" description="1.3.44">
- chore: updating core to 1.2.35 and framework to 1.1.44
</Update>

View File

@@ -0,0 +1,60 @@
---
title: "v1.3.44"
description: "v1.3.44 changelog - 2025-12-10"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.44
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.44
docker run -p 8080:8080 maximhq/bifrost:v1.3.44
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.44">
- feat: adds rbac support across all pages
- fix: fixes config.json - config store streaming cases for virtual keys, providers and keys. Improved test coverage for this flow.
- fix: adds support for text streaming logging
</Update>
<Update label="Framework" description="1.1.45">
- fix: adds support for text streaming accumulation
</Update>
<Update label="governance" description="1.3.46">
- chore: updates framework to 1.1.45
</Update>
<Update label="jsonparser" description="1.3.46">
- chore: updates framework to 1.1.45
</Update>
<Update label="logging" description="1.3.46">
- chore: updates framework to 1.1.45
</Update>
<Update label="maxim" description="1.4.46">
- chore: updates framework to 1.1.45
</Update>
<Update label="mocker" description="1.3.45">
- chore: updates framework to 1.1.45
</Update>
<Update label="otel" description="1.0.45">
- chore: updates framework to 1.1.45
</Update>
<Update label="semantic_cache" description="1.3.45">
- chore: updates framework to 1.1.45
</Update>
<Update label="telemetry" description="1.3.45">
- chore: updates framework to 1.1.45
</Update>

View File

@@ -0,0 +1,66 @@
---
title: "v1.3.45"
description: "v1.3.45 changelog - 2025-12-11"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.45
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.45
docker run -p 8080:8080 maximhq/bifrost:v1.3.45
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.45">
- feat: complete config.json to config-store sync using hash
- fix: structured output in bedrock, cohere and anthropic
- fix: tool calls in bedrock chat completion
</Update>
<Update label="Core" description="1.2.36">
- feat: complete config.json to config-store sync using hash
- fix: structured output in bedrock, cohere and anthropic
- fix: tool calls in bedrock chat completion
</Update>
<Update label="Framework" description="1.1.46">
- chore: updating core to 1.2.36 and framework to 1.1.46
</Update>
<Update label="governance" description="1.3.47">
- chore: updating core to 1.2.36 and framework to 1.1.46
</Update>
<Update label="jsonparser" description="1.3.47">
- chore: updating core to 1.2.36 and framework to 1.1.46
</Update>
<Update label="logging" description="1.3.47">
- chore: updating core to 1.2.36 and framework to 1.1.46
</Update>
<Update label="maxim" description="1.4.47">
- chore: updating core to 1.2.36 and framework to 1.1.46
</Update>
<Update label="mocker" description="1.3.46">
- chore: updating core to 1.2.36 and framework to 1.1.46
</Update>
<Update label="otel" description="1.0.46">
- chore: updating core to 1.2.36 and framework to 1.1.46
</Update>
<Update label="semantic_cache" description="1.3.46">
- chore: updating core to 1.2.36 and framework to 1.1.46
</Update>
<Update label="telemetry" description="1.3.46">
- chore: updating core to 1.2.36 and framework to 1.1.46
</Update>

View File

@@ -0,0 +1,22 @@
---
title: "v1.3.46"
description: "v1.3.46 changelog - 2025-12-12"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.46
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.46
docker run -p 8080:8080 maximhq/bifrost:v1.3.46
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.46">
- hotfix: security patches for [react](https://react.dev/blog/2025/12/11/denial-of-service-and-source-code-exposure-in-react-server-components) and [nextjs](https://nextjs.org/blog/security-update-2025-12-11)
</Update>

View File

@@ -0,0 +1,76 @@
---
title: "v1.3.47"
description: "v1.3.47 changelog - 2025-12-12"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.47
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.47
docker run -p 8080:8080 maximhq/bifrost:v1.3.47
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.47">
- feat: support for raw response accumulation for streaming
- feat: support for raw request logging and sending back in response
- feat: added support for reasoning in chat completions
- feat: enhanced reasoning support in responses api
- enhancement: improved internal inter provider conversions for integrations
- feat: switched to gemini native api
</Update>
<Update label="Core" description="1.2.37">
- feat: send back raw request in extra fields
- feat: added support for reasoning in chat completions
- feat: enhanced reasoning support in responses api
- enhancement: improved internal inter provider conversions for integrations
- feat: switched to gemini native api
- feat: fallback to supported request type for custom models used in integration
</Update>
<Update label="Framework" description="1.1.47">
- feat: support raw response accumulation in stream accumulator
- feat: support raw request configuration and logging
- feat: added support for reasoning accumulation in stream accumulator
- chore: updating core to 1.2.37 and framework to 1.1.47
</Update>
<Update label="governance" description="1.3.48">
- chore: updating core to 1.2.37 and framework to 1.1.47
</Update>
<Update label="jsonparser" description="1.3.48">
- chore: updating core to 1.2.37 and framework to 1.1.47
</Update>
<Update label="logging" description="1.3.48">
- feat: support for raw request logging
- chore: updating core to 1.2.37 and framework to 1.1.47
</Update>
<Update label="maxim" description="1.4.48">
- chore: updating core to 1.2.37 and framework to 1.1.47
</Update>
<Update label="mocker" description="1.3.47">
- chore: updating core to 1.2.37 and framework to 1.1.47
</Update>
<Update label="otel" description="1.0.47">
- chore: updating core to 1.2.37 and framework to 1.1.47
</Update>
<Update label="semantic_cache" description="1.3.47">
- chore: updating core to 1.2.37 and framework to 1.1.47
</Update>
<Update label="telemetry" description="1.3.47">
- chore: updating core to 1.2.37 and framework to 1.1.47
</Update>

View File

@@ -0,0 +1,22 @@
---
title: "v1.3.48"
description: "v1.3.48 changelog - 2025-12-12"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.48
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.48
docker run -p 8080:8080 maximhq/bifrost:v1.3.48
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.48">
- chore: security patches 2 to next + react
</Update>

View File

@@ -0,0 +1,79 @@
---
title: "v1.3.49"
description: "v1.3.49 changelog - 2025-12-16"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.49
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.49
docker run -p 8080:8080 maximhq/bifrost:v1.3.49
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="1.3.49">
- feat: add `x-bf-api-key` header to send requests with a key by name
- feat: parse `x-bf-eh-*` request headers as extra headers
- feat: addded api endpoint for /api/pricing/force-syncfeat: support for raw response accumulation for streaming
- feat: add support for enabling/disabling provider keys without deletion.
- feat: add batch api support for OpenAI, Anthropic, Google Gemini and Bedrock <Badge color="blue">Beta</Badge>.
- feat: new provider support - nebius.
- feat: force refresh datasheet support.
- fix: fixed minor issues with structured output support for Gemini and Bedrock.
- fix: fixed token usage base cost compute for models like gemini
- chore: CORS policy now allows `x-stainless-timeout`
</Update>
<Update label="Core" description="1.2.38">
- feat: adds batch and files API support for bedrock, openai, anthropic and gemini
- feat: new provider support - nebius
- feat: structured output support
- fix: vertex and bedrock usage aggregation improvements for streaming
- fix: choice index fixed to 0 for anthropic and bedrock streaming
</Update>
<Update label="Framework" description="1.1.48">
- feat: added force sync function in pricing and pricing according to 200k token
- feat: adds logging support for batch and file requests
- chore: upgrades core to 1.2.38 and framework to 1.1.48
</Update>
<Update label="governance" description="1.3.49">
- chore: upgrades core to 1.2.38 and framework to 1.1.48
</Update>
<Update label="jsonparser" description="1.3.49">
- chore: upgrades core to 1.2.38 and framework to 1.1.48
</Update>
<Update label="logging" description="1.3.49">
- chore: upgrades core to 1.2.38 and framework to 1.1.48
</Update>
<Update label="maxim" description="1.4.49">
- chore: upgrades core to 1.2.38 and framework to 1.1.48
</Update>
<Update label="mocker" description="1.3.48">
- chore: upgrades core to 1.2.38 and framework to 1.1.48
</Update>
<Update label="otel" description="1.0.48">
- feat: add batch and file request logging support; refactor centralized request handling
- chore: upgrades core to 1.2.38 and framework to 1.1.48
</Update>
<Update label="semantic_cache" description="1.3.48">
- chore: upgrades core to 1.2.38 and framework to 1.1.48
</Update>
<Update label="telemetry" description="1.3.48">
- feat: adds logging support for batch and file requests
- chore: upgrades core to 1.2.38 and framework to 1.1.48
</Update>

View File

@@ -0,0 +1,75 @@
---
title: "v1.3.5"
description: "v1.3.5 changelog"
---
<Tabs>
<Tab title="NPX">
```bash
npx -y @maximhq/bifrost --transport-version v1.3.5
```
</Tab>
<Tab title="Docker">
```bash
docker pull maximhq/bifrost:v1.3.5
docker run -p 8080:8080 maximhq/bifrost:v1.3.5
```
</Tab>
</Tabs>
<Update label="Bifrost(HTTP)" description="v1.3.5">
- chore: version update framework to 1.1.11
- fix: added missing migration for `cost` and `cache_debug` columns in logs table for old databases.
</Update>
<Update label="Core" description="v1.3.5">
- Feat: Added key name field to account schema for external key management
- Feat: Simplified MCP client management by removing toolsToSkip field, allowing wildcard (*) for all tools, and better tool filtering logic.
</Update>
<Update label="Framework" description="v1.3.5">
- Fix: Added missing migration for `cost` and `cache_debug` columns in logs table for old databases.
</Update>
<Update label="governance" description="v1.3.5">
- chore: version update framework to 1.1.11
</Update>
<Update label="jsonparser" description="v1.3.5">
- chore: version update framework to 1.1.11
</Update>
<Update label="logging" description="v1.3.5">
- chore: version update framework to 1.1.11
</Update>
<Update label="maxim" description="v1.3.5">
- chore: version update framework to 1.1.11
</Update>
<Update label="mocker" description="v1.3.5">
- chore: version update framework to 1.1.11
</Update>
<Update label="otel" description="v1.3.5">
- chore: version update framework to 1.1.11
</Update>
<Update label="semantic_cache" description="v1.3.5">
- chore: version update framework to 1.1.11
</Update>
<Update label="telemetry" description="v1.3.5">
- chore: version update framework to 1.1.11
</Update>

Some files were not shown because too many files have changed in this diff Show More