first commit

2026-04-26 21:52:23 +03:00
commit 880f412e2c
2662 changed files with 866266 additions and 0 deletions
--- a/docs/benchmarking/getting-started.mdx
+++ b/docs/benchmarking/getting-started.mdx
@@ -0,0 +1,81 @@
+---
+title: "Getting Started"
+description: "Introduction to Bifrost's performance capabilities and how to choose the right instance size for your workload."
+icon: "rocket"
+---
+
+## Overview
+
+Bifrost has been rigorously tested under high load conditions to ensure optimal performance for production deployments. Our benchmark tests demonstrate exceptional performance characteristics at **5,000 requests per second (RPS)** across different AWS EC2 instance types.
+
+**Key Performance Highlights:**
+- **Perfect Success Rate**: 100% request success rate under high load
+- **Minimal Overhead**: Less than 15µs added latency per request on average
+- **Efficient Queue Management**: Sub-microsecond queue wait times on optimized instances
+- **Fast Key Selection**: Near-instantaneous weighted API key selection (~10 ns)
+
+---
+
+## Test Environment Summary
+
+Bifrost was benchmarked on two primary AWS EC2 instance configurations:
+
+### **t3.medium (2 vCPUs, 4GB RAM)**
+- **Buffer Size**: 15,000
+- **Initial Pool Size**: 10,000
+- **Use Case**: Cost-effective option for moderate workloads
+
+### **t3.xlarge (4 vCPUs, 16GB RAM)**  
+- **Buffer Size**: 20,000
+- **Initial Pool Size**: 15,000
+- **Use Case**: High-performance option for demanding workloads
+
+---
+
+## Performance Comparison at a Glance
+
+| Metric | t3.medium | t3.xlarge | Improvement |
+|--------|-----------|-----------|-------------|
+| **Success Rate @ 5k RPS** | 100% | 100% | No failed requests |
+| **Bifrost Overhead** | 59 µs | 11 µs | **-81%** |
+| **Average Latency** | 2.12s | 1.61s | **-24%** |
+| **Queue Wait Time** | 47.13 µs | 1.67 µs | **-96%** |
+| **JSON Marshaling** | 63.47 µs | 26.80 µs | **-58%** |
+| **Response Parsing** | 11.30 ms | 2.11 ms | **-81%** |
+| **Peak Memory Usage** | 1,312.79 MB | 3,340.44 MB | +155% |
+
+> **Note**: t3.xlarge tests used significantly larger response payloads (~10 KB vs ~1 KB), yet still achieved better performance metrics.
+
+<Note>
+All benchmarks are on mocked OpenAI calls, whose latency and payload size are mentioned in the respective analysis pages.
+</Note>
+
+---
+
+## Configuration Flexibility
+
+One of Bifrost's key strengths is its **configuration flexibility**. You can fine-tune the speed ↔ memory trade-off based on your specific requirements:
+
+| Configuration Parameter | Effect |
+|------------------------|--------|
+| `initial_pool_size` | Higher values = faster performance, more memory usage |
+| `buffer_size` & `concurrency` | Controls queue depth and max parallel workers (per provider) |
+| `retry` & `timeout` | Tune aggressiveness for each provider to meet your SLOs |
+
+**Configuration Philosophy:**
+- **Higher settings** (like t3.xlarge profile) prioritize raw speed
+- **Lower settings** (like t3.medium profile) optimize for memory efficiency  
+- **Custom tuning** lets you find the sweet spot for your specific workload
+
+---
+
+## Next Steps
+
+### **Detailed Performance Analysis**
+- **[t3.medium Performance](./t3.medium)** - Deep dive into cost-effective performance
+- **[t3.xlarge Performance](./t3.xl)** - High-performance configuration analysis
+
+### **Run Your Own Tests**
+- **[Run Your Own Benchmarks](./run-your-own-benchmarks)** - Step-by-step guide to benchmark Bifrost in your environment
+
+Ready to dive deeper? Choose your instance type above or learn how to run your own performance tests.