--- title: "Getting Started" description: "Introduction to Bifrost's performance capabilities and how to choose the right instance size for your workload." icon: "rocket" --- ## Overview Bifrost has been rigorously tested under high load conditions to ensure optimal performance for production deployments. Our benchmark tests demonstrate exceptional performance characteristics at **5,000 requests per second (RPS)** across different AWS EC2 instance types. **Key Performance Highlights:** - **Perfect Success Rate**: 100% request success rate under high load - **Minimal Overhead**: Less than 15µs added latency per request on average - **Efficient Queue Management**: Sub-microsecond queue wait times on optimized instances - **Fast Key Selection**: Near-instantaneous weighted API key selection (~10 ns) --- ## Test Environment Summary Bifrost was benchmarked on two primary AWS EC2 instance configurations: ### **t3.medium (2 vCPUs, 4GB RAM)** - **Buffer Size**: 15,000 - **Initial Pool Size**: 10,000 - **Use Case**: Cost-effective option for moderate workloads ### **t3.xlarge (4 vCPUs, 16GB RAM)** - **Buffer Size**: 20,000 - **Initial Pool Size**: 15,000 - **Use Case**: High-performance option for demanding workloads --- ## Performance Comparison at a Glance | Metric | t3.medium | t3.xlarge | Improvement | |--------|-----------|-----------|-------------| | **Success Rate @ 5k RPS** | 100% | 100% | No failed requests | | **Bifrost Overhead** | 59 µs | 11 µs | **-81%** | | **Average Latency** | 2.12s | 1.61s | **-24%** | | **Queue Wait Time** | 47.13 µs | 1.67 µs | **-96%** | | **JSON Marshaling** | 63.47 µs | 26.80 µs | **-58%** | | **Response Parsing** | 11.30 ms | 2.11 ms | **-81%** | | **Peak Memory Usage** | 1,312.79 MB | 3,340.44 MB | +155% | > **Note**: t3.xlarge tests used significantly larger response payloads (~10 KB vs ~1 KB), yet still achieved better performance metrics. All benchmarks are on mocked OpenAI calls, whose latency and payload size are mentioned in the respective analysis pages. --- ## Configuration Flexibility One of Bifrost's key strengths is its **configuration flexibility**. You can fine-tune the speed ↔ memory trade-off based on your specific requirements: | Configuration Parameter | Effect | |------------------------|--------| | `initial_pool_size` | Higher values = faster performance, more memory usage | | `buffer_size` & `concurrency` | Controls queue depth and max parallel workers (per provider) | | `retry` & `timeout` | Tune aggressiveness for each provider to meet your SLOs | **Configuration Philosophy:** - **Higher settings** (like t3.xlarge profile) prioritize raw speed - **Lower settings** (like t3.medium profile) optimize for memory efficiency - **Custom tuning** lets you find the sweet spot for your specific workload --- ## Next Steps ### **Detailed Performance Analysis** - **[t3.medium Performance](./t3.medium)** - Deep dive into cost-effective performance - **[t3.xlarge Performance](./t3.xl)** - High-performance configuration analysis ### **Run Your Own Tests** - **[Run Your Own Benchmarks](./run-your-own-benchmarks)** - Step-by-step guide to benchmark Bifrost in your environment Ready to dive deeper? Choose your instance type above or learn how to run your own performance tests.