---
title: "Prometheus"
description: "Monitor Bifrost metrics with Prometheus scraping or Push Gateway for multi-node deployments"
icon: "chart-line"
---
## Overview
Bifrost exposes Prometheus metrics via two methods:
1. **Pull-based (Scraping)**: Traditional `/metrics` endpoint that Prometheus can scrape
2. **Push-based (Push Gateway)**: Push metrics to a Prometheus Push Gateway for cluster deployments
**For multi-node deployments**: Use the Push Gateway method to ensure accurate metric aggregation. Traditional scraping may miss nodes behind load balancers.
---
## Pull-based Scraping
Bifrost automatically exposes a `/metrics` endpoint when the telemetry plugin is enabled (enabled by default). No additional configuration is needed.
When Bifrost's authentication is enabled (`auth_config.is_enabled = true`), the `/metrics` endpoint requires Basic auth credentials. You must include the same `admin_username` and `admin_password` from your `auth_config` in the Prometheus scrape configuration. Without this, Prometheus will receive `401 Unauthorized` responses and scraping will silently fail.
### Prometheus Configuration
Add Bifrost to your Prometheus `prometheus.yml`:
```yaml
scrape_configs:
- job_name: 'bifrost'
static_configs:
- targets: ['bifrost-host:8080']
scrape_interval: 15s
```
If Bifrost authentication is enabled, add `basic_auth` to your scrape config:
```yaml
scrape_configs:
- job_name: 'bifrost'
static_configs:
- targets: ['bifrost-host:8080']
scrape_interval: 15s
basic_auth:
username: ''
password: ''
```
### Endpoint
```
GET /metrics
```
Returns metrics in Prometheus exposition format.
---
## Push-based (Push Gateway)
For multi-node cluster deployments, the Prometheus plugin pushes metrics to a [Prometheus Push Gateway](https://github.com/prometheus/pushgateway). This ensures all nodes' metrics are captured regardless of load balancer routing.
### Configuration
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `push_gateway_url` | `string` | ✅ Yes | - | Push Gateway URL (e.g., `http://pushgateway:9091`) |
| `job_name` | `string` | ❌ No | `bifrost` | Job label for pushed metrics |
| `instance_id` | `string` | ❌ No | hostname | Instance identifier for metric grouping |
| `push_interval` | `integer` | ❌ No | `15` | Push interval in seconds (1-300) |
| `basic_auth` | `object` | ❌ No | - | Basic auth credentials |
### Basic Auth Configuration
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `username` | `string` | ✅ Yes | Basic auth username |
| `password` | `string` | ✅ Yes | Basic auth password |
---
## Setup
1. Navigate to **Observability** → **Prometheus** in the Bifrost UI
2. The `/metrics` endpoint is shown at the top for scraping configuration
3. To enable Push Gateway:
- Enter the **Push Gateway URL**
- Configure **Job Name** and **Push Interval** as needed
- Optionally set a custom **Instance ID**
- Enable **Basic Authentication** if required
- Toggle **Enable Push Gateway** on
- Click **Save Prometheus Configuration**
```json
{
"plugins": [
{
"name": "telemetry",
"enabled": true,
"config": {
"push_gateway": {
"enabled": true,
"push_gateway_url": "http://pushgateway:9091",
"job_name": "bifrost",
"push_interval": 15
}
}
}
]
}
```
### With Basic Auth
```json
{
"plugins": [
{
"name": "telemetry",
"enabled": true,
"config": {
"push_gateway": {
"enabled": true,
"push_gateway_url": "http://pushgateway:9091",
"job_name": "bifrost",
"push_interval": 15,
"instance_id": "bifrost-node-1",
"basic_auth": {
"username": "admin",
"password": "secret"
}
}
}
}
]
}
```
---
## Available Metrics
The following metrics are available from both the `/metrics` endpoint and Push Gateway:
### HTTP Metrics
| Metric | Type | Description |
|--------|------|-------------|
| `http_requests_total` | Counter | Total HTTP requests by path, method, status |
| `http_request_duration_seconds` | Histogram | HTTP request latency |
| `http_request_size_bytes` | Histogram | Request body size |
| `http_response_size_bytes` | Histogram | Response body size |
### Bifrost LLM Metrics
| Metric | Type | Description |
|--------|------|-------------|
| `bifrost_upstream_requests_total` | Counter | Total requests to LLM providers |
| `bifrost_upstream_latency_seconds` | Histogram | Provider request latency |
| `bifrost_success_requests_total` | Counter | Successful provider requests |
| `bifrost_error_requests_total` | Counter | Failed provider requests |
| `bifrost_input_tokens_total` | Counter | Total input tokens processed |
| `bifrost_output_tokens_total` | Counter | Total output tokens generated |
| `bifrost_cost_total` | Counter | Total cost in USD |
| `bifrost_cache_hits_total` | Counter | Cache hits by type |
| `bifrost_stream_first_token_latency_seconds` | Histogram | Time to first token (streaming) |
| `bifrost_stream_inter_token_latency_seconds` | Histogram | Inter-token latency (streaming) |
| `bifrost_key_rotation_events_total` | Counter | Per-attempt retry/rotation events with key identifiers (see below) v1.5.0-prerelease4+ |
### Default Labels
All Bifrost metrics include these labels:
- `provider` - LLM provider name
- `model` - Model identifier
- `method` - Request type (chat, completion, embedding, etc.)
- `virtual_key_id` / `virtual_key_name` - Virtual key identifiers
- `selected_key_id` / `selected_key_name` - API key that successfully served the request (`""` when all attempts failed)
- `number_of_retries` - Total attempts minus one (across all keys)
- `fallback_index` - Fallback position
- `team_id` / `team_name` - Team identifiers (if governance enabled)
- `customer_id` / `customer_name` - Customer identifiers (if governance enabled)
**v1.5.0-prerelease4+**: `selected_key_id` / `selected_key_name` are only populated when the request succeeds. On final errors both are empty — use `bifrost_key_rotation_events_total` or the `attempt_trail` log field to see which keys were tried.
### Key Rotation Events v1.5.0-prerelease4+
`bifrost_key_rotation_events_total` is incremented once per **failed attempt** (not per request), giving you time-series visibility into retry pressure:
| Label | Values | Description |
|-------|--------|-------------|
| `provider` | e.g. `openai` | LLM provider |
| `requested_model` | e.g. `gpt-4o` | Model as requested (before any alias resolution) |
| `key_id` | UUID | The provider API key that failed on this attempt |
| `key_name` | string | Human-readable name of the provider API key |
| `fail_reason` | error type string | Provider error type (e.g. `rate_limit_error`, `network_error`) |
**Example queries:**
```promql
# Rate-limit events per provider over time
sum by (provider, fail_reason) (
rate(bifrost_key_rotation_events_total[5m])
)
# Which specific keys are hitting rate limits most often
topk(5, sum by (provider, key_name, fail_reason) (
rate(bifrost_key_rotation_events_total{fail_reason="rate_limit_error"}[1h])
))
```
---
## Push Gateway Setup
If you don't have a Push Gateway running, deploy one:
### Docker
```bash
docker run -d -p 9091:9091 prom/pushgateway
```
### Kubernetes (Helm)
```bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install pushgateway prometheus-community/prometheus-pushgateway
```
### Configure Prometheus to Scrape Push Gateway
Add to your `prometheus.yml`:
```yaml
scrape_configs:
- job_name: 'pushgateway'
honor_labels: true
static_configs:
- targets: ['pushgateway:9091']
```
The `honor_labels: true` setting is important - it preserves the `job` and `instance` labels pushed by Bifrost instead of overwriting them with the Push Gateway's labels.
---
## Pull vs Push: When to Use Each
| Scenario | Recommended Method |
|----------|-------------------|
| Single Bifrost instance | Pull (scraping) |
| Multiple instances, direct access | Pull (scraping) |
| Multiple instances behind load balancer | **Push (Push Gateway)** |
| Kubernetes with service mesh | Pull or Push |
| Serverless / ephemeral instances | **Push (Push Gateway)** |
### Why Push for Clusters?
When multiple Bifrost instances run behind a load balancer:
1. **Scraping randomness**: Each scrape may hit different nodes, missing metrics from others
2. **Instance tracking**: Push Gateway properly tracks per-instance metrics via `instance` label
3. **Aggregation**: Downstream tools (Grafana, Datadog) can aggregate across all instances
---
## Troubleshooting
### Push Gateway Connection Failed
```
failed to push metrics to push gateway: connection refused
```
- Verify the Push Gateway URL is correct and reachable from Bifrost
- Check firewall rules between Bifrost and Push Gateway
- Ensure Push Gateway is running: `curl http://pushgateway:9091/metrics`
### Metrics Not Appearing
- Verify the telemetry plugin is enabled (required for metrics collection)
- Check Bifrost logs for push errors
- Verify Prometheus is scraping the Push Gateway with `honor_labels: true`
### Authentication Failed
- Double-check username and password
- Ensure basic auth is configured on the Push Gateway side
- Check for special characters that may need escaping