---
title: "Prometheus"
description: "Monitor Bifrost metrics with Prometheus scraping or Push Gateway for multi-node deployments"
icon: "chart-line"
---

## Overview

Bifrost exposes Prometheus metrics via two methods:

1. **Pull-based (Scraping)**: Traditional `/metrics` endpoint that Prometheus can scrape
2. **Push-based (Push Gateway)**: Push metrics to a Prometheus Push Gateway for cluster deployments

<Note>
  **For multi-node deployments**: Use the Push Gateway method to ensure accurate metric aggregation. Traditional scraping may miss nodes behind load balancers.
</Note>

---

## Pull-based Scraping

Bifrost automatically exposes a `/metrics` endpoint when the telemetry plugin is enabled (enabled by default). No additional configuration is needed.

<Info>
  When Bifrost's authentication is enabled (`auth_config.is_enabled = true`), the `/metrics` endpoint requires Basic auth credentials. You must include the same `admin_username` and `admin_password` from your `auth_config` in the Prometheus scrape configuration. Without this, Prometheus will receive `401 Unauthorized` responses and scraping will silently fail.
</Info>

### Prometheus Configuration

Add Bifrost to your Prometheus `prometheus.yml`:

```yaml
scrape_configs:
  - job_name: 'bifrost'
    static_configs:
      - targets: ['bifrost-host:8080']
    scrape_interval: 15s
```

If Bifrost authentication is enabled, add `basic_auth` to your scrape config:

```yaml
scrape_configs:
  - job_name: 'bifrost'
    static_configs:
      - targets: ['bifrost-host:8080']
    scrape_interval: 15s
    basic_auth:
      username: '<admin_username>'
      password: '<admin_password>'
```

### Endpoint

```
GET /metrics
```

Returns metrics in Prometheus exposition format.

---

## Push-based (Push Gateway)

For multi-node cluster deployments, the Prometheus plugin pushes metrics to a [Prometheus Push Gateway](https://github.com/prometheus/pushgateway). This ensures all nodes' metrics are captured regardless of load balancer routing.

### Configuration

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `push_gateway_url` | `string` | ✅ Yes | - | Push Gateway URL (e.g., `http://pushgateway:9091`) |
| `job_name` | `string` | ❌ No | `bifrost` | Job label for pushed metrics |
| `instance_id` | `string` | ❌ No | hostname | Instance identifier for metric grouping |
| `push_interval` | `integer` | ❌ No | `15` | Push interval in seconds (1-300) |
| `basic_auth` | `object` | ❌ No | - | Basic auth credentials |

### Basic Auth Configuration

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `username` | `string` | ✅ Yes | Basic auth username |
| `password` | `string` | ✅ Yes | Basic auth password |

---

## Setup

<Tabs group="setup-method">
<Tab title="UI">

1. Navigate to **Observability** → **Prometheus** in the Bifrost UI
2. The `/metrics` endpoint is shown at the top for scraping configuration
3. To enable Push Gateway:
   - Enter the **Push Gateway URL**
   - Configure **Job Name** and **Push Interval** as needed
   - Optionally set a custom **Instance ID**
   - Enable **Basic Authentication** if required
   - Toggle **Enable Push Gateway** on
   - Click **Save Prometheus Configuration**

</Tab>
<Tab title="Config File">

```json
{
  "plugins": [
    {
      "name": "telemetry",
      "enabled": true,
      "config": {
        "push_gateway": {
          "enabled": true,
          "push_gateway_url": "http://pushgateway:9091",
          "job_name": "bifrost",
          "push_interval": 15
        }
      }
    }
  ]
}
```

### With Basic Auth

```json
{
  "plugins": [
    {
      "name": "telemetry",
      "enabled": true,
      "config": {
        "push_gateway": {
          "enabled": true,
          "push_gateway_url": "http://pushgateway:9091",
          "job_name": "bifrost",
          "push_interval": 15,
          "instance_id": "bifrost-node-1",
          "basic_auth": {
            "username": "admin",
            "password": "secret"
          }
        }
      }
    }
  ]
}
```

</Tab>
</Tabs>

---

## Available Metrics

The following metrics are available from both the `/metrics` endpoint and Push Gateway:

### HTTP Metrics

| Metric | Type | Description |
|--------|------|-------------|
| `http_requests_total` | Counter | Total HTTP requests by path, method, status |
| `http_request_duration_seconds` | Histogram | HTTP request latency |
| `http_request_size_bytes` | Histogram | Request body size |
| `http_response_size_bytes` | Histogram | Response body size |

### Bifrost LLM Metrics

| Metric | Type | Description |
|--------|------|-------------|
| `bifrost_upstream_requests_total` | Counter | Total requests to LLM providers |
| `bifrost_upstream_latency_seconds` | Histogram | Provider request latency |
| `bifrost_success_requests_total` | Counter | Successful provider requests |
| `bifrost_error_requests_total` | Counter | Failed provider requests |
| `bifrost_input_tokens_total` | Counter | Total input tokens processed |
| `bifrost_output_tokens_total` | Counter | Total output tokens generated |
| `bifrost_cost_total` | Counter | Total cost in USD |
| `bifrost_cache_hits_total` | Counter | Cache hits by type |
| `bifrost_stream_first_token_latency_seconds` | Histogram | Time to first token (streaming) |
| `bifrost_stream_inter_token_latency_seconds` | Histogram | Inter-token latency (streaming) |
| `bifrost_key_rotation_events_total` | Counter | Per-attempt retry/rotation events with key identifiers (see below) <sup>v1.5.0-prerelease4+</sup> |

### Default Labels

All Bifrost metrics include these labels:

- `provider` - LLM provider name
- `model` - Model identifier
- `method` - Request type (chat, completion, embedding, etc.)
- `virtual_key_id` / `virtual_key_name` - Virtual key identifiers
- `selected_key_id` / `selected_key_name` - API key that successfully served the request (`""` when all attempts failed)
- `number_of_retries` - Total attempts minus one (across all keys)
- `fallback_index` - Fallback position
- `team_id` / `team_name` - Team identifiers (if governance enabled)
- `customer_id` / `customer_name` - Customer identifiers (if governance enabled)

<Note>
  **v1.5.0-prerelease4+**: `selected_key_id` / `selected_key_name` are only populated when the request succeeds. On final errors both are empty — use `bifrost_key_rotation_events_total` or the `attempt_trail` log field to see which keys were tried.
</Note>

### Key Rotation Events <sup>v1.5.0-prerelease4+</sup>

`bifrost_key_rotation_events_total` is incremented once per **failed attempt** (not per request), giving you time-series visibility into retry pressure:

| Label | Values | Description |
|-------|--------|-------------|
| `provider` | e.g. `openai` | LLM provider |
| `requested_model` | e.g. `gpt-4o` | Model as requested (before any alias resolution) |
| `key_id` | UUID | The provider API key that failed on this attempt |
| `key_name` | string | Human-readable name of the provider API key |
| `fail_reason` | error type string | Provider error type (e.g. `rate_limit_error`, `network_error`) |

**Example queries:**

```promql
# Rate-limit events per provider over time
sum by (provider, fail_reason) (
  rate(bifrost_key_rotation_events_total[5m])
)

# Which specific keys are hitting rate limits most often
topk(5, sum by (provider, key_name, fail_reason) (
  rate(bifrost_key_rotation_events_total{fail_reason="rate_limit_error"}[1h])
))
```

---

## Push Gateway Setup

If you don't have a Push Gateway running, deploy one:

### Docker

```bash
docker run -d -p 9091:9091 prom/pushgateway
```

### Kubernetes (Helm)

```bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install pushgateway prometheus-community/prometheus-pushgateway
```

### Configure Prometheus to Scrape Push Gateway

Add to your `prometheus.yml`:

```yaml
scrape_configs:
  - job_name: 'pushgateway'
    honor_labels: true
    static_configs:
      - targets: ['pushgateway:9091']
```

<Note>
  The `honor_labels: true` setting is important - it preserves the `job` and `instance` labels pushed by Bifrost instead of overwriting them with the Push Gateway's labels.
</Note>

---

## Pull vs Push: When to Use Each

| Scenario | Recommended Method |
|----------|-------------------|
| Single Bifrost instance | Pull (scraping) |
| Multiple instances, direct access | Pull (scraping) |
| Multiple instances behind load balancer | **Push (Push Gateway)** |
| Kubernetes with service mesh | Pull or Push |
| Serverless / ephemeral instances | **Push (Push Gateway)** |

### Why Push for Clusters?

When multiple Bifrost instances run behind a load balancer:

1. **Scraping randomness**: Each scrape may hit different nodes, missing metrics from others
2. **Instance tracking**: Push Gateway properly tracks per-instance metrics via `instance` label
3. **Aggregation**: Downstream tools (Grafana, Datadog) can aggregate across all instances

---

## Troubleshooting

### Push Gateway Connection Failed

```
failed to push metrics to push gateway: connection refused
```

- Verify the Push Gateway URL is correct and reachable from Bifrost
- Check firewall rules between Bifrost and Push Gateway
- Ensure Push Gateway is running: `curl http://pushgateway:9091/metrics`

### Metrics Not Appearing

- Verify the telemetry plugin is enabled (required for metrics collection)
- Check Bifrost logs for push errors
- Verify Prometheus is scraping the Push Gateway with `honor_labels: true`

### Authentication Failed

- Double-check username and password
- Ensure basic auth is configured on the Push Gateway side
- Check for special characters that may need escaping