Files
bifrost/docs/features/prompt-repository/prompts-plugin.mdx
Beyhan Oğur 880f412e2c first commit
2026-04-26 21:52:23 +03:00

149 lines
6.7 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Prompts plugin"
description: "Use committed prompt templates from the Prompt Repository on inference requests via HTTP headers or custom resolvers."
icon: "puzzle-piece"
---
## Overview
The **Prompts** plugin connects the [Prompt Repository](/features/prompt-repository/playground) to inference. It loads committed prompt versions from the config store and **prepends** their messages to **Chat Completions** and **Responses** requests. It also **merges model parameters** from the stored version with the incoming request (request values take precedence).
**What it does:**
- Resolves which prompt and version to apply per request (default: HTTP headers).
- Injects the versions message history **before** the clients messages.
- Applies the versions `model` parameters as defaults, then overrides with whatever the client sent for the same parameters.
---
## Prerequisites
- **Config store** with Prompt Repository tables (typically **PostgreSQL**). File-backed config alone does not store prompts.
- Prompts authored and **committed as versions** in the UI or via the `/api/prompt-repo/...` HTTP API (see `docs/openapi/openapi.yaml` in the repository).
- A **prompt ID** (UUID) for each prompt you reference at runtime. You can read it from the repository API or the playground.
---
## How it works
```mermaid
flowchart TB
Client([Client]) --> Gateway[Bifrost HTTP]
Gateway --> PreHook["HTTP transport pre-hook:<br/>copy x-bf-prompt-id / x-bf-prompt-version to context"]
PreHook --> PreLLM["PreLLM hook:<br/>resolve version, merge params,<br/>prepend template messages"]
PreLLM --> Provider[Provider]
```
1. **Transport (HTTP):** Incoming headers `x-bf-prompt-id` and `x-bf-prompt-version` are copied onto the Bifrost context (header name matching is case-insensitive).
2. **Resolve:** The plugin looks up the prompt and the requested version. If **`x-bf-prompt-version` is omitted**, the prompts **latest committed version** is used.
3. **Parameters:** Version `model` parameters are merged into the request; any field already set on the request wins.
4. **Messages:** Messages from the committed version are **prepended** to `messages` (chat) or `input` (responses). Your request body adds the user turn(s) after the template.
If the prompt ID is missing, the plugin does nothing and the request passes through unchanged.
---
## HTTP headers (gateway)
| Header | Required | Description |
|--------|----------|-------------|
| `x-bf-prompt-id` | Yes, to enable injection | UUID of the prompt in the repository. |
| `x-bf-prompt-version` | No | **Integer version number** (e.g. `3` for v3). If omitted, the **latest** committed version for that prompt is used. |
Invalid or unknown IDs / versions are logged as warnings; the request is **not** failed by the plugin (it proceeds without template injection).
---
## Example: Chat Completions
Use the same JSON body as a normal chat request. Only the headers select the template.
```bash
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-bf-prompt-id: YOUR-PROMPT-UUID" \
-H "x-bf-vk: sk-bf-your-virtual-key" \
-d '{
"model": "openai/gpt-5.4",
"messages": [
{
"role": "user",
"content": "Tell me about Bifrost Gateway?"
}
]
}'
```
![Commit Version with Stream enabled in the playground](../../media/prompt-plugin-version-commit.png)
When you commit a version from the playground, the model parameters (temperature, max tokens, etc.) are saved with it. These parameters are merged into the outgoing request, with client-supplied values taking precedence.
![LLM log for the same request showing Type: Chat Stream](../../media/prompt-plugin-llm-log.png)
In **Logs**, that run shows the full conversation: the committed **system** template, your **user** message from the request body, and the assistant reply. The log also displays the **Selected Prompt** name and version number for easy traceability.
The provider receives the merged model parameters from both the prompt version and the client request, with the messages from the committed version prepended before the clients messages.
---
## Example: Responses API
```bash
curl -X POST http://localhost:8080/v1/responses \
-H "Content-Type: application/json" \
-H "x-bf-prompt-id: YOUR-PROMPT-UUID" \
-H "x-bf-prompt-version: 4" \
-H "x-bf-vk: sk-bf-your-virtual-key" \
-d '{
"model": "openai/gpt-5-nano-2025-08-07",
"input": "What is Pale Blue Dot?"
}'
```
---
## Streaming
Streaming is controlled entirely by the client request. If you want streaming, set `"stream": true` in the request body. The plugin merges model parameters from the committed version (request values take precedence), but does **not** override the transport-level streaming mode.
---
## Cache and updates
The plugin keeps an in-memory cache of prompts and versions (loaded with a small number of store queries at startup). When you create, update, or delete prompts or versions through the **gateway APIs**, the server **reloads** that cache so new commits are visible without a full process restart.
---
## Go SDK and custom resolution
For embedded Bifrost (Go SDK), register the plugin with `prompts.Init` and a **config store** that implements the prompt tables API. The default resolver reads the same logical keys from `BifrostContext`:
- `prompts.PromptIDKey` (`x-bf-prompt-id`)
- `prompts.PromptVersionKey` (`x-bf-prompt-version`)
Set them on the context you pass to `ChatCompletion` / `Responses` if you are not going through the HTTP transport hooks.
For advanced routing (for example, choosing a prompt from governance metadata), implement `prompts.PromptResolver` and use **`prompts.InitWithResolver`**. The interface is:
```go
type PromptResolver interface {
Resolve(ctx *schemas.BifrostContext, req *schemas.BifrostRequest) (promptID string, versionNumber int, err error)
}
```
Return an empty `promptID` to skip injection for a request. Return `versionNumber == 0` to use the prompt's **latest** committed version; any positive integer selects that specific version.
After injection, the plugin sets the following context keys (read by the logging plugin to populate log fields):
- `schemas.BifrostContextKeySelectedPromptID` — UUID of the applied prompt
- `schemas.BifrostContextKeySelectedPromptName` — Display name of the prompt
- `schemas.BifrostContextKeySelectedPromptVersion` — Version number as a string (e.g. `"3"`)
---
## Related
- [Playground](/features/prompt-repository/playground) — create folders, prompts, sessions, and committed versions.
- [Writing Go plugins](/plugins/writing-go-plugin) — plugin interfaces and lifecycle.
- Built-in plugin name in code: `prompts` (`github.com/maximhq/bifrost/plugins/prompts`).