first commit

2026-04-26 21:52:23 +03:00
commit 880f412e2c
2662 changed files with 866266 additions and 0 deletions
--- a/docs/features/litellm-compat.mdx
+++ b/docs/features/litellm-compat.mdx
@@ -0,0 +1,214 @@
+---
+title: "LiteLLM Compatibility"
+description: "Request and response transformations for LiteLLM proxy/SDK compatibility."
+icon: "train"
+---
+
+## Compatibility Transformations
+
+The LiteLLM compatibility plugin provides two transformations:
+
+1. **Text-to-Chat Conversion** - Automatically converts text completion requests to chat completion format for models that only support chat APIs
+2. **Chat-to-Responses Conversion** - Automatically converts chat completion requests to responses format for models that only support responses APIs
+3. **Drop Unsupported Params** - Automatically drops unsupported parameters if the model doesn't support them
+
+When either transformation is applied, responses include `extra_fields.converted_request_type: <transformed_request_type>`. If request parameters are dropped, the keys are added in `extra_fields.dropped_compat_plugin_params`.
+
+---
+
+## 1. Text-to-Chat Conversion
+
+Many modern AI models (like GPT-3.5-turbo, GPT-4, Claude, etc.) only support the chat completion API and don't have native text completion endpoints. LiteLLM compatibility mode automatically handles this by:
+
+1. Checking if the model supports text completion natively (using the model catalog)
+2. If not supported, converting your text prompt to chat message format
+3. Calling the chat completion endpoint internally
+4. Transforming the response back to text completion format
+5. Returning content in `choices[0].text` instead of `choices[0].message.content`
+
+<Note>
+**Smart Conversion**: The conversion only happens when the model doesn't support text completions natively. If a model has native text completion support (like OpenAI's davinci models), Bifrost uses the text completion endpoint directly without any conversion.
+</Note>
+
+This allows you to use a unified text completion interface across all providers, even those that only support chat completions.
+
+## How It Works
+
+When LiteLLM compatibility is enabled and you make a text completion request, Bifrost first checks if the model supports text completion:
+
+```mermaid
+flowchart LR
+A[Text Completion Request] --> B{Model Supports Text Completion?}
+B -->|Yes| C[Call Text Completion API]
+B -->|No| D[Convert to Chat Message]
+D --> E[Call Chat Completion API]
+E --> F[Transform Response]
+C --> G[Text Completion Response]
+F --> G
+```
+
+**Request Transformation:**
+- Your text prompt becomes a user message: `{"role": "user", "content": "your prompt"}`
+- Parameters like `max_tokens`, `temperature`, `top_p` are mapped to chat equivalents
+- Fallbacks are preserved
+
+**Response Transformation:**
+- `choices[0].message.content` → `choices[0].text`
+- `object: "chat.completion"` → `object: "text_completion"`
+- Usage statistics and metadata are preserved
+
+## 2. Chat-to-Responses Conversion
+
+Some AI models (like OpenAI o1-pro) only support the responses API and don't support native chat completion endpoints. LiteLLM compatibility mode automatically handles this by:
+
+1. Checking if the model supports chat completion natively (using the model catalog)
+2. If not supported, converting your chat message to responses API format
+3. Calling the responses endpoint internally
+4. Transforming the response back to chat completion format
+
+<Note>
+**Smart Conversion**: The conversion only happens when the model doesn't support chat completions natively. If a model has native chat completion support (like OpenAI's gpt-4 models), Bifrost uses the chat completion endpoint directly without any conversion.
+</Note>
+
+This allows you to use a unified chat completion interface across all providers, even those that only support responses API.
+
+## How It Works
+
+When LiteLLM compatibility is enabled and you make a chat completion request, Bifrost first checks if the model supports chat completion:
+
+```mermaid
+flowchart LR
+A[Chat Completion Request] --> B{Model Supports Chat Completion?}
+B -->|Yes| C[Call Chat Completion API]
+B -->|No| D[Convert to Responses Message]
+D --> E[Call Responses API]
+E --> F[Transform Response]
+C --> G[Chat Completion Response]
+F --> G
+```
+
+## Enabling LiteLLM Compatibility
+
+<Tabs group="litellm-compat">
+
+<Tab title="Gateway UI">
+
+1. Open the Bifrost dashboard
+2. Navigate to **Settings** → **Client Configuration**
+3. Expand **LiteLLM Compat** and enable the features you need:
+   - **Convert Text to Chat** — converts text completion requests to chat for models that only support chat
+   - **Convert Chat to Responses** — converts chat completion requests to responses for models that only support responses
+   - **Drop Unsupported Params** — drops unsupported parameters based on model catalog allowlist
+4. Save your configuration
+
+</Tab>
+
+<Tab title="Configuration File">
+
+```json
+{
+  "client_config": {
+    "compat": {
+      "convert_text_to_chat": true,
+      "convert_chat_to_responses": true,
+      "should_drop_params": true
+    }
+  }
+}
+```
+
+</Tab>
+
+</Tabs>
+
+## Supported Providers
+
+Text completion to chat completion conversion works with any provider that supports chat completions but lacks native text completion support:
+
+| Provider | Native Text Completion | With Fallback |
+|----------|----------------------|------------------|
+| OpenAI (GPT-4, GPT-3.5-turbo) | No | Yes |
+| Anthropic (Claude) | No | Yes |
+| Groq | No | Yes |
+| Gemini | No | Yes |
+| Mistral | No | Yes |
+| Bedrock | Varies by model | Yes |
+
+Chat completion to responses conversion works with any provider that supports responses but lacks native chat completion support:
+
+| Provider | Native Chat Completion | With Fallback |
+|----------|----------------------|------------------|
+| OpenAI (o1-pro) | No | Yes |
+
+## Behavior Details
+
+**Model Capability Detection:**
+- Bifrost uses the model catalog to check if a model supports text completion
+- If the model has a "completion" mode in its pricing data, it supports text completion
+- Conversion only happens when the model lacks native text completion support
+
+## Transformations Reference
+
+### Transformation 1: Text-to-Chat Conversion
+
+**Applies to:** Text completion requests on chat-only models
+
+| Phase | Original | Transformed |
+|-------|----------|-------------|
+| Request | Text prompt (string) | Chat message with `role: "user"` |
+| Request | Array prompts | Concatenated into text content blocks |
+| Request | `text_completion` request type | `chat_completion` request type |
+| Request | `max_tokens`, `temperature`, `top_p` | Mapped to chat equivalents |
+| Response | `choices[0].message.content` | `choices[0].text` |
+| Response | `object: "chat.completion"` | `object: "text_completion"` |
+
+### Transformation 2: Chat-to-Responses Conversion
+
+**Applies to:** Chat completion requests on responses-only models
+
+| Phase | Original | Transformed |
+|-------|----------|-------------|
+| Request | Chat message with `role: "user"` | Responses input with `role: "user"` |
+| Request | `chat_completion` request type | `responses` request type |
+
+### Metadata Set on Transformed Responses
+
+When either transformation is applied:
+
+- `extra_fields.request_type`: Reflects the original request type
+- `extra_fields.original_model_requested`: The originally requested model
+- `extra_fields.resolved_model_used`: The actual provider API identifier used (equals original_model_requested when no alias mapping exists)
+
+### Error Handling
+
+When errors occur on transformed requests:
+- Original request type and model are preserved in error metadata
+- `extra_fields.converted_request_type`: Set to type of request that was converted to (i.e., `chat_completion` or `responses`)
+- `extra_fields.provider`: The provider that handled the request
+- `extra_fields.original_model_requested`: The originally requested model
+- `extra_fields.dropped_compat_plugin_params`: If any unsupported parameters were dropped, the keys are added here
+
+## What's Preserved
+
+- Model selection and fallback chain
+- Temperature, top_p, max_tokens, and other generation parameters
+- Stop sequences and frequency/presence penalties
+- Usage statistics and token counts
+
+## When to Use This
+
+**Good Use Cases:**
+- Migrating from LiteLLM to Bifrost without code changes
+- Maintaining backward compatibility with text completion interfaces or chat completion interfaces
+- Using a unified API across providers with different capabilities
+
+**Consider Alternatives When:**
+- You need chat-specific features (system messages, conversation history)
+- You want explicit control over message formatting
+- Performance is critical (direct chat requests avoid conversion overhead)
+
+## Related Features
+
+- [Fallbacks](/features/fallbacks) - Automatic provider failover
+- [Drop-in Replacement](/features/drop-in-replacement) - Use existing SDKs with Bifrost
+- [LiteLLM Integration](/integrations/litellm-sdk) - Using LiteLLM SDK with Bifrost