first commit

2026-04-26 21:52:23 +03:00
commit 880f412e2c
2662 changed files with 866266 additions and 0 deletions
--- a/docs/providers/supported-providers/fireworks.mdx
+++ b/docs/providers/supported-providers/fireworks.mdx
@@ -0,0 +1,179 @@
+---
+title: "Fireworks"
+description: "Fireworks API conversion guide covering native chat, responses, completions, embeddings, streaming, and Fireworks-specific parameter handling"
+icon: "sparkles"
+---
+
+## Overview
+
+Fireworks is an **OpenAI-compatible provider** in Bifrost with native support for:
+- **Chat Completions** via `/v1/chat/completions`
+- **Responses API** via `/v1/responses`
+- **Text Completions** via `/v1/completions`
+- **Embeddings** via `/v1/embeddings`
+- **Streaming** for chat, responses, and completions
+- **Tool calling** for chat and responses
+
+Unless noted below, Fireworks follows the standard OpenAI-compatible request and response behavior described in [OpenAI](./openai).
+
+### Supported Operations
+
+| Operation | Non-Streaming | Streaming | Endpoint |
+|-----------|---------------|-----------|----------|
+| Chat Completions | ✅ | ✅ | `/v1/chat/completions` |
+| Responses API | ✅ | ✅ | `/v1/responses` |
+| Text Completions | ✅ | ✅ | `/v1/completions` |
+| Embeddings | ✅ | ❌ | `/v1/embeddings` |
+| List Models | ✅ | - | `/v1/models` |
+| Images | ❌ | ❌ | - |
+| Speech / Transcription | ❌ | ❌ | - |
+| Files | ❌ | ❌ | - |
+| Batch | ❌ | ❌ | - |
+| Count Tokens | ❌ | ❌ | - |
+
+<Note>
+Fireworks Responses support is **native** in Bifrost. Requests are sent to Fireworks’ `/v1/responses` endpoint directly, so fields such as `previous_response_id`, `max_tool_calls`, and `store` are preserved.
+</Note>
+
+---
+
+# 1. Chat Completions
+
+Fireworks chat completions use the standard OpenAI-compatible wire format.
+
+## Fireworks-specific handling
+
+- `prediction` is preserved and forwarded.
+- Bifrost maps `prompt_cache_key` to Fireworks `prompt_cache_isolation_key` for chat-completion cache isolation.
+- Assistant `reasoning_content` is preserved for Fireworks chat-completion models that support reasoning history.
+
+## Filtered Parameters
+
+For Fireworks chat completions, Bifrost removes or rewrites a small set of OpenAI-specific fields before sending the request upstream:
+
+- `prompt_cache_key` is mapped to Fireworks `prompt_cache_isolation_key`
+- `prompt_cache_retention` is removed
+- `verbosity` is removed
+- `store` is removed
+- `web_search_options` is removed
+
+## Example
+
+```bash
+curl -X POST http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
+    "messages": [
+      {"role": "user", "content": "Reply with exactly: fireworks ok"}
+    ]
+  }'
+```
+
+---
+
+# 2. Responses API
+
+Fireworks Responses use the native Fireworks endpoint:
+
+```text
+/v1/responses
+```
+
+This preserves Responses-only fields and semantics, including:
+- `previous_response_id`
+- `max_tool_calls`
+- `store`
+- native responses streaming
+
+## Example
+
+```bash
+curl -X POST http://localhost:8080/v1/responses \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
+    "input": [
+      {"role": "user", "content": "Reply with exactly: responses ok"}
+    ],
+    "max_tool_calls": 2
+  }'
+```
+
+For continuation requests, Fireworks also supports `previous_response_id`.
+
+---
+
+# 3. Text Completions
+
+Fireworks text completions are sent to the native completions endpoint:
+
+```text
+/v1/completions
+```
+
+## Example
+
+```bash
+curl -X POST http://localhost:8080/v1/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
+    "prompt": "In fruits, A is for apple and B is for"
+  }'
+```
+
+For Fireworks text completions, Bifrost extracts `prompt_cache_key` from `extra_params` and maps it to Fireworks `prompt_cache_isolation_key`.
+
+---
+
+# 4. Embeddings
+
+Fireworks embeddings are sent to:
+
+```text
+/v1/embeddings
+```
+
+Embedding-capable models may be different from chat/completions models.
+
+## Example
+
+```bash
+curl -X POST http://localhost:8080/v1/embeddings \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "fireworks/nomic-ai/nomic-embed-text-v1.5",
+    "input": "embedding test"
+  }'
+```
+
+Fireworks documents additional embedding-specific fields such as `prompt_template`, `return_logits`, and `normalize`. This page describes the standard embeddings flow currently covered by Bifrost.
+
+---
+
+# 5. Unsupported Features
+
+The following operations are still unsupported by the Fireworks provider in Bifrost:
+
+| Feature | Status |
+|---------|--------|
+| Image generation / editing / variations | ❌ |
+| Speech / TTS | ❌ |
+| Transcription / STT | ❌ |
+| Files | ❌ |
+| Batch | ❌ |
+| Count tokens | ❌ |
+| Rerank | ❌ |
+
+---
+
+# 6. Caveats
+
+<Accordion title="Prompt Caching Semantics">
+For Fireworks chat completions, Bifrost maps `prompt_cache_key` to Fireworks `prompt_cache_isolation_key`, which is the Fireworks body field for cache isolation. Fireworks also accepts the header form `x-prompt-cache-isolation-key`. For text completions, Bifrost extracts `prompt_cache_key` from `extra_params` and maps it to the same Fireworks body field. If you need Fireworks session-affinity behavior, pass `user`, configure `x-session-affinity` in provider extra headers, or send it through the HTTP gateway via `x-bf-eh-x-session-affinity`. Live cache-hit behavior remains model and deployment dependent.
+</Accordion>
+
+<Accordion title="Reasoning History">
+Bifrost preserves assistant `reasoning_content` for Fireworks chat models that support reasoning history. Fireworks-specific reasoning controls such as `reasoning_history` are not given special typed handling in this provider page.
+</Accordion>