--- title: "Fireworks" description: "Fireworks API conversion guide covering native chat, responses, completions, embeddings, streaming, and Fireworks-specific parameter handling" icon: "sparkles" --- ## Overview Fireworks is an **OpenAI-compatible provider** in Bifrost with native support for: - **Chat Completions** via `/v1/chat/completions` - **Responses API** via `/v1/responses` - **Text Completions** via `/v1/completions` - **Embeddings** via `/v1/embeddings` - **Streaming** for chat, responses, and completions - **Tool calling** for chat and responses Unless noted below, Fireworks follows the standard OpenAI-compatible request and response behavior described in [OpenAI](./openai). ### Supported Operations | Operation | Non-Streaming | Streaming | Endpoint | |-----------|---------------|-----------|----------| | Chat Completions | ✅ | ✅ | `/v1/chat/completions` | | Responses API | ✅ | ✅ | `/v1/responses` | | Text Completions | ✅ | ✅ | `/v1/completions` | | Embeddings | ✅ | ❌ | `/v1/embeddings` | | List Models | ✅ | - | `/v1/models` | | Images | ❌ | ❌ | - | | Speech / Transcription | ❌ | ❌ | - | | Files | ❌ | ❌ | - | | Batch | ❌ | ❌ | - | | Count Tokens | ❌ | ❌ | - | Fireworks Responses support is **native** in Bifrost. Requests are sent to Fireworks’ `/v1/responses` endpoint directly, so fields such as `previous_response_id`, `max_tool_calls`, and `store` are preserved. --- # 1. Chat Completions Fireworks chat completions use the standard OpenAI-compatible wire format. ## Fireworks-specific handling - `prediction` is preserved and forwarded. - Bifrost maps `prompt_cache_key` to Fireworks `prompt_cache_isolation_key` for chat-completion cache isolation. - Assistant `reasoning_content` is preserved for Fireworks chat-completion models that support reasoning history. ## Filtered Parameters For Fireworks chat completions, Bifrost removes or rewrites a small set of OpenAI-specific fields before sending the request upstream: - `prompt_cache_key` is mapped to Fireworks `prompt_cache_isolation_key` - `prompt_cache_retention` is removed - `verbosity` is removed - `store` is removed - `web_search_options` is removed ## Example ```bash curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "fireworks/accounts/fireworks/models/deepseek-v3p2", "messages": [ {"role": "user", "content": "Reply with exactly: fireworks ok"} ] }' ``` --- # 2. Responses API Fireworks Responses use the native Fireworks endpoint: ```text /v1/responses ``` This preserves Responses-only fields and semantics, including: - `previous_response_id` - `max_tool_calls` - `store` - native responses streaming ## Example ```bash curl -X POST http://localhost:8080/v1/responses \ -H "Content-Type: application/json" \ -d '{ "model": "fireworks/accounts/fireworks/models/deepseek-v3p2", "input": [ {"role": "user", "content": "Reply with exactly: responses ok"} ], "max_tool_calls": 2 }' ``` For continuation requests, Fireworks also supports `previous_response_id`. --- # 3. Text Completions Fireworks text completions are sent to the native completions endpoint: ```text /v1/completions ``` ## Example ```bash curl -X POST http://localhost:8080/v1/completions \ -H "Content-Type: application/json" \ -d '{ "model": "fireworks/accounts/fireworks/models/deepseek-v3p2", "prompt": "In fruits, A is for apple and B is for" }' ``` For Fireworks text completions, Bifrost extracts `prompt_cache_key` from `extra_params` and maps it to Fireworks `prompt_cache_isolation_key`. --- # 4. Embeddings Fireworks embeddings are sent to: ```text /v1/embeddings ``` Embedding-capable models may be different from chat/completions models. ## Example ```bash curl -X POST http://localhost:8080/v1/embeddings \ -H "Content-Type: application/json" \ -d '{ "model": "fireworks/nomic-ai/nomic-embed-text-v1.5", "input": "embedding test" }' ``` Fireworks documents additional embedding-specific fields such as `prompt_template`, `return_logits`, and `normalize`. This page describes the standard embeddings flow currently covered by Bifrost. --- # 5. Unsupported Features The following operations are still unsupported by the Fireworks provider in Bifrost: | Feature | Status | |---------|--------| | Image generation / editing / variations | ❌ | | Speech / TTS | ❌ | | Transcription / STT | ❌ | | Files | ❌ | | Batch | ❌ | | Count tokens | ❌ | | Rerank | ❌ | --- # 6. Caveats For Fireworks chat completions, Bifrost maps `prompt_cache_key` to Fireworks `prompt_cache_isolation_key`, which is the Fireworks body field for cache isolation. Fireworks also accepts the header form `x-prompt-cache-isolation-key`. For text completions, Bifrost extracts `prompt_cache_key` from `extra_params` and maps it to the same Fireworks body field. If you need Fireworks session-affinity behavior, pass `user`, configure `x-session-affinity` in provider extra headers, or send it through the HTTP gateway via `x-bf-eh-x-session-affinity`. Live cache-hit behavior remains model and deployment dependent. Bifrost preserves assistant `reasoning_content` for Fireworks chat models that support reasoning history. Fireworks-specific reasoning controls such as `reasoning_history` are not given special typed handling in this provider page.