Files
Beyhan Oğur 880f412e2c first commit
2026-04-26 21:52:23 +03:00

180 lines
5.4 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Fireworks"
description: "Fireworks API conversion guide covering native chat, responses, completions, embeddings, streaming, and Fireworks-specific parameter handling"
icon: "sparkles"
---
## Overview
Fireworks is an **OpenAI-compatible provider** in Bifrost with native support for:
- **Chat Completions** via `/v1/chat/completions`
- **Responses API** via `/v1/responses`
- **Text Completions** via `/v1/completions`
- **Embeddings** via `/v1/embeddings`
- **Streaming** for chat, responses, and completions
- **Tool calling** for chat and responses
Unless noted below, Fireworks follows the standard OpenAI-compatible request and response behavior described in [OpenAI](./openai).
### Supported Operations
| Operation | Non-Streaming | Streaming | Endpoint |
|-----------|---------------|-----------|----------|
| Chat Completions | ✅ | ✅ | `/v1/chat/completions` |
| Responses API | ✅ | ✅ | `/v1/responses` |
| Text Completions | ✅ | ✅ | `/v1/completions` |
| Embeddings | ✅ | ❌ | `/v1/embeddings` |
| List Models | ✅ | - | `/v1/models` |
| Images | ❌ | ❌ | - |
| Speech / Transcription | ❌ | ❌ | - |
| Files | ❌ | ❌ | - |
| Batch | ❌ | ❌ | - |
| Count Tokens | ❌ | ❌ | - |
<Note>
Fireworks Responses support is **native** in Bifrost. Requests are sent to Fireworks `/v1/responses` endpoint directly, so fields such as `previous_response_id`, `max_tool_calls`, and `store` are preserved.
</Note>
---
# 1. Chat Completions
Fireworks chat completions use the standard OpenAI-compatible wire format.
## Fireworks-specific handling
- `prediction` is preserved and forwarded.
- Bifrost maps `prompt_cache_key` to Fireworks `prompt_cache_isolation_key` for chat-completion cache isolation.
- Assistant `reasoning_content` is preserved for Fireworks chat-completion models that support reasoning history.
## Filtered Parameters
For Fireworks chat completions, Bifrost removes or rewrites a small set of OpenAI-specific fields before sending the request upstream:
- `prompt_cache_key` is mapped to Fireworks `prompt_cache_isolation_key`
- `prompt_cache_retention` is removed
- `verbosity` is removed
- `store` is removed
- `web_search_options` is removed
## Example
```bash
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
"messages": [
{"role": "user", "content": "Reply with exactly: fireworks ok"}
]
}'
```
---
# 2. Responses API
Fireworks Responses use the native Fireworks endpoint:
```text
/v1/responses
```
This preserves Responses-only fields and semantics, including:
- `previous_response_id`
- `max_tool_calls`
- `store`
- native responses streaming
## Example
```bash
curl -X POST http://localhost:8080/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
"input": [
{"role": "user", "content": "Reply with exactly: responses ok"}
],
"max_tool_calls": 2
}'
```
For continuation requests, Fireworks also supports `previous_response_id`.
---
# 3. Text Completions
Fireworks text completions are sent to the native completions endpoint:
```text
/v1/completions
```
## Example
```bash
curl -X POST http://localhost:8080/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
"prompt": "In fruits, A is for apple and B is for"
}'
```
For Fireworks text completions, Bifrost extracts `prompt_cache_key` from `extra_params` and maps it to Fireworks `prompt_cache_isolation_key`.
---
# 4. Embeddings
Fireworks embeddings are sent to:
```text
/v1/embeddings
```
Embedding-capable models may be different from chat/completions models.
## Example
```bash
curl -X POST http://localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "fireworks/nomic-ai/nomic-embed-text-v1.5",
"input": "embedding test"
}'
```
Fireworks documents additional embedding-specific fields such as `prompt_template`, `return_logits`, and `normalize`. This page describes the standard embeddings flow currently covered by Bifrost.
---
# 5. Unsupported Features
The following operations are still unsupported by the Fireworks provider in Bifrost:
| Feature | Status |
|---------|--------|
| Image generation / editing / variations | ❌ |
| Speech / TTS | ❌ |
| Transcription / STT | ❌ |
| Files | ❌ |
| Batch | ❌ |
| Count tokens | ❌ |
| Rerank | ❌ |
---
# 6. Caveats
<Accordion title="Prompt Caching Semantics">
For Fireworks chat completions, Bifrost maps `prompt_cache_key` to Fireworks `prompt_cache_isolation_key`, which is the Fireworks body field for cache isolation. Fireworks also accepts the header form `x-prompt-cache-isolation-key`. For text completions, Bifrost extracts `prompt_cache_key` from `extra_params` and maps it to the same Fireworks body field. If you need Fireworks session-affinity behavior, pass `user`, configure `x-session-affinity` in provider extra headers, or send it through the HTTP gateway via `x-bf-eh-x-session-affinity`. Live cache-hit behavior remains model and deployment dependent.
</Accordion>
<Accordion title="Reasoning History">
Bifrost preserves assistant `reasoning_content` for Fireworks chat models that support reasoning history. Fireworks-specific reasoning controls such as `reasoning_history` are not given special typed handling in this provider page.
</Accordion>