---
title: "Fireworks"
description: "Fireworks API conversion guide covering native chat, responses, completions, embeddings, streaming, and Fireworks-specific parameter handling"
icon: "sparkles"
---

## Overview

Fireworks is an **OpenAI-compatible provider** in Bifrost with native support for:
- **Chat Completions** via `/v1/chat/completions`
- **Responses API** via `/v1/responses`
- **Text Completions** via `/v1/completions`
- **Embeddings** via `/v1/embeddings`
- **Streaming** for chat, responses, and completions
- **Tool calling** for chat and responses

Unless noted below, Fireworks follows the standard OpenAI-compatible request and response behavior described in [OpenAI](./openai).

### Supported Operations

| Operation | Non-Streaming | Streaming | Endpoint |
|-----------|---------------|-----------|----------|
| Chat Completions | ✅ | ✅ | `/v1/chat/completions` |
| Responses API | ✅ | ✅ | `/v1/responses` |
| Text Completions | ✅ | ✅ | `/v1/completions` |
| Embeddings | ✅ | ❌ | `/v1/embeddings` |
| List Models | ✅ | - | `/v1/models` |
| Images | ❌ | ❌ | - |
| Speech / Transcription | ❌ | ❌ | - |
| Files | ❌ | ❌ | - |
| Batch | ❌ | ❌ | - |
| Count Tokens | ❌ | ❌ | - |

<Note>
Fireworks Responses support is **native** in Bifrost. Requests are sent to Fireworks’ `/v1/responses` endpoint directly, so fields such as `previous_response_id`, `max_tool_calls`, and `store` are preserved.
</Note>

---

# 1. Chat Completions

Fireworks chat completions use the standard OpenAI-compatible wire format.

## Fireworks-specific handling

- `prediction` is preserved and forwarded.
- Bifrost maps `prompt_cache_key` to Fireworks `prompt_cache_isolation_key` for chat-completion cache isolation.
- Assistant `reasoning_content` is preserved for Fireworks chat-completion models that support reasoning history.

## Filtered Parameters

For Fireworks chat completions, Bifrost removes or rewrites a small set of OpenAI-specific fields before sending the request upstream:

- `prompt_cache_key` is mapped to Fireworks `prompt_cache_isolation_key`
- `prompt_cache_retention` is removed
- `verbosity` is removed
- `store` is removed
- `web_search_options` is removed

## Example

```bash
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
    "messages": [
      {"role": "user", "content": "Reply with exactly: fireworks ok"}
    ]
  }'
```

---

# 2. Responses API

Fireworks Responses use the native Fireworks endpoint:

```text
/v1/responses
```

This preserves Responses-only fields and semantics, including:
- `previous_response_id`
- `max_tool_calls`
- `store`
- native responses streaming

## Example

```bash
curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
    "input": [
      {"role": "user", "content": "Reply with exactly: responses ok"}
    ],
    "max_tool_calls": 2
  }'
```

For continuation requests, Fireworks also supports `previous_response_id`.

---

# 3. Text Completions

Fireworks text completions are sent to the native completions endpoint:

```text
/v1/completions
```

## Example

```bash
curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
    "prompt": "In fruits, A is for apple and B is for"
  }'
```

For Fireworks text completions, Bifrost extracts `prompt_cache_key` from `extra_params` and maps it to Fireworks `prompt_cache_isolation_key`.

---

# 4. Embeddings

Fireworks embeddings are sent to:

```text
/v1/embeddings
```

Embedding-capable models may be different from chat/completions models.

## Example

```bash
curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fireworks/nomic-ai/nomic-embed-text-v1.5",
    "input": "embedding test"
  }'
```

Fireworks documents additional embedding-specific fields such as `prompt_template`, `return_logits`, and `normalize`. This page describes the standard embeddings flow currently covered by Bifrost.

---

# 5. Unsupported Features

The following operations are still unsupported by the Fireworks provider in Bifrost:

| Feature | Status |
|---------|--------|
| Image generation / editing / variations | ❌ |
| Speech / TTS | ❌ |
| Transcription / STT | ❌ |
| Files | ❌ |
| Batch | ❌ |
| Count tokens | ❌ |
| Rerank | ❌ |

---

# 6. Caveats

<Accordion title="Prompt Caching Semantics">
For Fireworks chat completions, Bifrost maps `prompt_cache_key` to Fireworks `prompt_cache_isolation_key`, which is the Fireworks body field for cache isolation. Fireworks also accepts the header form `x-prompt-cache-isolation-key`. For text completions, Bifrost extracts `prompt_cache_key` from `extra_params` and maps it to the same Fireworks body field. If you need Fireworks session-affinity behavior, pass `user`, configure `x-session-affinity` in provider extra headers, or send it through the HTTP gateway via `x-bf-eh-x-session-affinity`. Live cache-hit behavior remains model and deployment dependent.
</Accordion>

<Accordion title="Reasoning History">
Bifrost preserves assistant `reasoning_content` for Fireworks chat models that support reasoning history. Fireworks-specific reasoning controls such as `reasoning_history` are not given special typed handling in this provider page.
</Accordion>