---
title: "Reasoning"
description: "Cross-provider reference for reasoning and thinking capabilities in AI models"
icon: "brain"
---
## Overview
Reasoning (also called "thinking" in some providers) allows AI models to show their step-by-step thought process before providing a final answer. This feature is available across multiple providers with different implementations.
Bifrost normalizes all provider-specific reasoning formats to a consistent OpenAI-compatible structure using `reasoning` in requests and `reasoning_details` in responses.
---
## Provider Support Matrix
| Provider | Request Field | Response Field | Min Budget | Effort Levels | Streaming |
|----------|--------------|----------------|------------|---------------|-----------|
| OpenAI | `reasoning` | `reasoning_details` | None | `minimal`, `low`, `medium`, `high` | ✅ |
| Anthropic | `thinking` | Content blocks | **1024 tokens** | `enabled` only | ✅ |
| Bedrock (Anthropic) | `thinking` | Content blocks | **1024 tokens** | `enabled` only | ✅ |
| Gemini 2.5+ | `thinking_config` | `thought` parts | 1024 | Budget-only | ✅ |
| Gemini 3.0+ | `thinking_config` | `thought` parts | 1024 | `minimal`, `low`, `medium`, `high` + Budget | ✅ |
---
## Request Configuration
### Chat Completions API
```json
{
"model": "provider/model-name",
"messages": [...],
"reasoning": {
"effort": "high",
"max_tokens": 4096
}
}
```
```go
package main
import (
"github.com/maximhq/bifrost"
"github.com/maximhq/bifrost/core/schemas"
)
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.OpenAI,
Model: "gpt-4o",
Input: []schemas.ChatMessage{
{
Role: schemas.ChatMessageRoleUser,
Content: &schemas.ChatMessageContent{
ContentStr: schemas.Ptr("Explain quantum computing"),
},
},
},
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(4096),
Reasoning: &schemas.ChatReasoning{
Effort: schemas.Ptr("high"),
MaxTokens: schemas.Ptr(4096),
},
},
}
```
### Responses API
```json
{
"model": "provider/model-name",
"input": [...],
"reasoning": {
"effort": "high",
"max_tokens": 4096,
"summary": "detailed"
}
}
```
```go
package main
import (
"github.com/maximhq/bifrost/core/schemas"
)
responsesReq := &schemas.BifrostResponsesRequest{
Provider: schemas.Anthropic,
Model: "claude-3-5-sonnet-20241022",
Input: []schemas.ResponsesMessage{
{
Role: schemas.Ptr(schemas.ResponsesInputMessageRoleUser),
Content: &schemas.ResponsesMessageContent{
ContentStr: schemas.Ptr("Explain quantum computing"),
},
},
},
Params: &schemas.ResponsesParameters{
MaxOutputTokens: schemas.Ptr(4096),
Reasoning: &schemas.ResponsesParametersReasoning{
Effort: schemas.Ptr("high"),
MaxTokens: schemas.Ptr(4096),
Summary: schemas.Ptr("detailed"),
},
},
}
```
Responses API supports both `effort` + `max_tokens` (like Chat Completions) and adds the optional `summary` parameter for output summarization.
### Parameter Reference
#### Chat Completions API Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| `effort` | `string` | Reasoning intensity level |
| `max_tokens` | `int` | Maximum tokens for reasoning (budget) |
#### Responses API Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| `effort` | `string` | Reasoning intensity level |
| `max_tokens` | `int` | Maximum tokens for reasoning (budget) |
| `summary` | `string` | Summary level: `brief`, `detailed`, or `json` |
**Responses API** accepts the same `effort` and `max_tokens` parameters as Chat Completions, but adds an optional `summary` parameter for reasoning output summarization.
---
## Provider-Specific Conversions
### OpenAI
OpenAI uses effort-based reasoning only. Bifrost applies priority logic:
1. If `reasoning.effort` is provided → use it directly
2. Else if `reasoning.max_tokens` is provided → estimate effort from it
3. The `max_tokens` field is cleared before sending to OpenAI
**Conversion Examples**:
```json
// Bifrost Request (with effort)
{
"reasoning": {
"effort": "high"
}
}
// OpenAI Request Sent
{
"reasoning": {
"effort": "high"
}
}
```
```go
// Bifrost request with effort (native field)
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.OpenAI,
Model: "gpt-4o",
Input: messages,
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(4096),
Reasoning: &schemas.ChatReasoning{
Effort: schemas.Ptr("high"),
},
},
}
// OpenAI receives effort directly, max_tokens is cleared
```
```json
// Bifrost Request (with max_tokens only)
{
"max_completion_tokens": 4096,
"reasoning": {
"max_tokens": 3000
}
}
// Estimation: ratio = 3000/4096 ≈ 0.73 → "high"
// OpenAI Request Sent
{
"reasoning": {
"effort": "high"
}
}
```
```go
// Bifrost request with max_tokens only
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.OpenAI,
Model: "gpt-4o",
Input: messages,
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(4096),
Reasoning: &schemas.ChatReasoning{
MaxTokens: schemas.Ptr(3000),
},
},
}
// Bifrost estimates effort from max_tokens
// ratio = 3000/4096 ≈ 0.73 → effort = "high"
// OpenAI receives effort, max_tokens cleared
```
**Supported Effort Levels**: `minimal`, `low`, `medium`, `high`
When `minimal` is encountered, it's converted to `low` for non-OpenAI providers. OpenAI receives only: `low`, `medium`, `high`.
---
### Anthropic
Anthropic uses a `thinking` parameter with different structure.
```json
// Bifrost Request
{
"reasoning": {
"effort": "high",
"max_tokens": 4096
}
}
// Anthropic Request
{
"thinking": {
"type": "enabled",
"budget_tokens": 4096
}
}
```
```go
// Using Bifrost Go SDK
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.Anthropic,
Model: "claude-3-5-sonnet-20241022",
Input: messages,
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(4096),
Reasoning: &schemas.ChatReasoning{
MaxTokens: schemas.Ptr(4096), // Anthropic native field
},
},
}
// Bifrost converts to Anthropic format:
// {
// "thinking": {
// "type": "enabled",
// "budget_tokens": 4096
// }
// }
```
```json
// Anthropic Response (content blocks)
{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "EqoBCkgIAR..."
},
{
"type": "text",
"text": "The answer is 42."
}
]
}
// Bifrost Response
{
"choices": [{
"message": {
"content": "The answer is 42.",
"reasoning": "Let me analyze this step by step...",
"reasoning_details": [{
"index": 0,
"type": "text",
"text": "Let me analyze this step by step...",
"signature": "EqoBCkgIAR..."
}]
}
}]
}
```
```go
// After calling Bifrost Chat Completions with reasoning
resp, err := client.ChatCompletionRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), chatReq)
if err != nil {
log.Fatal(err)
}
// Extract reasoning from response
choice := resp.Choices[0]
message := choice.Message
// Access combined reasoning text
reasoningText := message.Reasoning
// Access detailed reasoning blocks
for i, details := range message.ReasoningDetails {
fmt.Printf("Block %d: %s\n", i, details.Text)
if details.Signature != "" {
fmt.Printf(" Signature: %s\n", details.Signature)
}
}
```
**Conversion Rules**:
| Bifrost | Anthropic | Notes |
|---------|-----------|-------|
| `reasoning.effort` | `thinking.type` | Always mapped to `"enabled"` |
| `reasoning.max_tokens` | `thinking.budget_tokens` | Token budget for reasoning |
**Critical Constraint**: Anthropic requires `reasoning.max_tokens >= 1024`. Requests with lower values will **fail with an error**.
**Dynamic Budget Handling**:
| Input Value | Converted To |
|-------------|--------------|
| `-1` (dynamic) | `1024` (minimum default) |
| `< 1024` | **Error** |
| `>= 1024` | Pass-through |
**Code Reference**: `core/providers/anthropic/chat.go:104-134`
---
### Bedrock (Anthropic Models)
Bedrock uses the same structure as Anthropic for Claude models.
```json
// Bifrost Request
{
"reasoning": {
"effort": "high",
"max_tokens": 4096
}
}
// Bedrock Request (for Anthropic/Claude models)
{
"additionalModelRequestFields": {
"reasoning_config": {
"type": "enabled",
"budget_tokens": 4096
}
}
}
```
```go
// Using Bifrost Go SDK with Bedrock provider
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.Bedrock,
Model: "us.anthropic.claude-3-5-sonnet-20241022-v2:0",
Input: messages,
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(4096),
Reasoning: &schemas.ChatReasoning{
MaxTokens: schemas.Ptr(4096), // Bedrock Anthropic native field
},
},
}
// Bifrost converts to Bedrock format with reasoning_config
```
The same 1024 minimum token budget constraint applies to Bedrock Anthropic models. Attempts to set `max_tokens` below 1024 will result in an error.
**Code Reference**: `core/providers/bedrock/utils.go:34-47`
---
### Bedrock (Nova Models)
Bedrock Nova models use an effort-based approach similar to OpenAI.
```json
// Bifrost Request
{
"reasoning": {
"effort": "high",
"max_tokens": 4096
}
}
// Bedrock Request (for Nova models)
{
"additionalModelRequestFields": {
"reasoningConfig": {
"type": "enabled",
"maxReasoningEffort": "high"
}
}
}
```
```go
// Using Bifrost Go SDK with Bedrock Nova
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.Bedrock,
Model: "us.amazon.nova-pro-v1:0",
Input: messages,
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(4096),
Reasoning: &schemas.ChatReasoning{
Effort: schemas.Ptr("high"), // Nova native field
},
},
}
// Bifrost converts to Bedrock Nova format:
// reasoningConfig: {
// type: "enabled",
// maxReasoningEffort: "high"
// }
```
| Bifrost Effort | Nova Effort | Configuration |
|---|---|---|
| `minimal`, `low` | `"low"` | Normal parameters allowed |
| `medium` | `"medium"` | Normal parameters allowed |
| `high` | `"high"` | Clears `maxTokens`, `temperature`, `topP` |
**Key Differences from Anthropic**:
- No minimum token budget constraint
- Uses effort levels instead of token budgets
- High effort mode automatically clears conflicting parameters
**Code Reference**: `core/providers/bedrock/utils.go:48-89`
---
### Gemini
Gemini uses `thinking_config` with dual support for both token budgets and effort levels, depending on the model version.
#### Model Version Support
| Gemini Version | `thinkingBudget` | `thinkingLevel` | Notes |
|----------------|------------------|-----------------|-------|
| **2.5+** | ✅ | ❌ | Budget-only models |
| **3.0+** | ✅ | ✅ | Support both budget and level |
**Important**: Only ONE parameter (`thinkingBudget` or `thinkingLevel`) should be sent to Gemini at a time. When both `reasoning.max_tokens` and `reasoning.effort` are provided in a Bifrost request, `max_tokens` takes priority and is converted to `thinkingBudget`.
#### Priority Rules
When both `reasoning.max_tokens` and `reasoning.effort` are present:
```
1. If max_tokens is provided → USE thinkingBudget (ignores effort)
2. Else if effort is provided:
- Gemini 3.0+ → USE thinkingLevel (more native)
- Gemini 2.5 → CONVERT effort to thinkingBudget
3. Else → disable reasoning
```
```json
// Bifrost Request - Both fields provided
{
"model": "gemini-3.0-flash",
"reasoning": {
"effort": "high", // Ignored
"max_tokens": 4096 // Takes priority
}
}
// Gemini 3.0+ Request - Only budget sent
{
"generation_config": {
"thinking_config": {
"include_thoughts": true,
"thinking_budget": 4096
}
}
}
```
```json
// Bifrost Request - Effort only
{
"model": "gemini-3.0-flash",
"reasoning": {
"effort": "high"
}
}
// Gemini 3.0+ Request - Converted to level
{
"generation_config": {
"thinking_config": {
"include_thoughts": true,
"thinking_level": "high"
}
}
}
```
```json
// Bifrost Request - Effort only
{
"model": "gemini-2.5-flash",
"max_completion_tokens": 4096,
"reasoning": {
"effort": "high"
}
}
// Gemini 2.5 Request - Converted to budget
// Calculation: 1024 + (0.80 × (4096 - 1024)) = 3482
{
"generation_config": {
"thinking_config": {
"include_thoughts": true,
"thinking_budget": 3482
}
}
}
```
#### Model-Specific Level Conversions
Gemini Pro models have stricter constraints on thinking levels:
| Bifrost Effort | Non-Pro Models | Pro Models | Notes |
|----------------|----------------|------------|-------|
| `"none"` | Empty string | Empty string | Disables thinking |
| `"minimal"` | `"minimal"` | `"low"` | Pro doesn't support minimal |
| `"low"` | `"low"` | `"low"` | Supported on all |
| `"medium"` | `"medium"` | `"high"` | Pro doesn't support medium |
| `"high"` | `"high"` | `"high"` | Supported on all |
**Example**:
```go
// For "gemini-3.0-flash-thinking-exp" (non-Pro)
effort: "medium" → thinkingLevel: "medium"
// For "gemini-3.0-pro" (Pro model)
effort: "medium" → thinkingLevel: "high" // Converted up
```
#### Special Values
| Value | Field | Behavior | Use Case |
|-------|-------|----------|----------|
| `0` | `max_tokens` | `thinking_budget: 0`, `include_thoughts: false` | Explicitly disable reasoning |
| `-1` | `max_tokens` | `thinking_budget: -1` | **Dynamic budget** (Gemini decides) |
| `"none"` | `effort` | `thinking_budget: 0`, `include_thoughts: false` | Disable reasoning |
```json
// Bifrost Request - Dynamic budget
{
"reasoning": {
"max_tokens": -1
}
}
// Gemini Request - Sent as-is
{
"generation_config": {
"thinking_config": {
"include_thoughts": true,
"thinking_budget": -1
}
}
}
```
```json
// Bifrost Request - Method 1
{
"reasoning": {
"max_tokens": 0
}
}
// Bifrost Request - Method 2
{
"reasoning": {
"effort": "none"
}
}
// Gemini Request - Both become
{
"generation_config": {
"thinking_config": {
"include_thoughts": false,
"thinking_budget": 0
}
}
}
```
```go
// Using Bifrost Go SDK with Gemini
// Example 1: Dynamic budget
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.Gemini,
Model: "gemini-2.0-flash-thinking-exp-1219",
Input: messages,
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(4096),
Reasoning: &schemas.ChatReasoning{
MaxTokens: schemas.Ptr(-1), // Let Gemini decide
},
},
}
// Example 2: Effort-based for Gemini 3.0+
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.Gemini,
Model: "gemini-3.0-flash",
Input: messages,
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(4096),
Reasoning: &schemas.ChatReasoning{
Effort: schemas.Ptr("high"), // Converts to thinkingLevel
},
},
}
// Example 3: Budget-based (all versions)
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.Gemini,
Model: "gemini-2.5-flash",
Input: messages,
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(4096),
Reasoning: &schemas.ChatReasoning{
MaxTokens: schemas.Ptr(3000), // Direct budget
},
},
}
```
#### Response Conversion
```json
// Gemini Response
{
"candidates": [{
"content": {
"parts": [
{
"thought": true,
"text": "Analyzing the problem..."
},
{
"text": "The answer is 42."
}
]
}
}]
}
// Bifrost Response
{
"choices": [{
"message": {
"content": "The answer is 42.",
"reasoning": "Analyzing the problem...",
"reasoning_details": [{
"index": 0,
"type": "text",
"text": "Analyzing the problem..."
}]
}
}]
}
```
```go
// After calling Bifrost Chat Completions with Gemini
resp, err := client.ChatCompletionRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), chatReq)
if err != nil {
log.Fatal(err)
}
// Extract reasoning from response
choice := resp.Choices[0]
message := choice.Message
// Access combined reasoning text
fmt.Printf("Reasoning: %s\n", message.Reasoning)
// Access detailed reasoning blocks
for i, details := range message.ReasoningDetails {
if details.Type == "text" {
fmt.Printf("Thinking block %d:\n%s\n", i, details.Text)
}
}
// Access final answer
fmt.Printf("Answer:\n%s\n", message.Content)
```
#### Conversion Summary
**Bifrost → Gemini (Request)**:
| Input | Gemini 2.5 | Gemini 3.0+ | Note |
|-------|------------|-------------|------|
| `max_tokens: 4096` | `thinking_budget: 4096` | `thinking_budget: 4096` | Direct pass-through |
| `max_tokens: -1` | `thinking_budget: -1` | `thinking_budget: -1` | Dynamic budget |
| `max_tokens: 0` | `thinking_budget: 0` | `thinking_budget: 0` | Disabled |
| `effort: "high"` only | `thinking_budget: 3482`* | `thinking_level: "high"` | Estimated or native |
| `effort: "medium"` only | `thinking_budget: 2330`* | `thinking_level: "medium"` or `"high"`** | Estimated or native |
| Both `effort` + `max_tokens` | Uses `max_tokens` | Uses `max_tokens` | Priority rule |
\* Assumes `max_completion_tokens: 8192` (default), uses estimation formula
\*\* Pro models convert `"medium"` to `"high"`
**Gemini → Bifrost (Response)**:
| Gemini Field | Bifrost Field | Conversion |
|--------------|---------------|------------|
| `thinking_budget` | `reasoning.max_tokens` | Direct mapping |
| `thinking_level` | `reasoning.effort` | Level → effort mapping |
| `thought: true` parts | `reasoning_details[]` | Array of reasoning blocks |
**Code References**:
- `core/providers/gemini/utils.go` (Chat Completions)
- `core/providers/gemini/responses.go` (Responses API)
- `core/providers/gemini/types.go` (Constants)
---
## Two Reasoning Methods: Effort vs. Max Tokens
Bifrost supports two distinct reasoning models across different providers:
### Reasoning Model Types
| Model | Providers | Request Field | Native Format |
|-------|-----------|---------------|---------------|
| **Effort-Based** | OpenAI, AWS Bedrock Nova | `reasoning.effort` | `reasoning_effort` (Chat) / `effort` (Responses) |
| **Max-Tokens-Based** | Anthropic, Cohere, Gemini | `reasoning.max_tokens` | `thinking.budget_tokens` |
**Important**: Both effort and max_tokens can be specified in a single request. Bifrost uses a **priority hierarchy** to determine which field is used.
### Priority Logic: Native vs. Estimated
When both `effort` and `max_tokens` are present in a request, Bifrost prioritizes the **native compatible field** for the target provider:
#### **For Max-Tokens-Based Providers** (Anthropic, Cohere, Gemini)
```
1. If reasoning.max_tokens is provided → USE IT (native field)
2. Else if reasoning.effort is provided → ESTIMATE max_tokens from effort
3. Else → disable reasoning
```
**Example** (Cohere):
```json
// Request with both fields
{
"reasoning": {
"effort": "high",
"max_tokens": 2000
}
}
```
**Result**: Uses `max_tokens: 2000` directly, ignores `effort`
#### **For Effort-Based Providers** (OpenAI, AWS Bedrock Nova)
```
1. If reasoning.effort is provided → USE IT (native field)
2. Else if reasoning.max_tokens is provided → ESTIMATE effort from max_tokens
3. Else → disable reasoning
```
**Example** (OpenAI Chat Completions):
```json
// Request with both fields
{
"reasoning": {
"effort": "high",
"max_tokens": 2000
}
}
```
**Result**: Uses `effort: "high"` directly, strips `max_tokens` from JSON
**Reason 1: Accuracy** - Native fields provide direct control without estimation loss
**Reason 2: Consistency** - Using native fields ensures the exact user intent is preserved
**Reason 3: Performance** - Avoids unnecessary conversions when native field is already provided
---
## Estimator Functions
Bifrost provides two estimator functions to convert between reasoning methods. These are used when the native field is not available.
### Function 1: Effort → Max Tokens
**Function**: `GetBudgetTokensFromReasoningEffort()`
**File**: `core/providers/utils/utils.go:1350-1387`
**Signature**:
```go
func GetBudgetTokensFromReasoningEffort(
effort string, // "minimal", "low", "medium", "high"
minBudgetTokens int, // Provider-specific minimum (e.g., 1024 for Anthropic)
maxTokens int, // Total completion tokens available
) (int, error)
```
**Algorithm**:
```
1. Define ratio for effort level:
- "minimal" → 2.5% (0.025)
- "low" → 15% (0.15)
- "medium" → 42.5% (0.425)
- "high" → 80% (0.80)
2. Calculate budget:
budget = minBudgetTokens + (ratio × (maxTokens - minBudgetTokens))
3. Clamp to valid range:
if budget < minBudgetTokens → budget = minBudgetTokens
if budget > maxTokens → budget = maxTokens
```
**Conversion Examples** (with `minBudgetTokens=1024`, `maxTokens=4096`):
| Effort | Ratio | Calculation | Result |
|--------|-------|-------------|--------|
| `minimal` | 2.5% | 1024 + 0.025 × 3072 | 1101 → 1024* |
| `low` | 15% | 1024 + 0.15 × 3072 | 1485 |
| `medium` | 42.5% | 1024 + 0.425 × 3072 | 2330 |
| `high` | 80% | 1024 + 0.80 × 3072 | 3482 |
*When result is below minimum, clamped to minBudgetTokens (for Anthropic minimum of 1024)
**Error Handling**:
```go
if minBudgetTokens > maxTokens {
return 0, fmt.Errorf("max_tokens must be > minBudgetTokens")
}
```
**Code Example**:
```go
// Cohere: Convert effort to token budget
budgetTokens, err := providerUtils.GetBudgetTokensFromReasoningEffort(
"high", // effort
1, // Cohere min
4096, // max completion tokens
)
// Returns: 3277 tokens
```
### Function 2: Max Tokens → Effort
**Function**: `GetReasoningEffortFromBudgetTokens()`
**File**: `core/providers/utils/utils.go:1308-1345`
**Signature**:
```go
func GetReasoningEffortFromBudgetTokens(
budgetTokens int, // Reasoning token budget
minBudgetTokens int, // Provider-specific minimum
maxTokens int, // Total completion tokens available
) string // Returns: "low", "medium", "high"
```
**Algorithm**:
```
1. Normalize budget to valid range:
if budget < min → budget = min
if budget > max → budget = max
2. Calculate ratio:
ratio = (budgetTokens - minBudgetTokens) / (maxTokens - minBudgetTokens)
3. Map ratio to effort level:
if ratio ≤ 0.25 → "low"
if ratio ≤ 0.60 → "medium"
if ratio > 0.60 → "high"
```
**Conversion Examples** (with `minBudgetTokens=1024`, `maxTokens=4096`):
| Budget Tokens | Ratio | Effort |
|---|---|---|
| 1024 | 0% | `low` |
| 1101 | 2.5% | `low` |
| 1500 | 15.6% | `low` |
| 1900 | 28.6% | `medium` |
| 2500 | 48.1% | `medium` |
| 3000 | 64.5% | `high` |
| 3400 | 77.6% | `high` |
**Defensive Defaults**:
```go
if budgetTokens <= 0 {
return "none"
}
if maxTokens <= 0 {
return "medium" // Safe default
}
if maxTokens <= minBudgetTokens {
return "high" // Can't calculate ratio
}
```
**Code Example**:
```go
// Convert Anthropic budget back to effort for display
effort := providerUtils.GetReasoningEffortFromBudgetTokens(
3000, // budget tokens from Anthropic response
1024, // Anthropic minimum
4096, // max tokens
)
// Returns: "high"
```
---
## Provider-Specific Constants
Different providers have different constraints on reasoning budget:
### Min Budget Constants
| Provider | File | MinBudgetTokens | Reason |
|----------|------|---|---|
| Anthropic | `core/providers/anthropic/types.go` | **1024** | Anthropic API requirement |
| Bedrock Anthropic | `core/providers/bedrock/types.go` | **1024** | Same as Anthropic |
| Bedrock Nova | `core/providers/bedrock/types.go` | 1 | More flexible |
| Cohere | `core/providers/cohere/types.go` | 1 | Flexible |
| Gemini | `core/providers/gemini/types.go` | 1024 | Default minimum for conversions |
### Default Completion Tokens (for ratio calculation)
When `max_completion_tokens` is not provided, these defaults are used for ratio calculations:
| Provider | Default | File |
|----------|---------|------|
| OpenAI, Anthropic, Cohere, Bedrock | 4096 | `core/providers/*/types.go` |
| Gemini | 8192 | `core/providers/gemini/types.go` |
---
## Effort-to-Token Conversion Examples
### Example 1: Estimate tokens from effort (Anthropic)
**Input**:
```json
{
"model": "anthropic/claude-3-5-sonnet",
"max_completion_tokens": 2000,
"reasoning": {
"effort": "high"
}
}
```
**Conversion Process**:
1. `effort = "high"` → `ratio = 0.80`
2. `minBudgetTokens = 1024` (Anthropic)
3. `maxCompletionTokens = 2000`
4. `budget = 1024 + (0.80 × (2000 - 1024))`
5. `budget = 1024 + (0.80 × 976)`
6. `budget = 1024 + 780`
7. **Result: 1804 tokens**
**Anthropic Request Generated**:
```json
{
"thinking": {
"type": "enabled",
"budget_tokens": 1804
}
}
```
```go
import (
"github.com/maximhq/bifrost/core/providers/utils"
"github.com/maximhq/bifrost/core/schemas"
)
// Using Bifrost Go SDK
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.Anthropic,
Model: "claude-3-5-sonnet-20241022",
Input: messages,
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(2000),
Reasoning: &schemas.ChatReasoning{
Effort: schemas.Ptr("high"), // Effort provided, max_tokens not set
},
},
}
// Bifrost automatically converts effort to budget tokens:
// 1. Get ratio for "high": 0.80
// 2. Calculate: 1024 + (0.80 × (2000 - 1024)) = 1804
// 3. Send to Anthropic with budget_tokens: 1804
// Alternatively, manually call the estimator function:
budgetTokens, _ := utils.GetBudgetTokensFromReasoningEffort(
"high", // effort
1024, // Anthropic minimum
2000, // max completion tokens
)
// Returns: 1804
```
### Example 2: Estimate effort from tokens (Bedrock Nova)
**Input**:
```json
{
"model": "bedrock/us.amazon.nova-pro-v1:0",
"max_completion_tokens": 4096,
"reasoning": {
"max_tokens": 2000
}
}
```
**Conversion Process**:
1. `budgetTokens = 2000`
2. `minBudgetTokens = 1` (Nova)
3. `maxCompletionTokens = 4096`
4. `ratio = (2000 - 1) / (4096 - 1)`
5. `ratio = 1999 / 4095`
6. `ratio = 0.488` (48.8%)
7. Since `0.25 < 0.488 ≤ 0.60` → **Result: "medium"**
**Bedrock Nova Request Generated**:
```json
{
"reasoningConfig": {
"type": "enabled",
"maxReasoningEffort": "medium"
}
}
```
```go
import (
"github.com/maximhq/bifrost/core/providers/utils"
"github.com/maximhq/bifrost/core/schemas"
)
// Using Bifrost Go SDK with max_tokens (not effort)
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.Bedrock,
Model: "us.amazon.nova-pro-v1:0",
Input: messages,
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(4096),
Reasoning: &schemas.ChatReasoning{
MaxTokens: schemas.Ptr(2000), // Max tokens provided, effort not set
},
},
}
// Bifrost automatically estimates effort from max_tokens:
// 1. Calculate ratio: (2000 - 1) / (4096 - 1) = 0.488
// 2. Since 0.25 < 0.488 ≤ 0.60 → "medium"
// 3. Send to Bedrock Nova with effort: "medium"
// Alternatively, manually call the estimator function:
effort := utils.GetReasoningEffortFromBudgetTokens(
2000, // budget tokens
1, // Nova minimum
4096, // max completion tokens
)
// Returns: "medium"
```
### Example 3: Both fields provided (priority used)
**Input**:
```json
{
"model": "anthropic/claude-3-5-sonnet",
"max_completion_tokens": 4096,
"reasoning": {
"effort": "medium",
"max_tokens": 2500
}
}
```
**Logic for Max-Tokens-Based Provider**:
1. Check: Is `max_tokens` provided? → **YES**
2. Use `max_tokens` directly (ignore `effort`)
3. Validate: `2500 >= 1024`? → **YES**
**Anthropic Request Generated**:
```json
{
"thinking": {
"type": "enabled",
"budget_tokens": 2500
}
}
```
**Note**: The `effort: "medium"` is completely ignored because `max_tokens` takes priority.
```go
import "github.com/maximhq/bifrost/core/schemas"
// Using Bifrost Go SDK with BOTH effort and max_tokens
chatReq := &schemas.BifrostChatRequest{
Provider: schemas.Anthropic,
Model: "claude-3-5-sonnet-20241022",
Input: messages,
Params: &schemas.ChatParameters{
MaxCompletionTokens: schemas.Ptr(4096),
Reasoning: &schemas.ChatReasoning{
Effort: schemas.Ptr("medium"), // Provided but ignored
MaxTokens: schemas.Ptr(2500), // This takes priority
},
},
}
// Bifrost Priority Logic:
// 1. For max-tokens-based providers (Anthropic):
// → Check if max_tokens is provided? YES
// → Use it directly: 2500
// → Ignore effort: "medium"
// → Validate: 2500 >= 1024? YES ✓
// 2. Send to Anthropic with budget_tokens: 2500
// Result: effort is completely ignored, max_tokens is used
```
---
## Response Format
### Bifrost Standard Response
All providers return reasoning in a normalized `reasoning_details` array:
```json
{
"choices": [{
"message": {
"role": "assistant",
"content": "Final response text",
"reasoning_details": [
{
"index": 0,
"type": "text",
"text": "Step-by-step reasoning content...",
"signature": "optional_signature_for_verification"
}
]
}
}]
}
```
### Reasoning Details Fields
| Field | Type | Description | Present In |
|-------|------|-------------|------------|
| `index` | `int` | Position in reasoning sequence | All |
| `type` | `string` | Content type (`text`, `encrypted`, `summary`) | All |
| `text` | `string` | Reasoning content | Chat Completions |
| `summary` | `string` | Reasoning summary | Responses API |
| `signature` | `string` | Cryptographic signature for verification | Anthropic, Bedrock |
### Type Mappings
| Reasoning Type | When Used | Source |
|---|---|---|
| `reasoning.text` | Direct thinking/reasoning content | Anthropic, Gemini, Bedrock |
| `reasoning.encrypted` | Signature-verified reasoning | Anthropic, Bedrock Nova |
| `reasoning.summary` | Summarized reasoning (Responses API) | All providers |
**OpenAI Implementation**: OpenAI (both Chat Completions and Responses API) is effort-based, following the standard priority logic: if `effort` is provided, it's used directly; if only `max_tokens` is provided, effort is estimated from it. The `max_tokens` field is then cleared before JSON serialization via `MarshalJSON` (`core/providers/openai/types.go:383-453`), since OpenAI's APIs don't accept it.
---
## Streaming
### Stream Event Types
| Provider | Reasoning Event | Signature Event |
|----------|-----------------|-----------------|
| OpenAI | `reasoning` (top-level) | N/A |
| Anthropic | `thinking_delta` | `signature_delta` |
| Bedrock | `thinking_delta` | `signature_delta` |
| Gemini | `thought` (in content) | `thought_signature` |
### Anthropic Streaming Example
```
// Stream events
event: content_block_start
data: {"type": "content_block_start", "content_block": {"type": "thinking"}}
event: content_block_delta
data: {"type": "content_block_delta", "delta": {"type": "thinking_delta", "thinking": "Let me"}}
event: content_block_delta
data: {"type": "content_block_delta", "delta": {"type": "thinking_delta", "thinking": " analyze..."}}
event: content_block_delta
data: {"type": "content_block_delta", "delta": {"type": "signature_delta", "signature": "EqoB..."}}
event: content_block_stop
data: {"type": "content_block_stop"}
```
### Bifrost Stream Response
```json
// Thinking delta
{
"choices": [{
"delta": {
"reasoning_details": [{
"index": 0,
"type": "text",
"text": "Let me analyze..."
}]
}
}]
}
// Signature delta
{
"choices": [{
"delta": {
"reasoning_details": [{
"index": 0,
"signature": "EqoB..."
}]
}
}]
}
```
---
## Caveats Summary
**Severity**: High
**Behavior**: `reasoning.max_tokens` must be >= 1024
**Impact**: Requests with lower values fail with error
**Workaround**: Always set max_tokens >= 1024 for Anthropic/Bedrock
**Severity**: Medium
**Behavior**: `reasoning.max_tokens = -1` converted to `1024`
**Impact**: Dynamic budgeting not available on Anthropic/Bedrock
**Workaround**: Set explicit token budget
**Severity**: Low
**Behavior**: OpenAI's `minimal` converted to `low` when routing to other providers
**Impact**: Slightly different reasoning behavior
**Severity**: Low
**Behavior**: `signature` field only present in Anthropic/Bedrock responses
**Impact**: Signature-based verification only available for these providers
**Severity**: Low
**Behavior**: Anthropic's `thinking.type` always set to `"enabled"` regardless of effort
**Impact**: Cannot disable thinking once reasoning param is present
**Severity**: Medium
**Behavior**: When both `effort` and `max_tokens` are provided, only `thinkingBudget` is sent to Gemini (effort is dropped)
**Impact**: Effort value is completely ignored when max_tokens is present
**Workaround**: Provide only the parameter you want to use
**Severity**: Medium
**Behavior**: Gemini 2.5 only supports `thinkingBudget`, while 3.0+ supports both `thinkingBudget` and `thinkingLevel`
**Impact**: Effort-only requests on 2.5 are converted to budget; on 3.0+ they use native levels
**Note**: Bifrost automatically detects version and uses appropriate conversion
**Severity**: Low
**Behavior**: Pro models only support "low" and "high" thinking levels
**Impact**: `"minimal"` → `"low"`, `"medium"` → `"high"` for Pro models
**Note**: Non-Pro models support all four levels: minimal, low, medium, high
---
## Complete Provider Comparison
### Reasoning Model
| Provider | Model Type | Budget Type | Min Budget | Signature Support |
|----------|-----------|-------------|------------|------------------|
| OpenAI | Effort-based | Effort-based | None | ❌ |
| Anthropic | Thinking blocks | Token budget | **1024** | ✅ |
| Bedrock (Anthropic) | Reasoning config | Token budget | **1024** | ✅ |
| Bedrock (Nova) | Reasoning config | Effort-based | None | ❌ |
| Gemini 2.5+ | Thinking config | Token budget | 1024 | ✅ |
| Gemini 3.0+ | Thinking config | Dual (budget + level) | 1024 | ✅ |
### Parameter Support
| Provider | `effort` | `max_tokens` | `summary` | Streaming |
|----------|----------|------------|----------|-----------|
| OpenAI | ✅ (4 levels) | ✅ | ❌ | ✅ |
| Anthropic | ❌ (binary) | ✅ | ✅ | ✅ |
| Bedrock (Anthropic) | ❌ (binary) | ✅ | ✅ | ✅ |
| Bedrock (Nova) | ✅ (3 levels) | ⚠️ (ignored) | ❌ | ✅ |
| Gemini 2.5+ | ⚠️ (converts to budget) | ✅ | ❌ | ✅ |
| Gemini 3.0+ | ✅ (4 levels) | ✅ | ❌ | ✅ |
---
## Troubleshooting
### Anthropic: "reasoning.max_tokens must be >= 1024"
**Cause**: Attempting to use reasoning with `max_tokens < 1024`
**Solution**: Ensure `reasoning.max_tokens >= 1024` for Anthropic/Bedrock Anthropic models
```json
// ❌ Invalid
{"reasoning": {"effort": "high", "max_tokens": 500}}
// ✅ Valid
{"reasoning": {"effort": "high", "max_tokens": 1024}}
```
### OpenAI: Model doesn't support reasoning
**Cause**: Using an older model that doesn't support reasoning (e.g., `gpt-4-turbo`)
**Solution**: Use models with reasoning support: `gpt-4o`, `gpt-4o-mini` (o1 series with native reasoning)
### Bedrock Nova: `max_tokens` parameter being ignored
**Expected Behavior**: Bedrock Nova uses effort-based reasoning only
**Solution**: Provide `effort` parameter instead of `max_tokens` for Nova models
```json
// ✅ Correct for Nova
{"reasoning": {"effort": "high"}}
```
---