# AGENTS.md — Bifrost AI Gateway > Context for AI agents (Claude Code, Copilot, Cursor, etc.) working on this codebase. Read this fully before making changes. ## What is Bifrost? Bifrost is a high-performance AI gateway that unifies 20+ LLM providers behind a single OpenAI-compatible API with ~11µs overhead at 5,000 RPS. It also serves as an MCP (Model Context Protocol) gateway, turning static chat models into tool-calling agents. GitHub: `maximhq/bifrost` --- ## Repository Layout ``` bifrost/ ├── core/ # Go core library — the engine │ ├── bifrost.go # Main struct, request queuing, provider lifecycle (~3.4K lines) │ ├── inference.go # Inference routing, fallbacks, streaming dispatch (~1.9K lines) │ ├── mcp.go # MCP integration entry point │ ├── schemas/ # ALL shared Go types — 41 files │ │ ├── bifrost.go # BifrostConfig, ModelProvider enum, RequestType enum, context keys │ │ ├── provider.go # Provider interface (30+ methods), NetworkConfig, ProviderConfig │ │ ├── plugin.go # LLMPlugin, MCPPlugin, HTTPTransportPlugin, ObservabilityPlugin │ │ ├── context.go # BifrostContext (custom context.Context with mutable values) │ │ ├── chatcompletions.go # Chat completion request/response types │ │ ├── responses.go # OpenAI Responses API types │ │ ├── embedding.go # Embedding types │ │ ├── images.go # Image generation types │ │ ├── batch.go # Batch operation types │ │ ├── files.go # File management types │ │ ├── mcp.go # MCP types │ │ ├── trace.go # Tracer interface │ │ └── logger.go # Logger interface │ ├── providers/ # 20+ provider implementations │ │ ├── openai/ # Reference implementation (largest, most complete) │ │ ├── anthropic/ # Non-OpenAI-compatible example │ │ ├── bedrock/ # AWS event-stream protocol │ │ ├── gemini/ # Google-specific API shape │ │ ├── groq/ # OpenAI-compatible (minimal, delegates to openai/) │ │ └── utils/ # Shared: HTTP client, SSE parsing, error handling, scanner pool │ ├── pool/ # Generic Pool[T] — dual-mode (prod: sync.Pool, debug: full tracking) │ │ ├── pool_prod.go # Zero-overhead sync.Pool wrapper (default build) │ │ └── pool_debug.go # Double-release/use-after-release/leak detection (-tags pooldebug) │ ├── mcp/ # MCP protocol implementation │ │ ├── agent.go # Agent orchestration loop (multi-turn tool calling) │ │ ├── clientmanager.go # MCP client lifecycle management │ │ ├── toolmanager.go # Tool registration, discovery, filtering │ │ ├── healthmonitor.go # Client health monitoring │ │ └── codemode/starlark/ # Starlark sandbox for code-mode execution │ └── internal/ │ ├── llmtests/ # LLM integration test infra (48 files, scenario-based) │ └── mcptests/ # MCP/Agent test infra (40+ files, mock-based) │ ├── framework/ # Data persistence, streaming, ecosystem utilities │ ├── configstore/ # Config storage backends (file, postgres) │ ├── logstore/ # Log storage backends (file, postgres) │ ├── vectorstore/ # Vector storage (Weaviate, Qdrant, Redis, Pinecone) │ ├── streaming/ # Streaming accumulator, delta copying, response marshaling │ │ ├── accumulator.go # Chunk accumulation into full response (~24KB) │ │ ├── chat.go # Chat stream handling (~17KB) │ │ └── responses.go # Response stream marshaling (~35KB) │ ├── modelcatalog/ # Model metadata registry │ ├── tracing/ # Distributed tracing helpers │ └── encrypt/ # Encryption utilities │ ├── transports/ │ ├── config.schema.json # JSON Schema — THE source of truth for config.json (~2700 lines) │ └── bifrost-http/ # HTTP gateway transport │ ├── server/ # Server lifecycle, route registration │ ├── handlers/ # 27 HTTP endpoint handlers │ │ ├── inference.go # Chat/text completions, responses API (~109KB) │ │ ├── mcpinference.go # MCP tool execution │ │ ├── governance.go # Virtual keys, teams, customers, budgets (~100KB) │ │ ├── providers.go # Provider CRUD, key management │ │ ├── mcp.go # MCP client registry management │ │ ├── logging.go # Log queries, stats, histograms │ │ ├── config.go # System configuration │ │ ├── plugins.go # Plugin CRUD │ │ ├── cache.go # Cache management │ │ ├── session.go # Auth/session management │ │ ├── health.go # Health checks │ │ ├── mcpserver.go # MCP server (SSE/streamable HTTP) │ │ ├── websocket.go # WebSocket handler │ │ ├── devpprof.go # Pool debug profiler endpoint (~23KB) │ │ └── middlewares.go # Middleware definitions │ ├── lib/ # ChainMiddlewares, config, context conversion │ └── integrations/ # SDK compatibility layers │ ├── openai.go # OpenAI SDK drop-in compatibility │ ├── anthropic.go # Anthropic SDK compatibility │ ├── bedrock.go # AWS Bedrock SDK compatibility │ ├── genai.go # Google GenAI SDK compatibility │ ├── langchain.go # LangChain compatibility │ ├── litellm.go # LiteLLM compatibility │ └── pydanticai.go # PydanticAI compatibility │ ├── plugins/ # Go plugins — each has own go.mod │ ├── governance/ # Budget, rate limiting, virtual keys, routing, RBAC │ ├── telemetry/ # Prometheus metrics, push gateway │ ├── logging/ # Request/response audit logging │ ├── semanticcache/ # Semantic response caching via vector store │ ├── otel/ # OpenTelemetry tracing │ ├── mocker/ # Mock responses for testing │ ├── jsonparser/ # JSON extraction utilities │ ├── maxim/ # Maxim observability │ └── compat/ # LiteLLM SDK compatibility (HTTP transport) │ ├── ui/ # React + vite web interface │ ├── app/workspace/ # Feature pages (20+ workspace sections) │ ├── components/ # Shared React components │ └── lib/ # Constants, utilities, types │ ├── tests/e2e/ # Playwright E2E tests │ ├── core/ # Fixtures, page objects, helpers, API actions │ └── features/ # Per-feature test suites │ ├── docs/ # Mintlify MDX documentation │ ├── docs.json # Navigation config │ ├── media/ # Screenshots (ui-*.png naming convention) │ └── (architecture|features|providers|mcp|plugins|enterprise|...) │ ├── .claude/skills/ # Claude Code skill definitions (4 skills) ├── go.work # Go workspace — requires Go 1.26.1 ├── Makefile # Build, test, dev commands (1300+ lines) └── terraform/ # Infrastructure as Code ``` --- ## Go Workspace Bifrost is a **multi-module Go workspace**. Each module has its own `go.mod`: ``` go.work ├── core/go.mod # github.com/maximhq/bifrost/core ├── framework/go.mod # github.com/maximhq/bifrost/framework ├── transports/go.mod # github.com/maximhq/bifrost/transports └── plugins/*/go.mod # 9 plugin modules (governance, telemetry, logging, etc.) ``` **Rules:** - Run `go mod tidy` in the **specific module directory**, not the root - Cross-module imports resolve via workspace locally, but need explicit `require` in `go.mod` for releases - The workspace requires **Go 1.26.1** (`go.work` directive) --- ## Build, Test & Dev Commands ```bash # Development make dev # Full local dev (UI + API with hot reload via air) make build # Build bifrost-http binary # Core tests (provider integration tests — hit live APIs) make test-core # All providers make test-core PROVIDER=openai # Specific provider make test-core PROVIDER=openai TESTCASE=TestSimpleChat # Specific test make test-core PATTERN=TestStreaming # Tests matching pattern make test-core DEBUG=1 # With Delve debugger on :2345 # MCP/Agent tests (mock-based, no live APIs) make test-mcp # All MCP tests make test-mcp TESTCASE=TestAgentLoop # Specific test make test-mcp TYPE=agent # By category (agent|tool|connection|codemode) # Plugin tests make test-plugins # All plugins make test-governance # Governance plugin specifically # Integration tests (SDK compatibility) make test-integrations-py # Python SDK tests make test-integrations-ts # TypeScript SDK tests # E2E tests (Playwright, requires running dev server) make run-e2e # All E2E tests make run-e2e FLOW=providers # Specific feature # Code quality make lint # Linting make fmt # Format code ``` --- ## Architecture ### Request Flow ``` Client HTTP Request → FastHTTP Transport (parsing, validation ~2µs) → SDK Integration Layer (OpenAI/Anthropic/Bedrock format → Bifrost format) → Middleware Chain (lib.ChainMiddlewares, applied per-route) → HTTPTransportPreHook (HTTP-level plugins, can short-circuit) → PreLLMHook Pipeline (auth, rate-limit, cache check — registration order) → MCP Tool Discovery & Injection (if tool_choice present) → Provider Queue (channel-based, per-provider isolation) → Worker picks up request → Key Selection (~10ns weighted random) → Provider API Call (fasthttp client, connection pooling) → Response / SSE Stream → PostLLMHook Pipeline (reverse order of PreLLMHooks) → Tool Execution Loop (if tool_calls in response, MCP agent loop) → HTTPTransportPostHook (reverse order) → Response Serialization → HTTP Response to Client ``` ### Design Principles - **Provider isolation**: Each provider has its own worker pool and queue. One provider going down doesn't cascade to others. - **Channel-based async**: Request routing uses Go channels (`chan *ChannelMessage`), not mutexes. The `ProviderQueue` struct manages channel lifecycle with atomic flags. - **Object pooling everywhere**: `sync.Pool` wrappers reduce GC pressure. Pools exist for: channel messages, response channels, error channels, stream channels, plugin pipelines, MCP requests, HTTP request/response objects, scanner buffers. - **Plugin pipeline symmetry**: Pre-hooks execute in registration order, post-hooks in **reverse** order (LIFO). For every pre-hook executed, the corresponding post-hook is guaranteed to run. - **Streaming**: SSE chunks flow through `chan chan *schemas.BifrostStreamChunk`. Accumulated into full response for post-hooks via `framework/streaming/accumulator.go`. ### BifrostContext — Custom Context `BifrostContext` (`core/schemas/context.go`) is a custom `context.Context` with **thread-safe mutable values**. Unlike standard Go contexts, values can be set after creation: ```go ctx := schemas.NewBifrostContext(parent, deadline) ctx.SetValue(key, value) // Thread-safe, uses RWMutex ctx.WithValue(key, value) // Chainable variant ``` **Reserved context keys** (set by Bifrost internals — DO NOT set manually): - `BifrostContextKeySelectedKeyID/Name` — Set by governance plugin - `BifrostContextKeyGovernance*` — Set by governance plugin - `BifrostContextKeyNumberOfRetries`, `BifrostContextKeyFallbackIndex` — Set by retry/fallback logic - `BifrostContextKeyStreamEndIndicator` — Set by streaming infrastructure - `BifrostContextKeyTrace*`, `BifrostContextKeySpan*` — Set by tracing middleware **User-settable keys** (plugins and handlers can set these): - `BifrostContextKeyVirtualKey` (`x-bf-vk`) — Virtual key for governance - `BifrostContextKeyAPIKeyName` (`x-bf-api-key`) — Explicit key selection by name - `BifrostContextKeyAPIKeyID` (`x-bf-api-key-id`) — Explicit key selection by ID (takes priority over name) - `BifrostContextKeyRequestID` — Request ID - `BifrostContextKeyExtraHeaders` — Extra headers to forward to provider - `BifrostContextKeyURLPath` — Custom URL path for provider - `BifrostContextKeySkipKeySelection` — Skip key selection (pass empty key) - `BifrostContextKeyUseRawRequestBody` — Send raw body directly to provider **Gotcha**: `BlockRestrictedWrites()` silently drops writes to reserved keys. This prevents plugins from accidentally overwriting internal state. --- ## Core Patterns ### Provider Implementation There are **two categories** of providers: **Category 1: Non-OpenAI-compatible** (Anthropic, Bedrock, Gemini, Cohere, HuggingFace, Replicate, ElevenLabs): ``` core/providers// ├── .go # Controller: constructor, interface methods, HTTP orchestration ├── _test.go # Tests ├── types.go # ALL provider-specific structs (PascalCase prefixed with provider name) ├── utils.go # Constants, base URLs, helpers (camelCase for unexported) ├── errors.go # Error parsing: provider HTTP error → *schemas.BifrostError ├── chat.go # Chat request/response converters ├── embedding.go # Embedding converters (if supported) ├── images.go # Image generation (if supported) ├── speech.go # TTS/STT (if supported) └── responses.go # Responses API + streaming converters ``` **Category 2: OpenAI-compatible** (Groq, Cerebras, Ollama, Perplexity, OpenRouter, Parasail, Nebius, xAI, SGL): ``` core/providers// ├── .go # Minimal — constructor + delegates to openai.HandleOpenAI* functions └── _test.go # Tests ``` **Converter function naming convention:** - `ToRequest()` — Bifrost schema → Provider API format - `ToBifrostResponse()` — Provider API format → Bifrost schema - These must be **pure transformation functions** — no HTTP calls, no logging, no side effects **Provider constructor pattern:** ```go func NewProvider(config schemas.ProviderConfig) (*Provider, error) { // Validate config, set up fasthttp.Client with connection pooling client := &fasthttp.Client{ MaxConnsPerHost: config.NetworkConfig.MaxConnsPerHost, // configurable, default 5000 MaxIdleConnDuration: 30 * time.Second, } // After ConfigureProxy/ConfigureDialer/ConfigureTLS, build a sibling client // for streaming. BuildStreamingClient zeros ReadTimeout/WriteTimeout/MaxConnDuration // so streams aren't killed by fasthttp's whole-response deadline; per-chunk idle // is enforced at the app layer via NewIdleTimeoutReader. streamingClient := providerUtils.BuildStreamingClient(client) return &Provider{client: client, streamingClient: streamingClient, ...}, nil } ``` **Streaming vs unary client:** Every provider holds two clients — `client` for unary requests (`ReadTimeout=30s` bounds the whole response) and `streamingClient` for SSE / EventStream / chunked paths (`ReadTimeout=0`; the per-chunk `NewIdleTimeoutReader` is the only governor). Pass `provider.streamingClient` to every `Handle*Streaming` / `Handle*StreamRequest` helper and to direct `Do` calls inside `*Stream` methods. For new providers, apply the same pattern — missing the switch means streams get killed at 30s. **Note:** Bedrock uses `net/http` (not fasthttp) with HTTP/2 support. Its `http.Transport` is configured with `ForceAttemptHTTP2: true` and `MaxConnsPerHost` from `NetworkConfig` to allow multiple HTTP/2 connections when the server's per-connection stream limit (100 for AWS Bedrock) is reached. Use `providerUtils.BuildStreamingHTTPClient(client)` to derive the streaming variant — it shares the base `Transport` (safe for concurrent reuse) but clears `Client.Timeout`. ### The Provider Interface `core/schemas/provider.go` defines the `Provider` interface with **30+ methods**. Every provider must implement all of them (returning "not supported" for unsupported operations). The interface covers: - `ListModels`, `ChatCompletion`, `ChatCompletionStream` - `Responses`, `ResponsesStream` (OpenAI Responses API) - `TextCompletion`, `TextCompletionStream` - `Embedding`, `Speech`, `SpeechStream`, `Transcription`, `TranscriptionStream` - `ImageGeneration`, `ImageGenerationStream`, `ImageEdit`, `ImageEditStream`, `ImageVariation` - `CountTokens` - `Batch*` (Create, List, Retrieve, Cancel, Results) - `File*` (Upload, List, Retrieve, Delete, Content) - `Container*` and `ContainerFile*` (Create, List, Retrieve, Delete, Content) **Streaming methods** receive a `PostHookRunner` callback and return `chan *BifrostStreamChunk`: ```go ChatCompletionStream(ctx *BifrostContext, postHookRunner PostHookRunner, key Key, request *BifrostChatRequest) (chan *BifrostStreamChunk, *BifrostError) ``` ### Error Handling Each provider has `errors.go` with an `ErrorConverter` function: ```go type ErrorConverter func(resp *fasthttp.Response, requestType schemas.RequestType, providerName schemas.ModelProvider, model string) *schemas.BifrostError ``` The shared utility `providerUtils.HandleProviderAPIError()` handles common HTTP error parsing. Provider-specific parsers add extra field mapping. Errors always carry metadata: ```go bifrostErr.ExtraFields.Provider = providerName bifrostErr.ExtraFields.ModelRequested = model bifrostErr.ExtraFields.RequestType = requestType ``` ### Plugin System Four plugin interfaces exist: | Interface | Hook Methods | When Called | |-----------|-------------|------------| | `LLMPlugin` | `PreLLMHook`, `PostLLMHook` | Every LLM request (SDK + HTTP) | | `MCPPlugin` | `PreMCPHook`, `PostMCPHook` | Every MCP tool execution | | `HTTPTransportPlugin` | `HTTPTransportPreHook`, `HTTPTransportPostHook`, `HTTPTransportStreamChunkHook` | HTTP gateway only (not Go SDK) | | `ObservabilityPlugin` | `Inject(ctx, trace)` | Async, after response written to wire | **Key plugin behaviors:** - Plugin errors are **logged as warnings**, never returned to the caller - Pre-hooks can **short-circuit** by returning `*LLMPluginShortCircuit` (cache hit, auth failure, rate limit) - Post-hooks receive both response and error — either can be nil. Plugins can **recover from errors** (set error to nil, provide response) or **invalidate responses** (set response to nil, provide error) - `BifrostError.AllowFallbacks` controls whether fallback providers are tried: `nil` or `&true` = allow, `&false` = block - `HTTPTransportStreamChunkHook` is called **per-chunk** during streaming — can modify, skip, or abort the stream ### Pool System `core/pool/` provides `Pool[T]` with two build modes: ```go // Production (default): zero-overhead sync.Pool wrapper // Debug (-tags pooldebug): tracks double-release, use-after-release, leaks with stack traces p := pool.New[MyType]("descriptive-name", func() *MyType { return &MyType{} }) obj := p.Get() // ... use obj ... // MUST reset ALL fields before Put — pool does not auto-reset p.Put(obj) ``` **Acquire/Release pattern** for types with complex reset logic (used in `schemas/plugin.go`): ```go req := schemas.AcquireHTTPRequest() // Get from pool, pre-allocated maps defer schemas.ReleaseHTTPRequest(req) // Clears all maps and fields, returns to pool ``` ### HTTP Transport Layer **Handler pattern:** Handlers are structs with injected dependencies: ```go type CompletionHandler struct { client *bifrost.Bifrost handlerStore lib.HandlerStore config *lib.Config } ``` **Route registration:** Each handler implements `RegisterRoutes(router, middlewares...)` — routes get middleware chains applied per-route via `lib.ChainMiddlewares()`. **SDK integration layers** (`transports/bifrost-http/integrations/`) provide request/response converters between provider-native SDK formats and Bifrost's internal format. This enables drop-in replacement of OpenAI SDK, Anthropic SDK, AWS Bedrock SDK, Google GenAI SDK, LangChain, and LiteLLM. --- ## Gotchas ### 1. Always Reset Pooled Objects Before Put Every pooled object must have **all** fields zeroed before `pool.Put()`. Stale data leaks between requests. The debug build catches double-release and use-after-release but **not** missing resets. ```go // WRONG — stale data from previous request leaks to next user pool.Put(msg) // RIGHT msg.Response = nil msg.Error = nil msg.Context = nil msg.ResponseStream = nil pool.Put(msg) ``` ### 2. Channel Lifecycle — ProviderQueue Pattern `ProviderQueue` uses atomic flags and `sync.Once` to prevent "send on closed channel" panics: ```go type ProviderQueue struct { queue chan *ChannelMessage done chan struct{} closing uint32 // atomic: 0=open, 1=closing signalOnce sync.Once // ensure signal fires only once closeOnce sync.Once // ensure close fires only once } ``` Always check the atomic closing flag before sending. Never close a channel without this pattern. ### 3. NetworkConfig Duration Serialization `RetryBackoffInitial` and `RetryBackoffMax` are `time.Duration` (nanoseconds) in Go but **milliseconds** (integers) in JSON. Custom `MarshalJSON`/`UnmarshalJSON` handles conversion. If adding new duration fields to any config struct, follow this pattern exactly. ### 4. ExtraHeaders — Defensive Map Copy `NetworkConfig.ExtraHeaders` is deep-copied in `CheckAndSetDefaults()` to prevent data races between concurrent requests. Apply the same `maps.Copy()` pattern to any new map fields in config structs. ### 5. Provider Interface Has 30+ Methods Adding a new operation type requires changes across the entire codebase: 1. Add method to `Provider` interface in `core/schemas/provider.go` 2. Implement in **all** 20+ providers (most return "not supported") 3. Add `RequestType` constant in `core/schemas/bifrost.go` 4. Add to `AllowedRequests` struct and `IsOperationAllowed()` switch 5. Add handler endpoint in `transports/bifrost-http/handlers/` 6. Wire up in `core/bifrost.go` and `core/inference.go` ### 6. OpenAI Provider Changes Cascade to 9+ Providers Groq, Cerebras, Ollama, Perplexity, OpenRouter, Parasail, Nebius, xAI, and SGL all delegate to `openai.HandleOpenAI*` functions. **Any change to OpenAI converter logic affects all of them.** Always test broadly: `make test-core` (all providers). ### 7. Scanner Buffer Pool Has a Capacity Cap The SSE scanner buffer pool in `core/providers/utils/utils.go` starts at 4KB. Buffers grow dynamically but those exceeding **64KB are discarded** (not returned to pool) to prevent memory bloat. Be aware when working with providers that send very large SSE events. ### 8. Plugin Execution Order is Meaningful Pre-hooks: registration order (first registered → first to run). Post-hooks: **reverse** order. This creates "wrapping" semantics — the first plugin registered is the outermost wrapper (its pre-hook runs first, post-hook runs last). Changing registration order changes behavior. ### 9. Fallbacks Re-execute the Full Plugin Pipeline When a provider fails and the request falls to a fallback, the **entire plugin pipeline** re-executes from scratch. Governance checks, caching, and logging all run again for each attempt. Intentional, but surprising when debugging request counts or cost tracking. ### 10. `AllowedRequests` Nil Semantics A **nil** `*AllowedRequests` means "all operations allowed." A **non-nil** value only allows fields explicitly set to `true`. This applies to both `ProviderConfig.AllowedRequests` and `CustomProviderConfig.AllowedRequests`. ### 11. BifrostContext Reserved Keys Are Silently Dropped When `BlockRestrictedWrites()` is active, writes to reserved keys (governance IDs, retry counts, fallback index, etc.) are **silently ignored** — no error. If your plugin needs to pass data through context, use your own custom key type. ### 12. `fasthttp`, Not `net/http` Bifrost uses `github.com/valyala/fasthttp` for provider HTTP calls. The API is different from `net/http`: - Use `fasthttp.AcquireRequest()`/`fasthttp.ReleaseRequest()` for lifecycle - `fasthttp.Client` pools connections per-host (`NetworkConfig.MaxConnsPerHost`, default 5000, 30s idle) - Request/response bodies accessed via `resp.Body()` (returns `[]byte`, not `io.Reader`) - **Exception:** Bedrock uses `net/http` (for AWS SigV4 signing) with `http.Transport` configured for HTTP/2 multi-connection support ### 13. `sonic`, Not `encoding/json` JSON marshaling in hot paths uses `github.com/bytedance/sonic` for performance. `core/schemas/` uses standard `encoding/json` for custom marshaling (e.g., `NetworkConfig`). Don't mix them accidentally. ### 14. Atomic Pointer for Hot Config Reload `Bifrost` uses `atomic.Pointer` for providers and plugins lists. On updates: create new slice → atomically swap pointer. **Never mutate the slice in place** — concurrent readers would see partial state. ### 15. MCP Tool Filtering is 4 Levels Deep Tool access follows: Global filter → Client-level filter → Tool-level filter → Per-request filter (HTTP headers). All four levels must agree for a tool to be available. Changes to filtering logic must respect this hierarchy. ### 16. `config.schema.json` is the Source of Truth `transports/config.schema.json` (~2700 lines) is the authoritative definition for all `config.json` fields. Documentation examples must match. When adding config fields: update schema first → handlers → docs. ### 17. UI `data-testid` Attributes Are Load-Bearing E2E tests depend on `data-testid` attributes. Convention: `data-testid="--"`. If you rename or remove one, search `tests/e2e/` for references. If you add new interactive elements, add `data-testid`. ### 18. E2E Tests — Never Marshal Payloads to Maps In `tests/e2e/core/`, **never marshal API payloads to a `Record`/`Map`/plain-object and then re-serialize**. Field ordering matters for backend validation and snapshot comparisons. Construct payloads as object literals with fields in the intended order and pass directly to Playwright's `request.post({ data })`. Avoid `Object.fromEntries()`, `JSON.parse(JSON.stringify(...))` round-trips, or destructuring into an intermediate `Record` — these can silently reorder fields. --- ## Adding a New Provider — Full Checklist 1. Create `core/providers//` with files per the pattern (see "Provider Implementation" above) 2. Add `ModelProvider` constant in `core/schemas/bifrost.go` 3. Add to `StandardProviders` list in `core/schemas/bifrost.go` 4. Register in `core/bifrost.go` — add import + case in provider init switch 5. **UI integration** (all required): - `ui/lib/constants/config.ts` — model placeholder + key requirement - `ui/lib/constants/icons.tsx` — provider icon - `ui/lib/constants/logs.ts` — provider display name (2 places) - `docs/openapi/openapi.json` — OpenAPI spec update - `transports/config.schema.json` — config schema (2 locations) 6. **CI/CD**: Add env vars to `.github/workflows/pr-tests.yml` and `release-pipeline.yml` (4 jobs) 7. **Docs**: Create `docs/providers/supported-providers/.mdx` 8. **Test**: `make test-core PROVIDER=` --- ## Testing ### Always prefer `make test-core` over raw `go test` for provider-level tests The `make test-core` target is the canonical harness for provider tests — it wires up env vars from `.env` (provider API keys), invokes the per-provider `{provider}_test.go` entrypoint in `core/providers//`, and routes through the shared `core/internal/llmtests/` scenario suite that validates end-to-end behavior (including streaming). Running bare `go test ./core/providers//...` only executes unit tests and skips the llmtests scenarios — so it won't catch regressions in streaming, tool-calling, or provider-specific response shapes. ```bash make test-core PROVIDER=anthropic TESTCASE=TestChatCompletionStream # exact test make test-core PROVIDER=openai PATTERN=Stream # substring match make test-core PROVIDER=bedrock # all scenarios for one provider make test-core DEBUG=1 PROVIDER=gemini TESTCASE=TestResponsesStream # attach Delve on :2345 ``` `PATTERN` and `TESTCASE` are mutually exclusive. Provider name must match a directory under `core/providers/` (e.g. `anthropic`, `openai`, `bedrock`, `vertex`, `azure`, `gemini`, `cohere`, `mistral`, `groq`, etc.). ### LLM Tests (`core/internal/llmtests/`) Scenario-based tests that run against **live provider APIs** with dual-API testing (Chat Completions + Responses API): ```go func RunMyScenarioTest(t *testing.T, client *bifrost.Bifrost, ctx context.Context, cfg ComprehensiveTestConfig) { // Use validation presets: BasicChatExpectations(), ToolCallExpectations(), etc. // Use retry framework for flaky assertions } ``` - Register in `tests.go` `testScenarios` slice - Add `Scenarios.MyScenario` flag to `ComprehensiveTestConfig` - Run: `make test-core PROVIDER= TESTCASE=` ### MCP Tests (`core/internal/mcptests/`) Mock-based tests with `DynamicLLMMocker` and declarative setup: ```go manager, mocker, ctx := SetupAgentTest(t, AgentTestConfig{ InProcessTools: []string{"echo", "calculator"}, AutoExecuteTools: []string{"*"}, MaxDepth: 5, }) // Queue mock LLM responses, assert tool execution order ``` Categories: `agent_*_test.go`, `tool_*_test.go`, `connection_*_test.go`, `codemode_*_test.go` Run: `make test-mcp TESTCASE=` ### E2E Tests (`tests/e2e/`) Playwright tests with page objects, data factories, fixtures: - Page objects extend `BasePage`, use `getByTestId()` as primary selector strategy - Data factories use `Date.now()` for unique names (prevents collision in parallel runs) - Track created resources in arrays, clean up in `afterEach` - Import `test`/`expect` from `../../core/fixtures/base.fixture` (never from `@playwright/test`) - **Never marshal API payloads to a `Record`/`Map`/plain-object and then re-serialize.** Field ordering matters for snapshot comparisons and some backend validations. Construct payloads as object literals with fields in the intended order and pass directly to Playwright's `request.post({ data })`. Do NOT destructure into an intermediate `Record` or use `Object.fromEntries()` / `JSON.parse(JSON.stringify(...))` round-trips, as these can reorder fields. Run: `make run-e2e FLOW=` --- ## Claude Code Skills Four skills are available via `/skill-name`: ### `/docs-writer ` Write, update, or review Mintlify MDX documentation. Researches UI code, Go handlers, and config schema. Validates `config.json` examples against `transports/config.schema.json`. Outputs docs with Web UI / API / config.json tabs. Variants: `/docs-writer update `, `/docs-writer review ` ### `/e2e-test ` Create, run, debug, audit, or auto-update Playwright E2E tests. Variants: - `/e2e-test fix ` — Debug and fix a failing test - `/e2e-test sync` — Detect UI changes, update affected tests automatically - `/e2e-test audit` — Scan specs for incorrect/weak assertions (P0-P6 severity scale) ### `/investigate-issue ` Investigate a GitHub issue from `maximhq/bifrost`. Fetches issue details, classifies by type/area, searches codebase, traces dependencies, analyzes side effects, suggests tests (LLM/MCP/E2E), and presents an implementation plan with per-change approval gates. ### `/resolve-pr-comments ` Systematically address unresolved PR review comments. Uses GraphQL to get unresolved threads, presents each with FIX/REPLY/SKIP options, collects fixes locally, and only posts replies **after code is pushed** to remote. --- ## Common Workflows ### Modify chat completions across all providers 1. Change types in `core/schemas/chatcompletions.go` 2. Update converter functions in each provider's `chat.go` 3. If streaming affected, update `framework/streaming/` (accumulator, delta copy) 4. Run `make test-core` (all providers) ### Add a new field to API responses 1. Add to schema type in `core/schemas/` 2. Map in provider response converter (`ToBifrost*Response`) 3. Handle in streaming accumulator if applicable 4. Update HTTP handler if field needs special serialization 5. Update `transports/config.schema.json` if configurable ### Add a new plugin 1. Create `plugins//` with its own `go.mod` 2. Implement `LLMPlugin`, `MCPPlugin`, or `HTTPTransportPlugin` interface 3. Add to `go.work` 4. Register in transport layer or Bifrost config 5. Add test targets to `Makefile` ### Modify a UI feature 1. Find workspace page: `ui/app/workspace//` 2. Check existing `data-testid` attributes — E2E tests depend on them 3. Add `data-testid` to new interactive elements 4. Run `make run-e2e FLOW=` to verify 5. If E2E tests break, use `/e2e-test sync` to update them --- ## Key Files Quick Reference | What | Where | |------|-------| | Main Bifrost struct & queuing | `core/bifrost.go` | | Inference routing & fallbacks | `core/inference.go` | | Provider interface (30+ methods) | `core/schemas/provider.go` | | ModelProvider enum & context keys | `core/schemas/bifrost.go` | | Plugin interfaces & pooled HTTP types | `core/schemas/plugin.go` | | BifrostContext (mutable context) | `core/schemas/context.go` | | Chat completion types | `core/schemas/chatcompletions.go` | | Responses API types | `core/schemas/responses.go` | | Object pool (prod + debug) | `core/pool/pool_prod.go`, `pool_debug.go` | | Shared provider utils & SSE parsing | `core/providers/utils/utils.go` | | Streaming accumulator | `framework/streaming/accumulator.go` | | HTTP inference handler | `transports/bifrost-http/handlers/inference.go` | | Governance handler | `transports/bifrost-http/handlers/governance.go` | | Config schema (source of truth) | `transports/config.schema.json` | | Pool debug profiler | `transports/bifrost-http/handlers/devpprof.go` | | LLM test infrastructure | `core/internal/llmtests/` | | MCP test infrastructure | `core/internal/mcptests/` | | E2E test infrastructure | `tests/e2e/core/` | | Docs navigation config | `docs/docs.json` | | CI/CD workflows | `.github/workflows/` | --- ## Code Style - **Go**: `gofmt`/`goimports`. No custom linter config. - **TypeScript/React**: Oxfmt. TanStack Router. - **JSON tags**: `snake_case` matching provider API conventions. - **Error strings**: Lowercase, no trailing punctuation (Go convention). - **Provider types**: Prefixed with provider name in PascalCase (`AnthropicChatRequest`, `GeminiEmbeddingResponse`). - **Converter functions**: Pure — no side effects, no logging, no HTTP. - **Pool names**: Descriptive string passed to `pool.New()` (e.g., `"channel-message"`, `"response-stream"`). - **Context keys**: Use `BifrostContextKey` type. Custom plugins should define their own key types to avoid collisions. - **Go filenames**: No underscores. The only permitted underscore is the `_test.go` suffix. Examples: `pluginpipeline.go`, `pluginpipeline_test.go` — never `plugin_pipeline.go` or `plugin_pipeline_race_test.go`. Concatenate words (lowercase, no separators) for multi-word filenames. # Frontend Code Guidelines & Patterns This document defines the standards, structure, and best practices for writing frontend code in this project. --- ## Tech Stack - **React** (with Vite) - **TypeScript** - **@tanstack/react-router** (type-safe routing) - **Tailwind CSS v4** - **Radix UI** (primitives) - **Local UI component library** (`ui/components/ui/`) built on Radix primitives --- ## Folder Structure ```text /ui ├── app # Routes & pages ├── components # Shared components │ └── ui # Core design system components ├── hooks # Custom React hooks ├── lib # Utilities, helpers, shared logic └── app/enterprise # Enterprise-specific code (via symlink) ``` ### Rules - All frontend code must live inside `/ui` - Routes and pages → `ui/app` - Shared/reusable components → `ui/components` - Core UI primitives → `ui/components/ui` - Utilities and libraries → `ui/lib` - Custom hooks → `ui/hooks` --- ## Libraries & Usage ### Core Libraries - `react` → UI library - `typescript` → Type safety - `tailwindcss` → Styling - `@tanstack/react-router` → Routing ### UI & Visualization - `@radix-ui/react-*` → UI primitives - `ui/components/ui/*` → Project's Radix-based component system - `recharts` → Charts - `monaco-editor` → Code editor ### Utilities - `date-fns` → Date/time formatting - `nuqs` → Query param state management ### Tooling - `Oxfmt` → Code formatting - `vitest` → Testing --- ## Routing Convention For every new route: ```text ui/app// ├── layout.tsx # Route definition using createFileRoute ├── page.tsx # Page content └── views/ # Optional: route-specific components ``` ### Rules - Folder name must match route name - Always use `createFileRoute` in `layout.tsx` - `page.tsx` should only handle composition (not heavy logic) - Route-specific components go inside `views/` --- ## Component Guidelines ### Reusability First - Always check if similar components/functions already exist - Prefer extending or refactoring existing code over duplication - Only create new components if reuse is not feasible --- ### Component Placement - Shared → `ui/components` - Route-specific → `views/` inside route folder --- ### JSX & Rendering - Avoid deeply nested conditional rendering - Break complex UI into smaller components - Keep components readable and maintainable --- ### Lists & Keys - Always use **stable, unique keys** - Never use array index as key (unless unavoidable) --- ## React Best Practices - Avoid unnecessary or unstable dependencies in hooks - Prevent infinite loops in `useEffect` - Keep dependency arrays accurate and minimal - Prefer derived state over duplicated state --- ## State Management ### Priority Order 1. Query Params (`nuqs`) → for persistent/shareable state 2. Local State → for UI-only state 3. Redux → only when truly necessary --- ### Query Params (`nuqs`) - Use for state that should persist across refresh/navigation - Use proper parsers like `parseAsString` or `parseAsInteger` - Do NOT mix query param state with local/redux state - Follow a single consistent pattern across the codebase --- ### Redux - Use only when global/shared state is required - Avoid unnecessary slices - Prefer simpler alternatives when possible --- ### RTK Query (`@reduxjs/toolkit/query`) - Use for API calls and caching - Use **granular tags** for cache invalidation - Avoid invalidating entire datasets unnecessarily - Implement **optimistic updates** where applicable --- ## Forms We use: - `react-hook-form` - `zod v4` (for schema validation) ### Rules - Always define a Zod schema - Include meaningful validation messages - Prefer **inline field errors** (not toast notifications) - Use `refine` / `superRefine` for complex validation - Store schemas in: `ui/lib/types/schemas.ts` --- ## Tables - Use `@tanstack/react-table` **only for large/complex datasets** - For simple tables → build custom lightweight components - Prioritize performance over abstraction --- ## ⚡ Performance Guidelines - Lazy load heavy or rarely-used libraries - Avoid unnecessary re-renders - Split large components into smaller ones - Keep bundle size minimal --- ## Dependency Rules - Do NOT add new dependencies unless absolutely necessary - Always pin exact versions (no `^` or `~`) - Prefer existing libraries in the codebase --- ## TypeScript Guidelines - Avoid using `any` unless absolutely unavoidable - Prefer strict typing and inference - Define reusable types in shared locations --- ## Code Quality & Formatting After writing code: ```bash cd ui && npm run format ```` Then verify build: ```bash cd ui && npm run build ``` * Code must pass formatting and build checks * Follow consistent naming and structure conventions --- ## Anti-Patterns to Avoid * Duplicate components without considering reuse * Mixing multiple state management approaches unnecessarily * Overusing Redux * Using unstable hook dependencies * Adding heavy libraries for simple use cases * Poorly structured or deeply nested JSX --- ## Summary * Prioritize **reusability, performance, and consistency** * Follow **strict folder structure and routing conventions** * Use **the right tool for the right problem** * Keep code **simple, predictable, and maintainable**