first commit
This commit is contained in:
161
docs/architecture/framework/config-store.mdx
Normal file
161
docs/architecture/framework/config-store.mdx
Normal file
@@ -0,0 +1,161 @@
|
||||
---
|
||||
title: "Config Store"
|
||||
description: "A persistent and flexible configuration management system for Bifrost, supporting multiple database backends."
|
||||
icon: "gear"
|
||||
---
|
||||
|
||||
The ConfigStore is a critical component of the Bifrost framework, providing a centralized and persistent storage solution for all gateway configurations. It abstracts the underlying database, offering a unified API for managing everything from provider settings and virtual keys to governance policies and plugin configurations.
|
||||
|
||||
## Core Features
|
||||
|
||||
- **Unified Configuration API**: A single interface (`ConfigStore`) for all configuration CRUD (Create, Read, Update, Delete) operations.
|
||||
- **Multiple Backend Support**: Out-of-the-box support for SQLite and PostgreSQL, with an extensible architecture for adding new database backends.
|
||||
- **Comprehensive Data Management**: Manages a wide range of configuration data, including:
|
||||
- Provider and key settings
|
||||
- Virtual keys and governance rules (budgets, rate limits)
|
||||
- Customer and team information for multi-tenancy
|
||||
- Plugin configurations
|
||||
- Vector store and log store settings
|
||||
- Model pricing information
|
||||
- **Transactional Operations**: Ensures data consistency by supporting atomic transactions for complex configuration changes.
|
||||
- **Database Migrations**: Integrated migration system to manage schema evolution across different versions of Bifrost.
|
||||
- **Environment Variable Handling**: Securely manages sensitive data like API keys by storing references to environment variables instead of raw values.
|
||||
|
||||
## Architecture
|
||||
|
||||
The ConfigStore is designed around the `ConfigStore` interface, which defines all the methods for interacting with the configuration data. The primary implementation is `RDBConfigStore`, which uses [GORM](https://gorm.io/) as an ORM to communicate with relational databases.
|
||||
|
||||
### Supported Backends
|
||||
|
||||
- **SQLite**: The default, file-based database, perfect for local development, testing, and single-node deployments. It requires no external services.
|
||||
- **PostgreSQL**: A robust, production-grade database suitable for large-scale, high-availability deployments.
|
||||
|
||||
The backend is selected and configured in Bifrost's main configuration file.
|
||||
|
||||
### Initialization
|
||||
|
||||
The ConfigStore is initialized at startup based on the provided configuration.
|
||||
|
||||
```go
|
||||
import (
|
||||
"github.com/maximhq/bifrost/framework/configstore"
|
||||
"github.com/maximhq/bifrost/core/schemas"
|
||||
)
|
||||
|
||||
// Example: Initialize a SQLite-based ConfigStore
|
||||
config := &configstore.Config{
|
||||
Enabled: true,
|
||||
Type: configstore.ConfigStoreTypeSQLite,
|
||||
Config: &configstore.SQLiteConfig{
|
||||
File: "/path/to/config.db",
|
||||
},
|
||||
}
|
||||
|
||||
var logger schemas.Logger // Assume logger is initialized
|
||||
store, err := configstore.NewConfigStore(context.Background(), config, logger)
|
||||
if err != nil {
|
||||
// Handle error
|
||||
}
|
||||
```
|
||||
|
||||
Here is an example for initializing a PostgreSQL-based `ConfigStore`:
|
||||
```go
|
||||
// Example: Initialize a PostgreSQL-based ConfigStore
|
||||
pgConfig := &configstore.Config{
|
||||
Enabled: true,
|
||||
Type: configstore.ConfigStoreTypePostgres,
|
||||
Config: &configstore.PostgresConfig{
|
||||
Host: "localhost",
|
||||
Port: "5432",
|
||||
User: "postgres",
|
||||
Password: "secret",
|
||||
DBName: "bifrost",
|
||||
SSLMode: "disable",
|
||||
MaxIdleConns: 5, // Optional: Maximum idle connections (default: 5)
|
||||
MaxOpenConns: 50, // Optional: Maximum open connections (default: 50)
|
||||
},
|
||||
}
|
||||
|
||||
store, err = configstore.NewConfigStore(context.Background(), pgConfig, logger)
|
||||
if err != nil {
|
||||
// Handle error
|
||||
}
|
||||
```
|
||||
|
||||
<Note>
|
||||
PostgreSQL databases used by Bifrost stores must be UTF8 encoded. See [PostgreSQL UTF8 Requirement](../../quickstart/gateway/setting-up#postgresql-utf8-requirement).
|
||||
</Note>
|
||||
|
||||
### Connection Pool Configuration
|
||||
|
||||
For PostgreSQL backends, you can configure the database connection pool to optimize performance based on your workload:
|
||||
|
||||
- **MaxIdleConns**: Maximum number of idle connections in the pool (default: 5)
|
||||
- **MaxOpenConns**: Maximum number of open connections to the database (default: 50)
|
||||
|
||||
These parameters help manage database connection resources effectively. Increase them for high-traffic deployments or decrease them for resource-constrained environments.
|
||||
|
||||
## Data Models
|
||||
|
||||
The ConfigStore manages a variety of data models, which are defined as GORM tables in the `framework/configstore/tables` directory. Some of the key models include:
|
||||
|
||||
- `TableVirtualKey`: Represents a virtual key with its associated governance rules, keys, and metadata.
|
||||
- `TableProvider` & `TableKey`: Store provider-specific configurations and the physical API keys.
|
||||
- `TableBudget` & `TableRateLimit`: Define spending limits and request rate limits for governance.
|
||||
- `TableCustomer` & `TableTeam`: Enable multi-tenant configurations.
|
||||
- `TableModelPricing`: Caches model pricing information for cost calculation.
|
||||
- `TablePlugin`: Stores configuration for loaded plugins.
|
||||
|
||||
## Usage
|
||||
|
||||
The `ConfigStore` interface provides a rich set of methods for managing Bifrost's configuration.
|
||||
|
||||
### Managing Virtual Keys
|
||||
|
||||
```go
|
||||
// Create a new virtual key
|
||||
newKey := &tables.TableVirtualKey{
|
||||
ID: "vk-12345",
|
||||
Name: "My Test Key",
|
||||
// ... other fields
|
||||
}
|
||||
err := store.CreateVirtualKey(ctx, newKey)
|
||||
|
||||
// Retrieve a virtual key
|
||||
virtualKey, err := store.GetVirtualKey(ctx, "vk-12345")
|
||||
```
|
||||
|
||||
### Managing Providers
|
||||
|
||||
```go
|
||||
// Get all provider configurations
|
||||
providers, err := store.GetProvidersConfig(ctx)
|
||||
|
||||
// Update a specific provider
|
||||
providerConfig := providers[schemas.OpenAI]
|
||||
providerConfig.NetworkConfig.TimeoutSeconds = 120
|
||||
err = store.UpdateProvider(ctx, schemas.OpenAI, providerConfig, envKeys)
|
||||
```
|
||||
|
||||
### Executing Transactions
|
||||
|
||||
For operations that require multiple database writes, you can use a transaction to ensure atomicity.
|
||||
|
||||
```go
|
||||
err := store.ExecuteTransaction(ctx, func(tx *gorm.DB) error {
|
||||
// Perform multiple operations within this transaction
|
||||
if err := store.CreateBudget(ctx, budget1, tx); err != nil {
|
||||
return err // Rollback
|
||||
}
|
||||
if err := store.UpdateRateLimit(ctx, limit1, tx); err != nil {
|
||||
return err // Rollback
|
||||
}
|
||||
return nil // Commit
|
||||
})
|
||||
```
|
||||
|
||||
## Migrations
|
||||
|
||||
The ConfigStore includes a migration system to handle database schema changes between Bifrost versions. Migrations are automatically applied at startup, ensuring the database schema is always up-to-date. This process is managed by the `migrator` package and is transparent to the user.
|
||||
|
||||
The ConfigStore is a powerful and flexible component that provides the backbone for Bifrost's dynamic configuration capabilities. Its support for multiple backends and transactional operations makes it suitable for both small-scale and large-scale, production environments.
|
||||
176
docs/architecture/framework/log-store.mdx
Normal file
176
docs/architecture/framework/log-store.mdx
Normal file
@@ -0,0 +1,176 @@
|
||||
---
|
||||
title: "Log Store"
|
||||
description: "A robust and queryable system for persisting API request and response logs, with support for multiple database backends."
|
||||
icon: "clipboard-list"
|
||||
---
|
||||
|
||||
The LogStore is a core component of the Bifrost framework responsible for capturing, storing, and retrieving detailed logs of API requests and responses. It provides a persistent, queryable audit trail of all activity passing through the gateway, which is essential for debugging, monitoring, analytics, and compliance.
|
||||
|
||||
## Core Features
|
||||
|
||||
- **Persistent Logging**: Automatically saves detailed information about each API request, including input, output, status, latency, and cost.
|
||||
- **Multiple Backend Support**: Comes with built-in support for SQLite and PostgreSQL, allowing you to choose the best storage solution for your deployment needs.
|
||||
- **Rich Querying and Filtering**: A powerful search API allows you to filter and sort logs based on a wide range of criteria such as provider, model, status, latency, cost, and content.
|
||||
- **Performance Analytics**: The search functionality also provides aggregated statistics, including total requests, success rate, average latency, total tokens, and total cost for the queried data.
|
||||
- **Structured Data Model**: Logs are stored in a structured format, with complex objects like message history and tool calls serialized as JSON for efficient storage and retrieval.
|
||||
- **Automatic Data Management**: Includes GORM hooks to automatically handle JSON serialization/deserialization and to build a searchable content summary.
|
||||
|
||||
## Architecture
|
||||
|
||||
The LogStore is built around the `LogStore` interface, which defines the standard methods for interacting with the log database. The primary implementation, `RDBLogStore`, uses GORM to provide an abstraction over relational databases.
|
||||
|
||||
### Supported Backends
|
||||
|
||||
- **SQLite**: The default, file-based database, ideal for local development and smaller, single-node deployments.
|
||||
- **PostgreSQL**: A production-ready database for scalable and high-availability deployments.
|
||||
|
||||
The backend is configured in Bifrost's main configuration file.
|
||||
|
||||
### Initialization
|
||||
|
||||
The LogStore is initialized at startup based on the provided configuration.
|
||||
|
||||
```go
|
||||
import (
|
||||
"github.com/maximhq/bifrost/framework/logstore"
|
||||
"github.com/maximhq/bifrost/core/schemas"
|
||||
)
|
||||
|
||||
// Example: Initialize a SQLite-based LogStore
|
||||
config := &logstore.Config{
|
||||
Enabled: true,
|
||||
Type: logstore.LogStoreTypeSQLite,
|
||||
Config: &logstore.SQLiteConfig{
|
||||
File: "/path/to/logs.db",
|
||||
},
|
||||
}
|
||||
|
||||
var logger schemas.Logger // Assume logger is initialized
|
||||
store, err := logstore.NewLogStore(context.Background(), config, logger)
|
||||
if err != nil {
|
||||
// Handle error
|
||||
}
|
||||
```
|
||||
|
||||
Here is an example for initializing a PostgreSQL-based `LogStore`:
|
||||
```go
|
||||
// Example: Initialize a PostgreSQL-based LogStore
|
||||
pgConfig := &logstore.Config{
|
||||
Enabled: true,
|
||||
Type: logstore.LogStoreTypePostgres,
|
||||
Config: &logstore.PostgresConfig{
|
||||
Host: "localhost",
|
||||
Port: "5432",
|
||||
User: "postgres",
|
||||
Password: "secret",
|
||||
DBName: "bifrost_logs",
|
||||
SSLMode: "disable",
|
||||
MaxIdleConns: 5, // Optional: Maximum idle connections (default: 5)
|
||||
MaxOpenConns: 50, // Optional: Maximum open connections (default: 50)
|
||||
},
|
||||
}
|
||||
|
||||
store, err = logstore.NewLogStore(context.Background(), pgConfig, logger)
|
||||
if err != nil {
|
||||
// Handle error
|
||||
}
|
||||
```
|
||||
|
||||
<Note>
|
||||
PostgreSQL databases used by Bifrost stores must be UTF8 encoded. See [PostgreSQL UTF8 Requirement](../../quickstart/gateway/setting-up#postgresql-utf8-requirement).
|
||||
</Note>
|
||||
|
||||
### Connection Pool Configuration
|
||||
|
||||
For PostgreSQL backends, you can configure the database connection pool to optimize performance based on your workload:
|
||||
|
||||
- **MaxIdleConns**: Maximum number of idle connections in the pool (default: 5)
|
||||
- **MaxOpenConns**: Maximum number of open connections to the database (default: 50)
|
||||
|
||||
These parameters help manage database connection resources effectively. Increase them for high-traffic deployments or decrease them for resource-constrained environments.
|
||||
|
||||
## Data Model
|
||||
|
||||
The core of the LogStore is the `Log` struct, which represents a single log entry in the `logs` table.
|
||||
|
||||
```go
|
||||
// Log represents a complete log entry for a request/response cycle
|
||||
type Log struct {
|
||||
ID string `gorm:"primaryKey;type:varchar(255)"`
|
||||
Timestamp time.Time `gorm:"index;not null"`
|
||||
Object string `gorm:"type:varchar(255);index;not null;column:object_type"`
|
||||
Provider string `gorm:"type:varchar(255);index;not null"`
|
||||
Model string `gorm:"type:varchar(255);index;not null"`
|
||||
Latency *float64
|
||||
Cost *float64 `gorm:"index"`
|
||||
Status string `gorm:"type:varchar(50);index;not null"` // "processing", "success", or "error"
|
||||
Stream bool `gorm:"default:false"`
|
||||
|
||||
// Denormalized token fields for easier querying
|
||||
PromptTokens int `gorm:"default:0"`
|
||||
CompletionTokens int `gorm:"default:0"`
|
||||
TotalTokens int `gorm:"default:0"`
|
||||
|
||||
// JSON serialized fields
|
||||
InputHistory string `gorm:"type:text"`
|
||||
OutputMessage string `gorm:"type:text"`
|
||||
TokenUsage string `gorm:"type:text"`
|
||||
ErrorDetails string `gorm:"type:text"`
|
||||
// ... and many more for different data types
|
||||
}
|
||||
```
|
||||
Complex data like message arrays and tool calls are serialized into JSON strings for storage and are automatically deserialized back into their struct forms when retrieved.
|
||||
|
||||
## Usage
|
||||
|
||||
### Creating Log Entries
|
||||
|
||||
A log entry is created by populating a `Log` struct and passing it to the `Create` method. This is typically handled internally by Bifrost's logging plugins.
|
||||
|
||||
```go
|
||||
logEntry := &logstore.Log{
|
||||
ID: "req-xyz123",
|
||||
Timestamp: time.Now(),
|
||||
Provider: "openai",
|
||||
Model: "gpt-4",
|
||||
Status: "success",
|
||||
// ... other fields
|
||||
}
|
||||
err := store.Create(ctx, logEntry)
|
||||
```
|
||||
|
||||
### Searching and Filtering Logs
|
||||
|
||||
The `SearchLogs` method provides a powerful way to query logs with fine-grained filters and pagination.
|
||||
|
||||
```go
|
||||
// Define search criteria
|
||||
filters := logstore.SearchFilters{
|
||||
Providers: []string{"openai", "anthropic"},
|
||||
Status: []string{"error"},
|
||||
StartTime: &startTime, // time.Time pointer
|
||||
}
|
||||
|
||||
pagination := logstore.PaginationOptions{
|
||||
Limit: 50,
|
||||
Offset: 0,
|
||||
SortBy: "timestamp",
|
||||
Order: "desc",
|
||||
}
|
||||
|
||||
// Execute the search
|
||||
results, err := store.SearchLogs(ctx, filters, pagination)
|
||||
if err != nil {
|
||||
// Handle error
|
||||
}
|
||||
|
||||
// Process the results
|
||||
for _, log := range results.Logs {
|
||||
fmt.Printf("Found log: %s\n", log.ID)
|
||||
}
|
||||
|
||||
// Access aggregated stats
|
||||
fmt.Printf("Total errors: %d\n", results.Stats.TotalRequests)
|
||||
```
|
||||
|
||||
The LogStore is an indispensable tool for observability in Bifrost, providing the detailed audit trail needed to monitor, debug, and analyze AI application performance and behavior effectively.
|
||||
412
docs/architecture/framework/model-catalog.mdx
Normal file
412
docs/architecture/framework/model-catalog.mdx
Normal file
@@ -0,0 +1,412 @@
|
||||
---
|
||||
title: "Model Catalog"
|
||||
description: "A centralized system for managing model information, pricing, and capabilities across all supported AI providers."
|
||||
icon: "book-open"
|
||||
---
|
||||
|
||||
The Model Catalog is a foundational component of Bifrost that provides a unified interface for managing AI models, including their pricing, capabilities, and availability. It serves as a centralized repository for all model-related information, enabling dynamic cost calculation, intelligent model routing, and efficient resource management.
|
||||
|
||||
<Info>
|
||||
**Related Documentation**: The Model Catalog powers Bifrost's intelligent routing system. See [Provider Routing](/providers/provider-routing) for detailed examples of how governance and load balancing use the catalog to make routing decisions, including cross-provider scenarios and weighted routing via proxy providers.
|
||||
</Info>
|
||||
|
||||
## Core Features
|
||||
|
||||
### **1. Automatic Pricing Synchronization**
|
||||
The Model Catalog manages pricing data through a two-phase approach:
|
||||
|
||||
**Startup Behavior:**
|
||||
- **With ConfigStore**: Downloads a pricing sheet from Maxim's datasheet, persists it to the config store, and then loads it into memory for fast lookups.
|
||||
- **Without ConfigStore**: Downloads the pricing sheet directly into memory on every startup.
|
||||
|
||||
**Ongoing Synchronization:**
|
||||
- When ConfigStore is available, an automatic sync occurs every 24 hours to keep pricing data current.
|
||||
- All pricing data is cached in memory for O(1) lookup performance during cost calculations.
|
||||
|
||||
This ensures that cost calculations always use the latest pricing information from AI providers while maintaining optimal performance.
|
||||
|
||||
### **2. Multi-Modal Cost Calculation**
|
||||
It supports diverse pricing models across different AI operation types:
|
||||
- **Text Operations**: Token-based pricing for chat completions, text completions, responses, and embeddings. Cache-read/cache-write pricing applies to chat/text/responses when providers surface prompt cache token details.
|
||||
- **Audio Processing**: Character-based, token-based, and duration-based pricing for speech synthesis and transcription, with audio token detail breakdown. Speech responses populate `usage.input_chars` so speech can be billed by input characters in addition to tokens/duration.
|
||||
- **Image Processing**: Per-image (`input_cost_per_image`/`output_cost_per_image`), per-pixel (`input_cost_per_pixel`/`output_cost_per_pixel`), or token-based pricing with text/image token breakdown.
|
||||
- **Video Processing**: Token-based or duration-based pricing. Input can use prompt tokens or `input_cost_per_video_per_second`; output can use completion tokens or fall back to `output_cost_per_video_per_second` / `output_cost_per_second`.
|
||||
- **Reranking**: Input/output token pricing with search query cost support.
|
||||
- **Prompt Caching**: Separate rates for cache-read tokens (`cached_read_tokens`) and cache-creation tokens (`cached_write_tokens`), both surfaced under `prompt_tokens_details` (see [Prompt Cache Cost Calculation](#prompt-cache-cost-calculation)).
|
||||
|
||||
### **3. Model Information Management**
|
||||
The Model Catalog maintains a pool of available models for each provider, populated from both pricing data and provider list models APIs. This enables:
|
||||
- **Model Discovery**: Listing all available models for a given provider
|
||||
- **Provider Discovery**: Finding all providers that support a specific model with intelligent cross-provider resolution (OpenRouter, Vertex, Groq, Bedrock)
|
||||
- **Model Validation**: Checking if a model is allowed for a provider based on allowed models lists (supports provider-prefixed entries)
|
||||
|
||||
### **4. Intelligent Cache Cost Handling**
|
||||
It integrates with semantic caching to provide accurate cost calculations:
|
||||
- **Cache Hits**: Zero cost for direct cache hits, and embedding cost only for semantic matches.
|
||||
- **Cache Misses**: Combined cost of the base model usage plus the embedding generation cost for cache storage.
|
||||
|
||||
### **5. Tiered Pricing Support**
|
||||
The system automatically applies different pricing rates for high-token contexts, reflecting real provider pricing models. Two tiers are supported: above 128k tokens and above 200k tokens, with the higher tier taking precedence when both are configured.
|
||||
|
||||
## Configuration
|
||||
|
||||
The `ModelCatalog` can be configured during initialization by passing a `Config` struct.
|
||||
|
||||
```go
|
||||
type Config struct {
|
||||
PricingURL *string `json:"pricing_url,omitempty"`
|
||||
PricingSyncInterval *time.Duration `json:"pricing_sync_interval,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
- **`PricingURL`**: Overrides the default URL (`https://getbifrost.ai/datasheet`) for downloading the pricing sheet.
|
||||
- **`PricingSyncInterval`**: Customizes the interval for periodic pricing data synchronization. The default is 24 hours.
|
||||
|
||||
This configuration is passed during the initialization of the `ModelCatalog`:
|
||||
|
||||
```go
|
||||
config := &modelcatalog.Config{
|
||||
PricingURL: "https://my-custom-url.com/pricing.json",
|
||||
}
|
||||
modelCatalog, err := modelcatalog.Init(context.Background(), config, configStore, logger)
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
### ModelCatalog
|
||||
The `ModelCatalog` is the central component that handles all model and pricing operations:
|
||||
|
||||
```go
|
||||
type ModelCatalog struct {
|
||||
configStore configstore.ConfigStore
|
||||
logger schemas.Logger
|
||||
|
||||
pricingURL string
|
||||
pricingSyncInterval time.Duration
|
||||
|
||||
// In-memory cache for fast access
|
||||
pricingData map[string]configstoreTables.TableModelPricing
|
||||
mu sync.RWMutex
|
||||
|
||||
modelPool map[schemas.ModelProvider][]string
|
||||
|
||||
// Background sync worker
|
||||
syncTicker *time.Ticker
|
||||
done chan struct{}
|
||||
wg sync.WaitGroup
|
||||
syncCtx context.Context
|
||||
syncCancel context.CancelFunc
|
||||
}
|
||||
```
|
||||
|
||||
### Pricing Data Structure
|
||||
Each model's pricing information includes comprehensive cost metrics, supporting various modalities and tiered pricing:
|
||||
|
||||
```go
|
||||
// PricingEntry represents a single model's pricing information.
|
||||
// The fields below are an excerpt — see framework/modelcatalog/main.go for the full definition.
|
||||
type PricingEntry struct {
|
||||
BaseModel string `json:"base_model,omitempty"`
|
||||
Provider string `json:"provider"`
|
||||
Mode string `json:"mode"`
|
||||
|
||||
// Costs - Text
|
||||
InputCostPerToken float64 `json:"input_cost_per_token"`
|
||||
OutputCostPerToken float64 `json:"output_cost_per_token"`
|
||||
InputCostPerTokenBatches *float64 `json:"input_cost_per_token_batches,omitempty"`
|
||||
OutputCostPerTokenBatches *float64 `json:"output_cost_per_token_batches,omitempty"`
|
||||
InputCostPerTokenPriority *float64 `json:"input_cost_per_token_priority,omitempty"`
|
||||
OutputCostPerTokenPriority *float64 `json:"output_cost_per_token_priority,omitempty"`
|
||||
InputCostPerTokenAbove200kTokens *float64 `json:"input_cost_per_token_above_200k_tokens,omitempty"`
|
||||
OutputCostPerTokenAbove200kTokens *float64 `json:"output_cost_per_token_above_200k_tokens,omitempty"`
|
||||
|
||||
// Costs - Cache
|
||||
CacheCreationInputTokenCost *float64 `json:"cache_creation_input_token_cost,omitempty"`
|
||||
CacheReadInputTokenCost *float64 `json:"cache_read_input_token_cost,omitempty"`
|
||||
CacheCreationInputTokenCostAbove200kTokens *float64 `json:"cache_creation_input_token_cost_above_200k_tokens,omitempty"`
|
||||
CacheReadInputTokenCostAbove200kTokens *float64 `json:"cache_read_input_token_cost_above_200k_tokens,omitempty"`
|
||||
CacheCreationInputTokenCostAbove1hr *float64 `json:"cache_creation_input_token_cost_above_1hr,omitempty"`
|
||||
CacheCreationInputTokenCostAbove1hrAbove200kTokens *float64 `json:"cache_creation_input_token_cost_above_1hr_above_200k_tokens,omitempty"`
|
||||
CacheCreationInputAudioTokenCost *float64 `json:"cache_creation_input_audio_token_cost,omitempty"`
|
||||
CacheReadInputTokenCostPriority *float64 `json:"cache_read_input_token_cost_priority,omitempty"`
|
||||
|
||||
// Costs - Image
|
||||
InputCostPerImage *float64 `json:"input_cost_per_image,omitempty"`
|
||||
InputCostPerPixel *float64 `json:"input_cost_per_pixel,omitempty"`
|
||||
OutputCostPerImage *float64 `json:"output_cost_per_image,omitempty"`
|
||||
OutputCostPerPixel *float64 `json:"output_cost_per_pixel,omitempty"`
|
||||
OutputCostPerImagePremiumImage *float64 `json:"output_cost_per_image_premium_image,omitempty"`
|
||||
OutputCostPerImageAbove512x512Pixels *float64 `json:"output_cost_per_image_above_512_and_512_pixels,omitempty"`
|
||||
OutputCostPerImageAbove512x512PixelsPremium *float64 `json:"output_cost_per_image_above_512_and_512_pixels_and_premium_image,omitempty"`
|
||||
OutputCostPerImageAbove1024x1024Pixels *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels,omitempty"`
|
||||
OutputCostPerImageAbove1024x1024PixelsPremium *float64 `json:"output_cost_per_image_above_1024_and_1024_pixels_and_premium_image,omitempty"`
|
||||
OutputCostPerImageAbove2048x2048Pixels *float64 `json:"output_cost_per_image_above_2048_and_2048_pixels,omitempty"`
|
||||
OutputCostPerImageAbove4096x4096Pixels *float64 `json:"output_cost_per_image_above_4096_and_4096_pixels,omitempty"`
|
||||
OutputCostPerImageLowQuality *float64 `json:"output_cost_per_image_low_quality,omitempty"`
|
||||
OutputCostPerImageMediumQuality *float64 `json:"output_cost_per_image_medium_quality,omitempty"`
|
||||
OutputCostPerImageHighQuality *float64 `json:"output_cost_per_image_high_quality,omitempty"`
|
||||
OutputCostPerImageAutoQuality *float64 `json:"output_cost_per_image_auto_quality,omitempty"`
|
||||
// Costs - Audio/Video
|
||||
InputCostPerAudioToken *float64 `json:"input_cost_per_audio_token,omitempty"`
|
||||
InputCostPerAudioPerSecond *float64 `json:"input_cost_per_audio_per_second,omitempty"`
|
||||
InputCostPerSecond *float64 `json:"input_cost_per_second,omitempty"`
|
||||
InputCostPerVideoPerSecond *float64 `json:"input_cost_per_video_per_second,omitempty"`
|
||||
OutputCostPerAudioToken *float64 `json:"output_cost_per_audio_token,omitempty"`
|
||||
OutputCostPerVideoPerSecond *float64 `json:"output_cost_per_video_per_second,omitempty"`
|
||||
OutputCostPerSecond *float64 `json:"output_cost_per_second,omitempty"`
|
||||
|
||||
// Costs - Other
|
||||
SearchContextCostPerQuery *float64 `json:"search_context_cost_per_query,omitempty"`
|
||||
CodeInterpreterCostPerSession *float64 `json:"code_interpreter_cost_per_session,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
## Usage in Plugins
|
||||
|
||||
The Model Catalog is designed to be shared across all Bifrost plugins, providing consistent model information and validation logic for governance, load balancing, and other routing mechanisms.
|
||||
|
||||
<Note>
|
||||
**Governance & Load Balancing**: Both plugins delegate model validation to the Model Catalog's `IsModelAllowedForProvider` method, ensuring consistent handling of cross-provider scenarios and provider-prefixed allowed models. See [Provider Routing](/providers/provider-routing) for configuration examples.
|
||||
</Note>
|
||||
|
||||
### Initialization
|
||||
In Bifrost's gateway, the `ModelCatalog` is initialized once at the start and shared across all plugins:
|
||||
|
||||
```go
|
||||
import "github.com/maximhq/bifrost/framework/modelcatalog"
|
||||
|
||||
// Initialize model catalog with config store and logger
|
||||
modelCatalog, err := modelcatalog.Init(context.Background(), &modelcatalog.Config{}, configStore, logger)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to initialize model catalog: %w", err)
|
||||
}
|
||||
```
|
||||
|
||||
### Basic Cost Calculation
|
||||
Calculate costs from a Bifrost response:
|
||||
|
||||
```go
|
||||
// Calculate cost for a completed request
|
||||
cost := modelCatalog.CalculateCost(
|
||||
result, // *schemas.BifrostResponse
|
||||
nil, // *PricingLookupScopes (nil = no scoped overrides)
|
||||
)
|
||||
|
||||
logger.Info("Request cost: $%.6f", cost)
|
||||
```
|
||||
|
||||
### Unified Cost Calculation
|
||||
`CalculateCost` is the single entry point for all cost calculations. It handles all request types, semantic cache billing, and tiered pricing automatically:
|
||||
|
||||
```go
|
||||
// CalculateCost handles all cost scenarios including cache-aware pricing
|
||||
cost := modelCatalog.CalculateCost(result, nil) // *schemas.BifrostResponse, *PricingLookupScopes
|
||||
|
||||
// Cache hits return 0 for direct hits, embedding cost for semantic matches
|
||||
// Cache misses return base model cost + embedding generation cost
|
||||
// Returns 0.0 if pricing data is not found (logs a debug message)
|
||||
```
|
||||
|
||||
### Model Discovery
|
||||
The `ModelCatalog` provides several methods to query for model and provider information.
|
||||
|
||||
#### Get Models for a Provider
|
||||
Retrieve a list of all models supported by a specific provider.
|
||||
```go
|
||||
openaiModels := modelCatalog.GetModelsForProvider(schemas.OpenAI)
|
||||
for _, model := range openaiModels {
|
||||
logger.Info("Found OpenAI model: %s", model)
|
||||
}
|
||||
```
|
||||
|
||||
**Thread-safe**: Uses read lock for concurrent access.
|
||||
|
||||
#### Get Providers for a Model
|
||||
Find all providers that offer a specific model, including cross-provider resolution.
|
||||
|
||||
```go
|
||||
gpt4Providers := modelCatalog.GetProvidersForModel("gpt-4o")
|
||||
for _, provider := range gpt4Providers {
|
||||
logger.Info("gpt-4o is available from: %s", provider)
|
||||
}
|
||||
// Result: [openai, azure, groq] (includes cross-provider mappings)
|
||||
```
|
||||
|
||||
**Cross-Provider Resolution**:
|
||||
|
||||
This method implements intelligent cross-provider routing logic to discover all providers that can serve a model:
|
||||
|
||||
1. **Direct Match**: Checks each provider's model list in `modelPool` for the exact model name
|
||||
2. **OpenRouter Format**: For models found in other providers, checks if `provider/model` exists in OpenRouter
|
||||
- Example: `claude-3-5-sonnet` found in Anthropic → checks OpenRouter for `anthropic/claude-3-5-sonnet`
|
||||
3. **Vertex Format**: Similar check for Vertex with `provider/model` format
|
||||
4. **Groq OpenAI Compatibility**: For GPT models, checks if `openai/model` exists in Groq's catalog
|
||||
5. **Bedrock Claude Models**: For Claude models, flexible matching against Bedrock's full ARN format
|
||||
|
||||
**Example**:
|
||||
```go
|
||||
providers := modelCatalog.GetProvidersForModel("claude-3-5-sonnet")
|
||||
// Returns: [anthropic, vertex, bedrock, openrouter]
|
||||
// Even though request was just "claude-3-5-sonnet" without provider prefix!
|
||||
```
|
||||
|
||||
<Note>
|
||||
This cross-provider logic powers Bifrost's intelligent routing capabilities. See [Provider Routing](/providers/provider-routing#the-model-catalog) for detailed examples of how this enables features like weighted routing via proxy providers.
|
||||
</Note>
|
||||
|
||||
#### Check Model Allowance for Provider
|
||||
Validate if a model is allowed for a specific provider based on an allowed models list. This method is used internally by governance and load balancing plugins.
|
||||
|
||||
```go
|
||||
// ["*"] wildcard - uses catalog to determine support
|
||||
isAllowed := modelCatalog.IsModelAllowedForProvider(
|
||||
schemas.OpenRouter,
|
||||
"gpt-4o",
|
||||
schemas.WhiteList{"*"}, // wildcard = check catalog
|
||||
)
|
||||
// Returns: true (catalog knows OpenRouter supports openai/gpt-4o)
|
||||
|
||||
// Explicit allowedModels with provider prefix
|
||||
isAllowed := modelCatalog.IsModelAllowedForProvider(
|
||||
schemas.OpenRouter,
|
||||
"gpt-4o",
|
||||
schemas.WhiteList{"openai/gpt-4o", "anthropic/claude-3-5-sonnet"},
|
||||
)
|
||||
// Returns: true (strips "openai/" prefix and matches "gpt-4o")
|
||||
|
||||
// Explicit allowedModels without prefix
|
||||
isAllowed := modelCatalog.IsModelAllowedForProvider(
|
||||
schemas.OpenAI,
|
||||
"gpt-4o",
|
||||
schemas.WhiteList{"gpt-4o", "gpt-4o-mini"},
|
||||
)
|
||||
// Returns: true (direct match)
|
||||
```
|
||||
|
||||
**Behavior**:
|
||||
- **`["*"]` wildcard**: Delegates to `GetProvidersForModel` (includes cross-provider logic) — this is the "allow all via catalog" mode
|
||||
- **Non-empty explicit list**: Checks for both direct matches and provider-prefixed entries
|
||||
- **Empty slice (`[]string{}` / empty `schemas.WhiteList`)**: Returns `false` (deny-all) — mirrors the config deny-by-default semantics
|
||||
|
||||
<Note>
|
||||
In `config.json` and the governance API, `allowed_models: []` (empty array) means **deny all models** (deny-by-default, v1.5.0+). The Go helper `IsModelAllowedForProvider` behaves the same way: an empty `allowedModels` slice also returns `false`. Use `["*"]` to allow all models validated through the catalog.
|
||||
</Note>
|
||||
- Direct: `"gpt-4o"` matches `"gpt-4o"`
|
||||
- Prefixed: `"openai/gpt-4o"` matches request for `"gpt-4o"` (prefix stripped)
|
||||
|
||||
**Use Cases**:
|
||||
- **Governance Routing**: Validate if a model request is allowed for a provider configuration
|
||||
- **Load Balancing**: Filter providers based on allowed models before performance scoring
|
||||
- **Virtual Key Validation**: Check if a model can be used with a specific virtual key's provider configs
|
||||
|
||||
<Tip>
|
||||
This method is the central validation point for both governance and load balancing plugins, ensuring consistent model allowance logic across all routing mechanisms. It handles all edge cases including proxy providers (OpenRouter, Vertex) and provider-prefixed model entries.
|
||||
</Tip>
|
||||
|
||||
#### Dynamically Add Models
|
||||
You can dynamically add models to the catalog's pool from a `v1/models` compatible response structure. This is useful for providers that expose a model list endpoint.
|
||||
```go
|
||||
// response is *schemas.BifrostListModelsResponse
|
||||
modelCatalog.AddModelDataToPool(response)
|
||||
```
|
||||
This is automatically done in Bifrost gateway initialization for all providers that are supported by Bifrost.
|
||||
|
||||
**When to use**:
|
||||
- After fetching models from a provider's `/v1/models` endpoint
|
||||
- When a new provider is dynamically added at runtime
|
||||
- For testing with custom model lists
|
||||
### Reloading Configuration
|
||||
You can reload the pricing configuration at runtime if you need to change the pricing URL or sync interval.
|
||||
```go
|
||||
newConfig := &modelcatalog.Config{
|
||||
PricingSyncInterval: 12 * time.Hour,
|
||||
}
|
||||
err := modelCatalog.UpdateSyncConfig(ctx, newConfig)
|
||||
```
|
||||
|
||||
## Error Handling and Fallbacks
|
||||
|
||||
The Model Catalog handles missing pricing data gracefully with intelligent fallbacks:
|
||||
|
||||
```go
|
||||
// resolvePricing resolves the pricing entry for a model, trying deployment as fallback.
|
||||
func (mc *ModelCatalog) resolvePricing(provider, model, deployment string, requestType schemas.RequestType) *configstoreTables.TableModelPricing {
|
||||
pricing, exists := mc.getPricing(model, provider, requestType)
|
||||
if exists {
|
||||
return pricing
|
||||
}
|
||||
// If pricing not found for model, try the deployment name
|
||||
if deployment != "" {
|
||||
pricing, exists = mc.getPricing(deployment, provider, requestType)
|
||||
if exists {
|
||||
return pricing
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// getPricing returns pricing information for a model (thread-safe).
|
||||
// It implements a multi-step fallback chain:
|
||||
// 1. Direct lookup by model + provider + mode
|
||||
// 2. Gemini → Vertex provider fallback
|
||||
// 3. Vertex "provider/model" prefix stripping
|
||||
// 4. Bedrock "anthropic." prefix addition for Claude models
|
||||
// 5. Responses → Chat mode fallback (at each step)
|
||||
// 6. ImageEdit / ImageVariation → ImageGeneration mode fallback
|
||||
func (mc *ModelCatalog) getPricing(model, provider string, requestType schemas.RequestType) (*configstoreTables.TableModelPricing, bool) {
|
||||
mc.mu.RLock()
|
||||
defer mc.mu.RUnlock()
|
||||
|
||||
mode := normalizeRequestType(requestType)
|
||||
|
||||
pricing, ok := mc.pricingData[makeKey(model, provider, mode)]
|
||||
if ok {
|
||||
return &pricing, true
|
||||
}
|
||||
|
||||
// Provider-specific fallbacks (Gemini→Vertex, Vertex prefix strip, Bedrock anthropic. prefix)
|
||||
// Each fallback also tries Responses→Chat mode if applicable
|
||||
// ...
|
||||
|
||||
// Final fallback: Responses → Chat mode for any provider
|
||||
if requestType == schemas.ResponsesRequest || requestType == schemas.ResponsesStreamRequest {
|
||||
pricing, ok = mc.pricingData[makeKey(model, provider, normalizeRequestType(schemas.ChatCompletionRequest))]
|
||||
if ok {
|
||||
return &pricing, true
|
||||
}
|
||||
}
|
||||
|
||||
return nil, false
|
||||
}
|
||||
|
||||
// When pricing is not found, CalculateCost returns 0.0 and logs a debug message.
|
||||
// This ensures operations continue smoothly without billing failures.
|
||||
```
|
||||
|
||||
|
||||
## Cleanup and Lifecycle Management
|
||||
|
||||
Properly clean up resources when shutting down:
|
||||
|
||||
```go
|
||||
// Cleanup model catalog resources
|
||||
defer func() {
|
||||
if err := modelCatalog.Cleanup(); err != nil {
|
||||
logger.Error("Failed to cleanup model catalog: %v", err)
|
||||
}
|
||||
}()
|
||||
```
|
||||
|
||||
## Thread Safety
|
||||
|
||||
All `ModelCatalog` operations are thread-safe, making it suitable for concurrent usage across multiple plugins and goroutines. The internal pricing data cache uses read-write mutexes for optimal performance during frequent lookups.
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Shared Instance**: Use a single `ModelCatalog` instance across all plugins to avoid redundant data synchronization.
|
||||
2. **Error Handling**: Always handle the case where pricing returns 0.0 due to missing model data.
|
||||
3. **Logging**: Monitor pricing sync failures and missing model warnings in production.
|
||||
4. **Cache Awareness**: Use `CalculateCost` which automatically handles cache hits/misses and embedding costs.
|
||||
5. **Resource Cleanup**: Always call `Cleanup()` during application shutdown to prevent resource leaks.
|
||||
|
||||
The Model Catalog provides a robust, production-ready foundation for implementing billing, budgeting, and cost monitoring features in Bifrost plugins.
|
||||
130
docs/architecture/framework/streaming.mdx
Normal file
130
docs/architecture/framework/streaming.mdx
Normal file
@@ -0,0 +1,130 @@
|
||||
---
|
||||
title: "Streaming"
|
||||
description: "Framework utility for aggregating and processing real-time stream chunks from AI providers"
|
||||
icon: "water"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The **Streaming** package (`framework/streaming`) is a core utility within Bifrost designed to handle real-time data streams from AI providers. It provides a robust and efficient mechanism for plugins like [Logging](/features/observability/default), [OTel](/features/observability/otel), and [Maxim](/features/observability/maxim) to process, aggregate, and format streaming responses for chat completions, transcriptions, and other real-time AI interactions.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Plugin
|
||||
participant BC as Bifrost Core
|
||||
participant Accumulator
|
||||
|
||||
BC->>Plugin: PreLLMHook(StreamingRequest)
|
||||
activate Plugin
|
||||
Plugin->>Accumulator: CreateStreamAccumulator(requestID)
|
||||
activate Accumulator
|
||||
Accumulator-->>Plugin: ack
|
||||
deactivate Accumulator
|
||||
Plugin-->>BC: return
|
||||
deactivate Plugin
|
||||
|
||||
loop For each response chunk
|
||||
BC->>Plugin: PostLLMHook(StreamChunk)
|
||||
activate Plugin
|
||||
Plugin->>Accumulator: ProcessStreamingResponse(StreamChunk)
|
||||
activate Accumulator
|
||||
alt Is NOT Final Chunk
|
||||
Accumulator-->>Plugin: return {Type: Delta}
|
||||
else Is Final Chunk
|
||||
Accumulator->>Accumulator: buildCompleteResponse()
|
||||
Accumulator-->>Plugin: return {Type: Final, CompleteData}
|
||||
end
|
||||
deactivate Accumulator
|
||||
Plugin-->>BC: return
|
||||
deactivate Plugin
|
||||
end
|
||||
|
||||
```
|
||||
|
||||
Its primary purpose is to simplify the complexity of handling chunked data, ensuring that plugins can work with complete, well-structured responses without needing to implement their own aggregation logic.
|
||||
|
||||
|
||||
## How It Works
|
||||
|
||||
The streaming package uses an `Accumulator` to manage the lifecycle of a streaming operation. This process is designed to be highly efficient, using `sync.Pool` to reuse objects and minimize memory allocations.
|
||||
|
||||
1. **Initialization**: When a plugin that needs to process streams (like `logging` or `otel`) is initialized, it creates a new `streaming.Accumulator`.
|
||||
|
||||
2. **Stream Start**: In the `PreLLMHook` phase of a request, if the request is identified as a streaming type, the plugin calls `accumulator.CreateStreamAccumulator(requestID, timestamp)` to prepare a dedicated buffer for the incoming chunks of that request.
|
||||
|
||||
3. **Chunk Processing**: In the `PostLLMHook` phase, as each chunk of the streaming response arrives, the plugin passes it to `accumulator.ProcessStreamingResponse()`.
|
||||
* For each `delta` chunk, the accumulator appends it to the buffer associated with the request ID.
|
||||
* The accumulator handles different types of streams, including chat, audio, and transcriptions, using specialized logic to correctly piece together the data. For example, it accumulates text deltas, tool call argument deltas, and other parts of the message.
|
||||
|
||||
4. **Finalization**: When the final chunk of the stream is received (indicated by a `finish_reason` or other provider-specific signal), `ProcessStreamingResponse` performs the final assembly.
|
||||
* It reconstructs the complete `ChatMessage` or other response object from all the stored chunks.
|
||||
* It calculates total token usage, cost, and latency.
|
||||
* It returns a `ProcessedStreamResponse` object with `StreamResponseTypeFinal` and the complete, structured `AccumulatedData`.
|
||||
|
||||
5. **Cleanup**: Once the final response is processed, the accumulator cleans up all buffered chunks for that request ID, returning them to the `sync.Pool` for reuse.
|
||||
|
||||
## Key Components
|
||||
|
||||
### `Accumulator`
|
||||
|
||||
The central component of the package. It is a thread-safe manager that:
|
||||
- Tracks stream chunks for multiple concurrent requests using a `sync.Map`.
|
||||
- Uses `sync.Pool` to recycle `*StreamChunk` objects, reducing garbage collection overhead.
|
||||
- Provides methods to add chunks (`addChatStreamChunk`, `addAudioStreamChunk`, etc.).
|
||||
- Includes a periodic cleanup worker to remove stale accumulators for incomplete or orphaned requests.
|
||||
|
||||
### `ProcessStreamingResponse`
|
||||
|
||||
This is the main entry point for plugins to process stream data. It inspects the response type and delegates to the appropriate handler:
|
||||
- `processChatStreamingResponse`
|
||||
- `processAudioStreamingResponse`
|
||||
- `processTranscriptionStreamingResponse`
|
||||
- `processResponsesStreamingResponse`
|
||||
|
||||
It returns a `ProcessedStreamResponse`, which indicates whether the chunk is a `delta` or the `final` aggregated response.
|
||||
|
||||
### Stream-Specific Builders
|
||||
|
||||
The package includes internal logic to correctly build complete messages from chunks. For example, `buildCompleteMessageFromChatStreamChunks` iterates through the collected `ChatStreamChunk` objects, appending content deltas and assembling tool calls into a final, coherent `schemas.ChatMessage`.
|
||||
|
||||
## Usage Example
|
||||
|
||||
The following snippet from the `logging` plugin shows how the `streaming` package is used in practice within a plugin's `PostLLMHook`.
|
||||
|
||||
```go
|
||||
// In plugins/logging/main.go
|
||||
|
||||
func (p *LoggerPlugin) PostLLMHook(ctx *schemas.BifrostContext, result *schemas.BifrostResponse, bifrostErr *schemas.BifrostError) (*schemas.BifrostResponse, *schemas.BifrostError, error) {
|
||||
// ... setup, get requestID ...
|
||||
|
||||
go func() {
|
||||
// ...
|
||||
if bifrost.IsStreamRequestType(requestType) {
|
||||
p.logger.Debug("[logging] processing streaming response")
|
||||
|
||||
// 1. Pass the response chunk to the accumulator
|
||||
streamResponse, err := p.accumulator.ProcessStreamingResponse(ctx, result, bifrostErr)
|
||||
if err != nil {
|
||||
p.logger.Error("failed to process streaming response: %v", err)
|
||||
// 2. Check if this is the final, aggregated response
|
||||
} else if streamResponse != nil && streamResponse.Type == streaming.StreamResponseTypeFinal {
|
||||
// Prepare final log data
|
||||
logMsg.Operation = LogOperationStreamUpdate
|
||||
logMsg.StreamResponse = streamResponse
|
||||
|
||||
// 3. Update the log entry with the complete data
|
||||
processingErr := retryOnNotFound(p.ctx, func() error {
|
||||
return p.updateStreamingLogEntry(p.ctx, logMsg.RequestID, logMsg.SemanticCacheDebug, logMsg.StreamResponse, true)
|
||||
})
|
||||
|
||||
// ... handle errors and callbacks ...
|
||||
}
|
||||
}
|
||||
// ... handle non-streaming responses ...
|
||||
}()
|
||||
|
||||
return result, bifrostErr, nil
|
||||
}
|
||||
```
|
||||
|
||||
This demonstrates how a plugin can remain agnostic to the details of stream aggregation and simply react to the final, complete data returned by the `streaming` package. This greatly simplifies plugin development and ensures consistent data handling across the framework.
|
||||
185
docs/architecture/framework/vector-store.mdx
Normal file
185
docs/architecture/framework/vector-store.mdx
Normal file
@@ -0,0 +1,185 @@
|
||||
---
|
||||
title: "Vector Store"
|
||||
description: "Vector database implementations for semantic search, embeddings storage, and AI-powered features in Bifrost."
|
||||
icon: "diagram-project"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The VectorStore is a core component of Bifrost's framework package that provides a unified interface for vector database operations. It enables plugins to store embeddings, perform similarity searches, and build AI-powered features like semantic caching, content recommendations, and knowledge retrieval.
|
||||
|
||||
**Key Capabilities:**
|
||||
- **Vector Similarity Search**: Find semantically similar content using embeddings
|
||||
- **Namespace Management**: Organize data into separate collections with custom schemas
|
||||
- **Flexible Filtering**: Query data with complex filters and pagination
|
||||
- **Multiple Backends**: Support for Weaviate, Redis/Valkey-compatible, Qdrant, and Pinecone vector stores
|
||||
- **High Performance**: Optimized for production workloads
|
||||
- **Scalable Storage**: Handle millions of vectors with efficient indexing
|
||||
|
||||
## VectorStore Interface Usage
|
||||
|
||||
### Creating Namespaces
|
||||
Create collections (namespaces) with custom schemas:
|
||||
|
||||
```go
|
||||
// Define properties for your data
|
||||
properties := map[string]vectorstore.VectorStoreProperties{
|
||||
"content": {
|
||||
DataType: vectorstore.VectorStorePropertyTypeString,
|
||||
Description: "The main content text",
|
||||
},
|
||||
"category": {
|
||||
DataType: vectorstore.VectorStorePropertyTypeString,
|
||||
Description: "Content category",
|
||||
},
|
||||
"tags": {
|
||||
DataType: vectorstore.VectorStorePropertyTypeStringArray,
|
||||
Description: "Content tags",
|
||||
},
|
||||
}
|
||||
|
||||
// Create namespace
|
||||
err := store.CreateNamespace(ctx, "my_content", 1536, properties)
|
||||
if err != nil {
|
||||
log.Fatal("Failed to create namespace:", err)
|
||||
}
|
||||
```
|
||||
|
||||
### Storing Data with Embeddings
|
||||
Add data with vector embeddings for similarity search:
|
||||
|
||||
```go
|
||||
// Your embedding data (typically from an embedding model)
|
||||
embedding := []float32{0.1, 0.2, 0.3 } // example 3-dimensional vector
|
||||
|
||||
// Metadata associated with this vector
|
||||
metadata := map[string]interface{}{
|
||||
"content": "This is my content text",
|
||||
"category": "documentation",
|
||||
"tags": []string{"guide", "tutorial"},
|
||||
}
|
||||
|
||||
// Store in vector database
|
||||
err := store.Add(ctx, "my_content", "unique-id-123", embedding, metadata)
|
||||
if err != nil {
|
||||
log.Fatal("Failed to add data:", err)
|
||||
}
|
||||
```
|
||||
|
||||
### Similarity Search
|
||||
Find similar content using vector similarity:
|
||||
|
||||
```go
|
||||
// Query embedding (from user query)
|
||||
queryEmbedding := []float32{0.15, 0.25, 0.35, ...}
|
||||
|
||||
// Optional filters
|
||||
filters := []vectorstore.Query{
|
||||
{
|
||||
Field: "category",
|
||||
Operator: vectorstore.QueryOperatorEqual,
|
||||
Value: "documentation",
|
||||
},
|
||||
}
|
||||
|
||||
// Perform similarity search
|
||||
results, err := store.GetNearest(
|
||||
ctx,
|
||||
"my_content", // namespace
|
||||
queryEmbedding, // query vector
|
||||
filters, // optional filters
|
||||
[]string{"content", "category"}, // fields to return
|
||||
0.7, // similarity threshold (0-1)
|
||||
10, // limit
|
||||
)
|
||||
|
||||
for _, result := range results {
|
||||
fmt.Printf("Score: %.3f, Content: %s\n", *result.Score, result.Properties["content"])
|
||||
}
|
||||
```
|
||||
|
||||
### Data Retrieval and Management
|
||||
Query and manage stored data:
|
||||
|
||||
```go
|
||||
// Get specific item by ID
|
||||
item, err := store.GetChunk(ctx, "my_content", "unique-id-123")
|
||||
if err != nil {
|
||||
log.Fatal("Failed to get item:", err)
|
||||
}
|
||||
|
||||
// Get all items with filtering and pagination
|
||||
allResults, cursor, err := store.GetAll(
|
||||
ctx,
|
||||
"my_content",
|
||||
[]vectorstore.Query{
|
||||
{Field: "category", Operator: vectorstore.QueryOperatorEqual, Value: "documentation"},
|
||||
},
|
||||
[]string{"content", "tags"}, // select fields
|
||||
nil, // cursor for pagination
|
||||
50, // limit
|
||||
)
|
||||
|
||||
// Delete items
|
||||
err = store.Delete(ctx, "my_content", "unique-id-123")
|
||||
```
|
||||
|
||||
## Supported Vector Stores
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Weaviate" icon="database" href="/integrations/vector-databases/weaviate">
|
||||
Production-ready vector database with gRPC support.
|
||||
</Card>
|
||||
<Card title="Redis / Valkey" icon="database" href="/integrations/vector-databases/redis">
|
||||
High-performance in-memory vector store.
|
||||
</Card>
|
||||
<Card title="Qdrant" icon="database" href="/integrations/vector-databases/qdrant">
|
||||
Rust-based vector search engine with advanced filtering.
|
||||
</Card>
|
||||
<Card title="Pinecone" icon="database" href="/integrations/vector-databases/pinecone">
|
||||
Managed vector database with serverless options.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
---
|
||||
|
||||
## Use Cases
|
||||
|
||||
### [Semantic Caching](../../features/semantic-caching)
|
||||
Build intelligent caching systems that understand query intent rather than just exact matches.
|
||||
|
||||
**Applications:**
|
||||
- Customer support systems with FAQ matching
|
||||
- Code completion and documentation search
|
||||
- Content management with semantic deduplication
|
||||
|
||||
### Knowledge Base & Search
|
||||
Create intelligent search systems that understand user queries contextually.
|
||||
|
||||
**Applications:**
|
||||
- Document search and retrieval systems
|
||||
- Product recommendation engines
|
||||
- Research paper and knowledge discovery platforms
|
||||
|
||||
### Content Classification
|
||||
Automatically categorize and tag content based on semantic similarity.
|
||||
|
||||
**Applications:**
|
||||
- Email classification and routing
|
||||
- Content moderation and filtering
|
||||
- News article categorization and clustering
|
||||
|
||||
### Recommendation Systems
|
||||
Build personalized recommendation engines using vector similarity.
|
||||
|
||||
**Applications:**
|
||||
- Product recommendations based on user preferences
|
||||
- Content suggestions for media platforms
|
||||
- Similar document or article recommendations
|
||||
|
||||
## Related Documentation
|
||||
|
||||
| Topic | Documentation | Description |
|
||||
|-------|---------------|-------------|
|
||||
| **Framework Overview** | [What is Framework](./what-is-framework) | Understanding the framework package and VectorStore interface |
|
||||
| **Semantic Caching** | [Semantic Caching](../../features/semantic-caching) | Using VectorStore for AI response caching |
|
||||
49
docs/architecture/framework/what-is-framework.mdx
Normal file
49
docs/architecture/framework/what-is-framework.mdx
Normal file
@@ -0,0 +1,49 @@
|
||||
---
|
||||
title: "What is framework?"
|
||||
description: "Framework is Bifrost's shared storage and utilities SDK package that provides common database interfaces and logic for the plugin ecosystem."
|
||||
icon: "play"
|
||||
---
|
||||
|
||||
Framework serves as the foundation layer that enables plugins to implement consistent data management patterns without reinventing storage solutions.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
go get github.com/maximhq/bifrost/framework
|
||||
```
|
||||
|
||||
## Purpose
|
||||
|
||||
The framework package was designed to solve a fundamental challenge in plugin development: providing standardized, reliable storage and utility interfaces that plugins can depend on. Instead of each plugin implementing its own database logic, configuration management, or logging systems, framework offers battle-tested, shared implementations.
|
||||
|
||||
## Core Components
|
||||
|
||||
### ConfigStore
|
||||
A unified configuration persistence layer that provides consistent storage patterns for plugin settings, provider configurations, and system state. Plugins can leverage `ConfigStore` to manage their configuration data with built-in CRUD operations, transaction support, and schema management.
|
||||
|
||||
### LogStore
|
||||
Standardized logging and audit trail capabilities that enable plugins to implement observability features. `LogStore` provides structured logging, search and filtering capabilities, pagination support, and automated data retention policies.
|
||||
|
||||
### VectorStore
|
||||
Vector database operations designed for AI-powered plugins that need semantic capabilities. `VectorStore` handles embeddings management, similarity search operations, and namespace isolation, making it easy for plugins to add features like semantic caching, content search, and AI-powered recommendations.
|
||||
|
||||
### Pricing Module
|
||||
Cost calculation and model pricing management tools that help plugins implement billing and usage tracking features. The pricing system supports multi-tier pricing models, real-time usage tracking, and dynamic pricing updates.
|
||||
|
||||
## Benefits for Plugin Developers
|
||||
|
||||
**Shared Logic**: Common patterns for configuration, logging, and data management are provided out-of-the-box, reducing development time and ensuring consistency across plugins.
|
||||
|
||||
**Standardized Interfaces**: All framework components use consistent APIs, making it easier for developers to work across different plugins and maintain code quality.
|
||||
|
||||
**Pluggable Architecture**: The interface-based design allows different storage backends to be used without changing plugin code, providing flexibility for different deployment scenarios.
|
||||
|
||||
**Transaction Support**: Built-in transaction management and error handling ensure data integrity and provide reliable rollback capabilities.
|
||||
|
||||
**Production Ready**: Framework components are battle-tested in production environments and include features like connection pooling, retry logic, and performance optimizations.
|
||||
|
||||
## Integration with Bifrost
|
||||
|
||||
Framework seamlessly integrates with the Bifrost ecosystem, providing the storage foundation that powers core features like provider management, request logging, semantic caching, and governance. When plugins use framework components, they automatically participate in Bifrost's unified data management strategy.
|
||||
|
||||
The framework package enables plugin developers to focus on their core business logic while relying on robust, shared infrastructure for all storage and utility needs.
|
||||
Reference in New Issue
Block a user