{/* Semantic Cache Toggle */}

Enable Semantic Caching

Enable semantic caching for requests. Send x-bf-cache-key header with requests to use semantic caching.{" "} {!isVectorStoreEnabled && ( Requires vector store to be configured and enabled in config.json. )} {!providersLoading && providers?.length === 0 && ( Requires at least one provider to be configured. )}

{ if (isVectorStoreEnabled) { handleSemanticCacheToggle(checked); } }} /> {(isSemanticCacheEnabled || originalCacheEnabled) && ( )}

{/* Cache Configuration (only show when enabled) */} {originalCacheEnabled && isVectorStoreEnabled && (providersLoading ? (

) : (

{loadedDirectOnlyConfig && (

This plugin was loaded in direct-only mode via config.json. The Web UI currently edits provider-backed semantic cache settings; keep using config.json if you want to stay in direct-only mode.

)} {hasInvalidProviderBackedDimension && (

You selected a provider while keeping dimension: 1. That is only valid for direct-only mode. Set the embedding model's real dimension before saving, or remove the provider to stay in direct-only mode.

)} {/* Provider and Model Settings */}

Provider and Model Settings

Configured Providers

Embedding Model* updateCacheConfigLocal({ embedding_model: e.target.value })} />

{/* Cache Settings */}

Cache Settings

TTL (seconds) { const value = e.target.value; if (value === "") { updateCacheConfigLocal({ ttl_seconds: undefined }); return; } const parsed = parseInt(value); if (!Number.isNaN(parsed)) { updateCacheConfigLocal({ ttl_seconds: parsed }); } }} />

Similarity Threshold { const value = e.target.value; if (value === "") { updateCacheConfigLocal({ threshold: undefined }); return; } const parsed = parseFloat(value); if (!Number.isNaN(parsed)) { updateCacheConfigLocal({ threshold: parsed }); } }} />

Dimension { const value = e.target.value; if (value === "") { updateCacheConfigLocal({ dimension: undefined }); return; } const parsed = parseInt(value); if (!Number.isNaN(parsed)) { updateCacheConfigLocal({ dimension: parsed }); } }} />

API keys for the embedding provider will be inherited from the main provider configuration. The semantic cache will use the configured provider's keys automatically. Updates in keys will be reflected on Bifrost restart.

{/* Conversation Settings */}

Conversation Settings

Conversation History Threshold updateCacheConfigLocal({ conversation_history_threshold: parseInt(e.target.value) || 3 })} />

Skip caching for conversations with more than this number of messages (prevents false positives)

Exclude System Prompt

Exclude system messages from cache key generation

updateCacheConfigLocal({ exclude_system_prompt: checked })} size="md" />

{/* Cache Behavior */}

Cache Behavior

Cache by Model

Include model name in cache key

updateCacheConfigLocal({ cache_by_model: checked })} size="md" />

Cache by Provider

Include provider name in cache key

updateCacheConfigLocal({ cache_by_provider: checked })} size="md" />

Notes

You can pass x-bf-cache-ttl header with requests to use request-specific TTL.
You can pass x-bf-cache-threshold header with requests to use request-specific similarity threshold.
You can pass x-bf-cache-type header with "direct" or "semantic" to control cache behavior.
You can pass x-bf-cache-no-store header with "true" to disable response caching.

))}