--- title: "Vertex AI" description: "Google Vertex AI API conversion guide - multi-model support, OAuth2 authentication, project/region configuration" icon: "v" --- ## Overview Vertex AI is Google's unified ML platform providing access to Google's Gemini models, Anthropic Claude models, and other third-party LLMs through a single API. Bifrost performs conversions including: - **Multi-model support** - Unified interface for Gemini, Anthropic, and third-party models - **OAuth2 authentication** - Service account credentials with automatic token refresh - **Project and region management** - Automatic endpoint construction from GCP project/region - **Model routing** - Automatic provider detection (Gemini vs Anthropic) based on model name - **Request conversion** - Conversion to underlying provider format (Gemini or Anthropic) - **Embeddings support** - Vector generation with task type and truncation options - **Model discovery** - Paginated model listing with deployment information ### Supported Operations | Operation | Non-Streaming | Streaming | Endpoint | | -------------------- | ------------- | --------- | ----------------------------------------- | | Chat Completions | ✅ | ✅ | `/generate` | | Responses API | ✅ | ✅ | `/messages` | | Embeddings | ✅ | - | `/embeddings` | | Image Generation | ✅ | - | `/generateContent` or `/predict` (Imagen) | | Image Edit | ✅ | - | `/generateContent` or `/predict` (Imagen) | | Video Generation | ✅ | - | `/predictLongRunning` (Veo models only) | | Image Variation | ❌ | - | Not supported | | List Models | ✅ | - | `/models` | | Text Completions | ❌ | ❌ | - | | Speech (TTS) | ❌ | ❌ | - | | Transcriptions (STT) | ❌ | ❌ | - | | Files | ❌ | ❌ | - | | Batch | ❌ | ❌ | - | **Unsupported Operations** (❌): Text Completions, Speech, Transcriptions, Files, and Batch are not supported by Vertex AI. These return `UnsupportedOperationError`. **Vertex-specific**: Endpoints vary by model type. Responses API available for both Gemini and Anthropic models. --- ## Setup & Configuration Vertex AI requires Google Cloud project configuration and authentication credentials. Three authentication methods are supported. The `aliases` field (mapping model names to fine-tuned model IDs or endpoint identifiers) requires **v1.5.0-prerelease2 or later**. On v1.4.x, use `deployments` inside `vertex_key_config` instead — see the [v1.5.0 Migration Guide](/migration-guides/v1.5.0#breaking-change-9-provider-deployments-removed-migrate-to-aliases) for details. ### 1. Service Account JSON (Recommended for Production) Provide a credential JSON string in `auth_credentials`. The JSON must contain a `type` field. Supported types: `service_account` (most common), `impersonated_service_account`, `authorized_user`, `external_account`, `external_account_authorized_user`. Google Vertex AI Service Account (JSON) authentication setup in the Bifrost Web UI showing Project ID, Region, and Auth Credentials fields 1. Navigate to **"Model Providers"** → **"Configurations"** → **"Google Vertex"** 2. Click **"Add Key"** (or edit an existing key) 3. Under **Authentication Method**, select **"Service Account (JSON)"** 4. Set **Project ID**: Your Google Cloud project ID 5. Set **Project Number** (Required only for fine-tuned models): Your GCP project number; leave blank for standard models 6. Set **Region**: e.g., `us-central1` 7. Set **Auth Credentials**: Paste your service account JSON or reference an env var (e.g., `env.VERTEX_CREDENTIALS`) 8. Configure **Aliases**: Map model names to fine-tuned model IDs (if using fine-tuned models) 9. Save ```bash # Step 1: Create the provider curl -X POST http://localhost:8080/api/providers \ -H "Content-Type: application/json" \ -d '{"provider": "vertex"}' # Step 2: Create a key (Service Account JSON) curl -X POST http://localhost:8080/api/providers/vertex/keys \ -H "Content-Type: application/json" \ -d '{ "name": "vertex-sa-key", "value": "", "models": ["*"], "weight": 1.0, "vertex_key_config": { "project_id": "env.VERTEX_PROJECT_ID", "region": "us-central1", "auth_credentials": "env.VERTEX_CREDENTIALS" } }' ``` **On v1.4.x**, two differences apply: - Pass `keys` directly in the `POST /api/providers` body — there is no separate `/api/providers/{provider}/keys` endpoint. - Use `deployments` inside `vertex_key_config` instead of the top-level `aliases` field for fine-tuned model mappings. ```json { "providers": { "vertex": { "keys": [ { "name": "vertex-sa-key", "value": "", "models": ["*"], "weight": 1.0, "vertex_key_config": { "project_id": "env.VERTEX_PROJECT_ID", "region": "us-central1", "auth_credentials": "env.VERTEX_CREDENTIALS" } } ] } } } ``` On **v1.4.x**, use `deployments` inside `vertex_key_config` instead of the top-level `aliases` field for fine-tuned model mappings. ```go func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) { switch provider { case schemas.Vertex: return []schemas.Key{ { Value: schemas.EnvVar{}, // Leave empty when using service account credentials Models: []string{"*"}, Weight: 1.0, VertexKeyConfig: &schemas.VertexKeyConfig{ ProjectID: *schemas.NewEnvVar("env.VERTEX_PROJECT_ID"), Region: *schemas.NewEnvVar("us-central1"), AuthCredentials: *schemas.NewEnvVar("env.VERTEX_CREDENTIALS"), // full service account JSON }, }, }, nil } return nil, fmt.Errorf("provider %s not supported", provider) } ``` ### 2. Application Default Credentials Leave `auth_credentials` empty. Bifrost calls `google.FindDefaultCredentials()` — Google's ADC library — which resolves credentials in this order: 1. `GOOGLE_APPLICATION_CREDENTIALS` env var (path to a JSON credential file) 2. Application default credential file (`~/.config/gcloud/application_default_credentials.json`, written by `gcloud auth application-default login`) 3. GCE/GKE/Cloud Run/App Engine metadata server (attached service account or Workload Identity) Google Vertex AI Application Default Credentials setup in the Bifrost Web UI showing Project ID and Region fields with no credential inputs 1. Navigate to **"Model Providers"** → **"Configurations"** → **"Google Vertex"** 2. Click **"Add Key"** (or edit an existing key) 3. Under **Authentication Method**, select **"Service Account (Attached)"** 4. Set **Project ID**: Your Google Cloud project ID 5. Set **Project Number** (Required only for fine-tuned models): Your GCP project number; leave blank for standard models 6. Set **Region**: e.g., `us-central1` 7. Configure **Aliases** if needed 8. Save Ensure `GOOGLE_APPLICATION_CREDENTIALS` is set in your environment, or that Workload Identity / gcloud is configured. ```bash # Step 1: Create the provider curl -X POST http://localhost:8080/api/providers \ -H "Content-Type: application/json" \ -d '{"provider": "vertex"}' # Step 2: Create a key (Application Default Credentials) curl -X POST http://localhost:8080/api/providers/vertex/keys \ -H "Content-Type: application/json" \ -d '{ "name": "vertex-adc-key", "value": "", "models": ["*"], "weight": 1.0, "vertex_key_config": { "project_id": "env.VERTEX_PROJECT_ID", "region": "us-central1", "auth_credentials": "" } }' ``` **On v1.4.x**, pass `keys` directly in the `POST /api/providers` body — there is no separate `/api/providers/{provider}/keys` endpoint. ```json { "providers": { "vertex": { "keys": [ { "name": "vertex-adc-key", "value": "", "models": ["*"], "weight": 1.0, "vertex_key_config": { "project_id": "env.VERTEX_PROJECT_ID", "region": "us-central1", "auth_credentials": "" } } ] } } } ``` ```go func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) { switch provider { case schemas.Vertex: return []schemas.Key{ { Value: schemas.EnvVar{}, Models: []string{"*"}, Weight: 1.0, VertexKeyConfig: &schemas.VertexKeyConfig{ ProjectID: *schemas.NewEnvVar("env.VERTEX_PROJECT_ID"), Region: *schemas.NewEnvVar("us-central1"), // Leave AuthCredentials empty — uses Application Default Credentials }, }, }, nil } return nil, fmt.Errorf("provider %s not supported", provider) } ``` ### 3. API Key (Gemini and Fine-Tuned Models Only) Set `value` to your Vertex API key. API key authentication is supported only for Gemini models and fine-tuned Gemini models. For Anthropic models on Vertex, use Service Account or Application Default Credentials. Google Vertex AI API Key authentication setup in the Bifrost Web UI showing API Key, Project ID, Region, and Project Number fields 1. Navigate to **"Model Providers"** → **"Configurations"** → **"Google Vertex"** 2. Click **"Add Key"** (or edit an existing key) 3. Under **Authentication Method**, select **"API Key"** 4. Set **API Key**: Your Vertex AI API key 5. Set **Project ID**: Your Google Cloud project ID 6. Set **Project Number** (Required only for fine-tuned models): Your GCP project number; leave blank for standard models 7. Set **Region**: e.g., `us-central1` 8. Configure **Aliases**: Map short names to fine-tuned model IDs (e.g., `my-model` → `123456789`) 9. Save ```bash # Step 1: Create the provider curl -X POST http://localhost:8080/api/providers \ -H "Content-Type: application/json" \ -d '{"provider": "vertex"}' # Step 2: Create a key (API Key — Gemini + fine-tuned models) curl -X POST http://localhost:8080/api/providers/vertex/keys \ -H "Content-Type: application/json" \ -d '{ "name": "vertex-api-key", "value": "env.VERTEX_API_KEY", "models": ["gemini-pro", "gemini-2.0-flash", "my-fine-tuned-model"], "weight": 1.0, "aliases": { "my-fine-tuned-model": "123456789" }, "vertex_key_config": { "project_id": "env.VERTEX_PROJECT_ID", "project_number": "env.VERTEX_PROJECT_NUMBER", "region": "us-central1" } }' ``` **On v1.4.x**, two differences apply: - Pass `keys` directly in the `POST /api/providers` body — there is no separate `/api/providers/{provider}/keys` endpoint. - Replace the top-level `aliases` with `"deployments"` inside `vertex_key_config`: ```json "vertex_key_config": { "project_id": "env.VERTEX_PROJECT_ID", "region": "us-central1", "deployments": { "my-fine-tuned-model": "123456789" } } ``` ```json { "providers": { "vertex": { "keys": [ { "name": "vertex-api-key", "value": "env.VERTEX_API_KEY", "models": ["gemini-pro", "gemini-2.0-flash", "my-fine-tuned-model"], "weight": 1.0, "aliases": { "my-fine-tuned-model": "123456789" }, "vertex_key_config": { "project_id": "env.VERTEX_PROJECT_ID", "project_number": "env.VERTEX_PROJECT_NUMBER", "region": "us-central1" } } ] } } } ``` On **v1.4.x**, use `deployments` inside `vertex_key_config` instead of the top-level `aliases` field. ```go func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) { switch provider { case schemas.Vertex: return []schemas.Key{ { Value: *schemas.NewEnvVar("env.VERTEX_API_KEY"), // only when using Gemini or fine-tuned models Models: []string{"gemini-pro", "gemini-2.0-flash", "my-fine-tuned-model"}, Weight: 1.0, Aliases: schemas.KeyAliases{ "my-fine-tuned-model": "123456789", }, VertexKeyConfig: &schemas.VertexKeyConfig{ ProjectID: *schemas.NewEnvVar("env.VERTEX_PROJECT_ID"), ProjectNumber: *schemas.NewEnvVar("env.VERTEX_PROJECT_NUMBER"), // required for fine-tuned models Region: *schemas.NewEnvVar("us-central1"), }, }, }, nil } return nil, fmt.Errorf("provider %s not supported", provider) } ``` Vertex AI support for fine-tuned models is currently in beta. Requests to non-Gemini fine-tuned models may fail, so please test and report any issues. **`vertex_key_config` fields:** | Field | Required | Description | | ------------------ | -------- | ------------------------------------------------------ | | `project_id` | Yes | Google Cloud project ID | | `region` | Yes | GCP region (e.g., `us-central1`, `eu-west1`, `global`) | | `auth_credentials` | No | Service account JSON string (leave empty for ADC) | | `project_number` | No | GCP project number (required for fine-tuned models) | **Key-level fields:** | Field | Required | Description | | --------- | -------- | ----------------------------------------------------------------------------------------- | | `value` | No | Vertex API key (Gemini and fine-tuned models only; leave empty for Service Account / ADC) | | `aliases` | No | Map model names to fine-tuned model IDs or endpoint identifiers (v1.5.0-prerelease2+) | | `models` | Yes | Models this key can serve; use `["*"]` to allow all | --- ## Beta Headers For Anthropic models on Vertex AI, Bifrost validates `anthropic-beta` headers and drops unsupported headers from the request. **Supported**: `computer-use-*`, `compact-*`, `context-management-*`, `interleaved-thinking-*`, `context-1m-*` **Not supported**: `structured-outputs-*`, `advanced-tool-use-*`, `mcp-client-*`, `prompt-caching-scope-*`, `files-api-*`, `skills-*`, `fast-mode-*`, `redact-thinking-*` You can override these defaults per provider via the **Beta Headers** tab in provider configuration or via [`beta_header_overrides`](/quickstart/gateway/provider-configuration#beta-header-overrides). See the full support matrix in the [Anthropic provider docs](/providers/supported-providers/anthropic#beta-headers). Vertex AI Beta Headers configuration tab showing supported and unsupported Anthropic beta features with override options --- # 1. Chat Completions ## Request Parameters ### Core Parameter Mapping | Parameter | Vertex Handling | Notes | | ---------------- | ------------------------- | ---------------------------------------------------- | | `model` | Maps to Vertex model ID | Region-specific endpoint constructed automatically | | All other params | Model-specific conversion | Converted per underlying provider (Gemini/Anthropic) | ### Key Configuration The key configuration for Vertex requires Google Cloud credentials: ```json { "vertex_key_config": { "project_id": "my-gcp-project", "region": "us-central1", "auth_credentials": "{service-account-json}" } } ``` **Configuration Details**: - `project_id` - GCP project ID (required) - `region` - GCP region for API endpoints (required) - Examples: `us-central1`, `us-west1`, `eu-west1`, `global` - `auth_credentials` - Service account JSON credentials (optional if using default credentials) ### Authentication Methods 1. **Service Account JSON** (recommended for production) ```json { "auth_credentials": "{full-service-account-json}" } ``` 2. **Application Default Credentials** (for local development) - Requires `GOOGLE_APPLICATION_CREDENTIALS` environment variable - Leave `auth_credentials` empty ## Gemini Models When using Google's Gemini models, Bifrost converts requests to Gemini's API format. ### Parameter Mapping for Gemini All Gemini-compatible parameters are supported. Special handling includes: - **System prompts**: Converted to Gemini's system message format - **Tool usage**: Mapped to Gemini's function calling format - **Streaming**: Uses Gemini's streaming protocol Refer to [Gemini documentation](/providers/supported-providers/gemini) for detailed conversion details. ## Anthropic Models (Claude) When using Anthropic models through Vertex AI, Bifrost converts requests to Anthropic's message format. ### Parameter Mapping for Anthropic All Anthropic-standard parameters are supported: - **Reasoning/Thinking**: `reasoning` parameters converted to `thinking` structure - **System messages**: Extracted and placed in separate `system` field - **Tool message grouping**: Consecutive tool messages merged - **API version**: Automatically set to `vertex-2023-10-16` for Anthropic models Refer to [Anthropic documentation](/providers/supported-providers/anthropic) for detailed conversion details. ### Special Notes for Vertex + Anthropic - Responses API uses special `/v1/messages` endpoint - `anthropic_version` automatically set to `vertex-2023-10-16` - Minimum reasoning budget: 1024 tokens - Model field removed from request (Vertex uses different identification) ## Region Selection The region determines the API endpoint: | Region | Endpoint | Purpose | | ------------- | --------------------------------------- | ------------------------- | | `us-central1` | `us-central1-aiplatform.googleapis.com` | US Central | | `us-west1` | `us-west1-aiplatform.googleapis.com` | US West | | `eu-west1` | `eu-west1-aiplatform.googleapis.com` | Europe West | | `global` | `aiplatform.googleapis.com` | Global (no region prefix) | Availability varies by region. Check [GCP documentation](https://cloud.google.com/vertex-ai/docs/general/locations) for model availability. ## Streaming Streaming format depends on model type: - **Gemini models**: Standard Gemini streaming with server-sent events - **Anthropic models**: Anthropic message streaming format --- # 2. Responses API The Responses API is available for both Anthropic (Claude) and Gemini models on Vertex AI. ## Request Parameters ### Core Parameter Mapping | Parameter | Vertex Handling | Notes | | ------------------- | ---------------------------- | --------------------------------- | | `instructions` | Becomes system message | Model-specific conversion | | `input` | Converted to messages | String or array support | | `max_output_tokens` | Model-specific field mapping | Gemini vs Anthropic conversion | | All other params | Model-specific conversion | Converted per underlying provider | ### Gemini Models For Gemini models, conversion follows Gemini's Responses API format. ### Anthropic Models (Claude) For Anthropic models, conversion follows Anthropic's message format: - `instructions` becomes system message - `reasoning` mapped to `thinking` structure ### Configuration ```bash curl -X POST http://localhost:8080/v1/responses \ -H "Content-Type: application/json" \ -d '{ "model": "vertex/claude-3-5-sonnet", "input": "What is AI?", "instructions": "You are a helpful assistant", "project_id": "my-gcp-project", "region": "us-central1" }' \ -H "X-Goog-Authorization: Bearer {token}" ``` ```go resp, err := client.ResponsesRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostResponsesRequest{ Provider: schemas.Vertex, Model: "claude-3-5-sonnet", Input: messages, Params: &schemas.ResponsesParameters{ Instructions: schemas.Ptr("You are a helpful assistant"), }, }) ``` ### Special Handling - Endpoint: `/v1/messages` (Anthropic format) - `anthropic_version` set to `vertex-2023-10-16` automatically - Model and region fields removed from request - Raw request body passthrough supported Refer to [Anthropic Responses API](/providers/supported-providers/anthropic#2-responses-api) for parameter details. --- # 3. Embeddings Embeddings are supported for Gemini and other models that support embedding generation. ## Request Parameters ### Core Parameters | Parameter | Vertex Mapping | Notes | | ------------ | --------------------------------- | -------------------- | | `input` | `instances[].content` | Text to embed | | `dimensions` | `parameters.outputDimensionality` | Optional output size | ### Advanced Parameters Use `extra_params` for embedding-specific options: ```bash curl -X POST http://localhost:8080/v1/embeddings \ -H "Content-Type: application/json" \ -d '{ "model": "text-embedding-004", "input": ["text to embed"], "dimensions": 256, "task_type": "RETRIEVAL_DOCUMENT", "title": "Document title", "project_id": "my-gcp-project", "region": "us-central1", "autoTruncate": true }' ``` ```go resp, err := client.EmbeddingRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostEmbeddingRequest{ Provider: schemas.Vertex, Model: "text-embedding-004", Input: &schemas.EmbeddingInput{ Texts: []string{"text to embed"}, }, Params: &schemas.EmbeddingParameters{ Dimensions: schemas.Ptr(256), ExtraParams: map[string]interface{}{ "task_type": "RETRIEVAL_DOCUMENT", "title": "Document title", "autoTruncate": true, }, }, }) ``` #### Embedding Parameters | Parameter | Type | Description | | -------------- | ------- | ------------------------------------------------------------------------------------------------------------------------- | | `task_type` | string | Task type hint: `RETRIEVAL_QUERY`, `RETRIEVAL_DOCUMENT`, `SEMANTIC_SIMILARITY`, `CLASSIFICATION`, `CLUSTERING` (optional) | | `title` | string | Optional title to help model produce better embeddings (used with task_type) | | `autoTruncate` | boolean | Auto-truncate input to max tokens (defaults to true) | ### Task Type Effects Different task types optimize embeddings for specific use cases: - `RETRIEVAL_DOCUMENT` - Optimized for documents in retrieval systems - `RETRIEVAL_QUERY` - Optimized for queries searching documents - `SEMANTIC_SIMILARITY` - Optimized for semantic similarity tasks - `CLASSIFICATION` - For classification tasks - `CLUSTERING` - For clustering tasks ## Response Conversion Embeddings response includes vectors and truncation information: ```json { "embeddings": [ { "values": [0.1234, -0.5678, ...], "statistics": { "token_count": 15, "truncated": false } } ] } ``` **Response Fields**: - `values` - Embedding vector as floats - `statistics.token_count` - Input token count - `statistics.truncated` - Whether input was truncated due to length --- # 4. Image Generation Image Generation is supported for Gemini and Imagen on Vertex AI. The provider automatically routes to the appropriate format based on the model type. ## Request Parameters ### Core Parameter Mapping | Parameter | Vertex Handling | Notes | | ---------------- | ------------------------------------- | ------------------------------------------------- | | `model` | Mapped to deployment/model identifier | Model type detected automatically | | `prompt` | Model-specific conversion | Converted per underlying provider (Gemini/Imagen) | | All other params | Model-specific conversion | Converted per underlying provider | ### Model Type Detection Vertex automatically detects the model type and uses the appropriate conversion: 1. **Gemini Models**: Uses Gemini format (same as [Gemini Image Generation](/providers/supported-providers/gemini#8-image-generation)) 2. **Imagen Models**: Uses Imagen format (detected via `IsImagenModel()`) ### Configuration ```bash curl -X POST http://localhost:8080/v1/images/generations \ -H "Content-Type: application/json" \ -d '{ "model": "vertex/imagen-4.0-generate-001", "prompt": "A sunset over the mountains", "size": "1024x1024", "n": 2, "project_id": "my-gcp-project", "region": "us-central1" }' \ -H "X-Goog-Authorization: Bearer {token}" ``` ```go resp, err := client.ImageGenerationRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostImageGenerationRequest{ Provider: schemas.Vertex, Model: "imagen-4.0-generate-001", Input: &schemas.ImageGenerationInput{ Prompt: "A sunset over the mountains", }, Params: &schemas.ImageGenerationParameters{ Size: schemas.Ptr("1024x1024"), N: schemas.Ptr(2), }, }) ``` ## Request Conversion Vertex converts requests based on model type: - **Gemini Models**: Uses `gemini.ToGeminiImageGenerationRequest()` - same conversion as standard Gemini (see [Gemini Image Generation](/providers/supported-providers/gemini#8-image-generation)) - **Imagen Models**: Uses `gemini.ToImagenImageGenerationRequest()` - Imagen-specific format with size/aspect ratio conversion All request bodies are converted to `map[string]interface{}` and the `region` field is removed before sending to Vertex API. ## Response Conversion - **Gemini Models**: Responses converted using `GenerateContentResponse.ToBifrostImageGenerationResponse()` - same as standard Gemini - **Imagen Models**: Responses converted using `GeminiImagenResponse.ToBifrostImageGenerationResponse()` - Imagen-specific format ## Endpoint Selection The provider automatically selects the endpoint based on model type: - **Fine-tuned models**: `/v1beta1/projects/{projectNumber}/locations/{region}/endpoints/{deployment}:generateContent` - **Imagen models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:predict` - **Gemini models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:generateContent` ## Streaming Image generation streaming is not supported by Vertex AI. --- # 5. Image Edit Requests use **multipart/form-data**, not JSON. Image Edit is supported for Gemini and Imagen models on Vertex AI. The provider automatically routes to the appropriate format based on the model type. **Request Parameters** | Parameter | Type | Required | Notes | | -------------------- | ------ | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model` | string | ✅ | Model identifier (must be Gemini or Imagen model) | | `prompt` | string | ✅ | Text description of the edit | | `image[]` | binary | ✅ | Image file(s) to edit (supports multiple images) | | `mask` | binary | ❌ | Mask image file | | `type` | string | ❌ | Edit type: `"inpainting"`, `"outpainting"`, `"inpaint_removal"`, `"bgswap"` (Imagen only) | | `n` | int | ❌ | Number of images to generate (1-10) | | `output_format` | string | ❌ | Output format: `"png"`, `"webp"`, `"jpeg"` | | `output_compression` | int | ❌ | Compression level (0-100%) | | `seed` | int | ❌ | Seed for reproducibility (via `ExtraParams["seed"]`) | | `negative_prompt` | string | ❌ | Negative prompt (via `ExtraParams["negativePrompt"]`) | | `maskMode` | string | ❌ | Mask mode (via `ExtraParams["maskMode"]`, Imagen only): `"MASK_MODE_USER_PROVIDED"`, `"MASK_MODE_BACKGROUND"`, `"MASK_MODE_FOREGROUND"`, `"MASK_MODE_SEMANTIC"` | | `dilation` | float | ❌ | Mask dilation (via `ExtraParams["dilation"]`, Imagen only): Range [0, 1] | | `maskClasses` | int[] | ❌ | Mask classes (via `ExtraParams["maskClasses"]`, Imagen only): For `MASK_MODE_SEMANTIC` | --- **Request Conversion** Vertex uses the same conversion functions as Gemini: 1. **Gemini Models**: Uses `gemini.ToGeminiImageEditRequest()` - same conversion as standard Gemini (see [Gemini Image Edit](/providers/supported-providers/gemini#9-image-edit)) 2. **Imagen Models**: Uses `gemini.ToImagenImageEditRequest()` - Imagen-specific format with edit mode mapping and mask configuration (see [Gemini Image Edit](/providers/supported-providers/gemini#9-image-edit)) **Model Validation**: Only Gemini and Imagen models are supported. Other models return `ConfigurationError`. **Request Body Processing**: - All request bodies are converted to `map[string]interface{}` for Vertex API compatibility - The `region` field is removed before sending to Vertex API - For Gemini models, unsupported fields are stripped via `stripVertexGeminiUnsupportedFields()` (removes `id` from function_call and function_response) **Response Conversion** - **Gemini Models**: Responses converted using `GenerateContentResponse.ToBifrostImageGenerationResponse()` - same as standard Gemini - **Imagen Models**: Responses converted using `GeminiImagenResponse.ToBifrostImageGenerationResponse()` - Imagen-specific format **Endpoint Selection** The provider automatically selects the endpoint based on model type: - **Gemini models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:generateContent` - **Imagen models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:predict` **Streaming** Image edit streaming is not supported by Vertex AI. **Image Variation** Image variation is not supported by Vertex AI. --- # 6. List Models ## Request Parameters None required. Automatically uses project_id and region from key config. ## Response Conversion Lists models available in the specified project and region with metadata and deployment information: ```json { "models": [ { "name": "projects/{project}/locations/{region}/models/gemini-2.0-flash", "display_name": "Gemini 2.0 Flash", "description": "Fast multimodal model", "version_id": "1", "version_aliases": ["latest", "stable"], "capabilities": [...], "deployed_models": [...] } ], "next_page_token": "..." } ``` ## Custom vs Non-Custom Models **Important**: Vertex AI's List Models API **only returns custom fine-tuned models** that have been deployed to your project. It does NOT return standard foundation models (Gemini, Claude, etc.). To provide a complete model listing experience, Bifrost performs **multi-pass model discovery**: ### Three-Pass Model Discovery 1. **First Pass - Custom Models from API Response** - Queries Vertex AI's List Models API - Returns only custom fine-tuned models deployed to your project - Custom models are identified by having deployment values that contain only digits - Example: `"deployment": "1234567890"` 2. **Second Pass - Non-Custom Models from Aliases** - Adds standard foundation models from your `aliases` configuration - Non-custom models have alphanumeric deployment values (e.g., `gemini-pro`, `claude-3-5-sonnet`) - Filters by the key-level `models` allowlist, if specified - Example: `"deployment": "gemini-2.0-flash"` 3. **Third Pass - Allowed Models Not in Aliases** - Adds models specified in `models` that weren't in the `aliases` map - Ensures all explicitly allowed models appear in the list - Uses the model name itself as the deployment value - Skips digit-only model IDs (reserved for custom models) ### Model Filtering Logic - **If `models` is empty and no aliases are configured**: No models are returned - **If `models` is empty but aliases are configured**: Only aliased models are returned - **If `models` is `["*"]`**: All models from all three passes are included (unrestricted) - **If `models` is non-empty**: Only models/aliases whose request names appear in `models` are included - **Duplicate Prevention**: Each model ID is tracked to prevent duplicates across passes ### Model Name Formatting Non-custom models from aliases and allowed models are automatically formatted for display: - `gemini-pro` → "Gemini Pro" - `claude-3-5-sonnet` → "Claude 3 5 Sonnet" - `gemini_2_flash` → "Gemini 2 Flash" Formatting uses title case and converts hyphens/underscores to spaces. ### Example Configuration ```json { "aliases": { "my-gemini-ft": "1234567890", "my-claude-ft": "9876543210" }, "vertex_key_config": { "project_id": "my-project", "region": "us-central1" } } ``` This returns only your custom fine-tuned models from the API. ```json { "aliases": { "gemini-2.0-flash": "gemini-2.0-flash", "claude-3-5-sonnet": "claude-3-5-sonnet-v2@20241022" }, "vertex_key_config": { "project_id": "my-project", "region": "us-central1" } } ``` This returns both custom models AND foundation models from aliases. ```json { "models": ["gemini-2.0-flash", "claude-3-5-sonnet"], "aliases": { "gemini-2.0-flash": "gemini-2.0-flash", "claude-3-5-sonnet": "claude-3-5-sonnet-v2@20241022", "gemini-1.5-pro": "gemini-1.5-pro" }, "vertex_key_config": { "project_id": "my-project", "region": "us-central1" } } ``` Only returns `gemini-2.0-flash` and `claude-3-5-sonnet`, excluding `gemini-1.5-pro`. ### Pagination Model listing is paginated automatically. If more than 100 models exist, `next_page_token` will be present. Bifrost handles pagination internally. --- ## Caveats **Severity**: High **Behavior**: Both project_id and region required for all operations **Impact**: Request fails without valid GCP project/region configuration **Code**: `vertex.go:127-138` **Severity**: Medium **Behavior**: Tokens cached and automatically refreshed when expired **Impact**: First request slightly slower due to auth; cached for subsequent requests **Code**: `vertex.go:34-55` **Severity**: Medium **Behavior**: Automatic detection of Anthropic vs Gemini models **Impact**: Different conversion logic applied transparently **Code**: `vertex.go` chat/responses endpoints **Severity**: Low **Behavior**: Responses API automatically routes to Anthropic or Gemini implementation based on model **Impact**: Different conversion logic applied transparently per model **Code**: `vertex.go:836-1080` **Severity**: Low **Behavior**: `anthropic_version` always set to `vertex-2023-10-16` for Claude **Impact**: Cannot override Anthropic version for Claude on Vertex **Code**: `utils.go:33, 71` **Severity**: Low **Behavior**: Vertex returns float64 embeddings, and Bifrost preserves that precision in normalized embedding responses **Impact**: No precision loss in the `/v1/embeddings` response path **Code**: `embedding.go:84-91` **Severity**: High **Behavior**: Vertex AI's List Models API only returns custom fine-tuned models, NOT foundation models **Impact**: Bifrost performs three-pass discovery to include foundation models from aliases and the key-level `models` allowlist **Why**: This is a Vertex AI API limitation - foundation models must be explicitly configured **Code**: `models.go:76-217` --- ## Configuration **HTTP Settings**: OAuth2 authentication with automatic token refresh | Region-specific endpoints | Max Connections 5000 | Max Idle 60 seconds **Scope**: `https://www.googleapis.com/auth/cloud-platform` **Endpoint Format**: `https://{region}-aiplatform.googleapis.com/v1/projects/{project}/locations/{region}/{resource}` **Note**: For `global` region, endpoint is `https://aiplatform.googleapis.com/v1/projects/{project}/locations/global/{resource}` ## Video Generation Vertex AI routes video generation through Gemini's Veo models using the `predictLongRunning` endpoint. All parameters are identical to [Gemini Video Generation](/providers/supported-providers/gemini#video-generation). Only Veo models are supported (e.g., `veo-2.0-generate-001`). Passing a non-Veo model name returns a configuration error. **Supported Operations** | Operation | Supported | Notes | | --------- | --------- | ----------------------------- | | Generate | ✅ | `POST /v1/videos` | | Retrieve | ✅ | `GET /v1/videos/{id}` | | Download | ✅ | `GET /v1/videos/{id}/content` | | Delete | ❌ | Not supported | | List | ❌ | Not supported | | Remix | ❌ | Not supported |