bifrost/docs/providers/supported-providers/vertex.mdx

---
title: "Vertex AI"
description: "Google Vertex AI API conversion guide - multi-model support, OAuth2 authentication, project/region configuration"
icon: "v"
---

## Overview

Vertex AI is Google's unified ML platform providing access to Google's Gemini models, Anthropic Claude models, and other third-party LLMs through a single API. Bifrost performs conversions including:

- **Multi-model support** - Unified interface for Gemini, Anthropic, and third-party models
- **OAuth2 authentication** - Service account credentials with automatic token refresh
- **Project and region management** - Automatic endpoint construction from GCP project/region
- **Model routing** - Automatic provider detection (Gemini vs Anthropic) based on model name
- **Request conversion** - Conversion to underlying provider format (Gemini or Anthropic)
- **Embeddings support** - Vector generation with task type and truncation options
- **Model discovery** - Paginated model listing with deployment information

### Supported Operations

| Operation            | Non-Streaming | Streaming | Endpoint                                  |
| -------------------- | ------------- | --------- | ----------------------------------------- |
| Chat Completions     | ✅            | ✅        | `/generate`                               |
| Responses API        | ✅            | ✅        | `/messages`                               |
| Embeddings           | ✅            | -         | `/embeddings`                             |
| Image Generation     | ✅            | -         | `/generateContent` or `/predict` (Imagen) |
| Image Edit           | ✅            | -         | `/generateContent` or `/predict` (Imagen) |
| Video Generation     | ✅            | -         | `/predictLongRunning` (Veo models only)   |
| Image Variation      | ❌            | -         | Not supported                             |
| List Models          | ✅            | -         | `/models`                                 |
| Text Completions     | ❌            | ❌        | -                                         |
| Speech (TTS)         | ❌            | ❌        | -                                         |
| Transcriptions (STT) | ❌            | ❌        | -                                         |
| Files                | ❌            | ❌        | -                                         |
| Batch                | ❌            | ❌        | -                                         |

<Note>
**Unsupported Operations** (❌): Text Completions, Speech, Transcriptions, Files, and Batch are not supported by Vertex AI. These return `UnsupportedOperationError`.

**Vertex-specific**: Endpoints vary by model type. Responses API available for both Gemini and Anthropic models.

</Note>

---

## Setup & Configuration

Vertex AI requires Google Cloud project configuration and authentication credentials. Three authentication methods are supported.

<Note>
  The `aliases` field (mapping model names to fine-tuned model IDs or endpoint
  identifiers) requires **v1.5.0-prerelease2 or later**. On v1.4.x, use
  `deployments` inside `vertex_key_config` instead — see the [v1.5.0 Migration
  Guide](/migration-guides/v1.5.0#breaking-change-9-provider-deployments-removed-migrate-to-aliases)
  for details.
</Note>

### 1. Service Account JSON (Recommended for Production)

Provide a credential JSON string in `auth_credentials`. The JSON must contain a `type` field. Supported types: `service_account` (most common), `impersonated_service_account`, `authorized_user`, `external_account`, `external_account_authorized_user`.

<Tabs>

<Tab title="Web UI">

<Frame>
  <img
    src="/media/ui-vertex-service-account-auth-setup.png"
    alt="Google Vertex AI Service Account (JSON) authentication setup in the Bifrost Web UI showing Project ID, Region, and Auth Credentials fields"
  />
</Frame>

1. Navigate to **"Model Providers"** → **"Configurations"** → **"Google Vertex"**
2. Click **"Add Key"** (or edit an existing key)
3. Under **Authentication Method**, select **"Service Account (JSON)"**
4. Set **Project ID**: Your Google Cloud project ID
5. Set **Project Number** (Required only for fine-tuned models): Your GCP project number; leave blank for standard models
6. Set **Region**: e.g., `us-central1`
7. Set **Auth Credentials**: Paste your service account JSON or reference an env var (e.g., `env.VERTEX_CREDENTIALS`)
8. Configure **Aliases**: Map model names to fine-tuned model IDs (if using fine-tuned models)
9. Save

</Tab>

<Tab title="API">

```bash
# Step 1: Create the provider
curl -X POST http://localhost:8080/api/providers \
  -H "Content-Type: application/json" \
  -d '{"provider": "vertex"}'

# Step 2: Create a key (Service Account JSON)
curl -X POST http://localhost:8080/api/providers/vertex/keys \
  -H "Content-Type: application/json" \
  -d '{
    "name": "vertex-sa-key",
    "value": "",
    "models": ["*"],
    "weight": 1.0,
    "vertex_key_config": {
      "project_id": "env.VERTEX_PROJECT_ID",
      "region": "us-central1",
      "auth_credentials": "env.VERTEX_CREDENTIALS"
    }
  }'
```

<Note>
  **On v1.4.x**, two differences apply: - Pass `keys` directly in the `POST
  /api/providers` body — there is no separate `/api/providers/{provider}/keys`
  endpoint. - Use `deployments` inside `vertex_key_config` instead of the
  top-level `aliases` field for fine-tuned model mappings.
</Note>

</Tab>

<Tab title="config.json">

```json
{
  "providers": {
    "vertex": {
      "keys": [
        {
          "name": "vertex-sa-key",
          "value": "",
          "models": ["*"],
          "weight": 1.0,
          "vertex_key_config": {
            "project_id": "env.VERTEX_PROJECT_ID",
            "region": "us-central1",
            "auth_credentials": "env.VERTEX_CREDENTIALS"
          }
        }
      ]
    }
  }
}
```

<Note>
  On **v1.4.x**, use `deployments` inside `vertex_key_config` instead of the
  top-level `aliases` field for fine-tuned model mappings.
</Note>

</Tab>

<Tab title="Go SDK">

```go
func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
    switch provider {
    case schemas.Vertex:
        return []schemas.Key{
            {
                Value:  schemas.EnvVar{}, // Leave empty when using service account credentials
                Models: []string{"*"},
                Weight: 1.0,
                VertexKeyConfig: &schemas.VertexKeyConfig{
                    ProjectID:       *schemas.NewEnvVar("env.VERTEX_PROJECT_ID"),
                    Region:          *schemas.NewEnvVar("us-central1"),
                    AuthCredentials: *schemas.NewEnvVar("env.VERTEX_CREDENTIALS"), // full service account JSON
                },
            },
        }, nil
    }
    return nil, fmt.Errorf("provider %s not supported", provider)
}
```

</Tab>

</Tabs>

### 2. Application Default Credentials

Leave `auth_credentials` empty. Bifrost calls `google.FindDefaultCredentials()` — Google's ADC library — which resolves credentials in this order:

1. `GOOGLE_APPLICATION_CREDENTIALS` env var (path to a JSON credential file)
2. Application default credential file (`~/.config/gcloud/application_default_credentials.json`, written by `gcloud auth application-default login`)
3. GCE/GKE/Cloud Run/App Engine metadata server (attached service account or Workload Identity)

<Tabs>

<Tab title="Web UI">

<Frame>
  <img
    src="/media/ui-vertex-default-service-account-auth-setup.png"
    alt="Google Vertex AI Application Default Credentials setup in the Bifrost Web UI showing Project ID and Region fields with no credential inputs"
  />
</Frame>

1. Navigate to **"Model Providers"** → **"Configurations"** → **"Google Vertex"**
2. Click **"Add Key"** (or edit an existing key)
3. Under **Authentication Method**, select **"Service Account (Attached)"**
4. Set **Project ID**: Your Google Cloud project ID
5. Set **Project Number** (Required only for fine-tuned models): Your GCP project number; leave blank for standard models
6. Set **Region**: e.g., `us-central1`
7. Configure **Aliases** if needed
8. Save

Ensure `GOOGLE_APPLICATION_CREDENTIALS` is set in your environment, or that Workload Identity / gcloud is configured.

</Tab>

<Tab title="API">

```bash
# Step 1: Create the provider
curl -X POST http://localhost:8080/api/providers \
  -H "Content-Type: application/json" \
  -d '{"provider": "vertex"}'

# Step 2: Create a key (Application Default Credentials)
curl -X POST http://localhost:8080/api/providers/vertex/keys \
  -H "Content-Type: application/json" \
  -d '{
    "name": "vertex-adc-key",
    "value": "",
    "models": ["*"],
    "weight": 1.0,
    "vertex_key_config": {
      "project_id": "env.VERTEX_PROJECT_ID",
      "region": "us-central1",
      "auth_credentials": ""
    }
  }'
```

<Note>
  **On v1.4.x**, pass `keys` directly in the `POST /api/providers` body — there
  is no separate `/api/providers/{provider}/keys` endpoint.
</Note>

</Tab>

<Tab title="config.json">

```json
{
  "providers": {
    "vertex": {
      "keys": [
        {
          "name": "vertex-adc-key",
          "value": "",
          "models": ["*"],
          "weight": 1.0,
          "vertex_key_config": {
            "project_id": "env.VERTEX_PROJECT_ID",
            "region": "us-central1",
            "auth_credentials": ""
          }
        }
      ]
    }
  }
}
```

</Tab>

<Tab title="Go SDK">

```go
func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
    switch provider {
    case schemas.Vertex:
        return []schemas.Key{
            {
                Value:  schemas.EnvVar{},
                Models: []string{"*"},
                Weight: 1.0,
                VertexKeyConfig: &schemas.VertexKeyConfig{
                    ProjectID: *schemas.NewEnvVar("env.VERTEX_PROJECT_ID"),
                    Region:    *schemas.NewEnvVar("us-central1"),
                    // Leave AuthCredentials empty — uses Application Default Credentials
                },
            },
        }, nil
    }
    return nil, fmt.Errorf("provider %s not supported", provider)
}
```

</Tab>

</Tabs>

### 3. API Key (Gemini and Fine-Tuned Models Only)

Set `value` to your Vertex API key. API key authentication is supported only for Gemini models and fine-tuned Gemini models. For Anthropic models on Vertex, use Service Account or Application Default Credentials.

<Tabs>

<Tab title="Web UI">

<Frame>
  <img
    src="/media/ui-vertex-api-key-auth-setup.png"
    alt="Google Vertex AI API Key authentication setup in the Bifrost Web UI showing API Key, Project ID, Region, and Project Number fields"
  />
</Frame>

1. Navigate to **"Model Providers"** → **"Configurations"** → **"Google Vertex"**
2. Click **"Add Key"** (or edit an existing key)
3. Under **Authentication Method**, select **"API Key"**
4. Set **API Key**: Your Vertex AI API key
5. Set **Project ID**: Your Google Cloud project ID
6. Set **Project Number** (Required only for fine-tuned models): Your GCP project number; leave blank for standard models
7. Set **Region**: e.g., `us-central1`
8. Configure **Aliases**: Map short names to fine-tuned model IDs (e.g., `my-model` → `123456789`)
9. Save

</Tab>

<Tab title="API">

```bash
# Step 1: Create the provider
curl -X POST http://localhost:8080/api/providers \
  -H "Content-Type: application/json" \
  -d '{"provider": "vertex"}'

# Step 2: Create a key (API Key — Gemini + fine-tuned models)
curl -X POST http://localhost:8080/api/providers/vertex/keys \
  -H "Content-Type: application/json" \
  -d '{
    "name": "vertex-api-key",
    "value": "env.VERTEX_API_KEY",
    "models": ["gemini-pro", "gemini-2.0-flash", "my-fine-tuned-model"],
    "weight": 1.0,
    "aliases": {
      "my-fine-tuned-model": "123456789"
    },
    "vertex_key_config": {
      "project_id": "env.VERTEX_PROJECT_ID",
      "project_number": "env.VERTEX_PROJECT_NUMBER",
      "region": "us-central1"
    }
  }'
```

<Note>
**On v1.4.x**, two differences apply:
- Pass `keys` directly in the `POST /api/providers` body — there is no separate `/api/providers/{provider}/keys` endpoint.
- Replace the top-level `aliases` with `"deployments"` inside `vertex_key_config`:
```json
"vertex_key_config": {
  "project_id": "env.VERTEX_PROJECT_ID",
  "region": "us-central1",
  "deployments": {
    "my-fine-tuned-model": "123456789"
  }
}
```
</Note>

</Tab>

<Tab title="config.json">

```json
{
  "providers": {
    "vertex": {
      "keys": [
        {
          "name": "vertex-api-key",
          "value": "env.VERTEX_API_KEY",
          "models": ["gemini-pro", "gemini-2.0-flash", "my-fine-tuned-model"],
          "weight": 1.0,
          "aliases": {
            "my-fine-tuned-model": "123456789"
          },
          "vertex_key_config": {
            "project_id": "env.VERTEX_PROJECT_ID",
            "project_number": "env.VERTEX_PROJECT_NUMBER",
            "region": "us-central1"
          }
        }
      ]
    }
  }
}
```

<Note>
  On **v1.4.x**, use `deployments` inside `vertex_key_config` instead of the
  top-level `aliases` field.
</Note>

</Tab>

<Tab title="Go SDK">

```go
func (a *MyAccount) GetKeysForProvider(ctx *context.Context, provider schemas.ModelProvider) ([]schemas.Key, error) {
    switch provider {
    case schemas.Vertex:
        return []schemas.Key{
            {
                Value:  *schemas.NewEnvVar("env.VERTEX_API_KEY"), // only when using Gemini or fine-tuned models
                Models: []string{"gemini-pro", "gemini-2.0-flash", "my-fine-tuned-model"},
                Weight: 1.0,
                Aliases: schemas.KeyAliases{
                    "my-fine-tuned-model": "123456789",
                },
                VertexKeyConfig: &schemas.VertexKeyConfig{
                    ProjectID:     *schemas.NewEnvVar("env.VERTEX_PROJECT_ID"),
                    ProjectNumber: *schemas.NewEnvVar("env.VERTEX_PROJECT_NUMBER"), // required for fine-tuned models
                    Region:        *schemas.NewEnvVar("us-central1"),
                },
            },
        }, nil
    }
    return nil, fmt.Errorf("provider %s not supported", provider)
}
```

</Tab>

</Tabs>

<Note>
  Vertex AI support for fine-tuned models is currently in beta. Requests to
  non-Gemini fine-tuned models may fail, so please test and report any issues.
</Note>

**`vertex_key_config` fields:**

| Field              | Required | Description                                            |
| ------------------ | -------- | ------------------------------------------------------ |
| `project_id`       | Yes      | Google Cloud project ID                                |
| `region`           | Yes      | GCP region (e.g., `us-central1`, `eu-west1`, `global`) |
| `auth_credentials` | No       | Service account JSON string (leave empty for ADC)      |
| `project_number`   | No       | GCP project number (required for fine-tuned models)    |

**Key-level fields:**

| Field     | Required | Description                                                                               |
| --------- | -------- | ----------------------------------------------------------------------------------------- |
| `value`   | No       | Vertex API key (Gemini and fine-tuned models only; leave empty for Service Account / ADC) |
| `aliases` | No       | Map model names to fine-tuned model IDs or endpoint identifiers (v1.5.0-prerelease2+)     |
| `models`  | Yes      | Models this key can serve; use `["*"]` to allow all                                       |

---

## Beta Headers

For Anthropic models on Vertex AI, Bifrost validates `anthropic-beta` headers and drops unsupported headers from the request.

**Supported**: `computer-use-*`, `compact-*`, `context-management-*`, `interleaved-thinking-*`, `context-1m-*`

**Not supported**: `structured-outputs-*`, `advanced-tool-use-*`, `mcp-client-*`, `prompt-caching-scope-*`, `files-api-*`, `skills-*`, `fast-mode-*`, `redact-thinking-*`

You can override these defaults per provider via the **Beta Headers** tab in provider configuration or via [`beta_header_overrides`](/quickstart/gateway/provider-configuration#beta-header-overrides). See the full support matrix in the [Anthropic provider docs](/providers/supported-providers/anthropic#beta-headers).

<Frame>
  <img
    src="/media/vertex-ai-setting-anthropic-beta-headers.png"
    alt="Vertex AI Beta Headers configuration tab showing supported and unsupported Anthropic beta features with override options"
  />
</Frame>

---

# 1. Chat Completions

## Request Parameters

### Core Parameter Mapping

| Parameter        | Vertex Handling           | Notes                                                |
| ---------------- | ------------------------- | ---------------------------------------------------- |
| `model`          | Maps to Vertex model ID   | Region-specific endpoint constructed automatically   |
| All other params | Model-specific conversion | Converted per underlying provider (Gemini/Anthropic) |

### Key Configuration

The key configuration for Vertex requires Google Cloud credentials:

```json
{
  "vertex_key_config": {
    "project_id": "my-gcp-project",
    "region": "us-central1",
    "auth_credentials": "{service-account-json}"
  }
}
```

**Configuration Details**:

- `project_id` - GCP project ID (required)
- `region` - GCP region for API endpoints (required)
  - Examples: `us-central1`, `us-west1`, `eu-west1`, `global`
- `auth_credentials` - Service account JSON credentials (optional if using default credentials)

### Authentication Methods

1. **Service Account JSON** (recommended for production)

   ```json
   { "auth_credentials": "{full-service-account-json}" }
   ```

2. **Application Default Credentials** (for local development)
   - Requires `GOOGLE_APPLICATION_CREDENTIALS` environment variable
   - Leave `auth_credentials` empty

## Gemini Models

When using Google's Gemini models, Bifrost converts requests to Gemini's API format.

### Parameter Mapping for Gemini

All Gemini-compatible parameters are supported. Special handling includes:

- **System prompts**: Converted to Gemini's system message format
- **Tool usage**: Mapped to Gemini's function calling format
- **Streaming**: Uses Gemini's streaming protocol

Refer to [Gemini documentation](/providers/supported-providers/gemini) for detailed conversion details.

## Anthropic Models (Claude)

When using Anthropic models through Vertex AI, Bifrost converts requests to Anthropic's message format.

### Parameter Mapping for Anthropic

All Anthropic-standard parameters are supported:

- **Reasoning/Thinking**: `reasoning` parameters converted to `thinking` structure
- **System messages**: Extracted and placed in separate `system` field
- **Tool message grouping**: Consecutive tool messages merged
- **API version**: Automatically set to `vertex-2023-10-16` for Anthropic models

Refer to [Anthropic documentation](/providers/supported-providers/anthropic) for detailed conversion details.

### Special Notes for Vertex + Anthropic

- Responses API uses special `/v1/messages` endpoint
- `anthropic_version` automatically set to `vertex-2023-10-16`
- Minimum reasoning budget: 1024 tokens
- Model field removed from request (Vertex uses different identification)

## Region Selection

The region determines the API endpoint:

| Region        | Endpoint                                | Purpose                   |
| ------------- | --------------------------------------- | ------------------------- |
| `us-central1` | `us-central1-aiplatform.googleapis.com` | US Central                |
| `us-west1`    | `us-west1-aiplatform.googleapis.com`    | US West                   |
| `eu-west1`    | `eu-west1-aiplatform.googleapis.com`    | Europe West               |
| `global`      | `aiplatform.googleapis.com`             | Global (no region prefix) |

Availability varies by region. Check [GCP documentation](https://cloud.google.com/vertex-ai/docs/general/locations) for model availability.

## Streaming

Streaming format depends on model type:

- **Gemini models**: Standard Gemini streaming with server-sent events
- **Anthropic models**: Anthropic message streaming format

---

# 2. Responses API

The Responses API is available for both Anthropic (Claude) and Gemini models on Vertex AI.

## Request Parameters

### Core Parameter Mapping

| Parameter           | Vertex Handling              | Notes                             |
| ------------------- | ---------------------------- | --------------------------------- |
| `instructions`      | Becomes system message       | Model-specific conversion         |
| `input`             | Converted to messages        | String or array support           |
| `max_output_tokens` | Model-specific field mapping | Gemini vs Anthropic conversion    |
| All other params    | Model-specific conversion    | Converted per underlying provider |

### Gemini Models

For Gemini models, conversion follows Gemini's Responses API format.

### Anthropic Models (Claude)

For Anthropic models, conversion follows Anthropic's message format:

- `instructions` becomes system message
- `reasoning` mapped to `thinking` structure

### Configuration

<Tabs>
<Tab title="Gateway">

```bash
curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vertex/claude-3-5-sonnet",
    "input": "What is AI?",
    "instructions": "You are a helpful assistant",
    "project_id": "my-gcp-project",
    "region": "us-central1"
  }' \
  -H "X-Goog-Authorization: Bearer {token}"
```

</Tab>
<Tab title="Go SDK">

```go
resp, err := client.ResponsesRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostResponsesRequest{
    Provider: schemas.Vertex,
    Model:    "claude-3-5-sonnet",
    Input:    messages,
    Params: &schemas.ResponsesParameters{
        Instructions: schemas.Ptr("You are a helpful assistant"),
    },
})
```

</Tab>
</Tabs>

### Special Handling

- Endpoint: `/v1/messages` (Anthropic format)
- `anthropic_version` set to `vertex-2023-10-16` automatically
- Model and region fields removed from request
- Raw request body passthrough supported

Refer to [Anthropic Responses API](/providers/supported-providers/anthropic#2-responses-api) for parameter details.

---

# 3. Embeddings

Embeddings are supported for Gemini and other models that support embedding generation.

## Request Parameters

### Core Parameters

| Parameter    | Vertex Mapping                    | Notes                |
| ------------ | --------------------------------- | -------------------- |
| `input`      | `instances[].content`             | Text to embed        |
| `dimensions` | `parameters.outputDimensionality` | Optional output size |

### Advanced Parameters

Use `extra_params` for embedding-specific options:

<Tabs>
<Tab title="Gateway">

```bash
curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-004",
    "input": ["text to embed"],
    "dimensions": 256,
    "task_type": "RETRIEVAL_DOCUMENT",
    "title": "Document title",
    "project_id": "my-gcp-project",
    "region": "us-central1",
    "autoTruncate": true
  }'
```

</Tab>
<Tab title="Go SDK">

```go
resp, err := client.EmbeddingRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostEmbeddingRequest{
    Provider: schemas.Vertex,
    Model:    "text-embedding-004",
    Input: &schemas.EmbeddingInput{
        Texts: []string{"text to embed"},
    },
    Params: &schemas.EmbeddingParameters{
        Dimensions: schemas.Ptr(256),
        ExtraParams: map[string]interface{}{
            "task_type": "RETRIEVAL_DOCUMENT",
            "title": "Document title",
            "autoTruncate": true,
        },
    },
})
```

</Tab>
</Tabs>

#### Embedding Parameters

| Parameter      | Type    | Description                                                                                                               |
| -------------- | ------- | ------------------------------------------------------------------------------------------------------------------------- |
| `task_type`    | string  | Task type hint: `RETRIEVAL_QUERY`, `RETRIEVAL_DOCUMENT`, `SEMANTIC_SIMILARITY`, `CLASSIFICATION`, `CLUSTERING` (optional) |
| `title`        | string  | Optional title to help model produce better embeddings (used with task_type)                                              |
| `autoTruncate` | boolean | Auto-truncate input to max tokens (defaults to true)                                                                      |

### Task Type Effects

Different task types optimize embeddings for specific use cases:

- `RETRIEVAL_DOCUMENT` - Optimized for documents in retrieval systems
- `RETRIEVAL_QUERY` - Optimized for queries searching documents
- `SEMANTIC_SIMILARITY` - Optimized for semantic similarity tasks
- `CLASSIFICATION` - For classification tasks
- `CLUSTERING` - For clustering tasks

## Response Conversion

Embeddings response includes vectors and truncation information:

```json
{
  "embeddings": [
    {
      "values": [0.1234, -0.5678, ...],
      "statistics": {
        "token_count": 15,
        "truncated": false
      }
    }
  ]
}
```

**Response Fields**:

- `values` - Embedding vector as floats
- `statistics.token_count` - Input token count
- `statistics.truncated` - Whether input was truncated due to length

---

# 4. Image Generation

Image Generation is supported for Gemini and Imagen on Vertex AI. The provider automatically routes to the appropriate format based on the model type.

## Request Parameters

### Core Parameter Mapping

| Parameter        | Vertex Handling                       | Notes                                             |
| ---------------- | ------------------------------------- | ------------------------------------------------- |
| `model`          | Mapped to deployment/model identifier | Model type detected automatically                 |
| `prompt`         | Model-specific conversion             | Converted per underlying provider (Gemini/Imagen) |
| All other params | Model-specific conversion             | Converted per underlying provider                 |

### Model Type Detection

Vertex automatically detects the model type and uses the appropriate conversion:

1. **Gemini Models**: Uses Gemini format (same as [Gemini Image Generation](/providers/supported-providers/gemini#8-image-generation))
2. **Imagen Models**: Uses Imagen format (detected via `IsImagenModel()`)

### Configuration

<Tabs>
<Tab title="Gateway">

```bash
curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vertex/imagen-4.0-generate-001",
    "prompt": "A sunset over the mountains",
    "size": "1024x1024",
    "n": 2,
    "project_id": "my-gcp-project",
    "region": "us-central1"
  }' \
  -H "X-Goog-Authorization: Bearer {token}"
```

</Tab>
<Tab title="Go SDK">

```go
resp, err := client.ImageGenerationRequest(schemas.NewBifrostContext(ctx, schemas.NoDeadline), &schemas.BifrostImageGenerationRequest{
    Provider: schemas.Vertex,
    Model:    "imagen-4.0-generate-001",
    Input: &schemas.ImageGenerationInput{
        Prompt: "A sunset over the mountains",
    },
    Params: &schemas.ImageGenerationParameters{
        Size: schemas.Ptr("1024x1024"),
        N:    schemas.Ptr(2),
    },
})
```

</Tab>
</Tabs>

## Request Conversion

Vertex converts requests based on model type:

- **Gemini Models**: Uses `gemini.ToGeminiImageGenerationRequest()` - same conversion as standard Gemini (see [Gemini Image Generation](/providers/supported-providers/gemini#8-image-generation))
- **Imagen Models**: Uses `gemini.ToImagenImageGenerationRequest()` - Imagen-specific format with size/aspect ratio conversion

All request bodies are converted to `map[string]interface{}` and the `region` field is removed before sending to Vertex API.

## Response Conversion

- **Gemini Models**: Responses converted using `GenerateContentResponse.ToBifrostImageGenerationResponse()` - same as standard Gemini
- **Imagen Models**: Responses converted using `GeminiImagenResponse.ToBifrostImageGenerationResponse()` - Imagen-specific format

## Endpoint Selection

The provider automatically selects the endpoint based on model type:

- **Fine-tuned models**: `/v1beta1/projects/{projectNumber}/locations/{region}/endpoints/{deployment}:generateContent`
- **Imagen models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:predict`
- **Gemini models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:generateContent`

## Streaming

Image generation streaming is not supported by Vertex AI.

---

# 5. Image Edit

<Warning>Requests use **multipart/form-data**, not JSON.</Warning>

Image Edit is supported for Gemini and Imagen models on Vertex AI. The provider automatically routes to the appropriate format based on the model type.

**Request Parameters**

| Parameter            | Type   | Required | Notes                                                                                                                                                           |
| -------------------- | ------ | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `model`              | string | ✅       | Model identifier (must be Gemini or Imagen model)                                                                                                               |
| `prompt`             | string | ✅       | Text description of the edit                                                                                                                                    |
| `image[]`            | binary | ✅       | Image file(s) to edit (supports multiple images)                                                                                                                |
| `mask`               | binary | ❌       | Mask image file                                                                                                                                                 |
| `type`               | string | ❌       | Edit type: `"inpainting"`, `"outpainting"`, `"inpaint_removal"`, `"bgswap"` (Imagen only)                                                                       |
| `n`                  | int    | ❌       | Number of images to generate (1-10)                                                                                                                             |
| `output_format`      | string | ❌       | Output format: `"png"`, `"webp"`, `"jpeg"`                                                                                                                      |
| `output_compression` | int    | ❌       | Compression level (0-100%)                                                                                                                                      |
| `seed`               | int    | ❌       | Seed for reproducibility (via `ExtraParams["seed"]`)                                                                                                            |
| `negative_prompt`    | string | ❌       | Negative prompt (via `ExtraParams["negativePrompt"]`)                                                                                                           |
| `maskMode`           | string | ❌       | Mask mode (via `ExtraParams["maskMode"]`, Imagen only): `"MASK_MODE_USER_PROVIDED"`, `"MASK_MODE_BACKGROUND"`, `"MASK_MODE_FOREGROUND"`, `"MASK_MODE_SEMANTIC"` |
| `dilation`           | float  | ❌       | Mask dilation (via `ExtraParams["dilation"]`, Imagen only): Range [0, 1]                                                                                        |
| `maskClasses`        | int[]  | ❌       | Mask classes (via `ExtraParams["maskClasses"]`, Imagen only): For `MASK_MODE_SEMANTIC`                                                                          |

---

**Request Conversion**

Vertex uses the same conversion functions as Gemini:

1. **Gemini Models**: Uses `gemini.ToGeminiImageEditRequest()` - same conversion as standard Gemini (see [Gemini Image Edit](/providers/supported-providers/gemini#9-image-edit))
2. **Imagen Models**: Uses `gemini.ToImagenImageEditRequest()` - Imagen-specific format with edit mode mapping and mask configuration (see [Gemini Image Edit](/providers/supported-providers/gemini#9-image-edit))

**Model Validation**: Only Gemini and Imagen models are supported. Other models return `ConfigurationError`.

**Request Body Processing**:

- All request bodies are converted to `map[string]interface{}` for Vertex API compatibility
- The `region` field is removed before sending to Vertex API
- For Gemini models, unsupported fields are stripped via `stripVertexGeminiUnsupportedFields()` (removes `id` from function_call and function_response)

**Response Conversion**

- **Gemini Models**: Responses converted using `GenerateContentResponse.ToBifrostImageGenerationResponse()` - same as standard Gemini
- **Imagen Models**: Responses converted using `GeminiImagenResponse.ToBifrostImageGenerationResponse()` - Imagen-specific format

**Endpoint Selection**

The provider automatically selects the endpoint based on model type:

- **Gemini models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:generateContent`
- **Imagen models**: `/v1/projects/{projectID}/locations/{region}/publishers/google/models/{model}:predict`

**Streaming**

Image edit streaming is not supported by Vertex AI.

**Image Variation**

Image variation is not supported by Vertex AI.

---

# 6. List Models

## Request Parameters

None required. Automatically uses project_id and region from key config.

## Response Conversion

Lists models available in the specified project and region with metadata and deployment information:

```json
{
  "models": [
    {
      "name": "projects/{project}/locations/{region}/models/gemini-2.0-flash",
      "display_name": "Gemini 2.0 Flash",
      "description": "Fast multimodal model",
      "version_id": "1",
      "version_aliases": ["latest", "stable"],
      "capabilities": [...],
      "deployed_models": [...]
    }
  ],
  "next_page_token": "..."
}
```

## Custom vs Non-Custom Models

<Warning>
  **Important**: Vertex AI's List Models API **only returns custom fine-tuned
  models** that have been deployed to your project. It does NOT return standard
  foundation models (Gemini, Claude, etc.).
</Warning>

To provide a complete model listing experience, Bifrost performs **multi-pass model discovery**:

### Three-Pass Model Discovery

1. **First Pass - Custom Models from API Response**
   - Queries Vertex AI's List Models API
   - Returns only custom fine-tuned models deployed to your project
   - Custom models are identified by having deployment values that contain only digits
   - Example: `"deployment": "1234567890"`

2. **Second Pass - Non-Custom Models from Aliases**
   - Adds standard foundation models from your `aliases` configuration
   - Non-custom models have alphanumeric deployment values (e.g., `gemini-pro`, `claude-3-5-sonnet`)
   - Filters by the key-level `models` allowlist, if specified
   - Example: `"deployment": "gemini-2.0-flash"`

3. **Third Pass - Allowed Models Not in Aliases**
   - Adds models specified in `models` that weren't in the `aliases` map
   - Ensures all explicitly allowed models appear in the list
   - Uses the model name itself as the deployment value
   - Skips digit-only model IDs (reserved for custom models)

### Model Filtering Logic

- **If `models` is empty and no aliases are configured**: No models are returned
- **If `models` is empty but aliases are configured**: Only aliased models are returned
- **If `models` is `["*"]`**: All models from all three passes are included (unrestricted)
- **If `models` is non-empty**: Only models/aliases whose request names appear in `models` are included
- **Duplicate Prevention**: Each model ID is tracked to prevent duplicates across passes

### Model Name Formatting

Non-custom models from aliases and allowed models are automatically formatted for display:

- `gemini-pro` → "Gemini Pro"
- `claude-3-5-sonnet` → "Claude 3 5 Sonnet"
- `gemini_2_flash` → "Gemini 2 Flash"

Formatting uses title case and converts hyphens/underscores to spaces.

### Example Configuration

<Tabs>
<Tab title="With Custom Models Only">

```json
{
  "aliases": {
    "my-gemini-ft": "1234567890",
    "my-claude-ft": "9876543210"
  },
  "vertex_key_config": {
    "project_id": "my-project",
    "region": "us-central1"
  }
}
```

This returns only your custom fine-tuned models from the API.

</Tab>
<Tab title="With Foundation Models">

```json
{
  "aliases": {
    "gemini-2.0-flash": "gemini-2.0-flash",
    "claude-3-5-sonnet": "claude-3-5-sonnet-v2@20241022"
  },
  "vertex_key_config": {
    "project_id": "my-project",
    "region": "us-central1"
  }
}
```

This returns both custom models AND foundation models from aliases.

</Tab>
<Tab title="With Allowed Models Filter">

```json
{
  "models": ["gemini-2.0-flash", "claude-3-5-sonnet"],
  "aliases": {
    "gemini-2.0-flash": "gemini-2.0-flash",
    "claude-3-5-sonnet": "claude-3-5-sonnet-v2@20241022",
    "gemini-1.5-pro": "gemini-1.5-pro"
  },
  "vertex_key_config": {
    "project_id": "my-project",
    "region": "us-central1"
  }
}
```

Only returns `gemini-2.0-flash` and `claude-3-5-sonnet`, excluding `gemini-1.5-pro`.

</Tab>
</Tabs>

### Pagination

Model listing is paginated automatically. If more than 100 models exist, `next_page_token` will be present. Bifrost handles pagination internally.

---

## Caveats

<Accordion title="Project ID and Region Required">
  **Severity**: High **Behavior**: Both project_id and region required for all
  operations **Impact**: Request fails without valid GCP project/region
  configuration **Code**: `vertex.go:127-138`
</Accordion>

<Accordion title="OAuth2 Token Management">
  **Severity**: Medium **Behavior**: Tokens cached and automatically refreshed
  when expired **Impact**: First request slightly slower due to auth; cached for
  subsequent requests **Code**: `vertex.go:34-55`
</Accordion>

<Accordion title="Anthropic Model Detection">
  **Severity**: Medium **Behavior**: Automatic detection of Anthropic vs Gemini
  models **Impact**: Different conversion logic applied transparently **Code**:
  `vertex.go` chat/responses endpoints
</Accordion>

<Accordion title="Model-Specific Responses API Handling">
  **Severity**: Low **Behavior**: Responses API automatically routes to
  Anthropic or Gemini implementation based on model **Impact**: Different
  conversion logic applied transparently per model **Code**:
  `vertex.go:836-1080`
</Accordion>

<Accordion title="Anthropic Version Lock">
  **Severity**: Low **Behavior**: `anthropic_version` always set to
  `vertex-2023-10-16` for Claude **Impact**: Cannot override Anthropic version
  for Claude on Vertex **Code**: `utils.go:33, 71`
</Accordion>

<Accordion title="Embeddings Precision Preservation">
  **Severity**: Low **Behavior**: Vertex returns float64 embeddings, and Bifrost
  preserves that precision in normalized embedding responses **Impact**: No
  precision loss in the `/v1/embeddings` response path **Code**:
  `embedding.go:84-91`
</Accordion>

<Accordion title="List Models API Returns Only Custom Models">
  **Severity**: High **Behavior**: Vertex AI's List Models API only returns
  custom fine-tuned models, NOT foundation models **Impact**: Bifrost performs
  three-pass discovery to include foundation models from aliases and the
  key-level `models` allowlist **Why**: This is a Vertex AI API limitation -
  foundation models must be explicitly configured **Code**: `models.go:76-217`
</Accordion>

---

## Configuration

**HTTP Settings**: OAuth2 authentication with automatic token refresh | Region-specific endpoints | Max Connections 5000 | Max Idle 60 seconds

**Scope**: `https://www.googleapis.com/auth/cloud-platform`

**Endpoint Format**: `https://{region}-aiplatform.googleapis.com/v1/projects/{project}/locations/{region}/{resource}`

**Note**: For `global` region, endpoint is `https://aiplatform.googleapis.com/v1/projects/{project}/locations/global/{resource}`

## Video Generation

Vertex AI routes video generation through Gemini's Veo models using the `predictLongRunning` endpoint. All parameters are identical to [Gemini Video Generation](/providers/supported-providers/gemini#video-generation).

<Note>
  Only Veo models are supported (e.g., `veo-2.0-generate-001`). Passing a
  non-Veo model name returns a configuration error.
</Note>

**Supported Operations**

| Operation | Supported | Notes                         |
| --------- | --------- | ----------------------------- |
| Generate  | ✅        | `POST /v1/videos`             |
| Retrieve  | ✅        | `GET /v1/videos/{id}`         |
| Download  | ✅        | `GET /v1/videos/{id}/content` |
| Delete    | ❌        | Not supported                 |
| List      | ❌        | Not supported                 |
| Remix     | ❌        | Not supported                 |