1606 lines
50 KiB
Markdown
1606 lines
50 KiB
Markdown
# Bifrost Integration Tests
|
|
|
|
Production-ready end-to-end test suite for testing AI integrations through Bifrost proxy. This test suite provides uniform testing across multiple AI integrations with comprehensive coverage of chat, tool calling, image processing, embeddings, speech synthesis, and multimodal workflows.
|
|
|
|
## 🎯 Quick Start (TL;DR)
|
|
|
|
```bash
|
|
# 1. Install uv (if not already installed)
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
|
|
# 2. Install dependencies
|
|
cd bifrost/tests/integrations
|
|
uv sync
|
|
|
|
# 3. Set environment variables
|
|
export BIFROST_BASE_URL="http://localhost:8080"
|
|
export OPENAI_API_KEY="your-key"
|
|
export ANTHROPIC_API_KEY="your-key"
|
|
|
|
# 4. Run tests
|
|
uv run pytest # All tests
|
|
uv run pytest tests/integrations/test_openai.py -v # Specific integration
|
|
uv run pytest -k "tool_call" -v # By pattern
|
|
uv run pytest -n auto # Parallel execution
|
|
```
|
|
|
|
**Note:** All `pytest` commands in this README can be prefixed with `uv run`. If you prefer traditional pip, run `pip install -r requirements.txt` and use `pytest` directly.
|
|
|
|
## 🌉 Architecture Overview
|
|
|
|
The Bifrost integration tests use a centralized configuration system that routes all AI integration requests through Bifrost as a gateway/proxy:
|
|
|
|
```text
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ Test Client │───▶│ Bifrost Gateway │───▶│ AI Integration │
|
|
│ │ │ localhost:8080 │ │ (OpenAI, etc.) │
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
```
|
|
|
|
### URL Structure
|
|
|
|
- **Base URL**: `http://localhost:8080` (configurable via `BIFROST_BASE_URL`)
|
|
- **Integration Endpoints**:
|
|
- OpenAI: `http://localhost:8080/openai`
|
|
- Anthropic: `http://localhost:8080/anthropic`
|
|
- Google: `http://localhost:8080/genai`
|
|
- LiteLLM: `http://localhost:8080/litellm`
|
|
|
|
## 🚀 Features
|
|
|
|
- **🌉 Bifrost Gateway Integration**: All integrations route through Bifrost proxy
|
|
- **🤖 Centralized Configuration**: YAML-based configuration with environment variable support
|
|
- **🔧 Integration-Specific Clients**: Type-safe, integration-optimized implementations
|
|
- **📋 Comprehensive Test Coverage**: 14 categories covering all major AI functionality
|
|
- **⚙️ Flexible Execution**: Selective test running with command-line flags
|
|
- **🛡️ Robust Error Handling**: Graceful error handling and detailed error reporting
|
|
- **🎯 Production-Ready**: Async support, timeouts, retries, and logging
|
|
- **🎵 Speech & Audio Support**: Text-to-speech synthesis and speech-to-text transcription testing
|
|
- **🔗 Embeddings Support**: Text-to-vector conversion and similarity analysis testing
|
|
|
|
## 📋 Test Categories
|
|
|
|
Our test suite covers 30 comprehensive scenarios for each integration:
|
|
|
|
### Core Chat & Conversation Tests
|
|
1. **Simple Chat** - Basic single-message conversations
|
|
2. **Multi-turn Conversation** - Conversation history and context retention
|
|
3. **Streaming** - Real-time streaming responses and tool calls
|
|
|
|
### Tool Calling & Function Tests
|
|
4. **Single Tool Call** - Basic function calling capabilities
|
|
5. **Multiple Tool Calls** - Multiple tools in single request
|
|
6. **End-to-End Tool Calling** - Complete tool workflow with results
|
|
7. **Automatic Function Calling** - Integration-managed tool execution
|
|
|
|
### Image & Vision Tests
|
|
8. **Image Analysis (URL)** - Image processing from URLs
|
|
9. **Image Analysis (Base64)** - Image processing from base64 data
|
|
10. **Multiple Images** - Multi-image analysis and comparison
|
|
|
|
### Speech & Audio Tests (OpenAI)
|
|
11. **Speech Synthesis** - Text-to-speech conversion with different voices
|
|
12. **Audio Transcription** - Speech-to-text conversion with multiple formats
|
|
13. **Transcription Streaming** - Real-time transcription processing
|
|
14. **Speech Round-Trip** - Complete text→speech→text workflow validation
|
|
15. **Speech Error Handling** - Invalid voice, model, and input error handling
|
|
16. **Transcription Error Handling** - Invalid audio format and model error handling
|
|
17. **Voice & Format Testing** - Multiple voices and audio format validation
|
|
|
|
### Embeddings Tests (OpenAI)
|
|
18. **Single Text Embedding** - Basic text-to-vector conversion
|
|
19. **Batch Text Embeddings** - Multiple text embeddings in single request
|
|
20. **Embedding Similarity Analysis** - Cosine similarity testing for similar texts
|
|
21. **Embedding Dissimilarity Analysis** - Validation of different topic embeddings
|
|
22. **Different Embedding Models** - Testing various embedding model capabilities
|
|
23. **Long Text Embedding** - Handling of longer text inputs and token usage
|
|
24. **Embedding Error Handling** - Invalid model and input error processing
|
|
25. **Dimensionality Reduction** - Custom embedding dimensions (if supported)
|
|
26. **Encoding Format Testing** - Different embedding output formats
|
|
27. **Usage Tracking** - Token consumption and batch processing validation
|
|
|
|
### Integration & Error Tests
|
|
28. **Complex End-to-End** - Comprehensive multimodal workflows
|
|
29. **Integration-Specific Features** - Integration-unique capabilities
|
|
30. **Error Handling** - Invalid request error processing and propagation
|
|
|
|
## 📁 Directory Structure
|
|
|
|
```text
|
|
integrations/
|
|
├── config.yml # Central configuration file
|
|
├── pyproject.toml # Python project configuration (uv/pip)
|
|
├── requirements.txt # Python dependencies (legacy compatibility)
|
|
├── .python-version # Python version specification for uv
|
|
├── pytest.ini # Pytest configuration
|
|
├── tests/
|
|
│ ├── conftest.py # Pytest configuration and fixtures
|
|
│ ├── utils/
|
|
│ │ ├── common.py # Shared test utilities and fixtures
|
|
│ │ ├── config_loader.py # Configuration system
|
|
│ │ └── models.py # Model configurations (compatibility layer)
|
|
│ └── integrations/
|
|
│ ├── test_openai.py # OpenAI integration tests
|
|
│ ├── test_anthropic.py # Anthropic integration tests
|
|
│ ├── test_google.py # Google AI integration tests
|
|
│ └── test_litellm.py # LiteLLM integration tests
|
|
```
|
|
|
|
## ⚡ Quick Start
|
|
|
|
### 1. Installation
|
|
|
|
```bash
|
|
# Clone the repository
|
|
git clone <repository-url>
|
|
cd bifrost/tests/integrations
|
|
|
|
# Install uv (if not already installed)
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
|
|
# Install dependencies with uv (recommended - fastest)
|
|
uv sync
|
|
|
|
# Or with traditional pip
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
#### Why use uv?
|
|
|
|
[uv](https://github.com/astral-sh/uv) is an extremely fast Python package installer and resolver, written in Rust. It's 10-100x faster than pip and provides better dependency resolution.
|
|
|
|
```bash
|
|
# Install dependencies
|
|
uv sync
|
|
|
|
# Run all tests
|
|
uv run pytest
|
|
|
|
# Run specific integration tests
|
|
uv run pytest tests/integrations/test_openai.py -v
|
|
|
|
# Run specific test categories
|
|
uv run pytest -k "tool_call" -v
|
|
```
|
|
|
|
### 2. Configuration
|
|
|
|
The system uses `config.yml` for centralized configuration. Set up your environment variables:
|
|
|
|
```bash
|
|
# Required: Bifrost gateway
|
|
export BIFROST_BASE_URL="http://localhost:8080"
|
|
|
|
# Required: Integration API keys
|
|
export OPENAI_API_KEY="your-openai-key"
|
|
export ANTHROPIC_API_KEY="your-anthropic-key"
|
|
export GOOGLE_API_KEY="your-google-api-key"
|
|
|
|
# Optional: Integration-specific settings
|
|
export OPENAI_ORG_ID="org-..."
|
|
export OPENAI_PROJECT_ID="proj_..."
|
|
export GOOGLE_PROJECT_ID="your-project"
|
|
export GOOGLE_LOCATION="us-central1"
|
|
export TEST_ENV="development"
|
|
|
|
# Quick check using Makefile
|
|
make check-env
|
|
```
|
|
|
|
### 3. Verify Configuration
|
|
|
|
```bash
|
|
# Test the configuration system
|
|
uv run python tests/utils/config_loader.py
|
|
```
|
|
|
|
This will display:
|
|
|
|
- 🌉 Bifrost gateway URLs
|
|
- 🤖 Model configurations
|
|
- ⚙️ API settings
|
|
- ✅ Validation status
|
|
|
|
### 4. Pytest Configuration
|
|
|
|
The project includes a `pytest.ini` file with optimized settings:
|
|
|
|
```ini
|
|
[pytest]
|
|
# Test discovery
|
|
testpaths = .
|
|
python_files = test_*.py
|
|
python_classes = Test*
|
|
python_functions = test_*
|
|
|
|
# Output formatting
|
|
addopts =
|
|
-v
|
|
--tb=short
|
|
--strict-markers
|
|
--disable-warnings
|
|
--color=yes
|
|
|
|
# Timeout settings (3 minutes per test)
|
|
timeout = 180
|
|
|
|
# Markers for test categorization
|
|
markers =
|
|
integration: marks tests as integration tests
|
|
slow: marks tests as slow running
|
|
e2e: marks tests as end-to-end tests
|
|
tool_calling: marks tests as tool calling tests
|
|
```
|
|
|
|
### 5. Run Tests
|
|
|
|
```bash
|
|
# Run all tests
|
|
uv run pytest
|
|
|
|
# Run all tests with verbose output
|
|
uv run pytest -v
|
|
|
|
# Run specific integration tests
|
|
uv run pytest tests/integrations/test_openai.py -v
|
|
uv run pytest tests/integrations/test_anthropic.py -v
|
|
uv run pytest tests/integrations/test_google.py -v
|
|
uv run pytest tests/integrations/test_litellm.py -v
|
|
|
|
# Run specific test by name
|
|
uv run pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_01_simple_chat -v
|
|
|
|
# Run tests by pattern/category
|
|
uv run pytest -k "tool_call" -v # All tool calling tests
|
|
uv run pytest -k "image" -v # All image tests
|
|
uv run pytest -k "speech or transcription" -v # All audio tests
|
|
uv run pytest -k "embedding" -v # All embedding tests
|
|
|
|
# Run tests in parallel (faster)
|
|
uv run pytest -n auto
|
|
|
|
# Run with coverage report
|
|
uv run pytest --cov=tests --cov-report=html
|
|
|
|
# Traditional pip usage (if not using uv)
|
|
pytest tests/integrations/test_openai.py -v
|
|
```
|
|
|
|
## 🚄 Using uv (Recommended)
|
|
|
|
[uv](https://github.com/astral-sh/uv) is an extremely fast Python package installer and resolver, written in Rust. It's 10-100x faster than pip and provides better dependency resolution, making it the recommended way to run these tests.
|
|
|
|
### Installation
|
|
|
|
```bash
|
|
# Install uv
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
|
|
# Or with pip
|
|
pip install uv
|
|
|
|
# Or with Homebrew (macOS)
|
|
brew install uv
|
|
```
|
|
|
|
### Quick Start with uv
|
|
|
|
```bash
|
|
# 1. Install dependencies
|
|
uv sync
|
|
|
|
# 2. Run all tests
|
|
uv run pytest
|
|
|
|
# 3. Run specific integration
|
|
uv run pytest tests/integrations/test_openai.py -v
|
|
|
|
# 4. Run tests by pattern
|
|
uv run pytest -k "tool_call" -v
|
|
```
|
|
|
|
### Common Commands
|
|
|
|
```bash
|
|
# Setup
|
|
uv sync # Install all dependencies from pyproject.toml
|
|
|
|
# Running tests
|
|
uv run pytest # Run all tests
|
|
uv run pytest -v # Verbose output
|
|
uv run pytest -n auto # Run tests in parallel
|
|
uv run pytest -k "pattern" # Run tests matching pattern
|
|
uv run pytest tests/integrations/test_openai.py # Run specific file
|
|
|
|
# Development
|
|
uv run black . # Format code
|
|
uv run flake8 . # Lint code
|
|
uv run mypy . # Type check
|
|
|
|
# Managing dependencies
|
|
uv add package-name # Add a new dependency
|
|
uv remove package-name # Remove a dependency
|
|
uv pip list # List installed packages
|
|
```
|
|
|
|
### Why use uv?
|
|
|
|
1. **Speed**: 10-100x faster than pip for package installation
|
|
2. **Reliability**: Better dependency resolution and conflict detection
|
|
3. **Simplicity**: Single tool for package management and running scripts
|
|
4. **Modern**: Built with Rust, designed for speed and efficiency
|
|
5. **Compatible**: Works with standard Python packaging (pyproject.toml, requirements.txt)
|
|
|
|
### Migration from pip
|
|
|
|
If you're currently using pip, migrating to uv is straightforward:
|
|
|
|
```bash
|
|
# Old way (pip)
|
|
pip install -r requirements.txt
|
|
pytest tests/integrations/test_openai.py -v
|
|
|
|
# New way (uv)
|
|
uv sync
|
|
uv run pytest tests/integrations/test_openai.py -v
|
|
```
|
|
|
|
All existing pytest commands work the same way, just prefix them with `uv run`.
|
|
|
|
## 🔧 Configuration System
|
|
|
|
### Configuration Files
|
|
|
|
#### 1. `config.yml` - Main Configuration
|
|
|
|
Central configuration file containing:
|
|
|
|
- Bifrost gateway settings and endpoints
|
|
- Model configurations for all integrations
|
|
- API settings (timeouts, retries)
|
|
- Test parameters and limits
|
|
- Environment-specific overrides
|
|
- Integration-specific settings
|
|
|
|
#### 2. `tests/utils/config_loader.py` - Configuration Loader
|
|
|
|
Python module that:
|
|
|
|
- Loads and parses `config.yml`
|
|
- Expands environment variables with `${VAR:-default}` syntax
|
|
- Provides convenience functions for URLs and models
|
|
- Validates configuration completeness
|
|
- Handles error scenarios
|
|
|
|
#### 3. `tests/utils/models.py` - Compatibility Layer
|
|
|
|
Maintains backward compatibility while delegating to the new config system.
|
|
|
|
### Key Configuration Sections
|
|
|
|
#### Bifrost Gateway
|
|
|
|
```yaml
|
|
bifrost:
|
|
base_url: "${BIFROST_BASE_URL:-http://localhost:8080}"
|
|
endpoints:
|
|
openai: "openai"
|
|
anthropic: "anthropic"
|
|
google: "genai"
|
|
litellm: "litellm"
|
|
```
|
|
|
|
#### Model Configurations
|
|
|
|
```yaml
|
|
models:
|
|
openai:
|
|
chat: "gpt-3.5-turbo"
|
|
vision: "gpt-4o"
|
|
tools: "gpt-3.5-turbo"
|
|
speech: "tts-1"
|
|
transcription: "whisper-1"
|
|
alternatives: ["gpt-4", "gpt-4-turbo-preview", "gpt-4o", "gpt-4o-mini"]
|
|
speech_alternatives: ["tts-1-hd"]
|
|
transcription_alternatives: ["whisper-1"]
|
|
```
|
|
|
|
#### API Settings
|
|
|
|
```yaml
|
|
api:
|
|
timeout: 30
|
|
max_retries: 3
|
|
retry_delay: 1
|
|
```
|
|
|
|
### Usage Examples
|
|
|
|
#### Getting Integration URLs
|
|
|
|
```python
|
|
from tests.utils.config_loader import get_integration_url
|
|
|
|
# Get Bifrost URL for OpenAI
|
|
openai_url = get_integration_url("openai")
|
|
# Returns: http://localhost:8080/openai
|
|
|
|
# Get integration URL through Bifrost
|
|
openai_url = get_integration_url("openai")
|
|
# Returns: http://localhost:8080/openai
|
|
```
|
|
|
|
#### Getting Model Names
|
|
|
|
```python
|
|
from tests.utils.config_loader import get_model
|
|
|
|
# Get different model types
|
|
chat_model = get_model("openai", "chat") # "gpt-3.5-turbo"
|
|
vision_model = get_model("openai", "vision") # "gpt-4o"
|
|
speech_model = get_model("openai", "speech") # "tts-1"
|
|
transcription_model = get_model("openai", "transcription") # "whisper-1"
|
|
```
|
|
|
|
## 🎵 Speech & Transcription Testing
|
|
|
|
The test suite includes comprehensive speech synthesis and transcription testing for supported integrations (currently OpenAI).
|
|
|
|
### Speech & Audio Test Categories
|
|
|
|
#### 1. Speech Synthesis (Text-to-Speech)
|
|
- **Basic synthesis**: Convert text to audio with different voices
|
|
- **Format testing**: Multiple audio formats (MP3, WAV, Opus)
|
|
- **Voice validation**: Test all available voices (alloy, echo, fable, onyx, nova, shimmer)
|
|
- **Parameter testing**: Response format, voice settings, and quality options
|
|
|
|
#### 2. Speech Streaming
|
|
- **Real-time generation**: Streaming audio synthesis for large texts
|
|
- **Chunk validation**: Verify audio chunk integrity and format
|
|
- **Performance testing**: Measure streaming latency and throughput
|
|
|
|
#### 3. Audio Transcription (Speech-to-Text)
|
|
- **File format support**: WAV, MP3, and other audio formats
|
|
- **Language detection**: Multi-language transcription capabilities
|
|
- **Parameter testing**: Language hints, response formats, temperature settings
|
|
- **Quality validation**: Transcription accuracy and completeness
|
|
|
|
#### 4. Transcription Streaming
|
|
- **Real-time processing**: Streaming transcription for long audio files
|
|
- **Progressive results**: Incremental text output validation
|
|
- **Error handling**: Network interruption and recovery testing
|
|
|
|
#### 5. Round-Trip Testing
|
|
- **Complete workflow**: Text → Speech → Transcription → Text validation
|
|
- **Accuracy measurement**: Compare original text with round-trip result
|
|
- **Quality assessment**: Measure transcription fidelity and word preservation
|
|
|
|
### Running Speech & Transcription Tests
|
|
|
|
#### Quick Start
|
|
|
|
```bash
|
|
# Run all speech and transcription tests
|
|
uv run pytest -k "speech or transcription" -v
|
|
|
|
# Run with verbose output and show print statements
|
|
uv run pytest -k "speech or transcription" -v -s
|
|
|
|
# Run specific test
|
|
uv run pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_14_speech_synthesis -v
|
|
|
|
# List available tests
|
|
uv run pytest --collect-only -k "speech or transcription"
|
|
```
|
|
|
|
#### Individual Test Examples
|
|
|
|
```bash
|
|
# Test speech synthesis
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_14_speech_synthesis -v
|
|
|
|
# Test transcription
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_16_transcription_audio -v
|
|
|
|
# Test round-trip workflow
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_18_speech_transcription_round_trip -v
|
|
|
|
# Test error handling
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_19_speech_error_handling -v
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_20_transcription_error_handling -v
|
|
```
|
|
|
|
#### Available Test Audio Types
|
|
|
|
1. **Sine Wave**: Pure tone audio for basic testing
|
|
2. **Chord**: Multi-frequency audio for complex signal testing
|
|
3. **Frequency Sweep**: Variable frequency audio for range testing
|
|
4. **White Noise**: Random audio for noise handling testing
|
|
5. **Silence**: Empty audio for edge case testing
|
|
6. **Various Durations**: Short (0.5s) to long (10s) audio files
|
|
|
|
### Speech & Transcription Configuration
|
|
|
|
#### Model Configuration
|
|
|
|
```yaml
|
|
models:
|
|
openai:
|
|
speech: "tts-1" # Default speech synthesis model
|
|
transcription: "whisper-1" # Default transcription model
|
|
speech_alternatives: ["tts-1-hd"] # Higher quality speech model
|
|
transcription_alternatives: ["whisper-1"] # Alternative transcription models
|
|
|
|
# Model capabilities
|
|
model_capabilities:
|
|
"tts-1":
|
|
speech: true
|
|
streaming: false # Streaming support varies
|
|
max_tokens: null
|
|
context_window: null
|
|
|
|
"whisper-1":
|
|
transcription: true
|
|
streaming: false # Streaming support varies
|
|
max_tokens: null
|
|
context_window: null
|
|
```
|
|
|
|
#### Test Settings
|
|
|
|
```yaml
|
|
test_settings:
|
|
max_tokens:
|
|
speech: null # Speech doesn't use token limits
|
|
transcription: null # Transcription doesn't use token limits
|
|
|
|
timeouts:
|
|
speech: 60 # Speech generation timeout
|
|
transcription: 60 # Transcription processing timeout
|
|
```
|
|
|
|
### Speech Test Examples
|
|
|
|
#### Basic Speech Synthesis
|
|
|
|
```python
|
|
# Test basic speech synthesis
|
|
response = openai_client.audio.speech.create(
|
|
model="tts-1",
|
|
voice="alloy",
|
|
input="Hello, this is a test of speech synthesis.",
|
|
)
|
|
audio_content = response.content
|
|
assert len(audio_content) > 1000 # Ensure substantial audio data
|
|
```
|
|
|
|
#### Transcription Testing
|
|
|
|
```python
|
|
# Test audio transcription
|
|
test_audio = generate_test_audio() # Generate test WAV file
|
|
response = openai_client.audio.transcriptions.create(
|
|
model="whisper-1",
|
|
file=("test.wav", test_audio, "audio/wav"),
|
|
language="en",
|
|
)
|
|
transcribed_text = response.text
|
|
assert len(transcribed_text) > 0 # Ensure transcription occurred
|
|
```
|
|
|
|
#### Round-Trip Validation
|
|
|
|
```python
|
|
# Complete round-trip test
|
|
original_text = "The quick brown fox jumps over the lazy dog."
|
|
|
|
# Step 1: Text to speech
|
|
speech_response = openai_client.audio.speech.create(
|
|
model="tts-1",
|
|
voice="alloy",
|
|
input=original_text,
|
|
response_format="wav",
|
|
)
|
|
|
|
# Step 2: Speech to text
|
|
transcription_response = openai_client.audio.transcriptions.create(
|
|
model="whisper-1",
|
|
file=("speech.wav", speech_response.content, "audio/wav"),
|
|
)
|
|
|
|
# Step 3: Validate similarity
|
|
transcribed_text = transcription_response.text
|
|
# Check for key word preservation (allowing for transcription variations)
|
|
```
|
|
|
|
### Error Handling Tests
|
|
|
|
#### Speech Synthesis Errors
|
|
|
|
```python
|
|
# Test invalid voice
|
|
with pytest.raises(Exception):
|
|
openai_client.audio.speech.create(
|
|
model="tts-1",
|
|
voice="invalid_voice",
|
|
input="This should fail",
|
|
)
|
|
|
|
# Test empty input
|
|
with pytest.raises(Exception):
|
|
openai_client.audio.speech.create(
|
|
model="tts-1",
|
|
voice="alloy",
|
|
input="",
|
|
)
|
|
```
|
|
|
|
#### Transcription Errors
|
|
|
|
```python
|
|
# Test invalid audio format
|
|
invalid_audio = b"This is not audio data"
|
|
with pytest.raises(Exception):
|
|
openai_client.audio.transcriptions.create(
|
|
model="whisper-1",
|
|
file=("invalid.wav", invalid_audio, "audio/wav"),
|
|
)
|
|
|
|
# Test unsupported file type
|
|
with pytest.raises(Exception):
|
|
openai_client.audio.transcriptions.create(
|
|
model="whisper-1",
|
|
file=("test.txt", b"text content", "text/plain"),
|
|
)
|
|
```
|
|
|
|
### Integration Support Matrix
|
|
|
|
| Integration | Speech Synthesis | Transcription | Streaming | Notes |
|
|
|------------|------------------|---------------|-----------|-------|
|
|
| OpenAI | ✅ Full Support | ✅ Full Support | 🔄 Varies | Complete implementation |
|
|
| Anthropic | ❌ Not Available | ❌ Not Available | ❌ No | No speech/audio APIs |
|
|
| Google | ❌ Not Available* | ❌ Not Available* | ❌ No | *Not through Gemini API |
|
|
| LiteLLM | ✅ Via OpenAI | ✅ Via OpenAI | 🔄 Varies | Proxies to OpenAI |
|
|
|
|
*Note: Google offers speech services through separate APIs (Cloud Speech-to-Text, Cloud Text-to-Speech) that are not currently integrated.*
|
|
|
|
### Performance Considerations
|
|
|
|
#### Speech Synthesis
|
|
- **File Size**: Generated audio files range from 50KB to 5MB depending on length and quality
|
|
- **Generation Time**: Typically 2-10 seconds for short texts, longer for complex content
|
|
- **Format Impact**: WAV files are larger but offer better compatibility; MP3 is more compressed
|
|
|
|
#### Transcription
|
|
- **Processing Time**: Usually 1-5 seconds for short audio files (under 30 seconds)
|
|
- **File Size Limits**: Most services support files up to 25MB
|
|
- **Accuracy Factors**: Audio quality, background noise, speaker clarity affect results
|
|
|
|
### Best Practices
|
|
|
|
#### For Speech Testing
|
|
1. **Use consistent test text** for reproducible results
|
|
2. **Test multiple voices** to ensure voice switching works
|
|
3. **Validate audio headers** to confirm proper format generation
|
|
4. **Check file sizes** to ensure reasonable audio generation
|
|
|
|
#### For Transcription Testing
|
|
1. **Use high-quality test audio** for consistent transcription results
|
|
2. **Test various audio formats** (WAV, MP3, etc.) for compatibility
|
|
3. **Include silence and noise** tests for edge case handling
|
|
4. **Validate response formats** (JSON, text) as needed
|
|
|
|
#### For Round-Trip Testing
|
|
1. **Use simple, clear phrases** to maximize transcription accuracy
|
|
2. **Allow for minor variations** in transcribed text
|
|
3. **Focus on key word preservation** rather than exact matches
|
|
4. **Test with different voices** to ensure consistency across voice models
|
|
|
|
### Troubleshooting
|
|
|
|
#### Common Issues
|
|
|
|
1. **Audio Format Errors**
|
|
```bash
|
|
# Check audio file headers
|
|
file test_audio.wav
|
|
# Should show: RIFF (little-endian) data, WAVE audio
|
|
```
|
|
|
|
2. **API Key Issues**
|
|
```bash
|
|
# Verify OpenAI API key
|
|
export OPENAI_API_KEY="your-key-here"
|
|
python test_audio.py --test test_14_speech_synthesis
|
|
```
|
|
|
|
3. **Bifrost Configuration**
|
|
```bash
|
|
# Ensure Bifrost is running and accessible
|
|
curl http://localhost:8080/openai/v1/audio/speech -I
|
|
```
|
|
|
|
4. **Model Availability**
|
|
```python
|
|
# Check if speech/transcription models are available
|
|
from tests.utils.config_loader import get_model
|
|
print("Speech model:", get_model("openai", "speech"))
|
|
print("Transcription model:", get_model("openai", "transcription"))
|
|
```
|
|
|
|
#### Debug Commands
|
|
|
|
```bash
|
|
# Test individual components
|
|
uv run pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_14_speech_synthesis -v -s
|
|
|
|
# Check Bifrost logs for audio endpoint requests
|
|
# (Check your Bifrost instance logs)
|
|
```
|
|
|
|
## Getting Model Names
|
|
|
|
```python
|
|
from tests.utils.config_loader import get_model
|
|
|
|
# Get chat model for OpenAI
|
|
chat_model = get_model("openai", "chat")
|
|
# Returns: gpt-3.5-turbo
|
|
|
|
# Get vision model for Anthropic
|
|
vision_model = get_model("anthropic", "vision")
|
|
# Returns: claude-3-haiku-20240307
|
|
```
|
|
|
|
## 🤖 Integration Support
|
|
|
|
### Currently Supported Integrations
|
|
|
|
#### OpenAI
|
|
|
|
- ✅ **Full Bifrost Integration**: Complete base URL support
|
|
- ✅ **Models**: gpt-3.5-turbo, gpt-4, gpt-4o, gpt-4o-mini, text-embedding-3-small, tts-1, whisper-1
|
|
- ✅ **Features**: Chat, tools, vision, speech synthesis, transcription, embeddings
|
|
- ✅ **Settings**: Organization/project IDs, timeouts, retries
|
|
- ✅ **All Test Categories**: 30/30 scenarios supported (including speech & embeddings)
|
|
|
|
#### Anthropic
|
|
|
|
- ✅ **Full Bifrost Integration**: Complete base URL support
|
|
- ✅ **Models**: claude-3-haiku-20240307, claude-3-sonnet-20240229, claude-3-opus-20240229, claude-3-5-sonnet-20241022
|
|
- ✅ **Features**: Chat, tools, vision
|
|
- ✅ **Settings**: API version headers, timeouts, retries
|
|
- ✅ **All Test Categories**: 11/11 scenarios supported
|
|
|
|
#### Google AI
|
|
|
|
- ✅ **Full Bifrost Integration**: Complete custom transport implementation
|
|
- ✅ **Models**: gemini-2.0-flash-001, gemini-1.5-pro, gemini-1.5-flash, gemini-1.0-pro
|
|
- ✅ **Features**: Chat, tools, vision, multimodal processing
|
|
- ✅ **Settings**: Project ID, location, API configuration
|
|
- ✅ **All Test Categories**: 11/11 scenarios supported
|
|
- ✅ **Custom Base64 Handling**: Resolved cross-language encoding compatibility
|
|
|
|
#### LiteLLM
|
|
|
|
- ✅ **Full Bifrost Integration**: Global base URL configuration
|
|
- ✅ **Models**: Supports all LiteLLM-compatible models
|
|
- ✅ **Features**: Chat, tools, vision (integration-dependent)
|
|
- ✅ **Settings**: Drop params, debug mode, integration-specific configs
|
|
- ✅ **All Test Categories**: 11/11 scenarios supported
|
|
- ✅ **Multi-Integration**: OpenAI, Anthropic, Google, Azure, Cohere, Mistral, etc.
|
|
|
|
## 🧪 Running Tests
|
|
|
|
### Test Execution Methods
|
|
|
|
#### Using pytest with uv
|
|
|
|
```bash
|
|
# Run all tests for an integration
|
|
uv run pytest tests/integrations/test_openai.py -v
|
|
|
|
# Run specific test categories
|
|
uv run pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_01_simple_chat -v
|
|
|
|
# Run with coverage
|
|
uv run pytest tests/integrations/ --cov=tests --cov-report=html
|
|
|
|
# Run with custom markers
|
|
uv run pytest tests/integrations/ -m "not slow" -v
|
|
```
|
|
|
|
#### Selective Test Execution
|
|
|
|
```bash
|
|
# Skip tests that require API keys you don't have
|
|
uv run pytest tests/integrations/test_openai.py -v # Will skip if OPENAI_API_KEY not set
|
|
|
|
# Run only specific test methods
|
|
uv run pytest tests/integrations/test_anthropic.py -k "tool_call" -v
|
|
|
|
# Run with timeout
|
|
uv run pytest tests/integrations/ --timeout=300 -v
|
|
```
|
|
|
|
### 🔍 Checking and Running Specific Tests
|
|
|
|
#### 🚀 Quick Commands (Most Common)
|
|
|
|
```bash
|
|
# Run specific test for specific integration
|
|
uv run pytest tests/integrations/test_google.py::TestGoogleIntegration::test_03_single_tool_call -v
|
|
|
|
# Run all tool calling tests across all integrations
|
|
uv run pytest -k "tool_call" -v
|
|
|
|
# Run all tests for one integration
|
|
uv run pytest tests/integrations/test_openai.py -v
|
|
|
|
# Run tests in parallel (faster)
|
|
uv run pytest -n auto
|
|
|
|
# Run with coverage
|
|
uv run pytest --cov=tests --cov-report=html -v
|
|
```
|
|
|
|
#### Quick Reference: Test Categories
|
|
|
|
```text
|
|
Test 01: Simple Chat - Basic single-message conversations
|
|
Test 02: Multi-turn Conversation - Conversation history and context
|
|
Test 03: Single Tool Call - Basic function calling
|
|
Test 04: Multiple Tool Calls - Multiple tools in one request
|
|
Test 05: End-to-End Tool Calling - Complete tool workflow with results
|
|
Test 06: Automatic Function Call - Integration-managed tool execution
|
|
Test 07: Image Analysis (URL) - Image processing from URLs
|
|
Test 08: Image Analysis (Base64) - Image processing from base64
|
|
Test 09: Multiple Images - Multi-image analysis and comparison
|
|
Test 10: Complex End-to-End - Comprehensive multimodal workflows
|
|
Test 11: Integration-Specific - Integration-unique features
|
|
```
|
|
|
|
#### Listing Available Tests
|
|
|
|
```bash
|
|
# List all tests for a specific integration
|
|
pytest tests/integrations/test_openai.py --collect-only
|
|
|
|
# List all test methods with descriptions
|
|
pytest tests/integrations/test_openai.py --collect-only -q
|
|
|
|
# Show test structure for all integrations
|
|
pytest tests/integrations/ --collect-only
|
|
```
|
|
|
|
#### Running Individual Test Categories
|
|
|
|
```bash
|
|
# Test 1: Simple Chat
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_01_simple_chat -v
|
|
|
|
# Test 3: Single Tool Call
|
|
pytest tests/integrations/test_anthropic.py::TestAnthropicIntegration::test_03_single_tool_call -v
|
|
|
|
# Test 7: Image Analysis (URL)
|
|
pytest tests/integrations/test_google.py::TestGoogleIntegration::test_07_image_url -v
|
|
|
|
# Test 9: Multiple Images
|
|
pytest tests/integrations/test_litellm.py::TestLiteLLMIntegration::test_09_multiple_images -v
|
|
|
|
# Test 21: Single Text Embedding (OpenAI only)
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_21_single_text_embedding -v
|
|
|
|
# Test 23: Embedding Similarity Analysis (OpenAI only)
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_23_embedding_similarity_analysis -v
|
|
```
|
|
|
|
#### Running Test Categories by Pattern
|
|
|
|
```bash
|
|
# Run all simple chat tests across integrations
|
|
pytest tests/integrations/ -k "test_01_simple_chat" -v
|
|
|
|
# Run all tool calling tests (single and multiple)
|
|
pytest tests/integrations/ -k "tool_call" -v
|
|
|
|
# Run all image-related tests
|
|
pytest tests/integrations/ -k "image" -v
|
|
|
|
# Run all embedding tests (OpenAI only)
|
|
pytest tests/integrations/test_openai.py -k "embedding" -v
|
|
|
|
# Run all speech and audio tests (OpenAI only)
|
|
pytest tests/integrations/test_openai.py -k "speech or transcription" -v
|
|
|
|
# Run all end-to-end tests
|
|
pytest tests/integrations/ -k "end2end" -v
|
|
|
|
# Run integration-specific feature tests
|
|
pytest tests/integrations/ -k "integration_specific" -v
|
|
```
|
|
|
|
#### Running Tests by Integration
|
|
|
|
```bash
|
|
# Run all OpenAI tests
|
|
pytest tests/integrations/test_openai.py -v
|
|
|
|
# Run all Anthropic tests with detailed output
|
|
pytest tests/integrations/test_anthropic.py -v -s
|
|
|
|
# Run Google tests with coverage
|
|
pytest tests/integrations/test_google.py --cov=tests --cov-report=term-missing -v
|
|
|
|
# Run LiteLLM tests with timing
|
|
pytest tests/integrations/test_litellm.py --durations=10 -v
|
|
```
|
|
|
|
#### Advanced Test Selection
|
|
|
|
```bash
|
|
# Run tests 1-5 (basic functionality) for OpenAI
|
|
pytest tests/integrations/test_openai.py -k "test_01 or test_02 or test_03 or test_04 or test_05" -v
|
|
|
|
# Run only vision tests (tests 7, 8, 9, 10)
|
|
pytest tests/integrations/ -k "test_07 or test_08 or test_09 or test_10" -v
|
|
|
|
# Run tests excluding images (skip tests 7, 8, 9, 10)
|
|
pytest tests/integrations/ -k "not (test_07 or test_08 or test_09 or test_10)" -v
|
|
|
|
# Run only tool-related tests (tests 3, 4, 5, 6)
|
|
pytest tests/integrations/ -k "test_03 or test_04 or test_05 or test_06" -v
|
|
```
|
|
|
|
#### Test Status and Validation
|
|
|
|
```bash
|
|
# Check which tests would run (dry run)
|
|
pytest tests/integrations/test_openai.py --collect-only --quiet
|
|
|
|
# Validate test setup without running
|
|
pytest tests/integrations/test_openai.py --setup-only -v
|
|
|
|
# Run tests with immediate failure reporting
|
|
pytest tests/integrations/ -x -v # Stop on first failure
|
|
|
|
# Run tests with detailed failure information
|
|
pytest tests/integrations/ --tb=long -v
|
|
```
|
|
|
|
#### Integration-Specific Test Validation
|
|
|
|
```bash
|
|
# Check if integration supports all test categories
|
|
python -c "
|
|
from tests.integrations.test_openai import TestOpenAIIntegration
|
|
import inspect
|
|
methods = [m for m in dir(TestOpenAIIntegration) if m.startswith('test_')]
|
|
print('OpenAI Test Methods:')
|
|
for i, method in enumerate(sorted(methods), 1):
|
|
print(f' {i:2d}. {method}')
|
|
print(f'Total: {len(methods)} tests')
|
|
"
|
|
|
|
# Verify integration configuration
|
|
python -c "
|
|
from tests.utils.config_loader import get_config, get_model
|
|
config = get_config()
|
|
integration = 'openai'
|
|
print(f'{integration.upper()} Configuration:')
|
|
for model_type in ['chat', 'vision', 'tools']:
|
|
try:
|
|
model = get_model(integration, model_type)
|
|
print(f' {model_type}: {model}')
|
|
except Exception as e:
|
|
print(f' {model_type}: ERROR - {e}')
|
|
"
|
|
```
|
|
|
|
#### Test Results Analysis
|
|
|
|
```bash
|
|
# Run tests with detailed reporting
|
|
pytest tests/integrations/test_openai.py -v --tb=short --report=term-missing
|
|
|
|
# Generate HTML test report
|
|
pytest tests/integrations/ --html=test_report.html --self-contained-html
|
|
|
|
# Run tests with JSON output for analysis
|
|
pytest tests/integrations/test_openai.py --json-report --json-report-file=openai_results.json
|
|
|
|
# Compare test results across integrations
|
|
pytest tests/integrations/ -v | grep -E "(PASSED|FAILED|SKIPPED)" | sort
|
|
```
|
|
|
|
#### Debugging Specific Tests
|
|
|
|
```bash
|
|
# Debug a failing test with full output
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_03_single_tool_call -v -s --tb=long
|
|
|
|
# Run test with Python debugger
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_03_single_tool_call --pdb
|
|
|
|
# Run test with custom logging
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_03_single_tool_call --log-cli-level=DEBUG -s
|
|
|
|
# Test with environment variable override
|
|
OPENAI_API_KEY=sk-test pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_01_simple_chat -v
|
|
```
|
|
|
|
#### Practical Testing Scenarios
|
|
|
|
```bash
|
|
# Scenario 1: Test a new integration
|
|
# 1. Check configuration
|
|
uv run python tests/utils/config_loader.py
|
|
|
|
# 2. List available tests
|
|
uv run pytest tests/integrations/test_your_integration.py --collect-only
|
|
|
|
# 3. Run basic tests first
|
|
uv run pytest tests/integrations/test_your_integration.py -k "test_01 or test_02" -v
|
|
|
|
# 4. Test tool calling if supported
|
|
uv run pytest tests/integrations/test_your_integration.py -k "tool_call" -v
|
|
|
|
# Scenario 2: Debug a failing tool call test
|
|
# 1. Run with full debugging
|
|
uv run pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_03_single_tool_call -v -s --tb=long
|
|
|
|
# 2. Check tool extraction function
|
|
uv run python -c "
|
|
from tests.integrations.test_openai import extract_openai_tool_calls
|
|
print('Tool extraction function available:', callable(extract_openai_tool_calls))
|
|
"
|
|
|
|
# 3. Test with different model
|
|
OPENAI_CHAT_MODEL=gpt-4 uv run pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_03_single_tool_call -v
|
|
|
|
# Scenario 3: Compare integration capabilities
|
|
# Run the same test across all integrations
|
|
uv run pytest tests/integrations/ -k "test_01_simple_chat" -v --tb=short
|
|
|
|
# Scenario 4: Test only supported features
|
|
# For an integration that doesn't support images
|
|
uv run pytest tests/integrations/test_your_integration.py -k "not (test_07 or test_08 or test_09 or test_10)" -v
|
|
|
|
# Scenario 5: Performance testing
|
|
# Run with timing to identify slow tests
|
|
uv run pytest tests/integrations/test_openai.py --durations=0 -v
|
|
|
|
# Scenario 6: Continuous integration testing
|
|
# Run all tests with coverage and reports
|
|
uv run pytest tests/integrations/ --cov=tests --cov-report=xml --junit-xml=test_results.xml -v
|
|
```
|
|
|
|
#### Test Output Examples
|
|
|
|
```bash
|
|
# Successful test run
|
|
$ uv run pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_01_simple_chat -v
|
|
========================= test session starts =========================
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_01_simple_chat PASSED [100%]
|
|
✓ OpenAI simple chat test passed
|
|
Response: "Hello! I'm an AI assistant. How can I help you today?"
|
|
|
|
# Failed test with debugging info
|
|
$ uv run pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_03_single_tool_call -v -s
|
|
========================= FAILURES =========================
|
|
_____________ TestOpenAIIntegration.test_03_single_tool_call _____________
|
|
AssertionError: Expected tool calls but got none
|
|
Response content: "I can help with weather information, but I need a specific location."
|
|
Tool calls found: []
|
|
|
|
# Test collection output
|
|
$ uv run pytest tests/integrations/test_openai.py --collect-only -q
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_01_simple_chat
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_02_multi_turn_conversation
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_03_single_tool_call
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_04_multiple_tool_calls
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_05_end2end_tool_calling
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_06_automatic_function_calling
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_07_image_url
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_08_image_base64
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_09_multiple_images
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_10_complex_end2end
|
|
tests/integrations/test_openai.py::TestOpenAIIntegration::test_11_integration_specific_features
|
|
11 tests collected
|
|
|
|
# Running all tests with summary
|
|
$ uv run pytest tests/integrations/test_google.py::TestGoogleIntegration::test_03_single_tool_call -v
|
|
========================= test session starts =========================
|
|
tests/integrations/test_google.py::TestGoogleIntegration::test_03_single_tool_call PASSED [100%]
|
|
✅ All tests passed
|
|
|
|
# Running tests in parallel
|
|
$ uv run pytest -n auto
|
|
========================= test session starts =========================
|
|
plugins: xdist-3.5.0, forked-2.0.0
|
|
gw0 [11] / gw1 [11] / gw2 [11] / gw3 [11]
|
|
........... [100%]
|
|
========================= 11 passed in 5.21s =========================
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
#### Required Variables
|
|
|
|
```bash
|
|
# Bifrost gateway (required)
|
|
export BIFROST_BASE_URL="http://localhost:8080"
|
|
|
|
# Integration API keys (at least one required)
|
|
export OPENAI_API_KEY="sk-..."
|
|
export ANTHROPIC_API_KEY="sk-ant-..."
|
|
export GOOGLE_API_KEY="AIza..."
|
|
```
|
|
|
|
#### Optional Variables
|
|
|
|
```bash
|
|
# Integration-specific settings
|
|
export OPENAI_ORG_ID="org-..."
|
|
export OPENAI_PROJECT_ID="proj_..."
|
|
export GOOGLE_PROJECT_ID="your-project"
|
|
export GOOGLE_LOCATION="us-central1"
|
|
|
|
# Environment configuration
|
|
export TEST_ENV="development" # or "production"
|
|
```
|
|
|
|
### Test Output and Debugging
|
|
|
|
#### Understanding Test Results
|
|
|
|
```bash
|
|
# Successful test output
|
|
✓ OpenAI Integration Tests
|
|
✓ test_01_simple_chat - Response: "Hello! How can I help you today?"
|
|
✓ test_03_single_tool_call - Tool called: get_weather(location="New York")
|
|
✓ test_07_image_url - Image analyzed successfully
|
|
|
|
# Failed test output
|
|
✗ test_03_single_tool_call - AssertionError: Expected tool calls but got none
|
|
Response content: "I can help with weather, but I need a specific location."
|
|
```
|
|
|
|
#### Debug Mode
|
|
|
|
```bash
|
|
# Enable verbose output
|
|
pytest tests/integrations/test_openai.py -v -s
|
|
|
|
# Show full tracebacks
|
|
pytest tests/integrations/test_openai.py --tb=long
|
|
|
|
# Enable debug logging
|
|
pytest tests/integrations/test_openai.py --log-cli-level=DEBUG
|
|
```
|
|
|
|
## 🔨 Adding New Integrations
|
|
|
|
### Step-by-Step Guide
|
|
|
|
#### 1. Update Configuration
|
|
|
|
Add your integration to `config.yml`:
|
|
|
|
```yaml
|
|
# Add to bifrost endpoints
|
|
bifrost:
|
|
endpoints:
|
|
your_integration: "/your_integration"
|
|
|
|
# Add model configuration
|
|
models:
|
|
your_integration:
|
|
chat: "your-chat-model"
|
|
vision: "your-vision-model"
|
|
tools: "your-tools-model"
|
|
alternatives: ["alternative-model-1", "alternative-model-2"]
|
|
|
|
# Add model capabilities
|
|
model_capabilities:
|
|
"your-chat-model":
|
|
chat: true
|
|
tools: true
|
|
vision: false
|
|
max_tokens: 4096
|
|
context_window: 8192
|
|
|
|
# Add integration settings
|
|
integration_settings:
|
|
your_integration:
|
|
api_version: "v1"
|
|
custom_header: "value"
|
|
```
|
|
|
|
#### 2. Create Integration Test File
|
|
|
|
Create `tests/integrations/test_your_integration.py`:
|
|
|
|
```python
|
|
"""
|
|
Your Integration Tests
|
|
|
|
Tests all 11 core scenarios using Your Integration SDK.
|
|
"""
|
|
|
|
import pytest
|
|
from your_integration_sdk import YourIntegrationClient
|
|
|
|
from ..utils.common import (
|
|
Config,
|
|
SIMPLE_CHAT_MESSAGES,
|
|
MULTI_TURN_MESSAGES,
|
|
# ... import all test fixtures
|
|
get_api_key,
|
|
skip_if_no_api_key,
|
|
get_model,
|
|
)
|
|
|
|
|
|
@pytest.fixture
|
|
def your_integration_client():
|
|
"""Create Your Integration client for testing"""
|
|
from ..utils.config_loader import get_integration_url, get_config
|
|
|
|
api_key = get_api_key("your_integration")
|
|
base_url = get_integration_url("your_integration")
|
|
|
|
# Get additional integration settings
|
|
config = get_config()
|
|
integration_settings = config.get_integration_settings("your_integration")
|
|
api_config = config.get_api_config()
|
|
|
|
client_kwargs = {
|
|
"api_key": api_key,
|
|
"base_url": base_url,
|
|
"timeout": api_config.get("timeout", 30),
|
|
"max_retries": api_config.get("max_retries", 3),
|
|
}
|
|
|
|
# Add integration-specific settings
|
|
if integration_settings.get("api_version"):
|
|
client_kwargs["api_version"] = integration_settings["api_version"]
|
|
|
|
return YourIntegrationClient(**client_kwargs)
|
|
|
|
|
|
@pytest.fixture
|
|
def test_config():
|
|
"""Test configuration"""
|
|
return Config()
|
|
|
|
|
|
class TestYourIntegrationIntegration:
|
|
"""Test suite for Your Integration covering all 11 core scenarios"""
|
|
|
|
@skip_if_no_api_key("your_integration")
|
|
def test_01_simple_chat(self, your_integration_client, test_config):
|
|
"""Test Case 1: Simple chat interaction"""
|
|
response = your_integration_client.chat.create(
|
|
model=get_model("your_integration", "chat"),
|
|
messages=SIMPLE_CHAT_MESSAGES,
|
|
max_tokens=100,
|
|
)
|
|
|
|
assert_valid_chat_response(response)
|
|
assert response.content is not None
|
|
assert len(response.content) > 0
|
|
|
|
# ... implement all 11 test methods following the same pattern
|
|
# See existing integration test files for complete examples
|
|
|
|
|
|
def extract_your_integration_tool_calls(response) -> List[Dict[str, Any]]:
|
|
"""Extract tool calls from Your Integration response format"""
|
|
tool_calls = []
|
|
|
|
# Implement based on your integration's response format
|
|
if hasattr(response, 'tool_calls') and response.tool_calls:
|
|
for tool_call in response.tool_calls:
|
|
tool_calls.append({
|
|
"name": tool_call.function.name,
|
|
"arguments": json.loads(tool_call.function.arguments)
|
|
})
|
|
|
|
return tool_calls
|
|
```
|
|
|
|
#### 3. Update Common Utilities
|
|
|
|
Add your integration to `tests/utils/common.py`:
|
|
|
|
```python
|
|
def get_api_key(integration: str) -> str:
|
|
"""Get API key for integration"""
|
|
key_map = {
|
|
"openai": "OPENAI_API_KEY",
|
|
"anthropic": "ANTHROPIC_API_KEY",
|
|
"google": "GOOGLE_API_KEY",
|
|
"litellm": "LITELLM_API_KEY",
|
|
"your_integration": "YOUR_INTEGRATION_API_KEY", # Add this line
|
|
}
|
|
|
|
env_var = key_map.get(integration)
|
|
if not env_var:
|
|
raise ValueError(f"Unknown integration: {integration}")
|
|
|
|
api_key = os.getenv(env_var)
|
|
if not api_key:
|
|
raise ValueError(f"{env_var} environment variable not set")
|
|
|
|
return api_key
|
|
```
|
|
|
|
#### 4. Add Integration-Specific Tool Extraction
|
|
|
|
Update the tool extraction functions in your test file:
|
|
|
|
```python
|
|
def extract_your_integration_tool_calls(response: Any) -> List[Dict[str, Any]]:
|
|
"""Extract tool calls from Your Integration response format"""
|
|
tool_calls = []
|
|
|
|
try:
|
|
# Implement based on your integration's response structure
|
|
# Example for a hypothetical integration:
|
|
if hasattr(response, 'function_calls'):
|
|
for fc in response.function_calls:
|
|
tool_calls.append({
|
|
"name": fc.name,
|
|
"arguments": fc.parameters
|
|
})
|
|
|
|
return tool_calls
|
|
|
|
except Exception as e:
|
|
print(f"Error extracting tool calls: {e}")
|
|
return []
|
|
```
|
|
|
|
#### 5. Test Your Implementation
|
|
|
|
```bash
|
|
# Set up environment
|
|
export YOUR_INTEGRATION_API_KEY="your-api-key"
|
|
export BIFROST_BASE_URL="http://localhost:8080"
|
|
|
|
# Test configuration
|
|
python tests/utils/config_loader.py
|
|
|
|
# Run your integration tests
|
|
pytest tests/integrations/test_your_integration.py -v
|
|
|
|
# Run specific test
|
|
pytest tests/integrations/test_your_integration.py::TestYourIntegrationIntegration::test_01_simple_chat -v
|
|
```
|
|
|
|
### 🎯 Key Implementation Points
|
|
|
|
#### 1. **Follow the Pattern**
|
|
|
|
- Use existing integration test files as templates
|
|
- Implement all 11 test scenarios
|
|
- Follow the same naming conventions and structure
|
|
|
|
#### 2. **Handle Integration Differences**
|
|
|
|
```python
|
|
# Example: Different response formats
|
|
def assert_valid_chat_response(response):
|
|
"""Validate chat response - adapt for your integration"""
|
|
if hasattr(response, 'choices'): # OpenAI-style
|
|
assert response.choices[0].message.content
|
|
elif hasattr(response, 'content'): # Anthropic-style
|
|
assert response.content[0].text
|
|
elif hasattr(response, 'text'): # Google-style
|
|
assert response.text
|
|
# Add your integration's format here
|
|
```
|
|
|
|
#### 3. **Implement Tool Calling**
|
|
|
|
```python
|
|
def convert_to_your_integration_tools(tools: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
|
"""Convert common tool format to your integration's format"""
|
|
your_integration_tools = []
|
|
|
|
for tool in tools:
|
|
# Convert to your integration's tool schema
|
|
your_integration_tools.append({
|
|
"name": tool["name"],
|
|
"description": tool["description"],
|
|
"parameters": tool["parameters"],
|
|
# Add integration-specific fields
|
|
})
|
|
|
|
return your_integration_tools
|
|
```
|
|
|
|
#### 4. **Handle Image Processing**
|
|
|
|
```python
|
|
def convert_to_your_integration_messages(messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
|
"""Convert common message format to your integration's format"""
|
|
your_integration_messages = []
|
|
|
|
for msg in messages:
|
|
if isinstance(msg.get("content"), list):
|
|
# Handle multimodal content (text + images)
|
|
content = []
|
|
for item in msg["content"]:
|
|
if item["type"] == "text":
|
|
content.append({"type": "text", "text": item["text"]})
|
|
elif item["type"] == "image_url":
|
|
# Convert to your integration's image format
|
|
content.append({
|
|
"type": "image",
|
|
"source": item["image_url"]["url"]
|
|
})
|
|
your_integration_messages.append({"role": msg["role"], "content": content})
|
|
else:
|
|
your_integration_messages.append(msg)
|
|
|
|
return your_integration_messages
|
|
```
|
|
|
|
#### 5. **Error Handling**
|
|
|
|
```python
|
|
@skip_if_no_api_key("your_integration")
|
|
def test_03_single_tool_call(self, your_integration_client, test_config):
|
|
"""Test Case 3: Single tool call"""
|
|
try:
|
|
response = your_integration_client.chat.create(
|
|
model=get_model("your_integration", "tools"),
|
|
messages=SINGLE_TOOL_CALL_MESSAGES,
|
|
tools=convert_to_your_integration_tools([WEATHER_TOOL]),
|
|
max_tokens=100,
|
|
)
|
|
|
|
assert_has_tool_calls(response, expected_count=1)
|
|
tool_calls = extract_your_integration_tool_calls(response)
|
|
assert tool_calls[0]["name"] == "get_weather"
|
|
assert "location" in tool_calls[0]["arguments"]
|
|
|
|
except Exception as e:
|
|
pytest.skip(f"Tool calling not supported or failed: {e}")
|
|
```
|
|
|
|
### 🔍 Testing Checklist
|
|
|
|
Before submitting your integration implementation:
|
|
|
|
- [ ] **Configuration**: Integration added to `config.yml` with all required sections
|
|
- [ ] **Environment**: API key environment variable documented and tested
|
|
- [ ] **All 11 Tests**: Every test scenario implemented and passing
|
|
- [ ] **Tool Extraction**: Integration-specific tool call extraction function
|
|
- [ ] **Message Conversion**: Proper handling of multimodal messages
|
|
- [ ] **Error Handling**: Graceful handling of unsupported features
|
|
- [ ] **Documentation**: Integration added to README with capabilities
|
|
- [ ] **Bifrost Integration**: Base URL properly configured and tested
|
|
|
|
### 🚨 Common Pitfalls
|
|
|
|
1. **Incorrect Response Parsing**: Each integration has different response formats
|
|
2. **Tool Schema Differences**: Tool calling schemas vary significantly
|
|
3. **Image Format Handling**: Base64 vs URL handling differs per integration
|
|
4. **Missing Error Handling**: Some integrations don't support all features
|
|
5. **Configuration Errors**: Forgetting to add integration to all config sections
|
|
|
|
## 🔧 Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
#### 1. Configuration Problems
|
|
|
|
```bash
|
|
# Error: Configuration file not found
|
|
FileNotFoundError: Configuration file not found: config.yml
|
|
|
|
# Solution: Ensure config.yml exists in project root
|
|
ls -la config.yml
|
|
```
|
|
|
|
#### 2. Integration Connection Issues
|
|
|
|
```bash
|
|
# Error: Connection refused to Bifrost
|
|
ConnectionError: Connection refused to localhost:8080
|
|
|
|
# Solutions:
|
|
# 1. Check if Bifrost is running
|
|
curl http://localhost:8080/health
|
|
|
|
# 2. Ensure BIFROST_BASE_URL is set correctly
|
|
echo $BIFROST_BASE_URL
|
|
```
|
|
|
|
#### 3. API Key Issues
|
|
|
|
```bash
|
|
# Error: API key not set
|
|
ValueError: OPENAI_API_KEY environment variable not set
|
|
|
|
# Solution: Set required environment variables
|
|
export OPENAI_API_KEY="sk-..."
|
|
export ANTHROPIC_API_KEY="sk-ant-..."
|
|
export GOOGLE_API_KEY="AIza..."
|
|
```
|
|
|
|
#### 4. Model Configuration Errors
|
|
|
|
```bash
|
|
# Error: Unknown model type
|
|
ValueError: Unknown model type 'vision' for integration 'your_integration'
|
|
|
|
# Solution: Check config.yml has all model types defined
|
|
python tests/utils/config_loader.py
|
|
```
|
|
|
|
#### 5. Test Failures
|
|
|
|
```bash
|
|
# Error: Tool calls not found
|
|
AssertionError: Response should contain tool calls
|
|
|
|
# Debug steps:
|
|
# 1. Check if integration supports tool calling
|
|
# 2. Verify tool extraction function
|
|
# 3. Check integration-specific tool format
|
|
pytest tests/integrations/test_openai.py::TestOpenAIIntegration::test_03_single_tool_call -v -s
|
|
```
|
|
|
|
### Debug Mode
|
|
|
|
Enable comprehensive debugging:
|
|
|
|
```bash
|
|
# Full verbose output with debugging
|
|
pytest tests/integrations/test_openai.py -v -s --tb=long --log-cli-level=DEBUG
|
|
|
|
# Test configuration system
|
|
python tests/utils/config_loader.py
|
|
|
|
# Check specific integration URL
|
|
python -c "
|
|
from tests.utils.config_loader import get_integration_url, get_model
|
|
print('OpenAI URL:', get_integration_url('openai'))
|
|
print('OpenAI Chat Model:', get_model('openai', 'chat'))
|
|
"
|
|
```
|
|
|
|
## 📚 Additional Resources
|
|
|
|
### Configuration Examples
|
|
|
|
- See `config.yml` for complete configuration reference
|
|
- Check `tests/utils/config_loader.py` for usage examples
|
|
- Review integration test files for implementation patterns
|
|
|
|
### Contributing
|
|
|
|
1. Fork the repository
|
|
2. Create feature branch: `git checkout -b feature/new-integration`
|
|
3. Follow the integration implementation guide above
|
|
4. Add comprehensive tests and documentation
|
|
5. Submit pull request with test results
|
|
|
|
## 🆘 Support
|
|
|
|
For issues and questions:
|
|
|
|
- Create GitHub issues for bugs and feature requests
|
|
- Check existing issues for solutions
|
|
- Review integration-specific documentation
|
|
- Test configuration with `python tests/utils/config_loader.py`
|
|
|
|
---
|
|
|
|
**Note**: This test suite is designed for testing AI integrations through Bifrost proxy. Ensure your Bifrost instance is properly configured and running before executing tests. The configuration system provides Bifrost routing for maximum flexibility.
|