tasq/node_modules/agentic-flow/docs/guides/PROXY-ARCHITECTURE-AND-EXTENSION.md

709 lines
18 KiB
Markdown

# Proxy Architecture and Extension Guide
## 📖 Table of Contents
- [How the Proxy Works](#how-the-proxy-works)
- [Architecture Overview](#architecture-overview)
- [Adding New Cloud Providers](#adding-new-cloud-providers)
- [Adding Local LLM Providers](#adding-local-llm-providers)
- [Message Format Conversion](#message-format-conversion)
- [Tool/Function Calling Support](#toolfunction-calling-support)
- [Testing Your Proxy](#testing-your-proxy)
- [Examples](#examples)
---
## How the Proxy Works
### The Problem
Claude Code and the Claude Agent SDK expect requests in **Anthropic's Messages API format**. When you want to use cheaper alternatives (OpenRouter, Gemini, local models), you need to:
1. Translate Anthropic request format → Provider's format
2. Forward request to the provider's API
3. Translate provider's response → Anthropic response format
4. Return to Claude Code/SDK (which thinks it's talking to Anthropic)
### The Solution
A transparent HTTP proxy that sits between Claude Code and the LLM provider:
```
┌─────────────┐ ┌──────────────┐ ┌──────────────┐
│ Claude Code │──────▶│ Proxy Server │──────▶│ Provider │
│ /SDK │ │ (localhost) │ │ (OpenRouter, │
│ │◀──────│ │◀──────│ Gemini, etc)│
└─────────────┘ └──────────────┘ └──────────────┘
Anthropic API Translates Provider API
```
**Key Benefits:**
- ✅ No code changes to Claude Code or Agent SDK
- ✅ 99% cost savings with OpenRouter models
- ✅ 100% free with Gemini free tier
- ✅ All MCP tools work through the proxy
- ✅ Streaming support
- ✅ Function/tool calling support
---
## Architecture Overview
### File Structure
```
src/proxy/
├── anthropic-to-openrouter.ts # OpenRouter proxy
├── anthropic-to-gemini.ts # Gemini proxy
└── provider-instructions.ts # Model-specific configs
```
### Core Components
#### 1. **Express Server**
- Listens on port 3000 (configurable)
- Handles `/v1/messages` endpoint (Anthropic's Messages API)
- Health check at `/health`
#### 2. **Request Converter**
Translates Anthropic → Provider format:
```typescript
private convertAnthropicToOpenAI(anthropicReq: AnthropicRequest): OpenAIRequest {
// 1. Extract system prompt
// 2. Convert messages array
// 3. Convert tools (if present)
// 4. Map model names
// 5. Apply provider-specific configs
}
```
#### 3. **Response Converter**
Translates Provider → Anthropic format:
```typescript
private convertOpenAIToAnthropic(openaiRes: any): any {
// 1. Extract choice/candidate
// 2. Convert tool_calls → tool_use blocks
// 3. Extract text content
// 4. Map finish reasons
// 5. Convert usage stats
}
```
#### 4. **Streaming Handler**
For real-time token-by-token output:
```typescript
private convertOpenAIStreamToAnthropic(chunk: string): string {
// Convert SSE format: OpenAI → Anthropic
}
```
---
## Adding New Cloud Providers
### Example: Adding Mistral AI
**Step 1: Create proxy file**
`src/proxy/anthropic-to-mistral.ts`:
```typescript
import express, { Request, Response } from 'express';
import { logger } from '../utils/logger.js';
interface MistralMessage {
role: 'system' | 'user' | 'assistant';
content: string;
}
interface MistralRequest {
model: string;
messages: MistralMessage[];
temperature?: number;
max_tokens?: number;
stream?: boolean;
}
export class AnthropicToMistralProxy {
private app: express.Application;
private mistralApiKey: string;
private mistralBaseUrl: string;
private defaultModel: string;
constructor(config: {
mistralApiKey: string;
mistralBaseUrl?: string;
defaultModel?: string;
}) {
this.app = express();
this.mistralApiKey = config.mistralApiKey;
this.mistralBaseUrl = config.mistralBaseUrl || 'https://api.mistral.ai/v1';
this.defaultModel = config.defaultModel || 'mistral-large-latest';
this.setupMiddleware();
this.setupRoutes();
}
private setupMiddleware(): void {
this.app.use(express.json({ limit: '50mb' }));
}
private setupRoutes(): void {
// Health check
this.app.get('/health', (req: Request, res: Response) => {
res.json({ status: 'ok', service: 'anthropic-to-mistral-proxy' });
});
// Main conversion endpoint
this.app.post('/v1/messages', async (req: Request, res: Response) => {
try {
const anthropicReq = req.body;
// Convert Anthropic → Mistral
const mistralReq = this.convertAnthropicToMistral(anthropicReq);
// Forward to Mistral
const response = await fetch(`${this.mistralBaseUrl}/chat/completions`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${this.mistralApiKey}`,
'Content-Type': 'application/json'
},
body: JSON.stringify(mistralReq)
});
if (!response.ok) {
const error = await response.text();
logger.error('Mistral API error', { status: response.status, error });
return res.status(response.status).json({
error: { type: 'api_error', message: error }
});
}
// Convert Mistral → Anthropic
const mistralRes = await response.json();
const anthropicRes = this.convertMistralToAnthropic(mistralRes);
res.json(anthropicRes);
} catch (error: any) {
logger.error('Mistral proxy error', { error: error.message });
res.status(500).json({
error: { type: 'proxy_error', message: error.message }
});
}
});
}
private convertAnthropicToMistral(anthropicReq: any): MistralRequest {
const messages: MistralMessage[] = [];
// Add system prompt if present
if (anthropicReq.system) {
messages.push({
role: 'system',
content: typeof anthropicReq.system === 'string'
? anthropicReq.system
: anthropicReq.system.map((b: any) => b.text).join('\n')
});
}
// Convert messages
for (const msg of anthropicReq.messages) {
messages.push({
role: msg.role,
content: typeof msg.content === 'string'
? msg.content
: msg.content.filter((b: any) => b.type === 'text').map((b: any) => b.text).join('\n')
});
}
return {
model: this.defaultModel,
messages,
temperature: anthropicReq.temperature,
max_tokens: anthropicReq.max_tokens || 4096,
stream: anthropicReq.stream || false
};
}
private convertMistralToAnthropic(mistralRes: any): any {
const choice = mistralRes.choices?.[0];
if (!choice) throw new Error('No choices in Mistral response');
const content = choice.message?.content || '';
return {
id: mistralRes.id || `msg_${Date.now()}`,
type: 'message',
role: 'assistant',
model: mistralRes.model,
content: [{ type: 'text', text: content }],
stop_reason: choice.finish_reason === 'stop' ? 'end_turn' : 'max_tokens',
usage: {
input_tokens: mistralRes.usage?.prompt_tokens || 0,
output_tokens: mistralRes.usage?.completion_tokens || 0
}
};
}
public start(port: number): void {
this.app.listen(port, () => {
logger.info('Mistral proxy started', { port });
console.log(`\n✅ Mistral Proxy running at http://localhost:${port}\n`);
});
}
}
// CLI entry point
if (import.meta.url === `file://${process.argv[1]}`) {
const port = parseInt(process.env.PORT || '3000');
const mistralApiKey = process.env.MISTRAL_API_KEY;
if (!mistralApiKey) {
console.error('❌ Error: MISTRAL_API_KEY environment variable required');
process.exit(1);
}
const proxy = new AnthropicToMistralProxy({ mistralApiKey });
proxy.start(port);
}
```
**Step 2: Update TypeScript build**
Add to `config/tsconfig.json` if needed (usually auto-detected).
**Step 3: Test the proxy**
```bash
# Terminal 1: Start proxy
export MISTRAL_API_KEY=your-key-here
npm run build
node dist/proxy/anthropic-to-mistral.js
# Terminal 2: Use with Claude Code
export ANTHROPIC_BASE_URL=http://localhost:3000
export ANTHROPIC_API_KEY=dummy-key
npx agentic-flow --agent coder --task "Write hello world"
```
---
## Adding Local LLM Providers
### Example: Adding Ollama Support
**Step 1: Create proxy file**
`src/proxy/anthropic-to-ollama.ts`:
```typescript
import express, { Request, Response } from 'express';
import { logger } from '../utils/logger.js';
export class AnthropicToOllamaProxy {
private app: express.Application;
private ollamaBaseUrl: string;
private defaultModel: string;
constructor(config: {
ollamaBaseUrl?: string;
defaultModel?: string;
}) {
this.app = express();
this.ollamaBaseUrl = config.ollamaBaseUrl || 'http://localhost:11434';
this.defaultModel = config.defaultModel || 'llama3.3:70b';
this.setupMiddleware();
this.setupRoutes();
}
private setupMiddleware(): void {
this.app.use(express.json({ limit: '50mb' }));
}
private setupRoutes(): void {
this.app.get('/health', (req: Request, res: Response) => {
res.json({ status: 'ok', service: 'anthropic-to-ollama-proxy' });
});
this.app.post('/v1/messages', async (req: Request, res: Response) => {
try {
const anthropicReq = req.body;
// Build prompt from messages
let prompt = '';
if (anthropicReq.system) {
prompt += `System: ${anthropicReq.system}\n\n`;
}
for (const msg of anthropicReq.messages) {
const content = typeof msg.content === 'string'
? msg.content
: msg.content.filter((b: any) => b.type === 'text').map((b: any) => b.text).join('\n');
prompt += `${msg.role === 'user' ? 'Human' : 'Assistant'}: ${content}\n\n`;
}
prompt += 'Assistant: ';
// Call Ollama API
const response = await fetch(`${this.ollamaBaseUrl}/api/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: this.defaultModel,
prompt,
stream: false,
options: {
temperature: anthropicReq.temperature || 0.7,
num_predict: anthropicReq.max_tokens || 4096
}
})
});
if (!response.ok) {
const error = await response.text();
logger.error('Ollama API error', { status: response.status, error });
return res.status(response.status).json({
error: { type: 'api_error', message: error }
});
}
const ollamaRes = await response.json();
// Convert to Anthropic format
const anthropicRes = {
id: `msg_${Date.now()}`,
type: 'message',
role: 'assistant',
model: this.defaultModel,
content: [{ type: 'text', text: ollamaRes.response }],
stop_reason: ollamaRes.done ? 'end_turn' : 'max_tokens',
usage: {
input_tokens: ollamaRes.prompt_eval_count || 0,
output_tokens: ollamaRes.eval_count || 0
}
};
res.json(anthropicRes);
} catch (error: any) {
logger.error('Ollama proxy error', { error: error.message });
res.status(500).json({
error: { type: 'proxy_error', message: error.message }
});
}
});
}
public start(port: number): void {
this.app.listen(port, () => {
logger.info('Ollama proxy started', { port, ollamaBaseUrl: this.ollamaBaseUrl });
console.log(`\n✅ Ollama Proxy running at http://localhost:${port}`);
console.log(` Ollama Server: ${this.ollamaBaseUrl}`);
console.log(` Default Model: ${this.defaultModel}\n`);
});
}
}
// CLI entry point
if (import.meta.url === `file://${process.argv[1]}`) {
const port = parseInt(process.env.PORT || '3000');
const proxy = new AnthropicToOllamaProxy({
ollamaBaseUrl: process.env.OLLAMA_BASE_URL,
defaultModel: process.env.OLLAMA_MODEL || 'llama3.3:70b'
});
proxy.start(port);
}
```
**Step 2: Start Ollama server**
```bash
# Install Ollama (https://ollama.ai)
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model
ollama pull llama3.3:70b
# Server starts automatically on port 11434
```
**Step 3: Use with Agentic Flow**
```bash
# Terminal 1: Start proxy
npm run build
node dist/proxy/anthropic-to-ollama.js
# Terminal 2: Use with agents
export ANTHROPIC_BASE_URL=http://localhost:3000
export ANTHROPIC_API_KEY=dummy-key
npx agentic-flow --agent coder --task "Write hello world"
```
---
## Message Format Conversion
### Anthropic Messages API Format
```json
{
"model": "claude-3-5-sonnet-20241022",
"messages": [
{
"role": "user",
"content": "Hello!"
}
],
"system": "You are a helpful assistant",
"max_tokens": 1024,
"temperature": 0.7
}
```
### OpenAI Chat Completions Format
```json
{
"model": "gpt-4",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant"
},
{
"role": "user",
"content": "Hello!"
}
],
"max_tokens": 1024,
"temperature": 0.7
}
```
### Gemini generateContent Format
```json
{
"contents": [
{
"role": "user",
"parts": [
{
"text": "System: You are a helpful assistant\n\nHello!"
}
]
}
],
"generationConfig": {
"temperature": 0.7,
"maxOutputTokens": 1024
}
}
```
### Key Differences
| Feature | Anthropic | OpenAI | Gemini |
|---------|-----------|--------|--------|
| **System Prompt** | Separate `system` field | First message with `role: "system"` | Prepended to first user message |
| **Message Content** | String or array of blocks | Always string | Array of `parts` with `text` |
| **Role Names** | `user`, `assistant` | `user`, `assistant`, `system` | `user`, `model` |
| **Max Tokens** | `max_tokens` | `max_tokens` | `generationConfig.maxOutputTokens` |
| **Response Format** | `content` array with typed blocks | `message.content` string | `candidates[0].content.parts[0].text` |
---
## Tool/Function Calling Support
### Anthropic Tool Format
```json
{
"tools": [
{
"name": "get_weather",
"description": "Get weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}
]
}
```
### OpenAI Tool Format
```json
{
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}
}
]
}
```
### Conversion Logic
```typescript
// Anthropic → OpenAI
if (anthropicReq.tools && anthropicReq.tools.length > 0) {
openaiReq.tools = anthropicReq.tools.map(tool => ({
type: 'function',
function: {
name: tool.name,
description: tool.description || '',
parameters: tool.input_schema || {
type: 'object',
properties: {},
required: []
}
}
}));
}
// OpenAI → Anthropic (tool_calls in response)
if (message.tool_calls && message.tool_calls.length > 0) {
for (const toolCall of message.tool_calls) {
contentBlocks.push({
type: 'tool_use',
id: toolCall.id,
name: toolCall.function.name,
input: JSON.parse(toolCall.function.arguments)
});
}
}
```
---
## Testing Your Proxy
### Unit Tests
Create `tests/proxy-mistral.test.ts`:
```typescript
import { AnthropicToMistralProxy } from '../src/proxy/anthropic-to-mistral.js';
import fetch from 'node-fetch';
describe('Mistral Proxy', () => {
let proxy: AnthropicToMistralProxy;
const port = 3001;
beforeAll(() => {
proxy = new AnthropicToMistralProxy({
mistralApiKey: process.env.MISTRAL_API_KEY || 'test-key'
});
proxy.start(port);
});
it('should convert Anthropic request to Mistral format', async () => {
const response = await fetch(`http://localhost:${port}/v1/messages`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'claude-3-5-sonnet-20241022',
messages: [{ role: 'user', content: 'Hello!' }],
max_tokens: 100
})
});
expect(response.ok).toBe(true);
const data = await response.json();
expect(data).toHaveProperty('content');
expect(data.role).toBe('assistant');
});
});
```
### Manual Testing
```bash
# Test health check
curl http://localhost:3000/health
# Test message endpoint
curl -X POST http://localhost:3000/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 100
}'
```
---
## Examples
### Complete Example: Adding Cohere
See full implementation: [examples/proxy-cohere.ts](../examples/proxy-cohere.ts)
### Integration with Agentic Flow
```typescript
// src/cli-proxy.ts - Add new provider option
if (options.provider === 'mistral' || process.env.USE_MISTRAL) {
// Start Mistral proxy
const proxy = new AnthropicToMistralProxy({
mistralApiKey: process.env.MISTRAL_API_KEY!
});
proxy.start(3000);
// Set environment for SDK
process.env.ANTHROPIC_BASE_URL = 'http://localhost:3000';
process.env.ANTHROPIC_API_KEY = 'dummy-key';
}
```
---
## Best Practices
1. **Error Handling**: Always catch and log errors with context
2. **Streaming**: Support both streaming and non-streaming modes
3. **Tool Calling**: Handle MCP tools via native function calling when possible
4. **Logging**: Use verbose logging during development, info in production
5. **API Keys**: Never hardcode keys, use environment variables
6. **Health Checks**: Always provide a `/health` endpoint
7. **Rate Limiting**: Respect provider rate limits
8. **Timeouts**: Set appropriate timeouts for API calls
---
## Resources
- [Anthropic Messages API](https://docs.anthropic.com/en/api/messages)
- [OpenAI Chat Completions](https://platform.openai.com/docs/api-reference/chat)
- [Google Gemini API](https://ai.google.dev/gemini-api/docs)
- [OpenRouter API](https://openrouter.ai/docs)
- [Ollama API](https://github.com/ollama/ollama/blob/main/docs/api.md)
---
## Support
Need help adding a provider? Open an issue: [GitHub Issues](https://github.com/ruvnet/agentic-flow/issues)