ihompadmin/tasq

Fork 0

Marc Rejohn Castillano 5cb6561924 added ruflo

2026-04-09 19:01:53 +08:00

33 KiB

Raw Permalink Blame History

Requesty.ai Integration Architecture

Architecture Overview

High-Level Design

The Requesty integration will follow the exact same proxy pattern as OpenRouter, with minimal modifications:

┌─────────────────────────────────────────────────────────────────┐
│                     Agentic Flow CLI                            │
│                  (cli-proxy.ts entry point)                     │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ├─ Parse CLI flags (--provider requesty)
                         ├─ Detect REQUESTY_API_KEY
                         └─ Route to appropriate handler
                         │
         ┌───────────────┴───────────────┐
         │                               │
         ▼                               ▼
┌────────────────┐              ┌────────────────┐
│   Direct API   │              │  Proxy Mode    │
│   (Anthropic)  │              │  (Requesty)    │
└────────────────┘              └────────┬───────┘
                                         │
                                         ▼
                        ┌────────────────────────────────┐
                        │ AnthropicToRequestyProxy       │
                        │ (Port 3000 local server)       │
                        ├────────────────────────────────┤
                        │ 1. Accept Anthropic format     │
                        │    (/v1/messages endpoint)     │
                        │ 2. Convert to OpenAI format    │
                        │ 3. Forward to Requesty router  │
                        │ 4. Convert response back       │
                        │ 5. Handle streaming/tools      │
                        └────────────┬───────────────────┘
                                     │
                                     │ HTTP POST
                                     ▼
                        ┌────────────────────────────────┐
                        │  Requesty Router               │
                        │  router.requesty.ai/v1         │
                        ├────────────────────────────────┤
                        │ • Auto-routing                 │
                        │ • Caching                      │
                        │ • Load balancing               │
                        │ • Cost optimization            │
                        └────────────┬───────────────────┘
                                     │
                                     ├─ Model Execution
                                     │
                        ┌────────────┴───────────────────┐
                        │                                │
                ┌───────▼──────┐              ┌─────────▼────────┐
                │  OpenAI      │              │  Anthropic       │
                │  (GPT-4o)    │              │  (Claude 3.5)    │
                └──────────────┘              └──────────────────┘
                        │                                │
                ┌───────▼──────┐              ┌─────────▼────────┐
                │  Google      │              │  DeepSeek        │
                │  (Gemini)    │              │  (Chat V3)       │
                └──────────────┘              └──────────────────┘

Component Breakdown

1. CLI Integration (`src/cli-proxy.ts`)

Responsibilities:

Detect --provider requesty flag
Check for REQUESTY_API_KEY environment variable
Initialize Requesty proxy server
Configure environment for Claude Agent SDK

Code Changes Required:

// Add to shouldUseRequesty() method
private shouldUseRequesty(options: any): boolean {
  if (options.provider === 'requesty' || process.env.PROVIDER === 'requesty') {
    return true;
  }

  if (process.env.USE_REQUESTY === 'true') {
    return true;
  }

  if (process.env.REQUESTY_API_KEY &&
      !process.env.ANTHROPIC_API_KEY &&
      !process.env.OPENROUTER_API_KEY &&
      !process.env.GOOGLE_GEMINI_API_KEY) {
    return true;
  }

  return false;
}

// Add to start() method
if (useRequesty) {
  console.log('🚀 Initializing Requesty proxy...');
  await this.startRequestyProxy(options.model);
}

// Add startRequestyProxy() method (clone from startOpenRouterProxy)
private async startRequestyProxy(modelOverride?: string): Promise<void> {
  const requestyKey = process.env.REQUESTY_API_KEY;

  if (!requestyKey) {
    console.error('❌ Error: REQUESTY_API_KEY required for Requesty models');
    console.error('Set it in .env or export REQUESTY_API_KEY=requesty-xxxxx');
    process.exit(1);
  }

  logger.info('Starting integrated Requesty proxy');

  const defaultModel = modelOverride ||
                      process.env.COMPLETION_MODEL ||
                      'openai/gpt-4o-mini';

  const capabilities = detectModelCapabilities(defaultModel);

  const proxy = new AnthropicToRequestyProxy({
    requestyApiKey: requestyKey,
    requestyBaseUrl: process.env.REQUESTY_BASE_URL,
    defaultModel,
    capabilities: capabilities
  });

  proxy.start(this.proxyPort);
  this.proxyServer = proxy;

  process.env.ANTHROPIC_BASE_URL = `http://localhost:${this.proxyPort}`;

  if (!process.env.ANTHROPIC_API_KEY) {
    process.env.ANTHROPIC_API_KEY = 'sk-ant-proxy-dummy-key';
  }

  console.log(`🔗 Proxy Mode: Requesty`);
  console.log(`🔧 Proxy URL: http://localhost:${this.proxyPort}`);
  console.log(`🤖 Default Model: ${defaultModel}`);

  if (capabilities.requiresEmulation) {
    console.log(`\n⚙️  Detected: Model lacks native tool support`);
    console.log(`🔧 Using ${capabilities.emulationStrategy.toUpperCase()} emulation pattern`);
  }
  console.log('');

  await new Promise(resolve => setTimeout(resolve, 1500));
}

2. Proxy Server (`src/proxy/anthropic-to-requesty.ts`)

Based on: src/proxy/anthropic-to-openrouter.ts (95% identical)

Class Structure:

export class AnthropicToRequestyProxy {
  private app: express.Application;
  private requestyApiKey: string;
  private requestyBaseUrl: string;
  private defaultModel: string;
  private capabilities?: ModelCapabilities;

  constructor(config: {
    requestyApiKey: string;
    requestyBaseUrl?: string;
    defaultModel?: string;
    capabilities?: ModelCapabilities;
  }) {
    this.app = express();
    this.requestyApiKey = config.requestyApiKey;
    this.requestyBaseUrl = config.requestyBaseUrl ||
                           'https://router.requesty.ai/v1';
    this.defaultModel = config.defaultModel || 'openai/gpt-4o-mini';
    this.capabilities = config.capabilities;

    this.setupMiddleware();
    this.setupRoutes();
  }

  private setupRoutes(): void {
    // Health check
    this.app.get('/health', (req, res) => {
      res.json({ status: 'ok', service: 'anthropic-to-requesty-proxy' });
    });

    // Anthropic Messages API → Requesty Chat Completions
    this.app.post('/v1/messages', async (req, res) => {
      // Convert and forward request
      const result = await this.handleRequest(req.body, res);
      if (result) res.json(result);
    });
  }

  private async handleRequest(
    anthropicReq: AnthropicRequest,
    res: Response
  ): Promise<any> {
    const capabilities = this.capabilities ||
                        detectModelCapabilities(anthropicReq.model || this.defaultModel);

    if (capabilities.requiresEmulation && anthropicReq.tools?.length > 0) {
      return this.handleEmulatedRequest(anthropicReq, capabilities);
    }

    return this.handleNativeRequest(anthropicReq, res);
  }

  private async handleNativeRequest(
    anthropicReq: AnthropicRequest,
    res: Response
  ): Promise<any> {
    // Convert Anthropic → OpenAI format
    const openaiReq = this.convertAnthropicToOpenAI(anthropicReq);

    // Forward to Requesty
    const response = await fetch(`${this.requestyBaseUrl}/chat/completions`, {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${this.requestyApiKey}`,
        'Content-Type': 'application/json',
        'HTTP-Referer': 'https://github.com/ruvnet/agentic-flow',
        'X-Title': 'Agentic Flow'
      },
      body: JSON.stringify(openaiReq)
    });

    if (!response.ok) {
      const error = await response.text();
      logger.error('Requesty API error', { status: response.status, error });
      res.status(response.status).json({
        error: { type: 'api_error', message: error }
      });
      return null;
    }

    // Handle streaming vs non-streaming
    if (anthropicReq.stream) {
      // Stream response
      res.setHeader('Content-Type', 'text/event-stream');
      const reader = response.body?.getReader();
      // ... streaming logic
    } else {
      // Non-streaming
      const openaiRes = await response.json();
      return this.convertOpenAIToAnthropic(openaiRes);
    }
  }

  private convertAnthropicToOpenAI(req: AnthropicRequest): OpenAIRequest {
    // IDENTICAL to OpenRouter conversion
    // See anthropic-to-openrouter.ts lines 376-532
  }

  private convertOpenAIToAnthropic(res: any): any {
    // IDENTICAL to OpenRouter conversion
    // See anthropic-to-openrouter.ts lines 588-685
  }

  public start(port: number): void {
    this.app.listen(port, () => {
      logger.info('Anthropic to Requesty proxy started', {
        port,
        requestyBaseUrl: this.requestyBaseUrl,
        defaultModel: this.defaultModel
      });
      console.log(`\n✅ Anthropic Proxy running at http://localhost:${port}`);
      console.log(`   Requesty Base URL: ${this.requestyBaseUrl}`);
      console.log(`   Default Model: ${this.defaultModel}\n`);
    });
  }
}

Key Differences from OpenRouter Proxy:

Component	OpenRouter	Requesty	Change Required
Class name	`AnthropicToOpenRouterProxy`	`AnthropicToRequestyProxy`	Rename
Base URL	`https://openrouter.ai/api/v1`	`https://router.requesty.ai/v1`	Update constant
API key variable	`openrouterApiKey`	`requestyApiKey`	Rename
Auth header	`Bearer sk-or-...`	`Bearer requesty-...`	No code change
Endpoint	`/chat/completions`	`/chat/completions`	Identical
Request format	OpenAI	OpenAI	Identical
Response format	OpenAI	OpenAI	Identical
Tool format	OpenAI functions	OpenAI functions	Identical

Lines of Code to Copy: ~750 lines (95% reusable)

3. Agent Integration (`src/agents/claudeAgent.ts`)

Changes Required:

function getCurrentProvider(): string {
  // Add Requesty detection
  if (process.env.PROVIDER === 'requesty' || process.env.USE_REQUESTY === 'true') {
    return 'requesty';
  }
  // ... existing providers
}

function getModelForProvider(provider: string): {
  model: string;
  apiKey: string;
  baseURL?: string;
} {
  switch (provider) {
    case 'requesty':
      return {
        model: process.env.COMPLETION_MODEL || 'openai/gpt-4o-mini',
        apiKey: process.env.REQUESTY_API_KEY || process.env.ANTHROPIC_API_KEY || '',
        baseURL: process.env.PROXY_URL || undefined
      };
    // ... existing cases
  }
}

// In claudeAgent() function, add Requesty handling
if (provider === 'requesty' && process.env.REQUESTY_API_KEY) {
  envOverrides.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY || 'proxy-key';
  envOverrides.ANTHROPIC_BASE_URL = process.env.ANTHROPIC_BASE_URL ||
                                   process.env.REQUESTY_PROXY_URL ||
                                   'http://localhost:3000';

  logger.info('Using Requesty proxy', {
    proxyUrl: envOverrides.ANTHROPIC_BASE_URL,
    model: finalModel
  });
}

4. Model Capabilities (`src/utils/modelCapabilities.ts`)

Add Requesty Model Definitions:

const MODEL_CAPABILITIES: Record<string, Partial<ModelCapabilities>> = {
  // Existing models...

  // Requesty - OpenAI models
  'openai/gpt-4o': {
    supportsNativeTools: true,
    contextWindow: 128000,
    requiresEmulation: false,
    emulationStrategy: 'none',
    costPerMillionTokens: 0.50,
    provider: 'requesty'
  },
  'openai/gpt-4o-mini': {
    supportsNativeTools: true,
    contextWindow: 128000,
    requiresEmulation: false,
    emulationStrategy: 'none',
    costPerMillionTokens: 0.03,
    provider: 'requesty'
  },

  // Requesty - Anthropic models
  'anthropic/claude-3.5-sonnet': {
    supportsNativeTools: true,
    contextWindow: 200000,
    requiresEmulation: false,
    emulationStrategy: 'none',
    costPerMillionTokens: 0.60,
    provider: 'requesty'
  },

  // Requesty - Google models
  'google/gemini-2.5-flash': {
    supportsNativeTools: true,
    contextWindow: 1000000,
    requiresEmulation: false,
    emulationStrategy: 'none',
    costPerMillionTokens: 0.0, // FREE
    provider: 'requesty'
  },

  // Requesty - DeepSeek models
  'deepseek/deepseek-chat-v3': {
    supportsNativeTools: true,
    contextWindow: 128000,
    requiresEmulation: false,
    emulationStrategy: 'none',
    costPerMillionTokens: 0.03,
    provider: 'requesty'
  },

  // ... add more Requesty models
};

5. Model Optimizer (`src/utils/modelOptimizer.ts`)

Add Requesty Models to Optimizer Database:

// In MODEL_DATABASE constant, add Requesty models
const MODEL_DATABASE: ModelInfo[] = [
  // Existing models...

  // Requesty models
  {
    provider: 'requesty',
    modelId: 'openai/gpt-4o',
    name: 'GPT-4o (Requesty)',
    contextWindow: 128000,
    maxOutput: 4096,
    qualityScore: 95,
    speedScore: 85,
    costPer1MTokens: { input: 0.50, output: 1.50 },
    capabilities: {
      toolCalling: true,
      streaming: true,
      vision: true,
      jsonMode: true
    },
    useCase: ['reasoning', 'coding', 'analysis'],
    requiresKey: 'REQUESTY_API_KEY'
  },
  {
    provider: 'requesty',
    modelId: 'openai/gpt-4o-mini',
    name: 'GPT-4o Mini (Requesty)',
    contextWindow: 128000,
    maxOutput: 4096,
    qualityScore: 80,
    speedScore: 95,
    costPer1MTokens: { input: 0.03, output: 0.06 },
    capabilities: {
      toolCalling: true,
      streaming: true,
      vision: false,
      jsonMode: true
    },
    useCase: ['coding', 'analysis', 'chat'],
    requiresKey: 'REQUESTY_API_KEY'
  },
  {
    provider: 'requesty',
    modelId: 'google/gemini-2.5-flash',
    name: 'Gemini 2.5 Flash (Requesty)',
    contextWindow: 1000000,
    maxOutput: 8192,
    qualityScore: 85,
    speedScore: 98,
    costPer1MTokens: { input: 0.0, output: 0.0 }, // FREE
    capabilities: {
      toolCalling: true,
      streaming: true,
      vision: true,
      jsonMode: true
    },
    useCase: ['coding', 'analysis', 'chat'],
    requiresKey: 'REQUESTY_API_KEY'
  },
  // ... add more Requesty models
];

Data Flow Diagrams

Request Flow - Chat Completion

User CLI Command
    │
    └─> npx agentic-flow --agent coder --task "Create API" --provider requesty
         │
         ├─> CLI Parser (cli-proxy.ts)
         │    ├─ Detect --provider requesty
         │    ├─ Load REQUESTY_API_KEY from env
         │    └─ Start AnthropicToRequestyProxy on port 3000
         │
         ├─> Set Environment Variables
         │    ├─ ANTHROPIC_BASE_URL = http://localhost:3000
         │    └─ ANTHROPIC_API_KEY = sk-ant-proxy-dummy-key
         │
         └─> Execute Agent (claudeAgent.ts)
              │
              └─> Claude Agent SDK query()
                   │
                   ├─> Reads ANTHROPIC_BASE_URL (proxy)
                   │
                   └─> POST http://localhost:3000/v1/messages
                        │
                        └─> AnthropicToRequestyProxy
                             │
                             ├─> Receive Anthropic format request
                             │    {
                             │      model: "openai/gpt-4o-mini",
                             │      messages: [...],
                             │      tools: [...]
                             │    }
                             │
                             ├─> Convert to OpenAI format
                             │    {
                             │      model: "openai/gpt-4o-mini",
                             │      messages: [...],
                             │      tools: [...]
                             │    }
                             │
                             ├─> POST https://router.requesty.ai/v1/chat/completions
                             │    Headers:
                             │      Authorization: Bearer requesty-xxxxx
                             │      Content-Type: application/json
                             │
                             └─> Requesty Router
                                  │
                                  ├─> Auto-route to optimal model
                                  ├─> Check cache
                                  ├─> Execute model
                                  │
                                  └─> Return OpenAI format response
                                       │
                                       └─> AnthropicToRequestyProxy
                                            │
                                            ├─> Convert to Anthropic format
                                            │    {
                                            │      id: "msg_xxx",
                                            │      role: "assistant",
                                            │      content: [...]
                                            │    }
                                            │
                                            └─> Return to Claude Agent SDK
                                                 │
                                                 └─> Display to user

Tool Calling Flow

User asks agent to read a file
    │
    └─> Agent determines tool call needed
         │
         └─> POST /v1/messages with tools array
              {
                "tools": [{
                  "type": "function",
                  "function": {
                    "name": "Read",
                    "parameters": {...}
                  }
                }]
              }
              │
              └─> Proxy converts to OpenAI format (no change needed)
                   │
                   └─> Requesty executes model
                        │
                        └─> Model returns tool_calls
                             {
                               "choices": [{
                                 "message": {
                                   "tool_calls": [{
                                     "id": "call_abc",
                                     "function": {
                                       "name": "Read",
                                       "arguments": "{...}"
                                     }
                                   }]
                                 }
                               }]
                             }
                             │
                             └─> Proxy converts to Anthropic format
                                  {
                                    "content": [{
                                      "type": "tool_use",
                                      "id": "call_abc",
                                      "name": "Read",
                                      "input": {...}
                                    }]
                                  }
                                  │
                                  └─> Claude Agent SDK executes tool
                                       │
                                       └─> Returns result to model

File Organization

New Files

src/
└── proxy/
    └── anthropic-to-requesty.ts        (~750 lines, cloned from OpenRouter)

docs/
└── plans/
    └── requesty/
        ├── 00-overview.md
        ├── 01-api-research.md
        ├── 02-architecture.md          (this file)
        ├── 03-implementation-phases.md
        ├── 04-testing-strategy.md
        └── 05-migration-guide.md

Modified Files

src/
├── cli-proxy.ts                        (+ ~80 lines)
│   ├── shouldUseRequesty()
│   ├── startRequestyProxy()
│   └── Updated help text
│
├── agents/
│   └── claudeAgent.ts                  (+ ~15 lines)
│       ├── getCurrentProvider()
│       └── getModelForProvider()
│
└── utils/
    ├── modelCapabilities.ts            (+ ~50 lines)
    │   └── Add Requesty model definitions
    │
    └── modelOptimizer.ts               (+ ~100 lines)
        └── Add Requesty models to database

Total Code Impact

Metric	Count
New files	1
Modified files	4
New lines of code	~1,000
Reused lines of code	~750 (95% from OpenRouter)
Original code needed	~250

Configuration Management

Environment Variables

# Required for Requesty
REQUESTY_API_KEY=requesty-xxxxxxxxxxxxx

# Optional overrides
REQUESTY_BASE_URL=https://router.requesty.ai/v1  # Custom base URL
REQUESTY_PROXY_URL=http://localhost:3000         # Proxy override
PROVIDER=requesty                                # Force Requesty
USE_REQUESTY=true                                # Alternative flag
COMPLETION_MODEL=openai/gpt-4o-mini              # Default model

# Proxy configuration
PROXY_PORT=3000                                  # Proxy server port

.env.example Update

# Add to .env.example
# ============================================
# Requesty Configuration
# ============================================
REQUESTY_API_KEY=                               # Get from https://app.requesty.ai
REQUESTY_BASE_URL=https://router.requesty.ai/v1 # Optional: Custom base URL
USE_REQUESTY=false                              # Set to 'true' to force Requesty

Config File Support

Consider adding ~/.agentic-flow/requesty.json:

{
  "apiKey": "requesty-xxxxx",
  "baseUrl": "https://router.requesty.ai/v1",
  "defaultModel": "openai/gpt-4o-mini",
  "autoRouting": true,
  "caching": {
    "enabled": true,
    "ttl": 3600
  },
  "fallback": {
    "enabled": true,
    "providers": ["openrouter", "anthropic"]
  }
}

Error Handling Strategy

Error Mapping

// Map Requesty errors to user-friendly messages
private mapRequestyError(error: any): string {
  const errorMappings = {
    'invalid_api_key': 'Invalid REQUESTY_API_KEY. Check your API key.',
    'rate_limit_exceeded': 'Rate limit exceeded. Please wait and retry.',
    'model_not_found': 'Model not available. Check model ID.',
    'insufficient_quota': 'Insufficient Requesty credits.',
    'model_overloaded': 'Model temporarily overloaded. Retrying...',
    'timeout': 'Request timeout. Model took too long to respond.'
  };

  return errorMappings[error.code] || error.message;
}

Retry Logic

private async callRequestyWithRetry(
  request: any,
  maxRetries: number = 3
): Promise<any> {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(/* ... */);
      if (response.ok) return await response.json();

      // Check if error is retryable
      if ([429, 503, 504].includes(response.status)) {
        const delay = Math.pow(2, attempt) * 1000; // Exponential backoff
        logger.warn(`Retrying after ${delay}ms (attempt ${attempt}/${maxRetries})`);
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }

      // Non-retryable error
      throw new Error(`Requesty API error: ${response.status}`);
    } catch (error) {
      if (attempt === maxRetries) throw error;
    }
  }
}

Performance Considerations

Latency Optimization

Keep-Alive Connections

import https from 'https';

const agent = new https.Agent({
  keepAlive: true,
  maxSockets: 10
});

fetch(url, { agent });

Request Pooling
- Reuse HTTP connections
- Connection pooling for concurrent requests
Streaming
- Enable streaming by default for large responses
- Reduce time-to-first-token

Caching Strategy

Requesty has built-in caching, but we can add client-side caching too:

interface CacheEntry {
  key: string;
  value: any;
  timestamp: number;
  ttl: number;
}

class ResponseCache {
  private cache: Map<string, CacheEntry> = new Map();

  set(key: string, value: any, ttl: number = 3600): void {
    this.cache.set(key, {
      key,
      value,
      timestamp: Date.now(),
      ttl: ttl * 1000
    });
  }

  get(key: string): any | null {
    const entry = this.cache.get(key);
    if (!entry) return null;

    if (Date.now() - entry.timestamp > entry.ttl) {
      this.cache.delete(key);
      return null;
    }

    return entry.value;
  }

  generateKey(request: any): string {
    return crypto.createHash('sha256')
      .update(JSON.stringify(request))
      .digest('hex');
  }
}

Security Architecture

API Key Security

Never log API keys

logger.info('Request to Requesty', {
  apiKeyPresent: !!this.requestyApiKey,
  apiKeyPrefix: this.requestyApiKey?.substring(0, 10) // Only log prefix
});

Environment variable validation

if (!requestyKey || !requestyKey.startsWith('requesty-')) {
  throw new Error('Invalid REQUESTY_API_KEY format');
}

Rate limit API key exposure
- Don't include API key in error messages
- Don't send API key to client in proxy responses

Request Validation

private validateRequest(req: AnthropicRequest): void {
  if (!req.messages || req.messages.length === 0) {
    throw new Error('Messages array cannot be empty');
  }

  if (req.max_tokens && req.max_tokens > 100000) {
    logger.warn('Unusually high max_tokens requested', {
      requested: req.max_tokens
    });
  }

  // Prevent injection attacks in system prompts
  if (req.system && typeof req.system === 'string') {
    this.sanitizeSystemPrompt(req.system);
  }
}

Monitoring and Observability

Logging Strategy

// Request logging
logger.info('Requesty request', {
  model: request.model,
  messageCount: request.messages.length,
  toolCount: request.tools?.length || 0,
  streaming: request.stream,
  maxTokens: request.max_tokens
});

// Response logging
logger.info('Requesty response', {
  id: response.id,
  model: response.model,
  finishReason: response.choices[0].finish_reason,
  tokensUsed: response.usage.total_tokens,
  latencyMs: Date.now() - startTime
});

// Error logging
logger.error('Requesty error', {
  errorType: error.type,
  errorCode: error.code,
  message: error.message,
  model: request.model,
  retryAttempt: attempt
});

Metrics Collection

interface RequestMetrics {
  requestId: string;
  model: string;
  startTime: number;
  endTime: number;
  latencyMs: number;
  tokensIn: number;
  tokensOut: number;
  tokensTotal: number;
  cost: number;
  success: boolean;
  errorType?: string;
}

class MetricsCollector {
  private metrics: RequestMetrics[] = [];

  recordRequest(metrics: RequestMetrics): void {
    this.metrics.push(metrics);

    // Optional: Send to analytics service
    if (process.env.ANALYTICS_ENABLED === 'true') {
      this.sendToAnalytics(metrics);
    }
  }

  getStats(period: '1h' | '24h' | '7d'): any {
    // Calculate aggregate stats
    const relevantMetrics = this.filterByPeriod(period);
    return {
      totalRequests: relevantMetrics.length,
      avgLatency: this.average(relevantMetrics.map(m => m.latencyMs)),
      totalTokens: this.sum(relevantMetrics.map(m => m.tokensTotal)),
      totalCost: this.sum(relevantMetrics.map(m => m.cost)),
      successRate: this.successRate(relevantMetrics)
    };
  }
}

Deployment Considerations

Standalone Proxy Mode

Support running Requesty proxy as standalone server:

# Terminal 1 - Run proxy
npx agentic-flow proxy --provider requesty --port 3000 --model "openai/gpt-4o-mini"

# Terminal 2 - Use with Claude Code
export ANTHROPIC_BASE_URL=http://localhost:3000
export ANTHROPIC_API_KEY=sk-ant-proxy-dummy-key
export REQUESTY_API_KEY=requesty-xxxxx
claude

Docker Support

# Add to existing Dockerfile
ENV REQUESTY_API_KEY=""
ENV REQUESTY_BASE_URL="https://router.requesty.ai/v1"
ENV USE_REQUESTY="false"

Health Checks

// Enhanced health check endpoint
this.app.get('/health', async (req, res) => {
  const health = {
    status: 'ok',
    service: 'anthropic-to-requesty-proxy',
    version: packageJson.version,
    uptime: process.uptime(),
    requesty: {
      baseUrl: this.requestyBaseUrl,
      apiKeyConfigured: !!this.requestyApiKey,
      defaultModel: this.defaultModel
    }
  };

  // Optional: Ping Requesty API
  try {
    const response = await fetch(`${this.requestyBaseUrl}/models`, {
      headers: { Authorization: `Bearer ${this.requestyApiKey}` }
    });
    health.requesty.apiReachable = response.ok;
  } catch (error) {
    health.requesty.apiReachable = false;
  }

  res.json(health);
});

Future Enhancements

Phase 2 Features

Auto-Routing Integration
- Support Requesty's auto-routing feature
- Let Requesty choose optimal model based on request
Caching Control
- Expose cache control headers
- Per-request cache configuration
Analytics Dashboard
- Local web UI showing Requesty usage stats
- Cost tracking and optimization recommendations
Fallback Chain
- Automatic fallback to OpenRouter if Requesty fails
- Configurable provider priority

Phase 3 Features

Model Benchmarking
- Compare same task across Requesty vs OpenRouter vs Anthropic
- Quality/cost/speed metrics
Smart Provider Selection
- Automatically choose Requesty vs OpenRouter based on:
  - Current rate limits
  - Model availability
  - Cost optimization
  - Latency requirements
Webhook Support
- Async request processing
- Long-running task support

Architecture Decision Records

ADR-001: Copy OpenRouter Proxy Pattern

Decision: Clone OpenRouter proxy implementation for Requesty

Rationale:

95% code reuse
Proven pattern already tested
Minimal development time
Consistent user experience

Alternatives Considered:

Generic proxy factory (over-engineered for 2 providers)
Shared base class (adds complexity)

ADR-002: Same Port for All Proxies

Decision: Use port 3000 for all proxies (only one active at a time)

Rationale:

Simplifies configuration
Prevents port conflicts
Clear user experience

Alternatives Considered:

Different ports per provider (confusing)
Dynamic port allocation (complex)

ADR-003: OpenAI Format as Intermediate

Decision: Use OpenAI format for all proxy conversions

Rationale:

Industry standard
Most providers support it
Rich tool calling support

Alternatives Considered:

Direct Anthropic-to-Requesty (loses generalization)
Custom intermediate format (reinventing wheel)

Summary

The Requesty integration follows a proven, low-risk architecture:

Clone OpenRouter proxy (~750 lines, 95% reusable)
Update 4 existing files (~250 new lines total)
Add model definitions (~100 lines for optimizer)
Minimal testing overhead (reuse OpenRouter test suite)

Total Implementation Time: ~4 hours for core functionality

Risk Level: LOW (following established pattern)

Maintenance Burden: MINIMAL (almost identical to OpenRouter)

33 KiB Raw Permalink Blame History Unescape Escape

Requesty.ai Integration Architecture

Architecture Overview

High-Level Design

Component Breakdown

1. CLI Integration (src/cli-proxy.ts)

2. Proxy Server (src/proxy/anthropic-to-requesty.ts)

3. Agent Integration (src/agents/claudeAgent.ts)

4. Model Capabilities (src/utils/modelCapabilities.ts)

5. Model Optimizer (src/utils/modelOptimizer.ts)

Data Flow Diagrams

Request Flow - Chat Completion

Tool Calling Flow

File Organization

New Files

Modified Files

Total Code Impact

Configuration Management

Environment Variables

.env.example Update

Config File Support

Error Handling Strategy

Error Mapping

Retry Logic

Performance Considerations

Latency Optimization

Caching Strategy

Security Architecture

API Key Security

Request Validation

Monitoring and Observability

Logging Strategy

Metrics Collection

Deployment Considerations

Standalone Proxy Mode

Docker Support

Health Checks

Future Enhancements

Phase 2 Features

Phase 3 Features

Architecture Decision Records

ADR-001: Copy OpenRouter Proxy Pattern

ADR-002: Same Port for All Proxies

ADR-003: OpenAI Format as Intermediate

Summary

33 KiB

Raw Permalink Blame History

1. CLI Integration (`src/cli-proxy.ts`)

2. Proxy Server (`src/proxy/anthropic-to-requesty.ts`)

3. Agent Integration (`src/agents/claudeAgent.ts`)

4. Model Capabilities (`src/utils/modelCapabilities.ts`)

5. Model Optimizer (`src/utils/modelOptimizer.ts`)