tasq/node_modules/agentic-flow/docs/plans/requesty/02-architecture.md

# Requesty.ai Integration Architecture

## Architecture Overview

### High-Level Design

The Requesty integration will follow the **exact same proxy pattern** as OpenRouter, with minimal modifications:

```
┌─────────────────────────────────────────────────────────────────┐
│                     Agentic Flow CLI                            │
│                  (cli-proxy.ts entry point)                     │
└────────────────────────┬────────────────────────────────────────┘
                         │
                         ├─ Parse CLI flags (--provider requesty)
                         ├─ Detect REQUESTY_API_KEY
                         └─ Route to appropriate handler
                         │
         ┌───────────────┴───────────────┐
         │                               │
         ▼                               ▼
┌────────────────┐              ┌────────────────┐
│   Direct API   │              │  Proxy Mode    │
│   (Anthropic)  │              │  (Requesty)    │
└────────────────┘              └────────┬───────┘
                                         │
                                         ▼
                        ┌────────────────────────────────┐
                        │ AnthropicToRequestyProxy       │
                        │ (Port 3000 local server)       │
                        ├────────────────────────────────┤
                        │ 1. Accept Anthropic format     │
                        │    (/v1/messages endpoint)     │
                        │ 2. Convert to OpenAI format    │
                        │ 3. Forward to Requesty router  │
                        │ 4. Convert response back       │
                        │ 5. Handle streaming/tools      │
                        └────────────┬───────────────────┘
                                     │
                                     │ HTTP POST
                                     ▼
                        ┌────────────────────────────────┐
                        │  Requesty Router               │
                        │  router.requesty.ai/v1         │
                        ├────────────────────────────────┤
                        │ • Auto-routing                 │
                        │ • Caching                      │
                        │ • Load balancing               │
                        │ • Cost optimization            │
                        └────────────┬───────────────────┘
                                     │
                                     ├─ Model Execution
                                     │
                        ┌────────────┴───────────────────┐
                        │                                │
                ┌───────▼──────┐              ┌─────────▼────────┐
                │  OpenAI      │              │  Anthropic       │
                │  (GPT-4o)    │              │  (Claude 3.5)    │
                └──────────────┘              └──────────────────┘
                        │                                │
                ┌───────▼──────┐              ┌─────────▼────────┐
                │  Google      │              │  DeepSeek        │
                │  (Gemini)    │              │  (Chat V3)       │
                └──────────────┘              └──────────────────┘
```

### Component Breakdown

#### 1. CLI Integration (`src/cli-proxy.ts`)

**Responsibilities:**
- Detect `--provider requesty` flag
- Check for `REQUESTY_API_KEY` environment variable
- Initialize Requesty proxy server
- Configure environment for Claude Agent SDK

**Code Changes Required:**

```typescript
// Add to shouldUseRequesty() method
private shouldUseRequesty(options: any): boolean {
  if (options.provider === 'requesty' || process.env.PROVIDER === 'requesty') {
    return true;
  }

  if (process.env.USE_REQUESTY === 'true') {
    return true;
  }

  if (process.env.REQUESTY_API_KEY &&
      !process.env.ANTHROPIC_API_KEY &&
      !process.env.OPENROUTER_API_KEY &&
      !process.env.GOOGLE_GEMINI_API_KEY) {
    return true;
  }

  return false;
}

// Add to start() method
if (useRequesty) {
  console.log('🚀 Initializing Requesty proxy...');
  await this.startRequestyProxy(options.model);
}

// Add startRequestyProxy() method (clone from startOpenRouterProxy)
private async startRequestyProxy(modelOverride?: string): Promise<void> {
  const requestyKey = process.env.REQUESTY_API_KEY;

  if (!requestyKey) {
    console.error('❌ Error: REQUESTY_API_KEY required for Requesty models');
    console.error('Set it in .env or export REQUESTY_API_KEY=requesty-xxxxx');
    process.exit(1);
  }

  logger.info('Starting integrated Requesty proxy');

  const defaultModel = modelOverride ||
                      process.env.COMPLETION_MODEL ||
                      'openai/gpt-4o-mini';

  const capabilities = detectModelCapabilities(defaultModel);

  const proxy = new AnthropicToRequestyProxy({
    requestyApiKey: requestyKey,
    requestyBaseUrl: process.env.REQUESTY_BASE_URL,
    defaultModel,
    capabilities: capabilities
  });

  proxy.start(this.proxyPort);
  this.proxyServer = proxy;

  process.env.ANTHROPIC_BASE_URL = `http://localhost:${this.proxyPort}`;

  if (!process.env.ANTHROPIC_API_KEY) {
    process.env.ANTHROPIC_API_KEY = 'sk-ant-proxy-dummy-key';
  }

  console.log(`🔗 Proxy Mode: Requesty`);
  console.log(`🔧 Proxy URL: http://localhost:${this.proxyPort}`);
  console.log(`🤖 Default Model: ${defaultModel}`);

  if (capabilities.requiresEmulation) {
    console.log(`\n⚙️  Detected: Model lacks native tool support`);
    console.log(`🔧 Using ${capabilities.emulationStrategy.toUpperCase()} emulation pattern`);
  }
  console.log('');

  await new Promise(resolve => setTimeout(resolve, 1500));
}
```

#### 2. Proxy Server (`src/proxy/anthropic-to-requesty.ts`)

**Based on:** `src/proxy/anthropic-to-openrouter.ts` (95% identical)

**Class Structure:**

```typescript
export class AnthropicToRequestyProxy {
  private app: express.Application;
  private requestyApiKey: string;
  private requestyBaseUrl: string;
  private defaultModel: string;
  private capabilities?: ModelCapabilities;

  constructor(config: {
    requestyApiKey: string;
    requestyBaseUrl?: string;
    defaultModel?: string;
    capabilities?: ModelCapabilities;
  }) {
    this.app = express();
    this.requestyApiKey = config.requestyApiKey;
    this.requestyBaseUrl = config.requestyBaseUrl ||
                           'https://router.requesty.ai/v1';
    this.defaultModel = config.defaultModel || 'openai/gpt-4o-mini';
    this.capabilities = config.capabilities;

    this.setupMiddleware();
    this.setupRoutes();
  }

  private setupRoutes(): void {
    // Health check
    this.app.get('/health', (req, res) => {
      res.json({ status: 'ok', service: 'anthropic-to-requesty-proxy' });
    });

    // Anthropic Messages API → Requesty Chat Completions
    this.app.post('/v1/messages', async (req, res) => {
      // Convert and forward request
      const result = await this.handleRequest(req.body, res);
      if (result) res.json(result);
    });
  }

  private async handleRequest(
    anthropicReq: AnthropicRequest,
    res: Response
  ): Promise<any> {
    const capabilities = this.capabilities ||
                        detectModelCapabilities(anthropicReq.model || this.defaultModel);

    if (capabilities.requiresEmulation && anthropicReq.tools?.length > 0) {
      return this.handleEmulatedRequest(anthropicReq, capabilities);
    }

    return this.handleNativeRequest(anthropicReq, res);
  }

  private async handleNativeRequest(
    anthropicReq: AnthropicRequest,
    res: Response
  ): Promise<any> {
    // Convert Anthropic → OpenAI format
    const openaiReq = this.convertAnthropicToOpenAI(anthropicReq);

    // Forward to Requesty
    const response = await fetch(`${this.requestyBaseUrl}/chat/completions`, {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${this.requestyApiKey}`,
        'Content-Type': 'application/json',
        'HTTP-Referer': 'https://github.com/ruvnet/agentic-flow',
        'X-Title': 'Agentic Flow'
      },
      body: JSON.stringify(openaiReq)
    });

    if (!response.ok) {
      const error = await response.text();
      logger.error('Requesty API error', { status: response.status, error });
      res.status(response.status).json({
        error: { type: 'api_error', message: error }
      });
      return null;
    }

    // Handle streaming vs non-streaming
    if (anthropicReq.stream) {
      // Stream response
      res.setHeader('Content-Type', 'text/event-stream');
      const reader = response.body?.getReader();
      // ... streaming logic
    } else {
      // Non-streaming
      const openaiRes = await response.json();
      return this.convertOpenAIToAnthropic(openaiRes);
    }
  }

  private convertAnthropicToOpenAI(req: AnthropicRequest): OpenAIRequest {
    // IDENTICAL to OpenRouter conversion
    // See anthropic-to-openrouter.ts lines 376-532
  }

  private convertOpenAIToAnthropic(res: any): any {
    // IDENTICAL to OpenRouter conversion
    // See anthropic-to-openrouter.ts lines 588-685
  }

  public start(port: number): void {
    this.app.listen(port, () => {
      logger.info('Anthropic to Requesty proxy started', {
        port,
        requestyBaseUrl: this.requestyBaseUrl,
        defaultModel: this.defaultModel
      });
      console.log(`\n✅ Anthropic Proxy running at http://localhost:${port}`);
      console.log(`   Requesty Base URL: ${this.requestyBaseUrl}`);
      console.log(`   Default Model: ${this.defaultModel}\n`);
    });
  }
}
```

**Key Differences from OpenRouter Proxy:**

| Component | OpenRouter | Requesty | Change Required |
|-----------|-----------|----------|-----------------|
| Class name | `AnthropicToOpenRouterProxy` | `AnthropicToRequestyProxy` | Rename |
| Base URL | `https://openrouter.ai/api/v1` | `https://router.requesty.ai/v1` | Update constant |
| API key variable | `openrouterApiKey` | `requestyApiKey` | Rename |
| Auth header | `Bearer sk-or-...` | `Bearer requesty-...` | No code change |
| Endpoint | `/chat/completions` | `/chat/completions` | Identical |
| Request format | OpenAI | OpenAI | Identical |
| Response format | OpenAI | OpenAI | Identical |
| Tool format | OpenAI functions | OpenAI functions | Identical |

**Lines of Code to Copy:** ~750 lines (95% reusable)

#### 3. Agent Integration (`src/agents/claudeAgent.ts`)

**Changes Required:**

```typescript
function getCurrentProvider(): string {
  // Add Requesty detection
  if (process.env.PROVIDER === 'requesty' || process.env.USE_REQUESTY === 'true') {
    return 'requesty';
  }
  // ... existing providers
}

function getModelForProvider(provider: string): {
  model: string;
  apiKey: string;
  baseURL?: string;
} {
  switch (provider) {
    case 'requesty':
      return {
        model: process.env.COMPLETION_MODEL || 'openai/gpt-4o-mini',
        apiKey: process.env.REQUESTY_API_KEY || process.env.ANTHROPIC_API_KEY || '',
        baseURL: process.env.PROXY_URL || undefined
      };
    // ... existing cases
  }
}

// In claudeAgent() function, add Requesty handling
if (provider === 'requesty' && process.env.REQUESTY_API_KEY) {
  envOverrides.ANTHROPIC_API_KEY = process.env.ANTHROPIC_API_KEY || 'proxy-key';
  envOverrides.ANTHROPIC_BASE_URL = process.env.ANTHROPIC_BASE_URL ||
                                   process.env.REQUESTY_PROXY_URL ||
                                   'http://localhost:3000';

  logger.info('Using Requesty proxy', {
    proxyUrl: envOverrides.ANTHROPIC_BASE_URL,
    model: finalModel
  });
}
```

#### 4. Model Capabilities (`src/utils/modelCapabilities.ts`)

**Add Requesty Model Definitions:**

```typescript
const MODEL_CAPABILITIES: Record<string, Partial<ModelCapabilities>> = {
  // Existing models...

  // Requesty - OpenAI models
  'openai/gpt-4o': {
    supportsNativeTools: true,
    contextWindow: 128000,
    requiresEmulation: false,
    emulationStrategy: 'none',
    costPerMillionTokens: 0.50,
    provider: 'requesty'
  },
  'openai/gpt-4o-mini': {
    supportsNativeTools: true,
    contextWindow: 128000,
    requiresEmulation: false,
    emulationStrategy: 'none',
    costPerMillionTokens: 0.03,
    provider: 'requesty'
  },

  // Requesty - Anthropic models
  'anthropic/claude-3.5-sonnet': {
    supportsNativeTools: true,
    contextWindow: 200000,
    requiresEmulation: false,
    emulationStrategy: 'none',
    costPerMillionTokens: 0.60,
    provider: 'requesty'
  },

  // Requesty - Google models
  'google/gemini-2.5-flash': {
    supportsNativeTools: true,
    contextWindow: 1000000,
    requiresEmulation: false,
    emulationStrategy: 'none',
    costPerMillionTokens: 0.0, // FREE
    provider: 'requesty'
  },

  // Requesty - DeepSeek models
  'deepseek/deepseek-chat-v3': {
    supportsNativeTools: true,
    contextWindow: 128000,
    requiresEmulation: false,
    emulationStrategy: 'none',
    costPerMillionTokens: 0.03,
    provider: 'requesty'
  },

  // ... add more Requesty models
};
```

#### 5. Model Optimizer (`src/utils/modelOptimizer.ts`)

**Add Requesty Models to Optimizer Database:**

```typescript
// In MODEL_DATABASE constant, add Requesty models
const MODEL_DATABASE: ModelInfo[] = [
  // Existing models...

  // Requesty models
  {
    provider: 'requesty',
    modelId: 'openai/gpt-4o',
    name: 'GPT-4o (Requesty)',
    contextWindow: 128000,
    maxOutput: 4096,
    qualityScore: 95,
    speedScore: 85,
    costPer1MTokens: { input: 0.50, output: 1.50 },
    capabilities: {
      toolCalling: true,
      streaming: true,
      vision: true,
      jsonMode: true
    },
    useCase: ['reasoning', 'coding', 'analysis'],
    requiresKey: 'REQUESTY_API_KEY'
  },
  {
    provider: 'requesty',
    modelId: 'openai/gpt-4o-mini',
    name: 'GPT-4o Mini (Requesty)',
    contextWindow: 128000,
    maxOutput: 4096,
    qualityScore: 80,
    speedScore: 95,
    costPer1MTokens: { input: 0.03, output: 0.06 },
    capabilities: {
      toolCalling: true,
      streaming: true,
      vision: false,
      jsonMode: true
    },
    useCase: ['coding', 'analysis', 'chat'],
    requiresKey: 'REQUESTY_API_KEY'
  },
  {
    provider: 'requesty',
    modelId: 'google/gemini-2.5-flash',
    name: 'Gemini 2.5 Flash (Requesty)',
    contextWindow: 1000000,
    maxOutput: 8192,
    qualityScore: 85,
    speedScore: 98,
    costPer1MTokens: { input: 0.0, output: 0.0 }, // FREE
    capabilities: {
      toolCalling: true,
      streaming: true,
      vision: true,
      jsonMode: true
    },
    useCase: ['coding', 'analysis', 'chat'],
    requiresKey: 'REQUESTY_API_KEY'
  },
  // ... add more Requesty models
];
```

## Data Flow Diagrams

### Request Flow - Chat Completion

```
User CLI Command
    │
    └─> npx agentic-flow --agent coder --task "Create API" --provider requesty
         │
         ├─> CLI Parser (cli-proxy.ts)
         │    ├─ Detect --provider requesty
         │    ├─ Load REQUESTY_API_KEY from env
         │    └─ Start AnthropicToRequestyProxy on port 3000
         │
         ├─> Set Environment Variables
         │    ├─ ANTHROPIC_BASE_URL = http://localhost:3000
         │    └─ ANTHROPIC_API_KEY = sk-ant-proxy-dummy-key
         │
         └─> Execute Agent (claudeAgent.ts)
              │
              └─> Claude Agent SDK query()
                   │
                   ├─> Reads ANTHROPIC_BASE_URL (proxy)
                   │
                   └─> POST http://localhost:3000/v1/messages
                        │
                        └─> AnthropicToRequestyProxy
                             │
                             ├─> Receive Anthropic format request
                             │    {
                             │      model: "openai/gpt-4o-mini",
                             │      messages: [...],
                             │      tools: [...]
                             │    }
                             │
                             ├─> Convert to OpenAI format
                             │    {
                             │      model: "openai/gpt-4o-mini",
                             │      messages: [...],
                             │      tools: [...]
                             │    }
                             │
                             ├─> POST https://router.requesty.ai/v1/chat/completions
                             │    Headers:
                             │      Authorization: Bearer requesty-xxxxx
                             │      Content-Type: application/json
                             │
                             └─> Requesty Router
                                  │
                                  ├─> Auto-route to optimal model
                                  ├─> Check cache
                                  ├─> Execute model
                                  │
                                  └─> Return OpenAI format response
                                       │
                                       └─> AnthropicToRequestyProxy
                                            │
                                            ├─> Convert to Anthropic format
                                            │    {
                                            │      id: "msg_xxx",
                                            │      role: "assistant",
                                            │      content: [...]
                                            │    }
                                            │
                                            └─> Return to Claude Agent SDK
                                                 │
                                                 └─> Display to user
```

### Tool Calling Flow

```
User asks agent to read a file
    │
    └─> Agent determines tool call needed
         │
         └─> POST /v1/messages with tools array
              {
                "tools": [{
                  "type": "function",
                  "function": {
                    "name": "Read",
                    "parameters": {...}
                  }
                }]
              }
              │
              └─> Proxy converts to OpenAI format (no change needed)
                   │
                   └─> Requesty executes model
                        │
                        └─> Model returns tool_calls
                             {
                               "choices": [{
                                 "message": {
                                   "tool_calls": [{
                                     "id": "call_abc",
                                     "function": {
                                       "name": "Read",
                                       "arguments": "{...}"
                                     }
                                   }]
                                 }
                               }]
                             }
                             │
                             └─> Proxy converts to Anthropic format
                                  {
                                    "content": [{
                                      "type": "tool_use",
                                      "id": "call_abc",
                                      "name": "Read",
                                      "input": {...}
                                    }]
                                  }
                                  │
                                  └─> Claude Agent SDK executes tool
                                       │
                                       └─> Returns result to model
```

## File Organization

### New Files

```
src/
└── proxy/
    └── anthropic-to-requesty.ts        (~750 lines, cloned from OpenRouter)

docs/
└── plans/
    └── requesty/
        ├── 00-overview.md
        ├── 01-api-research.md
        ├── 02-architecture.md          (this file)
        ├── 03-implementation-phases.md
        ├── 04-testing-strategy.md
        └── 05-migration-guide.md
```

### Modified Files

```
src/
├── cli-proxy.ts                        (+ ~80 lines)
│   ├── shouldUseRequesty()
│   ├── startRequestyProxy()
│   └── Updated help text
│
├── agents/
│   └── claudeAgent.ts                  (+ ~15 lines)
│       ├── getCurrentProvider()
│       └── getModelForProvider()
│
└── utils/
    ├── modelCapabilities.ts            (+ ~50 lines)
    │   └── Add Requesty model definitions
    │
    └── modelOptimizer.ts               (+ ~100 lines)
        └── Add Requesty models to database
```

### Total Code Impact

| Metric | Count |
|--------|-------|
| New files | 1 |
| Modified files | 4 |
| New lines of code | ~1,000 |
| Reused lines of code | ~750 (95% from OpenRouter) |
| Original code needed | ~250 |

## Configuration Management

### Environment Variables

```bash
# Required for Requesty
REQUESTY_API_KEY=requesty-xxxxxxxxxxxxx

# Optional overrides
REQUESTY_BASE_URL=https://router.requesty.ai/v1  # Custom base URL
REQUESTY_PROXY_URL=http://localhost:3000         # Proxy override
PROVIDER=requesty                                # Force Requesty
USE_REQUESTY=true                                # Alternative flag
COMPLETION_MODEL=openai/gpt-4o-mini              # Default model

# Proxy configuration
PROXY_PORT=3000                                  # Proxy server port
```

### .env.example Update

```bash
# Add to .env.example
# ============================================
# Requesty Configuration
# ============================================
REQUESTY_API_KEY=                               # Get from https://app.requesty.ai
REQUESTY_BASE_URL=https://router.requesty.ai/v1 # Optional: Custom base URL
USE_REQUESTY=false                              # Set to 'true' to force Requesty
```

### Config File Support

Consider adding `~/.agentic-flow/requesty.json`:

```json
{
  "apiKey": "requesty-xxxxx",
  "baseUrl": "https://router.requesty.ai/v1",
  "defaultModel": "openai/gpt-4o-mini",
  "autoRouting": true,
  "caching": {
    "enabled": true,
    "ttl": 3600
  },
  "fallback": {
    "enabled": true,
    "providers": ["openrouter", "anthropic"]
  }
}
```

## Error Handling Strategy

### Error Mapping

```typescript
// Map Requesty errors to user-friendly messages
private mapRequestyError(error: any): string {
  const errorMappings = {
    'invalid_api_key': 'Invalid REQUESTY_API_KEY. Check your API key.',
    'rate_limit_exceeded': 'Rate limit exceeded. Please wait and retry.',
    'model_not_found': 'Model not available. Check model ID.',
    'insufficient_quota': 'Insufficient Requesty credits.',
    'model_overloaded': 'Model temporarily overloaded. Retrying...',
    'timeout': 'Request timeout. Model took too long to respond.'
  };

  return errorMappings[error.code] || error.message;
}
```

### Retry Logic

```typescript
private async callRequestyWithRetry(
  request: any,
  maxRetries: number = 3
): Promise<any> {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(/* ... */);
      if (response.ok) return await response.json();

      // Check if error is retryable
      if ([429, 503, 504].includes(response.status)) {
        const delay = Math.pow(2, attempt) * 1000; // Exponential backoff
        logger.warn(`Retrying after ${delay}ms (attempt ${attempt}/${maxRetries})`);
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }

      // Non-retryable error
      throw new Error(`Requesty API error: ${response.status}`);
    } catch (error) {
      if (attempt === maxRetries) throw error;
    }
  }
}
```

## Performance Considerations

### Latency Optimization

1. **Keep-Alive Connections**
   ```typescript
   import https from 'https';

   const agent = new https.Agent({
     keepAlive: true,
     maxSockets: 10
   });

   fetch(url, { agent });
   ```

2. **Request Pooling**
   - Reuse HTTP connections
   - Connection pooling for concurrent requests

3. **Streaming**
   - Enable streaming by default for large responses
   - Reduce time-to-first-token

### Caching Strategy

Requesty has built-in caching, but we can add client-side caching too:

```typescript
interface CacheEntry {
  key: string;
  value: any;
  timestamp: number;
  ttl: number;
}

class ResponseCache {
  private cache: Map<string, CacheEntry> = new Map();

  set(key: string, value: any, ttl: number = 3600): void {
    this.cache.set(key, {
      key,
      value,
      timestamp: Date.now(),
      ttl: ttl * 1000
    });
  }

  get(key: string): any | null {
    const entry = this.cache.get(key);
    if (!entry) return null;

    if (Date.now() - entry.timestamp > entry.ttl) {
      this.cache.delete(key);
      return null;
    }

    return entry.value;
  }

  generateKey(request: any): string {
    return crypto.createHash('sha256')
      .update(JSON.stringify(request))
      .digest('hex');
  }
}
```

## Security Architecture

### API Key Security

1. **Never log API keys**
   ```typescript
   logger.info('Request to Requesty', {
     apiKeyPresent: !!this.requestyApiKey,
     apiKeyPrefix: this.requestyApiKey?.substring(0, 10) // Only log prefix
   });
   ```

2. **Environment variable validation**
   ```typescript
   if (!requestyKey || !requestyKey.startsWith('requesty-')) {
     throw new Error('Invalid REQUESTY_API_KEY format');
   }
   ```

3. **Rate limit API key exposure**
   - Don't include API key in error messages
   - Don't send API key to client in proxy responses

### Request Validation

```typescript
private validateRequest(req: AnthropicRequest): void {
  if (!req.messages || req.messages.length === 0) {
    throw new Error('Messages array cannot be empty');
  }

  if (req.max_tokens && req.max_tokens > 100000) {
    logger.warn('Unusually high max_tokens requested', {
      requested: req.max_tokens
    });
  }

  // Prevent injection attacks in system prompts
  if (req.system && typeof req.system === 'string') {
    this.sanitizeSystemPrompt(req.system);
  }
}
```

## Monitoring and Observability

### Logging Strategy

```typescript
// Request logging
logger.info('Requesty request', {
  model: request.model,
  messageCount: request.messages.length,
  toolCount: request.tools?.length || 0,
  streaming: request.stream,
  maxTokens: request.max_tokens
});

// Response logging
logger.info('Requesty response', {
  id: response.id,
  model: response.model,
  finishReason: response.choices[0].finish_reason,
  tokensUsed: response.usage.total_tokens,
  latencyMs: Date.now() - startTime
});

// Error logging
logger.error('Requesty error', {
  errorType: error.type,
  errorCode: error.code,
  message: error.message,
  model: request.model,
  retryAttempt: attempt
});
```

### Metrics Collection

```typescript
interface RequestMetrics {
  requestId: string;
  model: string;
  startTime: number;
  endTime: number;
  latencyMs: number;
  tokensIn: number;
  tokensOut: number;
  tokensTotal: number;
  cost: number;
  success: boolean;
  errorType?: string;
}

class MetricsCollector {
  private metrics: RequestMetrics[] = [];

  recordRequest(metrics: RequestMetrics): void {
    this.metrics.push(metrics);

    // Optional: Send to analytics service
    if (process.env.ANALYTICS_ENABLED === 'true') {
      this.sendToAnalytics(metrics);
    }
  }

  getStats(period: '1h' | '24h' | '7d'): any {
    // Calculate aggregate stats
    const relevantMetrics = this.filterByPeriod(period);
    return {
      totalRequests: relevantMetrics.length,
      avgLatency: this.average(relevantMetrics.map(m => m.latencyMs)),
      totalTokens: this.sum(relevantMetrics.map(m => m.tokensTotal)),
      totalCost: this.sum(relevantMetrics.map(m => m.cost)),
      successRate: this.successRate(relevantMetrics)
    };
  }
}
```

## Deployment Considerations

### Standalone Proxy Mode

Support running Requesty proxy as standalone server:

```bash
# Terminal 1 - Run proxy
npx agentic-flow proxy --provider requesty --port 3000 --model "openai/gpt-4o-mini"

# Terminal 2 - Use with Claude Code
export ANTHROPIC_BASE_URL=http://localhost:3000
export ANTHROPIC_API_KEY=sk-ant-proxy-dummy-key
export REQUESTY_API_KEY=requesty-xxxxx
claude
```

### Docker Support

```dockerfile
# Add to existing Dockerfile
ENV REQUESTY_API_KEY=""
ENV REQUESTY_BASE_URL="https://router.requesty.ai/v1"
ENV USE_REQUESTY="false"
```

### Health Checks

```typescript
// Enhanced health check endpoint
this.app.get('/health', async (req, res) => {
  const health = {
    status: 'ok',
    service: 'anthropic-to-requesty-proxy',
    version: packageJson.version,
    uptime: process.uptime(),
    requesty: {
      baseUrl: this.requestyBaseUrl,
      apiKeyConfigured: !!this.requestyApiKey,
      defaultModel: this.defaultModel
    }
  };

  // Optional: Ping Requesty API
  try {
    const response = await fetch(`${this.requestyBaseUrl}/models`, {
      headers: { Authorization: `Bearer ${this.requestyApiKey}` }
    });
    health.requesty.apiReachable = response.ok;
  } catch (error) {
    health.requesty.apiReachable = false;
  }

  res.json(health);
});
```

## Future Enhancements

### Phase 2 Features

1. **Auto-Routing Integration**
   - Support Requesty's auto-routing feature
   - Let Requesty choose optimal model based on request

2. **Caching Control**
   - Expose cache control headers
   - Per-request cache configuration

3. **Analytics Dashboard**
   - Local web UI showing Requesty usage stats
   - Cost tracking and optimization recommendations

4. **Fallback Chain**
   - Automatic fallback to OpenRouter if Requesty fails
   - Configurable provider priority

### Phase 3 Features

1. **Model Benchmarking**
   - Compare same task across Requesty vs OpenRouter vs Anthropic
   - Quality/cost/speed metrics

2. **Smart Provider Selection**
   - Automatically choose Requesty vs OpenRouter based on:
     - Current rate limits
     - Model availability
     - Cost optimization
     - Latency requirements

3. **Webhook Support**
   - Async request processing
   - Long-running task support

## Architecture Decision Records

### ADR-001: Copy OpenRouter Proxy Pattern

**Decision:** Clone OpenRouter proxy implementation for Requesty

**Rationale:**
- 95% code reuse
- Proven pattern already tested
- Minimal development time
- Consistent user experience

**Alternatives Considered:**
- Generic proxy factory (over-engineered for 2 providers)
- Shared base class (adds complexity)

### ADR-002: Same Port for All Proxies

**Decision:** Use port 3000 for all proxies (only one active at a time)

**Rationale:**
- Simplifies configuration
- Prevents port conflicts
- Clear user experience

**Alternatives Considered:**
- Different ports per provider (confusing)
- Dynamic port allocation (complex)

### ADR-003: OpenAI Format as Intermediate

**Decision:** Use OpenAI format for all proxy conversions

**Rationale:**
- Industry standard
- Most providers support it
- Rich tool calling support

**Alternatives Considered:**
- Direct Anthropic-to-Requesty (loses generalization)
- Custom intermediate format (reinventing wheel)

## Summary

The Requesty integration follows a **proven, low-risk architecture**:

1. **Clone OpenRouter proxy** (~750 lines, 95% reusable)
2. **Update 4 existing files** (~250 new lines total)
3. **Add model definitions** (~100 lines for optimizer)
4. **Minimal testing overhead** (reuse OpenRouter test suite)

**Total Implementation Time:** ~4 hours for core functionality

**Risk Level:** LOW (following established pattern)

**Maintenance Burden:** MINIMAL (almost identical to OpenRouter)