# Provider-Specific Tool Instruction Optimization

## Overview

Enhanced the OpenRouter and Gemini proxies with provider-specific tool instructions to optimize tool calling success rates across different LLM families.

## Changes Made

### 1. Created Provider Instruction Templates (`src/proxy/provider-instructions.ts`)

Implemented tailored instruction sets for each major provider:

- **ANTHROPIC_INSTRUCTIONS**: Native tool calling, minimal instructions needed
- **OPENAI_INSTRUCTIONS**: XML format with strong emphasis on using tags
- **GOOGLE_INSTRUCTIONS**: Detailed step-by-step instructions with explicit examples
- **META_INSTRUCTIONS**: Clear, concise instructions for Llama models
- **DEEPSEEK_INSTRUCTIONS**: Technical, precise instructions
- **MISTRAL_INSTRUCTIONS**: Direct, action-oriented commands
- **XAI_INSTRUCTIONS**: Balanced instructions for Grok models
- **BASE_INSTRUCTIONS**: Default fallback for unknown providers

### 2. Enhanced OpenRouter Proxy (`src/proxy/anthropic-to-openrouter.ts`)

**Key Updates**:
- Imported `getInstructionsForModel` and `formatInstructions` from provider-instructions
- Added `extractProvider()` method to parse provider from model ID
- Modified `convertAnthropicToOpenAI()` to use model-specific instructions:
  ```typescript
  const modelId = anthropicReq.model || this.defaultModel;
  const provider = this.extractProvider(modelId);
  const instructions = getInstructionsForModel(modelId, provider);
  const toolInstructions = formatInstructions(instructions);
  ```

### 3. Updated Test Models (`test-top20-models.ts`)

Corrected invalid model IDs based on OpenRouter API research:
- `deepseek/deepseek-v3.1:free` → `deepseek/deepseek-chat-v3.1:free`
- `deepseek/deepseek-v3` → `deepseek/deepseek-v3.2-exp`
- `google/gemma-3-12b` → `google/gemma-2-27b-it`

### 4. Created Provider Validation Test (`tests/test-provider-instructions.ts`)

Comprehensive test covering all major providers:
- Anthropic (Claude)
- OpenAI (GPT)
- Google (Gemini)
- Meta (Llama)
- DeepSeek
- Mistral
- X.AI (Grok)

## Instruction Strategy by Provider

### Anthropic Models
**Format**: Native tool calling
**Strategy**: Minimal instructions - models already understand Anthropic tool format
**Example**: "You have native access to file system tools. Use them directly."

### OpenAI/GPT Models
**Format**: XML tags with strong emphasis
**Strategy**: Explicit instructions with "CRITICAL" emphasis to use exact XML formats
**Key Point**: "Do not just describe the file - actually use the tags"

### Google/Gemini Models
**Format**: Detailed XML with step-by-step guidance
**Strategy**: Very explicit instructions with numbered steps
**Key Point**: "Always use the XML tags. Just writing code blocks will NOT create files"

### Meta/Llama Models
**Format**: Clear, concise XML commands
**Strategy**: Simple, direct examples without excessive detail
**Key Point**: "Use these tags to perform actual file operations"

### DeepSeek Models
**Format**: Technical, precise XML instructions
**Strategy**: Focus on structured command parsing
**Key Point**: "Commands are parsed and executed by the system"

### Mistral Models
**Format**: Action-oriented with urgency
**Strategy**: Use "ACTION REQUIRED" language to prompt tool usage
**Key Point**: "Do not just show code - use the tags to create real files"

### X.AI/Grok Models
**Format**: Balanced, clear command structure
**Strategy**: Straightforward file system command list
**Key Point**: "Use structured commands to interact with the file system"

## Expected Improvements

Based on initial testing (TOP20_MODELS_MATRIX.md):

**Before Optimization**:
- 92.9% tool success rate (13/14 working models)
- 1 model (gpt-oss-120b) not using tools

**After Optimization** (Expected):
- 95-100% tool success rate with provider-specific instructions
- Better instruction clarity reducing model confusion
- Faster response times due to clearer prompts

## Testing

### Run Provider Instruction Test
```bash
export OPENROUTER_API_KEY="your-key-here"
npx tsx tests/test-provider-instructions.ts
```

### Run Top 20 Models Test (Updated IDs)
```bash
export OPENROUTER_API_KEY="your-key-here"
npx tsx test-top20-models.ts
```

## Next Steps

1. **Run Validation Tests**: Execute provider instruction test to verify improvements
2. **Re-run Top 20 Test**: Use corrected model IDs and optimized instructions
3. **Measure Improvements**: Compare success rates before/after optimization
4. **Fine-tune Instructions**: Adjust any providers with < 100% success rate
5. **Document Results**: Update TOP20_MODELS_MATRIX.md with final results

## Security Note

All API keys must be provided via environment variables. Never hardcode credentials in source files or tests.

## Files Modified

- ✅ `src/proxy/provider-instructions.ts` (created)
- ✅ `src/proxy/anthropic-to-openrouter.ts` (enhanced)
- ✅ `test-top20-models.ts` (model IDs corrected)
- ✅ `tests/test-provider-instructions.ts` (created)
- ✅ `docs/PROVIDER_INSTRUCTION_OPTIMIZATION.md` (this file)

## Conclusion

Provider-specific instruction optimization provides a systematic approach to maximizing tool calling success across diverse LLM families. By tailoring instructions to each model's strengths and quirks, we can achieve near-universal tool support while maintaining the same proxy architecture.