tasq/node_modules/agentic-flow/docs/plans/requesty/00-overview.md

177 lines
6.3 KiB
Markdown

# Requesty.ai Integration - Overview
## Executive Summary
This document outlines the plan to integrate Requesty.ai as a new provider in the agentic-flow project, following the same architectural pattern as the existing OpenRouter integration.
### What is Requesty.ai?
Requesty.ai is a unified AI gateway that provides:
- **300+ AI Models** - Access to models from OpenAI, Anthropic, Google, and other providers
- **OpenAI-Compatible API** - Drop-in replacement for OpenAI SDK with `base_url` override
- **Cost Optimization** - 80% cost savings through intelligent routing and caching
- **Built-in Analytics** - Real-time usage tracking and performance monitoring
- **Enterprise Features** - Zero downtime guarantee, automatic failover, load balancing
### Integration Goals
1. **Provider Parity** - Requesty should work alongside Anthropic, OpenRouter, Gemini, and ONNX
2. **Minimal Code Changes** - Leverage existing proxy pattern from OpenRouter
3. **Model Flexibility** - Support 300+ models through Requesty's router
4. **Tool Support** - Maintain full MCP tool calling compatibility
5. **Cost Optimization** - Enable users to access cheaper models with same quality
### Key Differentiators from OpenRouter
| Feature | OpenRouter | Requesty.ai |
|---------|-----------|-------------|
| Model Count | 100+ | 300+ |
| API Format | OpenAI `/chat/completions` | OpenAI `/chat/completions` |
| Base URL | `https://openrouter.ai/api/v1` | `https://router.requesty.ai/v1` |
| Authentication | `Authorization: Bearer sk-or-...` | `Authorization: Bearer requesty-...` |
| Cost Savings | ~90% vs Claude | ~80% vs Claude |
| Tool Calling | Native OpenAI format | Native OpenAI format |
| Unique Features | Leaderboard tracking | Auto-routing, caching, analytics |
### Strategic Benefits
1. **Provider Diversity** - Reduces dependency on single gateway
2. **Model Access** - 300+ models vs OpenRouter's 100+
3. **Cost Flexibility** - Users can choose based on price/performance
4. **Redundancy** - Fallback option if OpenRouter has issues
5. **Enterprise Features** - Built-in analytics and monitoring
## Integration Approach
### Architecture Pattern
Requesty will follow the **same proxy architecture** as OpenRouter:
```
User Request (Anthropic Format)
Agentic Flow CLI
Provider Detection (--provider requesty)
Anthropic-to-Requesty Proxy Server
Format Conversion (Anthropic → OpenAI)
Requesty Router (https://router.requesty.ai/v1)
Model Execution (300+ models)
Response Conversion (OpenAI → Anthropic)
Return to User
```
### File Structure
New files to create:
```
agentic-flow/
├── src/
│ └── proxy/
│ └── anthropic-to-requesty.ts # NEW - Requesty proxy (clone from OpenRouter)
├── docs/
│ └── plans/
│ └── requesty/
│ ├── 00-overview.md # This file
│ ├── 01-api-research.md
│ ├── 02-architecture.md
│ ├── 03-implementation-phases.md
│ ├── 04-testing-strategy.md
│ └── 05-migration-guide.md
```
Existing files to modify:
```
agentic-flow/
├── src/
│ ├── cli-proxy.ts # Add Requesty provider detection
│ ├── agents/claudeAgent.ts # Add Requesty to provider list
│ └── utils/
│ ├── modelCapabilities.ts # Add Requesty model mappings
│ └── modelOptimizer.ts # Include Requesty models in optimizer
```
## Success Criteria
### Must Have (MVP)
- [ ] Users can use `--provider requesty` flag
- [ ] Requesty API key via `REQUESTY_API_KEY` environment variable
- [ ] Chat completions work with at least 10 tested models
- [ ] Native tool calling support (MCP tools work)
- [ ] Streaming responses supported
- [ ] Error handling and logging
- [ ] Model override via `--model` flag
### Should Have (V1)
- [ ] Tool emulation for models without native support
- [ ] Model capability detection for Requesty models
- [ ] Integration with model optimizer (`--optimize`)
- [ ] Analytics and usage tracking
- [ ] Proxy mode for Claude Code/Cursor
- [ ] Cost estimation and reporting
### Nice to Have (Future)
- [ ] Auto-routing based on cost/performance
- [ ] Caching integration
- [ ] Fallback to other providers on error
- [ ] Model benchmarking and comparison
## Timeline Estimate
| Phase | Tasks | Estimated Time |
|-------|-------|----------------|
| Phase 1 | Research & Planning | 2 hours (DONE) |
| Phase 2 | Core Proxy Implementation | 4 hours |
| Phase 3 | CLI Integration | 2 hours |
| Phase 4 | Testing & Validation | 3 hours |
| Phase 5 | Documentation | 2 hours |
| **Total** | | **13 hours** |
## Risk Assessment
### Technical Risks
| Risk | Probability | Impact | Mitigation |
|------|------------|--------|------------|
| API format differences from OpenRouter | Medium | Medium | Thorough testing, fallback handling |
| Model compatibility issues | Low | Medium | Model capability detection system |
| Tool calling format incompatibility | Low | High | Test with multiple models early |
| Rate limiting differences | Medium | Low | Document limits, add retry logic |
### Business Risks
| Risk | Probability | Impact | Mitigation |
|------|------------|--------|------------|
| Requesty API changes | Medium | Medium | Version pinning, changelog monitoring |
| Service availability issues | Low | High | Multi-provider support already in place |
| Cost model changes | Low | Medium | Document pricing, update optimizer |
## Next Steps
1. Read `01-api-research.md` for detailed API analysis
2. Review `02-architecture.md` for technical design
3. Follow `03-implementation-phases.md` for step-by-step implementation
4. Use `04-testing-strategy.md` for comprehensive testing
5. Reference `05-migration-guide.md` for user documentation
## Open Questions
1. Does Requesty support streaming for all models or only specific ones?
2. Are there any model-specific quirks in tool calling format?
3. What are the exact rate limits per tier?
4. Does Requesty offer a free tier for testing?
5. How does auto-routing work - can we control it programmatically?
## References
- Requesty.ai Documentation: https://docs.requesty.ai
- Requesty.ai Base URL: https://router.requesty.ai/v1
- OpenRouter Integration (reference): `src/proxy/anthropic-to-openrouter.ts`
- Gemini Integration (reference): `src/proxy/anthropic-to-gemini.ts`