177 lines
6.3 KiB
Markdown
177 lines
6.3 KiB
Markdown
# Requesty.ai Integration - Overview
|
|
|
|
## Executive Summary
|
|
|
|
This document outlines the plan to integrate Requesty.ai as a new provider in the agentic-flow project, following the same architectural pattern as the existing OpenRouter integration.
|
|
|
|
### What is Requesty.ai?
|
|
|
|
Requesty.ai is a unified AI gateway that provides:
|
|
- **300+ AI Models** - Access to models from OpenAI, Anthropic, Google, and other providers
|
|
- **OpenAI-Compatible API** - Drop-in replacement for OpenAI SDK with `base_url` override
|
|
- **Cost Optimization** - 80% cost savings through intelligent routing and caching
|
|
- **Built-in Analytics** - Real-time usage tracking and performance monitoring
|
|
- **Enterprise Features** - Zero downtime guarantee, automatic failover, load balancing
|
|
|
|
### Integration Goals
|
|
|
|
1. **Provider Parity** - Requesty should work alongside Anthropic, OpenRouter, Gemini, and ONNX
|
|
2. **Minimal Code Changes** - Leverage existing proxy pattern from OpenRouter
|
|
3. **Model Flexibility** - Support 300+ models through Requesty's router
|
|
4. **Tool Support** - Maintain full MCP tool calling compatibility
|
|
5. **Cost Optimization** - Enable users to access cheaper models with same quality
|
|
|
|
### Key Differentiators from OpenRouter
|
|
|
|
| Feature | OpenRouter | Requesty.ai |
|
|
|---------|-----------|-------------|
|
|
| Model Count | 100+ | 300+ |
|
|
| API Format | OpenAI `/chat/completions` | OpenAI `/chat/completions` |
|
|
| Base URL | `https://openrouter.ai/api/v1` | `https://router.requesty.ai/v1` |
|
|
| Authentication | `Authorization: Bearer sk-or-...` | `Authorization: Bearer requesty-...` |
|
|
| Cost Savings | ~90% vs Claude | ~80% vs Claude |
|
|
| Tool Calling | Native OpenAI format | Native OpenAI format |
|
|
| Unique Features | Leaderboard tracking | Auto-routing, caching, analytics |
|
|
|
|
### Strategic Benefits
|
|
|
|
1. **Provider Diversity** - Reduces dependency on single gateway
|
|
2. **Model Access** - 300+ models vs OpenRouter's 100+
|
|
3. **Cost Flexibility** - Users can choose based on price/performance
|
|
4. **Redundancy** - Fallback option if OpenRouter has issues
|
|
5. **Enterprise Features** - Built-in analytics and monitoring
|
|
|
|
## Integration Approach
|
|
|
|
### Architecture Pattern
|
|
|
|
Requesty will follow the **same proxy architecture** as OpenRouter:
|
|
|
|
```
|
|
User Request (Anthropic Format)
|
|
↓
|
|
Agentic Flow CLI
|
|
↓
|
|
Provider Detection (--provider requesty)
|
|
↓
|
|
Anthropic-to-Requesty Proxy Server
|
|
↓
|
|
Format Conversion (Anthropic → OpenAI)
|
|
↓
|
|
Requesty Router (https://router.requesty.ai/v1)
|
|
↓
|
|
Model Execution (300+ models)
|
|
↓
|
|
Response Conversion (OpenAI → Anthropic)
|
|
↓
|
|
Return to User
|
|
```
|
|
|
|
### File Structure
|
|
|
|
New files to create:
|
|
```
|
|
agentic-flow/
|
|
├── src/
|
|
│ └── proxy/
|
|
│ └── anthropic-to-requesty.ts # NEW - Requesty proxy (clone from OpenRouter)
|
|
├── docs/
|
|
│ └── plans/
|
|
│ └── requesty/
|
|
│ ├── 00-overview.md # This file
|
|
│ ├── 01-api-research.md
|
|
│ ├── 02-architecture.md
|
|
│ ├── 03-implementation-phases.md
|
|
│ ├── 04-testing-strategy.md
|
|
│ └── 05-migration-guide.md
|
|
```
|
|
|
|
Existing files to modify:
|
|
```
|
|
agentic-flow/
|
|
├── src/
|
|
│ ├── cli-proxy.ts # Add Requesty provider detection
|
|
│ ├── agents/claudeAgent.ts # Add Requesty to provider list
|
|
│ └── utils/
|
|
│ ├── modelCapabilities.ts # Add Requesty model mappings
|
|
│ └── modelOptimizer.ts # Include Requesty models in optimizer
|
|
```
|
|
|
|
## Success Criteria
|
|
|
|
### Must Have (MVP)
|
|
- [ ] Users can use `--provider requesty` flag
|
|
- [ ] Requesty API key via `REQUESTY_API_KEY` environment variable
|
|
- [ ] Chat completions work with at least 10 tested models
|
|
- [ ] Native tool calling support (MCP tools work)
|
|
- [ ] Streaming responses supported
|
|
- [ ] Error handling and logging
|
|
- [ ] Model override via `--model` flag
|
|
|
|
### Should Have (V1)
|
|
- [ ] Tool emulation for models without native support
|
|
- [ ] Model capability detection for Requesty models
|
|
- [ ] Integration with model optimizer (`--optimize`)
|
|
- [ ] Analytics and usage tracking
|
|
- [ ] Proxy mode for Claude Code/Cursor
|
|
- [ ] Cost estimation and reporting
|
|
|
|
### Nice to Have (Future)
|
|
- [ ] Auto-routing based on cost/performance
|
|
- [ ] Caching integration
|
|
- [ ] Fallback to other providers on error
|
|
- [ ] Model benchmarking and comparison
|
|
|
|
## Timeline Estimate
|
|
|
|
| Phase | Tasks | Estimated Time |
|
|
|-------|-------|----------------|
|
|
| Phase 1 | Research & Planning | 2 hours (DONE) |
|
|
| Phase 2 | Core Proxy Implementation | 4 hours |
|
|
| Phase 3 | CLI Integration | 2 hours |
|
|
| Phase 4 | Testing & Validation | 3 hours |
|
|
| Phase 5 | Documentation | 2 hours |
|
|
| **Total** | | **13 hours** |
|
|
|
|
## Risk Assessment
|
|
|
|
### Technical Risks
|
|
|
|
| Risk | Probability | Impact | Mitigation |
|
|
|------|------------|--------|------------|
|
|
| API format differences from OpenRouter | Medium | Medium | Thorough testing, fallback handling |
|
|
| Model compatibility issues | Low | Medium | Model capability detection system |
|
|
| Tool calling format incompatibility | Low | High | Test with multiple models early |
|
|
| Rate limiting differences | Medium | Low | Document limits, add retry logic |
|
|
|
|
### Business Risks
|
|
|
|
| Risk | Probability | Impact | Mitigation |
|
|
|------|------------|--------|------------|
|
|
| Requesty API changes | Medium | Medium | Version pinning, changelog monitoring |
|
|
| Service availability issues | Low | High | Multi-provider support already in place |
|
|
| Cost model changes | Low | Medium | Document pricing, update optimizer |
|
|
|
|
## Next Steps
|
|
|
|
1. Read `01-api-research.md` for detailed API analysis
|
|
2. Review `02-architecture.md` for technical design
|
|
3. Follow `03-implementation-phases.md` for step-by-step implementation
|
|
4. Use `04-testing-strategy.md` for comprehensive testing
|
|
5. Reference `05-migration-guide.md` for user documentation
|
|
|
|
## Open Questions
|
|
|
|
1. Does Requesty support streaming for all models or only specific ones?
|
|
2. Are there any model-specific quirks in tool calling format?
|
|
3. What are the exact rate limits per tier?
|
|
4. Does Requesty offer a free tier for testing?
|
|
5. How does auto-routing work - can we control it programmatically?
|
|
|
|
## References
|
|
|
|
- Requesty.ai Documentation: https://docs.requesty.ai
|
|
- Requesty.ai Base URL: https://router.requesty.ai/v1
|
|
- OpenRouter Integration (reference): `src/proxy/anthropic-to-openrouter.ts`
|
|
- Gemini Integration (reference): `src/proxy/anthropic-to-gemini.ts`
|