tasq/node_modules/agentic-flow/docs/plans/requesty/05-migration-guide.md

577 lines
14 KiB
Markdown

# Requesty.ai Integration - User Migration Guide
## Welcome to Requesty!
This guide will help you start using Requesty.ai with agentic-flow to access 300+ AI models with significant cost savings.
## Quick Start (3 Steps)
### 1. Get Your Requesty API Key
1. Visit [Requesty.ai](https://app.requesty.ai)
2. Sign up or log in to your account
3. Navigate to **API Keys** section
4. Click **Generate New Key**
5. Copy your API key (starts with `requesty-`)
### 2. Set Your API Key
#### Option A: Environment Variable
```bash
export REQUESTY_API_KEY="requesty-xxxxxxxxxxxxx"
```
#### Option B: .env File
```bash
# Add to .env file
REQUESTY_API_KEY=requesty-xxxxxxxxxxxxx
```
### 3. Run Your First Command
```bash
npx agentic-flow --agent coder \
--task "Create a hello world function" \
--provider requesty
```
That's it! You're now using Requesty with 300+ models.
---
## Why Use Requesty?
| Feature | Benefit |
|---------|---------|
| **300+ Models** | Access OpenAI, Anthropic, Google, Meta, DeepSeek, and more |
| **80% Cost Savings** | Significantly cheaper than direct API calls |
| **Auto-Routing** | Intelligent model selection based on your needs |
| **Built-in Caching** | Reduce redundant API calls |
| **Analytics Dashboard** | Track usage, costs, and performance |
| **Zero Downtime** | Automatic failover and load balancing |
---
## Usage Examples
### Basic Chat Completion
```bash
npx agentic-flow --agent coder \
--task "Explain async/await in JavaScript" \
--provider requesty
```
Uses default model: `openai/gpt-4o-mini`
### Specify a Model
```bash
npx agentic-flow --agent researcher \
--task "Research quantum computing trends" \
--provider requesty \
--model "anthropic/claude-3.5-sonnet"
```
### Enable Streaming
```bash
npx agentic-flow --agent coder \
--task "Write a detailed tutorial on React hooks" \
--provider requesty \
--model "openai/gpt-4o" \
--stream
```
### Use FREE Models
```bash
# Google Gemini 2.5 Flash - Completely FREE!
npx agentic-flow --agent coder \
--task "Create a REST API with Express" \
--provider requesty \
--model "google/gemini-2.5-flash"
```
### Cost-Optimized Models
```bash
# DeepSeek Chat V3 - Only $0.03 per 1M tokens
npx agentic-flow --agent coder \
--task "Build a calculator function" \
--provider requesty \
--model "deepseek/deepseek-chat-v3"
```
### Premium Quality Models
```bash
# GPT-4o - Top-tier quality
npx agentic-flow --agent architect \
--task "Design a microservices architecture" \
--provider requesty \
--model "openai/gpt-4o"
```
---
## Recommended Models
### For General Coding
| Model | Cost/1M Tokens | Speed | Quality | Best For |
|-------|----------------|-------|---------|----------|
| `openai/gpt-4o-mini` | $0.03 | Fast | Good | Quick tasks, debugging |
| `deepseek/deepseek-chat-v3` | $0.03 | Fast | Good | Cost-effective coding |
| `qwen/qwen-2.5-coder-32b` | $0.05 | Fast | Excellent | Specialized coding |
| `openai/gpt-4o` | $0.50 | Medium | Excellent | Complex architecture |
### For Research & Analysis
| Model | Cost/1M Tokens | Speed | Quality | Best For |
|-------|----------------|-------|---------|----------|
| `google/gemini-2.5-flash` | FREE | Very Fast | Good | General research |
| `anthropic/claude-3.5-sonnet` | $0.60 | Medium | Excellent | Deep analysis |
| `openai/gpt-4-turbo` | $1.00 | Medium | Excellent | Complex reasoning |
| `google/gemini-2.5-pro` | $0.10 | Fast | Very Good | Large context tasks |
### For Cost Optimization
| Model | Cost/1M Tokens | Savings vs Claude 3.5 | Quality |
|-------|----------------|----------------------|---------|
| `google/gemini-2.5-flash` | FREE | 100% | Good |
| `deepseek/deepseek-chat-v3` | $0.03 | 95% | Good |
| `meta-llama/llama-3.3-8b` | $0.02 | 97% | Good |
| `openai/gpt-4o-mini` | $0.03 | 95% | Very Good |
---
## Configuration Options
### Environment Variables
```bash
# Required
export REQUESTY_API_KEY="requesty-xxxxxxxxxxxxx"
# Optional
export REQUESTY_BASE_URL="https://router.requesty.ai/v1" # Custom base URL
export USE_REQUESTY="true" # Force Requesty provider
export COMPLETION_MODEL="openai/gpt-4o-mini" # Default model
export PROXY_PORT="3000" # Proxy server port
```
### CLI Flags
```bash
--provider requesty # Use Requesty provider
--model "model-id" # Specify model
--stream # Enable streaming
--temperature 0.7 # Creativity (0.0-1.0)
--max-tokens 4096 # Maximum output length
--verbose # Detailed logging
```
---
## Migration from Other Providers
### From Anthropic Direct
**Before:**
```bash
export ANTHROPIC_API_KEY="sk-ant-xxxxx"
npx agentic-flow --agent coder --task "Create function"
```
**After:**
```bash
export REQUESTY_API_KEY="requesty-xxxxx"
npx agentic-flow --agent coder \
--task "Create function" \
--provider requesty \
--model "anthropic/claude-3.5-sonnet"
```
**Benefits:**
- 80% cost savings
- Same Claude quality
- Access to 300+ other models
### From OpenRouter
**Before:**
```bash
export OPENROUTER_API_KEY="sk-or-xxxxx"
npx agentic-flow --agent coder \
--task "Create function" \
--provider openrouter \
--model "meta-llama/llama-3.1-8b-instruct"
```
**After:**
```bash
export REQUESTY_API_KEY="requesty-xxxxx"
npx agentic-flow --agent coder \
--task "Create function" \
--provider requesty \
--model "meta-llama/llama-3.3-70b-instruct"
```
**Benefits:**
- 200 more models (300 vs 100)
- Built-in analytics dashboard
- Auto-routing and caching
### From Google Gemini
**Before:**
```bash
export GOOGLE_GEMINI_API_KEY="xxxxx"
npx agentic-flow --agent coder \
--task "Create function" \
--provider gemini
```
**After:**
```bash
export REQUESTY_API_KEY="requesty-xxxxx"
npx agentic-flow --agent coder \
--task "Create function" \
--provider requesty \
--model "google/gemini-2.5-flash"
```
**Benefits:**
- Access to OpenAI, Anthropic, DeepSeek models
- Unified billing and analytics
- Model fallback support
---
## Advanced Usage
### Use with Claude Code
#### Terminal 1 - Start Proxy
```bash
npx agentic-flow proxy --provider requesty --port 3000
```
#### Terminal 2 - Configure Claude Code
```bash
export ANTHROPIC_BASE_URL="http://localhost:3000"
export ANTHROPIC_API_KEY="sk-ant-proxy-dummy-key"
export REQUESTY_API_KEY="requesty-xxxxx"
claude
```
Now Claude Code will use Requesty models!
### Auto-Start Proxy
```bash
# One command - proxy + Claude Code
npx agentic-flow claude-code --provider requesty "Create a React app"
```
### Model Optimization
```bash
# Let agentic-flow choose the best model automatically
npx agentic-flow --agent coder \
--task "Build API endpoint" \
--optimize \
--priority cost \
--provider requesty
```
**Priorities:**
- `quality` - Best results (Claude, GPT-4o)
- `balanced` - Good quality + cost (DeepSeek, Gemini)
- `cost` - Cheapest options (FREE tier models)
- `speed` - Fastest responses (Gemini Flash)
---
## Troubleshooting
### Issue: "REQUESTY_API_KEY required"
**Solution:**
```bash
# Check if API key is set
echo $REQUESTY_API_KEY
# If empty, set it
export REQUESTY_API_KEY="requesty-xxxxx"
```
### Issue: "Invalid API key"
**Solution:**
1. Verify your API key starts with `requesty-`
2. Check for typos or extra spaces
3. Generate a new key at https://app.requesty.ai
4. Make sure key is active (not revoked)
### Issue: "Rate limit exceeded"
**Solution:**
```bash
# Wait and retry (auto-retry is built-in)
# Or upgrade your Requesty tier
# Or use a different model temporarily
```
### Issue: "Model not found"
**Solution:**
```bash
# Check model ID format: <provider>/<model-name>
# Example: openai/gpt-4o-mini (correct)
# gpt-4o-mini (incorrect - missing provider)
# Verify model exists at https://app.requesty.ai/model-list
```
### Issue: Proxy won't start
**Solution:**
```bash
# Check if port 3000 is already in use
lsof -i :3000
# Use a different port
PROXY_PORT=8080 npx agentic-flow proxy --provider requesty --port 8080
```
### Issue: Response is slow
**Solution:**
```bash
# Use faster models
--model "google/gemini-2.5-flash" # Fastest
--model "openai/gpt-4o-mini" # Fast
--model "deepseek/deepseek-chat-v3" # Fast + cheap
# Enable streaming for perceived speed
--stream
```
### Issue: Tool calling not working
**Solution:**
```bash
# Some older models don't support tools
# Use known tool-compatible models:
--model "openai/gpt-4o-mini" # ✓ Tools
--model "anthropic/claude-3.5-sonnet" # ✓ Tools
--model "google/gemini-2.5-flash" # ✓ Tools
--model "deepseek/deepseek-chat-v3" # ✓ Tools
# Avoid older models like:
--model "mistralai/mistral-7b-instruct" # ✗ No tools (emulation used)
```
---
## Cost Comparison
### Agentic Flow Task: "Create a REST API with Express.js"
| Provider | Model | Tokens Used | Cost | Savings |
|----------|-------|-------------|------|---------|
| Anthropic Direct | claude-3.5-sonnet | 5,000 | $0.0150 | Baseline |
| Requesty | anthropic/claude-3.5-sonnet | 5,000 | $0.0030 | 80% |
| Requesty | openai/gpt-4o-mini | 5,000 | $0.00015 | 99% |
| Requesty | google/gemini-2.5-flash | 5,000 | $0.0000 | 100% |
| Requesty | deepseek/deepseek-chat-v3 | 5,000 | $0.00015 | 99% |
**Real Savings Example:**
- 100 tasks/day with Claude 3.5 Sonnet
- Direct: $1.50/day = $45/month
- Requesty: $0.30/day = $9/month
- **Savings: $36/month (80%)**
---
## Best Practices
### 1. Start with Free Tier
```bash
# Test Requesty with FREE models first
--model "google/gemini-2.5-flash"
```
### 2. Use Right Model for Task
```bash
# Simple tasks → cheap models
--model "deepseek/deepseek-chat-v3"
# Complex tasks → premium models
--model "openai/gpt-4o"
# Research → large context models
--model "google/gemini-2.5-pro"
```
### 3. Enable Streaming for UX
```bash
# Always stream for user-facing tasks
--stream
```
### 4. Monitor Costs
Visit [Requesty Dashboard](https://app.requesty.ai/dashboard) to:
- Track token usage
- Monitor spending
- Set budget alerts
- Compare model costs
### 5. Use Model Optimizer
```bash
# Let agentic-flow choose the best model
--optimize --priority balanced
```
---
## FAQ
### Q: Do I need both ANTHROPIC_API_KEY and REQUESTY_API_KEY?
**A:** No, only `REQUESTY_API_KEY` is needed when using `--provider requesty`.
### Q: Can I use Requesty and Anthropic together?
**A:** Yes! Use `--provider requesty` for some tasks and `--provider anthropic` for others.
### Q: Does Requesty work with all agentic-flow features?
**A:** Yes! Tool calling, streaming, MCP servers, and all agents work with Requesty.
### Q: Is my data secure with Requesty?
**A:** Yes. Requesty follows industry-standard security practices. Check their [privacy policy](https://requesty.ai/privacy).
### Q: Can I use Requesty with Claude Code/Cursor?
**A:** Yes! Use proxy mode:
```bash
npx agentic-flow proxy --provider requesty
```
### Q: How do I get support?
**A:**
- Requesty support: support@requesty.ai
- Agentic Flow issues: https://github.com/ruvnet/agentic-flow/issues
### Q: Are there usage limits?
**A:** Yes, limits depend on your Requesty tier. Free tier has lower limits. Upgrade for higher limits.
### Q: Can I use custom models?
**A:** If your custom model is available on Requesty's platform, yes! Check their model catalog.
### Q: Does Requesty support vision/image models?
**A:** Yes! Some models support vision:
```bash
--model "openai/gpt-4o" # Supports vision
--model "google/gemini-2.5-pro" # Supports vision
```
### Q: How does caching work?
**A:** Requesty automatically caches similar requests to reduce costs. No configuration needed.
---
## Model Catalog
### Full Model List
Visit [Requesty Model Library](https://app.requesty.ai/model-list) for the complete catalog of 300+ models.
### Popular Models Quick Reference
```bash
# OpenAI
openai/gpt-4o # Premium quality, $0.50/1M tokens
openai/gpt-4o-mini # Fast, cost-effective, $0.03/1M tokens
openai/gpt-4-turbo # High quality, $1.00/1M tokens
openai/gpt-3.5-turbo # Legacy, cheap, $0.05/1M tokens
# Anthropic
anthropic/claude-3.5-sonnet # Best reasoning, $0.60/1M tokens
anthropic/claude-3-opus # Premium, $1.50/1M tokens
anthropic/claude-3-sonnet # Balanced, $0.30/1M tokens
anthropic/claude-3-haiku # Fast, $0.08/1M tokens
# Google
google/gemini-2.5-pro # Large context, $0.10/1M tokens
google/gemini-2.5-flash # FREE tier, fast
# DeepSeek
deepseek/deepseek-chat-v3 # Cost-optimized, $0.03/1M tokens
deepseek/deepseek-coder # Coding-focused, $0.03/1M tokens
# Meta/Llama
meta-llama/llama-3.3-70b-instruct # Open source, $0.10/1M tokens
meta-llama/llama-3.3-8b-instruct # Fast, cheap, $0.02/1M tokens
# Qwen
qwen/qwen-2.5-coder-32b-instruct # Coding expert, $0.05/1M tokens
# Mistral
mistralai/mistral-large # European alternative, $0.20/1M tokens
```
---
## Getting Help
### Documentation
- Requesty Docs: https://docs.requesty.ai
- Agentic Flow Docs: https://github.com/ruvnet/agentic-flow
### Support Channels
- Email: support@requesty.ai
- Discord: [Requesty Discord](https://discord.gg/requesty)
- GitHub Issues: https://github.com/ruvnet/agentic-flow/issues
### Community
- Share tips and tricks
- Report bugs
- Request new features
---
## Next Steps
1. **Get Your API Key** - https://app.requesty.ai
2. **Try Free Models** - `google/gemini-2.5-flash`
3. **Test Premium Models** - `anthropic/claude-3.5-sonnet`
4. **Monitor Usage** - Check Requesty dashboard
5. **Optimize Costs** - Use `--optimize` flag
## Happy Coding with Requesty! 🚀
Save money, access 300+ models, and build amazing AI applications with agentic-flow + Requesty.ai.