14 KiB
[FEATURE] Agent Booster: Ultra-Fast Code Application Engine (200x faster than Morph LLM)
📋 Summary
Build Agent Booster - a Rust-based vector semantic code merging engine that replaces expensive LLM-based code application APIs (like Morph LLM) with deterministic, vector-based AST merging.
Key Performance Targets:
- ⚡ 200x faster than Morph LLM (30ms vs 6000ms)
- 💰 100% cost savings ($0 vs $0.01+ per edit)
- 📊 97-99% accuracy (comparable to Morph's 98%)
- 🔒 100% local (privacy-first, offline-capable)
- 🌍 Universal (Native Node.js, WASM, MCP server)
🎯 Motivation
Current Problem
Agentic-flow (and similar AI code assistants) currently rely on:
- LLM-based file rewrites - Slow (10-60 seconds), expensive ($0.01-0.10 per edit), non-deterministic
- Morph LLM API - Fast-ish (6 seconds), expensive ($0.01 per edit), requires API calls
Real-World Impact
For a typical development session with 100 code edits:
- Current cost: $1-10 in API fees
- Current time: 10-100 minutes of waiting
- Privacy: Code sent to external APIs
- Reliability: Depends on API availability
Proposed Solution
Agent Booster uses:
- Tree-sitter AST parsing - Understand code structure
- Vector embeddings - Semantic code understanding (pre-trained models)
- HNSW similarity search - Find matching code locations
- Smart merge strategies - Apply edits deterministically
- Rust performance - Native speed, WASM portability
Expected Results (100 edits):
- Cost: $0 (100% savings)
- Time: 3-5 seconds (95% faster)
- Privacy: 100% local
- Reliability: No external dependencies
🏆 Success Criteria
Must Have (MVP)
- 100x+ speedup vs Morph LLM
- 95%+ accuracy on simple/medium edits
- JavaScript & TypeScript support
- Native Node.js addon (napi-rs)
- NPM package published
- Agentic-flow integration via
.env - Comprehensive benchmarks vs Morph + Claude Sonnet 4
Should Have (v1.0)
- WASM support (browser + edge)
- MCP server for Claude Desktop/Cursor/VS Code
- Standalone CLI (
npx agent-booster) - 5+ language support (Python, Rust, Go, Java, C++)
- Fallback to Morph LLM for low-confidence cases
- Documentation site
Nice to Have (Future)
- Fine-tuned custom models
- Multi-file refactoring
- VS Code extension
- Real-time collaboration
- Browser extension
🏗️ Technical Approach
Architecture
User (Agentic-flow / CLI / MCP Client)
↓
JavaScript/TypeScript Layer
(Auto-detects: Native > WASM)
↓
Rust Core Library
↓
┌──────────┬──────────┬──────────┬─────────┐
│ Parser │ Embedder │ Vector │ Merger │
│ (Tree- │ (ONNX │ (HNSW) │ (Smart │
│ sitter) │ Runtime) │ │Strategy)│
└──────────┴──────────┴──────────┴─────────┘
Technology Stack
Core:
- Rust - Performance + memory safety
- Tree-sitter - AST parsing (40+ languages)
- ONNX Runtime - Local embedding inference
- HNSW - Vector similarity search
Bindings:
- napi-rs - Native Node.js addon (fastest)
- wasm-bindgen - WebAssembly (portable)
- TypeScript - Type-safe API
Models:
- jina-embeddings-v2-base-code - 768-dim, best accuracy
- all-MiniLM-L6-v2 - 384-dim, faster alternative
Data Flow
1. Parse original code → AST chunks (Tree-sitter)
2. Generate embeddings → 768-dim vectors (ONNX)
3. Build index → HNSW graph (in-memory)
4. Embed edit snippet → 768-dim vector
5. Vector search → Top-5 similar locations (cosine)
6. Select strategy → Based on similarity score
7. Apply merge → String manipulation + validation
8. Validate syntax → Tree-sitter re-parse
📊 Detailed Benchmark Plan
Baseline: Morph LLM
Models to Test:
- Claude Sonnet 4 (production default)
- Claude Opus 4 (max accuracy)
- Claude Haiku 4 (max speed)
Dataset:
- 40 simple edits (function additions, renames, etc.)
- 40 medium edits (async conversion, type additions)
- 20 complex edits (refactoring, pattern changes)
Metrics:
- Latency (p50, p95, p99, max)
- Accuracy (exact match, semantic match, syntax valid)
- Cost (per edit, per 100 edits)
- Throughput (tokens/sec)
Agent Booster Benchmarks
Variants:
- Native addon (napi-rs) - Fastest
- WASM - Portable
- TypeScript fallback - Baseline
Metrics:
- Same as baseline for fair comparison
- Additional: Confidence scores, strategy distribution
Expected Results
| Metric | Morph + Sonnet 4 | Agent Booster | Improvement |
|---|---|---|---|
| Latency (p50) | 5,800ms | 35ms | 166x |
| Accuracy | 98.0% | 96.8% | -1.2pp |
| Cost/edit | $0.01 | $0.00 | 100% |
| Throughput | 10,500 tok/s | 1M+ tok/s | 95x |
🔌 Integration Plan
1. Agentic-Flow Integration
Environment Variables:
AGENT_BOOSTER_ENABLED=true
AGENT_BOOSTER_MODEL=jina-code-v2
AGENT_BOOSTER_CONFIDENCE_THRESHOLD=0.65
AGENT_BOOSTER_FALLBACK_TO_MORPH=true
MORPH_API_KEY=sk-morph-xxx # Optional fallback
Tool Enhancement:
// Enhance edit_file tool
export const editFileTool = {
async execute(params) {
// Try Agent Booster first if enabled
if (process.env.AGENT_BOOSTER_ENABLED === 'true') {
const result = await booster.applyEdit(params);
if (result.confidence >= threshold) {
return { method: 'agent-booster', ...result };
}
// Fallback to Morph/LLM if low confidence
}
// Original behavior
return fallbackToLLM(params);
}
};
2. MCP Server
Start Server:
npx agent-booster mcp
Client Config (Claude Desktop):
{
"mcpServers": {
"agent-booster": {
"command": "npx",
"args": ["agent-booster", "mcp"],
"env": {
"AGENT_BOOSTER_MODEL": "jina-code-v2"
}
}
}
}
Tools Exposed:
agent_booster_apply- Single editagent_booster_batch- Parallel batch editsagent_booster_analyze- Workspace analysisagent_booster_status- Server status
3. Standalone CLI
# Apply single edit
npx agent-booster apply src/main.ts "add error handling"
# Batch processing
npx agent-booster batch edits.json
# Watch mode
npx agent-booster watch src/
# MCP server
npx agent-booster mcp --port 3000
📁 Project Structure
agent-booster/
├── Cargo.toml # Rust workspace
├── README.md # Main docs
├── LICENSE # MIT/Apache-2.0
├── crates/
│ ├── agent-booster/ # Core Rust library
│ ├── agent-booster-native/ # napi-rs bindings
│ └── agent-booster-wasm/ # WASM bindings
├── npm/
│ ├── agent-booster/ # NPM package (auto-detection)
│ └── agent-booster-cli/ # Standalone CLI
├── benchmarks/
│ ├── datasets/ # Test code samples
│ ├── baselines/ # Morph LLM baselines
│ └── results/ # Benchmark outputs
├── docs/
│ ├── architecture.md
│ ├── api.md
│ ├── benchmarks.md
│ └── integration.md
└── examples/
├── basic-usage.js
├── agentic-flow.js
└── cli-usage.sh
🗓️ Implementation Roadmap
Phase 1: Foundation (Week 1-2) - @assignee
- Setup Rust workspace (
cargo init) - Implement tree-sitter parsing for JS/TS
- Implement basic AST chunking
- Setup benchmark framework
- Run Morph LLM baseline benchmarks
- Document baseline results
Phase 2: Core Engine (Week 3-4) - @assignee
- Implement ONNX Runtime embedding generation
- Implement HNSW vector similarity search
- Implement merge strategies (replace, insert, append)
- Implement confidence scoring
- Add syntax validation
- Run accuracy tests vs Morph LLM
- Document accuracy comparison
Phase 3: Native Integration (Week 5) - @assignee
- Build napi-rs native addon
- Create NPM package with auto-detection
- Write TypeScript definitions
- Add comprehensive tests
- Benchmark native performance
- Document speedup results
Phase 4: WASM Support (Week 6) - @assignee
- Build WASM bindings (wasm-bindgen)
- Optimize WASM bundle size
- Add browser compatibility tests
- Benchmark WASM performance
- Create browser examples
Phase 5: Agentic-flow Integration (Week 7) - @assignee
- Design
.envconfiguration - Create agent-booster tool in agentic-flow
- Add fallback to Morph LLM
- Write integration tests
- Update agentic-flow documentation
- Test with real workflows
Phase 6: MCP Server (Week 8) - @assignee
- Implement MCP protocol server
- Add workspace detection
- Expose tools (apply, batch, analyze, status)
- Add metrics resource
- Test with Claude Desktop/Cursor/VS Code
- Document MCP setup
Phase 7: CLI & SDK (Week 9) - @assignee
- Build standalone CLI (
npx agent-booster) - Add commands (apply, batch, watch, mcp, dashboard)
- Add watch mode
- Add batch processing
- Create usage examples
- Write CLI documentation
Phase 8: Documentation & Release (Week 10) - @assignee
- Complete API documentation
- Write architecture deep dive
- Create comparison benchmarks
- Record demo videos
- Publish to crates.io
- Publish to npm
- Announce release (GitHub, Twitter, Reddit)
📚 Documentation Plan
README.md
- Badges (crates.io, npm, CI, docs)
- Quick start (3 examples: API, CLI, MCP)
- Performance comparison table
- Feature comparison vs Morph LLM
- When to use Agent Booster vs Morph
- Installation instructions
- Use cases
docs/architecture.md
- System architecture diagram
- Module breakdown (parser, embedder, vector, merger)
- Data flow diagram
- Performance optimizations
- Error handling strategy
- Memory management
docs/benchmarks.md
- Benchmark methodology
- Test dataset description
- Morph LLM baseline results
- Agent Booster results
- Comparison analysis
- Visualizations (charts, graphs)
docs/integration.md
- Agentic-flow setup
- MCP server setup
- CLI usage
- Environment variables
- Configuration options
- Metrics & monitoring
docs/api.md
- TypeScript API reference
- Rust API reference
- Configuration options
- Error types
- Usage examples
🧪 Testing Strategy
Unit Tests
- Parser module (AST extraction, chunking)
- Embeddings module (tokenization, inference)
- Vector search module (HNSW indexing, similarity)
- Merge module (strategy selection, application)
Integration Tests
- End-to-end edit application
- Multi-language support
- Fallback to Morph LLM
- MCP protocol compliance
- CLI commands
Benchmark Tests
- Morph LLM baseline (Claude Sonnet/Opus/Haiku)
- Agent Booster variants (native/WASM/TypeScript)
- Accuracy validation
- Performance profiling
- Memory usage
🚀 Release Checklist
v0.1.0 (MVP)
- Core Rust library functional
- Native Node.js addon working
- NPM package published
- Basic documentation
- Benchmarks vs Morph LLM
- Agentic-flow integration tested
v0.2.0 (Production Ready)
- WASM support
- MCP server
- Standalone CLI
- Comprehensive docs
- 5+ language support
- CI/CD setup
v1.0.0 (Stable)
- API stability guarantee
- Full test coverage (>80%)
- Production deployments
- Community feedback incorporated
- Performance tuning complete
- Security audit
📊 Metrics for Success
Performance KPIs
- Latency (p50) < 50ms
- Latency (p95) < 100ms
- Throughput > 100 edits/sec
- Memory usage < 500MB
Quality KPIs
- Accuracy (simple) > 98%
- Accuracy (medium) > 95%
- Accuracy (complex) > 90%
- Syntax errors < 1%
Adoption KPIs
- 100+ GitHub stars
- 1,000+ npm downloads
- 10+ production users
- 5+ community contributions
🤔 Open Questions
-
Model Selection
- Ship with one model or support multiple?
- Should we fine-tune models for specific use cases?
- Can we quantize models for smaller downloads?
-
Fallback Strategy
- Default to fallback enabled or disabled?
- What confidence threshold is optimal?
- How to learn from fallback cases?
-
Language Support
- Which languages to prioritize after JS/TS?
- Should we support LSP for better parsing?
- How to handle non-tree-sitter languages?
-
Deployment
- Offer hosted version for convenience?
- Enterprise on-premise deployment guide?
- Edge/serverless support?
-
Business Model
- Fully open source (MIT/Apache)?
- Dual license (open + commercial)?
- SaaS offering for enterprises?
📝 Related Documentation
- Planning Overview - Full vision and objectives
- Architecture Design - Technical deep dive
- Integration Guide - Agentic-flow & MCP integration
- Benchmark Plan - Testing methodology
- NPM SDK Design - Package structure
🙋 Questions?
Please comment on this issue or join the discussion in:
📄 License
Agent Booster will be dual-licensed under MIT OR Apache-2.0
Let's build the future of AI code editing! 🚀