521 lines
14 KiB
Markdown
521 lines
14 KiB
Markdown
# Federation Data Lifecycle: Persistent vs Ephemeral Storage
|
|
|
|
## Architecture Overview
|
|
|
|
The federation system uses a **hub-and-spoke** model with **persistent central storage** and **ephemeral agent storage**.
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ FEDERATION HUB (PERSISTENT) │
|
|
│ │
|
|
│ ┌────────────────────┐ ┌─────────────────────────┐ │
|
|
│ │ SQLite DB │ │ AgentDB │ │
|
|
│ │ (Metadata) │ │ (Vector Memory) │ │
|
|
│ │ │ │ │ │
|
|
│ │ • Episode metadata │ │ • Vector embeddings │ │
|
|
│ │ • Agent registry │ │ • HNSW index │ │
|
|
│ │ • Change log │ │ • Semantic search │ │
|
|
│ │ • Tenant isolation │ │ • Pattern storage │ │
|
|
│ └────────────────────┘ └─────────────────────────┘ │
|
|
│ │
|
|
│ Storage: /data/hub.db and /data/hub-agentdb.db │
|
|
│ Lifetime: PERMANENT (until manually deleted) │
|
|
└─────────────────────────────────────────────────────────┘
|
|
↑
|
|
│ WebSocket Sync
|
|
┌─────────────────┼─────────────────┐
|
|
│ │ │
|
|
↓ ↓ ↓
|
|
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
|
│ Agent 1 │ │ Agent 2 │ │ Agent 3 │
|
|
│ (Ephemeral) │ │ (Ephemeral) │ │ (Ephemeral) │
|
|
│ │ │ │ │ │
|
|
│ Local AgentDB│ │ Local AgentDB│ │ Local AgentDB│
|
|
│ Storage: │ │ Storage: │ │ Storage: │
|
|
│ :memory: │ │ :memory: │ │ :memory: │
|
|
│ │ │ │ │ │
|
|
│ Lifetime: │ │ Lifetime: │ │ Lifetime: │
|
|
│ 5s - 15min │ │ 5s - 15min │ │ 5s - 15min │
|
|
└──────────────┘ └──────────────┘ └──────────────┘
|
|
↓ ↓ ↓
|
|
DESTROYED DESTROYED DESTROYED
|
|
(RAM freed) (RAM freed) (RAM freed)
|
|
```
|
|
|
|
---
|
|
|
|
## Data Flow: Complete Lifecycle
|
|
|
|
### Phase 1: Agent Spawns
|
|
|
|
```typescript
|
|
// Agent spawns with ephemeral memory
|
|
const agent = await EphemeralAgent.spawn({
|
|
tenantId: 'acme-corp',
|
|
lifetime: 300, // 5 minutes
|
|
hubEndpoint: 'ws://hub:8443'
|
|
});
|
|
|
|
// Local AgentDB created in memory
|
|
agent.agentDB = new AgentDB({ path: ':memory:' });
|
|
|
|
// Connect to hub
|
|
await agent.connect();
|
|
```
|
|
|
|
**Storage State:**
|
|
- Hub: Empty (or has old memories from previous agents)
|
|
- Agent: Empty `:memory:` database
|
|
|
|
### Phase 2: Agent Pulls Memories from Hub
|
|
|
|
```typescript
|
|
await agent.execute(async (db) => {
|
|
// PULL: Agent requests memories from hub
|
|
const memories = await agent.queryMemories('task-name', 10);
|
|
|
|
// Hub sends back all relevant memories from past agents
|
|
// Agent stores them locally for fast semantic search
|
|
});
|
|
```
|
|
|
|
**Storage State:**
|
|
- Hub: Contains 1000 memories from previous agents (PERSISTENT)
|
|
- Agent: Downloaded 10 relevant memories to local `:memory:` (TEMPORARY)
|
|
|
|
**Data Transfer:**
|
|
```
|
|
Hub (Disk: 1000 episodes)
|
|
→ WebSocket
|
|
→ Agent (RAM: 10 relevant episodes)
|
|
```
|
|
|
|
### Phase 3: Agent Works and Learns
|
|
|
|
```typescript
|
|
await agent.execute(async (db) => {
|
|
// Agent uses memories to inform decisions
|
|
const context = await agent.queryMemories('similar-task');
|
|
|
|
// Agent performs work
|
|
const result = await processTask(context);
|
|
|
|
// Agent stores NEW learning locally
|
|
await agent.storeEpisode({
|
|
task: 'task-name',
|
|
input: 'data',
|
|
output: result,
|
|
reward: 0.95
|
|
});
|
|
});
|
|
```
|
|
|
|
**Storage State:**
|
|
- Hub: Still 1000 memories (not updated yet)
|
|
- Agent: 10 old + 1 new = 11 memories in RAM
|
|
|
|
### Phase 4: Agent Syncs to Hub (PUSH)
|
|
|
|
```typescript
|
|
// Agent pushes new memories to hub
|
|
await agent.syncWithHub();
|
|
|
|
// Hub receives and stores permanently
|
|
hub.agentDB.storePattern({
|
|
sessionId: 'acme-corp/agent-001',
|
|
task: 'task-name',
|
|
...episode
|
|
});
|
|
```
|
|
|
|
**Storage State:**
|
|
- Hub: 1001 memories (NEW memory added to disk) ✅
|
|
- Agent: 11 memories in RAM
|
|
|
|
**Data Transfer:**
|
|
```
|
|
Agent (RAM: 1 new episode)
|
|
→ WebSocket
|
|
→ Hub (Disk: saves permanently)
|
|
```
|
|
|
|
### Phase 5: Agent Expires and Destroys
|
|
|
|
```typescript
|
|
// After 5 minutes (or manual cleanup)
|
|
await agent.destroy();
|
|
|
|
// Local memory is freed
|
|
agent.agentDB.close(); // :memory: database destroyed
|
|
```
|
|
|
|
**Storage State:**
|
|
- Hub: 1001 memories (PERSISTS on disk) ✅
|
|
- Agent: RAM freed, all local data GONE ❌
|
|
|
|
### Phase 6: New Agent Spawns (Hours/Days Later)
|
|
|
|
```typescript
|
|
// New agent spawns in the future
|
|
const newAgent = await EphemeralAgent.spawn({
|
|
tenantId: 'acme-corp', // Same tenant
|
|
lifetime: 300,
|
|
hubEndpoint: 'ws://hub:8443'
|
|
});
|
|
|
|
await newAgent.execute(async (db) => {
|
|
// NEW agent can access OLD memories!
|
|
const memories = await newAgent.queryMemories('task-name', 10);
|
|
|
|
// Returns memories from previous agents that died hours ago
|
|
console.log(memories.length); // 10 memories (including episode from Agent 1)
|
|
});
|
|
```
|
|
|
|
**Storage State:**
|
|
- Hub: 1001 memories (still on disk from previous agents) ✅
|
|
- New Agent: Downloads 10 relevant memories from hub (including work from Agent 1)
|
|
|
|
**Key Insight**: Memory outlives the agents! 🎉
|
|
|
|
---
|
|
|
|
## Storage Locations
|
|
|
|
### Hub Storage (PERSISTENT)
|
|
|
|
**SQLite Database: `/data/hub.db`**
|
|
```sql
|
|
CREATE TABLE episodes (
|
|
id INTEGER PRIMARY KEY,
|
|
tenant_id TEXT NOT NULL, -- Tenant isolation
|
|
agent_id TEXT NOT NULL, -- Which agent created this
|
|
session_id TEXT NOT NULL, -- Agent session
|
|
task TEXT NOT NULL, -- Task description
|
|
input TEXT NOT NULL, -- Task input
|
|
output TEXT NOT NULL, -- Task output
|
|
reward REAL NOT NULL, -- Success metric
|
|
created_at INTEGER NOT NULL -- Timestamp
|
|
);
|
|
```
|
|
|
|
**AgentDB Database: `/data/hub-agentdb.db`**
|
|
```typescript
|
|
{
|
|
sessionId: 'acme-corp/agent-001', // Tenant prefix for isolation
|
|
task: 'implement-feature',
|
|
embedding: [...384 dimensions...], // Vector for semantic search
|
|
reward: 0.95,
|
|
metadata: {
|
|
tenantId: 'acme-corp',
|
|
agentId: 'agent-001',
|
|
timestamp: 1234567890
|
|
}
|
|
}
|
|
```
|
|
|
|
**Lifetime**: PERMANENT until:
|
|
- Manually deleted
|
|
- Retention policy applied (e.g., delete after 90 days)
|
|
- Tenant requests data deletion (GDPR)
|
|
|
|
### Agent Storage (EPHEMERAL)
|
|
|
|
**Location**: `:memory:` (RAM only)
|
|
|
|
**Lifetime**: 5 seconds to 15 minutes
|
|
|
|
**Contents**:
|
|
- Downloaded memories from hub
|
|
- Local work in progress
|
|
- Temporary caches
|
|
|
|
**Destroyed when**:
|
|
- Agent reaches `lifetime` expiration
|
|
- Manual `agent.destroy()` call
|
|
- Process crash/restart
|
|
- Container shutdown
|
|
|
|
---
|
|
|
|
## Memory Persistence Guarantees
|
|
|
|
### ✅ What PERSISTS (Survives Agent Death)
|
|
|
|
1. **All Episodes**: Every `storeEpisode()` call that syncs to hub
|
|
2. **Vector Embeddings**: Semantic search index in hub AgentDB
|
|
3. **Metadata**: Agent ID, tenant ID, timestamps, rewards
|
|
4. **Tenant Isolation**: Sessions tagged with tenant prefix
|
|
5. **Change Log**: History of all modifications
|
|
|
|
### ❌ What is LOST (Agent Death)
|
|
|
|
1. **Local Cache**: Downloaded memories in agent's `:memory:` DB
|
|
2. **In-Progress Work**: Anything not yet synced to hub
|
|
3. **Temporary State**: Agent-specific runtime data
|
|
4. **Unsaved Episodes**: Episodes created but not synced
|
|
|
|
---
|
|
|
|
## Sync Timing: When Does Data Persist?
|
|
|
|
### Automatic Sync Points
|
|
|
|
```typescript
|
|
class EphemeralAgent {
|
|
async execute(task) {
|
|
// 1. PRE-SYNC: Pull latest from hub
|
|
await this.syncWithHub(); // Download new memories
|
|
|
|
// 2. WORK: Agent performs task
|
|
const result = await task(this.db);
|
|
|
|
// 3. POST-SYNC: Push new memories to hub
|
|
await this.syncWithHub(); // Upload new episodes ✅
|
|
|
|
return result;
|
|
}
|
|
|
|
async destroy() {
|
|
// 4. FINAL SYNC: Ensure everything is saved
|
|
await this.syncWithHub(); // Last chance to save ✅
|
|
|
|
// 5. Local cleanup
|
|
await this.agentDB.close(); // Memory freed
|
|
}
|
|
}
|
|
```
|
|
|
|
### Manual Sync
|
|
|
|
```typescript
|
|
// Developer can force sync anytime
|
|
await agent.syncWithHub(); // Pushes all local episodes to hub
|
|
```
|
|
|
|
**Guarantee**: Any episode stored before `syncWithHub()` is PERMANENT.
|
|
|
|
---
|
|
|
|
## Example: Multi-Generation Learning
|
|
|
|
### Day 1: First Agent
|
|
|
|
```typescript
|
|
// 10:00 AM - Agent 1 spawns
|
|
const agent1 = await EphemeralAgent.spawn({
|
|
tenantId: 'research-team',
|
|
lifetime: 300
|
|
});
|
|
|
|
await agent1.execute(async () => {
|
|
await agent1.storeEpisode({
|
|
task: 'analyze-data',
|
|
input: 'dataset-v1',
|
|
output: 'Found pattern X',
|
|
reward: 0.92
|
|
});
|
|
});
|
|
|
|
// 10:05 AM - Agent 1 destroyed
|
|
await agent1.destroy(); // Episode saved to hub ✅
|
|
```
|
|
|
|
**Hub Storage**: 1 episode
|
|
|
|
### Day 2: Second Agent
|
|
|
|
```typescript
|
|
// 9:00 AM (next day) - Agent 2 spawns
|
|
const agent2 = await EphemeralAgent.spawn({
|
|
tenantId: 'research-team', // Same tenant
|
|
lifetime: 300
|
|
});
|
|
|
|
await agent2.execute(async () => {
|
|
// Query memories (finds Agent 1's work from yesterday!)
|
|
const memories = await agent2.queryMemories('analyze-data');
|
|
|
|
console.log(memories[0].output); // "Found pattern X" ✅
|
|
console.log(memories[0].agentId); // "agent-001" (from yesterday)
|
|
|
|
// Build on previous work
|
|
await agent2.storeEpisode({
|
|
task: 'refine-pattern',
|
|
input: 'pattern-x',
|
|
output: 'Confirmed pattern X, found pattern Y',
|
|
reward: 0.96
|
|
});
|
|
});
|
|
|
|
await agent2.destroy();
|
|
```
|
|
|
|
**Hub Storage**: 2 episodes (Agent 1 + Agent 2)
|
|
|
|
### Day 30: Tenth Agent
|
|
|
|
```typescript
|
|
// 30 days later - Agent 10 spawns
|
|
const agent10 = await EphemeralAgent.spawn({
|
|
tenantId: 'research-team',
|
|
lifetime: 300
|
|
});
|
|
|
|
await agent10.execute(async () => {
|
|
// Query all past work
|
|
const memories = await agent10.queryMemories('pattern', 100);
|
|
|
|
console.log(memories.length); // 50+ episodes from 9 previous agents ✅
|
|
|
|
// Agent 10 learns from all past agents' successes
|
|
const bestPatterns = memories
|
|
.filter(m => m.reward > 0.90)
|
|
.map(m => m.output);
|
|
|
|
// Standing on the shoulders of giants 🚀
|
|
});
|
|
```
|
|
|
|
**Hub Storage**: 50+ episodes (cumulative learning)
|
|
|
|
---
|
|
|
|
## Retention Policies
|
|
|
|
### Default: Infinite Retention
|
|
|
|
Hub stores everything forever unless configured otherwise.
|
|
|
|
### Optional: Time-Based Retention
|
|
|
|
```typescript
|
|
// Delete episodes older than 90 days
|
|
hub.db.prepare(`
|
|
DELETE FROM episodes
|
|
WHERE created_at < ?
|
|
`).run(Date.now() - (90 * 24 * 60 * 60 * 1000));
|
|
|
|
// Delete from AgentDB too
|
|
await hub.agentDB.deleteOldPatterns({ maxAge: 90 * 24 * 60 * 60 });
|
|
```
|
|
|
|
### Optional: Reward-Based Retention
|
|
|
|
```typescript
|
|
// Keep only high-reward episodes
|
|
hub.db.prepare(`
|
|
DELETE FROM episodes
|
|
WHERE reward < 0.70
|
|
`).run();
|
|
```
|
|
|
|
---
|
|
|
|
## Disaster Recovery
|
|
|
|
### Hub Backup
|
|
|
|
```bash
|
|
# Backup hub databases
|
|
cp /data/hub.db /backup/hub-2025-10-31.db
|
|
cp /data/hub-agentdb.db /backup/hub-agentdb-2025-10-31.db
|
|
```
|
|
|
|
### Hub Restore
|
|
|
|
```bash
|
|
# Restore from backup
|
|
cp /backup/hub-2025-10-31.db /data/hub.db
|
|
cp /backup/hub-agentdb-2025-10-31.db /data/hub-agentdb.db
|
|
|
|
# Restart hub
|
|
docker restart federation-hub
|
|
```
|
|
|
|
### Agent Recovery
|
|
|
|
**Agents don't need backup** - They're ephemeral by design!
|
|
|
|
If an agent crashes, just spawn a new one:
|
|
|
|
```typescript
|
|
// Old agent crashed (no problem!)
|
|
// agent1 died unexpectedly ❌
|
|
|
|
// Spawn replacement
|
|
const agent2 = await EphemeralAgent.spawn({
|
|
tenantId: 'acme-corp', // Same tenant
|
|
lifetime: 300
|
|
});
|
|
|
|
// New agent has access to ALL old memories ✅
|
|
await agent2.execute(async () => {
|
|
const memories = await agent2.queryMemories('task');
|
|
// Gets memories from crashed agent + all previous agents
|
|
});
|
|
```
|
|
|
|
---
|
|
|
|
## Production Deployment
|
|
|
|
### Single Hub (Simple)
|
|
|
|
```yaml
|
|
# docker-compose.yml
|
|
services:
|
|
federation-hub:
|
|
image: federation-hub:latest
|
|
volumes:
|
|
- hub-data:/data # PERSISTENT volume
|
|
ports:
|
|
- "8443:8443"
|
|
|
|
volumes:
|
|
hub-data:
|
|
driver: local # Data survives container restarts
|
|
```
|
|
|
|
**Persistence**: Volume survives container restarts ✅
|
|
|
|
### Multi-Hub (High Availability)
|
|
|
|
```
|
|
┌──────────┐ ┌──────────┐ ┌──────────┐
|
|
│ Hub US │────▶│ Hub EU │────▶│ Hub AP │
|
|
│ (Primary)│ │(Replica) │ │(Replica) │
|
|
└──────────┘ └──────────┘ └──────────┘
|
|
│ │ │
|
|
└────────────────┴────────────────┘
|
|
Sync every 5 seconds
|
|
```
|
|
|
|
**Guarantee**: Data replicated across regions ✅
|
|
|
|
---
|
|
|
|
## Key Takeaways
|
|
|
|
1. **Hub = Permanent**: All memories stored on disk forever (until manually deleted)
|
|
|
|
2. **Agents = Temporary**: Local databases destroyed after 5-15 minutes
|
|
|
|
3. **Memory Outlives Agents**: New agents can access memories from agents that died hours/days/weeks ago
|
|
|
|
4. **Sync = Persist**: Any episode that syncs to hub is PERMANENT
|
|
|
|
5. **Tenant Isolation**: Memories are isolated by tenant, but persist across all agents in that tenant
|
|
|
|
6. **No Data Loss**: As long as hub is backed up, no memories are lost when agents die
|
|
|
|
7. **Infinite Generations**: Agents can learn from an unlimited chain of previous agents
|
|
|
|
8. **Docker Volumes**: Hub data persists across container restarts if using volumes
|
|
|
|
---
|
|
|
|
**Bottom Line**: The federation hub is the "source of truth" - it's a **persistent, centralized database** that outlives all ephemeral agents. Agents come and go, but the hub remembers everything. 🧠
|
|
|
|
This enables **continuous learning** where each new generation of agents builds on the collective knowledge of all previous generations, while maintaining the efficiency benefits of ephemeral, short-lived agents.
|