14 KiB
Federation Data Lifecycle: Persistent vs Ephemeral Storage
Architecture Overview
The federation system uses a hub-and-spoke model with persistent central storage and ephemeral agent storage.
┌─────────────────────────────────────────────────────────┐
│ FEDERATION HUB (PERSISTENT) │
│ │
│ ┌────────────────────┐ ┌─────────────────────────┐ │
│ │ SQLite DB │ │ AgentDB │ │
│ │ (Metadata) │ │ (Vector Memory) │ │
│ │ │ │ │ │
│ │ • Episode metadata │ │ • Vector embeddings │ │
│ │ • Agent registry │ │ • HNSW index │ │
│ │ • Change log │ │ • Semantic search │ │
│ │ • Tenant isolation │ │ • Pattern storage │ │
│ └────────────────────┘ └─────────────────────────┘ │
│ │
│ Storage: /data/hub.db and /data/hub-agentdb.db │
│ Lifetime: PERMANENT (until manually deleted) │
└─────────────────────────────────────────────────────────┘
↑
│ WebSocket Sync
┌─────────────────┼─────────────────┐
│ │ │
↓ ↓ ↓
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Agent 1 │ │ Agent 2 │ │ Agent 3 │
│ (Ephemeral) │ │ (Ephemeral) │ │ (Ephemeral) │
│ │ │ │ │ │
│ Local AgentDB│ │ Local AgentDB│ │ Local AgentDB│
│ Storage: │ │ Storage: │ │ Storage: │
│ :memory: │ │ :memory: │ │ :memory: │
│ │ │ │ │ │
│ Lifetime: │ │ Lifetime: │ │ Lifetime: │
│ 5s - 15min │ │ 5s - 15min │ │ 5s - 15min │
└──────────────┘ └──────────────┘ └──────────────┘
↓ ↓ ↓
DESTROYED DESTROYED DESTROYED
(RAM freed) (RAM freed) (RAM freed)
Data Flow: Complete Lifecycle
Phase 1: Agent Spawns
// Agent spawns with ephemeral memory
const agent = await EphemeralAgent.spawn({
tenantId: 'acme-corp',
lifetime: 300, // 5 minutes
hubEndpoint: 'ws://hub:8443'
});
// Local AgentDB created in memory
agent.agentDB = new AgentDB({ path: ':memory:' });
// Connect to hub
await agent.connect();
Storage State:
- Hub: Empty (or has old memories from previous agents)
- Agent: Empty
:memory:database
Phase 2: Agent Pulls Memories from Hub
await agent.execute(async (db) => {
// PULL: Agent requests memories from hub
const memories = await agent.queryMemories('task-name', 10);
// Hub sends back all relevant memories from past agents
// Agent stores them locally for fast semantic search
});
Storage State:
- Hub: Contains 1000 memories from previous agents (PERSISTENT)
- Agent: Downloaded 10 relevant memories to local
:memory:(TEMPORARY)
Data Transfer:
Hub (Disk: 1000 episodes)
→ WebSocket
→ Agent (RAM: 10 relevant episodes)
Phase 3: Agent Works and Learns
await agent.execute(async (db) => {
// Agent uses memories to inform decisions
const context = await agent.queryMemories('similar-task');
// Agent performs work
const result = await processTask(context);
// Agent stores NEW learning locally
await agent.storeEpisode({
task: 'task-name',
input: 'data',
output: result,
reward: 0.95
});
});
Storage State:
- Hub: Still 1000 memories (not updated yet)
- Agent: 10 old + 1 new = 11 memories in RAM
Phase 4: Agent Syncs to Hub (PUSH)
// Agent pushes new memories to hub
await agent.syncWithHub();
// Hub receives and stores permanently
hub.agentDB.storePattern({
sessionId: 'acme-corp/agent-001',
task: 'task-name',
...episode
});
Storage State:
- Hub: 1001 memories (NEW memory added to disk) ✅
- Agent: 11 memories in RAM
Data Transfer:
Agent (RAM: 1 new episode)
→ WebSocket
→ Hub (Disk: saves permanently)
Phase 5: Agent Expires and Destroys
// After 5 minutes (or manual cleanup)
await agent.destroy();
// Local memory is freed
agent.agentDB.close(); // :memory: database destroyed
Storage State:
- Hub: 1001 memories (PERSISTS on disk) ✅
- Agent: RAM freed, all local data GONE ❌
Phase 6: New Agent Spawns (Hours/Days Later)
// New agent spawns in the future
const newAgent = await EphemeralAgent.spawn({
tenantId: 'acme-corp', // Same tenant
lifetime: 300,
hubEndpoint: 'ws://hub:8443'
});
await newAgent.execute(async (db) => {
// NEW agent can access OLD memories!
const memories = await newAgent.queryMemories('task-name', 10);
// Returns memories from previous agents that died hours ago
console.log(memories.length); // 10 memories (including episode from Agent 1)
});
Storage State:
- Hub: 1001 memories (still on disk from previous agents) ✅
- New Agent: Downloads 10 relevant memories from hub (including work from Agent 1)
Key Insight: Memory outlives the agents! 🎉
Storage Locations
Hub Storage (PERSISTENT)
SQLite Database: /data/hub.db
CREATE TABLE episodes (
id INTEGER PRIMARY KEY,
tenant_id TEXT NOT NULL, -- Tenant isolation
agent_id TEXT NOT NULL, -- Which agent created this
session_id TEXT NOT NULL, -- Agent session
task TEXT NOT NULL, -- Task description
input TEXT NOT NULL, -- Task input
output TEXT NOT NULL, -- Task output
reward REAL NOT NULL, -- Success metric
created_at INTEGER NOT NULL -- Timestamp
);
AgentDB Database: /data/hub-agentdb.db
{
sessionId: 'acme-corp/agent-001', // Tenant prefix for isolation
task: 'implement-feature',
embedding: [...384 dimensions...], // Vector for semantic search
reward: 0.95,
metadata: {
tenantId: 'acme-corp',
agentId: 'agent-001',
timestamp: 1234567890
}
}
Lifetime: PERMANENT until:
- Manually deleted
- Retention policy applied (e.g., delete after 90 days)
- Tenant requests data deletion (GDPR)
Agent Storage (EPHEMERAL)
Location: :memory: (RAM only)
Lifetime: 5 seconds to 15 minutes
Contents:
- Downloaded memories from hub
- Local work in progress
- Temporary caches
Destroyed when:
- Agent reaches
lifetimeexpiration - Manual
agent.destroy()call - Process crash/restart
- Container shutdown
Memory Persistence Guarantees
✅ What PERSISTS (Survives Agent Death)
- All Episodes: Every
storeEpisode()call that syncs to hub - Vector Embeddings: Semantic search index in hub AgentDB
- Metadata: Agent ID, tenant ID, timestamps, rewards
- Tenant Isolation: Sessions tagged with tenant prefix
- Change Log: History of all modifications
❌ What is LOST (Agent Death)
- Local Cache: Downloaded memories in agent's
:memory:DB - In-Progress Work: Anything not yet synced to hub
- Temporary State: Agent-specific runtime data
- Unsaved Episodes: Episodes created but not synced
Sync Timing: When Does Data Persist?
Automatic Sync Points
class EphemeralAgent {
async execute(task) {
// 1. PRE-SYNC: Pull latest from hub
await this.syncWithHub(); // Download new memories
// 2. WORK: Agent performs task
const result = await task(this.db);
// 3. POST-SYNC: Push new memories to hub
await this.syncWithHub(); // Upload new episodes ✅
return result;
}
async destroy() {
// 4. FINAL SYNC: Ensure everything is saved
await this.syncWithHub(); // Last chance to save ✅
// 5. Local cleanup
await this.agentDB.close(); // Memory freed
}
}
Manual Sync
// Developer can force sync anytime
await agent.syncWithHub(); // Pushes all local episodes to hub
Guarantee: Any episode stored before syncWithHub() is PERMANENT.
Example: Multi-Generation Learning
Day 1: First Agent
// 10:00 AM - Agent 1 spawns
const agent1 = await EphemeralAgent.spawn({
tenantId: 'research-team',
lifetime: 300
});
await agent1.execute(async () => {
await agent1.storeEpisode({
task: 'analyze-data',
input: 'dataset-v1',
output: 'Found pattern X',
reward: 0.92
});
});
// 10:05 AM - Agent 1 destroyed
await agent1.destroy(); // Episode saved to hub ✅
Hub Storage: 1 episode
Day 2: Second Agent
// 9:00 AM (next day) - Agent 2 spawns
const agent2 = await EphemeralAgent.spawn({
tenantId: 'research-team', // Same tenant
lifetime: 300
});
await agent2.execute(async () => {
// Query memories (finds Agent 1's work from yesterday!)
const memories = await agent2.queryMemories('analyze-data');
console.log(memories[0].output); // "Found pattern X" ✅
console.log(memories[0].agentId); // "agent-001" (from yesterday)
// Build on previous work
await agent2.storeEpisode({
task: 'refine-pattern',
input: 'pattern-x',
output: 'Confirmed pattern X, found pattern Y',
reward: 0.96
});
});
await agent2.destroy();
Hub Storage: 2 episodes (Agent 1 + Agent 2)
Day 30: Tenth Agent
// 30 days later - Agent 10 spawns
const agent10 = await EphemeralAgent.spawn({
tenantId: 'research-team',
lifetime: 300
});
await agent10.execute(async () => {
// Query all past work
const memories = await agent10.queryMemories('pattern', 100);
console.log(memories.length); // 50+ episodes from 9 previous agents ✅
// Agent 10 learns from all past agents' successes
const bestPatterns = memories
.filter(m => m.reward > 0.90)
.map(m => m.output);
// Standing on the shoulders of giants 🚀
});
Hub Storage: 50+ episodes (cumulative learning)
Retention Policies
Default: Infinite Retention
Hub stores everything forever unless configured otherwise.
Optional: Time-Based Retention
// Delete episodes older than 90 days
hub.db.prepare(`
DELETE FROM episodes
WHERE created_at < ?
`).run(Date.now() - (90 * 24 * 60 * 60 * 1000));
// Delete from AgentDB too
await hub.agentDB.deleteOldPatterns({ maxAge: 90 * 24 * 60 * 60 });
Optional: Reward-Based Retention
// Keep only high-reward episodes
hub.db.prepare(`
DELETE FROM episodes
WHERE reward < 0.70
`).run();
Disaster Recovery
Hub Backup
# Backup hub databases
cp /data/hub.db /backup/hub-2025-10-31.db
cp /data/hub-agentdb.db /backup/hub-agentdb-2025-10-31.db
Hub Restore
# Restore from backup
cp /backup/hub-2025-10-31.db /data/hub.db
cp /backup/hub-agentdb-2025-10-31.db /data/hub-agentdb.db
# Restart hub
docker restart federation-hub
Agent Recovery
Agents don't need backup - They're ephemeral by design!
If an agent crashes, just spawn a new one:
// Old agent crashed (no problem!)
// agent1 died unexpectedly ❌
// Spawn replacement
const agent2 = await EphemeralAgent.spawn({
tenantId: 'acme-corp', // Same tenant
lifetime: 300
});
// New agent has access to ALL old memories ✅
await agent2.execute(async () => {
const memories = await agent2.queryMemories('task');
// Gets memories from crashed agent + all previous agents
});
Production Deployment
Single Hub (Simple)
# docker-compose.yml
services:
federation-hub:
image: federation-hub:latest
volumes:
- hub-data:/data # PERSISTENT volume
ports:
- "8443:8443"
volumes:
hub-data:
driver: local # Data survives container restarts
Persistence: Volume survives container restarts ✅
Multi-Hub (High Availability)
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Hub US │────▶│ Hub EU │────▶│ Hub AP │
│ (Primary)│ │(Replica) │ │(Replica) │
└──────────┘ └──────────┘ └──────────┘
│ │ │
└────────────────┴────────────────┘
Sync every 5 seconds
Guarantee: Data replicated across regions ✅
Key Takeaways
-
Hub = Permanent: All memories stored on disk forever (until manually deleted)
-
Agents = Temporary: Local databases destroyed after 5-15 minutes
-
Memory Outlives Agents: New agents can access memories from agents that died hours/days/weeks ago
-
Sync = Persist: Any episode that syncs to hub is PERMANENT
-
Tenant Isolation: Memories are isolated by tenant, but persist across all agents in that tenant
-
No Data Loss: As long as hub is backed up, no memories are lost when agents die
-
Infinite Generations: Agents can learn from an unlimited chain of previous agents
-
Docker Volumes: Hub data persists across container restarts if using volumes
Bottom Line: The federation hub is the "source of truth" - it's a persistent, centralized database that outlives all ephemeral agents. Agents come and go, but the hub remembers everything. 🧠
This enables continuous learning where each new generation of agents builds on the collective knowledge of all previous generations, while maintaining the efficiency benefits of ephemeral, short-lived agents.