How to Add Persistent Memory to Your Paperclip AI Company
Paperclip has quickly become one of the most popular ways to orchestrate AI agents. With 36K+ GitHub stars and a thriving community, this open-source tool lets you spin up an entire AI-powered company — CEO, marketer, designer, engineer — all working together using Claude Code under the hood.
There's just one problem: every time you restart a session, your entire company gets collective amnesia.
Your AI CEO forgets the product roadmap it spent three hours refining. Your marketer loses the brand voice guidelines it developed. Your designer can't remember which color palette the team agreed on yesterday. You're back to square one, re-explaining everything, every single time.
In this tutorial, we'll fix that. You'll add persistent memory to every agent in your Paperclip company using 0Latency, so your AI employees remember everything across sessions — and can even share knowledge with each other.
The Memory Problem in Multi-Agent Orchestration
Paperclip manages multiple AI agents as employees within a simulated company structure. Each agent has a role, a set of responsibilities, and the ability to communicate with other agents. It's a powerful model — until you hit the context window wall.
Here's what actually happens without persistent memory:
- Session isolation: Each agent's context resets when the session ends. A decision made on Monday doesn't exist on Tuesday.
- No cross-agent memory: Even within a session, agents can only share information through explicit message passing. There's no shared knowledge base.
- Repeated onboarding: Every session starts with you re-explaining your company, product, customers, and goals. That's 10-20 minutes of wasted context window per agent, per session.
- Context window pressure: Stuffing "memory" into system prompts eats into the tokens available for actual work. The more you remind agents about, the less room they have to think.
This isn't a Paperclip-specific limitation. It's a fundamental challenge in multi-agent orchestration. But with the right memory layer, it's completely solvable.
How 0Latency Solves This
0Latency is an API-first memory service built specifically for AI agents. Instead of cramming memory into prompts or files, your agents store and retrieve memories through API calls — with automatic entity extraction, relationship mapping, and semantic search built in.
For Paperclip companies, this means:
- Every agent remembers its decisions, conversations, and outputs across sessions
- Agents share a knowledge graph — when the CEO makes a strategic decision, the marketer can recall it without being explicitly told
- Automatic entity extraction pulls out people, companies, products, and relationships without any extra work
- Semantic recall lets agents search memory by meaning, not just keywords
Setup: Adding 0Latency to Your Paperclip Company
There are two ways to integrate: via the MCP (Model Context Protocol) server or the Python SDK. We'll cover both.
Step 1: Get Your API Key
Sign up at 0latency.ai and grab your API key. The free tier gives you 10,000 memories and 3 agents — enough to run a full Paperclip company.
Step 2: Configure MCP for Each Agent
Paperclip uses Claude Code for its agents, which supports MCP servers natively. Add the 0Latency MCP server to each agent's configuration:
{
"mcpServers": {
"0latency-memory": {
"command": "npx",
"args": ["-y", "@0latency/mcp-server"],
"env": {
"ZERO_LATENCY_API_KEY": "your-api-key-here",
"ZEROLAT_AGENT_ID": "paperclip-ceo"
}
}
}
}
Each agent gets its own ZEROLAT_AGENT_ID. This lets 0Latency track who stored what, while still allowing cross-agent recall. Here's a typical Paperclip company setup:
// Agent IDs for a standard Paperclip company
"paperclip-ceo" // Strategic decisions, roadmap, priorities
"paperclip-marketer" // Brand voice, campaigns, audience insights
"paperclip-designer" // Design system, assets, style guidelines
"paperclip-engineer" // Architecture decisions, tech stack, code patterns
"paperclip-analyst" // Metrics, reports, market research
Step 3: Using Memory in Agent Prompts
Once MCP is configured, your agents get two new tools: memory_store and memory_recall. Add instructions to each agent's system prompt telling them when to use memory:
# Add to each Paperclip agent's system prompt
## Memory Protocol
- At the START of each session, recall relevant context:
memory_recall("current priorities and recent decisions")
- After making any significant decision, store it:
memory_store("Decided to pivot landing page to focus on enterprise")
- Before starting work, check what was done previously:
memory_recall("last session progress on [current task]")
Step 4: Python SDK for Custom Workflows
If you're extending Paperclip with custom Python scripts (common for data pipelines, report generation, or tool integration), use the Python SDK directly:
from zerolat import ZeroLatency
# Initialize with your API key
zl = ZeroLatency(api_key="your-api-key-here")
# Store a memory from the CEO agent
zl.store(
text="Q2 strategy: Focus on enterprise sales. Target 50 demos by June.",
agent_id="paperclip-ceo",
metadata={"category": "strategy", "quarter": "Q2-2026"}
)
# Recall memories across ALL agents (shared knowledge)
results = zl.recall(
query="What is our current sales strategy?",
agent_id="paperclip-marketer", # Who's asking
cross_agent=True # Search all agents' memories
)
for memory in results:
print(f"[{memory.agent_id}]: {memory.text}")
# [paperclip-ceo]: Q2 strategy: Focus on enterprise sales...
The cross_agent=True parameter is what makes this powerful for Paperclip. It lets any agent recall memories stored by any other agent in the same company, creating a true shared knowledge base.
How Shared Memory Transforms Agent Coordination
Without shared memory, Paperclip agents coordinate through explicit message passing — the CEO has to directly tell the marketer about a new strategy, who then has to directly tell the designer to update the brand assets. It's a game of telephone that breaks down fast.
With 0Latency shared memory, coordination becomes implicit:
Scenario: Product Launch
- The CEO agent decides on a launch date and stores it: "Product launch scheduled for April 15. Key message: fastest AI memory on the market."
- The marketer agent starts a new session, recalls recent decisions, and immediately knows the launch date and positioning — no one had to tell it.
- The designer agent recalls the key message and generates launch graphics that match the positioning.
- The engineer agent recalls the launch date and prioritizes the feature freeze accordingly.
Each agent independently pulls the context it needs. No message chains. No re-explaining. No lost context.
Entity Graph in Action
0Latency automatically extracts entities and relationships from stored memories. After a few sessions, your Paperclip company builds up a rich knowledge graph:
// Automatically extracted from agent memories:
{
"entities": [
{"name": "Enterprise Sales Push", "type": "initiative", "confidence": 0.95},
{"name": "April 15 Launch", "type": "event", "confidence": 0.98},
{"name": "Acme Corp", "type": "company", "confidence": 0.92}
],
"relationships": [
{"from": "Enterprise Sales Push", "to": "April 15 Launch", "type": "drives"},
{"from": "Acme Corp", "to": "Enterprise Sales Push", "type": "target_customer"}
]
}
When an agent recalls "enterprise strategy," it doesn't just get the text — it gets the full network of connected entities. The marketer asking about Acme Corp automatically surfaces the launch timeline and sales strategy that relate to it.
Production Tips
- Memory hygiene: Add a periodic "memory cleanup" instruction to your CEO agent. Have it review and consolidate outdated memories to keep recall results sharp.
- Scoped recall: Use metadata tags (like category: "strategy") to let agents filter recalls by topic. This keeps the engineer from drowning in marketing memories.
- Session summaries: At the end of each Paperclip session, have each agent store a summary of what it accomplished. This makes the next session startup nearly instant.
- Cost management: The Free tier (10K memories, 3 agents) handles most Paperclip setups. If you're running larger companies or storing high-frequency data, the Pro tier at $29/mo gives you 100K memories and 23 agents.
Ready to give your Paperclip company a memory?
Free tier includes 10,000 memories and 3 agents. No credit card required.
Get Your Free API Key →FAQ
Does 0Latency work with all Paperclip agent types?
Yes. Any Paperclip agent that uses Claude Code (which is all of them by default) can use the MCP integration. Custom agents using the Python SDK work with any LLM backend.
How many memories does a typical Paperclip session generate?
A typical session generates 20-50 memories. On the Free tier (10,000 memories), that's roughly 200+ full sessions before you'd need to upgrade. Most teams use the Pro plan for unlimited peace of mind.
Can agents from different Paperclip companies share memory?
Memory is scoped to your API key by default, so all agents under one key share a namespace. If you need isolation between companies, use separate API keys. If you want cross-company sharing, the Scale tier supports multi-namespace configurations.
What's the latency impact on agent responses?
Minimal. Memory recall averages under 100ms, and stores are fire-and-forget (async). Your agents won't feel slower — they'll actually feel faster because they spend less time being re-onboarded.
Is my company's data secure?
All data is encrypted in transit (TLS 1.3) and at rest (AES-256). Enterprise plans include SOC 2 compliance, dedicated infrastructure, and custom data retention policies. See our documentation for full security details.
Looking for more on multi-agent memory? Read our deep dive: Why Multi-Agent Systems Need a Memory Layer. Or check the full MCP integration guide for advanced configuration options.