March 28, 2026 < 8 min read · Paperclip Integration

How to Add Persistent Memory to Your Paperclip AI Company

Paperclip has quickly become one of the most popular ways to orchestrate AI agents. With 36K+ GitHub stars and a thriving community, this open-source tool lets you spin up an entire AI-powered company — CEO, marketer, designer, engineer — all working together using Claude Code under the hood.

There's just one problem: every time you restart a session, your entire company gets collective amnesia.

Your AI CEO forgets the product roadmap it spent three hours refining. Your marketer loses the brand voice guidelines it developed. Your designer can't remember which color palette the team agreed on yesterday. You're back to square one, re-explaining everything, every single time.

In this tutorial, we'll fix that. You'll add persistent memory to every agent in your Paperclip company using 0Latency, so your AI employees remember everything across sessions — and can even share knowledge with each other.

The Memory Problem in Multi-Agent Orchestration

Paperclip manages multiple AI agents as employees within a simulated company structure. Each agent has a role, a set of responsibilities, and the ability to communicate with other agents. It's a powerful model — until you hit the context window wall.

Here's what actually happens without persistent memory:

This isn't a Paperclip-specific limitation. It's a fundamental challenge in multi-agent orchestration. But with the right memory layer, it's completely solvable.

How 0Latency Solves This

0Latency is an API-first memory service built specifically for AI agents. Instead of cramming memory into prompts or files, your agents store and retrieve memories through API calls — with automatic entity extraction, relationship mapping, and semantic search built in.

For Paperclip companies, this means:

Setup: Adding 0Latency to Your Paperclip Company

There are two ways to integrate: via the MCP (Model Context Protocol) server or the Python SDK. We'll cover both.

Step 1: Get Your API Key

Sign up at 0latency.ai and grab your API key. The free tier gives you 10,000 memories and 3 agents — enough to run a full Paperclip company.

Step 2: Configure MCP for Each Agent

Paperclip uses Claude Code for its agents, which supports MCP servers natively. Add the 0Latency MCP server to each agent's configuration:

{
  "mcpServers": {
    "0latency-memory": {
      "command": "npx",
      "args": ["-y", "@0latency/mcp-server"],
      "env": {
        "ZERO_LATENCY_API_KEY": "your-api-key-here",
        "ZEROLAT_AGENT_ID": "paperclip-ceo"
      }
    }
  }
}

Each agent gets its own ZEROLAT_AGENT_ID. This lets 0Latency track who stored what, while still allowing cross-agent recall. Here's a typical Paperclip company setup:

// Agent IDs for a standard Paperclip company
"paperclip-ceo"        // Strategic decisions, roadmap, priorities
"paperclip-marketer"   // Brand voice, campaigns, audience insights
"paperclip-designer"   // Design system, assets, style guidelines
"paperclip-engineer"   // Architecture decisions, tech stack, code patterns
"paperclip-analyst"    // Metrics, reports, market research
💡 Tip: Use consistent, descriptive agent IDs. They appear in the 0Latency dashboard and make it easy to trace which agent stored a particular memory.

Step 3: Using Memory in Agent Prompts

Once MCP is configured, your agents get two new tools: memory_store and memory_recall. Add instructions to each agent's system prompt telling them when to use memory:

# Add to each Paperclip agent's system prompt

## Memory Protocol
- At the START of each session, recall relevant context:
  memory_recall("current priorities and recent decisions")
- After making any significant decision, store it:
  memory_store("Decided to pivot landing page to focus on enterprise")
- Before starting work, check what was done previously:
  memory_recall("last session progress on [current task]")

Step 4: Python SDK for Custom Workflows

If you're extending Paperclip with custom Python scripts (common for data pipelines, report generation, or tool integration), use the Python SDK directly:

from zerolat import ZeroLatency

# Initialize with your API key
zl = ZeroLatency(api_key="your-api-key-here")

# Store a memory from the CEO agent
zl.store(
    text="Q2 strategy: Focus on enterprise sales. Target 50 demos by June.",
    agent_id="paperclip-ceo",
    metadata={"category": "strategy", "quarter": "Q2-2026"}
)

# Recall memories across ALL agents (shared knowledge)
results = zl.recall(
    query="What is our current sales strategy?",
    agent_id="paperclip-marketer",  # Who's asking
    cross_agent=True                # Search all agents' memories
)

for memory in results:
    print(f"[{memory.agent_id}]: {memory.text}")
    # [paperclip-ceo]: Q2 strategy: Focus on enterprise sales...

The cross_agent=True parameter is what makes this powerful for Paperclip. It lets any agent recall memories stored by any other agent in the same company, creating a true shared knowledge base.

How Shared Memory Transforms Agent Coordination

Without shared memory, Paperclip agents coordinate through explicit message passing — the CEO has to directly tell the marketer about a new strategy, who then has to directly tell the designer to update the brand assets. It's a game of telephone that breaks down fast.

With 0Latency shared memory, coordination becomes implicit:

Scenario: Product Launch

  1. The CEO agent decides on a launch date and stores it: "Product launch scheduled for April 15. Key message: fastest AI memory on the market."
  2. The marketer agent starts a new session, recalls recent decisions, and immediately knows the launch date and positioning — no one had to tell it.
  3. The designer agent recalls the key message and generates launch graphics that match the positioning.
  4. The engineer agent recalls the launch date and prioritizes the feature freeze accordingly.

Each agent independently pulls the context it needs. No message chains. No re-explaining. No lost context.

Entity Graph in Action

0Latency automatically extracts entities and relationships from stored memories. After a few sessions, your Paperclip company builds up a rich knowledge graph:

// Automatically extracted from agent memories:
{
  "entities": [
    {"name": "Enterprise Sales Push", "type": "initiative", "confidence": 0.95},
    {"name": "April 15 Launch", "type": "event", "confidence": 0.98},
    {"name": "Acme Corp", "type": "company", "confidence": 0.92}
  ],
  "relationships": [
    {"from": "Enterprise Sales Push", "to": "April 15 Launch", "type": "drives"},
    {"from": "Acme Corp", "to": "Enterprise Sales Push", "type": "target_customer"}
  ]
}

When an agent recalls "enterprise strategy," it doesn't just get the text — it gets the full network of connected entities. The marketer asking about Acme Corp automatically surfaces the launch timeline and sales strategy that relate to it.

Production Tips

Ready to give your Paperclip company a memory?

Free tier includes 10,000 memories and 3 agents. No credit card required.

Get Your Free API Key →

FAQ

Does 0Latency work with all Paperclip agent types?

Yes. Any Paperclip agent that uses Claude Code (which is all of them by default) can use the MCP integration. Custom agents using the Python SDK work with any LLM backend.

How many memories does a typical Paperclip session generate?

A typical session generates 20-50 memories. On the Free tier (10,000 memories), that's roughly 200+ full sessions before you'd need to upgrade. Most teams use the Pro plan for unlimited peace of mind.

Can agents from different Paperclip companies share memory?

Memory is scoped to your API key by default, so all agents under one key share a namespace. If you need isolation between companies, use separate API keys. If you want cross-company sharing, the Scale tier supports multi-namespace configurations.

What's the latency impact on agent responses?

Minimal. Memory recall averages under 100ms, and stores are fire-and-forget (async). Your agents won't feel slower — they'll actually feel faster because they spend less time being re-onboarded.

Is my company's data secure?

All data is encrypted in transit (TLS 1.3) and at rest (AES-256). Enterprise plans include SOC 2 compliance, dedicated infrastructure, and custom data retention policies. See our documentation for full security details.

Looking for more on multi-agent memory? Read our deep dive: Why Multi-Agent Systems Need a Memory Layer. Or check the full MCP integration guide for advanced configuration options.