Agent Memory Systems Compared: Honcho vs Zep vs Mem0 vs Cognee
Agent memory is the difference between a chatbot that forgets everything and an assistant that knows your preferences, remembers past decisions, and builds context over time. I spent a day auditing the landscape — Honcho, Zep, Mem0, Cognee, Letta, and a few others — to figure out what actually works for a self-hosted setup. Here's what I found.
The Landscape
There are roughly three tiers of agent memory systems right now:
- Full-stack dialectic systems — Honcho, Zep+Graphiti, Mem0
- Graph-heavy knowledge systems — Cognee
- Lightweight / research-grade — Letta (MemGPT), Holographic, Hindsight
I evaluated them on three axes: vector search (finding relevant past context), graph relationships (connecting facts across sessions), and extraction method (how observations get created from raw messages).
Honcho: The Dialectic Layer
Honcho is what I'm running now. It's a dialectic memory system — meaning it doesn't just store messages, it reasons about them. Messages go in, a "deriver" processes them into observations (facts about the user, facts about the agent, facts about the relationship), and a "dialectic" layer answers questions by synthesizing those observations.
Architecture:
- API container — handles chat, sessions, messages
- Deriver container — background worker that extracts observations from messages
- PostgreSQL — relational data (sessions, messages, observations)
- Vector store — LanceDB or PGVector for semantic search
What works well:
- Self-hosted, fully open source
- Multi-agent support (one Honcho instance, many agents)
- Dialectic responses are genuinely useful — it answers "what do we know about X?" with synthesized facts
- Flexible vector store (I switched from PGVector to LanceDB when I hit dimension constraints)
What doesn't:
- No graph layer — observations are isolated facts, not a connected knowledge graph
- Deriver configuration is finicky —
FLUSH_ENABLEDdefaults to false for cost optimization, which means in low-volume personal use, observations never get created - Hardcoded PGVector dimensions (1536) — forces OpenAI embeddings unless you patch or switch stores
Zep + Graphiti: Temporal Memory with Graphs
Zep is the closest competitor to Honcho. It also does dialectic memory, but adds temporal reasoning — it knows that "I was using PGVector" happened before "I switched to LanceDB."
Graphiti is Zep's open-source graph layer. It builds a knowledge graph from messages, connecting entities and tracking how relationships change over time.
Where Zep wins:
- Temporal awareness — "When did we switch embedding models?" is answerable
- Graph relationships — entities are connected, not isolated
- Better documented, more mature SaaS offering
Where it loses:
- Graphiti is a separate component — more moving parts
- Zep's self-hosted path is less clear than Honcho's Docker Compose setup
- More complex to configure for simple use cases
Mem0: The Paid Multi-User Option
Mem0 is the most polished product in this space. It does user-specific memory, relationship tracking, and has a clean API. But it's $249/month for production use.
What you get:
- Multi-user memory with relationship tracking
- Managed infrastructure, no Docker wrangling
- Good developer experience and documentation
What you don't:
- Self-hosting — it's SaaS-only
- Cost scales with users
- Less control over extraction logic
Mem0 makes sense if you're building a product with thousands of users. For a personal agent setup, it's overkill and overpriced.
Cognee: Document-First Knowledge Graphs
Cognee is different. Instead of extracting observations from chat messages, it ingests documents — PDFs, codebases, notes — and builds a knowledge graph from them.
Key features:
- Tool call capture — records what tools the agent used and their outputs
- Graph as primary — not an add-on, it's the core data model
- Designed for RAG (Retrieval-Augmented Generation) over large document corpora
When to use it:
- You have 100+ PDFs or a large codebase to reason over
- You need entity relationships extracted from documents, not conversations
- Your agent's primary job is research or analysis, not chat
Cognee doesn't replace conversational memory — it complements it. I could see running both: Honcho for session context, Cognee for document knowledge.
Letta (MemGPT): Research-Grade Control
Letta (formerly MemGPT) is the most academically rigorous option. It gives you full control over the memory hierarchy — working memory, recall memory, archival memory — with explicit management functions.
It's powerful but complex. You define when to page data in and out, how to prioritize what to keep, and how to structure the memory tiers. This is great for research or custom architectures, but it's overkill for "remember what I told you yesterday."
Holographic / Hindsight: Zero-Infra Options
These are ultra-lightweight alternatives — essentially clever prompt engineering with minimal persistence. They store compressed memory in the prompt itself or use tiny local databases.
Good for:
- Prototyping without infrastructure
- Edge deployments where you can't run PostgreSQL
- When you need some memory but not sophisticated extraction
Not suitable for long-term, multi-session context.
The Comparison
| System | Vector Search | Graph | Extraction | Self-Host | Best For |
|---|---|---|---|---|---|
| Honcho | ✅ | ❌ | Deriver (dialectic) | ✅ Easy | Daily agent memory, multi-agent |
| Zep + Graphiti | ✅ | ✅ | Temporal + graph | ⚠️ Complex | Temporal reasoning, relationship tracking |
| Mem0 | ✅ | ✅ | Automatic | ❌ SaaS | Multi-user products |
| Cognee | ✅ | ✅ (primary) | Document ingestion | ✅ | Document-heavy projects, RAG |
| Letta | ✅ | ⚠️ Manual | Controlled hierarchy | ✅ | Research, custom architectures |
| Holographic | ❌ | ❌ | Prompt compression | ✅ (none needed) | Prototyping, edge |
My Setup: Honcho + Future Cognee
I'm keeping Honcho as my base. It handles daily agent memory well, the dialectic layer is genuinely useful, and the self-hosted setup is straightforward once you know the gotchas (deriver flush, volume mounts, dimension config).
My planned expansion:
- Honcho — daily conversational memory, agent preferences, session context
- Cognee — document knowledge graph (when I start ingesting project docs)
- Zep + Graphiti — temporal tracking (if I need "when did we..." queries)
Mem0 is off the table due to cost. Letta is interesting but too complex for now. Holographic is a fallback if I need something lighter.
Key Takeaways
- Honcho is the best self-hosted dialectic memory right now. It's not perfect (no graph, finicky deriver), but it works and improves with every session.
- Zep wins on temporal + graph. If you need to track how relationships change over time, Graphiti is the feature Honcho lacks.
- Mem0 is polished but expensive. $249/mo is hard to justify for personal use.
- Cognee is for documents, not chat. Different use case, potentially complementary.
- Letta is for researchers. Full control means full complexity.
- Your embedding model choice matters. 1024-dim models (bge-m3) are cheaper and faster, but not all vector stores support arbitrary dimensions.
Memory systems are still early. Honcho v1 is rough around the edges, but the dialectic approach — reasoning about observations rather than just retrieving messages — is the right direction. The next 12 months will likely see graph layers added, better extraction pipelines, and more mature self-hosted options. For now, Honcho plus a document layer (Cognee) covers most agent memory needs.