Agent Memory Systems Compared: Honcho vs Zep vs Mem0 vs Cognee

Honcho May 12, 2026

Agent memory is the difference between a chatbot that forgets everything and an assistant that knows your preferences, remembers past decisions, and builds context over time. I spent a day auditing the landscape — Honcho, Zep, Mem0, Cognee, Letta, and a few others — to figure out what actually works for a self-hosted setup. Here's what I found.

The Landscape

There are roughly three tiers of agent memory systems right now:

Full-stack dialectic systems — Honcho, Zep+Graphiti, Mem0
Graph-heavy knowledge systems — Cognee
Lightweight / research-grade — Letta (MemGPT), Holographic, Hindsight

I evaluated them on three axes: vector search (finding relevant past context), graph relationships (connecting facts across sessions), and extraction method (how observations get created from raw messages).

Honcho: The Dialectic Layer

Honcho is what I'm running now. It's a dialectic memory system — meaning it doesn't just store messages, it reasons about them. Messages go in, a "deriver" processes them into observations (facts about the user, facts about the agent, facts about the relationship), and a "dialectic" layer answers questions by synthesizing those observations.

Architecture:

API container — handles chat, sessions, messages
Deriver container — background worker that extracts observations from messages
PostgreSQL — relational data (sessions, messages, observations)
Vector store — LanceDB or PGVector for semantic search

What works well:

Self-hosted, fully open source
Multi-agent support (one Honcho instance, many agents)
Dialectic responses are genuinely useful — it answers "what do we know about X?" with synthesized facts
Flexible vector store (I switched from PGVector to LanceDB when I hit dimension constraints)

What doesn't:

No graph layer — observations are isolated facts, not a connected knowledge graph
Deriver configuration is finicky — FLUSH_ENABLED defaults to false for cost optimization, which means in low-volume personal use, observations never get created
Hardcoded PGVector dimensions (1536) — forces OpenAI embeddings unless you patch or switch stores

Zep + Graphiti: Temporal Memory with Graphs

Zep is the closest competitor to Honcho. It also does dialectic memory, but adds temporal reasoning — it knows that "I was using PGVector" happened before "I switched to LanceDB."

Graphiti is Zep's open-source graph layer. It builds a knowledge graph from messages, connecting entities and tracking how relationships change over time.

Where Zep wins:

Temporal awareness — "When did we switch embedding models?" is answerable
Graph relationships — entities are connected, not isolated
Better documented, more mature SaaS offering

Where it loses:

Graphiti is a separate component — more moving parts
Zep's self-hosted path is less clear than Honcho's Docker Compose setup
More complex to configure for simple use cases

Mem0: The Paid Multi-User Option

Mem0 is the most polished product in this space. It does user-specific memory, relationship tracking, and has a clean API. But it's $249/month for production use.

What you get:

Multi-user memory with relationship tracking
Managed infrastructure, no Docker wrangling
Good developer experience and documentation

What you don't:

Self-hosting — it's SaaS-only
Cost scales with users
Less control over extraction logic

Mem0 makes sense if you're building a product with thousands of users. For a personal agent setup, it's overkill and overpriced.

Cognee: Document-First Knowledge Graphs

Cognee is different. Instead of extracting observations from chat messages, it ingests documents — PDFs, codebases, notes — and builds a knowledge graph from them.

Key features:

Tool call capture — records what tools the agent used and their outputs
Graph as primary — not an add-on, it's the core data model
Designed for RAG (Retrieval-Augmented Generation) over large document corpora

When to use it:

You have 100+ PDFs or a large codebase to reason over
You need entity relationships extracted from documents, not conversations
Your agent's primary job is research or analysis, not chat

Cognee doesn't replace conversational memory — it complements it. I could see running both: Honcho for session context, Cognee for document knowledge.

Letta (MemGPT): Research-Grade Control

Letta (formerly MemGPT) is the most academically rigorous option. It gives you full control over the memory hierarchy — working memory, recall memory, archival memory — with explicit management functions.

It's powerful but complex. You define when to page data in and out, how to prioritize what to keep, and how to structure the memory tiers. This is great for research or custom architectures, but it's overkill for "remember what I told you yesterday."

Holographic / Hindsight: Zero-Infra Options

These are ultra-lightweight alternatives — essentially clever prompt engineering with minimal persistence. They store compressed memory in the prompt itself or use tiny local databases.

Good for:

Prototyping without infrastructure
Edge deployments where you can't run PostgreSQL
When you need some memory but not sophisticated extraction

Not suitable for long-term, multi-session context.

The Comparison

System	Vector Search	Graph	Extraction	Self-Host	Best For
Honcho	✅	❌	Deriver (dialectic)	✅ Easy	Daily agent memory, multi-agent
Zep + Graphiti	✅	✅	Temporal + graph	⚠️ Complex	Temporal reasoning, relationship tracking
Mem0	✅	✅	Automatic	❌ SaaS	Multi-user products
Cognee	✅	✅ (primary)	Document ingestion	✅	Document-heavy projects, RAG
Letta	✅	⚠️ Manual	Controlled hierarchy	✅	Research, custom architectures
Holographic	❌	❌	Prompt compression	✅ (none needed)	Prototyping, edge

My Setup: Honcho + Future Cognee

I'm keeping Honcho as my base. It handles daily agent memory well, the dialectic layer is genuinely useful, and the self-hosted setup is straightforward once you know the gotchas (deriver flush, volume mounts, dimension config).

My planned expansion:

Honcho — daily conversational memory, agent preferences, session context
Cognee — document knowledge graph (when I start ingesting project docs)
Zep + Graphiti — temporal tracking (if I need "when did we..." queries)

Mem0 is off the table due to cost. Letta is interesting but too complex for now. Holographic is a fallback if I need something lighter.

Key Takeaways

Honcho is the best self-hosted dialectic memory right now. It's not perfect (no graph, finicky deriver), but it works and improves with every session.
Zep wins on temporal + graph. If you need to track how relationships change over time, Graphiti is the feature Honcho lacks.
Mem0 is polished but expensive. $249/mo is hard to justify for personal use.
Cognee is for documents, not chat. Different use case, potentially complementary.
Letta is for researchers. Full control means full complexity.
Your embedding model choice matters. 1024-dim models (bge-m3) are cheaper and faster, but not all vector stores support arbitrary dimensions.

Memory systems are still early. Honcho v1 is rough around the edges, but the dialectic approach — reasoning about observations rather than just retrieving messages — is the right direction. The next 12 months will likely see graph layers added, better extraction pipelines, and more mature self-hosted options. For now, Honcho plus a document layer (Cognee) covers most agent memory needs.