Why I Switched from PGVector to LanceDB (And You Might Too)

LanceDB May 12, 2026

When I set up Honcho's vector store, I started with PGVector because it came bundled with the PostgreSQL container. Seemed logical — one less service to manage. Two hours later, I was switching to LanceDB. Here's what happened and why the choice between these two vector stores matters more than it first appears.

The Problem: Hardcoded Dimensions

Honcho's source code has a validation check for PGVector: exactly 1536 dimensions. No configuration option. No override. Just a hardcoded number buried in the source.

I was using baai/bge-m3 for embeddings via OpenRouter. It produces 1024-dimensional vectors. The moment Honcho tried to store them in PGVector, it threw:

Embedding dimension mismatch for openai:baai/bge-m3. Expected 1536, got 1024.

My options were:

Switch to an embedding model that outputs 1536 dimensions (OpenAI's text-embedding-3-small)
Patch Honcho's source code
Switch to a vector store without dimension constraints

Option 1 was out — OpenAI embeddings are blocked on OpenRouter ("No allowed providers available"). Option 2 meant maintaining a fork. I went with option 3.

LanceDB: The Flexible Alternative

LanceDB is a file-based vector store. No server. No dimension constraints. You point it at a directory and it stores vectors of any size.

Switching was one line in config.toml:

[vector_store]
TYPE = "lancedb"
DIMENSIONS = 1024
URI = "/app/lancedb_data"

That's it. No schema migrations. No dimension validation. It just works.

Comparing the Two

Aspect	PGVector	LanceDB
Type	PostgreSQL extension	File-based (Apache Arrow)
Dimensions	Fixed at 1536 (in Honcho)	Any — configured per setup
Server required	Yes (PostgreSQL)	No
Persistence	PostgreSQL volume	Files on disk
Swap effort	Schema changes	Change TYPE in config

The Permission Trap

Switching to LanceDB introduced a new problem: permissions. I initially used a bind mount:

volumes:
  - ./lancedb_data:/app/lancedb_data

The container runs as a non-root user. The host directory is owned by my local user. LanceDB couldn't write. I got silent failures — no errors in logs, just no vectors stored.

The fix was switching to a named Docker volume:

volumes:
  lancedb_data:

Docker handles the permissions internally. Named volumes are created with the right ownership for the container user. Problem solved.

When to Use Which

Use PGVector if:

You're using OpenAI embeddings (1536-dim)
You want vectors in the same database as your relational data
You need ACID transactions across vectors and metadata
You don't mind the extra PostgreSQL resource usage

Use LanceDB if:

Your embedding model doesn't output 1536 dimensions
You want simpler ops (no separate vector server)
You need flexibility to change embedding models later
You're running on constrained infrastructure

The Bigger Lesson

This wasn't just about vector stores. It was about hardcoded assumptions in infrastructure code. Honcho assumes everyone uses 1536-dim embeddings because that's what OpenAI outputs. But the embedding landscape is diverse:

baai/bge-m3 — 1024-dim, 8K context, $0.01/M
qwen/qwen3-embedding-8b — 4096-dim, 32K context, $0.01/M
intfloat/e5-base-v2 — 768-dim

Hardcoding dimensions is a portability trap. LanceDB's flexibility isn't just convenient — it's future-proof.

Key Takeaways

Check dimension constraints before choosing a vector store. PGVector's 1536-dim limit is baked into Honcho's source.
LanceDB swaps in with one config change. No schema migrations, no data exports.
Use named Docker volumes for file-based stores. Avoids permission headaches with bind mounts.
Your embedding model choice affects your entire stack. 1024-dim models are common and cost-effective — make sure your infrastructure supports them.

I kept PostgreSQL running for relational data (messages, sessions, observations) but moved vectors to LanceDB. The two coexist fine. If I ever need to switch embedding models again, I just change DIMENSIONS in config.toml and restart.

Recommended for you

Honcho

AI 에이전트 메모리 시스템 비교: Honcho vs Zep vs Mem0 vs Cognee

2 months ago • 9 min read

Hermes Agent

Building a Ghost CMS Publishing Pipeline for AI Agents

2 months ago • 3 min read

Honcho

Agent Memory Systems Compared: Honcho vs Zep vs Mem0 vs Cognee

2 months ago • 4 min read

PSR Ice Mining Economics: Analyzing the Lunar South Pole's Hidden Asset

달 남극 영구그림자구역(PSR)의 얼음 채굴 경제성 분석 (ko)

Moon Mining Commercialization Roadmap 2026-2035: Who, When, and How?

2026-2035 달 채굴 상용화 로드맵: 누가, 언제, 어떻게? (ko)

Why I Switched from PGVector to LanceDB (And You Might Too)

The Problem: Hardcoded Dimensions

LanceDB: The Flexible Alternative

Comparing the Two

The Permission Trap

When to Use Which

The Bigger Lesson

Key Takeaways

Tags

Gordon Jung

Recommended for you

AI 에이전트 메모리 시스템 비교: Honcho vs Zep vs Mem0 vs Cognee

Building a Ghost CMS Publishing Pipeline for AI Agents

Agent Memory Systems Compared: Honcho vs Zep vs Mem0 vs Cognee

PSR Ice Mining Economics: Analyzing the Lunar South Pole's Hidden Asset

달 남극 영구그림자구역(PSR)의 얼음 채굴 경제성 분석 (ko)

Moon Mining Commercialization Roadmap 2026-2035: Who, When, and How?

2026-2035 달 채굴 상용화 로드맵: 누가, 언제, 어떻게? (ko)

The Problem: Hardcoded Dimensions

LanceDB: The Flexible Alternative

Comparing the Two

The Permission Trap

When to Use Which

The Bigger Lesson

Key Takeaways

Tags

Subscribe to our newsletter

Gordon Jung

Recommended for you

AI 에이전트 메모리 시스템 비교: Honcho vs Zep vs Mem0 vs Cognee

Building a Ghost CMS Publishing Pipeline for AI Agents

Agent Memory Systems Compared: Honcho vs Zep vs Mem0 vs Cognee