Add long-term, semantic, and contextual memory to any AI system.
Open source. Self-hosted. Explainable. Framework-agnostic.
Report Bug • Request Feature • Discord server
OpenMemory is a self-hosted, modular AI memory engine designed to provide persistent, structured, and semantic memory for large language model (LLM) applications.
It enables AI agents, assistants, and copilots to remember user data, preferences, and prior interactions — securely and efficiently.
Unlike traditional vector databases or SaaS “memory layers”, OpenMemory implements a Hierarchical Memory Decomposition (HMD) architecture:
- One canonical node per memory (no data duplication)
- Multi-sector embeddings (episodic, semantic, procedural, emotional, reflective)
- Single-waypoint linking (sparse, biologically-inspired graph)
- Composite similarity retrieval (sector fusion + activation spreading)
This design offers better recall, lower latency, and explainable reasoning at a fraction of the cost.
| Feature / Metric | OpenMemory | Zep (Cloud) | Supermemory (SaaS) | Mem0 | OpenAI Memory | LangChain Memory | Vector DBs (Chroma / Weaviate / Pinecone) |
|---|---|---|---|---|---|---|---|
| Open-source | ✅ MIT | ❌ Closed (SaaS only) | ❌ Closed (Source available) | ✅ Apache | ❌ Closed | ✅ Apache | ✅ Varies |
| Self-hosted | ✅ | ❌ | ✅ With managed cloud | ✅ | ❌ | ✅ | ✅ |
| Architecture | HMD v2 (multi-sector + single-waypoint graph) | Flat embeddings (Postgres + FAISS) | Graph + Embeddings | Flat JSON memory | Proprietary long-term cache | Context cache | Vector index |
| Avg response time (100k nodes) | 110–130 ms | 280–350 ms | 50–150 ms on-prem, 250–400 ms cloud | 250 ms | 300 ms | 200 ms | 160 ms |
| Retrieval depth | Multi-sector fusion + 1-hop waypoint | Single embedding | Single embedding with graph relations | Single embedding | Unspecified | 1 session only | Single embedding |
| Explainable recall paths | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
| Cost per 1M tokens (with hosted embeddings) | ~$0.30–0.40 | ~$2.0–2.5 | ~$2.50+ | ~$1.20 | ~$3.00 | User-managed | User-managed |
| Local embeddings support | ✅ (Ollama / E5 / BGE) | ❌ | ✅ (Self-hosted tier) | ✅ | ❌ | Partial | ✅ |
| Ingestion | ✅ (pdf, docx, txt, audio, website) | ✅ (via API) | ✅ | ❌ | ❌ | ❌ | ❌ |
| Scalability model | Horizontally sharded by sector | Cloud-native (Postgres + FAISS shards) | Cloud-native (Postgres) | Single node | Vendor scale | In-memory | Horizontally scalable |
| Deployment | Local / Docker / Cloud | Cloud only | Docker/Cloud | Node app | Cloud | Python SDK | Docker / Cloud |
| Data ownership | 100% yours | Vendor | Self-hosting available | 100% yours | Vendor | Yours | Yours |
| Use-case fit | Long-term agent memory, assistants, journaling, enterprise copilots | Enterprise AI agents, retrieval-based assistants | Long-term agent memory, assistants, journaling, enterprise copilots | Basic agent memory | ChatGPT-only | LLM framework | Generic vector search |
OpenMemory delivers 2–3× faster contextual recall, 6–10× lower cost, and full transparency compared to hosted “memory APIs” like Zep or Supermemory.
Its multi-sector cognitive model allows explainable recall paths, hybrid embeddings (OpenAI / Gemini / Ollama / local), and real-time decay, making it ideal for developers seeking open, private, and interpretable long-term memory for LLMs.
For more detailed comparison check "Performance and Cost Analysis" below.
Prerequisites
- Node.js 20+
- SQLite 3.40+ (bundled)
- Optional: Ollama / OpenAI / Gemini embeddings
git clone https://github.com/caviraoss/openmemory.git
cp .env.example .env
cd openmemory/backend
npm install
npm run devExample .env configuration:
# Core server
OM_PORT=8080
OM_MODE=standard
OM_API_KEY=
# Metadata store
OM_METADATA_BACKEND=sqlite # sqlite | postgres
OM_DB_PATH=./data/openmemory.sqlite # used when sqlite
# PostgreSQL (only when OM_METADATA_BACKEND=postgres or OM_VECTOR_BACKEND=pgvector)
OM_PG_HOST=localhost
OM_PG_PORT=5432
OM_PG_DB=openmemory
OM_PG_USER=postgres
OM_PG_PASSWORD=postgres
OM_PG_SCHEMA=public
OM_PG_TABLE=openmemory_memories
OM_PG_SSL=disable # disable | require
# Vector store
OM_VECTOR_BACKEND=sqlite # sqlite | pgvector | weaviate
OM_VECTOR_TABLE=openmemory_vectors
OM_WEAVIATE_URL=
OM_WEAVIATE_API_KEY=
OM_WEAVIATE_CLASS=OpenMemory
# Embeddings
OM_EMBEDDINGS=openai
OM_VEC_DIM=768
OPENAI_API_KEY=
GEMINI_API_KEY=
OLLAMA_URL=http://localhost:11434
LOCAL_MODEL_PATH=
OM_MIN_SCORE=0.3
OM_DECAY_LAMBDA=0.02
# LangGraph integration (optional)
OM_LG_NAMESPACE=default
OM_LG_MAX_CONTEXT=50
OM_LG_REFLECTIVE=trueStart server:
npx tsx src/server.tsOpenMemory runs on http://localhost:8080.
docker compose up --build -dDefault ports:
8080→ OpenMemory API- Data persisted in
/data/openmemory.sqlite
| Layer | Technology | Description |
|---|---|---|
| Backend | Typescript | REST API and orchestration |
| Storage | SQLite (default) / PostgreSQL | Memory metadata, vectors, waypoints |
| Embeddings | E5 / BGE / OpenAI / Gemini / Ollama | Sector-specific embeddings |
| Graph Logic | In-process | Single-waypoint associative graph |
| Scheduler | node-cron | Decay, pruning, log repair |
- User request → Text sectorized into 2–3 likely memory types
- Query embeddings generated for those sectors
- Search over sector vectors + optional mean cache
- Top-K matches → one-hop waypoint expansion
- Ranked by composite score:
0.6 × similarity + 0.2 × salience + 0.1 × recency + 0.1 × link weight
[User / Agent]
│
▼
[OpenMemory API]
│
┌───────────────┬───────────────┐
│ SQLite (meta) │ Vector Store │
│ memories.db │ sector blobs │
└───────────────┴───────────────┘
│
▼
[Waypoint Graph]
Full API documentation is available in OpenAPI 3.0 format: openapi.yaml
View the documentation:
- Online: Upload
openapi.yamlto Swagger Editor - Local: Use Swagger UI or Redoc
- VS Code: Install the OpenAPI (Swagger) Editor extension
| Method | Endpoint | Description |
|---|---|---|
POST |
/memory/add |
Add a memory item |
POST |
/memory/query |
Retrieve similar memories |
GET |
/memory/all |
List all stored memories |
DELETE |
/memory/:id |
Delete a memory |
GET |
/health |
Health check |
Example
curl -X POST http://localhost:8080/memory/add -H "Content-Type: application/json" -d '{"content": "User prefers dark mode"}'Set the following environment variables to enable LangGraph integration:
OM_MODE=langgraph
OM_LG_NAMESPACE=default
OM_LG_MAX_CONTEXT=50
OM_LG_REFLECTIVE=trueWhen activated, OpenMemory mounts additional REST endpoints tailored for LangGraph nodes:
| Method | Endpoint | Purpose |
|---|---|---|
POST |
/lgm/store |
Persist a LangGraph node output into HMD storage |
POST |
/lgm/retrieve |
Retrieve memories scoped to a node/namespace/graph |
POST |
/lgm/context |
Fetch a summarized multi-sector context for a graph session |
POST |
/lgm/reflection |
Generate and store higher-level reflections |
GET |
/lgm/config |
Inspect active LangGraph mode configuration |
Node outputs are mapped to sectors automatically:
| Node | Sector |
|---|---|
observe |
episodic |
plan |
semantic |
reflect |
reflective |
act |
procedural |
emotion |
emotional |
All LangGraph requests pass through the core HSG pipeline, benefiting from salience, decay, automatic waypointing, and optional auto-reflection.
OpenMemory ships with a zero-config Model Context Protocol endpoint so MCP-aware agents (Claude Desktop, VSCode extensions, custom SDKs) can connect immediately—no SDK install required. The server advertises protocolVersion: 2025-06-18 and serverInfo.version: 2.1.0 for broad compatibility.
| Method | Endpoint | Purpose |
|---|---|---|
POST |
/mcp |
Streamable HTTP MCP interactions |
Available server features:
- Tools:
openmemory.query,openmemory.store,openmemory.reinforce,openmemory.list,openmemory.get - Resource:
openmemory://config(runtime, sector, and embedding snapshot)
Example MCP tool call (JSON-RPC):
{
"jsonrpc": "2.0",
"id": "1",
"method": "tools/call",
"params": {
"name": "openmemory.query",
"arguments": {
"query": "preferred coding habits",
"k": 5
}
}
}The MCP route is active as soon as the server starts and always responds with Content-Type: application/json, making it safe for curl, PowerShell, Claude, and other MCP runtimes.
Claude / stdio usage
For clients that require a command-based stdio transport (e.g., Claude Desktop), point them at the compiled CLI:
node backend/dist/mcp/index.jsThe CLI binds to stdin/stdout using the same toolset shown above, so HTTP and stdio clients share one implementation.
| Metric | OpenMemory (self-hosted) | Zep (Cloud) | Supermemory (SaaS) | Mem0 | Vector DB (avg) |
|---|---|---|---|---|---|
| Query latency (100k nodes) | 110–130 ms (local) | 280–350 ms | 350–400 ms | 250 ms | 160 ms |
| Memory addition throughput | ~40 ops/s (local batch) | ~15 ops/s | ~10 ops/s | ~25 ops/s | ~35 ops/s |
| CPU usage | Moderate (vector math only) | Serverless (billed per req) | Serverless (billed) | Moderate | High |
| Storage cost (per 1 M memories) | ~$75–100 | ~$60 + | ~$20 | ~$10–25 | |
| Hosted embedding cost | ~$0.30–0.40 / 1 M tokens | ~$2.0–2.5 / 1 M tokens | ~$2.50 + | ~$1.20 | User-managed |
| Local embedding cost | $0 (Ollama / E5 / BGE) | ❌ Not supported | ❌ Not supported | Partial | ✅ Supported |
| Expected monthly cost (100k memories) | ~$5–8 (self-hosted) | ~$80–150 (Cloud) | ~$60–120 | ~$25–40 | ~$15–40 |
| Reported accuracy (LongMemEval) | 94–97 % (avg) | 58–85 % (varies) | 82 % (claimed) | 74 % | 60–75 % |
| Median latency (LongMemEval) | ~2.1 s (GPT-4o) | 2.5–3.2 s (GPT-4o) | 3.1 s (GPT-4o) | 2.7 s | 2.4 s (avg) |
- OpenMemory is roughly 2.5× faster and 10–15× cheaper than Zep at the same memory scale when self-hosted.
- Zep Cloud offers simplicity and hosted infra but with slower ingestion, higher latency, and no local-model support.
- Mem0 balances cost and ease of use but lacks cognitive structure (no sectorized memory).
- Vector DBs remain efficient for raw similarity search but miss cognitive behaviors such as decay, episodic recall, and reflection.
- Bearer authentication required for write APIs
- Optional AES-GCM content encryption
- PII scrubbing and anonymization hooks
- Tenant isolation for multi-user deployments
- Full erasure via
DELETE /memory/:idor/memory/delete_all?tenant=X - No vendor data exposure; 100% local control
| Phase | Focus | Status |
|---|---|---|
| v1.0 | Core HMD backend (multi-sector memory) | ✅ Complete |
| v1.1 | Pluggable vector backends (pgvector, Weaviate) | ✅ Complete |
| v1.2 | Dashboard (React) + metrics | ⏳ In progress |
| v1.3 | Learned sector classifier (Tiny Transformer) | 🔜 Planned |
| v1.4 | Federated multi-node mode | 🔜 Planned |
Contributions are welcome.
See CONTRIBUTING.md, GOVERNANCE.md, and CODE_OF_CONDUCT.md for guidelines.
make build
make test|
Morven |
Sriram M |
DoKoB0512 |
Jason Kneen |
Muhammad Fiaz |
Peter Chung |
|
Brett Ammeson |
Dhravya Shah |
Joseph Goksu |
MIT License.
Copyright (c) 2025 OpenMemory.
Join our Discord community to connect, share ideas, and take part in exciting discussions!
PageLM: PageLM is a community-driven version of NotebookLM & an education platform that transforms study materials into interactive resources like quizzes, flashcards, notes, and podcasts.
Link: https://github.com/CaviraOSS/PageLM
OpenMemory aims to become the standard open-source memory layer for AI agents and assistants — combining persistent semantic storage, graph-based recall, and explainability in a system that runs anywhere.
It bridges the gap between vector databases and cognitive memory systems, delivering high-recall reasoning at low cost — a foundation for the next generation of intelligent, memory-aware AI.