Long-term memory for your AI
Capture every conversation. Distill durable knowledge. Recall compressed context on the next call.
Everything you need for AI memory
Six core capabilities that turn ephemeral AI conversations into persistent, searchable knowledge — deployed on your infrastructure.
On-Prem First
Runs on a single VM or a full cluster via Docker Compose or Helm. Your data never leaves your network — no cloud provider lock-in, no third-party data processing agreements required.
Knowledge Graph
Entities and relationships are extracted into an Apache AGE graph database. This enables multi-hop reasoning across conversations, surfacing connections that flat vector search misses entirely.
Hybrid Retrieval
Combines vector similarity (pgvector), keyword matching (BM25 via Tantivy), and graph traversal into a single query. Results are merged using Reciprocal Rank Fusion for consistently relevant recall.
MCP Native
Ships a Model Context Protocol server that lets IDE-based tools read and write Rohrpost memory directly. Any MCP-compatible client can query your knowledge base without custom integration code.
Stream Everything
All data flows through NATS JetStream with at-least-once delivery guarantees. Capsules are durably stored and replayable, so no interaction is ever lost even during downstream outages.
Built in Rust
A single static binary (~30 MB) with no runtime dependencies. Compiles to native code for sub-millisecond proxy overhead and runs comfortably on a 1 vCPU / 1 GB RAM instance.
From conversation to context in four steps
A durable pipeline that captures every AI interaction and transforms it into compressed, retrievable knowledge.
Capture
Your AI interactions produce capsules — prompts, responses, diffs, and errors — streamed durably via NATS JetStream with at-least-once delivery.
Transport
Capsules flow through the tube to the distiller. NATS JetStream guarantees delivery with full replay capability for reprocessing or auditing.
Distill
The distiller chunks text, generates vector embeddings via pgvector, builds BM25 indexes with Tantivy, and extracts entities into an Apache AGE knowledge graph.
Recall
On the next LLM call, hybrid retrieval (vector + BM25 + graph walk) merges results via Reciprocal Rank Fusion into a token-budgeted brief injected as context.
Built for every scale
Whether you are a solo developer or an enterprise team, Rohrpost adapts to your workflow and infrastructure requirements.
Solo Developers
The Problem
You repeat the same context to your AI assistant every session. Previous conversations vanish, forcing you to re-explain architecture decisions, coding patterns, and project constraints from scratch.
The Solution
Rohrpost captures every interaction and recalls relevant context automatically. Persistent memory across sessions saves 40% on token costs by eliminating redundant context windows.
Development Teams
The Problem
Knowledge lives in individual chat histories that teammates cannot access. Onboarding new developers means re-discovering decisions that were already made and explained to an AI months ago.
The Solution
Shared knowledge graph lets any team member query the collective AI memory. The OpenAI-compatible proxy integrates without code changes — existing applications work immediately.
Enterprises
The Problem
Sensitive IP flows through third-party AI providers with no audit trail. Compliance teams cannot verify what data was shared or how AI responses influenced production decisions.
The Solution
Self-hosted deployment keeps all data on-premise with a full audit trail for compliance. Every capsule is durably stored and replayable, providing complete observability into AI interactions.
Deploy in under a minute
A single static binary (~30 MB) that runs on 1 vCPU and 1 GB RAM. No JVM, no runtime dependencies, no warm-up time.
docker compose up