docs and usage #1

Merged
jason merged 1 commits from develop into main 2026-05-09 11:00:27 -05:00
2 changed files with 331 additions and 95 deletions
+124 -95
View File
@@ -1,136 +1,165 @@
# MemPalace — local fork
# MemPalace — server-mode fork
Local-first AI memory. Verbatim storage, pluggable backend, 96.6% R@5 raw on LongMemEval — zero API calls.
Local-first AI memory, deployed as a shared service across machines.
This is a personal fork configured for **server-mode deployment** MemPalace runs as a Docker container (typically on Unraid) and multiple AI tools (Claude Code, Codex, Antigravity) connect to a single shared palace from any machine on the network.
This is a personal fork of [MemPalace](https://github.com/MemPalace/mempalace) configured for **server-mode deployment**: MemPalace runs as a Docker container (typically on Unraid) and multiple AI tools Claude Code, Codex, Antigravity, or any MCP-compatible client — connect to a single shared palace from any machine on the network. Auto-save hooks on each client push session transcripts to the server over HTTPS with bearer auth.
The upstream project lives at <https://github.com/MemPalace/mempalace>; refer there for benchmark methodology, contribution guidelines, project history, and the public docs site at <https://mempalaceofficial.com>.
The upstream project remains local-first by design. This fork makes one deliberate trade: data crosses the LAN to a user-controlled server instead of staying on the originating machine. Privacy, verbatim storage, no-cloud, and no-telemetry properties are otherwise unchanged. See [CLAUDE.md](CLAUDE.md) for the architectural reasoning.
---
## What it is
## What's in this fork
MemPalace stores your conversation history as verbatim text and retrieves
it with semantic search. It does not summarize, extract, or paraphrase.
The index is structured — people and projects become *wings*, topics
become *rooms*, and original content lives in *drawers* — so searches
can be scoped rather than run against a flat corpus.
The retrieval layer is pluggable. The current default is ChromaDB; the
interface is defined in [`mempalace/backends/base.py`](mempalace/backends/base.py)
and alternative backends can be dropped in without touching the rest of
the system.
Nothing leaves your machine unless you opt in.
Architecture, concepts, and mining flows:
[mempalaceofficial.com/concepts/the-palace](https://mempalaceofficial.com/concepts/the-palace.html).
---
## Install
We recommend [`uv`](https://docs.astral.sh/uv/) — `uv tool install` puts
the `mempalace` CLI in an isolated environment on your PATH:
```bash
uv tool install mempalace
mempalace init ~/projects/myapp
```
home LAN
┌───────────────────────────────────┐
│ Unraid (always on) │
│ ┌────────────────────────────┐ │
│ │ caddy :8443 (TLS + auth) │ │
│ │ ├─ /sse → mcp-proxy │ │
│ │ └─ /ingest → ingest API │ │
│ │ mempalace (single process) │ │
│ │ ├─ mcp-proxy :8765 │ │
│ │ └─ ingest :8766 │ │
│ └────────────────────────────┘ │
└───────────────────────────────────┘
│ │ │
┌────┴───┐ ┌────┴───┐ ┌────┴─────┐
│ Claude │ │ Codex │ │ Antigrav │
└────────┘ └────────┘ └──────────┘
```
If you prefer pip, `pip install mempalace` still works.
* **One palace, many clients.** Search and write target the same ChromaDB index regardless of which machine you're on.
* **Single bearer token gates everything.** Caddy sidecar terminates TLS and enforces `Authorization: Bearer <token>` at the edge.
* **Auto-save hooks work across machines.** Each client's `Stop` and `PreCompact` events POST the active transcript to the server's `/ingest/transcript` endpoint; the server-side miner runs the existing entity-detection / room-assignment / dedup pipeline.
* **Single ChromaDB writer.** The HTTP ingest endpoint runs as a daemon thread inside the same Python process as the MCP server — ChromaDB's HNSW index isn't safe across processes, so this is the safe shape.
What this fork is **not**: a multi-tenant cloud service. One palace, one token, no per-user isolation. Designed for a single user with multiple machines.
---
## Concepts
MemPalace stores conversation history as verbatim text and retrieves it with semantic search. It does not summarize, extract, or paraphrase. The index is structured:
* **Wings** — broad categories (people, projects, topics)
* **Rooms** — time-based or topical groupings (days, sessions, themes)
* **Drawers** — verbatim content chunks (your exact words)
* **AAAK compression** — symbolic dialect for the index layer; an LLM can scan thousands of entries in one prompt and know which drawer to open
Same palace, two ingest paths: **project mining** (code, docs, notes) and **conversation mining** (Claude Code / Codex JSONL transcripts).
---
## Quickstart
### 1. Deploy the server (Unraid)
```bash
# Mine content into the palace
mempalace mine ~/projects/myapp # project files
mempalace mine ~/.claude/projects/ --mode convos # Claude Code sessions (scope with --wing per project)
# On Unraid:
cd /mnt/user/system/build && git clone <this-repo> mempalace && cd mempalace/deploy/unraid
# Search
mempalace search "why did we switch to GraphQL"
TOKEN=$(openssl rand -hex 32)
echo "MEMPAL_TOKEN=$TOKEN" > .env
chmod 600 .env
# Load context for a new session
mempalace wake-up
mkdir -p /mnt/user/appdata/mempalace /mnt/user/appdata/mempalace-caddy/{data,config}
chown -R 99:100 /mnt/user/appdata/mempalace /mnt/user/appdata/mempalace-caddy
docker compose up -d --build
echo "Token: $TOKEN" # save to your password manager
```
For Claude Code, Gemini CLI, MCP-compatible tools, and local models, see
[mempalaceofficial.com/guide/getting-started](https://mempalaceofficial.com/guide/getting-started.html).
Verify: `curl -k https://<unraid-ip>:8443/healthz` returns `{"status":"ok",...}`.
Benchmark methodology and per-question result files live in the upstream repository — this fork has had the `benchmarks/` directory removed since it isn't needed for deployment.
Full deployment guide: [`deploy/unraid/README.md`](deploy/unraid/README.md).
### 2. Connect a client (per machine)
Install `mcp-proxy` once: `uv tool install mcp-proxy` (or `pip install mcp-proxy`).
Set environment variables:
```powershell
# Windows PowerShell:
[Environment]::SetEnvironmentVariable("MEMPAL_REMOTE_URL", "https://<unraid-ip>:8443", "User")
[Environment]::SetEnvironmentVariable("MEMPAL_REMOTE_TOKEN", "<the-token>", "User")
[Environment]::SetEnvironmentVariable("MEMPAL_REMOTE_INSECURE", "1", "User") # self-signed cert
```
Add to your AI tool's MCP config (Claude Code `~/.claude.json`, Codex `~/.codex/config.toml`, Antigravity MCP settings):
```json
{
"mcpServers": {
"mempalace": {
"command": "mcp-proxy",
"args": [
"https://<unraid-ip>:8443/sse",
"--headers", "Authorization", "Bearer <the-token>"
]
}
}
}
```
### 3. Wire up auto-save hooks
Point Claude Code's `Stop` and `PreCompact` hooks at [`hooks/mempal_save_hook_remote.sh`](hooks/mempal_save_hook_remote.sh) and [`hooks/mempal_precompact_hook_remote.sh`](hooks/mempal_precompact_hook_remote.sh). Same shape for Codex via `.codex/hooks.json`. See [`hooks/README.md`](hooks/README.md) for the JSON config and env-var contract.
---
## Server mode (Unraid / shared across machines)
## Repository layout
Most users run MemPalace locally on a single machine. If you work
across multiple machines and want one shared memory, you can deploy it
as a Docker container — typically on a home NAS like Unraid — and
point Claude Code, Codex, Antigravity, or any MCP client on each
machine at the same palace.
```
mempalace/ # Python package (source unchanged from upstream)
├── mcp_server.py # MCP stdio server — all read/write tools
├── ingest_server.py # HTTP transcript-ingest endpoint (server mode)
└── ... # see CLAUDE.md for full layout
The `deploy/unraid/` directory ships a complete two-container stack:
deploy/unraid/ # Server-mode deployment
├── docker-compose.yml # mempalace + caddy sidecar
├── Caddyfile # bearer-token auth, TLS, SSE-aware proxy
├── mempalace-server.xml # dockerMan template (no-auth fallback)
└── README.md # Full install / troubleshooting guide
* `mempalace` runs the existing MCP-over-SSE endpoint plus a small
HTTP transcript-ingest endpoint, both in a single process so there's
exactly one ChromaDB writer.
* `caddy` sidecar terminates TLS, enforces a bearer-token check on
every request, and reverse-proxies `/sse` and `/ingest`.
hooks/ # Hook scripts for AI clients
├── mempal_save_hook_remote.sh
├── mempal_precompact_hook_remote.sh
└── README.md
Auto-save hooks have remote-aware variants
(`hooks/mempal_save_hook_remote.sh`,
`hooks/mempal_precompact_hook_remote.sh`) that POST transcripts to the
server instead of running `mempalace mine` locally.
Dockerfile # Builds the server image
.dockerignore
```
Full install, client config, hook setup, and troubleshooting:
[`deploy/unraid/README.md`](deploy/unraid/README.md).
---
## Knowledge graph
MemPalace includes a temporal entity-relationship graph with validity
windows — add, query, invalidate, timeline — backed by local SQLite.
Usage and tool reference:
[mempalaceofficial.com/concepts/knowledge-graph](https://mempalaceofficial.com/concepts/knowledge-graph.html).
## MCP server
29 MCP tools cover palace reads/writes, knowledge-graph operations,
cross-wing navigation, drawer management, and agent diaries. Installation
and the full tool list:
[mempalaceofficial.com/reference/mcp-tools](https://mempalaceofficial.com/reference/mcp-tools.html).
## Agents
Each specialist agent gets its own wing and diary in the palace.
Discoverable at runtime via `mempalace_list_agents` — no bloat in your
system prompt:
[mempalaceofficial.com/concepts/agents](https://mempalaceofficial.com/concepts/agents.html).
## Auto-save hooks
Two hooks save periodically and before context compression. In this fork the **remote** variants ship — they POST the active transcript to the server's `/ingest/transcript` endpoint with bearer auth instead of running `mempalace mine` locally. Setup, env-var contract, and troubleshooting: [`hooks/README.md`](hooks/README.md).
For per-message recall on top of the file-level chunks the hooks produce, `mempalace sweep <transcript-dir>` runs inside the container (`docker exec mempalace mempalace sweep ...`) — stores one verbatim drawer per user/assistant message, idempotent and resume-safe.
MemPalace includes a temporal entity-relationship graph with validity windows (add, query, invalidate, timeline) backed by local SQLite. Accessible via the MCP tools (`mempalace_kg_*`) over the same SSE endpoint. Tool reference: [mempalaceofficial.com/concepts/knowledge-graph](https://mempalaceofficial.com/concepts/knowledge-graph.html).
---
## Requirements
- Python 3.9+ (server image uses 3.13)
- A vector-store backend (ChromaDB by default)
- ~300 MB disk for the default embedding model
- Docker + Compose Manager plugin on Unraid for the server-mode path
* **Server (Unraid):** Docker + Compose Manager plugin. Image uses Python 3.13-slim. ~300 MB disk for the embedding model after first request. ~22 MB repo working tree.
* **Clients:** [`mcp-proxy`](https://github.com/sparfenyuk/mcp-proxy) and `curl`. Python 3 on PATH only if you use the auto-save hooks (the hooks parse stdin JSON via Python stdlib).
No API key is required for any path.
No API key is required at any stage. The default embedding model (all-MiniLM-L6-v2 ONNX) runs on CPU on the server.
---
## Docs
- Server-mode deployment → [`deploy/unraid/README.md`](deploy/unraid/README.md)
- Hook setup (remote variants) → [`hooks/README.md`](hooks/README.md)
- Release notes → [`CHANGELOG.md`](CHANGELOG.md)
- Project conventions → [`CLAUDE.md`](CLAUDE.md)
- Upstream CLI / Python API reference → [mempalaceofficial.com](https://mempalaceofficial.com)
* **Agent usage guide** → [`docs/AGENT_GUIDE.md`](docs/AGENT_GUIDE.md) — feed this to Claude Code / Codex / Antigravity so they know what tools exist, when to use which, and the workflow patterns. Drop into `~/.claude/CLAUDE.md` for global scope or a project's `CLAUDE.md` for project scope.
* Server deployment → [`deploy/unraid/README.md`](deploy/unraid/README.md)
* Hook setup → [`hooks/README.md`](hooks/README.md)
* Project conventions and architecture (for editing this codebase) → [`CLAUDE.md`](CLAUDE.md)
* Release notes (this fork) → [`CHANGELOG.md`](CHANGELOG.md)
* Upstream CLI / Python API / concepts → [mempalaceofficial.com](https://mempalaceofficial.com)
---
## License
MIT — see [LICENSE](LICENSE).
MIT — see [LICENSE](LICENSE). Upstream copyright belongs to the MemPalace authors; modifications in this fork are also MIT.
+207
View File
@@ -0,0 +1,207 @@
# MemPalace — Agent Usage Guide
> **For AI agents using MemPalace as memory.** Drop this file into `~/.claude/CLAUDE.md`, a project's `CLAUDE.md`, or paste into Codex / Antigravity system prompts. Tells the agent *that* memory exists, *how* it's structured, and *when* to use which tool.
You have access to **MemPalace**, a verbatim memory system reachable through MCP tools (the `mempalace_*` family). It stores everything the user has said to you previously, organized into a searchable structure. Use it. Don't pretend to forget what you've stored, and don't paraphrase or summarize what you find — return user content exactly as it was written.
---
## Mental model
```
WING (broad bucket — a project, a person, a topic)
└── ROOM (sub-bucket — backend, decisions, day-2026-05-09)
└── DRAWER (one verbatim chunk of text)
```
The palace is structured so searches can be **scoped** rather than brute-forced against a flat corpus. Filter by `wing` and `room` whenever you know which one the answer lives in — it's faster and more precise.
A separate **knowledge graph** layer stores facts as `subject — predicate — object` triples with `valid_from` / `valid_to` dates. Use it for explicit relationships ("Alice manages project X from 2025-03-01") rather than free-text recall.
**Tunnels** link related wings together. Useful when projects share themes: a search in wing A can follow tunnels to find drawers in wing B.
---
## When to reach for memory
Call `mempalace_search` (or a more specific tool below) **before** answering when:
- The user references prior decisions, conversations, or work *("we talked about X last week")*
- The user asks about a person, project, or topic by name
- You're about to give an opinion on a codebase or domain you've discussed before
- A new session starts and you need context — start with `mempalace_status` then `mempalace_list_wings`
- The user contradicts something you said — search for the original to verify
**Don't search** for trivial things you already know from the current conversation, generic programming questions, or anything in the prompt context window.
---
## Tools by frequency of use
### Daily — these are the workhorses
| Tool | Use when |
|---|---|
| `mempalace_search(query, wing?, room?, limit?)` | You need to recall *anything* about a topic. Keep `query` short — keywords or one sentence, max 250 chars. Filter by `wing` if you know the project. |
| `mempalace_status()` | Start of session, when you want a sense of how much history exists. Returns total drawers + breakdown by wing/room. |
| `mempalace_list_wings()` | "What projects/people have I talked to this user about?" |
| `mempalace_list_rooms(wing)` | After picking a wing — "what aspects of this project are filed?" |
### Filing new content (less often than you think)
The auto-save hooks file your conversations automatically. You almost never need to call `mempalace_add_drawer` for chat content — it's already being captured. Use it when:
- The user explicitly says "remember this" or "save this for later"
- You produce a piece of structured output (a design decision, a config snippet, a quoted user preference) that warrants its own retrieval handle
- You're working from external content (a pasted doc, a tool result) that should outlive the conversation
```
mempalace_add_drawer(
wing="<project-name>",
room="decisions" | "config" | "preferences" | ...,
content="<exact text — never summarize>"
)
```
Always `mempalace_check_duplicate(content)` first if you're not sure whether you've filed something similar — the palace dedups but the check tells you what already exists.
### Knowledge graph — for facts, not for prose
| Tool | Use when |
|---|---|
| `mempalace_kg_query(subject?, predicate?, object?, as_of?)` | "What does the user own / work on / report to as of <date>?" |
| `mempalace_kg_add(subject, predicate, object, valid_from?, valid_to?)` | The user states a durable fact — relationships, employment, preferences with a clear scope. |
| `mempalace_kg_invalidate(subject, predicate, object, ended)` | A fact stopped being true — close the validity window instead of deleting. |
| `mempalace_kg_timeline(entity)` | Show all facts about an entity over time. |
| `mempalace_kg_stats()` | Sanity check — how many triples exist. |
Don't use the KG for vague things ("user likes minimalism"). Use it for crisp triples where each component is a real entity name.
### Cross-wing exploration
| Tool | Use when |
|---|---|
| `mempalace_traverse(start_wing, depth?)` | Walk the palace graph from a starting point. Useful for "what's adjacent to project X?" |
| `mempalace_find_tunnels(from_wing, to_wing)` | "Are these two projects connected, and how?" |
| `mempalace_follow_tunnels(wing)` | After a search lands in one wing, see what other wings are linked. |
| `mempalace_create_tunnel(wing_a, wing_b, kind?)` | The user mentions two projects share concepts/people — link them. |
| `mempalace_list_tunnels()` / `mempalace_delete_tunnel(...)` | Inspection / cleanup. |
### Drawer management
| Tool | Use when |
|---|---|
| `mempalace_get_drawer(drawer_id)` | You have an ID from a previous search and want the full content. |
| `mempalace_list_drawers(wing, room?)` | Browse a room. Use when search isn't returning what you expect. |
| `mempalace_update_drawer(drawer_id, content)` | Rare — only when correcting a stored drawer's content. |
| `mempalace_delete_drawer(drawer_id)` | The user explicitly asks to forget something. Irreversible. |
### Diary
| Tool | Use when |
|---|---|
| `mempalace_diary_write(agent, content)` | You want to leave a note for your future self that's distinct from user content. Each agent gets its own wing. |
| `mempalace_diary_read(agent, limit?)` | Read your prior diary entries — gives you continuity across sessions. |
### Maintenance — rarely needed
| Tool | Use when |
|---|---|
| `mempalace_get_taxonomy()` | Full wing → room → count tree. Heavy; prefer `list_wings` + `list_rooms`. |
| `mempalace_get_aaak_spec()` | You want to scan a compressed index instead of full-text searching. Advanced. |
| `mempalace_sync(apply?)` | Prune drawers whose source files were deleted/moved. Operator decision; don't apply without asking. |
| `mempalace_hook_settings()` | Read auto-save hook config. Diagnostic. |
| `mempalace_memories_filed_away()` | Returns a confirmation banner — used by the hooks, not by you directly. |
| `mempalace_reconnect()` | Force a cache refresh after external writes. Use only if results look stale. |
---
## Workflow patterns
### New session, unknown context
```
1. mempalace_status() → "23,000 drawers across 14 wings"
2. mempalace_list_wings() → see project names
3. (user mentions project X)
4. mempalace_list_rooms(wing="x") → see what's filed
5. mempalace_search(query="...", wing="x")
```
### User asks about a past decision
```
1. mempalace_search(query="why did we pick redis", wing="<project>")
2. If the top hit's distance is < 1.0 — that's the answer; quote it verbatim.
3. If distance is 1.01.5 — relevant but tangential; show it and ask if it's what they meant.
4. If > 1.5 — say you don't have a stored decision on that and ask if they want to file one now.
```
### User provides a durable fact
User: *"I've moved from Acme to Globex as of Monday."*
```
1. mempalace_kg_invalidate(
subject="user", predicate="works_at", object="Acme",
ended="2026-05-04" # or whatever Monday resolves to
)
2. mempalace_kg_add(
subject="user", predicate="works_at", object="Globex",
valid_from="2026-05-04"
)
```
### User shares a config or design doc
```
1. mempalace_check_duplicate(content="<the doc>")
2. If new:
mempalace_add_drawer(
wing="<project>",
room="config" | "design" | "specs",
content="<exact pasted content — do NOT reformat>"
)
3. Return the drawer_id so the user can reference it later.
```
---
## Anti-patterns
These violate the system's design and degrade memory quality:
-**Summarizing user content before filing.** Store exact words. The whole point of MemPalace is verbatim recall.
-**Paraphrasing search results when reporting them back.** Quote the drawer. Use blockquotes.
-**Filing every conversation turn manually.** The auto-save hooks (`Stop` / `PreCompact`) handle session capture. Filing manually duplicates work and can cause `MineAlreadyRunning` collisions.
-**Using free-text drawers for crisp facts.** A relationship like "Alice → manages → ProjectX from 2025-03-01" belongs in the knowledge graph (`mempalace_kg_add`), not as prose in a drawer.
-**Putting room names where wing names go.** Wings are top-level (people/projects); rooms live inside them. `wing="backend"` is almost always wrong unless "backend" is literally a project.
-**Long search queries.** `query` has a 250-char limit and works best with 310 keywords. Put background reasoning in `context`, not `query`.
-**Calling `mempalace_delete_drawer` without explicit user instruction.** Irreversible. Always confirm.
---
## Things you don't need to worry about
- **Authentication.** The MCP transport handles bearer tokens transparently. Don't try to inject `Authorization` headers in tool args.
- **Token / data limits.** Drawer content is unlimited within reason; the server caps individual transcript uploads at 50 MB.
- **Session persistence.** Auto-save hooks capture transcripts every 15 user turns and at PreCompact. You can rely on it. If the user worries about losing context, point them at the hook log: `~/.mempalace/hook_state/hook.log`.
- **Cross-machine sync.** All clients (Claude Code, Codex, Antigravity, on any machine) point at the same palace. What you file from one machine is searchable from another instantly.
- **Embedding model.** Server-side, runs locally on CPU. No API key, no cloud round-trip.
---
## When the system is unavailable
If MCP tool calls fail (timeout, 401, connection refused), tell the user clearly: "I can't reach MemPalace right now — the server may be down or the token is wrong." Don't fall back to inventing memory. Don't pretend you remember things you'd normally retrieve.
Diagnostics the user can run:
```bash
# from any client machine:
curl -k "$MEMPAL_REMOTE_URL/healthz"
# expected: {"status":"ok","version":"3.3.x"}
```
If that fails, the server is down. If it succeeds but tool calls still fail, the bearer token is wrong or the MCP client config is stale.