cleanup and remote only

2026-05-09 10:52:25 -05:00
parent 2fc47a52fc
commit 40e5e5e3cc
136 changed files with 1502 additions and 349529 deletions
@@ -0,0 +1,512 @@
+# MemPalace on Unraid — server-mode deployment
+
+This directory contains everything needed to run MemPalace as a shared
+memory server on an Unraid box and connect multiple AI tools (Claude
+Code, Codex, Antigravity, or any MCP-compatible client) to a single
+persistent palace.
+
+If you only use one machine, you don't need any of this — install
+mempalace locally per the main [README](../../README.md) and you're
+done. This guide is for users running the same AI tools across multiple
+machines who want one shared memory.
+
+---
+
+## What you get
+
+```
+                     home LAN
+   ┌───────────────────────────────────┐
+   │        Unraid (always on)         │
+   │   ┌────────────────────────────┐  │
+   │   │ caddy :8443 (TLS + auth)   │  │
+   │   │   ├─ /sse     → mcp-proxy  │  │
+   │   │   └─ /ingest  → ingest API │  │
+   │   │ mempalace (single process) │  │
+   │   │   ├─ mcp-proxy :8765       │  │
+   │   │   └─ ingest   :8766        │  │
+   │   └────────────────────────────┘  │
+   │   /mnt/user/appdata/mempalace/    │
+   │     ├─ palace/  ChromaDB          │
+   │     ├─ kg/      knowledge graph   │
+   │     └─ inbox/   uploaded sessions │
+   └───────────────────────────────────┘
+              │           │           │
+        ┌─────┴─┐    ┌────┴──┐    ┌───┴──────┐
+        │ box A │    │ box B │    │ box C    │
+        │ Claude│    │ Codex │    │ Antigrav │
+        └───────┘    └───────┘    └──────────┘
+```
+
+* **One palace, many clients.** Search and write target the same
+  ChromaDB index regardless of which machine you're on.
+* **Auto-save hooks work across machines.** Each client's session
+  transcripts get pushed to the server on `Stop` and `PreCompact`
+  events; the server-side miner runs the existing `mine_convos`
+  pipeline (entity detection, room assignment, dedup, idempotency).
+* **Single shared secret.** One bearer token gates both MCP and
+  transcript ingest at the Caddy edge.
+
+What this is **not**: a multi-tenant cloud product. There's one palace,
+one token, no per-user isolation. It's designed for a single user with
+multiple machines.
+
+---
+
+## Files in this directory
+
+| File | Purpose |
+|---|---|
+| `docker-compose.yml` | Two-container stack: `mempalace` + `caddy` sidecar. |
+| `Caddyfile` | Caddy config: bearer-token auth, self-signed TLS, SSE-aware reverse proxy. |
+| `mempalace-server.xml` | dockerMan template for a single-container, **no-auth, LAN-trust-only** install (compose path is the recommended one). |
+| `README.md` | This file. |
+
+The `Dockerfile` and `.dockerignore` live at the repo root — the compose
+build context is `../..` so it can reach them.
+
+---
+
+## Prerequisites
+
+* Unraid 6.12+ with Docker enabled (default).
+* The **Compose Manager** plugin from Community Apps. Required for the
+  recommended (auth-enabled) path. The dockerMan template path doesn't
+  need it but has no auth.
+* `/mnt/user/appdata` set up (default on every Unraid).
+* Ports `8443` free on the Unraid host (or change in `docker-compose.yml`).
+
+You do **not** need Tailscale, WireGuard, a domain name, a public IP,
+SWAG, or NPM. The stack is self-contained.
+
+---
+
+## Install (recommended: compose with auth)
+
+### 1. Get the repo onto Unraid
+
+SSH to Unraid, pick a path on a regular share (not `/boot`, not
+`/mnt/cache` directly), and clone or copy the repo:
+
+```bash
+mkdir -p /mnt/user/system/build
+cd /mnt/user/system/build
+git clone <your-fork-or-rsync-source> mempalace
+cd mempalace/deploy/unraid
+```
+
+### 2. Mint a bearer token
+
+```bash
+TOKEN=$(openssl rand -hex 32)
+echo "MEMPAL_TOKEN=$TOKEN" > .env
+chmod 600 .env
+echo "Token: $TOKEN"   # save to a password manager — you'll set this on each client
+```
+
+`MEMPAL_TOKEN` is read from `.env` by `docker compose`. The same token
+is forwarded to:
+
+* Caddy, which checks `Authorization: Bearer <token>` on every request.
+* The in-container ingest server as `MEMPALACE_INGEST_TOKEN` for
+  defense-in-depth.
+
+### 3. Create the appdata directories
+
+```bash
+mkdir -p /mnt/user/appdata/mempalace \
+         /mnt/user/appdata/mempalace-caddy/data \
+         /mnt/user/appdata/mempalace-caddy/config
+chown -R 99:100 /mnt/user/appdata/mempalace
+chown -R 99:100 /mnt/user/appdata/mempalace-caddy
+```
+
+The Caddy data dir holds Caddy's auto-generated root CA — back it up
+so re-deploys keep the same cert (clients won't have to re-trust it).
+
+### 4. Build and start
+
+```bash
+docker compose up -d --build
+```
+
+First build downloads Python 3.13-slim and pip-installs `mempalace` +
+`mcp-proxy` (~3–5 min on a Celeron, faster on real hardware).
+
+### 5. Verify
+
+```bash
+# unauth'd liveness probe
+curl -k https://<unraid-ip>:8443/healthz
+# → {"status":"ok","version":"3.3.x"}
+
+# bearer-checked endpoint should 401 without the token
+curl -ki https://<unraid-ip>:8443/ingest/transcript
+# HTTP/2 401
+
+# ...and accept a request with it
+curl -k -H "Authorization: Bearer $TOKEN" https://<unraid-ip>:8443/healthz
+# → 200 OK
+```
+
+If you see all of the above, the server is up and the auth gate is
+working.
+
+### 6. (Optional) Trust Caddy's root CA on each client
+
+Caddy's `tls internal` directive auto-generates a self-signed root CA
+on first start. Clients must either trust that CA or skip TLS
+verification (`-k` for curl, `MEMPAL_REMOTE_INSECURE=1` for hooks,
+disabled SSL verify for the MCP client).
+
+To trust it once and stop seeing TLS warnings:
+
+```bash
+# On Unraid:
+cat /mnt/user/appdata/mempalace-caddy/data/caddy/pki/authorities/local/root.crt
+```
+
+Copy that PEM block to each Windows client and import into the
+**Trusted Root Certification Authorities** store via `certmgr.msc`,
+or via PowerShell:
+
+```powershell
+Import-Certificate -FilePath C:\path\to\root.crt -CertStoreLocation Cert:\LocalMachine\Root
+```
+
+---
+
+## Connect AI tools
+
+You'll need [`mcp-proxy`](https://github.com/sparfenyuk/mcp-proxy) on
+each client machine:
+
+```bash
+uv tool install mcp-proxy
+# or:
+pip install mcp-proxy
+```
+
+Set environment variables persistently. **PowerShell** (Windows):
+
+```powershell
+[Environment]::SetEnvironmentVariable("MEMPAL_REMOTE_URL",   "https://<unraid-ip>:8443", "User")
+[Environment]::SetEnvironmentVariable("MEMPAL_REMOTE_TOKEN", "<the-token>",              "User")
+# Drop this once you've trusted Caddy's root CA:
+[Environment]::SetEnvironmentVariable("MEMPAL_REMOTE_INSECURE", "1", "User")
+```
+
+**Bash/Zsh** (macOS/Linux): add the same three exports to
+`~/.zshrc` / `~/.bashrc`.
+
+### Claude Code
+
+Add to `~/.claude.json` (user-scoped) or `.mcp.json` in the project:
+
+```json
+{
+  "mcpServers": {
+    "mempalace": {
+      "command": "mcp-proxy",
+      "args": [
+        "https://<unraid-ip>:8443/sse",
+        "--headers", "Authorization", "Bearer <the-token>"
+      ],
+      "env": {
+        "PYTHONHTTPSVERIFY": "0"
+      }
+    }
+  }
+}
+```
+
+Drop the `env` block once Caddy's root CA is trusted on the client.
+
+### Codex CLI
+
+Add to `~/.codex/config.toml`:
+
+```toml
+[mcp_servers.mempalace]
+command = "mcp-proxy"
+args = [
+  "https://<unraid-ip>:8443/sse",
+  "--headers", "Authorization", "Bearer <the-token>",
+]
+
+[mcp_servers.mempalace.env]
+PYTHONHTTPSVERIFY = "0"
+```
+
+### Antigravity
+
+Antigravity uses the Windsurf-derived MCP layout. Open the IDE's
+MCP settings UI (Settings → AI → MCP Servers) and add:
+
+```json
+{
+  "mempalace": {
+    "command": "mcp-proxy",
+    "args": [
+      "https://<unraid-ip>:8443/sse",
+      "--headers", "Authorization", "Bearer <the-token>"
+    ]
+  }
+}
+```
+
+Or edit `~/.antigravity/mcp.json` directly with the same shape.
+
+### Verify each client
+
+In any of the three tools, start a session and call:
+
+> "Use mempalace_status to show palace stats."
+
+Expected: a JSON blob with `total_drawers`, wing/room breakdown, etc.
+A 401 means the token is wrong; a connection error means the
+URL/cert is wrong.
+
+---
+
+## Set up auto-save hooks
+
+The `_remote.sh` hook variants in `../../hooks/` push transcripts to
+the server instead of running `mempalace mine` locally. They share the
+same env-var contract as the MCP client config above.
+
+### Claude Code
+
+Make the scripts executable:
+
+```bash
+chmod +x hooks/mempal_save_hook_remote.sh \
+         hooks/mempal_precompact_hook_remote.sh
+```
+
+Add to `.claude/settings.local.json`:
+
+```json
+{
+  "hooks": {
+    "Stop": [{
+      "matcher": "*",
+      "hooks": [{
+        "type": "command",
+        "command": "/abs/path/to/hooks/mempal_save_hook_remote.sh",
+        "timeout": 30
+      }]
+    }],
+    "PreCompact": [{
+      "hooks": [{
+        "type": "command",
+        "command": "/abs/path/to/hooks/mempal_precompact_hook_remote.sh",
+        "timeout": 60
+      }]
+    }]
+  }
+}
+```
+
+### Codex CLI
+
+Add to `.codex/hooks.json` with the same shape — the scripts are
+hook-host-agnostic.
+
+### What the hooks do
+
+| Hook | Trigger | Behavior |
+|---|---|---|
+| `mempal_save_hook_remote.sh` | Every 15 user messages (configurable via `SAVE_INTERVAL` env var) | Backgrounded `curl` POSTs the active transcript to `/ingest/transcript`. Returns immediately so the AI doesn't stall. Idempotent — failed retries are safe. |
+| `mempal_precompact_hook_remote.sh` | Right before context compaction | Synchronous `curl` POST. Blocks until the upload completes (or the hook timeout fires) so memory is durable before context shrinks. |
+
+Both write logs to `~/.mempalace/hook_state/hook.log`. Tail it during
+setup to confirm uploads are landing.
+
+### Optional env vars
+
+| Variable | Default | Purpose |
+|---|---|---|
+| `MEMPAL_REMOTE_URL` | *(required)* | Server base URL, e.g. `https://unraid.local:8443`. |
+| `MEMPAL_REMOTE_TOKEN` | *(required)* | Bearer token. |
+| `MEMPAL_REMOTE_INSECURE` | unset | Set to `1` to skip TLS verification. Use only with `tls internal`. |
+| `MEMPAL_REMOTE_WING` | unset | Force a specific wing for this client's transcripts. Default: server derives wing from session id. |
+| `SAVE_INTERVAL` | `15` | Messages between save-hook fires. |
+
+---
+
+## Backfilling history
+
+The hooks only capture sessions going forward. To mine **past**
+transcripts into the remote palace, on each client run:
+
+```bash
+curl -k -X POST \
+  -H "Authorization: Bearer $MEMPAL_REMOTE_TOKEN" \
+  -H "X-Session-Id: backfill-$(hostname)-$(date +%s)" \
+  -H "X-Wing: backfill" \
+  --data-binary @/path/to/some-session.jsonl \
+  "$MEMPAL_REMOTE_URL/ingest/transcript"
+```
+
+For a whole directory of past sessions, loop:
+
+```bash
+for f in ~/.claude/projects/**/*.jsonl; do
+  curl -k -X POST \
+    -H "Authorization: Bearer $MEMPAL_REMOTE_TOKEN" \
+    -H "X-Session-Id: $(basename "$f" .jsonl)" \
+    --data-binary @"$f" \
+    "$MEMPAL_REMOTE_URL/ingest/transcript"
+done
+```
+
+The server-side miner is idempotent — re-uploading the same transcript
+won't double-file.
+
+---
+
+## Backups
+
+Everything that matters lives in `/mnt/user/appdata/mempalace/`:
+
+* `palace/` — ChromaDB vector index + SQLite metadata
+* `kg/` — knowledge-graph SQLite
+* `inbox/` — uploaded transcripts (kept for re-mining if needed)
+
+Add it to your **CA Backup / Appdata Backup** schedule. Losing this
+directory loses all memory.
+
+The Caddy data dir (`/mnt/user/appdata/mempalace-caddy/data/`) is also
+worth backing up — it contains the auto-generated root CA. Without it,
+re-deploys regenerate the CA and clients have to re-trust it.
+
+---
+
+## dockerMan template (no-auth, LAN-trust-only)
+
+If you don't want auth and trust your LAN absolutely (no other people,
+no untrusted IoT, no guests), the `mempalace-server.xml` template gives
+you a single-container, dockerMan-compatible install:
+
+```bash
+# Build the image:
+cd /mnt/user/system/build/mempalace
+docker build -t mempalace-server:latest .
+
+# Install the template:
+cp deploy/unraid/mempalace-server.xml \
+   /boot/config/plugins/dockerMan/templates-user/my-MemPalace.xml
+```
+
+Then in the Unraid WebUI: Docker → Add Container → "Select a template" →
+**MemPalace** → Apply.
+
+This path skips Caddy entirely. The MCP SSE endpoint is published bare
+on `:8765`, no TLS, no auth. Anyone on the LAN can read and write the
+palace. **Only use this if you understand and accept that.**
+
+---
+
+## Troubleshooting
+
+### `mcp-proxy` connects but tool calls hang
+
+Caddy is buffering SSE responses. Verify `flush_interval -1` is set in
+the Caddyfile and that Caddy version is 2.7+ (the compose pulls
+`caddy:2-alpine` which is current).
+
+### 401 from every request
+
+The token in the client's MCP config doesn't match the server's
+`MEMPAL_TOKEN`. Print both to confirm:
+
+```bash
+# On Unraid:
+grep MEMPAL_TOKEN /mnt/user/system/build/mempalace/deploy/unraid/.env
+
+# On client (PowerShell):
+[Environment]::GetEnvironmentVariable("MEMPAL_REMOTE_TOKEN", "User")
+```
+
+### `MineAlreadyRunning` errors in hook logs
+
+Two clients hit the ingest endpoint simultaneously. The server-side
+miner serializes via `mine_lock` and rejects the second one. The hook
+is idempotent — the next save catches up. If you see this constantly,
+raise `SAVE_INTERVAL` on the chattier client.
+
+### Caddy logs `tls: handshake failure`
+
+Client doesn't trust the self-signed cert. Either trust the root CA
+(see step 6 in install) or set `MEMPAL_REMOTE_INSECURE=1` /
+`PYTHONHTTPSVERIFY=0` on that client.
+
+### Container can't start: "address already in use"
+
+Port 8443 is taken (commonly by Unraid's WebUI HTTPS or another
+service). Edit `docker-compose.yml` and change the host-side mapping:
+
+```yaml
+    ports:
+      - "9443:8443"   # change 9443 to whatever's free
+```
+
+Update `MEMPAL_REMOTE_URL` on every client to match.
+
+### Embedding model download stalls on first request
+
+The ~80 MB MiniLM ONNX model downloads from HuggingFace on first
+use. Slow connections can time out the initial mining call. Pre-warm
+it manually:
+
+```bash
+docker exec mempalace python -c \
+  "from chromadb.utils.embedding_functions import ONNXMiniLM_L6_V2; ONNXMiniLM_L6_V2()(['warmup'])"
+```
+
+Subsequent uses load from `/data/.cache/chroma/` — ~50 ms.
+
+### Logs
+
+```bash
+docker logs mempalace          # MCP server, ingest server
+docker logs mempalace-caddy    # auth gate, TLS, access logs
+tail -f ~/.mempalace/hook_state/hook.log   # client-side hook activity
+```
+
+---
+
+## Updating
+
+When this repo updates upstream:
+
+```bash
+cd /mnt/user/system/build/mempalace
+git pull
+cd deploy/unraid
+docker compose up -d --build
+```
+
+Compose only rebuilds the `mempalace` service (the image hash
+changes); Caddy is pinned to `caddy:2-alpine` and pulls latest within
+the 2.x line.
+
+Your palace data and Caddy CA persist across rebuilds because they're
+on volumes outside the container.
+
+---
+
+## Going further
+
+* **Replace self-signed TLS with Let's Encrypt** — point a real domain at
+  Unraid (DDNS or otherwise), open port 80 for ACME challenge, and
+  change `tls internal` in `Caddyfile` to `tls your@email`. Caddy
+  handles the rest.
+* **Put behind SWAG / Nginx Proxy Manager** — drop the Caddy sidecar,
+  keep `mempalace` exposing 8765/8766 internally only, and add the
+  routes to your existing reverse proxy. Bearer-token auth and SSE
+  pass-through must be configured manually.
+* **Per-machine wings** — set `MEMPAL_REMOTE_WING=<machinename>` on
+  each client so transcripts file under separate wings; cross-wing
+  search still works via the palace graph.