Files
mempalace/deploy/unraid/README.md
T
2026-05-09 10:52:25 -05:00

513 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# MemPalace on Unraid — server-mode deployment
This directory contains everything needed to run MemPalace as a shared
memory server on an Unraid box and connect multiple AI tools (Claude
Code, Codex, Antigravity, or any MCP-compatible client) to a single
persistent palace.
If you only use one machine, you don't need any of this — install
mempalace locally per the main [README](../../README.md) and you're
done. This guide is for users running the same AI tools across multiple
machines who want one shared memory.
---
## What you get
```
home LAN
┌───────────────────────────────────┐
│ Unraid (always on) │
│ ┌────────────────────────────┐ │
│ │ caddy :8443 (TLS + auth) │ │
│ │ ├─ /sse → mcp-proxy │ │
│ │ └─ /ingest → ingest API │ │
│ │ mempalace (single process) │ │
│ │ ├─ mcp-proxy :8765 │ │
│ │ └─ ingest :8766 │ │
│ └────────────────────────────┘ │
│ /mnt/user/appdata/mempalace/ │
│ ├─ palace/ ChromaDB │
│ ├─ kg/ knowledge graph │
│ └─ inbox/ uploaded sessions │
└───────────────────────────────────┘
│ │ │
┌─────┴─┐ ┌────┴──┐ ┌───┴──────┐
│ box A │ │ box B │ │ box C │
│ Claude│ │ Codex │ │ Antigrav │
└───────┘ └───────┘ └──────────┘
```
* **One palace, many clients.** Search and write target the same
ChromaDB index regardless of which machine you're on.
* **Auto-save hooks work across machines.** Each client's session
transcripts get pushed to the server on `Stop` and `PreCompact`
events; the server-side miner runs the existing `mine_convos`
pipeline (entity detection, room assignment, dedup, idempotency).
* **Single shared secret.** One bearer token gates both MCP and
transcript ingest at the Caddy edge.
What this is **not**: a multi-tenant cloud product. There's one palace,
one token, no per-user isolation. It's designed for a single user with
multiple machines.
---
## Files in this directory
| File | Purpose |
|---|---|
| `docker-compose.yml` | Two-container stack: `mempalace` + `caddy` sidecar. |
| `Caddyfile` | Caddy config: bearer-token auth, self-signed TLS, SSE-aware reverse proxy. |
| `mempalace-server.xml` | dockerMan template for a single-container, **no-auth, LAN-trust-only** install (compose path is the recommended one). |
| `README.md` | This file. |
The `Dockerfile` and `.dockerignore` live at the repo root — the compose
build context is `../..` so it can reach them.
---
## Prerequisites
* Unraid 6.12+ with Docker enabled (default).
* The **Compose Manager** plugin from Community Apps. Required for the
recommended (auth-enabled) path. The dockerMan template path doesn't
need it but has no auth.
* `/mnt/user/appdata` set up (default on every Unraid).
* Ports `8443` free on the Unraid host (or change in `docker-compose.yml`).
You do **not** need Tailscale, WireGuard, a domain name, a public IP,
SWAG, or NPM. The stack is self-contained.
---
## Install (recommended: compose with auth)
### 1. Get the repo onto Unraid
SSH to Unraid, pick a path on a regular share (not `/boot`, not
`/mnt/cache` directly), and clone or copy the repo:
```bash
mkdir -p /mnt/user/system/build
cd /mnt/user/system/build
git clone <your-fork-or-rsync-source> mempalace
cd mempalace/deploy/unraid
```
### 2. Mint a bearer token
```bash
TOKEN=$(openssl rand -hex 32)
echo "MEMPAL_TOKEN=$TOKEN" > .env
chmod 600 .env
echo "Token: $TOKEN" # save to a password manager — you'll set this on each client
```
`MEMPAL_TOKEN` is read from `.env` by `docker compose`. The same token
is forwarded to:
* Caddy, which checks `Authorization: Bearer <token>` on every request.
* The in-container ingest server as `MEMPALACE_INGEST_TOKEN` for
defense-in-depth.
### 3. Create the appdata directories
```bash
mkdir -p /mnt/user/appdata/mempalace \
/mnt/user/appdata/mempalace-caddy/data \
/mnt/user/appdata/mempalace-caddy/config
chown -R 99:100 /mnt/user/appdata/mempalace
chown -R 99:100 /mnt/user/appdata/mempalace-caddy
```
The Caddy data dir holds Caddy's auto-generated root CA — back it up
so re-deploys keep the same cert (clients won't have to re-trust it).
### 4. Build and start
```bash
docker compose up -d --build
```
First build downloads Python 3.13-slim and pip-installs `mempalace` +
`mcp-proxy` (~35 min on a Celeron, faster on real hardware).
### 5. Verify
```bash
# unauth'd liveness probe
curl -k https://<unraid-ip>:8443/healthz
# → {"status":"ok","version":"3.3.x"}
# bearer-checked endpoint should 401 without the token
curl -ki https://<unraid-ip>:8443/ingest/transcript
# HTTP/2 401
# ...and accept a request with it
curl -k -H "Authorization: Bearer $TOKEN" https://<unraid-ip>:8443/healthz
# → 200 OK
```
If you see all of the above, the server is up and the auth gate is
working.
### 6. (Optional) Trust Caddy's root CA on each client
Caddy's `tls internal` directive auto-generates a self-signed root CA
on first start. Clients must either trust that CA or skip TLS
verification (`-k` for curl, `MEMPAL_REMOTE_INSECURE=1` for hooks,
disabled SSL verify for the MCP client).
To trust it once and stop seeing TLS warnings:
```bash
# On Unraid:
cat /mnt/user/appdata/mempalace-caddy/data/caddy/pki/authorities/local/root.crt
```
Copy that PEM block to each Windows client and import into the
**Trusted Root Certification Authorities** store via `certmgr.msc`,
or via PowerShell:
```powershell
Import-Certificate -FilePath C:\path\to\root.crt -CertStoreLocation Cert:\LocalMachine\Root
```
---
## Connect AI tools
You'll need [`mcp-proxy`](https://github.com/sparfenyuk/mcp-proxy) on
each client machine:
```bash
uv tool install mcp-proxy
# or:
pip install mcp-proxy
```
Set environment variables persistently. **PowerShell** (Windows):
```powershell
[Environment]::SetEnvironmentVariable("MEMPAL_REMOTE_URL", "https://<unraid-ip>:8443", "User")
[Environment]::SetEnvironmentVariable("MEMPAL_REMOTE_TOKEN", "<the-token>", "User")
# Drop this once you've trusted Caddy's root CA:
[Environment]::SetEnvironmentVariable("MEMPAL_REMOTE_INSECURE", "1", "User")
```
**Bash/Zsh** (macOS/Linux): add the same three exports to
`~/.zshrc` / `~/.bashrc`.
### Claude Code
Add to `~/.claude.json` (user-scoped) or `.mcp.json` in the project:
```json
{
"mcpServers": {
"mempalace": {
"command": "mcp-proxy",
"args": [
"https://<unraid-ip>:8443/sse",
"--headers", "Authorization", "Bearer <the-token>"
],
"env": {
"PYTHONHTTPSVERIFY": "0"
}
}
}
}
```
Drop the `env` block once Caddy's root CA is trusted on the client.
### Codex CLI
Add to `~/.codex/config.toml`:
```toml
[mcp_servers.mempalace]
command = "mcp-proxy"
args = [
"https://<unraid-ip>:8443/sse",
"--headers", "Authorization", "Bearer <the-token>",
]
[mcp_servers.mempalace.env]
PYTHONHTTPSVERIFY = "0"
```
### Antigravity
Antigravity uses the Windsurf-derived MCP layout. Open the IDE's
MCP settings UI (Settings → AI → MCP Servers) and add:
```json
{
"mempalace": {
"command": "mcp-proxy",
"args": [
"https://<unraid-ip>:8443/sse",
"--headers", "Authorization", "Bearer <the-token>"
]
}
}
```
Or edit `~/.antigravity/mcp.json` directly with the same shape.
### Verify each client
In any of the three tools, start a session and call:
> "Use mempalace_status to show palace stats."
Expected: a JSON blob with `total_drawers`, wing/room breakdown, etc.
A 401 means the token is wrong; a connection error means the
URL/cert is wrong.
---
## Set up auto-save hooks
The `_remote.sh` hook variants in `../../hooks/` push transcripts to
the server instead of running `mempalace mine` locally. They share the
same env-var contract as the MCP client config above.
### Claude Code
Make the scripts executable:
```bash
chmod +x hooks/mempal_save_hook_remote.sh \
hooks/mempal_precompact_hook_remote.sh
```
Add to `.claude/settings.local.json`:
```json
{
"hooks": {
"Stop": [{
"matcher": "*",
"hooks": [{
"type": "command",
"command": "/abs/path/to/hooks/mempal_save_hook_remote.sh",
"timeout": 30
}]
}],
"PreCompact": [{
"hooks": [{
"type": "command",
"command": "/abs/path/to/hooks/mempal_precompact_hook_remote.sh",
"timeout": 60
}]
}]
}
}
```
### Codex CLI
Add to `.codex/hooks.json` with the same shape — the scripts are
hook-host-agnostic.
### What the hooks do
| Hook | Trigger | Behavior |
|---|---|---|
| `mempal_save_hook_remote.sh` | Every 15 user messages (configurable via `SAVE_INTERVAL` env var) | Backgrounded `curl` POSTs the active transcript to `/ingest/transcript`. Returns immediately so the AI doesn't stall. Idempotent — failed retries are safe. |
| `mempal_precompact_hook_remote.sh` | Right before context compaction | Synchronous `curl` POST. Blocks until the upload completes (or the hook timeout fires) so memory is durable before context shrinks. |
Both write logs to `~/.mempalace/hook_state/hook.log`. Tail it during
setup to confirm uploads are landing.
### Optional env vars
| Variable | Default | Purpose |
|---|---|---|
| `MEMPAL_REMOTE_URL` | *(required)* | Server base URL, e.g. `https://unraid.local:8443`. |
| `MEMPAL_REMOTE_TOKEN` | *(required)* | Bearer token. |
| `MEMPAL_REMOTE_INSECURE` | unset | Set to `1` to skip TLS verification. Use only with `tls internal`. |
| `MEMPAL_REMOTE_WING` | unset | Force a specific wing for this client's transcripts. Default: server derives wing from session id. |
| `SAVE_INTERVAL` | `15` | Messages between save-hook fires. |
---
## Backfilling history
The hooks only capture sessions going forward. To mine **past**
transcripts into the remote palace, on each client run:
```bash
curl -k -X POST \
-H "Authorization: Bearer $MEMPAL_REMOTE_TOKEN" \
-H "X-Session-Id: backfill-$(hostname)-$(date +%s)" \
-H "X-Wing: backfill" \
--data-binary @/path/to/some-session.jsonl \
"$MEMPAL_REMOTE_URL/ingest/transcript"
```
For a whole directory of past sessions, loop:
```bash
for f in ~/.claude/projects/**/*.jsonl; do
curl -k -X POST \
-H "Authorization: Bearer $MEMPAL_REMOTE_TOKEN" \
-H "X-Session-Id: $(basename "$f" .jsonl)" \
--data-binary @"$f" \
"$MEMPAL_REMOTE_URL/ingest/transcript"
done
```
The server-side miner is idempotent — re-uploading the same transcript
won't double-file.
---
## Backups
Everything that matters lives in `/mnt/user/appdata/mempalace/`:
* `palace/` — ChromaDB vector index + SQLite metadata
* `kg/` — knowledge-graph SQLite
* `inbox/` — uploaded transcripts (kept for re-mining if needed)
Add it to your **CA Backup / Appdata Backup** schedule. Losing this
directory loses all memory.
The Caddy data dir (`/mnt/user/appdata/mempalace-caddy/data/`) is also
worth backing up — it contains the auto-generated root CA. Without it,
re-deploys regenerate the CA and clients have to re-trust it.
---
## dockerMan template (no-auth, LAN-trust-only)
If you don't want auth and trust your LAN absolutely (no other people,
no untrusted IoT, no guests), the `mempalace-server.xml` template gives
you a single-container, dockerMan-compatible install:
```bash
# Build the image:
cd /mnt/user/system/build/mempalace
docker build -t mempalace-server:latest .
# Install the template:
cp deploy/unraid/mempalace-server.xml \
/boot/config/plugins/dockerMan/templates-user/my-MemPalace.xml
```
Then in the Unraid WebUI: Docker → Add Container → "Select a template" →
**MemPalace** → Apply.
This path skips Caddy entirely. The MCP SSE endpoint is published bare
on `:8765`, no TLS, no auth. Anyone on the LAN can read and write the
palace. **Only use this if you understand and accept that.**
---
## Troubleshooting
### `mcp-proxy` connects but tool calls hang
Caddy is buffering SSE responses. Verify `flush_interval -1` is set in
the Caddyfile and that Caddy version is 2.7+ (the compose pulls
`caddy:2-alpine` which is current).
### 401 from every request
The token in the client's MCP config doesn't match the server's
`MEMPAL_TOKEN`. Print both to confirm:
```bash
# On Unraid:
grep MEMPAL_TOKEN /mnt/user/system/build/mempalace/deploy/unraid/.env
# On client (PowerShell):
[Environment]::GetEnvironmentVariable("MEMPAL_REMOTE_TOKEN", "User")
```
### `MineAlreadyRunning` errors in hook logs
Two clients hit the ingest endpoint simultaneously. The server-side
miner serializes via `mine_lock` and rejects the second one. The hook
is idempotent — the next save catches up. If you see this constantly,
raise `SAVE_INTERVAL` on the chattier client.
### Caddy logs `tls: handshake failure`
Client doesn't trust the self-signed cert. Either trust the root CA
(see step 6 in install) or set `MEMPAL_REMOTE_INSECURE=1` /
`PYTHONHTTPSVERIFY=0` on that client.
### Container can't start: "address already in use"
Port 8443 is taken (commonly by Unraid's WebUI HTTPS or another
service). Edit `docker-compose.yml` and change the host-side mapping:
```yaml
ports:
- "9443:8443" # change 9443 to whatever's free
```
Update `MEMPAL_REMOTE_URL` on every client to match.
### Embedding model download stalls on first request
The ~80 MB MiniLM ONNX model downloads from HuggingFace on first
use. Slow connections can time out the initial mining call. Pre-warm
it manually:
```bash
docker exec mempalace python -c \
"from chromadb.utils.embedding_functions import ONNXMiniLM_L6_V2; ONNXMiniLM_L6_V2()(['warmup'])"
```
Subsequent uses load from `/data/.cache/chroma/` — ~50 ms.
### Logs
```bash
docker logs mempalace # MCP server, ingest server
docker logs mempalace-caddy # auth gate, TLS, access logs
tail -f ~/.mempalace/hook_state/hook.log # client-side hook activity
```
---
## Updating
When this repo updates upstream:
```bash
cd /mnt/user/system/build/mempalace
git pull
cd deploy/unraid
docker compose up -d --build
```
Compose only rebuilds the `mempalace` service (the image hash
changes); Caddy is pinned to `caddy:2-alpine` and pulls latest within
the 2.x line.
Your palace data and Caddy CA persist across rebuilds because they're
on volumes outside the container.
---
## Going further
* **Replace self-signed TLS with Let's Encrypt** — point a real domain at
Unraid (DDNS or otherwise), open port 80 for ACME challenge, and
change `tls internal` in `Caddyfile` to `tls your@email`. Caddy
handles the rest.
* **Put behind SWAG / Nginx Proxy Manager** — drop the Caddy sidecar,
keep `mempalace` exposing 8765/8766 internally only, and add the
routes to your existing reverse proxy. Bearer-token auth and SSE
pass-through must be configured manually.
* **Per-machine wings** — set `MEMPAL_REMOTE_WING=<machinename>` on
each client so transcripts file under separate wings; cross-wing
search still works via the palace graph.