mempalace

Author	SHA1	Message	Date
Igor Lins e Silva	11a35de5ac	test(palace): set USERPROFILE too so the lock-path test works on Windows os.path.expanduser("~") reads HOME on POSIX but USERPROFILE on Windows; the lock-body bound test was monkeypatching HOME only, so on test-windows the lock file landed in the runner's real ~/.mempalace and the tmp_path glob found nothing. Patch USERPROFILE in addition to HOME, and read the body as bytes so the byte-0 sentinel doesn't trip a UTF-8 decode warning. Assertion shifts from line-count to size-bound (still detects unbounded growth across re-acquires).	2026-05-08 01:34:46 -03:00
Igor Lins e Silva	ef8d83cc8a	fix(mine): identify lock holder + exit non-zero on contention When a `mempalace mine` collided with another writer (live mcp_server, another mine, anything taking mine_palace_lock), the operator saw a generic "another `mempalace mine` is already running" message and the CLI exited 0 — making the contention invisible to nohup or scripts checking $?. The reporter ran a `nohup mempalace mine ... & disown` and got a 200-byte log with only the auto-defaults warning, no clue that an MCP server was holding the store. palace.py: the lock file now records the holder's PID + first three argv tokens on acquire. A failed acquire reads the file and surfaces "palace <path> is held by PID N (mempalace mcp_server); wait for it to finish or stop the holder before retrying" in the MineAlreadyRunning message. Open mode changes from "w" to "a+" so the prior holder's identity survives long enough to be read. miner.mine() now lets MineAlreadyRunning propagate. cmd_mine catches it, prints the holder-aware message to stderr, and exits non-zero so shell wrappers detect the contention. Note: this is a behavior change for in-process callers that depended on miner.mine() silently swallowing MineAlreadyRunning. The silent swallow was the bug. Closes #1264	2026-05-08 01:00:00 -03:00
Igor Lins e Silva	d1e27b8c42	style: ruff format new test files (CI lint)	2026-05-06 01:47:46 -03:00
imtylervo	f30fdf2672	fix: serialize ChromaCollection writes through palace lock #976 protects `mempalace mine`, but MCP/direct backend writers still call ChromaCollection.add/upsert/update/delete without the palace lock. This moves the lock boundary to the Chroma backend seam so all Chroma writes share the same palace-level serialization, with a re-entrant guard for miner paths that already hold the lock. mine_palace_lock(palace_path) gains a per-thread re-entrant guard (threading.local + pid-tag against fork inheritance) so ChromaCollection write methods can take the lock without self-deadlocking when called from inside miner.mine()'s outer hold. ChromaCollection.__init__ accepts an optional palace_path; when set, add/upsert/update/delete wrap their underlying chromadb call with mine_palace_lock(palace_path). palace_path=None preserves the legacy no-lock behaviour for direct callers and tests. ChromaBackend's get_collection/create_collection pass palace_path through; mcp_server._get_collection forwards _config.palace_path so all MCP write tools inherit the wrapping. Tests: 5 new in tests/test_chroma_collection_lock.py covering opt-in, writer-blocks-during-mine, re-entrant-inside-mine, two-process serialization, and a source-level read-path-not-locked pin. Plus 1 new + 1 rewritten in tests/test_palace_locks.py for the re-entrant semantics. 52 passed in 1.01s including the existing test_backends.py regression suite. Refs #1161.	2026-04-27 14:16:20 +10:00
Felipe Truman	1998aede66	fix: Windows CI compat for palace lock tests and path normalization Addresses the two actionable Copilot comments from the 2nd review pass. tests/test_palace_locks.py (#7, #8) multiprocessing.get_context("fork") is unavailable on Windows, so the cross-process tests would crash the Windows CI runner. Added `_get_mp_context()` that picks "spawn" on Windows and "fork" elsewhere. Spawn re-imports the module in the child; it inherits os.environ (including the monkeypatched HOME), which is all these tests need. mempalace/palace.py (#10) The per-palace lock key was computed from os.path.abspath(palace_path). On Windows the filesystem is case-insensitive, so `C:\\Palace` and `c:\\palace` would hash to different keys and two concurrent mines could touch the same on-disk palace. Switched to `os.path.normcase(os.path.realpath(...))` so: * realpath resolves symlinks and `..` segments * normcase folds case on Windows (no-op on POSIX) Testing pytest tests/test_palace_locks.py tests/test_hooks_cli.py tests/test_backends.py tests/test_cli.py → 98 passed, 0 failed.	2026-04-25 04:34:30 -03:00
Felipe Truman	99b820cb42	fix: address PR review — per-palace lock, MCP server path, hook timeout, tests Addresses the six Copilot review comments on the initial commit. 1) #6 (critical) — mcp_server.py `_get_collection` bypassed ChromaBackend The MCP server creates its palace collection directly via `chromadb.PersistentClient.get_or_create_collection` in `_get_collection`, not through `ChromaBackend.get_collection`. That path was missing the `hnsw:num_threads=1` metadata, so the primary crash surface for #974 and #965 was untouched by the original patch. Fixed by passing `hnsw:num_threads=1` at the mcp_server create site too. Documented in a code comment that the setting is only honored at creation time — existing palaces created before this fix still need a `mempalace nuke` + re-mine to gain the protection. 2) #3 — mine_global_lock over-serialized mines across unrelated palaces Replaced the single global lock file `mine_global.lock` with a per-palace lock keyed by `sha256(os.path.abspath(palace_path))` (`mine_palace_<hash>.lock`). Mines against the same palace still collapse to a single runner (the correctness boundary), but mines against different palaces are now free to run in parallel. `mine_global_lock` is kept as a backward-compatible alias for `mine_palace_lock` so any external callers that imported the previous name keep working. 3) #1 — hook_precompact swallowed OSError but not subprocess.TimeoutExpired `subprocess.run(..., timeout=60)` raises `TimeoutExpired` on slow palaces. The previous `except OSError` clause didn't catch it, so the hook could raise and fail to emit any JSON decision — leaving the harness without a block/passthrough signal. Fixed by catching `(OSError, subprocess.TimeoutExpired)` together and always falling through to the block decision so the hook reliably emits a response. 4) #2 + #4 — tests - tests/test_hooks_cli.py: added `test_precompact_first_two_attempts_block`, `test_precompact_passes_through_after_cap`, and `test_precompact_counter_is_per_session` to lock in the #955 deadlock fix. - tests/test_palace_locks.py (new): covers `mine_palace_lock` single-acquire, reuse-after-release, cross-process serialization on the same palace, non-interference across different palaces, path normalization, and the `mine_global_lock` back-compat alias. 5) #5 — known limitation, documented but not auto-fixed Copilot suggested detecting collections missing `hnsw:num_threads=1` and calling `collection.modify(metadata=...)` to retrofit existing palaces. Verified against chromadb 1.5.7: `modify(metadata=...)` replaces metadata rather than merging, and re-passing `hnsw:space="cosine"` then raises `ValueError: Changing the distance function of a collection once it is created is not supported currently.` The HNSW runtime configuration (`configuration_json`) also does not expose `num_threads` in chromadb 1.5.x, so the flag appears to be read only at creation time. Rather than paper over the limitation with a best-effort `modify` that silently drops `hnsw:space`, documented in the mcp_server comment that pre-existing palaces need a `mempalace nuke` + re-mine to gain the protection. Fresh palaces are always protected. Testing - pytest tests/test_palace_locks.py tests/test_hooks_cli.py tests/test_backends.py tests/test_cli.py → 98 passed, 0 failed. - Runtime validation with two concurrent `mempalace mine` calls: - Different palaces → both complete in parallel ✓ - Same palace → one completes, the other exits with "another `mine` is already running against <palace> — exiting cleanly." ✓	2026-04-25 04:34:30 -03:00
Felipe Truman	7e18a70796	fix: resolve hooks_cli.py merge conflict + add mine_global_lock tests - Resolve UU conflict in hooks_cli.py: take develop/HEAD approach (mine synchronously via _mine_sync, then pass through unconditionally). _mine_sync already catches subprocess.TimeoutExpired — fixes Copilot #1. - Add tests/test_palace_locks.py: 4 tests covering mine_global_lock non-blocking semantics (acquire, second-acquire raises MineAlreadyRunning, reusable after release, release on exception) — fixes Copilot #4. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 04:34:30 -03:00

7 Commits