mempalace

Author	SHA1	Message	Date
Igor Lins e Silva	c70d5182dd	Merge pull request #1413 from MemPalace/fix/1264-mine-lock-holder-diagnostics fix(mine): identify lock holder + exit non-zero on contention (#1264)	2026-05-08 01:59:00 -03:00
Igor Lins e Silva	bc7392aa29	Merge pull request #1412 from MemPalace/fix/1268-popen-detach-windows fix(hooks): detach Popen children so hook exits cleanly on Windows (#1268)	2026-05-08 01:38:16 -03:00
Igor Lins e Silva	11a35de5ac	test(palace): set USERPROFILE too so the lock-path test works on Windows os.path.expanduser("~") reads HOME on POSIX but USERPROFILE on Windows; the lock-body bound test was monkeypatching HOME only, so on test-windows the lock file landed in the runner's real ~/.mempalace and the tmp_path glob found nothing. Patch USERPROFILE in addition to HOME, and read the body as bytes so the byte-0 sentinel doesn't trip a UTF-8 decode warning. Assertion shifts from line-count to size-bound (still detects unbounded growth across re-acquires).	2026-05-08 01:34:46 -03:00
Igor Lins e Silva	d5ce97c7af	fix(palace): reserve byte 0 as lock sentinel for Windows portability Windows CI surfaced two bugs introduced by the holder-identity write: 1. msvcrt.locking(LK_NBLCK, 1) locks 1 byte at the current file position. Switching to "a+" mode put the position at end-of-file, so two contenders locked different bytes and silently both acquired (the test asserts saw [(ok, 1), (ok, 2)] instead of ok+busy). 2. With the byte-range lock active on Windows, the locked byte is read-blocked for other processes. A contender trying to read the holder identity from byte 0 would hit PermissionError. Switch to "r+" mode (after touch-create) and explicitly seek(0) before both lock and unlock. Then reserve byte 0 as a pure lock sentinel and write the holder identity from byte 1 onward. _read_lock_holder reads from byte 1+, so it never touches the locked byte. Also bound file growth across re-acquires: truncate to sentinel + len(ident) before writing so the file body stays the size of the current holder, never accumulating across runs. Linux fcntl.flock locks the whole file independent of byte position, so the seek(0) is harmless on POSIX. The shape works on both.	2026-05-08 01:28:42 -03:00
Igor Lins e Silva	ef8d83cc8a	fix(mine): identify lock holder + exit non-zero on contention When a `mempalace mine` collided with another writer (live mcp_server, another mine, anything taking mine_palace_lock), the operator saw a generic "another `mempalace mine` is already running" message and the CLI exited 0 — making the contention invisible to nohup or scripts checking $?. The reporter ran a `nohup mempalace mine ... & disown` and got a 200-byte log with only the auto-defaults warning, no clue that an MCP server was holding the store. palace.py: the lock file now records the holder's PID + first three argv tokens on acquire. A failed acquire reads the file and surfaces "palace <path> is held by PID N (mempalace mcp_server); wait for it to finish or stop the holder before retrying" in the MineAlreadyRunning message. Open mode changes from "w" to "a+" so the prior holder's identity survives long enough to be read. miner.mine() now lets MineAlreadyRunning propagate. cmd_mine catches it, prints the holder-aware message to stderr, and exits non-zero so shell wrappers detect the contention. Note: this is a behavior change for in-process callers that depended on miner.mine() silently swallowing MineAlreadyRunning. The silent swallow was the bug. Closes #1264	2026-05-08 01:00:00 -03:00
Igor Lins e Silva	71804c0aa5	fix(hooks): detach Popen children so the hook can exit on Windows The Stop hook spawns mining subprocesses via subprocess.Popen and then returns. On Windows the parent stays blocked at session end because the child inherits stdout/stderr handles and the OS waits for them to release before the parent can exit — the user-visible symptom is the "running stop hooks... 3/3" spinner hanging for minutes (#1268). Add _detached_popen_kwargs() helper that returns the right detach knobs per platform: - POSIX: start_new_session=True, stdin=DEVNULL, close_fds=True - Windows: creationflags=DETACHED_PROCESS\|CREATE_NEW_PROCESS_GROUP\| CREATE_BREAKAWAY_FROM_JOB, stdin=DEVNULL, close_fds=True Apply to all three fire-and-forget Popen sites in hooks_cli: _spawn_mine, _ingest_transcript, _desktop_toast. Leave _mine_sync's subprocess.run alone — that path is intentionally synchronous (the precompact hook must wait for the mine to finish). Note: the issue body references mempalace-stop.js, which does not exist in this repo (the plugin ships shell wrappers calling Python). The mechanism described — child holds parent open via inherited handles — is universal, so this fix targets the equivalent symptom in our Python hook path. Will follow up on the upstream JS file with the reporter.	2026-05-08 00:55:11 -03:00
Igor Lins e Silva	ea36a00f5f	Merge pull request #1406 from MemPalace/fix/925-diary-content-hash fix(diary): detect same-size edits via content hash (#925)	2026-05-07 17:53:24 -03:00
Igor Lins e Silva	26bc3d4f91	test(diary): write fixture with explicit utf-8 to fix Windows hash assert test_legacy_state_backfills_content_hash failed on test-windows because Path.write_text without an encoding uses the system locale (cp1252 on Windows). The em dash was written as 0x97, then read back by diary_ingest as UTF-8 with errors=replace — round-trip produced different bytes than the in-Python literal, so the assertion comparing the persisted hash to sha256(text.encode(utf-8)) diverged. Pin the write side to encoding=utf-8 so the on-disk bytes match what diary_ingest decodes. No production change.	2026-05-07 17:41:19 -03:00
Igor Lins e Silva	ba30ab6951	Merge pull request #1405 from MemPalace/fix/1156-exporter-reject-symlinks fix(exporter): refuse symlinks at export targets (#1156)	2026-05-07 17:38:23 -03:00
Igor Lins e Silva	83d53644bb	Merge pull request #1404 from MemPalace/fix/1155-call-llm-retry-on-json-decode fix(closet_llm): retry _call_llm on JSONDecodeError (#1155)	2026-05-07 17:38:04 -03:00
Igor Lins e Silva	2ff6283b32	fix(diary): rebuild closets on hash change + backfill legacy state Address Copilot review on #925: - Full closet rebuild whenever the content hash differs from prior state, not only on entry-count growth. Without this, an in-place edit (same entry count, different body) updated the drawer but left the closet/search index stale — defeats the verbatim guarantee at the search layer even if the drawer is correct. - Legacy size-only skip path now records the computed content_hash back into state so subsequent runs use the strict hash check instead of remaining on the size-only path indefinitely. - Test updates: typo direction in the regression test now matches the comment (typo "Teh" → fix "The"), assertion now also checks the closet collection reflects the edit, and a new test exercises the legacy-state backfill path.	2026-05-07 12:54:09 -03:00
Igor Lins e Silva	75452380a8	fix(exporter): refuse symlinks at file targets and skip tests on Windows Address Copilot review on #1156: - Per-file symlink check via new _safe_open_for_write() helper. Uses O_NOFOLLOW on POSIX (close TOCTOU window between islink check and open) and falls back to islink + open on Windows. Applied to room files and index.md, mirroring the existing dir-level check. - Tests now wrap os.symlink() in _try_symlink_or_skip() so Windows without Developer Mode and restricted CI sandboxes skip rather than hard-fail. Added two regression tests for the file-level cases (room file, index.md).	2026-05-07 12:51:47 -03:00
Igor Lins e Silva	8e21b5abd4	test(closet_llm): use _ for unused return values per copilot review	2026-05-07 12:49:27 -03:00
Igor Lins e Silva	0d1c1fbcaa	fix(diary): detect same-size edits via content hash The skip-if-unchanged check compared byte length only, so any in-place edit preserving total length (typo fix "teh"→"the", word swap) was silently dropped — a verbatim-storage violation: the user's actual words never reached the palace. Switch the gate to sha256(text). State entries gain a "content_hash" field; the legacy size-only path is preserved when prev_hash is missing so a post-upgrade run does not re-ingest every untouched diary. Closes #925	2026-05-07 12:42:02 -03:00
Igor Lins e Silva	40e2c8b056	fix(exporter): refuse symlinks at export targets A symlink pre-placed at the export output_dir or any wing subdirectory would redirect markdown writes to wherever the symlink points. The miner already rejects symlinked inputs via Path.is_symlink(); the exporter should apply the same caution to outputs. Add _reject_symlink() helper and call it before makedirs on both output_dir and each wing_dir. Refusal raises ValueError with a clear message rather than silently falling through. Closes #1156	2026-05-07 12:40:26 -03:00
Igor Lins e Silva	2a0ed0cb8f	fix(closet_llm): retry _call_llm on JSONDecodeError instead of bailing The retry loop already backs off on HTTP 429/503 and rate-limit-shaped exceptions, but JSONDecodeError exited on the first failure. Local LLM runtimes occasionally produce malformed JSON (truncated streams, partial chunks under load), and the retry was effectively dead for that path. Mirror the 429/503 branch: sleep with exponential backoff and continue through all 3 attempts, only returning None after the final failure. Closes #1155	2026-05-07 12:38:39 -03:00
Igor Lins e Silva	03ed4c45cf	Merge pull request #1403 from MemPalace/fix/sqlite-preflight-order fix(repair): run SQLite integrity preflight before chromadb open (follow-up to #1364)	2026-05-07 12:15:18 -03:00
Igor Lins e Silva	7b151039c9	test(repair): page-align corruption offset in preflight regression test Address Copilot review on #1403: the test seeked unconditionally to offset 40960 with only `pre_size > 16384` as a guard. If pre_size sat between 16384 and 40960 + 16384 = 57344 (e.g., on a chromadb version that allocated fewer pages on init, or a future schema change), the seek would extend the file with zero-padding and the original pages would stay intact — quick_check would still pass on the (untouched) real data, and the regression guard would silently skip detecting a preflight-ordering regression. Compute the offset from pre_size, page-aligned, with explicit asserts that the file is large enough to mangle 4 pages without truncating the header or extending past EOF.	2026-05-07 12:07:54 -03:00
Igor Lins e Silva	5134a635ed	fix(repair): run SQLite integrity preflight before chromadb open #1364 added the SQLite quick_check preflight to rebuild_index, but placed it AFTER backend.get_collection(...). On a SQLite-corrupt palace, chromadb's rust binding raises pyo3_runtime.PanicException — which is not a regular Exception subclass — so it propagates past the existing `except Exception` handlers and the user sees a 30-line stack trace instead of the friendly abort message #1364 was designed to deliver. Reproduced with `mempalace repair --yes` against a palace whose chroma.sqlite3 has 4 mangled pages: pre-fix, panic; post-fix, the clean abort message and exit code 1. Two changes: - mempalace/cli.py cmd_repair: run sqlite_integrity_errors() right after the basic palace-existence check, BEFORE the max_seq_id preflight (which itself opens sqlite3) and BEFORE backend = ChromaBackend(). Exit non-zero so unattended scripts and CI gates see the failure. - mempalace/repair.py rebuild_index: same move at the function level for direct callers (tests, MCP) that bypass cmd_repair. The new test test_rebuild_index_runs_sqlite_preflight_before_chromadb_open uses a real chromadb-built palace (no ChromaBackend mock) plus a real corrupt SQLite (16 KB of mangled pages) so the ordering is exercised end-to-end. The previously-shipping test for the abort path mocked both the backend and sqlite_integrity_errors, which is why the ordering bug shipped CI-green. Six existing test_cli.py cmd_repair tests used `(palace_dir / "chroma.sqlite3").write_text("db")` to fake the SQLite file. The new preflight correctly fails quick_check on those 2-byte stubs, so the tests now create empty real SQLite DBs the same way the test_repair.py fixtures already do.	2026-05-07 11:52:58 -03:00
Igor Lins e Silva	f38d9eb109	Merge pull request #1364 from fatkobra/fix/1362-repair-sqlite-integrity-preflight fix(repair): preflight SQLite integrity before rebuild	2026-05-07 11:07:52 -03:00
Igor Lins e Silva	aecd543d75	merge: develop into fix/1362-repair-sqlite-integrity-preflight (round 2) #1357 (max_seq_id preflight) merged into develop while this branch was in CI, opening a fresh conflict between the two preflight helpers. mempalace/repair.py: - Kept both: this branch's sqlite_integrity_errors() / print_sqlite_ integrity_abort() AND develop's maybe_repair_poisoned_max_seq_id_ before_rebuild() from #1357. They check for distinct corruption classes and run as separate preflights. tests/test_repair.py: - Kept both this branch's sqlite_integrity_errors test group and develop's max_seq_id preflight test group; non-overlapping coverage. Local: 1623 tests pass, ruff lint+format clean against 0.4.x CI pin.	2026-05-07 10:45:29 -03:00
Igor Lins e Silva	3893228e03	Merge pull request #1342 from fatkobra/fix/1274-missing-hnsw-metadata-gate fix(storage): quarantine partial HNSW flush without metadata (#1274)	2026-05-07 10:42:20 -03:00
Igor Lins e Silva	557b9b1908	Merge pull request #1357 from fatkobra/fix/1295-repair-max-seq-id-preflight fix(repair): preflight poisoned max_seq_id before rebuild (#1295)	2026-05-07 10:42:09 -03:00
Igor Lins e Silva	f2291b0320	merge: develop into fix/1362-repair-sqlite-integrity-preflight Conflicts opened by #1285 (temp-staging rebuild) and #1312 (collection_name in recovery paths) merging after this branch was authored. mempalace/repair.py: - Kept this branch's sqlite_integrity_errors() and print_sqlite_integrity_abort() helpers; took develop's rebuild_index signature with the collection_name parameter from #1312. Normalized the helper's print indent to 2 spaces to match the rest of the file. tests/test_repair.py: - Kept both this branch's sqlite_integrity_errors tests and develop's rebuild_from_sqlite + configured-collection coverage. - Replaced 7 sites of sqlite_path.write_text("fake") with sqlite3.connect(...).close() — write_text("fake") fails PRAGMA quick_check, so the new preflight aborts before the rebuild logic the tests actually exercise. An empty real SQLite DB passes quick_check and lets the tests run as intended. - Took develop's temp-staging assertion shape (delete/create the __repair_tmp collection in addition to the live drawers collection) for the existing test_rebuild_index_success test. Local: 1618 tests pass, ruff lint+format clean against 0.4.x CI pin.	2026-05-07 10:41:27 -03:00
Igor Lins e Silva	a1e19081d9	merge: develop into fix/1274-missing-hnsw-metadata-gate Two conflicts, both because #1339 (bloated link payloads) merged into develop after this branch was authored: - mempalace/backends/chroma.py: _segment_appears_healthy now stacks three checks — bloated-link from #1339 (top), missing-metadata-with- data-floor from this branch (middle), pickle format sniff (bottom). All three are complementary; #1339 catches structural payload corruption, this branch catches pickle truncation, the original catches pickle protocol-byte corruption. - tests/test_backends.py: kept both new imports (_segment_appears_healthy from this branch, quarantine_invalid_hnsw_metadata from #1285). Local: 1618 tests pass, ruff lint+format clean against 0.4.x CI pin.	2026-05-07 10:35:19 -03:00
Igor Lins e Silva	bdaac9d9a6	merge: develop into fix/1295-repair-max-seq-id-preflight Three conflicts, all from develop landing #1285/#1310/#1312 after this branch was authored: - mempalace/cli.py: keep both import sets — this branch's maybe_repair_poisoned_max_seq_id_before_rebuild plus develop's RebuildCollectionError / _close_chroma_handles / _extract_drawers / _rebuild_collection_via_temp added in #1285. - mempalace/repair.py: keep this branch's maybe_repair_poisoned_max_seq_id_before_rebuild definition; use develop's rebuild_index signature with the collection_name parameter added in #1312. Normalized print indent to 2 spaces matching the rest of the file. - tests/test_repair.py: keep both this branch's max_seq_id preflight tests and develop's rebuild_from_sqlite + configured-collection-name tests; they exercise distinct code paths and don't overlap. Local: 1617 tests pass, ruff lint+format clean against 0.4.x CI pin.	2026-05-07 10:32:29 -03:00
Igor Lins e Silva	88a2ebb85a	Merge pull request #1339 from fatkobra/fix/1218-hnsw-link-payload-health fix(storage): quarantine bloated HNSW link payloads (#1218)	2026-05-07 10:19:47 -03:00
Igor Lins e Silva	e272ed3fbf	Merge pull request #1359 from fatkobra/fix/1099-migrate-write-roundtrip fix(migrate): verify write roundtrip before bailout (#1099)	2026-05-07 09:59:58 -03:00
Igor Lins e Silva	72685f3aef	Merge pull request #1312 from mjc/stale-chroma-reconnect fix: use configured collection in recovery paths	2026-05-07 09:40:29 -03:00
Igor Lins e Silva	52c70c9f88	Merge pull request #1402 from MemPalace/fix/1296-windows-mine-resilience fix(miner): harden Windows mine against ONNX bad_alloc + silent partial exits (#1296)	2026-05-07 09:36:22 -03:00
Igor Lins e Silva	e9aee19433	fix(tests): apply ruff format after rebase resolution The collection_name plumbing rebase produced a few unformatted blocks in test_mcp_server.py and test_searcher.py; bringing them in line with the 0.4.x CI pin so test-windows / lint stay green.	2026-05-07 09:10:22 -03:00
Mika Cohen	ec6d2dde01	fix: use configured collection in recovery paths	2026-05-07 09:10:00 -03:00
Igor Lins e Silva	5488e7bb22	fix(miner): harden Windows mine against ONNX bad_alloc + silent partial exits Three small changes that together address the failure modes in #1296: 1. Add pnpm-lock.yaml and yarn.lock to SKIP_FILENAMES, mirroring the existing package-lock.json rule. A 24K-line pnpm-lock.yaml produced ~1124 chunks in one batch and tripped onnxruntime bad_alloc on Windows; pnpm/yarn lockfiles are no more useful to mine than npm's. 2. Skip any file that produces more than MAX_CHUNKS_PER_FILE (500) chunks, with a clear log line. Catches the broader class — generated CSV/JSON, build artifacts, etc. — that the named-file SKIP list will never fully cover. The cap is conservative (500 chunks * 800 chars ≈ 400 KB of source) so legitimate hand-written content still mines. 3. Print a partial-progress summary on any exception in _mine_impl, not just KeyboardInterrupt, then re-raise. Without this, an arbitrary exception (ONNX bad_alloc, chromadb HNSW error, OS fault) propagates silently — the operator sees only the last progress line and assumes the mine succeeded. The new path mirrors the KeyboardInterrupt summary (files_processed, drawers_filed, last_file) plus the exception type and message, then re-raises so the original traceback surfaces and the exit code is non-zero. Tests cover: SKIP_FILENAMES contents, the chunk-cap path returning (0, room) with no upserts, and the new mine-aborted summary surfacing both the partial counters and the exception class.	2026-05-07 08:56:41 -03:00
Igor Lins e Silva	88493acd0d	Merge pull request #1285 from mjc/hnsw-repair fix: harden Chroma repair preflight and rollback recovery	2026-05-07 08:06:59 -03:00
Igor Lins e Silva	7cf9b17582	fix(repair): quote ChromaBackend annotation for Python 3.9 compatibility `backend: ChromaBackend \| None = None` evaluates the X \| None union eagerly at function-definition time, which Python 3.9 rejects with TypeError: unsupported operand type(s) for \|: 'ABCMeta' and 'NoneType' since the new union syntax is 3.10+. Quoting matches the existing forward-reference style in repair.py (sqlite_drawer_count, etc.) and defers evaluation, restoring 3.9 compatibility.	2026-05-07 07:53:28 -03:00
Igor Lins e Silva	be05a2e179	Merge pull request #1310 from potterdigital/fix/1308-rebuild-from-sqlite fix(repair): add --mode from-sqlite to recover palaces with corrupt HNSW (#1308)	2026-05-07 07:51:42 -03:00
Igor Lins e Silva	be6dc033fd	merge: develop into hnsw-repair (resolve chroma.py + test_backends.py conflicts) Develop (post-#1162 lock-plumbing era) refactored the per-open quarantine pass into ChromaBackend._prepare_palace_for_open. This branch's inline-expansion form added quarantine_invalid_hnsw_metadata as a third check, plus a "discard from _quarantined_paths on inode swap" guard so re-opens against a different physical DB re-run quarantine. Resolution merges both: - _prepare_palace_for_open now also calls quarantine_invalid_hnsw_metadata, gated by the same _quarantined_paths set. - _client keeps the inode_changed -> _quarantined_paths.discard() guard before calling the helper, so a fresh inode triggers a fresh pass. - make_client collapses to a single _prepare_palace_for_open() call. - test_backends.py keeps both the pickle (#1285) and shutil (develop) imports — both are used.	2026-05-07 07:48:45 -03:00
Igor Lins e Silva	670aba974f	test(repair): close ChromaBackend in _seed_palace to release Windows file locks The helper opened a chromadb PersistentClient via ChromaBackend and never closed it, leaving rust-side SQLite/HNSW file locks alive after the helper returned. On Windows that blocks the in-place archive rename inside rebuild_from_sqlite with WinError 32 on data_level0.bin, causing test_rebuild_from_sqlite_in_place_archives_when_opted_in and test_rebuild_from_sqlite_raises_on_upsert_failure to fail in the test-windows CI job. No test consumes the returned collection, so closing the backend in a try/finally is safe and drops the return.	2026-05-07 07:37:25 -03:00
Igor Lins e Silva	8d8f54a807	Merge remote-tracking branch 'origin/develop' into fix/1308-rebuild-from-sqlite	2026-05-07 07:30:56 -03:00
Igor Lins e Silva	435f0ad348	Merge pull request #1391 from MemPalace/docs/auto-save-tools-on-develop docs: add 30-day expiry callout + ship 4 auto-save tools	2026-05-06 20:18:44 -03:00
MillaJ	7c679ba625	fix(tools/render_jsonl): apply ruff format Earlier commit fixed ruff lint but missed the formatter check. This applies `ruff format` — adds standard PEP8 blank lines between functions, splits one inline list. No behavior change. Verified: both `ruff format --check` and `ruff check` pass cleanly. Tool still renders correctly.	2026-05-06 16:12:34 -07:00
MillaJ	921ff5a6fa	fix(tools/render_jsonl): split chained statements per ruff 0.4.x Addresses CI lint feedback on PR #1391. No behavior change. - Split `import json, sys` into separate lines (E401) - Split chained `print(...); sys.exit(1)` into two lines (E702, two occurrences) - Split inline `if ts: stamps.append(ts)` into two lines (E701) Verified: `ruff check tools/render_jsonl.py` reports "All checks passed!" Tool still renders correctly (3 turns from a real JSONL test, identical output to pre-fix).	2026-05-06 15:39:08 -07:00
MillaJ	bddba59ae3	docs: add 30-day expiry callout + ship 4 auto-save tools Adds a brief [!IMPORTANT] callout at the top of the README pointing users to the urgent announcement at #1388. Claude Code auto-deletes local JSONL transcripts after 30 days; users without the auto-save hooks wired are losing transcript data off the rolling window. Ships 4 small standalone tools at tools/: - backup_claude_jsonls.sh — rsync ~/.claude/projects/ to a safe folder - render_jsonl.py — convert JSONL transcripts to readable text - find_orphan_claude_jsonls.sh — scan backup locations for orphan Claude Code transcripts (multi-line shape detection + topic preview) - save.md — Claude Code slash command for manual /save into MemPalace Tools verified by independent agent against v3.3.4 source. Read-only on user data. POSIX bash + Python stdlib only.	2026-05-06 13:10:16 -07:00
Igor Lins e Silva	f0d236019a	Merge pull request #1377 from MemPalace/fix/get-collection-retry-on-exception fix(mcp): retry _get_collection once on transient failure (#1286)	2026-05-06 05:18:04 -03:00
Igor Lins e Silva	e334e257bf	fix(mcp): retry _get_collection once on transient failure (#1286 ) A transient chromadb exception inside `_get_collection` was swallowed by the bare `except Exception: return None`, leaving every subsequent tool call hitting the same poisoned cache silently. The fix wraps the body in a `for attempt in range(2)` loop: on attempt 0 failure, log via `logger.exception(...)` and clear `_client_cache` / `_collection_cache` / `_metadata_cache` so the next iteration forces `_get_client()` to rebuild from scratch — that path now re-runs `quarantine_stale_hnsw` (per #1322), so the second attempt heals the common stale-handle case automatically. If both attempts fail, return `None` (matches the prior contract for permanent failures). Two new tests in `tests/test_mcp_server.py::TestCacheInvalidation`: - `test_get_collection_retries_once_on_exception` — first attempt raises via a monkeypatched `_get_client`, second attempt succeeds; assert the caller gets the collection back, not None. - `test_get_collection_returns_none_after_two_failures` — both attempts fail, assert we exhaust the loop and return None (no infinite retry). Surgical extraction from PR #1286, which carried the same fix idea (plus a fork-sync bundle that couldn't be merged); credit to the original author below. Co-authored-by: Jeffrey Hein <jp@jphein.com>	2026-05-06 04:52:18 -03:00
Brian potter	d92c741084	fix(repair): address PR #1310 review feedback Five small hardening fixes for the from-sqlite rebuild path, all from mjc's review on #1310: - repair.py: drawers collection name now resolves from MempalaceConfig().collection_name via _drawers_collection_name() (closets stays fixed by design — AAAK index references drawer IDs by string). Lines up with the broader configured-collection work in #1312 so that PR can rebase cleanly on top. - repair.py: create_collection() moved inside the try block in _rebuild_one_collection so a Chroma "Collection already exists" failure surfaces as RebuildPartialError with archive_path, not an unstructured exception that strands the user without recovery instructions. - repair.py: rebuild_from_sqlite wraps backend lifetime in try/finally with backend.close() so PersistentClient handles to dest_palace are released on every exit path. The from-sqlite path post-dates #1285's lifecycle hardening of the legacy rebuild, so this needed its own cleanup. - cli.py: cmd_repair (from-sqlite mode) now exits non-zero when rebuild_from_sqlite returns {} (validation refusal sentinel), so unattended scripts/CI distinguish "invalid inputs" from a successful rebuild that legitimately found zero rows. - tests/test_repair.py: test_extract_via_sqlite_returns_all_rows_with_metadata now asserts every backing segment is scope='METADATA', locking in the segment-layout assumption against future regressions that point the JOIN at the VECTOR segment. New test coverage: - test_rebuild_from_sqlite_honors_configured_drawer_collection_name - test_cmd_repair_from_sqlite_validation_refusal_exits_nonzero - test_cmd_repair_from_sqlite_success_does_not_exit Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 04:37:22 -03:00
Brian potter	cb6bfd5231	chore: gitignore .envrc for direnv users Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 04:36:39 -03:00
Brian potter	a7c4ed24d7	fix(repair): add --mode from-sqlite to recover palaces with corrupt HNSW (#1308 ) Both `--mode legacy` and the inline `cli.cmd_repair` rebuild path call `Collection.count()` as their first read — the same call that raises `chromadb.errors.InternalError: Failed to apply logs to the hnsw segment writer` on the corruption class reported in #1308. Repair would print "Cannot recover — palace may need to be re-mined from source files" even though the underlying SQLite tables were fully intact. The new `--mode from-sqlite` reads `(id, document, metadata)` rows directly from `chroma.sqlite3` via `segments` → `embeddings` → `embedding_metadata` joins, never opens a chromadb client against the corrupt palace, and re-upserts everything into a fresh palace. - `--source PATH` extracts from a corrupt palace already moved aside - `--archive-existing` handles the in-place case by renaming the existing palace to `<palace>.pre-rebuild-<timestamp>` first - Partial-rebuild failures raise `RebuildPartialError` with the archive path so users can recover; CLI exits non-zero - In-place mode calls `SharedSystemClient.clear_system_cache()` to drop chromadb's process-wide System registry (cross-palace use does not, to limit blast radius for library callers) - Source validation runs before any destructive moves Verified end-to-end recovering a 52,300-row real-world corrupt palace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 04:36:39 -03:00
Igor Lins e Silva	6741b6908e	Merge pull request #1138 from anthonyonazure/fix/bugbear-cleanup-and-endpoint-scheme fix: reject non-http(s) LLM endpoints + clear ruff bugbear/silent-except findings	2026-05-06 04:28:12 -03:00
Anthony Clendenen	ca5899e361	refactor: fix ruff bugbear and silent-except findings - B904: chain OSError/collection errors with "raise ... from e" in normalize.py and searcher.py so the original traceback is preserved. - B007: rename unused loop variables to _name in dedup, dialect, layers, and room_detector_local. - S110/S112: replace bare "try/except/pass" and "try/except/continue" with logger.debug(..., exc_info=True) in mcp_server, searcher, palace, palace_graph, miner, convo_miner, and fact_checker so background failures are observable without changing behaviour. A module-level logger ("mempalace_mcp", matching mcp_server/searcher) is added to the five files that didn't already have one. Configured ruff checks (E/F/W/C901) and ruff --select B, S110, S112 all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 04:12:09 -03:00

1 2 3 4 5 ...

830 Commits