414aa3e20b12b02d72b37ac9093b9e9a6dfcf4c7
22 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
025dd03047 |
Merge pull request #1177 from jphein/fix/blob-seq-marker-guard
fix: skip _fix_blob_seq_ids sqlite open on already-migrated palaces (#1090) |
||
|
|
247744296d |
fix(blob-seq-marker): tests + style nit per @igorls #1177 review
Three new tests cover the marker contract: - test_fix_blob_seq_ids_writes_marker_after_blob_path — marker is written after a successful BLOB → INTEGER conversion. - test_fix_blob_seq_ids_writes_marker_when_already_integer — marker is written even on a no-op (already-INTEGER) path. The point of the marker is to skip sqlite3.connect() on subsequent calls, not to record that a conversion happened. - test_fix_blob_seq_ids_skips_sqlite_when_marker_present — verifies the load-bearing property via monkeypatched sqlite3.connect: when the marker exists, the function never opens sqlite. This is the bug #1090 fix — opening Python's sqlite3 against a live ChromaDB 1.5.x WAL DB corrupts the next PersistentClient call, and we must never do it again post-migration. Also folded in @igorls's style nit: - Path(marker).touch() instead of open(marker, "a").close(). Same effect, reads cleaner. Moved Path import to the top of the module since it didn't yet exist there. 35/35 backend tests pass. The "Production: fresh MCP server + stop hook + mempalace mine subprocess" checkbox in the PR body is checked in the PR reply — the canonical 151K palace has been running this fix since #1177 was filed (via Syncthing-replicated source on the disks daemon at v1.7.0); zero PersistentClient corruption since. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
942aae3ed5 |
fix(hnsw): address @igorls's #1173 review
Restore three `_pin_hnsw_threads` tests that the previous integrity-gate commit deleted. The function is still live code on develop (defined at chroma.py:207, called from chroma.py:705 + mcp_server.py), so the deletion left the import unused (ruff F401) and dropped coverage on a function unrelated to this PR's scope. Restored verbatim from main. Plus three nits @igorls flagged: - **Thread-safety doc**: `_quarantined_paths` mutation is lock-free; documented that idempotency of `quarantine_stale_hnsw` is the safety property (concurrent same-palace calls produce a benign redundant rename attempt that no-ops, no need for a lock). - **Pickle protocol assumption**: `_segment_appears_healthy` requires PROTO ≥ 2 (`0x80`). Documented; matches what chromadb writes today, and a future protocol-0/1 emission would conservatively quarantine + lazy-rebuild rather than mis-classify as healthy. - Class-level vs module-level scope: keeping class-level — the conftest reset is the controlled case, and module-level wouldn't remove the foot-gun, just relocate it. Conftest reset documented in the existing comment is the right pattern for test isolation. Style nit (`Path(marker).touch()` vs `open(marker, "a").close()`) deferred — that pattern lives in #1177's territory, not #1173's. 37/37 tests pass on the PR branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
74ff5e6b98 |
fix(hnsw): integrity gate in quarantine_stale_hnsw — corruption vs flush-lag
Previous: quarantine fired whenever sqlite_mtime - hnsw_mtime exceeded the (lowered, in #1173) 300s threshold. ChromaDB 1.5.x flushes HNSW asynchronously and a clean shutdown does not force-flush, so the on- disk HNSW is *always* meaningfully older than chroma.sqlite3 — that's the steady state, not corruption. Quarantine renamed valid HNSW segments on every cold-start, chromadb created empty replacements, vector recall went to 0/N until rebuild. Confirmed in production on the disks daemon journal, 2026-04-26 06:56:45: three of three HNSW segments quarantined on cold-start with 538-557s mtime gaps (post-clean-shutdown flush lag), leaving a 151,478-drawer palace with vector_ranked=0. Drift directories at *.drift-20260426-065645/ each contained a complete 253MB data_level0.bin plus 18MB index_metadata.pickle — clearly healthy indexes, renamed by the false-positive heuristic. Fix: two-stage gate. 1. mtime gate (existing) — gap > stale_seconds is necessary. 2. integrity gate (new) — sniff index_metadata.pickle for chromadb's expected protocol/terminator bytes (PROTO 0x80 head, STOP 0x2e tail) and a non-trivial size, WITHOUT deserializing the file. Healthy segment with mtime drift → keep in place; truncated / zero-filled / partial-flush → quarantine. Format-sniff is deliberately non-deserializing — pickle deserialization can execute arbitrary code, and the PROTO+STOP byte presence + size floor is sufficient to distinguish a complete chromadb write from truncation, zero-fill, or a partial flush during process kill. Real load failures (the rare case where the bytes look right but chromadb fails to load) still surface to palace-daemon's _auto_repair, which calls quarantine_stale_hnsw directly on observed HNSW errors and bypasses this gate. The cold-start gate from 70c4bc6 (row 24) remains as a perf optimization — even with the integrity check, repeating the sniff on every reconnect is unnecessary work — but its load-bearing role is now covered by this deeper fix. 4 new tests in test_backends.py: - test_quarantine_stale_hnsw_renames_corrupt_segment (drift + bad meta) - test_quarantine_stale_hnsw_leaves_healthy_segment_with_drift_alone (drift + valid meta — the production case at 06:24) - test_quarantine_stale_hnsw_leaves_segment_without_metadata_alone (fresh / never-flushed, no meta file) - test_quarantine_stale_hnsw_renames_truncated_metadata (under-floor size, partial-flush shape) Existing test_quarantine_stale_hnsw_renames_drifted_segment renamed to renames_corrupt_segment with explicit corrupt meta_bytes — the old "renames any drift" contract is gone. Suite 1366/1366 pass. Coordinated cross-repo with palace-daemon's auto-repair-on-startup workaround (separate agent's commit ed3a892). With this fork-side fix the auto-repair becomes belt-and-suspenders; the structural cause of empty-HNSW-on-restart is addressed at the quarantine layer. CLAUDE.md row 26 + README fork-change-queue row + test count 1363→1366. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
e5e7a57930 |
fix(hnsw): gate quarantine_stale_hnsw to cold-start, not every reconnect
Symptom on the canonical disks daemon: drift quarantines firing every 10–30 minutes throughout the day under steady write load. Logs show .drift-* directories accumulating despite the daemon being the only writer (no Syncthing replication of palace data). Root cause is a false-positive thrash in the quarantine heuristic: - chroma.sqlite3 mtime bumps on every write (millisecond cadence). - HNSW segment files (data_level0.bin) only flush to disk on chromadb's internal cadence, which can lag minutes behind sqlite under continuous write load. Once the gap exceeds the 300s threshold, quarantine_stale_hnsw renames a perfectly valid HNSW segment, chromadb rebuilds it from scratch, and the cycle repeats as soon as the next batch of writes lands. The 300s threshold (lowered from 3600s in PR #1173 after a 0.96h-drift production segfault) is correct for the cross-machine-replication failure mode it was designed for, but wrong for a daemon-strict deployment whose only "drift" source is its own benign flush lag. Fix: gate the proactive quarantine check to the first ``make_client()`` invocation per palace per process (``ChromaBackend._quarantined_paths`` set). Real cold-start drift (replication, partial restore, crashed-mid- write) still gets caught — that's exactly when a fresh daemon process opens the palace. Real runtime drift on observed HNSW errors still gets caught via palace-daemon's ``_auto_repair`` which calls ``quarantine_stale_hnsw`` directly, bypassing this gate. Two new tests in test_backends.py verify single-fire-per-palace and per-palace independence. Conftest clears the gate between tests. Suite 1362/1362 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
db80f6e26c |
fix: call quarantine_stale_hnsw() in make_client(); lower threshold to 5min
make_client() called _fix_blob_seq_ids but skipped quarantine_stale_hnsw, so every fresh process (stop hook, precompact hook, CLI) opened a drifted palace and segfaulted in chromadb_rust_bindings before any write-path protection could fire. #1062 wires the quarantine call at MCP server startup (covers long-lived server processes). This fix adds it to make_client() itself — the call site that all callers (server, hooks, CLI, tests) pass through — so every fresh PersistentClient open is protected regardless of entry point. Also lowers stale_seconds default from 3600 to 300: a 0.96h-drifted segment caused production segfaults before the 1h threshold fired. ChromaDB's HNSW flush cadence means legitimate drift is seconds to low minutes; 5min gives headroom without admitting clearly corrupt segments. |
||
|
|
bc24aa14e2 |
fix: skip _fix_blob_seq_ids sqlite open on already-migrated palaces (#1090)
Opening chroma.sqlite3 via Python's sqlite3.connect() against a live ChromaDB 1.5.x WAL-mode database leaves state that segfaults the next PersistentClient call — the same failure mode tracked at #1090. _fix_blob_seq_ids runs unconditionally on every make_client() call, so every fresh process (MCP server, stop hook, CLI) re-triggers the sqlite open → corrupt → segfault cycle on palaces that have already completed the 0.6.x → 1.5.x seq_id migration. Guard with a .blob_seq_ids_migrated marker file in the palace directory: - If marker exists, return immediately — skip sqlite entirely - After successful migration (or confirmation that no BLOBs remain), write the marker so subsequent opens take the fast path - Palaces that never had BLOB seq_ids also get the marker on first open, so they too avoid the redundant sqlite open after that - Already-migrated palaces can touch the marker manually to opt in Test plan: Direct test — run _fix_blob_seq_ids twice against a fresh palace; second call returns immediately because marker exists. 1094 existing tests pass. |
||
|
|
7773432bca |
chore(rebase): reconcile with develop and apply ruff format
After rebasing onto current develop: - chroma.py: keep develop's quarantine_stale_hnsw + UnsupportedFilterError validation alongside this PR's _pin_hnsw_threads retrofit. - tests/test_backends.py: combine quarantine_stale_hnsw and _pin_hnsw_threads test sections; ruff format. - miner.py: propagate the new `files=` kwarg (added on develop in #1183 for the init -> mine flow) through _mine_impl so the caller can pass a pre-scanned file list under the global lock. |
||
|
|
8df944a54d |
fix: best-effort HNSW thread-pin retrofit + drop dead attempt-cap constant
Addresses remaining PR #976 review items after rebase on develop. `get_collection(create=False)` previously returned existing collections without re-applying `hnsw:num_threads=1`, so palaces created before the fix kept the unsafe parallel-insert path. Add `_pin_hnsw_threads()` helper that calls `collection.modify(configuration=UpdateCollectionConfiguration( hnsw=UpdateHNSWConfiguration(num_threads=1)))` best-effort on every `get_collection` call (including the MCP server's `_get_collection`). In chromadb 1.5.x the runtime config does not persist to disk across `PersistentClient` reopens, so the retrofit is re-applied each process start rather than being a one-shot migration. Fresh palaces keep the metadata-based pin as primary defense; legacy palaces now also get per-session protection without requiring `mempalace nuke` + re-mine. After the rebase on develop, `hook_precompact` delegates to `_mine_sync` and no longer emits `decision: block`, so the attempt-cap constant was orphaned. Grep confirms 0 usages in the repo — remove it. - `_pin_hnsw_threads` retrofits legacy collection (num_threads None -> 1) - `_pin_hnsw_threads` swallows all errors (never raises) - `ChromaBackend.get_collection(create=False)` applies retrofit on legacy palace - 62 tests pass (10 backends + 6 palace locks + 46 hooks_cli) |
||
|
|
7e18a70796 |
fix: resolve hooks_cli.py merge conflict + add mine_global_lock tests
- Resolve UU conflict in hooks_cli.py: take develop/HEAD approach (mine synchronously via _mine_sync, then pass through unconditionally). _mine_sync already catches subprocess.TimeoutExpired — fixes Copilot #1. - Add tests/test_palace_locks.py: 4 tests covering mine_global_lock non-blocking semantics (acquire, second-acquire raises MineAlreadyRunning, reusable after release, release on exception) — fixes Copilot #4. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
133dfbfb41 |
fix(search): BM25 hybrid rerank, legacy-metric warning, invariant tests
Three tightly-coupled search-quality fixes for v3.3.3: 1. CLI `mempalace search` now routes through the same `_hybrid_rank` the MCP path already used. Drawers whose text contains every query term but embed as file-tree noise (directory listings, diffs, log fragments) were scoring cosine distance >= 1.0 — the display formula `max(0, 1 - dist)` then floored every result to `Match: 0.0`, with no way for the user to tell a lexical match from a total miss. BM25 catches these cleanly; the display surfaces both `cosine=` and `bm25=` so users see which component is firing. 2. Legacy-palace distance-metric warning. Palaces created before `hnsw:space=cosine` was consistently set silently use ChromaDB's default L2 metric, which breaks the cosine-similarity formula (L2 distances routinely exceed 1.0 on normalized 384-dim vectors). The search path now detects this at query time and prints a one-line notice pointing at `mempalace repair`. Only fires for legacy palaces; new palaces already set cosine correctly. 3. Invariant tests pinning `hnsw:space=cosine` on every collection- creation path — legacy `get_or_create_collection`, legacy `create_collection`, RFC 001 `get_collection(create=True)`, the public `palace.get_collection`, and a round-trip through reopen. Locks down the correctness that new-user palaces already have so a future refactor can't silently regress it. Also adds a `metadata` property to `ChromaCollection` so callers can read the underlying hnsw:space without reaching into `_collection`. Tests: - New regression: simulate three candidates at distance 1.5 (cosine=0), one containing query terms — must rank first with non-zero bm25. - New: legacy metric (empty or non-cosine) produces stderr warning. - New: correctly-configured palace produces no warning. - New: all five creation paths pin cosine metadata. All existing tests still pass. |
||
|
|
a4868a3589 |
perf(mining): batch per-chunk upserts and add optional GPU acceleration
The miner upserted one drawer per ChromaDB call, paying tokenizer + ONNX session setup per chunk. The embedding device was CPU-only because no EmbeddingFunction was ever wired through the backend. Two changes, each a speedup in its own right; stacked they give ~10x end-to-end on a medium corpus (20 files, 568 drawers): 1. Batched upsert. `process_file` and `_file_chunks_locked` now collect all chunks of a file into a single `collection.upsert(...)` so the embedding model runs one forward pass per file instead of N. 2. Hardware-accelerated embedding function. New `mempalace/embedding.py` wraps `ONNXMiniLM_L6_V2` with configurable `preferred_providers`. `MEMPALACE_EMBEDDING_DEVICE` (or `embedding_device` in config.json) selects auto / cpu / cuda / coreml / dml. Unavailable accelerators log a warning and fall back to CPU. The factory subclasses `ONNXMiniLM_L6_V2` and spoofs its `name()` to `"default"` so the persisted EF identity matches existing palaces created with ChromaDB's bare `DefaultEmbeddingFunction` -- same model, same 384-dim vectors, no rebuild needed when turning GPU on. `ChromaBackend.get_collection` / `create_collection` now pass the resolved EF on every call so miner writes and searcher reads agree. Benchmarks (i9-12900KF + RTX 3090, medium scenario, 568 drawers): per-chunk + CPU 19.77s · 29 drw/s (baseline) batched + CPU 8.07s · 70 drw/s (2.4x) batched + CUDA 2.15s · 264 drw/s (9.2x) Reproducible via `benchmarks/mine_bench.py`. Install paths: pip install mempalace[gpu] # NVIDIA CUDA pip install mempalace[dml] # DirectML (Windows) pip install mempalace[coreml] # macOS Neural Engine Mine header now prints `Device: cpu|cuda|...` so users can confirm the accelerator engaged. |
||
|
|
035fe6d658 |
fix(llm): tighter refinement — word boundaries, JSON extraction, authoritative sources
Addresses issues found while reviewing the initial phase-2 implementation against real data: **Bug: uncertain bucket starved from the LLM.** `discover_entities` was dropping the regex-uncertain bucket whenever real git/manifest signal existed — which is exactly when `--llm` is most useful for cleaning up prose noise. The uncertain candidates never reached the refinement step. Fixed: only drop when `llm_provider is None`. **Context collection: word boundaries, not substring.** `_collect_contexts` used substring matching on lower-cased lines, so the name "Go" matched "good", "going", "forgot". Switched to a `(?<!\w)…(?!\w)` regex so short names only match at token boundaries. **Authoritative-source detection replaces confidence threshold.** Previously the refinement step skipped entries with `confidence >= 0.95` to avoid second-guessing manifest-backed projects. That threshold was fragile — the regex detector produces 0.99 confidence for things like `code file reference (5x)` on framework names (OpenAPI, etc.), so those skipped the LLM despite being regex-only noise. New helpers `_is_authoritative_person` / `_is_authoritative_project` look at the actual signal strings (commits, package.json, etc.) to decide. **Now also refines regex-derived people.** After #1148's high-pronoun-signal fix, the regex detector can promote non-people to the `people` bucket (e.g. a capitalized common noun that happened to appear near pronouns). The LLM now gets a chance to clean those up, while git-authored people are still skipped. **Robust JSON extraction.** Small local models routinely wrap JSON output in prose ("Sure, here's the classification: {…}"). The previous code-fence stripper failed on that. `_extract_json_candidates` now does balanced-bracket extraction with string-aware quote handling, so it recovers JSON from: - raw responses - markdown fenced blocks - JSON embedded inside surrounding text - multiple candidate objects/arrays **Prompt guidance for frameworks vs user projects.** Added an explicit instruction: frameworks, runtimes, APIs, cloud services, and third-party vendors (Angular, OpenAPI, Terraform, Bun, Google, etc.) are TOPIC unless the context clearly says it's the user's own codebase. Directly addresses a false-positive pattern observed during dev runs. **Defensive mtime.** `convo_scanner._safe_mtime` catches OSError during `stat()` — permission changes, filesystem races, broken symlinks — and sorts the affected file to the end of the newest-first order rather than crashing the scan. **Cosmetic:** merged two adjacent f-strings on the same line in `backends/chroma.py` and `llm_client.py` (no behaviour change). 15 new tests cover the OSError fallback, word-boundary matching, JSON extraction variants, authoritative-source helpers, refining high- confidence regex projects, and end-to-end LLM refinement preserving the uncertain bucket. |
||
|
|
65b17a6e0f |
fix: address Copilot review on release/3.3.2
Non-ASCII glyphs (regression of the #681 class of Windows UnicodeEncodeError): - mempalace/cli.py: "✗" → "ERROR:", "⚠" → "WARNING:", em dash → "-" - mempalace/sweeper.py: "⚠" → "WARNING:" Backend arg validation: - mempalace/backends/chroma.py: `_normalize_get_collection_args` now raises TypeError on unexpected trailing positional args instead of silently dropping them — surfaces call-site bugs early. Docs site: - website/.vitepress/config.mts: gate Google Analytics scripts behind MEMPALACE_DOCS_GA_ID env var (default off). Self-hosters no longer get GA injected unconditionally. Landing page SPA hygiene: - website/.vitepress/theme/landing/useLandingEffects.js: collect all IntersectionObserver disconnects and removeEventListener thunks in a shared `cleanups` registry; drain it in `onBeforeUnmount` so observers and form/replay listeners don't leak across SPA navigations. |
||
|
|
0c38deaab5 |
feat(backends): quarantine_stale_hnsw — recover from HNSW/sqlite drift
Add a helper that renames HNSW segment directories whose `data_level0.bin` is significantly older than `chroma.sqlite3`. Drift between the on-disk HNSW graph and the live embeddings table is the root cause of a segfault class where the Rust graph-walk dereferences dangling neighbor pointers for entries in the metadata segment that no longer exist in the HNSW index, crashing in a background thread on `count()` or `query()`. Issue #823 describes the same drift as a silent-staleness symptom (semantic search returns stale results after `add_drawer` because `data_level0.bin` lags the sqlite metadata under the default `sync_threshold=1000`). Under heavier load or after an interrupted write, the same drift can escalate from "silent stale results" to "SIGSEGV on next open," which is the failure mode observed at neo-cortex-mcp#2 (chromadb 1.5.5, Python 3.12) and acknowledged at chroma-core/chroma#2594. On one 135K-drawer palace where `index_metadata.pickle` claimed 137,813 elements against 135,464 rows in sqlite (2,349-entry drift), fresh Python processes crashed in `col.count()` 17/20 times; after renaming the segment dir out of the way and letting ChromaDB rebuild lazily, the same 20-run check went to 0 crashes. The recovery path #823 suggests (export / recreate / reimport) is heavy — it re-embeds every drawer. This helper is lighter: rename the segment dir so ChromaDB reopens without it, and the indexer rebuilds lazily on the next write. The original directory is renamed (not deleted) so the operator can recover if the heuristic misfires. If `chroma.sqlite3` is more than `stale_seconds` (default 3600) newer than the segment's `data_level0.bin`, the segment is considered suspect. One hour is deliberately conservative — normal HNSW flush cadence is seconds to minutes, so an hour of drift implies a crashed mid-write, not routine lag. - Additive: exposes `quarantine_stale_hnsw(palace_path, stale_seconds)` as a helper. Not wired into `_client()` / startup on this PR — the goal is to land the primitive first so operators and higher layers can opt in. A follow-up could call it automatically on palace open behind an env var or config flag. - Closes #823 by giving operators a first-class recovery path without having to install `chromadb-ops` or re-mine. Four new tests in `tests/test_backends.py`: - renames drifted segment, preserves original files under `.drift-TS` suffix - leaves fresh segments alone - no-op on missing palace path / missing `chroma.sqlite3` - skips already-quarantined (`.drift-` suffixed) directories `pytest tests/test_backends.py` → 11 passed. `ruff check` / `ruff format --check` — clean. |
||
|
|
42b940d263 |
fix(backends): address Copilot review on #995
Four defects surfaced by the automated review, fixed with targeted tests: 1. BaseCollection.update() default now validates that documents / metadatas / embeddings lengths match ids, raising ValueError instead of silently misaligning pairs or raising IndexError (base.py). 2. ChromaCollection.query() now rejects the two ambiguous input shapes up front — neither or both of query_texts / query_embeddings, and empty input lists — with clear ValueError messages rather than delegating to chromadb's less-obvious errors (chroma.py). 3. QueryResult.empty() accepts embeddings_requested=True to preserve the outer-query dimension with empty hit lists when the caller asked for embeddings, matching the spec rule that included fields carry the outer shape even when empty (base.py). ChromaCollection.query() threads this through on the empty-result path (chroma.py). 4. ChromaBackend cache-freshness check now matches the semantics from mcp_server._get_client (merged via #757) on three edge cases Copilot called out: (a) invalidate when chroma.sqlite3 disappears while a cached client is held, (b) treat a 0→nonzero stat transition as a change so a cache built when the DB did not yet exist is refreshed, (c) re-stat after PersistentClient constructs the DB lazily so freshness reflects the post-creation state (chroma.py). Tests: 978 passed (up from 970), 8 new tests covering the fixes. |
||
|
|
a17a8b734a |
refactor(backends): typed QueryResult/GetResult, PalaceRef, BaseBackend registry (RFC 001 §10)
Advances RFC 001 §10 cleanup so backend-author PRs (#574 LanceDB, #665 Postgres, #700 Qdrant, #697 hosted, #643 PalaceStore, #381 Qdrant) have a stable target to align against. Scope (this PR): - Typed QueryResult / GetResult dataclasses replace Chroma's dict shape at the BaseCollection boundary (§1.3). A transitional _DictCompatMixin keeps existing callers working while the attribute-access migration proceeds. - BaseCollection is now kwargs-only across add/upsert/query/get/delete/update with ABC defaults for estimated_count/close/health and a non-atomic default update() (§1.1–1.2). - PalaceRef replaces raw path strings at the backend boundary (§2.2). - BaseBackend ABC with get_collection/close_palace/close/health/detect (§2.3). - mempalace.backends entry-point group + in-tree registry with resolve_backend_for_palace priority order matching §3.2–3.3. - ChromaCollection normalizes chroma returns into typed results; unknown where-clause operators raise UnsupportedFilterError (no silent drop, §1.4). - ChromaBackend absorbs the inode/mtime client-cache freshness check previously duplicated in mcp_server._get_client() (§10 + PR #757). - searcher.py migrated to typed-attribute access as the reference call site; remaining callers land in a follow-up. - pyproject: chroma registered via [project.entry-points."mempalace.backends"]. Out of scope (explicit follow-ups): - Full caller migration off the dict-compat shim across palace.py, mcp_server.py, miner.py, convo_miner.py, dedup.py, repair.py, exporter.py, palace_graph.py, cli.py, closet_llm.py. - Embedder injection + three-state EmbedderIdentityMismatchError check (§1.5). - maintenance_state() / run_maintenance() benchmark hooks (§7.3). - AbstractBackendContractSuite full coverage (§7.1–7.2). - mempalace migrate / mempalace verify CLI rewrites through BaseCollection (§8). Tests: 970 passed (up from 967 on develop); new coverage for typed results, empty-result outer-shape preservation, \$regex rejection, registry lookup, priority resolver, and PalaceRef-kwarg ChromaBackend.get_collection. Refs: #743 (RFC 001), #989 (RFC 002 tracking issue). |
||
|
|
267a644f4f |
refactor: route all chromadb access through ChromaBackend
Prerequisite for RFC 001 (plugin spec, #743). Removes every direct `import chromadb` outside the ChromaDB backend itself so the core modules depend only on the backend abstraction layer. Extends ChromaBackend with make_client, get_or_create_collection, delete_collection, create_collection, and backend_version. Adds update() to the BaseCollection contract. Non-backend callers (mcp_server, dedup, repair, migrate, cli) now go through the abstraction; tests patch ChromaBackend instead of chromadb. With this landed, the RFC 001 spec can be enforced and PalaceStore (#643) can ship as a plugin without touching core modules. |
||
|
|
8dc5970ca9 | Fix: ruff format with CI-pinned version (0.4.x) | ||
|
|
1e86892e62 |
Fix: set cosine distance metadata on all collection creation sites
ChromaDB defaults HNSW index to L2 (Euclidean) distance, but
MemPalace scoring uses 1-distance which requires cosine (range 0-2).
Add metadata={"hnsw:space": "cosine"} to the 4 production and 3 test
call sites that were missing it.
Closes #218
|
||
|
|
abc99f4154 |
fix: auto-repair BLOB seq_ids from chromadb 0.6→1.5 migration (#664)
Note from code review: (1) silent exception swallow on migration failure means caller proceeds with potentially corrupt DB — consider returning a boolean or re-raising in a follow-up. (2) No blob length validation before int.from_bytes — malformed rows could produce wrong seq_id values. Both are edge cases; the fix is still valuable for the common chromadb 0.6→1.5 migration path. |
||
|
|
ae5196bc8d |
Мempalace backend seam (#413)
* refactor: add stage-1 backend abstraction seam Introduce the first upstreamable storage seam for MemPalace without bringing in the PostgreSQL spike or any benchmark artifacts. This change adds a small backend package with: - BaseCollection as the minimal collection contract - ChromaBackend/ChromaCollection as the default implementation It then routes the main runtime collection consumers through that seam: - palace.py - searcher.py - layers.py - palace_graph.py - mcp_server.py - miner.status() Behavioral constraints kept for stage 1: - ChromaDB remains the only backend and the default path - no config/env backend selection yet - no PostgreSQL code - no benchmark or research files - existing tests stay unchanged Important compatibility details: - read paths now call the seam with create=False so they still surface the existing 'no palace found' behavior instead of silently creating empty collections - write paths keep create=True semantics through palace.get_collection() - layers/searcher retain a chromadb module attribute so the existing mock-based tests can keep patching PersistentClient unchanged - ChromaBackend only creates palace directories on create=True, which preserves mocked read-path tests that use fake read-only paths Verification: - python3 -m py_compile mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py - pytest -q # 529 passed, 106 deselected * refactor: clean up stage-1 seam compatibility shims Tighten the stage-1 backend abstraction branch after review. This follow-up does three small things: - keep the chromadb compatibility hook in searcher.py and layers.py, but express it through the backends.chroma module so it no longer reads like an accidental unused import - fix the palace_graph.py helper alias to avoid the local name collision flagged by ruff (imported helper vs local _get_collection wrapper) - preserve the existing mock-based test patch points unchanged while keeping the new backend seam intact Why this matters: - the direct form looked like a dead import in review, even though it was intentionally preserving the existing test seam ( and ) - palace_graph.py had a real lint issue ( redefinition) that was small but worth fixing before a public PR Verification: - /opt/homebrew/bin/ruff check mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py - pytest -q tests/test_layers.py tests/test_searcher.py - pytest -q # 529 passed, 106 deselected * docs: explain backend shim imports in search paths Add short code comments in searcher.py and layers.py explaining why the module-level `chromadb` alias remains after the stage-1 backend seam refactor. The alias is intentional: it preserves the existing mock patch points used by the current test suite (`mempalace.searcher.chromadb.PersistentClient` and `mempalace.layers.chromadb.PersistentClient`) while the runtime logic now flows through the backend abstraction. This keeps the public PR easier to review because the apparent "unused import" now has an explicit reason next to it. Verification: - /opt/homebrew/bin/ruff check mempalace/searcher.py mempalace/layers.py - pytest -q tests/test_layers.py tests/test_searcher.py * refactor: reuse a default backend instance in palace helper Tighten the stage-1 backend seam by promoting the default Chroma backend adapter to a module-level singleton in `mempalace/palace.py`. This keeps the stage-1 scope unchanged — Chroma is still the only backend wired in this branch — but avoids constructing a fresh `ChromaBackend()` object on every `get_collection()` call. The backend is stateless today, so this is a readability/cleanup change rather than a behavioral one. Why this helps: - makes `palace.get_collection()` read like a real default factory instead of an inline constructor call - keeps the stage-1 branch a little cleaner before opening the public PR - does not widen the backend surface or change any config/runtime behavior Verification: - python3 -m py_compile mempalace/palace.py - pytest -q tests/test_miner.py tests/test_layers.py tests/test_searcher.py - pytest -q # 529 passed, 106 deselected * fix: harden read-only seam behavior and update seam tests Preserve the stage-1 backend abstraction while closing the real read-path regression surfaced in PR review. What changed: - make ChromaBackend.get_collection(create=False) fail fast when the palace directory does not exist instead of letting PersistentClient create it as a side effect - update miner.status() to call get_collection(..., create=False) so status keeps the historical 'No palace found' behavior - remove the temporary chromadb shim aliases from layers.py and searcher.py now that the tests patch the seam directly - add focused tests for the new backends package, including ChromaCollection delegation and ChromaBackend create=True/create=False behavior - retarget layer/searcher tests to patch the backend seam instead of patching chromadb.PersistentClient inside production modules - add a regression test that status() does not create an empty palace when the target path is missing Verification: - ruff check . - uv run pytest -q - uv run pytest -q tests/test_backends.py tests/test_cli.py tests/test_mcp_server.py tests/test_layers.py tests/test_searcher.py tests/test_miner.py Notes: - the separate benchmark/slow/stress layer was started as a soak but not used as the merge gate for this PR branch * refactor: drop duplicate mcp collection cache declaration Remove a redundant `_collection_cache = None` assignment in `mempalace/mcp_server.py` left over after the stage-1 backend seam refactor. This does not change behavior; it only trims review noise in the MCP server module after the read-path hardening pass. Verification: - ruff check mempalace/mcp_server.py - uv run pytest -q tests/test_mcp_server.py --------- Co-authored-by: Sergey Kuznetsov <sergey@iterudit.com> |